Class SingleKernelFitness<T extends Comparable<T>>
- Type Parameters:
T
- the fitness value type, must be Comparable for optimization algorithms
SingleKernelFitness provides a comprehensive framework for implementing fitness evaluation using a single OpenCL kernel. It manages the complete lifecycle of GPU computation including data loading, kernel execution, and result extraction, making it suitable for most GPU-accelerated evolutionary algorithm scenarios.
Key features:
- Single kernel execution: Executes one OpenCL kernel per fitness evaluation
- Data management: Handles static data, dynamic data, and result allocation
- Memory lifecycle: Automatic cleanup of OpenCL memory objects
- Multi-device support: Supports concurrent execution across multiple devices
- Local memory: Configurable local memory allocation for kernel optimization
Data flow architecture:
- Static data: Algorithm parameters loaded once before all evaluations
- Dynamic data: Population data loaded before each generation
- Local memory: Work group local memory allocated based on kernel requirements
- Result data: Output buffers allocated for fitness results and intermediate data
Typical usage pattern:
// Define kernel and data configuration
SingleKernelFitnessDescriptor descriptor = SingleKernelFitnessDescriptor.builder()
.kernelName("fitness_evaluation")
.addDataLoader(0, populationDataLoader)
.addStaticDataLoader(1, parametersDataLoader)
.addResultAllocator(2, fitnessResultAllocator)
.kernelExecutionContextComputer(executionContextComputer)
.build();
// Define fitness extraction from GPU results
FitnessExtractor<Double> extractor = (context, kernelCtx, executor, generation, genotypes, results) -> {
float[] fitnessValues = results.extractFloatArray(context, 2);
return Arrays.stream(fitnessValues)
.mapToDouble(f -> (double) f)
.boxed()
.collect(Collectors.toList());
};
// Create single kernel fitness evaluator
SingleKernelFitness<Double> fitness = SingleKernelFitness.of(descriptor, extractor);
Kernel execution workflow:
- Initialization: Load static data once before all evaluations
- Data preparation: Load generation-specific data and allocate result buffers
- Kernel setup: Configure kernel arguments with data references
- Execution: Launch kernel with optimized work group configuration
- Result extraction: Extract fitness values from GPU memory
- Cleanup: Release generation-specific memory resources
Memory management strategy:
- Static data persistence: Static data remains allocated across generations
- Dynamic allocation: Generation data is allocated and released per evaluation
- Result buffer reuse: Result buffers can be reused with proper sizing
- Automatic cleanup: Memory is automatically released in lifecycle methods
Performance optimization features:
- Asynchronous execution: Kernel execution returns CompletableFuture for pipeline processing
- Work group optimization: Configurable work group sizes for optimal device utilization
- Memory coalescing: Support for optimized memory access patterns
- Local memory utilization: Efficient use of device local memory for performance
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final FitnessExtractor
<T> private final Map
<Device, KernelExecutionContext> static final org.apache.logging.log4j.Logger
private final SingleKernelFitnessDescriptor
-
Constructor Summary
ConstructorsConstructorDescriptionSingleKernelFitness
(SingleKernelFitnessDescriptor _singleKernelFitnessDescriptor, FitnessExtractor<T> _fitnessExtractor) Constructs a SingleKernelFitness with the specified kernel descriptor and fitness extractor. -
Method Summary
Modifier and TypeMethodDescriptionvoid
afterAllEvaluations
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Per-device cleanup hook called for each OpenCL execution context at the end.void
afterEvaluation
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Per-device cleanup hook called after each device partition evaluation.void
beforeAllEvaluations
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Per-device initialization hook called for each OpenCL execution context.void
beforeEvaluation
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Per-device preparation hook called before each device partition evaluation.protected void
protected void
clearResultData
(Device device) protected void
clearStaticData
(Device device) compute
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Performs the actual fitness computation using OpenCL kernels on the GPU.static <U extends Comparable<U>>
SingleKernelFitness<U> of
(SingleKernelFitnessDescriptor singleKernelFitnessDescriptor, FitnessExtractor<U> fitnessExtractor) Creates a new SingleKernelFitness instance with the specified configuration.Methods inherited from class net.bmahe.genetics4j.gpu.spec.fitness.OpenCLFitness
afterAllEvaluations, afterEvaluation, beforeAllEvaluations, beforeEvaluation
-
Field Details
-
logger
public static final org.apache.logging.log4j.Logger logger -
singleKernelFitnessDescriptor
-
fitnessExtractor
-
staticData
-
data
-
resultData
-
kernelExecutionContexts
-
-
Constructor Details
-
SingleKernelFitness
public SingleKernelFitness(SingleKernelFitnessDescriptor _singleKernelFitnessDescriptor, FitnessExtractor<T> _fitnessExtractor) Constructs a SingleKernelFitness with the specified kernel descriptor and fitness extractor.- Parameters:
_singleKernelFitnessDescriptor
- configuration for kernel execution and data management_fitnessExtractor
- function to extract fitness values from GPU computation results- Throws:
IllegalArgumentException
- if any parameter is null
-
-
Method Details
-
clearStaticData
-
clearData
-
clearResultData
-
beforeAllEvaluations
public void beforeAllEvaluations(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Description copied from class:OpenCLFitness
Per-device initialization hook called for each OpenCL execution context.This method is called once for each OpenCL device that will be used for fitness evaluation. It allows device-specific initialization such as memory allocation, buffer creation, and device-specific resource setup.
Typical use cases:
- Allocate GPU memory buffers that persist across generations
- Pre-load static data to GPU memory
- Initialize device-specific data structures
- Set up device-specific kernels or configurations
Memory allocated in this method should typically be released in the corresponding
OpenCLFitness.afterAllEvaluations(OpenCLExecutionContext, ExecutorService)
method.- Overrides:
beforeAllEvaluations
in classOpenCLFitness<T extends Comparable<T>>
- Parameters:
openCLExecutionContext
- the OpenCL execution context for a specific deviceexecutorService
- the executor service for asynchronous operations- See Also:
-
beforeEvaluation
public void beforeEvaluation(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Description copied from class:OpenCLFitness
Per-device preparation hook called before each device partition evaluation.This method is called for each device before evaluating its assigned partition of the population. It provides access to the device context and the specific genotypes that will be evaluated on this device.
Typical use cases:
- Transfer genotype data to device memory
- Update device-specific parameters for this generation
- Prepare input buffers with population data
- Set up kernel arguments that vary by generation
- Overrides:
beforeEvaluation
in classOpenCLFitness<T extends Comparable<T>>
- Parameters:
openCLExecutionContext
- the OpenCL execution context for this deviceexecutorService
- the executor service for asynchronous operationsgeneration
- the current generation number (0-based)genotypes
- the partition of genotypes to be evaluated on this device- See Also:
-
compute
public CompletableFuture<List<T>> compute(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Description copied from class:OpenCLFitness
Performs the actual fitness computation using OpenCL kernels on the GPU.This is the core method that implements GPU-based fitness evaluation. It receives a partition of the population and must return corresponding fitness values using OpenCL kernel execution on the specified device.
Implementation requirements:
- Return order: Fitness values must correspond to genotypes in the same order
- Size consistency: Return exactly one fitness value per input genotype
- Asynchronous execution: Use the executor service for non-blocking GPU operations
- Error handling: Handle GPU errors gracefully and provide meaningful exceptions
Common implementation pattern:
- Data transfer: Copy genotype data to GPU memory
- Kernel setup: Configure kernel arguments and work group parameters
- Kernel execution: Launch OpenCL kernels for fitness computation
- Result retrieval: Read fitness values from GPU memory
- Data conversion: Convert GPU results to appropriate fitness type
- Specified by:
compute
in classOpenCLFitness<T extends Comparable<T>>
- Parameters:
openCLExecutionContext
- the OpenCL execution context providing device accessexecutorService
- the executor service for asynchronous operationsgeneration
- the current generation number for contextgenotypes
- the genotypes to evaluate on this device- Returns:
- a CompletableFuture that will complete with fitness values for each genotype
-
afterEvaluation
public void afterEvaluation(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Description copied from class:OpenCLFitness
Per-device cleanup hook called after each device partition evaluation.This method is called for each device after its partition evaluation completes, providing an opportunity for device-specific cleanup and resource management.
Typical use cases:
- Clean up temporary GPU memory allocations
- Log device-specific performance metrics
- Update device-specific statistics or state
- Perform device-specific validation or debugging
- Overrides:
afterEvaluation
in classOpenCLFitness<T extends Comparable<T>>
- Parameters:
openCLExecutionContext
- the OpenCL execution context for this deviceexecutorService
- the executor service for asynchronous operationsgeneration
- the current generation number (0-based)genotypes
- the partition of genotypes that were evaluated on this device- See Also:
-
afterAllEvaluations
public void afterAllEvaluations(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Description copied from class:OpenCLFitness
Per-device cleanup hook called for each OpenCL execution context at the end.This method is called once for each OpenCL device when fitness evaluation is complete, providing an opportunity to clean up device-specific resources that were allocated in
OpenCLFitness.beforeAllEvaluations(OpenCLExecutionContext, ExecutorService)
.Typical use cases:
- Release GPU memory buffers and resources
- Clean up device-specific data structures
- Log device-specific performance summaries
- Ensure no GPU memory leaks occur
This method should ensure proper cleanup even if exceptions occurred during evaluation, as it may be the only opportunity to prevent resource leaks.
- Overrides:
afterAllEvaluations
in classOpenCLFitness<T extends Comparable<T>>
- Parameters:
openCLExecutionContext
- the OpenCL execution context for this deviceexecutorService
- the executor service for asynchronous operations- See Also:
-
of
public static <U extends Comparable<U>> SingleKernelFitness<U> of(SingleKernelFitnessDescriptor singleKernelFitnessDescriptor, FitnessExtractor<U> fitnessExtractor) Creates a new SingleKernelFitness instance with the specified configuration.- Type Parameters:
U
- the fitness value type- Parameters:
singleKernelFitnessDescriptor
- configuration for kernel execution and data managementfitnessExtractor
- function to extract fitness values from GPU computation results- Returns:
- a new SingleKernelFitness instance
- Throws:
IllegalArgumentException
- if any parameter is null
-