Class OpenCLFitness<T extends Comparable<T>>
- Type Parameters:
T
- the type of fitness values produced, must be comparable for selection operations
- Direct Known Subclasses:
MultiStageFitness
,SingleKernelFitness
OpenCLFitness provides the framework for evaluating population fitness using OpenCL kernels executed on GPU devices. This class defines the lifecycle and coordination patterns needed for efficient GPU-based fitness computation, including resource management, data transfer, and kernel execution orchestration.
The fitness evaluation lifecycle consists of several phases:
- Global initialization: One-time setup before any evaluations (
beforeAllEvaluations()
) - Per-device initialization: Setup for each OpenCL device context
- Generation setup: Preparation before each generation evaluation
- Computation: Actual fitness evaluation using OpenCL kernels
- Generation cleanup: Cleanup after each generation evaluation
- Per-device cleanup: Cleanup for each OpenCL device context
- Global cleanup: Final cleanup after all evaluations (
afterAllEvaluations(net.bmahe.genetics4j.gpu.opencl.OpenCLExecutionContext, java.util.concurrent.ExecutorService)
)
Key responsibilities for implementations:
- Data preparation: Convert genotypes to GPU-compatible data formats
- Memory management: Allocate and manage GPU memory buffers
- Kernel execution: Configure and execute OpenCL kernels with appropriate parameters
- Result extraction: Retrieve and convert fitness values from GPU memory
- Resource cleanup: Ensure proper cleanup of GPU resources
Common implementation patterns:
public class MyGPUFitness extends OpenCLFitness<Double> {
private CLData inputBuffer;
private CLData outputBuffer;
@Override
public void beforeAllEvaluations(OpenCLExecutionContext context, ExecutorService executor) {
// Allocate GPU memory buffers that persist across generations
int maxPopulationSize = getMaxPopulationSize();
inputBuffer = CLData.allocateFloat(context, maxPopulationSize * chromosomeSize);
outputBuffer = CLData.allocateFloat(context, maxPopulationSize);
}
@Override
public CompletableFuture<List<Double>> compute(OpenCLExecutionContext context,
ExecutorService executor, long generation, List<Genotype> genotypes) {
return CompletableFuture.supplyAsync(() -> {
// Transfer genotype data to GPU
transferGenotypesToGPU(context, genotypes, inputBuffer);
// Execute fitness evaluation kernel
executeKernel(context, "fitness_kernel", genotypes.size());
// Retrieve results from GPU
return extractFitnessValues(context, outputBuffer, genotypes.size());
}, executor);
}
@Override
public void afterAllEvaluations(OpenCLExecutionContext context, ExecutorService executor) {
// Clean up GPU memory
inputBuffer.release();
outputBuffer.release();
}
}
Performance optimization strategies:
- Memory reuse: Allocate buffers once in
beforeAllEvaluations()
and reuse across generations - Asynchronous execution: Use CompletableFuture for non-blocking GPU operations
- Batch processing: Process entire populations in single kernel launches
- Memory coalescing: Organize data layouts for optimal GPU memory access patterns
- Kernel optimization: Design kernels to maximize GPU utilization and minimize divergence
Error handling and robustness:
- GPU errors: Handle OpenCL errors gracefully and provide meaningful error messages
- Memory management: Ensure proper cleanup even in exceptional circumstances
- Device failures: Support graceful degradation when GPU devices fail
- Timeout handling: Implement appropriate timeouts for long-running kernels
Multi-device considerations:
- Device-specific setup: Separate contexts and buffers for each device
- Load balancing: Coordinate with the framework's automatic population partitioning
- Resource isolation: Ensure proper isolation of resources between devices
- Synchronization: Coordinate results from multiple devices
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
Global cleanup hook called once after all fitness evaluations complete.void
afterAllEvaluations
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Per-device cleanup hook called for each OpenCL execution context at the end.void
afterEvaluation
(long generation, List<Genotype> genotypes) Global cleanup hook called after each generation evaluation.void
afterEvaluation
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Per-device cleanup hook called after each device partition evaluation.void
Global initialization hook called once before any fitness evaluations begin.void
beforeAllEvaluations
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Per-device initialization hook called for each OpenCL execution context.void
beforeEvaluation
(long generation, List<Genotype> genotypes) Global preparation hook called before each generation evaluation.void
beforeEvaluation
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Per-device preparation hook called before each device partition evaluation.abstract CompletableFuture
<List<T>> compute
(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Performs the actual fitness computation using OpenCL kernels on the GPU.
-
Field Details
-
logger
public static final org.apache.logging.log4j.Logger logger
-
-
Constructor Details
-
OpenCLFitness
public OpenCLFitness()
-
-
Method Details
-
beforeAllEvaluations
public void beforeAllEvaluations()Global initialization hook called once before any fitness evaluations begin.This method is called once at the beginning of the evolutionary algorithm execution, before any OpenCL contexts are created or evaluations are performed. Use this method for global initialization that applies to all devices and generations.
Typical use cases:
- Initialize problem-specific constants or parameters
- Load reference data or configuration
- Set up logging or monitoring infrastructure
- Validate problem constraints or requirements
This method is called on the main thread before any concurrent operations begin.
- See Also:
-
beforeAllEvaluations
public void beforeAllEvaluations(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Per-device initialization hook called for each OpenCL execution context.This method is called once for each OpenCL device that will be used for fitness evaluation. It allows device-specific initialization such as memory allocation, buffer creation, and device-specific resource setup.
Typical use cases:
- Allocate GPU memory buffers that persist across generations
- Pre-load static data to GPU memory
- Initialize device-specific data structures
- Set up device-specific kernels or configurations
Memory allocated in this method should typically be released in the corresponding
afterAllEvaluations(OpenCLExecutionContext, ExecutorService)
method.- Parameters:
openCLExecutionContext
- the OpenCL execution context for a specific deviceexecutorService
- the executor service for asynchronous operations- See Also:
-
beforeEvaluation
Global preparation hook called before each generation evaluation.This method is called before fitness evaluation of each generation, providing an opportunity for global preparation that applies across all devices. It receives the generation number and complete population for context.
Typical use cases:
- Update generation-specific parameters or configurations
- Log generation start or population statistics
- Prepare global data structures for the upcoming evaluation
- Implement adaptive behavior based on generation number
- Parameters:
generation
- the current generation number (0-based)genotypes
- the complete population to be evaluated- See Also:
-
beforeEvaluation
public void beforeEvaluation(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Per-device preparation hook called before each device partition evaluation.This method is called for each device before evaluating its assigned partition of the population. It provides access to the device context and the specific genotypes that will be evaluated on this device.
Typical use cases:
- Transfer genotype data to device memory
- Update device-specific parameters for this generation
- Prepare input buffers with population data
- Set up kernel arguments that vary by generation
- Parameters:
openCLExecutionContext
- the OpenCL execution context for this deviceexecutorService
- the executor service for asynchronous operationsgeneration
- the current generation number (0-based)genotypes
- the partition of genotypes to be evaluated on this device- See Also:
-
compute
public abstract CompletableFuture<List<T>> compute(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Performs the actual fitness computation using OpenCL kernels on the GPU.This is the core method that implements GPU-based fitness evaluation. It receives a partition of the population and must return corresponding fitness values using OpenCL kernel execution on the specified device.
Implementation requirements:
- Return order: Fitness values must correspond to genotypes in the same order
- Size consistency: Return exactly one fitness value per input genotype
- Asynchronous execution: Use the executor service for non-blocking GPU operations
- Error handling: Handle GPU errors gracefully and provide meaningful exceptions
Common implementation pattern:
- Data transfer: Copy genotype data to GPU memory
- Kernel setup: Configure kernel arguments and work group parameters
- Kernel execution: Launch OpenCL kernels for fitness computation
- Result retrieval: Read fitness values from GPU memory
- Data conversion: Convert GPU results to appropriate fitness type
- Parameters:
openCLExecutionContext
- the OpenCL execution context providing device accessexecutorService
- the executor service for asynchronous operationsgeneration
- the current generation number for contextgenotypes
- the genotypes to evaluate on this device- Returns:
- a CompletableFuture that will complete with fitness values for each genotype
- Throws:
RuntimeException
- if GPU evaluation fails or setup errors occur
-
afterEvaluation
public void afterEvaluation(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService, long generation, List<Genotype> genotypes) Per-device cleanup hook called after each device partition evaluation.This method is called for each device after its partition evaluation completes, providing an opportunity for device-specific cleanup and resource management.
Typical use cases:
- Clean up temporary GPU memory allocations
- Log device-specific performance metrics
- Update device-specific statistics or state
- Perform device-specific validation or debugging
- Parameters:
openCLExecutionContext
- the OpenCL execution context for this deviceexecutorService
- the executor service for asynchronous operationsgeneration
- the current generation number (0-based)genotypes
- the partition of genotypes that were evaluated on this device- See Also:
-
afterEvaluation
Global cleanup hook called after each generation evaluation.This method is called after fitness evaluation of each generation completes across all devices, providing an opportunity for global cleanup and statistics collection that applies to the entire population.
Typical use cases:
- Log generation completion and performance metrics
- Update global statistics or progress tracking
- Perform global validation or debugging
- Clean up generation-specific global resources
- Parameters:
generation
- the current generation number (0-based)genotypes
- the complete population that was evaluated- See Also:
-
afterAllEvaluations
public void afterAllEvaluations(OpenCLExecutionContext openCLExecutionContext, ExecutorService executorService) Per-device cleanup hook called for each OpenCL execution context at the end.This method is called once for each OpenCL device when fitness evaluation is complete, providing an opportunity to clean up device-specific resources that were allocated in
beforeAllEvaluations(OpenCLExecutionContext, ExecutorService)
.Typical use cases:
- Release GPU memory buffers and resources
- Clean up device-specific data structures
- Log device-specific performance summaries
- Ensure no GPU memory leaks occur
This method should ensure proper cleanup even if exceptions occurred during evaluation, as it may be the only opportunity to prevent resource leaks.
- Parameters:
openCLExecutionContext
- the OpenCL execution context for this deviceexecutorService
- the executor service for asynchronous operations- See Also:
-
afterAllEvaluations
public void afterAllEvaluations()Global cleanup hook called once after all fitness evaluations complete.This method is called once at the end of the evolutionary algorithm execution, after all OpenCL contexts have been cleaned up and all evaluations are complete. Use this method for final global cleanup and resource deallocation.
Typical use cases:
- Clean up global resources and data structures
- Log final performance summaries and statistics
- Save results or generate reports
- Perform final validation or cleanup
This method is called on the main thread after all concurrent operations complete.
- See Also:
-