Class GPUFitnessEvaluator<T extends Comparable<T>>
- Type Parameters:
T
- the type of fitness values produced, must be comparable for selection operations
- All Implemented Interfaces:
FitnessEvaluator<T>
GPUFitnessEvaluator implements the core FitnessEvaluator
interface to provide GPU acceleration
for fitness computation in evolutionary algorithms. This evaluator manages the complete OpenCL lifecycle,
from device discovery and kernel compilation to memory management and resource cleanup.
Key responsibilities include:
- OpenCL initialization: Platform and device discovery, context creation, and kernel compilation
- Resource management: Managing OpenCL contexts, command queues, programs, and kernels
- Population partitioning: Distributing work across multiple OpenCL devices
- Asynchronous execution: Coordinating concurrent GPU operations with CPU-side logic
- Memory lifecycle: Ensuring proper cleanup of GPU resources
Architecture overview:
- Initialization (
preEvaluation()
): Discover platforms/devices, compile kernels, create contexts - Evaluation (
evaluate(long, java.util.List<net.bmahe.genetics4j.core.Genotype>)
): Partition population, execute fitness computation on GPU - Cleanup (
postEvaluation()
): Release all OpenCL resources and contexts
Multi-device support:
- Device filtering: Selects devices based on user-defined criteria (type, capabilities)
- Load balancing: Automatically distributes population across available devices
- Parallel execution: Concurrent fitness evaluation on multiple GPUs or devices
- Asynchronous coordination: Non-blocking execution with CompletableFuture-based results
Resource management patterns:
- Lazy initialization: OpenCL resources created only when needed
- Automatic cleanup: Guaranteed resource release through lifecycle methods
- Error recovery: Robust handling of OpenCL errors and device failures
- Memory optimization: Efficient GPU memory usage and transfer patterns
Example usage in GPU EA system:
// GPU configuration with OpenCL kernel
Program fitnessProgram = Program.ofResource("/kernels/optimization.cl");
GPUEAConfiguration<Double> config = GPUEAConfigurationBuilder.<Double>builder()
.program(fitnessProgram)
.fitness(new MyGPUFitness())
// ... other EA configuration
.build();
// Execution context with device preferences
GPUEAExecutionContext<Double> context = GPUEAExecutionContextBuilder.<Double>builder()
.populationSize(2000)
.deviceFilter(device -> device.type() == DeviceType.GPU)
.platformFilter(platform -> platform.profile() == PlatformProfile.FULL_PROFILE)
.build();
// Evaluator handles all OpenCL lifecycle automatically
GPUFitnessEvaluator<Double> evaluator = new GPUFitnessEvaluator<>(context, config, executorService);
// Used by EA system - lifecycle managed automatically
EASystem<Double> system = EASystemFactory.from(config, context, executorService, evaluator);
Performance characteristics:
- Initialization overhead: One-time setup cost for OpenCL compilation and context creation
- Scalability: Performance scales with population size and problem complexity
- Memory bandwidth: Optimal for problems with high computational intensity
- Concurrency: Supports concurrent evaluation across multiple devices
Error handling:
- Device failures: Graceful degradation when devices become unavailable
- Memory errors: Proper cleanup and error reporting for GPU memory issues
- Compilation errors: Clear error messages for kernel compilation failures
- Resource leaks: Guaranteed cleanup even in exceptional circumstances
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) final List
<org.jocl.cl_command_queue> (package private) final List
<org.jocl.cl_context> (package private) final List
<OpenCLExecutionContext> (package private) final List
<org.jocl.cl_program> private final ExecutorService
private final GPUEAConfiguration
<T> private final GPUEAExecutionContext
<T> static final org.apache.logging.log4j.Logger
-
Constructor Summary
ConstructorsConstructorDescriptionGPUFitnessEvaluator
(GPUEAExecutionContext<T> _gpuEAExecutionContext, GPUEAConfiguration<T> _gpuEAConfiguration, ExecutorService _executorService) Constructs a GPU fitness evaluator with the specified configuration and execution context. -
Method Summary
Modifier and TypeMethodDescriptionEvaluates fitness for a population of genotypes using GPU acceleration.private String
loadResource
(String filename) void
Cleans up OpenCL resources and releases GPU memory after evaluation completion.void
Initializes OpenCL resources and prepares GPU devices for fitness evaluation.
-
Field Details
-
logger
public static final org.apache.logging.log4j.Logger logger -
gpuEAExecutionContext
-
gpuEAConfiguration
-
executorService
-
selectedPlatformToDevice
-
clContexts
-
clCommandQueues
-
clPrograms
-
clKernels
-
clExecutionContexts
-
-
Constructor Details
-
GPUFitnessEvaluator
public GPUFitnessEvaluator(GPUEAExecutionContext<T> _gpuEAExecutionContext, GPUEAConfiguration<T> _gpuEAConfiguration, ExecutorService _executorService) Constructs a GPU fitness evaluator with the specified configuration and execution context.Initializes the evaluator with GPU-specific configuration and execution parameters. The evaluator will use the provided executor service for coordinating asynchronous operations between CPU and GPU components.
The constructor performs minimal initialization - the actual OpenCL setup occurs during
preEvaluation()
to follow the fitness evaluator lifecycle pattern.- Parameters:
_gpuEAExecutionContext
- the GPU execution context with device filters and population settings_gpuEAConfiguration
- the GPU EA configuration with OpenCL program and fitness function_executorService
- the executor service for managing asynchronous operations- Throws:
IllegalArgumentException
- if any parameter is null
-
-
Method Details
-
loadResource
-
grabProgramSources
-
preEvaluation
public void preEvaluation()Initializes OpenCL resources and prepares GPU devices for fitness evaluation.This method performs the complete OpenCL initialization sequence:
- Platform discovery: Enumerates available OpenCL platforms
- Device filtering: Selects devices based on configured filters
- Context creation: Creates OpenCL contexts for selected devices
- Queue setup: Creates command queues with profiling and out-of-order execution
- Program compilation: Compiles OpenCL kernels from source code
- Kernel preparation: Creates kernel objects and queries execution info
- Fitness initialization: Calls lifecycle hooks on the fitness function
Device selection process:
- Applies platform filters to discovered OpenCL platforms
- Enumerates devices for each qualifying platform
- Applies device filters to select appropriate devices
- Validates that at least one device is available
The method creates separate OpenCL contexts for each selected device to enable concurrent execution and optimal resource utilization. Each context includes compiled programs and kernel objects ready for fitness evaluation.
- Specified by:
preEvaluation
in interfaceFitnessEvaluator<T extends Comparable<T>>
- Throws:
IllegalStateException
- if no compatible devices are foundRuntimeException
- if OpenCL initialization, program compilation, or kernel creation fails
-
evaluate
Evaluates fitness for a population of genotypes using GPU acceleration.This method implements the core fitness evaluation logic by distributing the population across available OpenCL devices and executing fitness computation concurrently. The evaluation process follows these steps:
- Population partitioning: Divides genotypes across available devices
- Parallel dispatch: Submits evaluation tasks to each device asynchronously
- GPU execution: Executes OpenCL kernels for fitness computation
- Result collection: Gathers fitness values from all devices
- Result aggregation: Combines results preserving original order
Load balancing strategy:
- Automatically calculates partition size based on population and device count
- Round-robin assignment of partitions to devices for balanced workload
- Asynchronous execution allows devices to work at their optimal pace
The method coordinates with the configured fitness function through lifecycle hooks:
beforeEvaluation()
: Called before each device partition evaluationcompute()
: Executes the actual GPU fitness computationafterEvaluation()
: Called after each device partition completes
Concurrency and performance:
- Multiple devices execute evaluation partitions concurrently
- CompletableFuture-based coordination for non-blocking execution
- Automatic workload distribution across available GPU resources
- Specified by:
evaluate
in interfaceFitnessEvaluator<T extends Comparable<T>>
- Parameters:
generation
- the current generation number for context and logginggenotypes
- the population of genotypes to evaluate- Returns:
- fitness values corresponding to each genotype in the same order
- Throws:
IllegalArgumentException
- if genotypes is null or emptyRuntimeException
- if GPU evaluation fails or OpenCL errors occur
-
postEvaluation
public void postEvaluation()Cleans up OpenCL resources and releases GPU memory after evaluation completion.This method performs comprehensive cleanup of all OpenCL resources in the proper order to prevent memory leaks and ensure clean shutdown. The cleanup sequence follows OpenCL best practices for resource deallocation:
- Fitness cleanup: Calls lifecycle hooks on the fitness function
- Kernel release: Releases all compiled kernel objects
- Program release: Releases compiled OpenCL programs
- Queue release: Releases command queues and pending operations
- Context release: Releases OpenCL contexts and associated memory
- Reference cleanup: Clears internal data structures and references
Resource management guarantees:
- All GPU memory allocations are properly released
- OpenCL objects are released in dependency order to avoid errors
- No resource leaks occur even if individual cleanup operations fail
- Evaluator returns to a clean state ready for potential reinitialization
The method coordinates with the configured fitness function to ensure any fitness-specific resources (buffers, textures, etc.) are also properly cleaned up through the
afterAllEvaluations()
lifecycle hooks.- Specified by:
postEvaluation
in interfaceFitnessEvaluator<T extends Comparable<T>>
- Throws:
RuntimeException
- if cleanup operations fail (logged but not propagated to prevent interference with EA system shutdown)
-