Interface KernelInfo
- All Known Implementing Classes:
ImmutableKernelInfo
KernelInfo encapsulates the device-specific compilation and execution characteristics of an OpenCL kernel, providing essential information for optimal work group configuration and resource allocation in GPU-accelerated evolutionary algorithms. This information is determined at kernel compilation time and varies by device.
Key kernel characteristics include:
- Work group constraints: Maximum and preferred work group sizes for efficient execution
- Memory usage: Local and private memory requirements per work-item
- Performance optimization: Preferred work group size multiples for optimal resource utilization
- Resource validation: Constraints for validating kernel launch parameters
Kernel optimization considerations for evolutionary algorithms:
- Work group sizing: Configure launch parameters within device-specific limits
- Memory allocation: Ensure sufficient local memory for parallel fitness evaluation
- Performance tuning: Align work group sizes with preferred multiples
- Resource planning: Account for per-work-item memory requirements
Common usage patterns for kernel configuration:
// Query kernel information after compilation
KernelInfo kernelInfo = kernelInfoReader.read(deviceId, kernel, "fitness_evaluation");
// Configure work group size within device limits
long maxWorkGroupSize = Math.min(kernelInfo.workGroupSize(), device.maxWorkGroupSize());
// Optimize for preferred work group size multiple
long preferredMultiple = kernelInfo.preferredWorkGroupSizeMultiple();
long optimalWorkGroupSize = (maxWorkGroupSize / preferredMultiple) * preferredMultiple;
// Validate memory requirements for population size
long populationSize = 1000;
long totalLocalMem = kernelInfo.localMemSize() * optimalWorkGroupSize;
long totalPrivateMem = kernelInfo.privateMemSize() * populationSize;
// Configure kernel execution with validated parameters
clEnqueueNDRangeKernel(commandQueue, kernel, 1, null,
new long[]{populationSize}, new long[]{optimalWorkGroupSize}, 0, null, null);
Performance optimization workflow:
- Kernel compilation: Compile kernel for target device
- Information query: Read kernel-specific execution characteristics
- Work group optimization: Calculate optimal work group size based on preferences
- Memory validation: Ensure memory requirements fit within device limits
- Launch configuration: Configure kernel execution with optimized parameters
Memory management considerations:
- Local memory: Shared among work-items in the same work group
- Private memory: Individual memory per work-item
- Total allocation: Sum of all work-items' memory requirements
- Device limits: Validate against device memory constraints
Error handling and validation:
- Work group limits: Ensure launch parameters don't exceed kernel limits
- Memory constraints: Validate total memory usage against device capabilities
- Performance degradation: Monitor for suboptimal work group configurations
- Resource conflicts: Handle multiple kernels competing for device resources
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionstatic ImmutableKernelInfo.Builder
builder()
Creates a new builder for constructing KernelInfo instances.long
Returns the amount of local memory in bytes used by this kernel.name()
Returns the name of the kernel function.long
Returns the preferred work group size multiple for optimal kernel execution performance.long
Returns the minimum amount of private memory in bytes used by each work-item.long
Returns the maximum work group size that can be used when executing this kernel on the device.
-
Method Details
-
name
String name()Returns the name of the kernel function.- Returns:
- the kernel function name as specified in the OpenCL program
-
workGroupSize
long workGroupSize()Returns the maximum work group size that can be used when executing this kernel on the device.This value represents the maximum number of work-items that can be in a work group when executing this specific kernel on the target device. It may be smaller than the device's general maximum work group size due to kernel-specific resource requirements.
- Returns:
- the maximum work group size for this kernel
-
preferredWorkGroupSizeMultiple
long preferredWorkGroupSizeMultiple()Returns the preferred work group size multiple for optimal kernel execution performance.For optimal performance, the work group size should be a multiple of this value. This represents the native vector width or wavefront size of the device and helps achieve better resource utilization and memory coalescing.
- Returns:
- the preferred work group size multiple for performance optimization
-
localMemSize
long localMemSize()Returns the amount of local memory in bytes used by this kernel.Local memory is shared among all work-items in a work group and includes both statically allocated local variables and dynamically allocated local memory passed as kernel arguments. This value is used to validate that the total local memory usage doesn't exceed the device's local memory capacity.
- Returns:
- the local memory usage in bytes per work group
-
privateMemSize
long privateMemSize()Returns the minimum amount of private memory in bytes used by each work-item.Private memory is individual to each work-item and includes local variables, function call stacks, and other per-work-item data. This value helps estimate the total memory footprint when launching kernels with large work group sizes.
- Returns:
- the private memory usage in bytes per work-item
-
builder
Creates a new builder for constructing KernelInfo instances.- Returns:
- a new builder for creating kernel information objects
-