The OpenCL Runtime
Command Queues [5.1]
cl_command_queue clCreateCommandQueue (
cl_context context, cl_device_id device,
cl_command_queue_properties properties,
cl_int *errcode_ret)
properties: CL_QUEUE_PROFILING_ENABLE,
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ ENABLE
cl_int clRetainCommandQueue (
cl_command_queue command_queue)
cl_int clReleaseCommandQueue (
cl_command_queue command_queue)
cl_int clGetCommandQueueInfo (
cl_command_queue command_queue,
cl_command_queue_info param_name,
size_t param_value_size,
void *param_value,
size_t *param_value_size_ret)
param_name: CL_QUEUE_CONTEXT,
CL_QUEUE_DEVICE,
CL_QUEUE_REFERENCE_COUNT,
CL_QUEUE_PROPERTIES
OpenCLTM (Open Computing Language)
is a multi-vendor open standard for
general-purpose parallel programming
of heterogeneous systems that include
CPUs, GPUs and other processors.
OpenCL provides a uniform programming
environment for software developers to
write efficient, portable code for high-
performance compute servers, desktop
computer systems and handheld devices.
[n.n.n] refers to the section in the API
Specification available at www.khronos.
org/opencl.
Program Objects
Create Program Objects [5.6.1]
cl_program clCreateProgramWithSource (
cl_context context, cl_uint count, const char **strings,
const size_t *lengths, cl_int *errcode_ret)
cl_program clCreateProgramWithBinary (
cl_context context, cl_uint num_devices,
const cl_device_id *device_list, const size_t *lengths,
const unsigned char **binaries, cl_int *binary_status,
cl_int *errcode_ret)
cl_int clRetainProgram (cl_program program)
cl_int clReleaseProgram (cl_program program)
Build Program Executable [5.6.2]
cl_int clBuildProgram (cl_program program,
cl_uint num_devices, const cl_device_id *device_list,
const char *options, void (CL_CALLBACK*pfn_notify)
(cl_program program, void *user_data),
void *user_data)
Build Options [5.6.3]
Preprocessor: (-D processed in order listed in clBuildProgram)
-D name -D name=definition -I dir
Optimization options:
-cl-opt-disable -cl-strict-aliasing
-cl-mad-enable -cl-no-signed-zeros
-cl-finite-math-only -cl-fast-relaxed-math
-cl-unsafe-math-optimizations
Math Intrinsics:
-cl-single-precision-constant -cl-denorms-are-zero
Warning request/suppress:
-w -Werror
Control OpenCL C language version:
-cl-std=CL1.1 // OpenCL 1.1 specification.
Query Program Objects [5.6.5]
cl_int clGetProgramInfo (cl_program program,
cl_program_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_PROGRAM_{REFERENCE_COUNT},
CL_PROGRAM_{CONTEXT, NUM_DEVICES, DEVICES},
CL_PROGRAM_{SOURCE, BINARY_SIZES, BINARIES}
(Program Objects Continue >)
The OpenCL Platform Layer
The OpenCL platform layer implements platform-specific features that allow applications to query OpenCL devices, device configuration
information, and to create OpenCL contexts using one or more devices.
Contexts [4.3]
cl_context clCreateContext (
const cl_context_properties *properties, cl_uint num_devices,
const cl_device_id *devices, void (CL_CALLBACK*pfn_notify)
(const char *errinfo, const void *private_info,
size_t cb, void *user_data),
void *user_data, cl_int *errcode_ret)
properties: CL_CONTEXT_PLATFORM, CL_GL_CONTEXT_KHR,
CL_CGL_SHAREGROUP_KHR, CL_{EGL, GLX}_DISPLAY_KHR,
CL_WGL_HDC_KHR
cl_context clCreateContextFromType (
const cl_context_properties *properties,
cl_device_type device_type, void (CL_CALLBACK *pfn_notify)
(const char *errinfo, const void *private_info, size_t cb,
void *user_data),
void *user_data, cl_int *errcode_ret)
properties: See clCreateContext
cl_int clRetainContext (cl_context context)
cl_int clReleaseContext (cl_context context)
cl_int clGetContextInfo (cl_context context,
cl_context_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_CONTEXT_REFERENCE_COUNT,
CL_CONTEXT_{DEVICES, PROPERTIES}, CL_CONTEXT_NUM_DEVICES
Querying Platform Info and Devices [4.1, 4.2]
cl_int clGetPlatformIDs (cl_uint num_entries,
cl_platform_id *platforms, cl_uint *num_platforms)
cl_int clGetPlatformInfo (cl_platform_id platform,
cl_platform_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_PLATFORM_{PROFILE, VERSION},
CL_PLATFORM_{NAME, VENDOR, EXTENSIONS}
cl_int clGetDeviceIDs (cl_platform_id platform,
cl_device_type device_type, cl_uint num_entries,
cl_device_id *devices, cl_uint *num_devices)
device_type: CL_DEVICE_TYPE_{CPU, GPU},
CL_DEVICE_TYPE_{ACCELERATOR, DEFAULT, ALL}
cl_int clGetDeviceInfo (cl_device_id device,
cl_device_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_DEVICE_TYPE,
CL_DEVICE_VENDOR_ID,
CL_DEVICE_MAX_COMPUTE_UNITS,
CL_DEVICE_MAX_WORK_ITEM_{DIMENSIONS, SIZES},
CL_DEVICE_MAX_WORK_GROUP_SIZE,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_CHAR,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_SHORT,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_INT,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_LONG,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_FLOAT,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_DOUBLE,
CL_DEVICE_{NATIVE, PREFERRED}_VECTOR_WIDTH_HALF,
CL_DEVICE_MAX_CLOCK_FREQUENCY,
CL_DEVICE_ADDRESS_BITS,
CL_DEVICE_MAX_MEM_ALLOC_SIZE,
CL_DEVICE_IMAGE_SUPPORT,
CL_DEVICE_MAX_{READ, WRITE}_IMAGE_ARGS,
CL_DEVICE_IMAGE2D_MAX_{WIDTH, HEIGHT},
CL_DEVICE_IMAGE3D_MAX_{WIDTH, HEIGHT, DEPTH},
CL_DEVICE_MAX_SAMPLERS,
CL_DEVICE_MAX_PARAMETER_SIZE,
CL_DEVICE_MEM_BASE_ADDR_ALIGN,
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE,
CL_DEVICE_SINGLE_FP_CONFIG,
CL_DEVICE_GLOBAL_MEM_CACHE_{TYPE, SIZE},
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE,
CL_DEVICE_GLOBAL_MEM_SIZE,
CL_DEVICE_MAX_CONSTANT_{BUFFER_SIZE, ARGS}
CL_DEVICE_LOCAL_MEM_{TYPE, SIZE},
CL_DEVICE_ERROR_CORRECTION_SUPPORT,
CL_DEVICE_PROFILING_TIMER_RESOLUTION,
CL_DEVICE_ENDIAN_LITTLE,
CL_DEVICE_AVAILABLE,
CL_DEVICE_COMPILER_AVAILABLE,
CL_DEVICE_EXECUTION_CAPABILITIES,
CL_DEVICE_QUEUE_PROPERTIES,
CL_DEVICE_{NAME, VENDOR, PROFILE, EXTENSIONS},
CL_DEVICE_HOST_UNIFIED_MEMORY,
CL_DEVICE_OPENCL_C_VERSION,
CL_DEVICE_VERSION,
CL_DRIVER_VERSION, CL_DEVICE_PLATFORM
Buffer Objects
Elements of a buffer object can be a scalar or vector data type or
a user-defined structure. Elements are stored sequentially and
are accessed using a pointer by a kernel executing on a device.
Data is stored in the same format as it is accessed by the kernel.
Create Buffer Objects [5.2.1]
cl_mem clCreateBuffer (cl_context context,
cl_mem_flags flags, size_t size, void *host_ptr,
cl_int *errcode_ret)
cl_mem clCreateSubBuffer (cl_mem buffer,
cl_mem_flags flags,
cl_buffer_create_type buffer_create_type,
const void *buffer_create_info, cl_int *errcode_ret)
flags for clCreateBuffer and clCreateSubBuffer:
CL_MEM_READ_WRITE,
CL_MEM_{WRITE, READ}_ONLY,
CL_MEM_{USE, ALLOC, COPY}_HOST_PTR
Read, Write, Copy Buffer Objects [5.2.2]
cl_int clEnqueueReadBuffer (
cl_command_queue command_queue, cl_mem buffer,
cl_bool blocking_read, size_t offset, size_t cb,
void *ptr, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueWriteBuffer (
cl_command_queue command_queue, cl_mem buffer,
cl_bool blocking_write, size_t offset, size_t cb,
const void *ptr, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueReadBufferRect (
cl_command_queue command_queue, cl_mem buffer,
cl_bool blocking_read, const size_t buffer_origin[3],
const size_t host_origin[3], const size_t region[3],
size_t buffer_row_pitch, size_t buffer_slice_pitch,
size_t host_row_pitch, size_t host_slice_pitch,
void *ptr, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueWriteBufferRect (
cl_command_queue command_queue, cl_mem buffer,
cl_bool blocking_write, const size_t buffer_origin[3],
const size_t host_origin[3], const size_t region[3],
size_t buffer_row_pitch, size_t buffer_slice_pitch,
size_t host_row_pitch, size_t host_slice_pitch,
void *ptr, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueCopyBuffer (
cl_command_queue command_queue,
cl_mem src_buffer, cl_mem dst_buffer, size_t src_offset,
size_t dst_offset, size_t cb,
cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueCopyBufferRect (
cl_command_queue command_queue,
cl_mem src_buffer, cl_mem dst_buffer,
const size_t src_origin[3], const size_t dst_origin[3],
const size_t region[3], size_t src_row_pitch,
size_t src_slice_pitch, size_t dst_row_pitch,
size_t dst_slice_pitch, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
Map Buffer Objects [5.2.2]
void * clEnqueueMapBuffer (
cl_command_queue command_queue, cl_mem buffer,
cl_bool blocking_map, cl_map_flags map_flags,
size_t offset, size_t cb, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event,
cl_int *errcode_ret)
Map Buffer Objects [5.4.1-2]
cl_int clRetainMemObject (cl_mem memobj)
cl_int clReleaseMemObject (cl_mem memobj)
cl_int clSetMemObjectDestructorCallback (
cl_mem memobj, void (CL_CALLBACK *pfn_notify)
(cl_mem memobj, void *user_data),
void *user_data)
cl_int clEnqueueUnmapMemObject (
cl_command_queue command_queue, cl_mem memobj,
void *mapped_ptr, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
Query Buffer Object [5.4.3]
cl_int clGetMemObjectInfo (cl_mem memobj,
cl_mem_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_MEM_{TYPE, FLAGS, SIZE, HOST_PTR},
CL_MEM_{MAP, REFERENCE}_COUNT, CL_MEM_OFFSET,
CL_MEM_CONTEXT, CL_MEM_ASSOCIATED_MEMOBJECT
OpenCL API 1.1 Quick Reference Card - Page 1
©2010 Khronos Group - Rev. 0711 www.khronos.org/opencl
Supported Data Types
Built-in Scalar Data Types [6.1.1]
OpenCL Type API Type Description
bool -- true (1) or false (0)
char cl_char 8-bit signed
unsigned char, uchar cl_uchar 8-bit unsigned
short cl_short 16-bit signed
unsigned short, ushort cl_ushort 16-bit unsigned
int cl_int 32-bit signed
unsigned int, uint cl_uint 32-bit unsigned
long cl_long 64-bit signed
unsigned long, ulong cl_ulong 64-bit unsigned
float cl_float 32-bit float
half cl_half 16-bit float (for storage only)
size_t -- 32- or 64-bit unsigned integer
ptrdiff_t -- 32- or 64-bit signed integer
intptr_t -- signed integer
uintptr_t -- unsigned integer
void void void
Built-in Vector Data Types [6.1.2]
OpenCL Type API Type Description
charn cl_charn 8-bit signed
ucharn cl_ucharn 8-bit unsigned
shortn cl_shortn 16-bit signed
ushortn cl_ushortn 16-bit unsigned
intn cl_intn 32-bit signed
uintn cl_uintn 32-bit unsigned
longn cl_longn 64-bit signed
ulongn cl_ulongn 64-bt unsigned
floatn cl_floatn 32-bit float
Other Built-in Data Types [6.1.3]
OpenCL Type Description
image2d_t 2D image handle
image3d_t 3D image handle
sampler_t sampler handle
event_t event handle
Reserved Data Types [6.1.4]
OpenCL Type Description
booln boolean vector
double, doublen OPTIONAL 64-bit float, vector
halfn 16-bit, vector
quad, quadn 128-bit float, vector
complex half, complex halfn
imaginary half, imaginary halfn
16-bit complex, vector
complex float, complex floatn
imaginary float, imaginary floatn
32-bit complex, vector
complex double, complex doublen
imaginary double, imaginary doublen
64-bit complex, vector
complex quad, complex quadn
imaginary quad, imaginary quadn
128-bit complex, vector
floatnxm n*m matrix of 32-bit floats
doublenxm n*m matrix of 64-bit floats
long double, long doublen 64 - 128-bit float, vector
long long, long longnb 128-bit signed
unsigned long long, ulong long,
ulong longn
128-bit unsigned
Operators [6.3]
These operators behave similarly as in C99 except that
operands may include vector types when possible:
+ - * % / -- ++ == != &
~ ^ > < >= <= | ! && ||
?: >> << , = op= sizeof
Address Space Qualifiers [6.5]
__global, global __local, local
__constant, constant __private, private
Function Qualifiers [6.7]
__kernel, kernel
__attribute__((vec_type_hint(type))) //type defaults to int
__attribute__((work_group_size_hint(X, Y, Z)))
__attribute__((reqd_work_group_size(X, Y, Z)))
Conversions & Type Casting Examples [6.2]
T a = (T)b; // Scalar to scalar, or scalar to vector
T a = convert_T(b);
T a = convert_T_R(b);
T a = as_T(b);
T a = convert_T_sat_R(b); //R is rounding mode
R can be one of the following rounding modes:
_rte to nearest even
_rtz toward zero
_rtp toward + infinity
_rtn toward - infinity
Program Objects (continued)
cl_int clGetProgramBuildInfo (cl_program program,
cl_device_id device, cl_program_build_info param_name,
size_t param_value_size, void *param_value,
size_t *param_value_size_ret)
param_name: CL_PROGRAM_BUILD_{STATUS, OPTIONS, LOG}
Unload the OpenCL Compiler [5.6.4]
cl_int clUnloadCompiler (void)
Vector Addressing Equivalencies
Numeric indices are preceded by the letter s or S, e.g.: s1. Swizzling, duplication, and nesting are allowed, e.g.: v.yx, v.xx, v.lo.x
v.lo v.hi v.odd v.even v.lo v.hi v.odd v.even
float2 v.x, v.s0 v.y, v.s1 v.y, v.s1 v.x, v.s0 float8 v.s0123 v.s4567 v.s1357 v.s0246
float3 * v.s01, v.xy v.s23, v.zw v.s13, v.yw v.s02, v.xz float16 v.s01234567 v.s89abcdef v.s13579bdf v.s02468ace
float4 v.s01, v.xy v.s23, v.zw v.s13, v.yw v.s02, v.xz *When using .lo or .hi with a 3-component vector, the .w component is undefined.
Vector Component Addressing [6.1.7]
Vector Components
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
float2 v; v.x, v.s0 v.y, v.s1
float3 v; v.x, v.s0 v.y, v.s1 v.z, v.s2
float4 v; v.x, v.s0 v.y, v.s1 v.z, v.s2 v.w, v.s3
float8 v; v.s0 v.s1 v.s2 v.s3 v.s4 v.s5 v.s6 v.s7
float16 v; v.s0 v.s1 v.s2 v.s3 v.s4 v.s5 v.s6 v.s7 v.s8 v.s9 v.sa,
v.sA
v.sb,
v.sB
v.sc,
v.sC
v.sd,
v.sD
v.se,
v.sE
v.sf,
v.sF
Kernel and Event Objects
Create Kernel Objects [5.7.1]
cl_kernel clCreateKernel (cl_program program,
const char *kernel_name, cl_int *errcode_ret)
cl_int clCreateKernelsInProgram (cl_program program,
cl_uint num_kernels, cl_kernel *kernels,
cl_uint *num_kernels_ret)
cl_int clRetainKernel (cl_kernel kernel)
cl_int clReleaseKernel (cl_kernel kernel)
Kernel Args. & Object Queries [5.7.2, 5.7.3]
cl_int clSetKernelArg (cl_kernel kernel, cl_uint arg_index,
size_t arg_size, const void *arg_value)
cl_int clGetKernelInfo (cl_kernel kernel,
cl_kernel_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_KERNEL_FUNCTION_NAME,
CL_KERNEL_NUM_ARGS, CL_KERNEL_REFERENCE_COUNT,
CL_KERNEL_CONTEXT, CL_KERNEL_PROGRAM
cl_int clGetKernelWorkGroupInfo (
cl_kernel kernel, cl_device_id device,
cl_kernel_work_group_info param_name,
size_t param_value_size, void *param_value,
size_t *param_value_size_ret)
param_name: CL_KERNEL_WORK_GROUP_SIZE,
CL_KERNEL_COMPILE_WORK_GROUP_SIZE,
CL_KERNEL_{LOCAL, PRIVATE}_MEM_SIZE,
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE
Execute Kernels [5.8]
cl_int clEnqueueNDRangeKernel (
cl_command_queue command_queue,
cl_kernel kernel, cl_uint work_dim,
const size_t *global_work_offset,
const size_t *global_work_size,
const size_t *local_work_size,
cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueTask (
cl_command_queue command_queue, cl_kernel
kernel, cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
cl_int clEnqueueNativeKernel (cl_command_queue
command_queue, void (*user_func)(void *),
void *args, size_t cb_args, cl_uint num_mem_objects,
const cl_mem *mem_list, const void **args_mem_loc,
cl_uint num_events_in_wait_list,
const cl_event *event_wait_list, cl_event *event)
Event Objects [5.9]
cl_event clCreateUserEvent (cl_context context,
cl_int *errcode_ret)
cl_int clSetUserEventStatus (cl_event event,
cl_int execution_status)
cl_int clWaitForEvents (cl_uint num_events,
const cl_event *event_list)
cl_int clGetEventInfo (cl_event event,
cl_event_info param_name, size_t param_value_size,
void *param_value, size_t *param_value_size_ret)
param_name: CL_EVENT_COMMAND_{QUEUE, TYPE},
CL_EVENT_{CONTEXT, REFERENCE_COUNT},
CL_EVENT_COMMAND_EXECUTION_STATUS
cl_int clSetEventCallback (cl_event event,
cl_int command_exec_callback_type,
void (CL_CALLBACK *pfn_event_notify)
(cl_event event, cl_int event_command_exec_status,
void *user_data),
void *user_data)
cl_int clRetainEvent (cl_event event)
cl_int clReleaseEvent (cl_event event)
Out-of-order Execution of Kernels
& Memory Object Commands [5.10]
cl_int clEnqueueMarker (
cl_command_queue command_queue,
cl_event *event)
cl_int clEnqueueWaitForEvents (
cl_command_queue command_queue,
cl_uint num_events, const cl_event *event_list)
cl_int clEnqueueBarrier (
cl_command_queue command_queue)
Profiling Operations [5.11]
cl_int clGetEventProfilingInfo (cl_event event,
cl_profiling_info param_name,
size_t param_value_size, void *param_value,
size_t *param_value_size_ret)
param_name: CL_PROFILING_COMMAND_QUEUED,
CL_PROFILING_COMMAND_{SUBMIT, START, END}
Flush and Finish [5.12]
cl_int clFlush (cl_command_queue command_queue)
cl_int clFinish (cl_command_queue command_queue)
OpenCL API 1.1 Quick Reference Card - Page 2
©2010 Khronos Group - Rev. 0711 www.khronos.org/opencl
Preprocessor Directives & Macros [6.9]
#pragma OPENCL FP_CONTRACT on-off-switch
on-off-switch: ON, OFF, DEFAULT
__FILE__ Current source file
__LINE__ Integer line number
__OPENCL_VERSION__ Integer version number
__CL_VERSION_1_0__ Substitutes integer 100 for version 1.0
__CL_VERSION_1_1__ Substitutes integer 110 for version 1.1
__ENDIAN_LITTLE__ 1 if device is little endian
__kernel_exec(X, typen) Same as: __kernel __attribute__(
(work_group_size_hint(X, 1, 1))) \
__attribute__((vec_type_hint(typen)))
__IMAGE_SUPPORT__ 1 if images are supported
__FAST_RELAXED_MATH__ 1 if –cl-fast-relaxed-math
optimization option is specified
Common Built-in Functions [6.11.4]
T is type float or floatn (or optionally double, doublen, or halfn).
Optional extensions enable double, doublen, and halfn types.
T clamp (T x, T min, T max)
floatn clamp (floatn x, float min, float max)
doublen clamp (doublen x, double min, double max)
halfn clamp (halfn x, half min, half max)
Clamp x to range
given by min, max
T degrees (T radians) radians to degrees
T max (T x, T y)
floatn max (floatn x, float y)
doublen max (doublen x, double y)
halfn max (halfn x, half y)
Max of x and y
T min (T x, T y)
floatn min (floatn x, float y)
doublen min (doublen x, double y)
halfn min (halfn x, half y)
Min of x and y
T mix (T x, T y, T a)
floatn mix (floatn x, float y, float a)
doublen mix (doublen x, double y, double a)
halfn mix (halfn x, half y, half a)
Linear blend of x
and y
T radians (T degrees) degrees to radians
T step (T edg
本文档为【opencl-1-1-quick-reference-card】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。