Skip to content

Latest commit

 

History

History

ocl_kernels

OpenCL Kernel Examples

This section contains OpenCL Kernel Examples.

Examples Table :

Example Description Key Concepts/Keywords
cl_array_partition This example shows how to use array partitioning to improve performance of a kernel

Key Concepts

Keywords

cl_burst_rw This is simple example of using AXI4-master interface for burst read and write

Key Concepts

Keywords

cl_dataflow_func This is simple example of vector addition to demonstrate Dataflow functionality in OpenCL Kernel. OpenCL Dataflow allows user to run multiple functions together to achieve higher throughput.

Key Concepts

Keywords

cl_dataflow_subfunc This is simple example of vector addition to demonstrate how OpenCL Dataflow allows user to run multiple sub functions together to achieve higher throughput.

Key Concepts

  • SubFunction Level Parallelism

Keywords

cl_gmem_2banks This example of 2ddr to demonstrate on how to use 2ddr XSA. How to create buffers in each DDR.

Key Concepts

Keywords

cl_helloworld This example is a simple OpenCL application. It will highlight the basic flow of an OpenCL application.

Key Concepts

cl_lmem_2rw This is simple example of vector addition to demonstrate how to utilized both ports of Local Memory.

Key Concepts

Keywords

cl_loop_reorder This is a simple example of matrix multiplication (Row x Col) to demonstrate how to achieve better pipeline II factor by loop reordering.

Key Concepts

Keywords

cl_partition_cyclicblock This example shows how to use array block and cyclic partitioning to improve performance of a kernel

Key Concepts

Keywords

cl_shift_register This example demonstrates how to shift values in registers in each clock cycle

Key Concepts

Keywords

cl_systolic_array This is a simple example of matrix multiplication (Row x Col) to help developers learn systolic array based algorithm design. Note: Systolic array based algorithm design is well suited for FPGA.  
cl_wide_mem_rw This is simple example of vector addition to demonstrate Wide Memory Access using uint16 data type. Based on input argument type, V++ compiler will figure our the memory datawidth between Global Memory and Kernel. For this example, uint16 datatype is used, so Memory datawidth will be 16 x (integer bit size) = 16 x 32 = 512 bit.

Key Concepts

Keywords