Build & Install OpenCL in Jetson TK1

3 min readSep 7, 2021

OpenCL 1.2 via PoCL (CPU ARM Cortex A15 and GPU NVIDIA Kepler GK20a via CUDA backend)

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. [Wikipedia].

Since NVIDIA is not releasing OpenCL for Jetson TK1 under L4T OS, we can’t run OpenCL on this board. Luckily, there is several opensource OpenCL implementation that can help us to solve this issue. One of the options is PoCL (Portable Computing Language).

PoCL (Portable Computing Language)

PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard (1.2 with some 2.0 features supported). In addition to being an easily portable multi-device (truely heterogeneous) open-source OpenCL implementation, a major goal of this project is improving interoperability of diversity of OpenCL-capable devices by integrating them to a single centrally orchestrated platform. [PoCL Page]

Prerequisites

Upgrade your Jetson TK1 to Ubuntu 16.04 by following this tutorial,

Jetson TK1 Upgrade OS to Ubuntu 16.04

A couple days ago, I am finally decide to upgrade Ubuntu 14.04 to 16.04 in Jetson TK1. Little bit scary, I am worried…

yunusmuhammad007.medium.com

Ensure you can compile CUDA program using nvcc by following this tutorial,

Jetson TK1 Use CUDA 6.5 After Upgrade to Ubuntu 16.04

After previously I am successfully upgrading my Jetson TK1 in here,

yunusmuhammad007.medium.com

Installation

Because PoCL use clang+llvm for compilation, we will use Clang+LLVM 8 for this, execute to set LLVM version,

LLVM_VERSION=8

Install dependency libraries,

sudo apt install -y build-essential ocl-icd-libopencl1 cmake git pkg-config libclang-${LLVM_VERSION}-dev clang-${LLVM_VERSION} llvm-${LLVM_VERSION} make ninja-build ocl-icd-libopencl1 ocl-icd-dev ocl-icd-opencl-dev libhwloc-dev zlib1g zlib1g-dev clinfo dialog apt-utils libxml2-dev llvm-${LLVM_VERSION}-dev libncurses5

Download & Build PoCL 1.7

Download PoCL 1.7 source from github,

cd ~
git clone --single-branch --branch release_1_7 https://github.com/pocl/pocl.git

Patch PoCL for Jetson TK1 (ARMv7 + CUDA6.5) by creating pocl-cuda6.5.patch file with the following content,

Then apply patch to pocl in lib/CL/devices/cuda/pocl-cuda.c with the following command ,

sudo patch -N ~/pocl/lib/CL/devices/cuda/pocl-cuda.c pocl-cuda6.5.patch

Then you should see the patch result success like below,

patching file /home/ubuntu/pocl/lib/CL/devices/cuda/pocl-cuda.c

Configure & Build,

cd pocl
mkdir build
cd buildcmake -DCMAKE_INSTALL_PREFIX=/usr/local/pocl/ -DENABLE_CUDA=ON ..make 
sudo make install

Create pocl.icd file in /etc/OpenCL/vendors/ path,

mkdir -p /etc/OpenCL/vendors/
cd /etc/OpenCL/vendors/sudo nano pocl.icd

Paste the following path,

/usr/local/pocl/lib/libpocl.so

Exit and save pocl.icd file

Installation Test

Run clinfo ,

clinfo

Result should look like this,

We can see there are two kernel device detected. CPU (ARM Cortex A15 ) and GPU (NVIDIA Kepler GK20a) using CUDA 6.5.
Create small C program to query device info, called it with query_device_info.c,

Compile it using gcc + ocl-icd loader,

gcc query_device_info.c -o query_device_info `pkg-config --libs --cflags OpenCL`

Then run the compiled binary,

./query_device_info

Output should look like this,

1. Device: pthread-cortex-a15
 1.1 Hardware version: OpenCL 1.2 pocl HSTR: pthread-armv7-unknown-linux-gnueabihf-cortex-a15
 1.2 Software version: 1.7
 1.3 OpenCL C version: OpenCL C 1.2 pocl
 1.4 Parallel compute units: 4
2. Device: GK20A
 2.1 Hardware version: OpenCL 1.2 pocl HSTR: CUDA-sm_32
 2.2 Software version: 1.7
 2.3 OpenCL C version: OpenCL C 1.2 pocl
 2.4 Parallel compute units: 1

Additional Note

̶B̶u̶i̶l̶d̶ ̶w̶i̶t̶h̶ ̶C̶U̶D̶A̶ ̶e̶n̶a̶b̶l̶e̶ ̶-̶D̶E̶N̶A̶B̶L̶E̶_̶C̶U̶D̶A̶=̶O̶N̶ ̶a̶l̶w̶a̶y̶s̶ ̶g̶i̶v̶i̶n̶g̶ ̶m̶e̶ ̶e̶r̶r̶o̶r̶ ̶o̶n̶ ̶P̶o̶C̶L̶ ̶b̶u̶i̶l̶d̶,̶ ̶t̶h̶i̶s̶ ̶i̶s̶ ̶r̶e̶l̶a̶t̶e̶ ̶t̶o̶ ̶t̶h̶i̶s̶ ̶i̶s̶s̶u̶e̶ ̶h̶t̶t̶p̶s̶:̶/̶/̶g̶i̶t̶h̶u̶b̶.̶c̶o̶m̶/̶p̶o̶c̶l̶/̶p̶o̶c̶l̶/̶i̶s̶s̶u̶e̶s̶/̶6̶0̶0̶ [Solved by applying patch into PoCL 1.7 with Clang+LLVM 8]
More about PoCL CUDA : http://portablecl.org/docs/html/cuda.html
Another resource successfully build PoCL CUDA in Jetson TX1 : https://highlevel-synthesis.com/2018/08/31/how-to-install-pocl-on-jetson-tx1/
PoCL Installation Procedure : http://portablecl.org/docs/html/install.html
https://yunusmuhammad007.medium.com/build-and-install-opencl-on-jetson-nano-10bf4a7f0e65