Build & Install OpenCL in Jetson TK1

Muhammad Yunus
3 min readSep 7, 2021

--

OpenCL 1.2 via PoCL (CPU ARM Cortex A15 and GPU NVIDIA Kepler GK20a via CUDA backend)

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. [Wikipedia].

Since NVIDIA is not releasing OpenCL for Jetson TK1 under L4T OS, we can’t run OpenCL on this board. Luckily, there is several opensource OpenCL implementation that can help us to solve this issue. One of the options is PoCL (Portable Computing Language).

PoCL (Portable Computing Language)

PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard (1.2 with some 2.0 features supported). In addition to being an easily portable multi-device (truely heterogeneous) open-source OpenCL implementation, a major goal of this project is improving interoperability of diversity of OpenCL-capable devices by integrating them to a single centrally orchestrated platform. [PoCL Page]

Prerequisites

  • Upgrade your Jetson TK1 to Ubuntu 16.04 by following this tutorial,
  • Ensure you can compile CUDA program using nvcc by following this tutorial,

Installation

  • Because PoCL use clang+llvm for compilation, we will use Clang+LLVM 8 for this, execute to set LLVM version,
LLVM_VERSION=8
  • Install dependency libraries,
sudo apt install -y build-essential ocl-icd-libopencl1 cmake git pkg-config libclang-${LLVM_VERSION}-dev clang-${LLVM_VERSION} llvm-${LLVM_VERSION} make ninja-build ocl-icd-libopencl1 ocl-icd-dev ocl-icd-opencl-dev libhwloc-dev zlib1g zlib1g-dev clinfo dialog apt-utils libxml2-dev llvm-${LLVM_VERSION}-dev libncurses5

Download & Build PoCL 1.7

  • Download PoCL 1.7 source from github,
cd ~
git clone --single-branch --branch release_1_7 https://github.com/pocl/pocl.git
  • Patch PoCL for Jetson TK1 (ARMv7 + CUDA6.5) by creating pocl-cuda6.5.patch file with the following content,
  • Then apply patch to pocl in lib/CL/devices/cuda/pocl-cuda.c with the following command ,
sudo patch -N ~/pocl/lib/CL/devices/cuda/pocl-cuda.c pocl-cuda6.5.patch
  • Then you should see the patch result success like below,
patching file /home/ubuntu/pocl/lib/CL/devices/cuda/pocl-cuda.c
  • Configure & Build,
cd pocl
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/usr/local/pocl/ -DENABLE_CUDA=ON ..make
sudo make install
  • Create pocl.icd file in /etc/OpenCL/vendors/ path,
mkdir -p /etc/OpenCL/vendors/
cd /etc/OpenCL/vendors/
sudo nano pocl.icd
  • Paste the following path,
/usr/local/pocl/lib/libpocl.so
  • Exit and save pocl.icd file

Installation Test

  • Run clinfo ,
clinfo
  • Result should look like this,
  • We can see there are two kernel device detected. CPU (ARM Cortex A15 ) and GPU (NVIDIA Kepler GK20a) using CUDA 6.5.
  • Create small C program to query device info, called it with query_device_info.c,
  • Compile it using gcc + ocl-icd loader,
gcc query_device_info.c -o query_device_info `pkg-config --libs --cflags OpenCL`
  • Then run the compiled binary,
./query_device_info
  • Output should look like this,
1. Device: pthread-cortex-a15
1.1 Hardware version: OpenCL 1.2 pocl HSTR: pthread-armv7-unknown-linux-gnueabihf-cortex-a15
1.2 Software version: 1.7
1.3 OpenCL C version: OpenCL C 1.2 pocl
1.4 Parallel compute units: 4
2. Device: GK20A
2.1 Hardware version: OpenCL 1.2 pocl HSTR: CUDA-sm_32
2.2 Software version: 1.7
2.3 OpenCL C version: OpenCL C 1.2 pocl
2.4 Parallel compute units: 1

Additional Note

--

--

Muhammad Yunus

IoT Engineer, Software Developer & Machine Learning Enthusiast