Quick Start

Summary

Check the Requirements
Download the Container
Get the Application
Find an FPGA Kernel
Integrate Application and FPGA Kernel
Run on FPGAs

1. Check the Requirements

To be able to quick start with the framework you would need:

One (or more) servers with
1. Singularity CE - Singularity Download
2. Compatible MPICH installation (MPICH v. 4.0.2) - MPICH Download
3. AMD Vitis Tools 2022.1 (or superior) - Vitis Page
4. AMD XRT version 2.13.466 (or superior) - XRT Page
5. One (or more) AMD Alveo Board

2. Download the Container

For this tutorial, we will be using the OMPC FPGA container. It includes all the tools needed to run applications using the framework. We suggest using Singularity.

Download the container using Singularity:

singularity pull docker://pedroohr/runtime-fpga:latest

3. Get the Application

As an example, let’s start with a basic Vector Addition example. Following figure shows the application:

A basic CPU kernel to execute the application can be implemented as the following:

void vadd_cpu(int *A, int *B, int *C, int size) {
   for (int i = 0; i < size; i++)
      C[i] = A[i] + B[i];
}

And it is called somewhere in the application as:

vadd(A, B, C, N);

Note

This application is already available in the container on the path /examples/vadd/vadd_cpu.cpp.

But you can find a fully functional implementation example here: vadd_cpu

4. Find an FPGA Kernel

The framework facilitates the usage of any kernel that can be used as an alternative to a defined CPU function (i.e.: share equivalent prototypes).

Tip

Application developers can always change the CPU functions to match a desired FPGA kernel (even if the arguments will not be used in the CPU implementation).

So, let’s say we found an FPGA implementation for the vadd kernel (in this case, an HLS version of the kernel):

void vadd_fpga(int *A, int *B, int *C, int size) {
#pragma HLS INTERFACE m_axi port = A
#pragma HLS INTERFACE m_axi port = B
#pragma HLS INTERFACE m_axi port = C
#pragma HLS INTERFACE s_axilite port = return
   for (int i = 0; i < size; i++)
      C[i] = A[i] + B[i];
}

That implementation can be compiled using the AMD Vitis^TM Compiler. The code below shows how to compile for the AMD Alveo u55c board.

v++

Note

This kernel implementation is already available in the container on the path /examples/vadd/fpga_kernel.cpp.

But you can find the kernel implementation here: vadd_cpu.cpp

5. Integrate Application and FPGA Kernel

The integration of the FPGA kernel can be done with just a few lines of code.

To make the program understand we want to use the FPGA kernel as an alternative to the CPU kernel we need two lines of code (lines 1 and 2):

void vadd_fpga(int *A, int *B, int *C, int size);
#pragma omp declare variant(vadd_fpga) match(device={arch(alveo)})
void vadd_cpu(int *A, int *B, int *C, int size) {
   for (int i = 0; i < size; i++)
      C[i] = A[i] + B[i];
}

Finally, in the line we call that function in the code we need to create an OpenMP Target task (line 1) and establish a syncronization point (line 4), so the program knows when to execute the kernels.

#pragma omp target map(to: A[:N], B[:N]) map(tofrom: C[:N]) nowait
vadd_cpu(A, B, C, N);

#pragma omp taskwait

Important

Observe how the original call to vadd_cpu do not change even if using FPGAs!

Note

This application is already available in the container on the path /examples/vadd/vadd_fpga.cpp.

But you can find a fully functional implementation example here: vadd_fpga.cpp

6. Run on FPGAs

To run the application using the FPGA kernel, one need to compile first, and then run, using the provided container:

Compiling it using Singularity:

singularity exec runtime-fpga_latest.sif clang++ -fopenmp -fopenmp-targets=alveo -fno-openmp-new-driver vadd_fpga.cpp -o vadd_fpga

Run it using Singularity:

# Runs using 1 worker node containing FPGAs
mpirun -np 2 singularity exec runtime-fpga_latest.sif ./fpga_vadd

Important

Currently, we run the applications using mpirun, the number of nodes will always be: 1 + number of workers

That is it! Happy coding with FPGAs