GPU computation demo

What does the sample do?

This sample performs the multplication of 2 square dense matrices on a GPU. The GPU computation is handled thanks to the APARAPI library, which provides Java bindings for OpenCL.

How does it work?

This sample submits a JPPF job with one or more tasks to be executed on a GPU. The tasks contain APARAPI-conformant code, whose bytecode is introspected at runtime to generate OpenCL code (see example here). This generated code is then compiled and executed on an OpenCL device if any is available.

How do I run it?

Before running this sample application, you must have a JPPF server and at least one node running.
For information on how to set up a node and server, please refer to the JPPF documentation.

The node will require some additional configuration. In effect, since the APARAPI library load a native library, the file "aparapi.jar" must be added directly to the node's classpath. If you simply keep it in the client's classpath, the node will attempt to load it for each distinct client. This will only work the first time, and fail on subsequent attempts.

For your convenience, we have included a set of files that will take care of this:

copy the file jppf-node.properties in GPU/config/node to your node's config/ folder to replace the existing config file with the new version. You will notice that this configuration file has APARAPI-specific settings in the jppf.jvm.options property
copy the file GPU/lib/aparapi.jar to your node's lib/ folder
copy the appropriate native library from GPU/lib/ for your platform to the node's lib/ folder:
- aparapi_x86.dll or aparapi_x86_64.dll for Windows 32/64 bits platforms
- libaparapi_x86.so or libaparapi_x86_64.so for Linux 32/64 bits
- libaparapi_x86_64.dylib for 64 bits Mac OS
Once this is done, you can start the server and node, then run the sample by typing "run.bat" on Windows or "./run.sh" on Linux/Unix
During the execution, the node will print out a message indicating whether the task was actually executed on a GPU, plus additional information on the OpenCL devices available to the platform
if the task cannot be executed on a GPU, if will fall back to executing in a Java thread pool (i.e. CPU-bound)

How do I use it?

This sample doesn't have a graphical user interface, however you can modify some of the parameters in the JPPF configuration file:

open the file "config/jppf-client.properties" in a text editor

at the end of the file, you will see the following properties:

# number of jobs to submit in sequence
iterations = 10
# number of tasks in each job
tasksPerJob = 1
# the size of the matrices to multiply
matrixSize = 1500
# execution mode, either GPU or JTP (Java Thread Pool)
execMode = GPU

You can experiment with various values, for instance to find out the JTP vs. GPU execution speedup. You may find out that for relatively small values of the matrix size there is no speedup, due to the overhead of generating and compiling the OpenCL code

Sample's source files

MatrixKernel.java: this is the class that will be translated into OpenCL code
GeneratedOpenCL.c: the OpenCL code generated from MatrixKernel which will be actually executed on the GPU
AparapiTask.java: the JPPF task which invokes the GPU bindings API
AparapiRunner.java: the JPPF client application which submits the jobs to the grid
SquareMatrix.java: a simple representation of a square dense matrix, whose values are stored in a one-dimensional float array

How can I build the sample?

To compile the source code, from a command prompt, type: "ant compile"
To generate the Javadoc, from a command prompt, type: "ant javadoc"

I have additional questions and comments, where can I go?

If you need more insight into the code of this demo, you can consult the source, or have a look at the API documentation.

In addition, There are 2 privileged places you can go to:

The JPPF Forums
The JPPF documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

GPU computation demo

What does the sample do?

How does it work?

How do I run it?

How do I use it?

Sample's source files

How can I build the sample?

I have additional questions and comments, where can I go?

Files

Readme.md

Latest commit

History

Readme.md

File metadata and controls

GPU computation demo

What does the sample do?

How does it work?

How do I run it?

How do I use it?

Sample's source files

How can I build the sample?

I have additional questions and comments, where can I go?