-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsigned values dysfunction #457
Comments
Thanks for the report. At a first look, this code cannot be parallelized: public static void unsignedByte(ByteArray a, IntArray result) {
for(@Parallel int i = 0; i < 3; i++) {
result.set(0, result.get(0) | ((a.get(i) & 0xFF) << (8*i))); << Shared position across **all threads**: it needs a blocking access, The TornadoVM Parallel API does not offer barriers, but the TornadoVM Kernel API does. Besides, TornadoVM supports atomics for the OpenCL backend.
result.set(i+1, a.get(i) & 0xFF);
}
} Note that TornadoVM does not solve data dependencies. It takes annotations as hints to parallelize. It is up to the user to ensure those regions can be parallelized. This is a similar concept to OpenMP, or OpenACC. |
Thanks for your response. I modified the test case to make it parallelizable and added a control group to verify if the code was executed correctly : import uk.ac.manchester.tornado.api.ImmutableTaskGraph;
import uk.ac.manchester.tornado.api.TaskGraph;
import uk.ac.manchester.tornado.api.TornadoExecutionPlan;
import uk.ac.manchester.tornado.api.annotations.Parallel;
import uk.ac.manchester.tornado.api.enums.DataTransferMode;
import uk.ac.manchester.tornado.api.exceptions.TornadoExecutionPlanException;
import uk.ac.manchester.tornado.api.types.arrays.ByteArray;
import uk.ac.manchester.tornado.api.types.arrays.IntArray;
import java.util.Arrays;
import java.util.Random;
public class Main {
public static void main(String[] args) throws TornadoExecutionPlanException {
Random r = new Random();
int size = 32;
ByteArray a = new ByteArray(size);
// First half of input data includes bytes using all 8 bits, last half includes bytes using less than 8 bits
for(int i = 0; i < size/2; i++) {
a.set(i, (byte) (128 + r.nextInt(128)));
a.set(i+size/2, (byte) (r.nextInt(128)));
}
IntArray theoreticalResult = new IntArray(size);
IntArray theoreticalControlResult = new IntArray(size);
IntArray actualResult = new IntArray(size);
IntArray actualControlResult = new IntArray(size);
theoreticalResult.init(0);
theoreticalControlResult.init(0);
actualResult.init(0);
actualControlResult.init(0);
TaskGraph graph = new TaskGraph("s0")
.transferToDevice(DataTransferMode.FIRST_EXECUTION, a, actualResult, actualControlResult)
.task("t0", Main::unsignedByte, a, actualResult, actualControlResult, size)
.transferToHost(DataTransferMode.EVERY_EXECUTION, actualResult, actualControlResult);
ImmutableTaskGraph immutableTaskGraph = graph.snapshot();
try(TornadoExecutionPlan executionPlan = new TornadoExecutionPlan(immutableTaskGraph)) {
executionPlan.execute();
unsignedByte(a, theoreticalResult, theoreticalControlResult, size);
}
if(Arrays.equals(theoreticalControlResult.toHeapArray(), actualControlResult.toHeapArray())) {
System.out.println("Executed successfully");
} else {
System.out.println("Error during execution");
}
for(int i = 1; i < size; i++) {
System.out.println("Expected byte " + i + " : " + Integer.toBinaryString(theoreticalResult.get(i-1)) + "\nFound : " + Integer.toBinaryString(actualResult.get(i-1)));
}
}
public static void unsignedByte(ByteArray a, IntArray result, IntArray controlResult, int size) {
for(@Parallel int i = 0; i < size; i++) {
result.set(i, a.get(i) & 0xFF);
controlResult.set(i, 2 * a.get(i));
}
}
} Here is the output :
As you can see, the bytes aren't unsigned well by the |
Ok, thanks for the update. I would need to analyse the generated code as well as if there any compiler phase that changes this. We will take a look. |
Hello, after looking at the following generated kernel
I saw that the result int |
Thanks for the report. It is possible that the code gen generates the wrong signed value. TornadoVM generates the code as it sees in the Graal IR / Tornado IR, but we might have missed something. Have you check with any other backend, like OpenCL or SPIR-V? |
The problem is the same in OpenCL :
|
I'm stupid... I forgot that |
I think it should be a way to register a TornadoVM/Graal plugin to do so, at least to force it from an API. We haven't look at this yet, but we will. A temporary solution could be to use the TornadoVM prebuilt API in which you can pass a directly the GPU code: |
Hi @Benco11-developement , we just merged a fix for this. You can take a look at our tests: |
Describe the bug
Unsigning bytes somehow doesn't work like it should normally.
How To Reproduce
Here is a test case where we make a 24-bit unsigned int from 3 bytes :
Result :
Expected behavior
Bytes should be normally unsigned thanks to the
& 0xFF
Computing system setup (please complete the following information):
Additional context
Changing the backend does not help
The text was updated successfully, but these errors were encountered: