Add tests for reduction with sum of squares (RSS), compute of RNSNorm, and RNSNorm fused with Matmul #593

mikepapadim · 2024-11-24T11:52:09Z

Description

This PR adds tests to test the code gen and and vailidity of the results for some common operation required in the LLM architecture.

Reduction with Sum of Squares (RSS) - Implements and tests the accuracy and stability of RSS with the generated code to use snippets for local memory.
Computation of RNSNorm - Introduces tests for RNSNorm, validating it can combine the above with a serial kernel.
Fused RNSNorm with Matmul - Adds tests for the RNSNorm function when fused with matrix multiplication (Matmul), ensuring compatibility and efficiency.

Backend/s tested

Mark the backends affected by this PR.

OpenCL
PTX
SPIRV

OS tested

Mark the OS where this PR is tested.

Linux
OSx
Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

Yes
No

How to test the new patch?

tornado-test -V uk.ac.manchester.tornado.unittests.reductions.TestReductionsFloats#testReduceSumSquares

tornado-test -V uk.ac.manchester.tornado.unittests.compute.LLMFusedKernelsTest

jjfumero · 2024-11-25T08:41:36Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+            }
+        }
+
+            private static void finalSum(KernelContext context, FloatArray reduce, int size, float eps) {


Fix format of the file.

jjfumero · 2024-11-25T08:42:04Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+                float expected = outputSeqLogits.get(i);           // Expected value from the sequential output
+                float actual = outputLogits.get(i);               // Actual value from the RNS output
+
+//                assertEquals("Mismatch at index " + i, expected, actual, 1f); // Allow some tolerance


Remove comment

I guess the test should assert the expected and actual values, right?

jjfumero · 2024-11-25T08:44:25Z

Include the new test in the test-suite:
tornado-assembly/src/bin/tornado-test

jjfumero · 2024-11-25T08:46:38Z

The new test enters in an infinite loop when running with the SPIR-V backend:

tornado-test -V uk.ac.manchester.tornado.unittests.compute.LLMFusedKernelsTest
/home/juan/tornadovm/TornadoVM/bin/sdk/bin/tornado --jvm "-Xmx6g -Dtornado.recover.bailout=False -Dtornado.unittests.verbose=True "  -m  tornado.unittests/uk.ac.manchester.tornado.unittests.tools.TornadoTestRunner  --params "uk.ac.manchester.tornado.unittests.compute.LLMFusedKernelsTest"

The PTX and OpenCL backends run fine.

stratika

I think we should add the new LLMFusedKernelsTest class in the tornado-test in order to be run when Jenkins runs the unit-tests. In my setup, the tests pass for PTX. But, the tests in the LLMFusedKernelsTest class are not finishing when running with SPIR-V. I guess, that they are not supported for SPIR-V?

stratika · 2024-12-09T11:24:01Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+     * </code>
+     */
+
+    public class LLMFusedKernelsTest extends TornadoTestBase {


To keep consistency with other test classes, I would suggest to move the "Test" at the beginning of the name of the class.

stratika · 2024-12-09T11:36:22Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+            public static void normalizeAndScale(KernelContext context,
+                    FloatArray out, FloatArray input, FloatArray weight, FloatArray scalingFactorBuffer,
+                    int size, float eps) {
+
+                int globalIdx = context.globalIdx;
+
+                if (globalIdx < size) {
+                    float scaledValue = weight.get(globalIdx) * (scalingFactorBuffer.get(0) * input.get(globalIdx));
+                    out.set(globalIdx, scaledValue);
+                }
+            }


fix code formatting.

stratika · 2024-12-09T11:36:56Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+
+        @Test
+        public void testRNSNorm() throws TornadoExecutionPlanException {
+            final int size = 2048;


I think we should add the following, unless it is supported:

assertNotBackend(TornadoVMBackendType.SPIRV);

stratika · 2024-12-09T11:37:02Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+
+        @Test
+        public void testRNSNormFusedWithMatMul() throws TornadoExecutionPlanException {
+            final int size = 2048;


I think we should add the following, unless it is supported:

assertNotBackend(TornadoVMBackendType.SPIRV);

stratika · 2024-12-09T11:37:38Z

...-unittests/src/main/java/uk/ac/manchester/tornado/unittests/compute/LLMFusedKernelsTest.java

+                float expected = outputSeqLogits.get(i);           // Expected value from the sequential output
+                float actual = outputLogits.get(i);               // Actual value from the RNS output
+
+//                assertEquals("Mismatch at index " + i, expected, actual, 1f); // Allow some tolerance


I guess the test should assert the expected and actual values, right?

jjfumero · 2024-12-10T14:21:23Z

@mikepapadim , is this ready?

mikepapadim added 2 commits November 23, 2024 19:09

Add test to ensure sum of squares reduction code gen works

57faae1

Add test to mimic the computation in llama3 forward method

02186f9

mikepapadim added the tests label Nov 24, 2024

mikepapadim requested review from jjfumero, mairooni and stratika November 24, 2024 11:52

mikepapadim self-assigned this Nov 24, 2024

jjfumero reviewed Nov 25, 2024

View reviewed changes

jjfumero requested changes Nov 25, 2024

View reviewed changes

stratika reviewed Dec 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for reduction with sum of squares (RSS), compute of RNSNorm, and RNSNorm fused with Matmul #593

Add tests for reduction with sum of squares (RSS), compute of RNSNorm, and RNSNorm fused with Matmul #593

mikepapadim commented Nov 24, 2024

jjfumero Nov 25, 2024

jjfumero Nov 25, 2024

stratika Dec 9, 2024

jjfumero commented Nov 25, 2024

jjfumero commented Nov 25, 2024

stratika left a comment

stratika Dec 9, 2024

stratika Dec 9, 2024

stratika Dec 9, 2024

stratika Dec 9, 2024

stratika Dec 9, 2024

jjfumero commented Dec 10, 2024

Add tests for reduction with sum of squares (RSS), compute of RNSNorm, and RNSNorm fused with Matmul #593

Are you sure you want to change the base?

Add tests for reduction with sum of squares (RSS), compute of RNSNorm, and RNSNorm fused with Matmul #593

Conversation

mikepapadim commented Nov 24, 2024

Description

Backend/s tested

OS tested

Did you check on FPGAs?

How to test the new patch?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjfumero commented Nov 25, 2024

jjfumero commented Nov 25, 2024

stratika left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjfumero commented Dec 10, 2024