RTLF is a new tool to statistically evaluate timing measurements with a type-1 error bounded by an input parameter
Please note: RTLF is a research tool intended for developers, pentesters, administrators and researchers. There is no GUI.
To use RTLF, it is necessary to have R (https://www.r-project.org/) and the tidyverse (https://www.tidyverse.org/) library installed.
In order to run RTLF, you need to execute the R file:
Rscript <script_name> <alpha> <input_file> <output_file>
Where alpha defines the threshold for the type-1 error rate.
We provide a Dockerfile, allowing you to run RTLF directly:
docker build -t rtlf .
docker run -v <source>:<target> rtlf <alpha> <input_file> <output_file>
The input file must have be a CSV that follows to a specific format. The file should be structured as follows:
V1,V2
X,494602
X,481100
Y,531296
Y,539770
...
X describes the first series of measurements, and Y represents the second series. Both are compared to each other. The measurements of X and Y do not have to be listed as separate blocks.
The R script will report if a difference between X and Y has been determined by the test and if so, which deciles (vector indices 0 to 8) indicate differences. To facilitate an automated evaluation of multiple csv files, RTLF uses exit codes to signal its result: 11 indicates a difference while 10 indicates no difference.
Note that the tidyverse library may print some debugging info to console about conflicting packages when starting R. This should not affect the evaluation.
The output file is an RDATA file that contains a list with five vectors. You can inspect the RDATA using R-Studio or simply using R (or RScript) like this:
load("[FILE_PATH].RDATA", data <- new.env())
data[["output"]]
The output looks like this, for example:
[[1]]
[1] 0 0 1 1 1 1 0 0 0
[[2]]
[1] 14 12 14 22 22 26 24 12 14
[[3]]
[1] 22 14 12 12 14 16 26 26 56
[[4]]
[1] 22 14 10 12 12 16 26 26 56
[[5]]
[1] 22 12 12 12 14 16 24 24 56
The output contains five vectors ([[1]] to [[5]]) with nine entries each. Each entry corresponds to one of the nine deciles we consider (10%, 20%, ..., 90%). The meaning of each line is best explained starting with last line:
- [[5]]: This vector shows the output of our empirical bootstrap for dataset Y, i.e it expresses the variance we expect to occur for each of the deciles when measuring this distribution.
- [[4]]: This vector likewise shows the output of our empirical bootstrap for dataset X
- [[3]]: This vector contains the maximum of each entry of [[4]] and [[5]] and will be used as our decision rule. Effectively, it encompasses the differences we simulated within X and Y through our bootstrap and thus determines which differences between X and Y we consider expected variance. Only if the difference between a decile of X and Y exceeds the expected variance of a decile, the test considers it significant.
- [[2]]: This vector contains the difference between the deciles of X and Y.
- [[1]]: This vector contains a bitmap that states for which deciles the test determined a significant difference (= test decision). This is the case for each vector entry where the difference in [[2]] exceeds the expected variance in [[3]].
For the example above, we can see that the test indicates a difference for the 30%,40%,50%, and 60% decile as the differences between X and Y were 14, 22, 22 and 26 respectively while we defined the tolerable difference (based on the bootstrap) to be 12, 12, 14, and 16 for these deciles.