Poor performance on importing big records #1428

YorikSar · 2023-07-03T15:37:01Z

I'm trying to generate a big Nickel file with metadata for all packages in nixpkgs. When I import it, Nickel takes a relatively long time to do anything with it. I've removed all values and kept only nested record structure to test this. See nixpkgs.zip for example records. no_recurse.ncl doesn't include nested attribute sets in nixpkgs and thus is smaller, only ~17k lines/fields. with_recurs.ncl includes all nested attribute sets, totalling to ~60k lines/fields. Both take very similar time to process though:

% echo '(import "no_recurse.ncl").hello' | time nickel export
{}
 export  3.71s user 0.09s system 99% cpu 3.818 total 
% echo '(import "with_recurs.ncl").hello' | time nickel export
{}
 export  4.15s user 0.11s system 99% cpu 4.290 total

This makes me think that this is not dominated by the size of the input, but rather on just it being big.

We would like to use this in nickel-nix to provide a better interface for importing packages from Nixpkgs.

The text was updated successfully, but these errors were encountered:

yannham · 2023-07-03T16:39:07Z

Here are two flamegraphs of exporting full_nixpkgs.ncl in a release mode and debug mode. While nothing, at first sight, looks really obviously wrong, we spent a lot of time hashing and querying hashmaps (which is both expected, given the size of records, but still probably too much). I'll try to see how many "generated variables" there are: this might cause many many more environments access when exporting, which we could completely avoid if we had a proper AST/Bytecode for the runtime.

Also, exporting is currently actually doing substitution, which might not be the most efficient. We might also avoid the validation pass altogether and just fail during serialization if we encounter something that we can't handle.

flamegraphs.zip

yannham · 2023-07-03T16:57:23Z

So, simple tests show that neither parsing nor program transformations seems to be at fault here. Evaluation is:

$ time nickel -f full_nixpkgs.ncl
nickel -f full_nixpkgs.ncl  9,71s user 0,22s system 99% cpu 10,017 total
$ time nickel -f full_nixpkgs.ncl export
nickel -f full_nixpkgs.ncl   8,80s user 0,23s system 98% cpu 9,150 total
$ time nickel <<< 'std.deep_seq (import "full_nixpkgs".ncl) 1'
1
nickel <<<   8,46s user 0,09s system 99% cpu 8,553 total
$ time nickel <<< 'std.seq (import "full_nixpkgs.ncl") 1'
1
nickel <<<   5,12s user 0,07s system 99% cpu 5,196 total
$ time nickel <<< 'let x = import "full_nixpkgs.ncl" in 2'
2
nickel <<<   0,71s user 0,08s system 99% cpu 0,794 total

I put them in order of decreasing slowness:

The first example fully (deeply) evaluates the term, substitutes all variables, and pretty-print it. This is the longest.
The second example fully (deeply) evaluates the term, substitutes all variables, and export it. It seems serialization code is at least more efficient than the pretty printer, which isn't totally surprising. Still the same order of magnitude
The third example fully evaluates the import, but doesn't use the result. The time is very close to export (a few tenth of a second, and for some reason there is high variability in multiple runs), so the culprit seems to be evaluation.
Very interestingly, the following one shallowly evaluates the main record. This should be quite fast to do, but it's still taking more than 5 seconds (against 8+ for deep_seq). If not for the last test, we could deduce that maybe parsing + program transformations also play a role. But see the next one.
The last one import the file, but doesn't use it at all. This means that the imported file is parsed, typechecked and transformed. This "only" takes .79sec. I don't know if that is reasonable, but it's one order of magnitude faster than the full export.

Given 5., I really really suspect our share normal form transformation is playing a role here. In fact, since the record is around 70 to 80k entries, this transformation should generate 70 to 80k heading let-bindings with generated variables, to insert them in the environment and then immediately get them.

Additionally, because we don't have let-blocks, the interpreter doesn't know that all those lets are independent. So each new let-binding opens a new scope I believe, which means the environment, instead of being one hashmap of 80k entries, is more like a linked list of 80k entries where each entry is a hashmap with one element... if that's true it's even a miracle that the whole thing runs under 10 seconds 🥵

This needs more investigation, but one obvious fix that has been on the back of my head for some time - fix or at least substantial improvement - would be to have a dedicated construct in the AST (and, in fact, the best would be to have several ASTs, with one specifically tailored for evaluation) to hold closures. So that we don't have to store them artificially in the environment anymore. From there we can even try #837 again.

yannham · 2023-10-16T11:38:43Z

Generated variables seem to be indeed the culprit: on the HEAD of #1647 (soon to be merged), which get rid of them, I get:

$ time target/release/nickel export <<< '(import "no_recurse.ncl").hello' 
{}
target/release/nickel export <<< '(import "no_recurse.ncl").hello'  0,13s user 0,06s system 99% cpu 0,183 total
$ time target/release/nickel export <<< '(import "with_recurs.ncl").hello'
{}
target/release/nickel export <<< '(import "with_recurs.ncl").hello'  0,39s user 0,09s system 99% cpu 0,487 total

@YorikSar, are those timing reasonable, or would you need a more drastic reduction to make it usable?

YorikSar · 2023-10-16T14:03:46Z

Well, 10x speedup is surely great, and now it being sub-second, it should be acceptable from UX perspective to use these in Nickel. Now that we have this option, we'll need to decide whether we want to switch to it. Thanks!

YorikSar mentioned this issue Jul 7, 2023

PoC: Provide inputs as Nickel records nickel-lang/organist#88

Draft

yannham added the P2 major: an upcoming release label Jul 10, 2023

yannham mentioned this issue Jul 28, 2023

Performance improvement ideas: tracking issue #1484

Open

9 tasks

yannham mentioned this issue Sep 29, 2023

Get rid of most generated variables #1647

Closed

YorikSar closed this as completed Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor performance on importing big records #1428

Poor performance on importing big records #1428

YorikSar commented Jul 3, 2023

yannham commented Jul 3, 2023

yannham commented Jul 3, 2023 •

edited

Loading

yannham commented Oct 16, 2023

YorikSar commented Oct 16, 2023

Poor performance on importing big records #1428

Poor performance on importing big records #1428

Comments

YorikSar commented Jul 3, 2023

yannham commented Jul 3, 2023

yannham commented Jul 3, 2023 • edited Loading

yannham commented Oct 16, 2023

YorikSar commented Oct 16, 2023

yannham commented Jul 3, 2023 •

edited

Loading