Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uploading filtered paths to the store #755

Draft
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

layus
Copy link
Collaborator

@layus layus commented Nov 13, 2020

This refactors the store to make it composable with arbitrary mtl monad stacks, with the added constraint that addToStore takes a filtering fucntion FilePath -> PathFilter -> m Bool which is not MonadBaseControl compatible, and cannot be lifted (the monad is in a negative/contravariant position).

The solution involves a RemoteStoreT transformer, a MonadRemoteStore monad and still lacks a proper generic MonadStore which I would like to make generic across all the store implementations (in-memeory / read-only / remote daemon / etc.)

Possibly doing something very wrong here, but it works.

based on top of #554

@layus
Copy link
Collaborator Author

layus commented Nov 15, 2020

At this point everything works well. But hell, it made the code sooooo slow. --eval -E '"${(import /home/gmaudoux/projets/nixpkgs { }).hello}"' is now at 50s (from 22 before). And that code does not even use nar streaming, which is the very reason why the code is so complicated.

The obvious solution is to use a handler. But we may also get good performance with effect systems, where all the hype is ATM :-).

Unless someone has good insights on how to use mtl stacks correctly. My intuition tells me that there is something very wrong about the implem in this PR.

@bqv Fun stuff is also happening here, and in the sibling hnix-store PR haskell-nix/hnix-store#72

@layus
Copy link
Collaborator Author

layus commented Dec 19, 2021

I made a second pass over this code, and I cannot reproduce the slower code issue. Here is a benchmark I made with four setups. Note that the first two cannot upload filtered paths to the store, because they cannot evaluate a nix expression (the filter function) while streaming the NAR to the store).

  1. store-fake uses the read-only store, and does not call the nix daemon.
  2. store-remote-restart keeps the monad stacks separate, and starts a new connection to the daemon on each store api call.
  3. threaded merges both monad stacks. It is the code we have here, and amounts to HNixT (StoreT IO))
  4. dedaerht is threaded, but the other way around (hence the name), and amounts to StoreT (HnixT IO))

The conclusion is that dedaerht is 10-20% faster than threaded. it is also the only implementation that beats store-remote-restart, which tells that it is possible to make things so wrong that they are slower that starting a socket connection each time...

I remains unsure about the benchmarks.threaded usually runs in 5.5s on my machine, and in this test it was 7.5s. This is an inconsistency that bothers me a lot. The machine has tons of ram, and a recent CPU, and yet running the same git revision twice does not yield identical results. Are GHC build non-deterministic. Is there a time limit to GHC optimization phases. This remains unclear.

image

Benchmark 1: ./cabal-run threaded
  Time (mean ± σ):      7.598 s ±  0.036 s    [User: 7.234 s, System: 0.296 s]
  Range (min … max):    7.572 s …  7.623 s    2 runs
 
Benchmark 2: ./cabal-run dedaerht
  Time (mean ± σ):      6.336 s ±  0.332 s    [User: 6.032 s, System: 0.244 s]
  Range (min … max):    6.102 s …  6.571 s    2 runs
 
Benchmark 3: ./cabal-run store-fake
  Time (mean ± σ):      6.010 s ±  0.080 s    [User: 5.704 s, System: 0.303 s]
  Range (min … max):    5.953 s …  6.067 s    2 runs
 
Benchmark 4: ./cabal-run store-remote-restart
  Time (mean ± σ):      7.232 s ±  0.156 s    [User: 5.147 s, System: 0.344 s]
  Range (min … max):    7.122 s …  7.343 s    2 runs
 
Summary
  './cabal-run store-fake' ran
    1.05 ± 0.06 times faster than './cabal-run dedaerht'
    1.20 ± 0.03 times faster than './cabal-run store-remote-restart'
    1.26 ± 0.02 times faster than './cabal-run threaded'

This is not the expected gain. My next move is probably going to be an iplementation of all of this stuff with https://github.com/arybczak/effectful, as the monad transformers we use are trivial.

@layus
Copy link
Collaborator Author

layus commented Dec 19, 2021

Oh, and that means that all the implementations are about 42 times slower than nix-instantiate. Is that a progress over last time ? I think so :-). Thanks @Anton-Latukha for your optimizations of the code.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Dec 19, 2021

Well, I did not do the optimization per say, but an organization & refactoring and some basic literal type design. I just as also could increase some laziness somewhere, while also GHC may have an easier time optimizing & fusing away some parts of the code after all the refactoring.

I hoped that having a pretty clean refactored codebase would make contributions easier. I do not require anyone to follow the coding style :P.

Sorry, that codebase was desynced from the PR so much. & that the CI got broken, I found difficulty in debugging undocumented changes to Nix & debugging its C++ code to make it work.

@Anton-Latukha Anton-Latukha mentioned this pull request Jan 17, 2022
13 tasks
@Anton-Latukha
Copy link
Collaborator

In this PR is implementation of builtins.unsafeDiscardOutputDependency & #128, so would cherry-pick/steal that code in nearest time.

@Anton-Latukha
Copy link
Collaborator

As always. I do not think about performance, having the functionality & tests is more important. After the code works, the performance can be reached.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jan 17, 2022

The PR uses unknown Store.Core & Store.Remote state, the code state used seems to be not in releases.

PathType used in the PR is absent in hnix-store git history diffs.

Thankfully data type information is fully contained in:

pathTypeStr :: PathType -> (NValue t f m)
pathTypeStr = nvStr . principledMakeNixStringWithoutContext . \case
  Regular -> "regular"
  Directory -> "directory"
  Symlink -> "symlink"
  Unknown -> "unknown

@Anton-Latukha
Copy link
Collaborator

The unsafeDiscardOutputDependency got ported.

Anton-Latukha added a commit to Anton-Latukha/hnix that referenced this pull request Jan 17, 2022
Courtesy of `layus`, ported from implementation in
haskell-nix#755.
Anton-Latukha added a commit to Anton-Latukha/hnix that referenced this pull request Jan 17, 2022
Courtesy of `layus`, ported from implementation in
haskell-nix#755.
@Anton-Latukha
Copy link
Collaborator

Also seems like successfully ported path #1032.

Anton-Latukha added a commit to Anton-Latukha/hnix that referenced this pull request Jan 17, 2022
Courtesy of `layus`, ported from implementation in
haskell-nix#755.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants