Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO-NOT-MERGE] .ef optimization experiments #12907

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wmitsuda
Copy link
Member

No description provided.

@wmitsuda wmitsuda added the do-not-merge PR that is in a merge-able state but is waiting for something else to take place before merging label Nov 28, 2024
@wmitsuda
Copy link
Member Author

note: code quality is "hacky", pls do not consider for review yet, only for evaluation of the overall solution + validating the strategy

@wmitsuda
Copy link
Member Author

now that the prototype 1 is out, let me rebase it to main as it is 20 days outdated...

@wmitsuda wmitsuda force-pushed the wmitsuda/ef-optimization-experiments branch 5 times, most recently from 3ea3702 to 9e5031a Compare December 7, 2024 00:33
@wmitsuda wmitsuda force-pushed the wmitsuda/ef-optimization-experiments branch 3 times, most recently from 9e7c1b2 to ae6c0bf Compare December 10, 2024 00:26
AskAlexSharov pushed a commit that referenced this pull request Dec 10, 2024
As part of #12907 I'll have
other iterator-like sequence implementations.

It makes sense to generalize the ErrEliasFanoIterExhausted and move it
to a common package and reuse it, rather than making a bunch of
IteratorExhausted-like for each implementation.
@wmitsuda wmitsuda force-pushed the wmitsuda/ef-optimization-experiments branch from ae6c0bf to 6d7f604 Compare December 10, 2024 03:52
@wmitsuda
Copy link
Member Author

Current status:

  • Implemented collate/merge support in addition to read from snapshots. In theory it is feature complete, now need to do a full validation of correctness + optimizations, polish the code, etc...
  • Hid everything behind a feature flag: --experimental.ef-optimization
    • That means if someone runs this branch as-is against an existing node, it should behave like this PR is not applied and the data should be written/read like regular E3.
    • If the feature flag is activated, new data moved from mdbx to snapshots will be written in the new optimized format (simple sequences/rebased elias-fano)
    • New merged files will be converted during the merge process.
    • Existing snapshots will not be touched, read support is backwards compatible.
    • New data is NOT backwards compatible, can't disable the flag and go back to regular E3. DO BACKUP YOUR NODE BEFORE TRYING THIS PR.
  • Did some simple tests so far on holesky:
    • Ran this PR with the flag DISABLED, waited for 1 collation, compared with regular node in order to validate feature flag disabled == working as before.
    • Ran this PR with the flag ENABLED, waited for 1 collation, 1 step was moved to new format.

Next steps:

  • Do longer tests (to validate multiple merges) + data deep comparison with regular E3.
  • Bootstrapping a new holesky node with --no-downloader in other to simulate a snapshotter + full comparison with regular E3 node. Not sure if my hardware will do it in a sane timeframe, but I'll try it.

@AskAlexSharov
Copy link
Collaborator

@wmitsuda FYI: maybe next commands may help to test files erigon snapshots integrity --check=InvertedIndex and erigon snapshots integrity --check=HistoryNoSystemTxs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge PR that is in a merge-able state but is waiting for something else to take place before merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants