-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of zoekt-webserver #86
Comments
Could be especially true with NVMe storage becoming more available and so quick. |
There are also options for compressed posting lists. For example, https://github.com/dgryski/go-postings |
@dgryski thanks for the link, I remember reading that code a while ago! (Thanks for all the interesting stuff you post btw, I'm a reader). However, the posting lists are already disk backed. It is a Line 31 in dd7d981
|
I'd do profiling before throwing out ideas, and classify the problems more precisely. Do you get OOM for queries with many results? For literal substring search or regex matching? Are the queries complex over few shards or simple over many shards? The content and posting lists are mmap'ed , so they are effectively already FS-based. Is the OOM because it faults in too many mmap'd pages, or because the search generates too much intermediate data ? re. FS based data structures, what does that mean? FS entities (dirnames, inodes) are very large compared to the data (lots of int32s) in the index, so moving them to FS will almost certainly increase overall memory use. |
Fair enough! I thought I'd do this just in case it triggered some ideas you had already thought about.
I don't know! It would be over many shards though. The instance had 20k+ repos indexed with a machine with 80G ram. As I said, I haven't actually dived into this yet just wanted to pre-empt discussion.
Sorry FS based is misleading. I meant that instead of unmarshalling the whole Apologies if I prematurely filed this issue. Was just showing intent to work on this. |
remember that for each shard, the map[ngram] might be small, so a btree might well create unnecessary One random idea: you could use uint32 iso. uint64 for shards that are pure ASCII; or maybe have a map[uint32] and map[uint64] for unicode vs ascii trigrams. Is this with your RPC-powered server? If so, it will have to copy around the content for the matches. If you can avoid (de)serializing data you'd avoid a lot of garbage. Try to see if you can repro the problem with the plain zoekt webserver or if it is a sourcegraph specific problem.
No worries. I'm curious to see what you come up with. |
FYI I did some memory profiling on an instance with 38k shards. Took a heap profile after startup, but no search queries had run yet. Was using about 40gb, the biggest offender by far was the trigram map.
Yeah my suspicion is the long tail of shards are small. So indeed the overall performance may degrade quite a bit if we naively switch to btrees.
This is a neat idea. I guess that is a roughly 1/4 memory saving (the values in the map are two 32bit uints). Could even do this without changing the index format, which makes it quite attractive.
Indeed this is. We did notice a bug recently were we set max matches far too high, which would of led to per request high usage. However, the RPC layer is actually quite efficient. gob is nice :) So I would be surprised outside of us being silly with the search options if this was sg specific.
@dgryski thanks again for this. The approach zoekt uses is already varint encoding on deltas (just not group varint). However, I've been experimenting with dividing the posting lists into blocks (like go-postings) to speed up |
A tunable we can look into is setting |
I want to investigate reducing the memory usage of zoekt-webserver when there are many shards. I haven't investigated this properly yet, but have noticed OOMs and I am wondering if there are some wins here.
Ideas before profiling:
map[ngram]
.I suspect the working set of ngrams over a time period is relatively small compared to the number of unique ngrams in a shard. So the performance implications of an fs based datastructure for
map[ngram]
may be minimal.@hanwen if you have any ideas to focus my investigation that would be great.
The text was updated successfully, but these errors were encountered: