Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I replaced the string search functions with the "hash chain" algorithm used by gzip (described in https://www.rfc-editor.org/rfc/rfc1951.html#section-4). Essentially, substrings of length 3 (the minimum match length) in the 0x1000-byte compression window are placed into linked lists by hash code, so we can quickly skip to the next candidate instead of searching the whole window.
This did require some changes from gzip though, since its implementation is pretty quirky. The biggest change is that the window is searched front-to-back here instead of back-to-front in gzip. When searching back-to-front you can let the hash chains grow indefinitely and garbage-collect the hash nodes whenever, but here we keep both head and tail pointers so we can garbage collect from the head as soon as a byte falls out of the window. (Alternatively, we could still search back-to-front and get rid of the abort-early check when we reach the maximum match length. I didn't think of this until just now so I haven't tried it.)
I used a script to test on OOT segments (https://gist.github.com/cadmic/ab7a8b2ce576f0c2ccd1f8b18f06dff0, run
cargo build --release
first). Everything matches and the user (CPU) time to run the script went from 95.75s to 4.55s.