block: reduce zstd compression allocations #4092

jbowens · 2024-10-21T22:00:41Z

I haven't looked at the API yet, but it seems like we're holding ZSTD wrong. It's heap allocating several times every time we compress something. Presumably we can recycle these allocations through using the API correctly.

ROUTINE ======================== github.com/DataDog/zstd.NewWriterLevelDict in /Users/jackson/go/pkg/mod/github.com/!data!dog/[email protected]/zstd_stream.go
   3228945    3228945 (flat, cum) 43.07% of Total
         .          .    120:func NewWriterLevelDict(w io.Writer, level int, dict []byte) *Writer {
         .          .    121:   var err error
         .          .    122:   ctx := C.ZSTD_createCStream()
         .          .    123:
         .          .    124:   // Load dictionnary if any
         .          .    125:   if dict != nil {
         .          .    126:           err = getError(int(C.ZSTD_CCtx_loadDictionary(ctx,
         .          .    127:                   unsafe.Pointer(&dict[0]),
         .          .    128:                   C.size_t(len(dict)),
         .          .    129:           )))
         .          .    130:   }
         .          .    131:
         .          .    132:   if err == nil {
         .          .    133:           // Only set level if the ctx is not in error already
         .          .    134:           err = getError(int(C.ZSTD_CCtx_setParameter(ctx, C.ZSTD_c_compressionLevel, C.int(level))))
         .          .    135:   }
         .          .    136:
   1024124    1024124    137:   return &Writer{
         .          .    138:           CompressionLevel: level,
         .          .    139:           ctx:              ctx,
         .          .    140:           dict:             dict,
         .          .    141:           srcBuffer:        make([]byte, 0),
   1046993    1046993    142:           dstBuffer:        make([]byte, CompressBound(1024)),
         .          .    143:           firstError:       err,
         .          .    144:           underlyingWriter: w,
   1157828    1157828    145:           resultBuffer:     new(C.compressStream2_result),
         .          .    146:   }
         .          .    147:}

Jira issue: PEBBLE-282

The text was updated successfully, but these errors were encountered:

petermattis · 2024-10-24T17:56:05Z

We should be using zstd.CompressLevel (https://github.com/DataDog/zstd/blob/1.x/zstd.go#L95) which offers a similar API to snappy.Encode.

jbowens · 2024-11-18T14:04:08Z

We should be using zstd.CompressLevel (https://github.com/DataDog/zstd/blob/1.x/zstd.go#L95) which offers a similar API to snappy.Encode.

We might be able to get better performance from using (and reusing) the zstd.Ctx manually.

https://github.com/DataDog/zstd/blob/091c219d0f2a0cbb662944b65ca0838920d2f857/zstd.h#L251-L254

Tangentially relatedly, we currently prefix zstd payloads with a varint-encoded length of the decompressed payload. I don't think this should be necessary, because zstd encodes a Frame_Content_Size to indicate the uncompressed content size. We could contribute upstream a func to extract the frame content size, allowing us to allocate an appropriately-sized buffer in the cache.

jbowens added T-storage A-storage C-performance labels Oct 21, 2024

jbowens added this to [Deprecated] Storage Oct 21, 2024

github-project-automation bot moved this to Incoming in [Deprecated] Storage Oct 21, 2024

jbowens mentioned this issue Oct 21, 2024

block: remove unnecessary CompressAndChecksum allocation #4091

Merged

nicktrav moved this from Incoming to Backlog in [Deprecated] Storage Oct 22, 2024

nicktrav self-assigned this Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

block: reduce zstd compression allocations #4092

block: reduce zstd compression allocations #4092

jbowens commented Oct 21, 2024 •

edited by cockroach-jira-scripts

Loading

petermattis commented Oct 24, 2024

jbowens commented Nov 18, 2024

block: reduce zstd compression allocations #4092

block: reduce zstd compression allocations #4092

Comments

jbowens commented Oct 21, 2024 • edited by cockroach-jira-scripts Loading

petermattis commented Oct 24, 2024

jbowens commented Nov 18, 2024

jbowens commented Oct 21, 2024 •

edited by cockroach-jira-scripts

Loading