You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unless the encoded type can be trivially transmuted to a slice of bytes (str, [u8], U32<NativeEndian>, ...) we have to allocate a temporary buffer where the value is written.
There is no way to get a temporary buffer without going through the global allocator.
Allocating a temporary buffer incurs an extra copy.
For example, most architectures in use are Little-Endian (x86, Arm), however, for lexicographic ordering to correctly sort integers, we have to store them in Big-Endian. Since this incurs a byte-swap on Little-Endian architectures we cannot simply return a pointer to the integer itself, instead we have to allocate a temporary buffer where we store the byte-swapped result. That's one heap allocation for each integer, something which could have been done trivially on the stack or in a CPU register.
Optimizations
In most cases we know a-priori the maximum size of the needed allocation:
LMDB forces keys to be 511 bytes or less.
Primitive types and arrays have a compile-time known size.
In some cases it may be preferable to run serialization in two phases:
First we serialize to a "null" writer which counts the number of bytes needed for the value.
With this size in hand, we can have LMDB reserve a memory area in the database which fits the serialized value (using MDB_RESERVE).
We then serialize our value directly into the reserved space.
A new trait
With the above in mind, we can make a few changes to the BytesEncode trait (naming TBD):
/// Set to `true` if the value can be trivially casted to `&[u8]` (e.g., `str`, `[u8]`, ...)./// This is used as a hint by the library to avoid using intermediate buffers.constZERO_COPY:bool = false;/// A hint to the database about the final size of the encoded value./// Can be used to reduce allocations and memory copies./// /// Must be exactly equal to the final encoded size,/// otherwise encoding will result in an error.fnsize_hint(item:&Self::EItem) -> Option<usize>{None}/// Encode the value into a pre-allocated buffer./// The buffer could be allocated on the stack if we are encoding keys (511 bytes),/// or allocated on the heap (with an initial capacity equal to `size_hint`).fnencode_writer<W: std::io::Write>(item:&Self::EItem,writer:&mutW) -> Result<()>{// the default impl forwards to the old method
writer.write_all(Self::bytes_encode(item)?)?;Ok(())}
Based on the value of ZERO_COPY we can decide between either calling bytes_encode and directly get our slice, or using a pre-allocated buffer with encode_writer. (If the )
We could also consider making encode_writer the required method, and define bytes_encode in terms of encode_writer with Vec as our Writer. However, this would be a breaking change.
Example Usage
Let's look at how we might implement Database::put with the above definition (pseudocode):
fnput(db,key:implBytesEncode,value:implBytesEncode){ifKey::ZERO_COPY{
key_bytes = key.encode_bytes()?;// should yield &[u8] without allocation}else{letmut buffer = [0u8;511];letmut array_writer = std::io::Cursor::new(&mut buffer);
key.encode_writer(&mut array_writer);
key_bytes = &buffer[..array_writer.position()];}if !Value::ZERO_COPY{ifletSome(size) = value.size_hint(){// allocate space directly in DB for value before writingreturn db.put_reserve(key, size, |space| value.encode_writer(space));}}// Should yield `&[u8]` directly if ZERO_COPY is true.// Otherwise, the implementation likely makes better decisions about allocating a `Vec`.// Although, we could possibly pre-allocate a small buffer on the stack, something the implementation cannot.
value_bytes = key.encode_bytes()?;mdb_put(db, key_bytes, value_bytes)?;}
Note that we can avoid all temporary allocations and intermediate memory copies in all cases except one: if the resulting size of value is unknown.
The text was updated successfully, but these errors were encountered:
Thank you for this proposal @nolanderc, very appreciated 😃
However, there is one minor issue with the ZERO_COPY parameter; it should be possible to dynamically determine it from the given Self::Item. I advise you to look at the Meilisearch use case, which is probably the project that uses heed the most. We use a CboRoaringBitmap (using RoaringBitmaps) codec that conditionally stores integers one after the others (when there are not many) or uses the default serialization method.
I am pretty sure we can apply this optimization to the Meilisearch codebase and see the performance gains 🤩
The
BytesEncode
trait is currently defined as follows:heed/heed-traits/src/lib.rs
Lines 20 to 26 in 3406383
There are some issues with this definition:
str
,[u8]
,U32<NativeEndian>
, ...) we have to allocate a temporary buffer where the value is written.For example, most architectures in use are Little-Endian (x86, Arm), however, for lexicographic ordering to correctly sort integers, we have to store them in Big-Endian. Since this incurs a byte-swap on Little-Endian architectures we cannot simply return a pointer to the integer itself, instead we have to allocate a temporary buffer where we store the byte-swapped result. That's one heap allocation for each integer, something which could have been done trivially on the stack or in a CPU register.
Optimizations
In most cases we know a-priori the maximum size of the needed allocation:
In some cases it may be preferable to run serialization in two phases:
MDB_RESERVE
).A new trait
With the above in mind, we can make a few changes to the
BytesEncode
trait (naming TBD):Based on the value of
ZERO_COPY
we can decide between either callingbytes_encode
and directly get our slice, or using a pre-allocated buffer withencode_writer
. (If the )We could also consider making
encode_writer
the required method, and definebytes_encode
in terms ofencode_writer
withVec
as ourWrite
r. However, this would be a breaking change.Example Usage
Let's look at how we might implement
Database::put
with the above definition (pseudocode):Note that we can avoid all temporary allocations and intermediate memory copies in all cases except one: if the resulting size of
value
is unknown.The text was updated successfully, but these errors were encountered: