Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Add support for multi get operation for database queries #2396

Closed

Conversation

netrome
Copy link
Contributor

@netrome netrome commented Oct 25, 2024

Linked Issues

Closes #2344

Description

This PR adds support for getting batches of values from the storage through a new StorageBatchInspect trait in fuel-core-storage.

This trait is implemented for StructuredStorage and GenericDatabase through a new get_batch method added to the KeyValueInspect and BluePrintInspect traits. The get_batch method is implemented with the multi-get operation for the RocksDB based storage implementations, and uses a default blanket implementation for in-memory implementations.

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog

Before requesting review

  • I have reviewed the code myself

@netrome netrome linked an issue Oct 25, 2024 that may be closed by this pull request
@netrome netrome changed the base branch from release/v0.40.0 to master October 25, 2024 09:12
@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch from f9e0289 to 2532b06 Compare October 25, 2024 12:31
@netrome netrome self-assigned this Oct 29, 2024
@netrome netrome changed the title 2344 add support for multi get operation in the database feat: Add support for multi get operation for database queries Oct 29, 2024
@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch 3 times, most recently from 97f786f to 5405e6c Compare October 29, 2024 09:40
@netrome netrome marked this pull request as ready for review October 29, 2024 21:22
@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch from 2920ae8 to cff7b47 Compare October 29, 2024 21:23
@netrome netrome requested a review from a team October 29, 2024 21:23
@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch from cff7b47 to b23e4be Compare October 30, 2024 09:24
@netrome netrome requested a review from a team October 31, 2024 09:57
crates/storage/src/codec.rs Show resolved Hide resolved
@@ -41,7 +44,7 @@ pub trait StorageColumn: Copy + core::fmt::Debug {

/// The definition of the key-value inspection store.
#[impl_tools::autoimpl(for<T: trait> &T, &mut T, Box<T>)]
pub trait KeyValueInspect {
pub trait KeyValueInspect: Send + Sync {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very sad that we need to require Send + Sync only because BoxedIter requires Send. Maybe we can define non Send boxed iterator instead and use it(because it seems we don't need Send feature, but maybe I'm wrong)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this bound is just to satisfy the BoxedIter requirement. I can see if I can define a non-send one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hitting some walls with this implementation. Will try again tomorrow with fresh eyes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright my mistake was trying to change some iterators that we turn into streams, and they need to be Send. Now I have managed to get rid of these trait bounds cbb6efc.

Now we have two boxed iterators: BoxedIter which doesn't require Send, and BoxedIterSend for the cases when you need Send.

Comment on lines +181 to +183
<Self as StorageBatchInspect<OldTransactions>>::get_batch(self, ids)
.map(|result| result.and_then(|opt| opt.ok_or(not_found!(OldTransactions))))
.into_boxed()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, you can add the same syntax sugars that we did with storage_as_ref to avoid <Self as StorageInspect<M>> usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting, that would be nice. I'll look into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this is not as trivial as I had thought, as we're hitting some lifetime issues.

The problem

get_batch returns an iterator bound to the lifetime of the self parameter. I.e. the storage we call it on. This is necessary since we do self.get(...) within the KeyValueInspect::get_batch implementation.

So while we can implement StorageBatchInspect for the StorageRef type as:

impl<'a, S, Type> StorageBatchInspect<Type> for StorageRef<'a, S, Type>
where
    S: StorageBatchInspect<Type>,
    Type: Mappable,
{
    #[inline(always)]
    fn get_batch<'b, Iter>(
        &'b self,
        _keys: Iter,
    ) -> impl Iterator<Item = Result<Option<Type::OwnedValue>>> + 'b
    where
        Iter: 'b + IntoIterator<Item = &'b Type::Key>,
        Type::Key: 'b,
    {
        None.into_iter() // Note that we'd need access to `self.0` here which requires VM changes, or copying the `StorageRef` type into this crate. But that's a separate, and very manageable issue
    }
}

when we try to use it as

    fn old_transactions<'a>(
        &'a self,
        ids: BoxedIter<'a, &'a TxId>,
    ) -> BoxedIter<'a, StorageResult<Transaction>> {
        self.storage::<OldTransactions>()
            .get_batch(ids)
            .map(|result| result.and_then(|opt| opt.ok_or(not_found!(OldTransactions))))
            .into_boxed()
    }

we're hitting this issue

error[E0716]: temporary value dropped while borrowed
   --> crates/fuel-core/src/service/adapters/graphql_api/off_chain.rs:183:9
    |
179 |       fn old_transactions<'a>(
    |                           -- lifetime `'a` defined here
...
183 |           self.storage::<OldTransactions>()
    |           -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |           |
    |  _________creates a temporary value which is freed while still in use
    | |
184 | |             .get_batch(ids)
    | |___________________________- argument requires that borrow lasts for `'a`
...
187 |       }
    |       - temporary value is freed at the end of this statement

Going forward

For this PR I suggest we don't go down this rabbit hole, but I'd be happy to do it as a follow-up if you think it's possible/interesting to explore it. Seems like a quite low prio though.

Copy link
Collaborator

@xgreenx xgreenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to get some benchmarks(I guess you need to add new one that will use GraphQL) =)

netrome and others added 2 commits October 31, 2024 13:03
Since this PR add `into_bytes` for the encoder, we can optimize the
batch mutate operations as well
#2396 (review)
@netrome
Copy link
Contributor Author

netrome commented Oct 31, 2024

It would be nice to get some benchmarks(I guess you need to add new one that will use GraphQL) =)

Sure thing. I'm not sure if it makes sense to use criterion/cargo bench for end-to-end performance testing though as it is more geared towards micro-benchmarking, and we need to do something like the following.

  1. Populate the database with a bunch of coins/transactions/messages etc.
  2. Call the DatabaseMessages::message_batch, DatabaseCoins::coins etc. methods and record the lookup times.

I'll look into our options for this type of workload. Let me know if you have any opinions or thoughts on this.

@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch from 32993e0 to 3a3d967 Compare November 1, 2024 09:02
@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch from 3a3d967 to cbb6efc Compare November 1, 2024 09:05
@netrome netrome requested review from a team and xgreenx November 1, 2024 09:10
@netrome netrome changed the title feat: Add support for multi get operation for database queries perf: Add support for multi get operation for database queries Nov 1, 2024
@netrome
Copy link
Contributor Author

netrome commented Nov 9, 2024

It would be nice to get some benchmarks(I guess you need to add new one that will use GraphQL) =)

Sure thing. I'm not sure if it makes sense to use criterion/cargo bench for end-to-end performance testing though as it is more geared towards micro-benchmarking, and we need to do something like the following.

1. Populate the database with a bunch of coins/transactions/messages etc.

2. Call the `DatabaseMessages::message_batch`, `DatabaseCoins::coins` etc. methods and record the lookup times.

I'll look into our options for this type of workload. Let me know if you have any opinions or thoughts on this.

I'll aim at defining these workloads as a stand-alone binary, which would allow us to use hyperfine to actually execute the benchmarks and interpret the results.

crates/client/src/client/schema/primitives.rs Show resolved Hide resolved
crates/client/src/client/schema/primitives.rs Show resolved Hide resolved
crates/fuel-core/src/graphql_api/database.rs Outdated Show resolved Hide resolved
crates/storage/src/codec.rs Outdated Show resolved Hide resolved
storage: &'a S,
keys: Iter,
column: S::Column,
) -> impl Iterator<Item = crate::Result<Option<M::OwnedValue>>> + 'a
Copy link
Contributor

@acerone85 acerone85 Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
) -> impl Iterator<Item = crate::Result<Option<M::OwnedValue>>> + 'a
) -> impl IntoIterator<Item = crate::Result<Option<M::OwnedValue>>> + 'a

Iterators implement IntoIterator, so if you change the signature to return a IntoIterator then StorageBatchInspect becomes slightly more flexible, as implementations can return either an iterator or a collection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general principle, I like interfaces to be very general in what they accept but prefer them to be specific in what they return. (see Robusness principle). That's why I typically accept IntoIterator in parameters, because as you say then the caller doesn't have to care about doing `.into_iter()´ explicitly if they have a collection or anything.

Therefore, for return values it's better in my opinion to return the Iterator directly since it is more specific. Otherwise, all callers are basically forced to call .into_iter() before they can do anything with the returned iterator, which is less ergonomic. Any implementer of the trait can (and should) always call .into_iter() if they are about to return a collection. Again, not doing so would just force this burden upon the user of the function which is less than ideal without providing any benefit whatsoever.

Let me know if you disagree and I'll be happy to further discuss this.

let transactions: Vec<_> = self
.on_chain
.transactions(tx_ids.iter().into_boxed())
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you can avoid collecting here since later you iterate on transactions again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unforutnately I need to both iterate over the transactions and zip them with the on_chain_results - so we need to iterate twice over the tersults. We could potentially do something fancy with Itertools::tee, but I think this solution is simpler and easier on the eyes.

@netrome netrome force-pushed the 2344-add-support-for-multi-get-operation-in-the-database branch from 36386af to c3b7264 Compare November 15, 2024 20:40
@netrome
Copy link
Contributor Author

netrome commented Nov 15, 2024

Got some initial benchmark results in #2433, and it doesn't look good for the multi-get implementation. So far it's either slower or the same for all workloads I've tried. I've done the following:

  • Measure end to end query times for getting transactions by owner over GraphQL, database populated by submitting a bunch of transactions (100k or 1M in most tests). with the same owner.
  • Measure transaction lookup times using a ReadView directly reading a database pre-populated with 1M transactions (also tried different parameters here).
  • Measure coin lookup times using a ReadView directly reading a database pre-populated with 1M coins. Tried with different parameters, and also with concurrent reads. Experimented with enabling/disabling caching in rocksdb.

I'm all ears if anyone has any suggestions for workloads to try, but until we can find evidence that this change actually is an improvement in any way we should put this PR on hold.

@rymnc
Copy link
Member

rymnc commented Nov 25, 2024

Got some initial benchmark results in #2433, and it doesn't look good for the multi-get implementation. So far it's either slower or the same for all workloads I've tried. I've done the following:

  • Measure end to end query times for getting transactions by owner over GraphQL, database populated by submitting a bunch of transactions (100k or 1M in most tests). with the same owner.
  • Measure transaction lookup times using a ReadView directly reading a database pre-populated with 1M transactions (also tried different parameters here).
  • Measure coin lookup times using a ReadView directly reading a database pre-populated with 1M coins. Tried with different parameters, and also with concurrent reads. Experimented with enabling/disabling caching in rocksdb.

I'm all ears if anyone has any suggestions for workloads to try, but until we can find evidence that this change actually is an improvement in any way we should put this PR on hold.

this is interesting, do you find any deviation from your impl and the benchmark code in

fn get_block_multi_get_method(
database: &RocksDb<BenchDatabase>,
height: BlockHeight,
) -> Result<Block> {
let height_key = height.to_bytes();
let raw_block = database
.get(&height_key, BenchDbColumn::FuelBlocks)?
.ok_or(anyhow!("empty raw block"))?;
let block: CompressedBlock = postcard::from_bytes(raw_block.as_slice())?;
let tx_ids = block.transactions().iter();
let raw_txs = database.multi_get(BenchDbColumn::Transactions.id(), tx_ids)?;
let txs: Vec<Transaction> = raw_txs
.iter()
.flatten()
.map(|raw_tx| postcard::from_bytes::<Transaction>(raw_tx.as_slice()))
.try_collect()?;
Ok(block.uncompress(txs))
}
? I would imagine the performance gains would be similar though

@netrome
Copy link
Contributor Author

netrome commented Nov 27, 2024

this is interesting, do you find any deviation from your impl and the benchmark code in

fn get_block_multi_get_method(
database: &RocksDb<BenchDatabase>,
height: BlockHeight,
) -> Result<Block> {
let height_key = height.to_bytes();
let raw_block = database
.get(&height_key, BenchDbColumn::FuelBlocks)?
.ok_or(anyhow!("empty raw block"))?;
let block: CompressedBlock = postcard::from_bytes(raw_block.as_slice())?;
let tx_ids = block.transactions().iter();
let raw_txs = database.multi_get(BenchDbColumn::Transactions.id(), tx_ids)?;
let txs: Vec<Transaction> = raw_txs
.iter()
.flatten()
.map(|raw_tx| postcard::from_bytes::<Transaction>(raw_tx.as_slice()))
.try_collect()?;
Ok(block.uncompress(txs))
}

? I would imagine the performance gains would be similar though

Yes, that code only tests the multi-get operation directly. I have no doubt that the multi-get operation gives a performance benefit in isolation, but with the current change it seems to be a net-negative performance effect. I suspect it's the boxed iterators, since each boxed iterator requires a vtable lookup to reference a single element if I'm not mistaken, and I would not be surprised if this overhead is enough to negate any positive effects of the multi-get. But we need to investigate this further before we can conclude anything for certain, and on my side this is de-prioritized in favor of the fraud-proving work.

With some larger refactors in the GraphQL crate, we should be able to avoid the boxed iterators - but we need to decide how to proceed and prioritize that work first.

@netrome
Copy link
Contributor Author

netrome commented Nov 27, 2024

Closing this PR for now, and we can re-open it when we want to revisit this.

@netrome netrome closed this Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for multi-get operation in the Database
5 participants