Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fatxpool: handling limits and priorities improvements #6405

Merged
merged 42 commits into from
Dec 3, 2024

Conversation

michalkucharczyk
Copy link
Contributor

@michalkucharczyk michalkucharczyk commented Nov 7, 2024

This PR provides a number of improvements around handling limits and priorities in the fork-aware transaction pool.

Notes to reviewers.

Following are the notable changes:

  1. Better support for Usurped transactions

    When any view reports an Usurped transaction (replaced by other with higher priority) it is removed from all the views (also inactive). Removal is implemented by simply submitting usurper transaction to all the views. It is also ensured that usurped tx will not sneak into the view_store in newly created view (this is why ViewStore::pending_txs_replacements was added).

  2. TimedTransactionSource introduced:

    Every view now has an information when the transaction entered the pool. Enforce limits (now only for future txs) uses this timestamp to find worst transactions. Having common timestamp ensures coherent assessment of the transaction's importance across different views. This also could later be used to select which ready transaction shall be dropped.

  3. DroppedWatcher: improved logic for future transactions

    For future transaction - if the last referencing view is removed, the transaction will be dropped from the pool. This prevents future unincluded and un-promoted transactions from staying in the pool for long time.

And some minor changes:

  1. simplified the flow in update_view_with_mempool (code duplication + minor bug fix).
  2. graph::BasePool: handling priorities for future transaction improved (previously transaction with lower prio was reported as failed),
  3. graph::listener: dedicated limit_enforced/usurped/dropped calls added,
  4. flaky test fixed
  5. new tests added,

related to: #5809

@michalkucharczyk michalkucharczyk changed the title fatxpool: improvements in handling of priorities fatxpool:limits and priorities improvements Nov 14, 2024
@michalkucharczyk michalkucharczyk changed the title fatxpool:limits and priorities improvements fatxpool: limits and priorities improvements Nov 14, 2024
@michalkucharczyk michalkucharczyk changed the title fatxpool: limits and priorities improvements fatxpool: handling limits and priorities improvements Nov 14, 2024
@michalkucharczyk
Copy link
Contributor Author

/cmd prdoc --bump major --audience node_dev

@michalkucharczyk michalkucharczyk added R0-silent Changes should not be mentioned in any release notes T0-node This PR/Issue is related to the topic “node”. labels Nov 14, 2024
@michalkucharczyk michalkucharczyk marked this pull request as ready for review November 14, 2024 17:11
Copy link
Contributor

@lexnv lexnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Not entirely familiar with the code, some minor needs that could all be ignored /followed up later if need be 🙏

Some((
// These transactions are coming from retracted blocks, we should
// simply consider them external.
TimedTransactionSource::new_external(false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably not be changed since it was like this before, but any idea why this is External instead of InBlock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand it could be InBlock. But I don't understand all the consequences of changing this value, as it goes to runtime side. I'd not touch it in this PR :)

self.future_transaction_views.entry(tx_hash).or_default().insert(block_hash);
},
TransactionStatus::Ready | TransactionStatus::InBlock(..) => {
if let Some(mut views) = self.future_transaction_views.remove(&tx_hash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DQ: Why do we move the entire list of views from future to ready here? This is just one view reporting the transaction as ready now right? I would expect that we only remove that one block hash from the future transactions view list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

It is about if we want to drop transaction when the last view referencing a transaction is dropped.

And now for transaction that is future, and never was seen as ready on any fork, we want to remove it from the pool when the last view referencing transaction is gone. We basically don't want to keep future transaction too long. If the transaction is no longer in any view it was likely pushed out by some newly incoming transactions.

If the transaction was seen as ready on any fork, it means that it can be soon included into some new fork that can pop out. This is why in a DroppedWatcher I decided to treat such transaction as ready. Due to this we will keep this transaction in a mempool for a while - we will not send a dropped event to the pool.

I treat an internal mempool as a secondary, best-effort, buffer for transactions - meaning that it can hold ready transactions that are currently not referenced by any view (note that it applies to the situation when limits are hit). When the blocks are built, those txs will be resubmitted and will get to the view. The other approach that could be simple would be to simply drop everything what is not referenced by views. Maybe it is a way to go. I'll give it more consideration when all groundwork is done.

Please also note that hitting limits on transaction pool is not a typical case. I would not expect hitting limits even during spamming tests. Hitting limits may occur during some attack or maybe when block production is stopped. This should not even happen when finality is stalled (becasue FinalityTimeout events shall be triggered and listeners will be dropped - still to be tested).

I am still here experimenting a bit, so it may happen that these code will get re-iterated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I could follow this explanation, but its kind of complicated behaviour, maybe a comment in the code would help for unsuspecting developers :D

@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/12140375234
Failed job name: fmt

@michalkucharczyk
Copy link
Contributor Author

/cmd fmt

Copy link

github-actions bot commented Dec 3, 2024

Command "fmt" has started 🚀 See logs here

Copy link

github-actions bot commented Dec 3, 2024

Command "fmt" has finished ✅ See logs here

@michalkucharczyk michalkucharczyk added this pull request to the merge queue Dec 3, 2024
Merged via the queue into master with commit 41a5d8e Dec 3, 2024
198 of 201 checks passed
@michalkucharczyk michalkucharczyk deleted the mku-fatxpool-priorities branch December 3, 2024 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants