Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flow resimulation HLD. #460

Closed
wants to merge 18 commits into from
Closed

Conversation

r12f
Copy link
Collaborator

@r12f r12f commented Nov 11, 2023

This is the follow up flow resimulation design for packet flow: #449.

Motivation

In current DASH pipeline, the flow resimulation is designed to rely on a incomplete flow table (ConnTrack Lookup stage), so both ACL and mapping table update can take effect immediately. But this design could not provide a good way to:

  1. Describe or leverage the fast path in the ASIC.
  2. Resimulate all flows in a pipeline.
  3. Provide a generic way for policy-based flow resimulation.
  4. Support learning-based flow resimulation, such as tunnel learning.
  5. Handle cases where we would like to maintain PCC.
  6. ...

Changes

The HLD is trying to describe how flow resimulation should work and the changes that we need in the pipeline.

The flow API doc is not merged into DASH yet. For reference, here is the PR, which will be moved into DASH repo later: sonic-net/SONiC#1483.

@r12f r12f changed the title Flow resimuation HLD. Flow resimulation HLD. Nov 11, 2023
@r12f
Copy link
Collaborator Author

r12f commented Nov 16, 2023

Hi @budgrise and @chrispsommers , I have updated the doc to add new ways for reduce the memory usage of flow tracking key in the match stage entry, as well as added the clarification for hash table. Feel free to check it out again and hopefully this addresses your concerns.

documentation/dataplane/dash-flow-resimulation.md Outdated Show resolved Hide resolved

```json
"DASH_SAI_ENI_TABLE|123456789012": {
"flow_incarnation_id": 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides ID, a max resimulation rate should be added to limit the burst of resimulated flows, as they usually compete with the CPS of new flows. A user should be able to tune this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually a good idea. the only thing is that we could not drop the request and have to queue the request that cannot be handled during the time. not sure will this cause the queue being built up and causing problems.

also, do we need this if passive tracking is used? since all request are batched and queued already.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the second thought, this should be captured as part of the scaling requirement for DASH. the real question is - should this count as part of CPS or not. I will go and confirm this, then come back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rate limiting takes into assumption asynchronous requests (packets are mirrored instead of queueing the original).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have confirmed that it is indeed counted as part of CPS requirement, so I have added it into the scaling requirement in sonic-dash-hld.

Also updated the rate limiting mechanism and design too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marian-pritsak , not sure this is nature enough for networking scenarios. although I feel token bucket is a good choice for this scenario, but it doesn't seem to be matching any existing rate limiting mechanisms.


Some flows can be very low volume and causing flow incarnation id stops working. Say, if the flow incarnation id is 8 bits, and not it is set to 1, a low volume flow is created. And after 256 flow resimulation calls, the next packet finally arrives and see the same id, which bypasses the flow resimulation process.

To solve this, whenever the incarnation id overflows, we will need to treat all flows as resimulated. Implementation-wise, this can be done by adding another bit to indicate overflow happened, then reset the flow incarnation id to 0 in each flow during the next flow aging process. After flow aging is done, we can reset the overflow bit.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the user know when to clear the overflow flag?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the overflow flag will be cleaned up by data plane app after flow aging process is completed. it mostly works as full flow resimulation requested flag. maybe we should just use that instead of introducing a new flag.

ideally, we should use increase on these attributes, but SAI doesn't support this operation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, rename the flag should make things more clear. will update the doc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the flag is renamed also explain how the workflow works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how does a user know the flow resimulation finished?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i guess we will also need to add a callback on the switch then. otherwise, the user won't be able to know indeed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although, what would be the granularity for reporting. and how to identify which request we are working... I need to think more on that.

documentation/dataplane/dash-flow-resimulation.md Outdated Show resolved Hide resolved
documentation/dataplane/dash-flow-resimulation.md Outdated Show resolved Hide resolved
documentation/dataplane/dash-flow-resimulation.md Outdated Show resolved Hide resolved
documentation/dataplane/dash-flow-resimulation.md Outdated Show resolved Hide resolved
* @default 0
*/
SAI_ENI_ATTR_FLOW_INCARNATION_ID,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering why the id is exposed through the SAI API. Shouldn't the size of the id be a choice of the vendor, e.g. whether it is 8 bits, 16 bits, 7 bits or some other efficient encoding? Could this SAI API instead be a "pulse" or request that says that the id should be incremented? This means the upper layers would not need to track the id and when wrap happens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering why resimulation (even if just a request) like in my previous comment is part of the SAI API. Could the vendor's code determine when a resimulation is needed? For example, a SAI call changes the routing table for an ENI. As a result of this, the vendor code increments the ENI.id. Changes could be batched for efficiency.

Is the SAI API specified like this because the SDN controller is in charge of resimulation? In other words, the controller is going to configure multiple objects through the SAI API and then increment the resimulation id?

How does the SDN controller make the id wrap/"full resimulation" a rare case? Is it rare because (say with an 8-bit id) it happens only 1 out of every 256 id increments? Instead, is it possible that the id can be incremented slowly enough (i.e. batches are large enough) that it can be guaranteed that the overlap case never happens?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, what you are saying it right. i think the same too but haven't get the chance to update the doc yet. will do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the doc is updated now.

@r12f r12f force-pushed the user/r12f/resimulation branch from c9180b3 to da87947 Compare December 8, 2023 01:06
@r12f
Copy link
Collaborator Author

r12f commented Mar 11, 2024

due to new requirement showing up, i am going to close this PR for now and we will come back later.

@r12f r12f closed this Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants