-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OA crash handling to reinitialize port through xcvrd #1432
Conversation
@prgeor @shyam77git @jaganbal-a @amnsinghal - It will be great if you can help in reviewing this HLD |
1. XCVRD main thread init | ||
- XCVRD main thread creates the key CMIS_REINIT_REQUIRED in PORT_TABLE:\<port\> (APPL_DB) with value as true for ports which do NOT have this key present | ||
- XCVRD main thread creates the key MEDIA_SETTINGS_SYNC_STATUS in PORT_TABLE:\<port\> (APPL_DB) with value MEDIA_SETTINGS_DEFAULT for ports which do NOT have this key present. | ||
- For transceivers which do not support media settings, MEDIA_SETTINGS_SYNC_STATUS will stay with value MEDIA_SETTINGS_DEFAULT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: For transceivers which do not "require" media settings, since this not a module feature, it is a requirement on the NPU side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified this now.
| MEDIA_SETTINGS_DONE | Orchagent after applying the SI settings | CmisManagerTask for proceeding to CMIS_STATE_DP_DEINIT from CMIS_STATE_MEDIA_SETTINGS_WAIT | | ||
|
||
2. SfpStateUpdateTask thread will notify the media settings to OA based on the value of PORT_TABLE:\<port\>.MEDIA_SETTINGS_SYNC_STATUS | ||
If PORT_TABLE:\<port\>.MEDIA_SETTINGS_SYNC_STATUS != MEDIA_SETTINGS_DONE, notify media settings will be invoked and will be set to MEDIA_SETTINGS_NOTIFIED for a port supporting media settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check for MEDIA_SETTINGS_NOTIFIED also? Could there be a scenario where SfpStateUpdateTask tries to renotify media settings before MEDIA_SETTINGS_DONE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SfpStateUpdateTask could try to renotify the media settings before MEDIA_SETTINGS_DONE if xcvrd crashes just after notifying the media settings (in this case, MEDIA_SETTINGS_SYNC_STATUS would be MEDIA_SETTINGS_NOTIFIED).
In such case, OA would apply the media settings but CmisManagerTask will not perform CMIS initialization (along with handling port toggle message from OA) since CmisManagerTask is already dead.
Eventually, once xcvrd spawn again, SfpStateUpdateTask would renotify the media settings to OA followed by CmisManagerTask proceeding with CMIS initialization.
Hence, to allow port toggle message to be sent from OA to CmisManagerTask and allow CMIS initialization in the above xcvrd crash scenario, I am currently not checking for MEDIA_SETTINGS_NOTIFIED. Do you think there could be an issue if we apply the media settings twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In xcvrd restart scenario, re-applying or re-notifying media settings might have an impact to the system i.e.
OA is just a conduit. Re-applying would lead to re-notifying media_settings from OA->syncd->SAI->SDK and in turn making port toggle.
Should avoid port toggling here
SfpStateUpdateTask ->> SfpStateUpdateTask : event = SFP_STATUS_REMOVED | ||
SfpStateUpdateTask -x STATE_DB : Delete TRANSCEIVER_INFO table for the port | ||
par CmisManagerTask, SfpStateUpdateTask | ||
CmisManagerTask ->> CmisManagerTask : Transition CMIS SM to CMIS_STATE_REMOVED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we set CMIS_REINIT_REQUIRED to True upon module removal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not currently changing the value of CMIS_REINIT_REQUIRED after setting it as False since the purpose of this flag is to drive CMIS re-initialization and not CMIS initialization.
The existing flow in the CMIS SM already ensures that CMIS initialization will be performed after module insertion.
Let me know your thoughts on this.
| Value | Modifier thread and event | Consumer thread and purpose | | ||
|:-----------------------:|:------------------------------------------------------:|:--------------------------------------------------------------------------------------------:| | ||
| MEDIA_SETTINGS_DEFAULT | XCVRD main thread during cold start of xcvrd | XCVRD main thread during boot-up for deciding to notify media settings | | ||
| | SfpStateUpdateTask during transceiver removal | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please correct indentation (unnecessary gaps between words)
Also bullet the two threads for readability, as:
- XCVRD main thread....
- SfpStateUpdateTask thread...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this now.
|
||
| Value | Modifier thread and event | Consumer thread and purpose | | ||
|:-----------------------:|:------------------------------------------------------:|:--------------------------------------------------------------------------------------------:| | ||
| MEDIA_SETTINGS_DEFAULT | XCVRD main thread during cold start of xcvrd | XCVRD main thread during boot-up for deciding to notify media settings | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this row, I believe consumer thread captures both uses cases - boot-up and transceiver insertion.
In that case, please add 'transceiver insertion' here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a check in OA as well to perform port toggle and set MEDIA_SETTINGS_DONE only if MEDIA_SETTINGS_SYNC_STATUS == MEDIA_SETTINGS_NOTIFIED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed, SfpStateUpdateTask will perform apply media settings upon transceiver insertion irrespective of the value of MEDIA_SETTINGS_SYNC_STATUS (unlike the case of SfpStateUpdateTask checking the value during boot-up).
3. The OA upon receiving media settings will | ||
- Disable port admin status | ||
- Apply SI settings | ||
- PORT_TABLE:\<port\>.MEDIA_SETTINGS_SYNC_STATUS = MEDIA_SETTINGS_DONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[1] Please elaborate "Apply SI settings" to something on lines...Ask SAI-SDK to apply SI settings
[2] Add Next bullet - Based on the update/response as SUCCESS from SAI call,
set PORT_TABLE:<port>.MEDIA_SETTINGS_SYNC_STATUS = MEDIA_SETTINGS_DONE
Enable port admin status
In case of failure update/response from SAI call, what would be the system flow? how would the failure be handled?
I beleive host_tx_ready settings is part of this workflow. Please add it here to the workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this now.
- MEDIA_SETTINGS_SYNC_STATUS != MEDIA_SETTINGS_DONE | ||
If all the above conditions are true, CMIS SM transitions to CMIS_STATE_MEDIA_SETTINGS_WAIT state. | ||
If port doesn't require media settings to be applied, CMIS SM will proceed with normal code flow (transitions to CMIS_STATE_DP_DEINIT) | ||
Overall, no functionality change related to CMIS SM transitions is intended for ports not supporting media settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If port doesn't require media settings to be applied,..."
This should be checked as part of previous point/bullet (i.e. the port supports media settings...)
bail out from there itself (i.e. no further action) in case port doesn't require media settings
``` | ||
|
||
## Test plan and expectation | ||
| Event | APPL_DB cleared | Xcvrd restarted | Media renotify | MEDIA_SETTINGS_SYNC_STATUS value on xcvrd boot-up for initialized transceiver | CMIS re-init triggered | Link flap | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change Media renotify to Media settings renotify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this now
| Swss restart | Y | Y | Y | MEDIA_SETTINGS_DEFAULT | Y | Y | | ||
| Syncd restart | Y | Y | Y | MEDIA_SETTINGS_DEFAULT | Y | Y | | ||
| config reload | Y | Y | Y | MEDIA_SETTINGS_DEFAULT | Y | Y | | ||
| Cold reboot | Y | Y | Y | MEDIA_SETTINGS_DEFAULT | Y | Y | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config reload and cold reboot triggers/use-cases doesn't have link flap (until link come operationally up and then go down post bring up).
Since all SW components went down along with HW ones (NPU, PHY, optics etc.), so should mark this case as N/A under this link flap column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this now.
CMIS_STATE_DP_TXON --> CMIS_STATE_DP_ACTIVATE | ||
CMIS_STATE_DP_ACTIVATE --> CMIS_STATE_READY | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not related to this specific design but what is the reason for setting CMIS_STATE_READY when either "host_tx_ready" is not yet TRUE or admin_status is not UP ? Why do we skip to the final state of the CMIS state machine in this use-case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this was initially implemented, the thought at that time was to move the CMIS SM to a fixed state rather than waiting for an event.
However, if you think we should create a new state to handle host_tx_ready and admin_status rather than moving to CMIS_STATE_READY if either fields have the value false/down respectively, I can plan to implement it accordingly. Let me know your opinion on this.
- XCVRD main thread creates the key CMIS_REINIT_REQUIRED in PORT_TABLE:\<port\> (APPL_DB) with value as true for ports which do NOT have this key present | ||
- XCVRD main thread creates the key MEDIA_SETTINGS_SYNC_STATUS in PORT_TABLE:\<port\> (APPL_DB) with value MEDIA_SETTINGS_DEFAULT for ports which do NOT have this key present. | ||
- For transceivers which do not require media settings, MEDIA_SETTINGS_SYNC_STATUS will stay with value MEDIA_SETTINGS_DEFAULT | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To know that the XCVRD main thread might need to parse "media_settings.json" file. Ports which have no match in this file don't require media settings. Is it the intention to do ? I know that with recent changes parsing/setting of media settings moved from init code of the XCVRD main thread to init of SfpStateUpdate task. May be this setting after parsing of .json file should be done there ? Or MEDIA_SETTING_DEFAULT can be put for all ports regardless media settings for them in .json file ? Then if SfpStateUpdate taks finds data for a port in .json it will set MEDIA_SETTING_NOTIFIED ? What is the impact of having MEDIA_SETTING_DEFAULT for all ports ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To know that the XCVRD main thread might need to parse "media_settings.json" file. Ports which have no match in this file don't require media settings. Is it the intention to do ?
[MP] Yes - if the media settings of a transceiver are not part of "media_settings.json", we would still keep the port with the value MEDIA_SETTING_DEFAULT
I know that with recent changes parsing/setting of media settings moved from init code of the XCVRD main thread to init of SfpStateUpdate task. May be this setting after parsing of .json file should be done there ? Or MEDIA_SETTING_DEFAULT can be put for all ports regardless media settings for them in .json file ?
[MP] I think it is better to initialize this key to MEDIA_SETTING_DEFAULT from xcvrd main thread since this key is used by both SfpStateUpdateTask and CmisManagerTask for handling media settings application for a port and these threads are running parallelly.
Then if SfpStateUpdate taks finds data for a port in .json it will set MEDIA_SETTING_NOTIFIED ? What is the impact of having MEDIA_SETTING_DEFAULT for all ports ?
[MP] Yes, once SfpStateUpdateTask find the media settings in .json file, the value of MEDIA_SETTINGS_SYNC_STATUS changes to MEDIA_SETTING_NOTIFIED. I think there is no harm in setting to MEDIA_SETTING_DEFAULT for all ports irrespective of its media settings requirement from the xcvrd main thread perspective. Also, the advantage with this approach is we do not introduce any race conditions between OA, SfpStateUpdateTask and CmisManagerTask threads related to the role or sequence in modifying the value of MEDIA_SETTINGS_SYNC_STATUS .
|:-----------------------:|:------------------------------------------------------:|:--------------------------------------------------------------------------------------------:| | ||
| MEDIA_SETTINGS_DEFAULT | XCVRD main thread during cold start of xcvrd | XCVRD main thread during boot-up for deciding to notify media settings | | ||
| | SfpStateUpdateTask during transceiver removal | | | ||
| MEDIA_SETTINGS_NOTIFIED | SfpStateUpdateTask while updating the media settings | Not being used currently | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this state used for? no consumer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This state is currently unused. We can use this for debugging purpose for now to check if XCVRD has applied and notified the media settings to OA
…teUpdateTask and CmisManagerTask threads
…g module v/s NPU SI settings in future
end | ||
``` | ||
|
||
## XCVRD termination during syncd/swss/orchagent crash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if any SFF complaint transceiver requires different SI value to be programmed to NPU. eg copper module with different length without ANLT may need a different SI on NPU serdes.
This approach is tied with cmis manager/CMIS complaint modules only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The below will take care of handling SFF compliant transceivers requiring different NPU SI settings. Let me know if I am missing something here.
"For non-CMIS SM driven transceivers, SfpStateUpdateTask thread will update NPU SI settings in the PORT_TABLE (APPL_DB) and notify to OA based on the value of PORT_TABLE:.NPU_SI_SETTINGS_SYNC_STATUS"
6. XCVRD will subscribe to PORT_TABLE in APPL_DB and trigger self-restart if the PORT_TABLE is deleted for the namespace. | ||
All threads will be gracefully terminated and xcvrd deinit will be performed followed by issuing a SIGABRT to ensure XCVRD is restarted automatically by supervisord. After respawn, CMIS re-init and NPU_SI_SETTINGS notified is triggered for the ports belonging to the affected namespace | ||
|
||
7. syncd/swss/orchagent restart clears the entire APPL-DB, including “NPU_SI_SETTINGS_SYNC_STATUS” and "CMIS_REINIT_REQUIRED" in PORT_TABLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is referring to "syncd/swss/orchagent crash" handling scenario.
Here "syncd/swss/orchagent restart" is mentioned.
- Does 'restart' of these docker containers has the same workflow?
- Please check from present codebase/image working state as well and ensure same/different handling (if any)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 'restart' of these docker containers has the same workflow.
I have now verified from that restarting swss indeed clears the APPL_DB. I have also added "restart swss" as part of my test plan to cover this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution looks compliated. How about following:
- If xcvrd crashes and restarts, it always notify media settings. And orchagent should keep a cache of media settings, only apply media settings when configuration changes
- If a cable is plug out and plug in, xcvrd should remove media settings from APP DB and re-apply it.
- if orchagent crashes, it should re-read from redis automatically, so, no issue.
Please share your thought. Thanks.
Hi Junchao, With the above suggestion, we will not be able to hold CMIS initialization (for CMIS based modules) while the NPU SI settings are being applied by the orchagent. To solve this issue (link flap), the current HLD has a separate code path for CMIS modules to notify orchagent with NPU SI settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 can you add revision 1.0 to the Revision table at the top of this document?
@mihirpat1 please update the PR details (including OA PR as well) |
In addition to the above, XCVRD can miss the port toggle message being sent from orchagent to xcvrd if the CmisTaskManager thread is busy handling other tasks. |
|
||
When syncd/swss/orchagent crashes, all ports in the corresponding namespace will be reinitialized by xcvrd irrespective of the current state of the port. All the corresponding ports are expected to experience link down until the initialization is complete. | ||
If just xcvrd crashes and restarts, then forced re-initialization (CMIS reinit + NPU SI settings notification) of ports will not be performed. Hence, the ports will not experience link downtime during scenario. | ||
CMIS_REINIT_REQUIRED and NPU_SI_SETTINGS_SYNC_STATUS keys in PORT_TABLE:\<port\> (APPL_DB) are used to determine if port re-initialization is required or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 any specific reason why APPL_DB and not STATE_DB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the description for it now.
| Value | Modifier thread and event | Consumer thread and purpose | | ||
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | | ||
| NPU_SI_SETTINGS_DEFAULT | 1\. XCVRD main thread during cold start of XCVRD<br>2. SfpStateUpdateTask during transceiver removal | XCVRD main thread during boot-up for deciding to notify NPU SI settings | | ||
| NPU_SI_SETTINGS_NOTIFIED | 1\. SfpStateUpdateTask while updating and notifying the NPU SI settings for non-CMIS SM driven transceivers<br> 2. CmisManagerTask while updating and notifying the NPU SI settings for CMIS SM driven transceivers | Not being used currently | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also mention in the table we SfpStateUpdateTask for notifying SI setting for non-CMIS transceviers for historical reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the description for it now.
|
||
5. The CmisManagerTask thread will set “CMIS_REINIT_REQUIRED" to false after CMIS SM reaches to a steady state (CMIS_STATE_UNKNOWN, CMIS_STATE_FAILED, CMIS_STATE_READY and CMIS_STATE_REMOVED) for the corresponding port | ||
|
||
6. XCVRD will subscribe to PORT_TABLE in APPL_DB and trigger self-restart if the PORT_TABLE is deleted for the namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 in which all cases PORT_TABLE deletion happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have covered the same in "Test plan and expectation" section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 can you also describe the rationale behind restarting Xcvrd in case swss/syncd/OA restart/crash?
If just xcvrd crashes and restarts, then forced re-initialization (CMIS reinit + NPU SI settings notification) of ports will not be performed. Hence, the ports will not experience link downtime during scenario. | ||
CMIS_REINIT_REQUIRED and NPU_SI_SETTINGS_SYNC_STATUS keys in PORT_TABLE:\<port\> (APPL_DB) are used to determine if port re-initialization is required or not. | ||
- CMIS_REINIT_REQUIRED key states if CMIS re-initialization is required for a port after xcvrd is spawned. CMIS_REINIT_REQUIRED helps in mainly driving CMIS re-initialization after syncd/swss/orchagent crash since it will allow reinitializing ports belonging to the relevant namespace of the crashing process. This key is not planned to drive CMIS initialization after transceiver insertion. | ||
- NPU_SI_SETTINGS_SYNC_STATUS key is used as a means to communicate the status of applying NPU SI settings for a transceiver requiring NPU SI settings. This key is used to update the NPU SI settings application status between SfpStateUpdateTask, CmisManagerTask and Orchagent. In case of warm reboot or xcvrd restart, this key will prevent application of NPU SI settings on the port if the settings are already applied. In case of transceiver insertion, NPU SI settings will be applied irrespective of the NPU SI settings application status for the port. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 this key prevents additional link flap ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the description for it now.
I have added it now. |
Added OA PR for now. Will add other PRs once I upload. |
@eddyk-nvidia - It will be great if you can help in reviewing this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In case of continuous restart of xcvrd, both the keys will still hold the same value as before the restart. This would ensure that the port re-initialization is resumed from the last known state. | ||
|
||
Rationale for choosing APPL_DB over STATE_DB for storing CMIS_REINIT_REQUIRED and NPU_SI_SETTINGS_SYNC_STATUS keys | ||
- APPL_DB is the first DB which gets cleared as part of SWSS crash handling. This inturn helps XCVRD to detect DB removal and terminate child threads and itself earlier than other DBs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 should we reconsider APPL_DB because, i do see oper status in APPL_DB as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prgeor - I have now updated the HLD with the usage of STATE_DB for storing the NPU_SI_SETTINGS_SYNC_STATUS
and CMIS_REINIT_REQUIRED
keys.
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | | ||
| NPU_SI_SETTINGS_DEFAULT | 1\. XCVRD main thread during cold start of XCVRD<br>2. SfpStateUpdateTask during transceiver removal | XCVRD main thread during boot-up for deciding to notify NPU SI settings | | ||
| NPU_SI_SETTINGS_NOTIFIED | 1\. SfpStateUpdateTask while updating and notifying the NPU SI settings for non-CMIS SM driven transceivers (this approach was chosen to preserve the existing behavior for non-CMIS SM driven transceivers)<br> 2. CmisManagerTask while updating and notifying the NPU SI settings for CMIS SM driven transceivers | Not being used currently | | ||
| NPU_SI_SETTINGS_DONE | Orchagent after applying the SI settings | CmisManagerTask for proceeding from CMIS_STATE_NPU_SI_SETTINGS_WAIT to CMIS_STATE_DP_INIT state | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihirpat1 does platform that does not need SI settings, do those port's CMIS state machine transition to DP init?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - For platforms or ports not requiring NPU SI settings, the CMIS state machine will transition to DP init state (i.e. skip transition to CMIS_STATE_NPU_SI_SETTINGS_WAIT)
* OA crash handling to reinitialize port through xcvrd * Added table to list values for MEDIA_SETTINGS_SYNC_STATUS * Fixed typo * Addressed PR comment * Addressed PR comments and separated media notification between SfpStateUpdateTask and CmisManagerTask threads * Modified media settings to NPU SI settings for clearly differentiating module v/s NPU SI settings in future * Fixed typo * Addressed PR comments * Reverted unrelated changeset * Addressed PR comments * Added pre-requisite section and added testcase for OA restart * Corrected typo in XCVRD init sequence diagram * Added description and functioning related to is_npu_si_settings_update_required * Fixed typo * Addressed review comments * Replaced usage of PORT_TABLE for storing the keys from APPL_DB to STATE_DB
Proposal to reinitialize ports through xcvrd during OA crash.
The current proposal is created to resolve sonic-net/sonic-platform-daemons#356