-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NROP CR status is not aligned with the specs #920
Comments
shajmakh
added a commit
to shajmakh/numaresources-operator
that referenced
this issue
Oct 17, 2024
We have been seeing several cases where the operator status doesn't get updated correctly yet it's been hard to reproduce and track down the reason (see https://issues.redhat.com/browse/OCPBUGS-16058 & https://issues.redhat.com/browse/CNF-9080). During the updates of the controller tests to extend the coverage of reflecting the node group config under the status, with further debugging it turned out that updating the operator status was skipped in case of 2 similar successive conditions. This is wrong because each condition status can vary with the reasons, for example, the operator is successful with one set of NodeGroupConfig and later is configured with another successful set of settings, which we still want to reflect. Fortunately this is just a status reflection that explains why the RTE daemonset for example gets updated even if the status on the NROP CR is not up-to-date. Fix this by relaxing the updateStatus() function not to make it a hard requirement on the successive conditions to unmatch, and thus allowing always update the status after the end of each reconciliation loop. candidate fix for openshift-kni#920 Signed-off-by: Shereen Haj <[email protected]>
shajmakh
added a commit
to shajmakh/numaresources-operator
that referenced
this issue
Oct 17, 2024
We have been seeing several cases where the operator status doesn't get updated correctly yet it's been hard to reproduce and track down the reason (see https://issues.redhat.com/browse/OCPBUGS-16058 & https://issues.redhat.com/browse/CNF-9080). During the updates of the controller tests to extend the coverage of reflecting the node group config under the status, with further debugging it turned out that updating the operator status was skipped in case of 2 similar successive conditions. This is wrong because each condition status can vary with the reasons, for example, the operator is successful with one set of NodeGroupConfig and later is configured with another successful set of settings, which we still want to reflect. Fortunately this is just a status reflection that explains why the RTE daemonset for example gets updated even if the status on the NROP CR is not up-to-date. Fix this by relaxing the updateStatus() function not to make it a hard requirement on the successive conditions to unmatch, and thus allowing always update the status after the end of each reconciliation loop. candidate fix for openshift-kni#920 Signed-off-by: Shereen Haj <[email protected]>
shajmakh
added a commit
to shajmakh/numaresources-operator
that referenced
this issue
Oct 17, 2024
We have been seeing several cases where the operator status doesn't get updated correctly yet it's been hard to reproduce and track down the reason (see https://issues.redhat.com/browse/OCPBUGS-16058 & https://issues.redhat.com/browse/CNF-9080). During the updates of the controller tests to extend the coverage of reflecting the node group config under the status, with further debugging it turned out that updating the operator status was skipped in case of 2 similar successive conditions. This is wrong because each condition status can vary with the reasons, for example, the operator is successful with one set of NodeGroupConfig and later is configured with another successful set of settings, which we still want to reflect. Fortunately this is just a status reflection that explains why the RTE daemonset for example gets updated even if the status on the NROP CR is not up-to-date. Fix this by relaxing the updateStatus() function not to make it a hard requirement on the successive conditions to unmatch, and thus allowing always update the status after the end of each reconciliation loop. candidate fix for openshift-kni#920 Signed-off-by: Shereen Haj <[email protected]>
shajmakh
added a commit
to shajmakh/numaresources-operator
that referenced
this issue
Oct 21, 2024
We have been seeing several cases where the operator status doesn't get updated correctly yet it's been hard to reproduce and track down the reason (see https://issues.redhat.com/browse/OCPBUGS-16058 & https://issues.redhat.com/browse/CNF-9080). During the updates of the controller tests to extend the coverage of reflecting the node group config under the status, with further debugging it turned out that updating the operator status was skipped in case of 2 similar successive conditions. This is wrong because each condition status can vary with the reasons, for example, the operator is successful with one set of NodeGroupConfig and later is configured with another successful set of settings, which we still want to reflect. Fortunately this is just a status reflection that explains why the RTE daemonset for example gets updated even if the status on the NROP CR is not up-to-date. Fix this by relaxing the updateStatus() function not to make it a hard requirement on the successive conditions to unmatch, and thus allowing always update the status after the end of each reconciliation loop. candidate fix for openshift-kni#920 Signed-off-by: Shereen Haj <[email protected]>
shajmakh
added a commit
to shajmakh/numaresources-operator
that referenced
this issue
Oct 21, 2024
We have been seeing several cases where the operator status doesn't get updated correctly yet it's been hard to reproduce and track down the reason (see https://issues.redhat.com/browse/OCPBUGS-16058 & https://issues.redhat.com/browse/CNF-9080). During the updates of the controller tests to extend the coverage of reflecting the node group config under the status, with further debugging it turned out that updating the operator status was skipped in case of 2 similar successive conditions. This is wrong because each condition status can vary with the reasons, for example, the operator is successful with one set of NodeGroupConfig and later is configured with another successful set of settings, which we still want to reflect. Fortunately this is just a status reflection that explains why the RTE daemonset for example gets updated even if the status on the NROP CR is not up-to-date. Fix this by relaxing the updateStatus() function not to make it a hard requirement on the successive conditions to unmatch, and thus allowing always update the status after the end of each reconciliation loop. candidate fix for openshift-kni#920 Signed-off-by: Shereen Haj <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
steps to reproduce:
this is not particularly to Tolerations but can happen with the other config fields too.
The text was updated successfully, but these errors were encountered: