Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config-manager: allow configuring NRI timeouts. #318

Merged

Conversation

klihub
Copy link
Collaborator

@klihub klihub commented May 17, 2024

Allow setting NRI plugin registration and request timeouts in the CRI-O or containerd configuration files. Require neither or both timeouts to be given. If both are given, verify that registration timeout is larger than the request timeout.

@klihub klihub requested a review from fmuyassarov May 17, 2024 09:02
@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch from 7cd6cf4 to b19dce3 Compare May 17, 2024 11:24
@klihub
Copy link
Collaborator Author

klihub commented May 17, 2024

@fmuyassarov Looking at the updated docs, one question immediately came to my mind related to the updated helm charts. Considering the new timeout values in Values.nri.config, would it be better/more consistent to use .Values.nri.config.patch: true instead of the current .Values.nri.patchConfig ?

@klihub klihub marked this pull request as ready for review May 17, 2024 12:32
@klihub klihub requested review from fmuyassarov and askervin May 17, 2024 12:33
Copy link
Collaborator

@fmuyassarov fmuyassarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks already good. I did a quick review and will do it again. Left some minor comments or nits.

cmd/config-manager/main.go Outdated Show resolved Hide resolved
cmd/config-manager/main.go Outdated Show resolved Hide resolved
cmd/config-manager/main.go Show resolved Hide resolved
deployment/helm/memory-qos/README.md Show resolved Hide resolved
deployment/helm/memory-qos/templates/daemonset.yaml Outdated Show resolved Hide resolved
deployment/helm/template/templates/daemonset.yaml Outdated Show resolved Hide resolved
deployment/helm/template/values.yaml Outdated Show resolved Hide resolved
deployment/helm/topology-aware/templates/daemonset.yaml Outdated Show resolved Hide resolved
deployment/helm/topology-aware/values.yaml Outdated Show resolved Hide resolved
deployment/operator/README.md Outdated Show resolved Hide resolved
@fmuyassarov
Copy link
Collaborator

fmuyassarov commented May 17, 2024

@fmuyassarov Looking at the updated docs, one question immediately came to my mind related to the updated helm charts. Considering the new timeout values in Values.nri.config, would it be better/more consistent to use .Values.nri.config.patch: true instead of the current .Values.nri.patchConfig ?

Yes, makes sense. That way we are more consistent indeed. I would also rather rename word patch to something more descriptive but at the same time I don't have a good suggestion to offer :)

UPDATE: A slight problem (if we can really call it a problem) is that adding .config. makes it more deeper which might be something that users don't like. But that might be not be true really. Just thinking out loud.

@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch 3 times, most recently from 3a7af7c to 9a3fbc0 Compare May 17, 2024 17:27
@klihub
Copy link
Collaborator Author

klihub commented May 17, 2024

UPDATE: A slight problem (if we can really call it a problem) is that adding .config. makes it more deeper which might be something that users don't like. But that might be not be true really. Just thinking out loud.

Yes, I had the reaction when updating the daemonset files with templating conditionals. I still decided to stick with it for consistency.

@klihub klihub requested a review from fmuyassarov May 17, 2024 17:56
@klihub
Copy link
Collaborator Author

klihub commented May 17, 2024

@fmuyassarov @askervin I fixed a few gotchas we neither noticed nor thought of. Fixed those after testing this in a cluster. I'd consider it a good enough first try unless you guys spot something I still missed.

@klihub klihub marked this pull request as draft May 17, 2024 19:09
Allow setting NRI plugin registration and request timeouts in
the CRI-O or containerd configuration files. Require neither
or both timeouts to be given. If both are given, verify that
registration timeout is larger than the request timeout.

Signed-off-by: Krisztian Litkey <[email protected]>
@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch from 9a3fbc0 to a1ed304 Compare May 17, 2024 19:31
@klihub klihub marked this pull request as ready for review May 17, 2024 19:31
@klihub
Copy link
Collaborator Author

klihub commented May 17, 2024

@fmuyassarov @askervin It would be nice if we could enforce with the values JSON schema that either both or neither configuration timeout parameters should be set. But I suspect the schema might not be able to express such dependencies...

@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch from a1ed304 to b4fde25 Compare May 20, 2024 09:23
@klihub klihub marked this pull request as draft May 20, 2024 09:33
Update all plugin Helm charts, to allow the config-manager to
patch plugin registration and request timeouts in the runtime
configuration.

Aiming for more consistent value structure by switching from
these NRI-related bits:

  nri:
    patchRuntimeConfig: false
    pluginIndex: 90

to these new ones:

  nri:
    plugin:
      index: 90
    runtime:
      patchConfig: false
      config:
        pluginRegistrationTimeout: 5s
        pluginRequestTimeout: 2s

There would be room for further later improvements by moving
the bits for enabling plugin test APIs to nri.plugin, too...

Signed-off-by: Krisztian Litkey <[email protected]>
@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch from b4fde25 to 843bf29 Compare May 20, 2024 09:43
@klihub klihub marked this pull request as ready for review May 20, 2024 09:43
@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch from 843bf29 to 05d3d66 Compare May 20, 2024 10:00
@klihub klihub requested a review from askervin May 20, 2024 10:10
@klihub klihub requested a review from fmuyassarov May 20, 2024 10:10
klihub added 2 commits May 21, 2024 10:51
Update runtime config patching bits in CRD.

Signed-off-by: Krisztian Litkey <[email protected]>
Update to use updated Values.nri.runtime.patchConfig instead of
Values.nri.patchRuntimeConfig.

Signed-off-by: Krisztian Litkey <[email protected]>
@klihub klihub force-pushed the devel/config-manager/nri-plugin-timeouts branch from 05d3d66 to d63bba3 Compare May 21, 2024 07:53
Copy link
Collaborator

@fmuyassarov fmuyassarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@klihub klihub requested review from marquiz and fmuyassarov May 21, 2024 10:15
@fmuyassarov fmuyassarov merged commit 1f45c70 into containers:main May 21, 2024
3 checks passed
@klihub klihub deleted the devel/config-manager/nri-plugin-timeouts branch May 22, 2024 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants