Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store provided environment of key/value kind into a k8s Secret owned by the spawned pod #471

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tirumerla
Copy link

@tirumerla tirumerla commented Dec 23, 2020

what

  • Create a secret with all the env. variables and bind this to the pod

why

  • Currently single user pod manifest show all the env. variables in plain text. To fix this issue and to follow the security standards and best practices all the env. variables are pulled into a secret which gives more flexibility for access policies.

process

  • Capture all the environment variables, split them into two dictionaries. One containing env. variables with just the "value" and the other containing env. variables with "valueFrom". Create a secret for each user and pass the 1st dictionary as stringData and Create a list of env. variables from the 2nd dictionary. At container creation pass both env ( list of env. variables from 2nd dict ) and envFrom ( secret holding all the env. variables from 1st dict ).

references

See #398 for details.

@welcome
Copy link

welcome bot commented Dec 23, 2020

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@consideRatio
Copy link
Member

consideRatio commented Dec 23, 2020

Quick note before i forget: there is work done in #409 about a k8s secret resource to go with the pod. Edit: aaah nice you have seen!


I want us to still allow for env set through refs to other secrets etc. I have not yet inspected the code (on mibile atm), but id like some discussion on the choice for the API you suggest kubespawner should implement.

@tirumerla
Copy link
Author

Edit: aaah nice you have seen!

Thanks @consideRatio, Yep made changes on top of this.

I want us to still allow for env set through refs to other secrets etc. I have not yet inspected the code (on mibile atm), but id like some discussion on the choice for the API you suggest kubespawner should implement.

Sounds good, let me know once you review the code. We can take it from there based on your feedback. :)

@tirumerla
Copy link
Author

Hey @consideRatio @yuvipanda, Happy new year :) Did you guys get a chance to look at this?

@consideRatio
Copy link
Member

@tirumerla thank you for your work. Is it okay if I suggest what could cause issues, and ask you to comment if you think there will be issues?

Main concern, will this PR stop supporting usage of the feature developed in #426 which was a long standing feature requested? Imagine that the user wants to set environment variables, and have them be set through the secret, but also be able to set any kind of environment variable directly - such as a reference to a configmap or another secret? Will this PR cause such conflicts?

I hope you can excuse me for being a bit lazy and not investigating this in detail by code inspection!

@consideRatio consideRatio changed the title read env from secrets Store provided environment of key/value kind into a k8s Secret owned by the spawned pod Jan 5, 2021
@tirumerla
Copy link
Author

Will this PR cause such conflicts?

@consideRatio Thank you for your response. This is a good point and a great catch - I will have to make some changes, any environment variable that is passed needs to be validated in get_env and construct the environment dict which is then passed to make_secret as opposed to validating in make_pod

I hope you can excuse me for being a bit lazy and not investigating this in detail by code inspection!

Not a problem :)

@tirumerla
Copy link
Author

@consideRatio I have updated my code to allow the feature developed in #426. Let me know what you think about it.

'key': 'password',
},
},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure old tests are retained and only update what changed? I believe the output 581-584 is what really change, but not the others, right?

                        {
                            'name': 'TEST_KEY_1',
                            'value': 'TEST_VALUE',
                        },

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure old tests are retained and only update what changed?

Yes good point, fixing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the tests, added one more test to spawner to verify 'valueFrom' env. variables.

@@ -671,9 +657,8 @@ def make_owner_reference(name, uid):

def make_secret(
name,
str_data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let this be named string_data to not create a deviation from the naming convention of k8s secret resources.

env['JUPYTERHUB_SSL_CERTFILE'] = self.secret_mount_path + "ssl.crt"
env['JUPYTERHUB_SSL_CLIENT_CA'] = (self.secret_mount_path + "notebooks-ca_trust.crt")

return env, prepared_env
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider introducing get_env_from alongside get_env here, I dislike the complexity we embed in this by returning a tuple.

I think prepared_env also should be named more appropriately now that it means specifically env_from.

The Container object has env and envFrom, both being arrays, but of the kind EnvVar and EnvFromSource.

I'd love for this source code to be simpler to understand rather than more complicated, of course adding this feature will make it more complicated no matter what, but how to minimize the complexity of understanding this code in general in the future is a focus of mine.

I have no concrete suggested action =/ but thats what goes in my mind while looking at this code base which I have struggled to understand in the past and think is quite hard to grasp still.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider introducing get_env_from alongside get_env here, I dislike the complexity we embed in this by returning a tuple.

Yes, this is a good point. I thought about this, creating a new get_env_from just for env. variables using "valueFrom" but in the moment i went with getting something up quicker rather than simpler :). You're right, i will take a look at this to make it simpler.

@@ -13,6 +13,7 @@ def test_make_simplest_pod():
assert api_client.sanitize_for_serialization(make_pod(
name='test',
image='jupyter/singleuser:latest',
env=[],
cmd=['jupyterhub-singleuser'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a consequence of passing an empty array to the V1Container object, can you make it not happen for both env and env_from in case those are empty arrays?


return env

def get_env_value_from(self):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@consideRatio I've created a separate get_env_value_from to filter only the env. variables which has value_from, hence the name :). Now get_env will filter only the env. variables with values but not value_from. Also i'm using V1EnvVar in this since we already have the right format for it to process.

For example.
This "TEST_KEY_2": {'valueFrom': {'secretKeyRef': {'name': 'my-test-secret', 'key': 'password'}}} being formatted as {"name": "TEST_KEY_2", "value_from": {'secretKeyRef': {'name': 'my-test-secret', 'key': 'password'}}}. I'm creating list of these env. variables containing value_from which are then passed to V1Container object in env variable.

@tirumerla
Copy link
Author

I have also tested this internally creating configmap of changed files spawner.py and objects.py , mounting these on the hub pod, logged in as a single user and spawned the single user pod. I saw the secret with all env. variables getting created, I was also able to test executing env from my notebook cell on the single-user pod to verify if i can see all the env. variables from secret and also "valueFrom".

@tirumerla
Copy link
Author

hi @consideRatio any updates on this?

@consideRatio
Copy link
Member

Hi @tirumerla! Sorry I have no update from my end.

I think this feature adds value but it also add quite significant complexity related to a not so well tested feature (internal_ssl, and the recent addition of a k8s secret). Due to this, I've not personally given it much priority.

@tirumerla
Copy link
Author

Hi @tirumerla! Sorry I have no update from my end.

I think this feature adds value but it also add quite significant complexity related to a not so well tested feature (internal_ssl, and the recent addition of a k8s secret). Due to this, I've not personally given it much priority.

Thanks @consideRatio. This makes sense. We will use this feature internally for now then.

@sstarcher
Copy link
Contributor

This would be a nice security improvement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants