-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AWS ECR support (round two) #1055
base: main
Are you sure you want to change the base?
Conversation
…support # Conflicts: # binderhub/app.py
…gistry for secret management
…nt ConfigException when outside k8s
Cc @scottyhq |
Yes we are running this code on AWS with ECR and no further modifications. |
|
||
@default("ecr_client") | ||
def _get_ecr_client(self): | ||
return boto3.client("ecr", region_name=self.aws_region) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is boto3 async?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not officially. There are third-party wrappers like aioboto3 - https://github.com/terrycain/aioboto3 - but we may not want to go there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BinderHub is written using tornado so we can't make calls over the network with a library which blocks. If we do then all of the BinderHub process will block while that network request is happening. We either need to use a threadpool to execute the boto3 calls or use a async library that you can await
.
Depending on the complexity of the requests you are making another option would be to implement the HTTP call yourself using the AsyncHTTPClient
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. There is really only one request that needs to be made currently and it's relatively simple to do ourselves. Unless you see a value in integrating something like aioboto3
we can implement the request ourselves using AsyncHTTPClient
. Drawing inspiration from elsewhere, it seems this is the route that FargateSpawner for JupyterHub went as well when making AWS calls - ref: https://github.com/uktrade/fargatespawner/blob/c614a54ffd80d0fb8886d1ef9e8de2c938de7759/fargatespawner/fargatespawner.py#L322-L346.
If you approve of this path, I will start work on implementing and testing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After biting into AWS's request signing process - ref: https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html - I realized heading down the request reimplementation may not be the best approach. Additionally, the kubernetes call is also not async, so I switched to using a threadpool as you suggested. It has now been implemented and tested.
# Conflicts: # requirements.txt
…nt ConfigException when outside k8s
…rhub into aws-ecr-support
Rebase to current and requested modifications
for more information, see https://pre-commit.ci
This PR sparked a lot of interest in the past, and it would be really neat to allow Binder to run with AWS ECR.
As for the mock-testing, I fully agree that this would certainly be nice to have. However the Moto library has not implemented all the calls that would be necessary to test here. I looked for other solutions but could not come up with much. Re-iterating here, but my organization has been using this fork for a while and it has been working perfectly. Maybe @teticio or @TomasBeuzen can confirm. I will be sure to test again with this updated version and report back. Would love to hear some feedback! |
Coming back to this after a while to take a fresh look, and after seeing #1506 (comment), I was wondering if we could either move the ECR token request to the Builder and pass in the credentials as an environment variable of volume instead of overwriting the Kubernetes secret? It's a bigger change, but I've just done some work to refactor the build classes #1518 and I think further changes to KubernetesBuildExecutor to support custom registries or passing tokens is reasonable, especially if the outcome is cleaner code. A few ideas:
@yuvipanda what do you think? I'm leaning towards option (a), and modifying repo2docker to optionally login to the registry before pushing, but I'm very keen to hear other ideas. |
This is the sort of repo2docker change I had in mind if we go down the route of passing the credentials into the build pod: |
any updates on this PR? we would really love to have that support! |
@omri-shilton I'm working on an alternative approach: |
Closes #705 and follow-on/alternative to #920.
Includes updated user documentation on how to configure AWS IAM and BinderHub to use AWS ECR as the Docker registry. Modifications are additive, in that existing configurations should not be affected, and have been tested with ECR.