Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling per hub storage quotas #764

Closed
sgibson91 opened this issue Oct 19, 2021 · 16 comments
Closed

Enabling per hub storage quotas #764

sgibson91 opened this issue Oct 19, 2021 · 16 comments

Comments

@sgibson91
Copy link
Member

sgibson91 commented Oct 19, 2021

October 2023 Update

Current proposals:

Work already done:


Description

Right now in our NFS servers, user home dirs expand until the whole NFS storage is consumed and then we either expand the storage further or delete stuff. This isn't necessarily a problem, except that the storage may not be equally distributed between hubs or even users! A slightly nicer solution may be to implement some storage quotas on a per user or per hub basis and then we can at least communicate with Community Representatives that that is the amount of storage they are allocated and can work from there.

This is a topic that has already had a lot of discussion in Pangeo pangeo-data/pangeo-cloud-federation#654

I already did some work to bring the nfs-server-provisioner helm chart into our infrastructure. We time-boxed the effort to get that working and instead fell back onto use Google Cloud Filestore on the Pangeo deployment. I believe achieving such quotas will involve more work in making the nfs-server-provisioner chart functional.

Value / benefit

  • Can more clearly communicate how much storage a given Hub Community can expect
  • No single user or hub can bogart the NFS storage with lots of data

Implementation details

No response

Tasks to complete

No response

Updates

No response

@damianavila
Copy link
Contributor

A slightly nicer solution may be to implement some storage quotas on a per user or per hub basis

I like that idea!

@sgibson91
Copy link
Member Author

sgibson91 commented Aug 10, 2022

@damianavila I wonder if we can bring this issue into our backlog following https://2i2c.freshdesk.com/a/tickets/171 Since disk space filling up is most often felt by hubs on shared clusters and feels most unfair that the hub community reporting it may not have even been the ones to cause it.

@damianavila
Copy link
Contributor

I added it to the backlog and raised the priority! Thanks for bringing this one to attention, @sgibson91.

@yuvipanda
Copy link
Member

My current worry is that the nfs external provisoner seems a bit abandoned (kubernetes-sigs/nfs-ganesha-server-and-external-provisioner#106). @consideRatio got some patches into it, so he might have an idea on how active it is?

The other problem I have with it is that it generates directory names with randomized chars, so if we lose the PVC objects (by recreating the k8s cluster, for example) we can no longer map the users to their home directories! currently that is not the case - we can recreate the k8s cluster and not lose any user data. This complicates backup quite a bit.

@yuvipanda
Copy link
Member

Here's a different approach to try:

For shared clusters

  1. Add a deployment to basehub that deploys an NFS server (nfs-kernel-server or nfs-ganesha). This would just attach to a PVC. So we'll have 1 NFS server per hub, which isolates hubs from trampling on each other in private spaces.
  2. Assuming we use XFS for the PVC, we can have a sidecar that runs xfs-quota in a loop to set per-user limits

The advantages of this approach are that it's much simpler than the external provisioner, doesn't require us to maintain an NFS server by hand, and doesn't keep state in the kubernetes cluster that is required to associate users home directories with the users. Given that the provisioner doesn't seem to be maintained upstream, I think we have a much better chance of maintaining this than of maintaining the external provisioner itself.

For non-shared clusters

I think using cloud based filestores (EFS, etc) is still more apporpriate when we aren't operating shared clusters. For those, disk space should focus on reporting. I'd suggest we write a prometheus exporter that basically runs du and reports per-user disk space, and we can expose this to users via grafana. That way, we can answer the question of 'who is eating disk space?' and have alerts if necessary.

@yuvipanda
Copy link
Member

There is a lot of precedent for running this kinda 'nfs server in a pod' - https://github.com/appscode/third-party-tools/blob/master/storage/nfs/artifacts/nfs-server.yaml for example. It'll have to run as a privileged pod, but totally doable. It'll also allow us to monitor free space on the disk easily.

@sgibson91
Copy link
Member Author

Thank you for the proposal @yuvipanda! I like it for three reasons:

  1. It resolves the systemic unfairness for storage on shared clusters
  2. For dedicated clusters, I think Community Reps will really appreciate the grafana-based reporting structure you suggest, particularly around managing costs
  3. It avoids putting unnecessary and difficult maintenance on the engineers

I believe we should try and work towards implementing an MVP of this sooner rather than later!

@Vaibhav1919
Copy link

Vaibhav1919 commented Nov 11, 2022

@yuvipanda @consideRatio
Is it possible to restrict EFS/NFS data per kubernetes pod?

@sgibson91
Copy link
Member Author

is there a rough value for how much persistent space each user has in /home/jovyan? This will help with brainstorming our final configuration.

from https://2i2c.freshdesk.com/a/tickets/271

I think Yuvi has already lined up a great proposal above, and we just need to assign someone to try to implement that.

@Vaibhav1919
Copy link

Vaibhav1919 commented Nov 13, 2022

@yuvipanda @sgibson91 @consideRatio
Do there exist a way through which we can use nfs external provisioned and assigning quota for each user without pre-creating XFS volume from google PD ?

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jun 6, 2023
This deploys https://github.com/yuvipanda/prometheus-dirsize-exporter,
an *efficient* per-user homedirectory stats (size, no. of files,
last modified date, etc) collector. It is capped at performing no
more than 250 IO operations per second, to not overwhelm NFS
servers. Metrics are refreshed every 2h after completion, although
on large servers (like LEAP), they can take many many hours to
complete with just 250 IO operations per second. This is perfectly
fine though, as we do not need 'up to date' information. Trading
off metric latency for minimal resource usage is pretty good
here.

Ref 2i2c-org#764
@yuvipanda
Copy link
Member

From #764 (comment):

I'd suggest we write a prometheus exporter that basically runs du and reports per-user disk space, and we can expose this to users via grafana.

I wrote this today! https://github.com/yuvipanda/prometheus-dirsize-exporter. Has some performance optimizations as well, although could do more. #2621 deploys it to our clusters.

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jun 20, 2023
This deploys https://github.com/yuvipanda/prometheus-dirsize-exporter,
an *efficient* per-user homedirectory stats (size, no. of files,
last modified date, etc) collector. It is capped at performing no
more than 250 IO operations per second, to not overwhelm NFS
servers. Metrics are refreshed every 2h after completion, although
on large servers (like LEAP), they can take many many hours to
complete with just 250 IO operations per second. This is perfectly
fine though, as we do not need 'up to date' information. Trading
off metric latency for minimal resource usage is pretty good
here.

Ref 2i2c-org#764
@GeorgianaElena
Copy link
Member

Just wanted to raise the priority of building an alerting and possibly a notification system on top of the really great and useful Grafana dashboad.

This is motivated by @jbusecke ticket https://2i2c.freshdesk.com/a/tickets/995 about the challenges of manually monitoring the usage and notifying the users one by one in the context of an increasing user base.

I was thinking that one step in this direction is to try and enable Grafana alerting on the usage dashboard, by setting a per user limit above which it would notify Julius.

Anyway, I believe this is something to have on our radars for next quarter. cc @damianavila who I remember mentioning that improving our monitoring and alerting systems might be a goal of the future quarter.

@damianavila damianavila moved this to Todo 👍 in Sprint Board Sep 28, 2023
@sgibson91 sgibson91 added the nominated-to-be-resolved-during-q4-2023 Nomination to be resolved during q4 goal of reducing the technical debt label Oct 18, 2023
@consideRatio
Copy link
Contributor

@yuvipanda we transitioned away from having a few in-cluster NFS servers I recall.

What is your take currently on the previously proposed implementation in #764 (comment)?

@yuvipanda
Copy link
Member

@consideRatio that's still the only possible path forward for per-hub quotas. It is probably also a quarter's worth of work, and not high priority right now. It's also just per-hub quotas, I think per-user quotas should be handled separately (and perhaps be more 'alerts' than actual quota). I think this issue can be scoped down to only discussing per-hub quotas, and left to be prioritized in the future.

@consideRatio
Copy link
Contributor

@GeorgianaElena do you want to open a dedicated issue to represent the ideas in #764 (comment)? I don't think we will manage to track that effectively as part of this already complicated issue.

@consideRatio consideRatio removed the nominated-to-be-resolved-during-q4-2023 Nomination to be resolved during q4 goal of reducing the technical debt label Oct 19, 2023
@consideRatio consideRatio changed the title Enabling per user/per hub storage quotas Enabling per hub storage quotas Oct 19, 2023
@sgibson91 sgibson91 removed this from Sprint Board Jun 18, 2024
@yuvipanda
Copy link
Member

Is eing handled via NASA-IMPACT/veda-jupyterhub#41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants