Rate-limiting in combination with auto-scaling #4270
-
Hi, we have currently implemented Rate-Limiting on our Ingress-Resources using http- and location-snippets (using the limit_req directives). That is working fine so far. Requests are limited properly per IP. However, we also have auto-scaling configured for our ingress-controller-pods. The issue now is, that the rate-limits applied via snippets are per-pod. Once additional pods are spawned by the autoscaling, the total request-limit rises (as requests are now distributed on more pods, which each have an individual rate-limit). This means that when you simply send enough requests to trigger a scale-up, you can raise and therefore circumvent the rate-limit. Any suggestions on how to solve this? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Rate limiting is by nature per pod (or per-instance if you are running machines). The possibility of workarounds depends on your applications. If for example, you deploy as a daemonset and you set your loadbalancer with session persistence, it will tend to consistently steer a client to a particular Node/pod and thus result in a more consistent apply of the setting. We also have some customers who run deployments and thus divide the rate limit by the number of pods in the deployment and thus apply a non-precise but better SWAG in that way. Additional thoughts? |
Beta Was this translation helpful? Give feedback.
-
Thats what I feared. What about using virtual-servers instead of ingresses: https://docs.nginx.com/nginx-ingress-controller/configuration/virtualserver-and-virtualserverroute-resources/ ? Are their rate-limit policies ( https://docs.nginx.com/nginx-ingress-controller/configuration/policy-resource/ ) also per pod, or does it automatically account for the number of active pods? One idea I had was to build some kind of kubernetes-controller which constantly monitors the amount of active controller-pods and automatically adjusts the rate-limit in the configmap. But that feels a little dirty to me. |
Beta Was this translation helpful? Give feedback.
Optimally, we want NGINX Plus to handle this using zone_sync as that will be the most accurate.
Alternately, I am interpreting that you are trying to achieve global behavior with the free edition.
There is one free edition customer I know who drives their scaling with automation (not HPA) and thus reconfigures the rate limit settings when a scaling action happens.
The controller does not 'automagically' attempt to do math here. As you state, we would need some type of Controller/Operator process to drive that from the outside. Helm is the only place where this is all templated together, otherwise it is distinct manifests.
We leave it in your hands do that math today.
If using VirtualServe…