Enter drain mode when a pod is terminating and sticky-sessions are enabled #5962

dcowden · 2018-04-09T18:58:04Z

dcowden
Apr 9, 2018

We have a legacy application ( tomcat/java), which needs sticky sessions. When we deploy new versions of our applications, we need to stop sending new connections to a server, while sending bound sessions to the old server. Please note: this is not referring to in-flight requests, we're needing the active tomcat sessions to expire, which normally takes a few hours.

This is possible using nginx drain command. This will send bound connections to the old server, but send new ones elsewhere. But in kubernetes, calling a command on the ingress controller is not part of the deployment flow. To do it with current tools, we would need to add a preStop hook to our application. In that hook, we'd need to access the ingress controller, and ask it to drain with an api call. We'd rather not introduce the ability for applications to call apis on the ingress controller.

When kubernetes terminates a pod, it enters the TERMINATING status. In nearly all cases, when sticky sessions are enabled, the desired functionality is probably to put the associated pod into drain mode. Is this possible with the nginx-plus ingress controller?

We currently use the kubernetes-maintained nginx ingress controller. This feature would make it worth the money to use nginx-plus

Aha! Link: https://nginx.aha.io/features/IC-110

pleshakov · 2018-04-10T13:38:24Z

pleshakov
Apr 10, 2018

@dcowden Maybe the following approach can work you?

If you want to drain particular pods, you change the corresponding Ingress resource by adding an annotation that specifies which pods to drain using a label query. For example:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: cafe-ingress
  annotations:
     kubernetes.io/ingress.class: "nginx"
     nginx.com/drain: "version=0.1"
spec:
  rules:
  - host: "cafe.example.com"
    http:
      paths:
      - path: /tea
        backend:
          serviceName: tea-svc
          servicePort: 80

In this case, the Ingress controller will drain all the pods corresponding to the tea-svc with the label version=0.1. This will allow you to specify which pods to drain during an application upgrade.

Please not that this is not available, but we can add it.

0 replies

dcowden · 2018-04-10T13:53:19Z

dcowden
Apr 10, 2018
Author

Hmm, that's an interesting approach, but I don't think it would work well for us.

Today we use fairly conventional deployments, in which the deployment controller scales pods up and down. Under the hood it does this with replicaSets i think. We do not re-publish our ingresses as a part of deployments, and this approach would require doing that.

Given our current flow, it would be much more seamless if a pod in TERMINATING status was automatically drained. This would cover several situations:

A deployment, in which case a pod is terminating on purpose for shutdown
A node drain, in which case we're actually deleting a pod so we can maintenance a node
An OOM kill, when k8s is terminating a pod beacuse it is out of memory.

In reality, if you are running sticky sessions, i can't think of any cases where you wouldnt want to drain a pod when it is terminating rather than immediately removing it from service.

In practice there is still other work needed to make it work, because kubernetes has to have a way to know when its ok to actually kill the pod. This is accomplished by registering a preStop hook, which runs and waits for all of the active sessions to be gone. If the hook finishes, or the pod kill grace period expires, kubernetes kills the pod, which will make it fail the health checks and it will be removed from nginx

0 replies

pleshakov · 2018-04-10T15:29:54Z

pleshakov
Apr 10, 2018

@dcowden thanks for providing more details.

It looks like it is possible to accomplish session draining through the Ingress controller.

Unfortunately, once a pod enters the terminating state, its endpoint is removed from Kubernetes, which makes the Ingress controller remove that endpoint from NGINX configuration. Thus, in order to retain that endpoint in NGINX configuration, when the endpoint is being removed, we must make an additional query to Kubernetes API to check if the corresponding pod is in the terminating state. If it is the case, we need to drain it, instead of removing it. Also, once the pod is successfully removed, we need to make sure that it is also removed from the NGINX configuration.

Do you think the logic above will cover your use case?

I can prepare a PR which implements that logic and share it with you, if you'd like to test it.

0 replies

dcowden · 2018-04-10T19:49:40Z

dcowden
Apr 10, 2018
Author

Yes, i think that's the logic... at least as near as I can tell without actually implementing it. I'd be happy to test it. I'm also open to alternate ways of working if it can accomplish the objective with less work.

As a side note, this use case once again validates the decision NOT to use the k8s service abstraction, because its pretty clear that the endpoint would become inaccssible. IIRC, there's a 'use service=true' flag, which would be incompatible with using this functionality.

0 replies

victor-frag · 2019-02-05T14:06:19Z

victor-frag
Feb 5, 2019

Hello,

i am facing the same scenario as @dcowden, with a java app using tomcat with sticky sessions. Do we have this implemented already?

0 replies

tkgregory · 2019-02-18T11:44:48Z

tkgregory
Feb 18, 2019

I also have this requirement for Tomcat instances that require session affinity. During deployment existing bound sessions should still be routed through the same instances, with new sessions being routed to the new instances.

Could we get an update on this please?

0 replies

dcowden · 2019-02-18T12:18:25Z

dcowden
Feb 18, 2019
Author

@tkgregory @victor-frag we are using https://github.com/jcmoraisjr/haproxy-ingress, which implements this functionality. We've been using it in production for a while-- its been stable, well supported, and actively updated.

0 replies

irizzant · 2019-04-18T11:44:16Z

irizzant
Apr 18, 2019

I'd like to understand this as well.
We do have a JBoss AS hosting our web application, which has exactly the same problems.
We switched to haproxy and we currently handle rolling updates just fine.

As @pleshakov suggested It should be possible though using the same approach taken for haproxy:

Thus, in order to retain that endpoint in NGINX configuration, when the endpoint is being removed, we must make an additional query to Kubernetes API to check if the corresponding pod is in the terminating state. If it is the case, we need to drain it, instead of removing it. Also, once the pod is successfully removed, we need to make sure that it is also removed from the NGINX configuration.

I'd also add that the above should happen only if session affinity is enabled, and there should be no need for an additional query to Kubernetes API since ingress controllers should be automatically notified when pods enter the termination phase.

0 replies

amodolo · 2019-06-26T08:58:13Z

amodolo
Jun 26, 2019

@tkgregory @victor-frag we are using https://github.com/jcmoraisjr/haproxy-ingress, which implements this functionality. We've been using it in production for a while-- its been stable, well supported, and actively updated.

@dcowden, i'm figuring out how to use haproxy to mantains alive the application untin the sessions termination. But at the moment, when i deploy a new application's version, the old pods are terminated and the new ones are started, no matter if there are active sessions.
Can you explain me how you have solve this? What ingress/haproxy configuration have you used?

Thx

0 replies

dcowden · 2019-06-26T11:04:01Z

dcowden
Jun 26, 2019
Author

Hi @amodolo,
We set up our tomcat container with a pre stop hook that doesn't return until there are no active sessions left, or until a timeout that's long enough we are comfortable the session isn't a real user.

In our case, we wrote a small servlet we deployed with the app to return the number of sessions left using jmx.. There are other ways for sure, but it works ok for us until we can do better

0 replies

amodolo · 2019-06-26T12:26:11Z

amodolo
Jun 26, 2019

I've just implements your solution and seams to works like a charm.

Thx a lot (also for the super fast response 😄)

0 replies

dcowden · 2019-06-26T12:39:33Z

dcowden
Jun 26, 2019
Author

@amodolo glad it worked for you! FWIW, we have been using this solution in production for about a year now. We run a 24x7 platform-- but humans do not work 24x7. We simply wait for sessions to die, or 12 hours, whichever comes first. When we execute a build, we'll have extra pods out there serving the old workloads for 1/2 day till they die. It works pretty well.

the main negative ( and why its not THE solution) is that it limits your iteration velocity in production on new code to once a day, which is a bit of a limitation.

0 replies

amodolo · 2019-06-28T07:07:27Z

amodolo
Jun 28, 2019

The main negative aspect of this solution is this: suppose you have one server with 2 active sessions and you are rolling up a new application version. The old POD will enter in the drain mode until the sessions die (or the greace period over). Suppose also that the sticky session is based on the cookie generated by HAProxy. In this configuration, if one of the two users logs out from the application, the haproxy's cookie is not removed until the user close the browser (because is a session cookie); so if that user logs out and the logs in again (without close the browser), it will be balanced to the same old POD.
One better approach could be to configure the ingress to use the JSESSIONID cookie generated by the server. In this case, if your application removes the session cookie on logout, the user will be immediatelly balanced to one of the new POD after the logout.
I hope that i can explain myself. What do you think?

0 replies

dcowden · 2019-06-28T11:36:31Z

dcowden
Jun 28, 2019
Author

Yes, we use haproxy in rewrite cookie mode, and use a separate cookie. I think using jsessionid would work too. Another requirement is that when a particular user on an old pod logs out and logs back in, we want to be guaranteed that they switch to a new pod. That ends up being important sometimes

0 replies

miclefebvre · 2020-03-04T13:17:58Z

miclefebvre
Mar 4, 2020

@dcowden @amodolo

We set up our tomcat container with a pre stop hook that doesn't return until there are no active sessions left, or until a timeout that's long enough we are comfortable the session isn't a real user.

In our case, we wrote a small servlet we deployed with the app to return the number of sessions left using jmx.. There are other ways for sure, but it works ok for us until we can do better

Do you mind explaining more how you wrote this pre stop hook ? If I do an exec, I would have to have a script in the same docker image than my tomcat, ortherwise if I do a http call that doesn't return the call will timeout.

How did you do it, you added a script in your tomcat container and it calls the tomcat ? Could it be done with another container in the pod ? Thanks

0 replies

dcowden · 2020-03-04T14:21:20Z

dcowden
Mar 4, 2020
Author

Hi @miclefebvre

How did you do it, you added a script in your tomcat container and it calls the tomcat ?

Yes, our script is in the same container as tomcat, and we hook a drain script to a pre-stop. This script calls a URL provided by tomcat that responds with the number of user sessions remaining. When there are no more sessions, or when we have reached our timeout, we finish draining.

Here's the important bit of our drain.sh script:

# notional logic:
# wait for ptplace to terminate as long as:
#   the drain endpoint returns a 2XX within 10 seconds AND
#   the drain response contains the word "DRAINING"
# return 0 if we terminated due to a 2xx resonse that DIDNT include draining
# otherwise, return 1 ( we timed out waiting to drain )

while [ 1 ]
do
    debug_msg "Checking ${DRAIN_URL}, timeout=${DRAIN_URL_TIMEOUT_SECS}s"
    echo "" > $RESULT_FILENAME
    curl -s -f --retry 2  --max-time $DRAIN_URL_TIMEOUT_SECS -o $RESULT_FILENAME --no-buffer "${DRAIN_URL}" 
    
    STATUS_RESULT=$?
    debug_msg "curl result code: $STATUS_RESULT"

    if [ $STATUS_RESULT -ne 0 ]; then
      info_msg "Drain returned non 2xx. Terminating"      
      exit 1
    fi

    info_msg "Received Result::"
    cat $RESULT_FILENAME

    grep -i -c $STILL_DRAINING $RESULT_FILENAME
    if [ $? -ne  0 ]; then
        info_msg "Draining Complete. Terminating."
        exit 0        
    else
        debug_msg "Still Draining. Waiting $DRAIN_URL_INTERVAL_SECS seconds.."
        sleep $DRAIN_URL_INTERVAL_SECS
    fi    

done

It's worth noting that we terminate if we receive a non 2XX from tomcat-- that's there incase tomcat has become unresponsive while we're draining, which happened once in production. If Tomcat's already hosed, then the user sessions there don't matter. That might seem unlikely, but in our case sessions last a LONG time ( our users are typically active for nearly an entire business day )

Could it be done with another container in the pod ? Thanks

Our pod has only one container, but I suppose it would work with multiple containers, because it's based on making an HTTP call into tomcat.

0 replies

miclefebvre · 2020-03-04T14:38:20Z

miclefebvre
Mar 4, 2020

Thanks a lot @dcowden,

I will give it a try. But we are using jib has base image. I'm not sure if there's curl or anything like that. I'll see if we can do this in another container or if I should change my base image.

0 replies

deepakkhetwal · 2021-03-28T21:42:43Z

deepakkhetwal
Mar 28, 2021

Hi @dcowden , would you like to share your sticky session configuration for HAproxy ingress?

0 replies

2021-05-29T02:15:20Z

github-actions[bot]
bot May 29, 2021

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

0 replies

alessandroargentieri · 2021-11-03T09:37:55Z

alessandroargentieri
Nov 3, 2021

hello, anyone knows if this feature has been added in any NGINX ingress controller implementation, like HAProxy does?

0 replies

brianehlert · 2024-01-02T19:04:47Z

brianehlert
Jan 2, 2024
Maintainer

I think this is valuable to keep around as a general purpose behavior and not have any dependency on sticky-sessions, since how the back-end/upstream pod shuts down is up to the application developer or operator and the ingress controller should simply behave consistently no matter if the upstream takes 2 minutes or 2 hours or 2 days to bleed off.

I believe that current behavior would be to remove the upstream when not in the ready state. Which is different than the drain behavior of NGINX.

To update this with the current state of the API we would need to set an upstream to drain when the pod state is terminating according to EndpointSlices
https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#conditions

This way any pre-stop hooks or other flow can be executed as outlined here: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

https://nginx.org/en/docs/http/ngx_http_upstream_module.html#server
Note: this is not available for stream

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enter drain mode when a pod is terminating and sticky-sessions are enabled #5962

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 21 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Enter drain mode when a pod is terminating and sticky-sessions are enabled #5962

Replies: 21 comments

dcowden Apr 10, 2018 Author

dcowden Apr 10, 2018 Author

dcowden Feb 18, 2019 Author

dcowden Jun 26, 2019 Author

dcowden Jun 26, 2019 Author

dcowden Jun 28, 2019 Author

dcowden Mar 4, 2020 Author

github-actions[bot] bot May 29, 2021

brianehlert Jan 2, 2024 Maintainer

dcowden
Apr 10, 2018
Author

dcowden
Apr 10, 2018
Author

dcowden
Feb 18, 2019
Author

dcowden
Jun 26, 2019
Author

dcowden
Jun 26, 2019
Author

dcowden
Jun 28, 2019
Author

dcowden
Mar 4, 2020
Author

github-actions[bot]
bot May 29, 2021

brianehlert
Jan 2, 2024
Maintainer