Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Zookeeper-operator does not restart pods when static configs are changed #454

Open
hoyhbx opened this issue Apr 1, 2022 · 4 comments · May be fixed by #469
Open

BUG: Zookeeper-operator does not restart pods when static configs are changed #454

hoyhbx opened this issue Apr 1, 2022 · 4 comments · May be fixed by #469

Comments

@hoyhbx
Copy link
Contributor

hoyhbx commented Apr 1, 2022

Description

When I change the fields under spec.config, zookeeper-operator does not issue a rolling update to roll out the changed config. For example, if I change the spec.config.commitLogCount field to 100, the operator reconciles the configMap to reflect the change. So in the pod, /conf/zoo.cfg which is where the configMap mounted has the commitLogCount field set to 100. But the /data/conf/zoo.cfg which is the config used by zookeeper still has the commitLogCount set to default value as 500.

Steps to reproduce:

  1. Deploy a simple zookeeper cluster with the following yaml:
apiVersion: "zookeeper.pravega.io/v1beta1"
kind: "ZookeeperCluster"
metadata:
  name: "zookeeper"
spec:
  replicas: 3
  1. Change the zookeeper yaml to configure the commitLogCount to 100
apiVersion: "zookeeper.pravega.io/v1beta1"
kind: "ZookeeperCluster"
metadata:
  name: "zookeeper"
spec:
  replicas: 3
  config:
    commitLogCount: 100
  1. Observe that the pods didn't restart, and go into the pod and check the config used by zookeeper in /data/conf/zoo.cfg:
metricsProvider.exportJvmInfo=true
dataDir=/data
4lw.commands.whitelist=cons, envi, conf, crst, srvr, stat, mntr, ruok
syncLimit=2
commitLogCount=500
metricsProvider.httpPort=7000
snapSizeLimitInKb=4194304
standaloneEnabled=false
metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
initLimit=10
minSessionTimeout=4000
snapCount=10000
admin.serverPort=8080
autopurge.purgeInterval=1
maxSessionTimeout=40000
maxCnxns=0
globalOutstandingLimit=1000
reconfigEnabled=true
skipACL=yes
autopurge.snapRetainCount=3
tickTime=2000
quorumListenOnAllIPs=false
preAllocSize=65536
maxClientCnxns=60
dynamicConfigFile=/data/conf/zoo.cfg.dynamic.100000010
root@zookeeper-0:/data/conf# cat zoo.cfg | grep con
4lw.commands.whitelist=cons, envi, conf, crst, srvr, stat, mntr, ruok
reconfigEnabled=true
dynamicConfigFile=/data/conf/zoo.cfg.dynamic.100000010

Importance

should-have

Location

Zookeeper-operator is missing the functionality to restart the pods when config is changed.

Suggestions for an improvement

We suggest to attach the hash of config as annotations to zookeeper's statefulSet's template. So that when the config is changed, the changed annotation would trigger statefulSet's rolling update.

@derekm
Copy link
Contributor

derekm commented Apr 1, 2022

@hoyhbx -- Your suggestion for improvement sounds sensible. Is this something you could implement and provide a PR for?

Thanks!

@anishakj
Copy link
Contributor

anishakj commented Apr 1, 2022

@hoyhbx Have one more question, Could you please try changing triggerRollingRestart to true and see pods are getting restarted automatically?

@hoyhbx
Copy link
Contributor Author

hoyhbx commented Apr 1, 2022

I tried changing the triggerRollingRestart to True and the operator successfully triggers a rolling restart. And the changed config are rolled out correctly.

I think it would be nice for zookeeper-operator to automatically trigger a rolling update when the config is changed, I have seen this behavior in many other operators, such as rabbitmq's cluster operator and https://github.com/banzaicloud/koperator

We are happy to provide a PR for this issue.

@hoyhbx hoyhbx linked a pull request Apr 28, 2022 that will close this issue
@fujin
Copy link

fujin commented Sep 15, 2023

Just hit this, any chance that we can get #469 merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants