Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using source code deployment, after upgrading from version 3.8.1 to 3.8.2, the Kafka component cannot start and remains in Restarting state #2903

Open
xiaobeiy opened this issue Nov 29, 2024 · 1 comment
Assignees
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@xiaobeiy
Copy link

OpenIM Server Version

3.8.2

Operating System and CPU Architecture

Linux (AMD)

Deployment Method

Source Code Deployment

Bug Description and Steps to Reproduce

The current version is 3.8.1, with source code deployment. The following are the upgrade steps:

  1. Download version 3.8.2 from the release and unzip it
  2. Copy the data directory from 3.8.1 to the root directory of 3.8.2
  3. Start the image and execute the docker compose up - d command. After about 10 seconds of startup, the Kafka component remains in Restarting state
    View Kafka logs:
    kafka 23:06:12.85 INFO ==> ** Starting Kafka **
    [2024-11-28 23:06:13,750] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
    [2024-11-28 23:06:14,111] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
    [2024-11-28 23:06:14,317] INFO Registered signal handlers for TERM, INT, HUP (org.apache.kafka.common.utils.LoggingSignalHandler)
    [2024-11-28 23:06:14,322] INFO [ControllerServer id=0] Starting controller (kafka.server.ControllerServer)
    [2024-11-28 23:06:14,791] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
    [2024-11-28 23:06:14,856] INFO [SocketServer listenerType=CONTROLLER, nodeId=0] Created data-plane acceptor and processors for endpoint : ListenerName(CONTROLLER) (kafka.network.SocketServer)
    [2024-11-28 23:06:14,859] INFO [SharedServer id=0] Starting SharedServer (kafka.server.SharedServer)
    [2024-11-28 23:06:14,921] ERROR [SharedServer id=0] Got exception while starting SharedServer (kafka.server.SharedServer)
    java.io.IOException: Could not read file /bitnami/kafka/data/__cluster_metadata-0/00000000000006012624-0000000019.checkpoint
    at kafka.log.LogLoader.$anonfun$removeTempFilesAndCollectSwapFiles$2(LogLoader.scala:228)
    at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985)
    at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
    at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984)
    at kafka.log.LogLoader.removeTempFilesAndCollectSwapFiles(LogLoader.scala:226)
    at kafka.log.LogLoader.load(LogLoader.scala:101)
    at kafka.log.UnifiedLog$.apply(UnifiedLog.scala:1804)
    at kafka.raft.KafkaMetadataLog$.apply(KafkaMetadataLog.scala:588)
    at kafka.raft.KafkaRaftManager.buildMetadataLog(RaftManager.scala:270)
    at kafka.raft.KafkaRaftManager.(RaftManager.scala:170)
    at kafka.server.SharedServer.start(SharedServer.scala:247)
    at kafka.server.SharedServer.startForController(SharedServer.scala:129)
    at kafka.server.ControllerServer.startup(ControllerServer.scala:197)
    at kafka.server.KafkaRaftServer.$anonfun$startup$1(KafkaRaftServer.scala:95)
    at kafka.server.KafkaRaftServer.$anonfun$startup$1$adapted(KafkaRaftServer.scala:95)
    at scala.Option.foreach(Option.scala:407)
    at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:95)
    at kafka.Kafka$.main(Kafka.scala:113)
    at kafka.Kafka.main(Kafka.scala)
    [2024-11-28 23:06:14,926] INFO [ControllerServer id=0] Waiting for controller quorum voters future (kafka.server.ControllerServer)
    [2024-11-28 23:06:14,926] INFO [ControllerServer id=0] Finished waiting for controller quorum voters future (kafka.server.ControllerServer)
    [2024-11-28 23:06:14,928] ERROR Encountered fatal fault: caught exception (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
    java.lang.NullPointerException: Cannot invoke "kafka.raft.KafkaRaftManager.apiVersions()" because the return value of "kafka.server.SharedServer.raftManager()" is null
    at kafka.server.ControllerServer.startup(ControllerServer.scala:210)
    at kafka.server.KafkaRaftServer.$anonfun$startup$1(KafkaRaftServer.scala:95)
    at kafka.server.KafkaRaftServer.$anonfun$startup$1$adapted(KafkaRaftServer.scala:95)
    at scala.Option.foreach(Option.scala:407)
    at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:95)
    at kafka.Kafka$.main(Kafka.scala:113)
    at kafka.Kafka.main(Kafka.scala)
    kafka 23:06:16.74 INFO ==>
    kafka 23:06:16.74 INFO ==> Welcome to the Bitnami kafka container
    kafka 23:06:16.74 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
    kafka 23:06:16.74 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
    kafka 23:06:16.75 INFO ==>
    kafka 23:06:16.75 INFO ==> ** Starting Kafka setup **
    kafka 23:06:16.89 INFO ==> Initializing KRaft storage metadata
    kafka 23:06:16.89 INFO ==> Formatting storage directories to add metadata...
    All of the log directories are already formatted.
    kafka 23:06:18.88 INFO ==> ** Kafka setup finished! **

kafka 23:06:18.90 INFO ==> ** Starting Kafka **
[2024-11-28 23:06:19,800] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2024-11-28 23:06:20,151] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
[2024-11-28 23:06:20,331] INFO Registered signal handlers for TERM, INT, HUP (org.apache.kafka.common.utils.LoggingSignalHandler)
[2024-11-28 23:06:20,336] INFO [ControllerServer id=0] Starting controller (kafka.server.ControllerServer)
[2024-11-28 23:06:20,859] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
[2024-11-28 23:06:20,948] INFO [SocketServer listenerType=CONTROLLER, nodeId=0] Created data-plane acceptor and processors for endpoint : ListenerName(CONTROLLER) (kafka.network.SocketServer)
[2024-11-28 23:06:20,951] INFO [SharedServer id=0] Starting SharedServer (kafka.server.SharedServer)
[2024-11-28 23:06:21,028] ERROR [SharedServer id=0] Got exception while starting SharedServer (kafka.server.SharedServer)
java.io.IOException: Could not read file /bitnami/kafka/data/__cluster_metadata-0/00000000000006012624-0000000019.checkpoint
at kafka.log.LogLoader.$anonfun$removeTempFilesAndCollectSwapFiles$2(LogLoader.scala:228)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984)
at kafka.log.LogLoader.removeTempFilesAndCollectSwapFiles(LogLoader.scala:226)
at kafka.log.LogLoader.load(LogLoader.scala:101)
at kafka.log.UnifiedLog$.apply(UnifiedLog.scala:1804)
at kafka.raft.KafkaMetadataLog$.apply(KafkaMetadataLog.scala:588)
at kafka.raft.KafkaRaftManager.buildMetadataLog(RaftManager.scala:270)
at kafka.raft.KafkaRaftManager.(RaftManager.scala:170)
at kafka.server.SharedServer.start(SharedServer.scala:247)
at kafka.server.SharedServer.startForController(SharedServer.scala:129)
at kafka.server.ControllerServer.startup(ControllerServer.scala:197)
at kafka.server.KafkaRaftServer.$anonfun$startup$1(KafkaRaftServer.scala:95)
at kafka.server.KafkaRaftServer.$anonfun$startup$1$adapted(KafkaRaftServer.scala:95)
at scala.Option.foreach(Option.scala:407)
at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:95)
at kafka.Kafka$.main(Kafka.scala:113)
at kafka.Kafka.main(Kafka.scala)
[2024-11-28 23:06:21,032] INFO [ControllerServer id=0] Waiting for controller quorum voters future (kafka.server.ControllerServer)
[2024-11-28 23:06:21,032] INFO [ControllerServer id=0] Finished waiting for controller quorum voters future (kafka.server.ControllerServer)
[2024-11-28 23:06:21,034] ERROR Encountered fatal fault: caught exception (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
java.lang.NullPointerException: Cannot invoke "kafka.raft.KafkaRaftManager.apiVersions()" because the return value of "kafka.server.SharedServer.raftManager()" is null
at kafka.server.ControllerServer.startup(ControllerServer.scala:210)
at kafka.server.KafkaRaftServer.$anonfun$startup$1(KafkaRaftServer.scala:95)
at kafka.server.KafkaRaftServer.$anonfun$startup$1$adapted(KafkaRaftServer.scala:95)
at scala.Option.foreach(Option.scala:407)
at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:95)
at kafka.Kafka$.main(Kafka.scala:113)
at kafka.Kafka.main(Kafka.scala)

Screenshots Link

No response

@xiaobeiy xiaobeiy added the bug Categorizes issue or PR as related to a bug. label Nov 29, 2024
@skiffer-git
Copy link
Member

I’ve adjusted this:

KAFKA_NUM_PARTITIONS: 8
KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE: "true"

Not sure if it will have any impact.

If the historical data isn’t important, you can delete the components directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants