-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConsumeNAck drops the first message #182
Comments
The first time |
Thanks for your response @horkhe. So does your statement imply that if we want to use this in a system that generates messages before the proxy is running, we will always drop the first message upon consumption? |
One more time: when kafka-pixy get a request to consume from a topic on behalf of a consumer group for the first time it initialises offsets to the head of all topic partitions. So whatever is already in the partitions won't be consumed. All messages produced to the topic after that will be consumed. However if you stop consuming for a period greater than the retention configured for the __consumer_offsets system topic, then consumer group offsets stored in kafka will be expired and removed by kafka. So the following consume request will trigger reinitialisation of topic offsets to the head of all partitions. |
Thanks for helping me understand |
The behaviour I am seeing on my end doesn't really line up with what you are saying though. Starting from a freshly created topic if I generate 5 messages and then start the ConsumeLoop, the consumer group partition is initialised (inconsistently) at either 0 or 1. After initialisation and catch up to head, the consumer group offset works as expected. If I understand you correctly, consumer group initialisation will set the offset to latest.. meaning I should not read anything in at all I suppose. For reference, I am using Sarama and relying on the auto topic generation upon first message creation. I have confirmed all of my messages arrive in the freshly created topic using kafkacat |
|
We've encountered an issue that I suspect is the same as this and we have some more details to share. Summary of our test:
What we're seeing is that some of the messages produced don't get consumed. They seem to be 'lost'. Sometime later, some but not all of those 'lost' messages reappear. Later still the remaining messages will appear. Here's an annotated log showing the effect: pixy-lost-msgs-issue.log Here's the (fairly generic?) config file: krproxyconfig.yaml.txt - the kafka and zookeeper are running locally in a container in this test setup. It seems clear that there's some unexpected buffering happening. I'm not very familiar with Kafka or Pixy and this could easily be pilot error correctable by tuning the config or pointing out where we're being silly. Side note, in case it'd relevant: this test sends writes (produces) through the proxy. The system we're developing wouldn't do that. The pixy proxy would only be used for reads. Perhaps that would avoid the problem all together? |
This is looking more like a pixy bug related to buffering on the reading (not writing) side. Here's a sequence of runs of the test which shows a common behavior where the first message of several doesn't get returned, and now shows the message offsets:
The pixy log had no new entries during those runs.
Then I reran the test. This time it read a message that had been written at a much earlier offset:
The pixy log had a couple of extra lines:
Several minutes later, after writing all this up, I reran the test and all the old messages arrived:
So it seems that some part of pixy is reading some messages and holding on to them for some time while other messages flow freely, before then releasing them later. |
Random observations...
|
You have |
Interesting, thanks. A different developer did the initial work. I'd presumed the config file was generic. But that can't be the cause of the behaviour as my recent testing has used no config file at all. The test starts the proxy itself, passing just the needed I've noticed something I think is relevant... it seems the lost messages correlate with there being more than one "Fetched subscriptions" reported in the log:
So I'd guess that the 'delayed' messages were read by a previous subscription, so aren't read by the new subscription. Then, when the previous subscription times out the old messages are 'released'. Does that sound about right? Would a short consumer.subscription_timeout setting be the right approach? |
So you have two |
That makes sense. We don't have two running but have been starting and stopping kafka-pixy on each test run. We found it doesn't terminate so it's been killed. See #183. (Note that the initial symptoms were encountered with one long-running instance of kafka-pixy so I'm doubtful this is the whole story.) I'll change the test to leave kafka-pixy running and get back to you. Thanks. |
I am 100% confident that there is no problem with EDIT: |
Thanks for the updates @horkhe. I can't reproduce the problem using the testconsumer and testproducer scripts. I also can't reproduce it with our test script with our current client code and a long-lived kafka-pixy process. I'm happy to put our issues down to pilot error. Feel free to close the case, unless @drjuarez has any further concerns. Thank you for your patience. |
Hi, Thank you for your work on this project. I am trying to run a simple test, and have found that the KafkaPixyClient.ConsumeNAck function consistently drops the first message. My code is following the example go getting started in the repo, and generating a few test messages using kafkacat.
Reproduce:
Results:
The first message is dropped.
NOTE: no dropping happens if the grpc server is up while the message are created.
Have any of you run into this issue before? I apologize if its an error on my part, look forward to hearing back
The text was updated successfully, but these errors were encountered: