You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We were having issues with data loss. We logged the data sent to kinesis from our producers and compared the data in our sinks. We are 100% sure that the data was pushed to kinesis but the data for 1 minutes was lost. Any possible reason why this was the case.
@roncemer - Any idea why is this happening. My initial guess is the kinesis re-sharding. So I have added the option .option("kinesis.client.describeShardInterval", "500ms") but dont know if this will fix it
@success-m I accidentally had issues disabled on my repo. I enabled that feature. If you have a change you'd like to submit, feel free to issue a pull request against https://github.com/roncemer/spark-sql-kinesis and I will merge it and drop a new release as soon as I can get to it.
I am currently not using this project for anything (I switched to using Kinesis Firehose Delivery Streams with AWS Lambda functions, as it's cheaper and doesn't require any explicit checkpointing mechanism), so if you're interested in taking over the project, I would be happy to add you as a maintainer and provide instructions for packaging and publishing updated versions.
We were having issues with data loss. We logged the data sent to kinesis from our producers and compared the data in our sinks. We are 100% sure that the data was pushed to kinesis but the data for 1 minutes was lost. Any possible reason why this was the case.
PS: I know the active repo is https://github.com/roncemer/spark-sql-kinesis but I could not post issues in the repo
Please help
The text was updated successfully, but these errors were encountered: