Fix data loss issue by ensuring proper locking and clearing of changes #314
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This pull request addresses a data loss issue by optimizing the timing of lock management. Specifically, it ensures that data retrieval and clearing operations are performed within the same lock scope, preventing data loss caused by concurrent modifications.
Changes
Modified lock management to ensure data is cleared within the same lock scope immediately after retrieval.
Previously, the lock was released after retrieving data, allowing other threads to modify the data before it was cleared. This caused data loss.
Problem Description
The following code retrieves data from changes after acquiring a lock, releases the lock, and then reacquires the lock to clear the data. This creates a window where other threads can add new events to changes, resulting in data loss.
As a result, any new events added to changes during this gap are not retrieved correctly, leading to data loss.
Reproduction Steps
Add Logging
Insert logging statements at the following points to visualize the issue:
178
insrc/lib.rs
.py_changes
assignment at line331
.slf.borrow().clear();
call at line340
.Logs Observed
When running the application in a test environment, the following logs indicate the problem.
These logs demonstrate that:
changes
(to be passed to Python).changes
.changes
is cleared.This order of operations leads to data loss as new events are not included in the cleared data.
File Creation Script
Script Using
watchfiles
to Monitor ChangesSubprocess Script to Run Logging
Script to Extract Specific Log Entries
Fixed Implementation
The fix ensures that data retrieval and clearing are performed under the same lock:
This approach eliminates the data loss issue by ensuring no other thread can modify changes during the operation.