-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data corruption: zeroes are sometimes written instead of the actual data #5038
Comments
It seems similar to one previous issue, it could be caused by concurrent chunk commit that lead to file length changes. Would it be normal reading again after a while? |
No, the file that was corrupted in this way stays the same. I just checked the one that I tested this on. The flow has been:
|
@abbradar This is not a bug. JuiceFS provides close-to-open consistency, once a file is written and closed, it is guaranteed to view the written data in the following opens and reads from any client.
|
Do you mean that if the object storage itself exhibits this behavior and may leave corrupted zero-filled objects, then this propagates to the FS level? I would have expected the region to be random garbage or for a read error to happen then because we use JuiceFS encryption on top; please correct me if I'm wrong. BTW, does JuiceFS use the EDIT:
Sorry I didn't ask this at first, but just to be clear — do these inconsistencies only include synchronization (say, if we update different parts of the file, the order may not be preserved, or the file might not be updated at all until closed), or may lead to the client reading completely invalid data (zeroes or other garbage)? |
@jiefenghuang I just managed to reproduce the other behavior you mentioned. The setup was the same (one writing and another reading host), but I wrote a small script for the reading side to detect zero regions: import time
import sys
pat = b"\0"
f = open("/mnt/jfs-partition/test-file", "rb")
pos = 0
while True:
chunk = f.read(2048)
if len(chunk) == 0:
time.sleep(1)
elif chunk.count(pat) == len(chunk):
print(f"Pos {pos}: Zeroes chunk detected! Size: {len(chunk)}", flush=True)
pos += len(chunk) On the writing side, I ran
And:
Notice that zeroes begin at the chunk border (21196 + 15668 == 38912), and indeed, stretch the entire chunk. But when I reopened the file later, no zero regions were there! Can you point me to the known issue? |
It is a question asked by a user in a WeCom user group, and has not been posted on GitHub Issue.
I've made some modifications, it may have resolved your question. |
@jiefenghuang Thanks; I have several questions about these. For (1), with no write failures, does this property mean:
We use JuiceFS for writing append-only binary logs, and read them sequentially on another host in real time. We are not concerned with not having a time guarantee, but the possibility of reading invalid data breaks this pattern. (2) is a problem for us. I would have expected JuiceFS to behave like a checksummed+journaled file system here—i.e., exclude the possibility of corrupted data, so the worst case becomes either "reading gives you EIO" or "write happened partially/did not happen". Can you clarify a bit more on how this works internally — what sequence of metadata+object operations can result in reading zero chunks, and what happens later when they get magically fixed? |
We could improve this case in next 1.3 release |
Thanks! We have lots of use cases for "one host appends, another reads", so we are motivated to help with this. |
What happened:
When writing a file on a host, sometimes a zero-filled region may be written instead of the actual data.
What you expected to happen:
No data corruption.
How to reproduce it (as minimally and precisely as possible):
On one host:
pv -L 3k /dev/urandom > /mnt/jfs-partition/test-file
On another host:
while ! hexdump -C /mnt/jfs-partition/test-file | grep '00 00 00 00'; do sleep 1; done
After some time, the command on the second host will find a large cluster of zeroes (>1k) and stop. In the file, you see, for instance:
Anything else we need to know?
More details:
Unfortunately, we have many irrelevant logs on the writing server because this is a production host. These ones may be relevant (the file inode is 2439005):
What else do we plan to try:
Environment:
juicefs --version
) or Hadoop Java SDK version:juicefs version 1.2.0+2024-06-18.873c47b922ba
(both hosts)cat /etc/os-release
): NixOS 24.11 (Vicuna) (both hosts)uname -a
):6.1.90
(both hosts)The text was updated successfully, but these errors were encountered: