-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An workaround for conflict resolve failure #1087
base: release/hydrogen
Are you sure you want to change the base?
An workaround for conflict resolve failure #1087
Conversation
Unfortunately this breaks some existing functionality (see the PR validation result). This area is very hard to touch without precise knowledge of what is going on. However you noted that you are unable to reproduce the issue, so how are you testing this fix? |
Yeah, you are right. We don't have very precise knowledge of the details and the code, so we would like some help from the community and finally have it fixed totally. Our situation is simple, and we are just using limited features supplied by couchbase & lite. So we don't need to consider every detail like you do. We are going to run the patched version on some devices for a relatively long time. We hope it will not ruin the "good data" 😄 , and then, if there are less conflict problems, that make sense to us. |
@borrrden Could we talk about this? I don't understand the test case that failed. core/Replicator/tests/ReplicatorLoopbackTest.cc, line 1257. // For https://github.com/couchbase/sync_gateway/issues/3359
C4Slice docID = C4STR("Khan");
{
TransactionHelper t(db);
createRev(db, docID, C4STR("1-11111111"), kFleeceBody);
createConflictingRev(db, docID, C4STR("1-11111111"), C4STR("2-22222222"));
createConflictingRev(db, docID, C4STR("1-11111111"), C4STR("2-ffffffff"));
createConflictingRev(db, docID, C4STR("2-22222222"), C4STR("3-33333333"));
}
_expectedDocumentCount = 1;
runPullReplication();
c4::ref<C4Document> doc = c4doc_get(db2, docID, true, nullptr);
REQUIRE(doc);
C4Slice revID = C4STR("3-33333333");
CHECK(doc->selectedRev.revID == revID);
CHECK((doc->flags & kDocConflicted) == 0); // locally in db there is no conflict After the transaction, the rev tree should look like this (and the loopback replication should not change the rev tree except remote tags):
It diverged at the second generation, and And here is another confusion. If we call Thank you |
Your analysis is actually very on point, I didn't think most outside people would read so far into it. However there is one small gotcha that you missed. If I were to pull the document out of The conflict was not inserted into
This is a very delicate situation that actually I was not involved in writing the solution for, so there are a lot of checks that happen here to ensure that this is really what happened (if you are curious you can see them inside of |
By the way, where should I commit new issues if it's about android but not core? |
Oh! I forgot about that and you caught me. ce-root is the correct repo for filling issues! |
As I keep debugging, I wonder why a C4DocPutRequest has a "allowConflict" field? |
It has nothing to do with tagging or not tagging. That flag is simply whether or not to outright refuse to insert ( |
Yeah, you are right, I made a mistake. When replication goes, the conflict tag is important. We need it to tell two things:
So if the conflict tags break down, conflict resolve will not work, and it finally put our data into an inconsistent state. From my perspective, to make conflict resolve work, a leaf revision should be either CURRENT or CONFLICT. But there's an argument |
I actually am unsure about that part of the code. I don't often get the chance to look at it. I wonder if @snej could answer. |
Hey @workingenius, I'm impressed that you've dug so deeply into this code, some of which (RevTree) is very old and has a lot of history :) and technical debt associated with it. Let me try to explain. Conveniently, writing this will also help me remember some details that I haven't thought about in a while... Your confusion about markConflict/isConflict has to do with design changes we made late in the 2.0 dev cycle that made conflict handling work differently than in 1.x or in CouchDB. Previously, the pull side of replication copied the entire remote rev-tree of a document; the goal is that every replica of a database has the same exact rev-tree. In CBL 2.x we don't do that. Instead we only pull the active/current branch from SG. CBL also only allows one local branch -- the public API (CBL, not LiteCore) doesn't allow you to create local/local conflicts. So a conflict always exists only between the one local branch and the one remote branch. (There can be multiple remote branches if you replicate with multiple servers, but let's ignore that.) Rather than rewrite RevTree according to the new way, we made smaller changes to support what we needed. (The rewrite is happening in 3.0; see the new version-vector code on the dev branch. This mode abandons rev-trees entirely.) The failing test case "Server Conflict Branch-Switch" is for handling a condition that can happen on Sync Gateway. On SG, when a doc is in conflict between leaf revisions A and B, the conflict can be resolved by adding a child to either A or B, making that new rev current, and adding a tombstone to the other. That means that at one point the doc's current revision can be A, and then later B' (the child of B), where B' is not a descendent of A. When the CBL replicator pulls B' it needs to detect this and edit the rev-tree so that the conflicting branch now has leaf B', not A, because that's the current server revision that the client needs to resolve with. (to be continued) |
Anyway! So after all this explanation and diagrams, I'm looking at the issue #1084 you reported. I agree that it's wrong for the |
Issue here: https://github.com/couchbase/couchbase-lite-core/issues/1084
As we don't know the real cause of the problem, we came up with a workaround.
Every time a rev tree is saved, check if conflict tags are in consistent state. Fix it if not.
It may not be the best solution, please check. Hope it helps.