-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What does the data format look like? #36
Comments
Hello! This is a sample repository. There is only one leaf. Commits and merges are made under special constraints. Please try a quick example. https://github.com/sosuisen/git-documentdb/tree/main/examples An example of synchronization is "sync". Many comments and the flow of the process of this example are here. The theory is described below: The basic idea is simple, but there are many practical problems that make it difficult. |
Thanks for all the links. I've read the blog post and I generally understand the concept. I also looked over the example sync file. But I'm still not clear on what the For additional context, by now I have built several data storage projects on top of git. I'm quite familiar with CRDTs and the theory behind all this. I'm really just curious about how you have solved the specific issues between git commits, merges, and so on. For example, one approach would be to simply store CRDTs in git, each in a separate file, named with a UUID, and then parse them all every time. This should guarantee no merge conflicts. But I imagine there are better strategies than this, and I assume you're doing something more elegant than this. :-) |
They are data for an app called inventory-manager.
New documents are added to
The documents under itemCollection are typed by using TypeScript.
The git-documentdb does not have a schema, but the documents under the
I hope this is helpful. |
Thanks for taking the time to answer my questions, I appreciate it. Unfortunately, I'm still not getting it. I see your examples. They seem to be examples of how to use the API. My goal is to understand how the API results in changes on disk. Have you written a custom git merge? So that merges are always successful? I saw that there's a "last write wins" approach. But I'm not clear how these API calls result in git commits, and how those commits get merged. For example, could I break the document db by pulling it in a terminal, editing some files, committing and pushing? I see the Maybe I've totally misunderstood this project. I thought it was an implementation of CRDT like functionality on top of git. So I imagined that I could create a collection, push to it from multiple machines, and then I could use a standard git merge, be guaranteed there were no conflicts, and then my data would be updated. But maybe this is incorrect? |
Indeed, there might be some misunderstandings. This project is a wrapper for Git, but there are specific restrictions to achieve CRDT. First, CRDT cannot be achieved by simply using Git freely; certain rules are necessary. In this project, I describe the rules that humans typically use to assess and operate Git synchronization with the goal of achieving CRDT. This project automates Git by implementing these rules, including conflict checking. If a conflict arises, an automatic resolution is performed. All possible scenarios in Git synchronization are documented as rules. You can find rules 1 to 4 here: How to Use Git as an Offline-First Database. Second, this project aims to utilize Git through the API of a document database. Therefore, only JSON or Markdown with Front Matter can be stored in the Git repository. Members within JSON are evaluated for merging individually. The 3-way merge of JSON arrays or objects is implemented using ot-json1 (This will probably confuse you even more..). For lengthy texts, a character-level merge strategy is employed, generating a single text that combines both lengthy texts. In the event of conflicts, the default strategy is "last write wins." Finally, git-documentdb is an API. Therefore, it must be incorporated into your program. The |
Okay, now I understand better, thanks. This package must handle all the merging so that it can apply its own merge logic. I see. The reference to ot-json1 is helpful, I guess your API results in creating ot-json1 operations which in turn get applied to whatever is in the JSON that's stored in git. So the real history of what has changed is visible inside git also. It's an interesting approach. I had assumed that handling merges would be too complex when I designed my systems, but perhaps this was a mistake. I will contemplate further on this. Thanks again for taking the time to explain the code. I'm really fond of storing my personal data inside git, I find it to be a great tool for transparency, easily self hosted, and so on. I'll consider if moving some of my applications onto this package makes sense. Or maybe building new ones with it. |
I also love managing personal data with Git. |
Just discovered this by searching for "git crdt". I've been dreaming about this kind of git based approach to data storage for some time now, great to discover somebody has already solved many of the challenges!
What does the data look like inside the git repo? Is there any kind of example repo that shows the data somewhere?
The text was updated successfully, but these errors were encountered: