Initialize collab document and undo as early as possible #151

msurdi · 2022-05-25T08:26:01Z

fixes #136

Description

This pull request changes the yjs document and undo manager setup so that it happens as soon as the feature is enabled, instead of once the the connection is ready.

Doing it this way, allows for building up the history and syncing the yjs document even if the connection cannot be stablished, or if the connection happens at a later time due to a poor internet connection.

Testing

First, Start by reproducing the problem (by checking out trunk, instead of this branch)

Access any sandboxed instance of the editor (such as a P2 post).
On the browser's dev tools, enable logging with: localStorage.setItem("debug", "iso-editor:*")
Make sure you have "verbose logs" active, otherwise you won't see these logs:.
On the browser's dev tools, set network throttling to "Slow 3G"
Reload the page, and as soon as the editor starts accepting input, quickly type a few characters, and hit CMD+z for undoing. Keep doing this repeatedly until the editor is fully loaded, and the WebSocket connections are up.

At this point you will notice the problem in the debug logs, the first few "undos" before the connection was up, will read as Undoing from redux-undo state and at some point, after the WebSocket connection is stablished, the messages will become Undoing from yjs undoManager.

This means that any redo buffers (up to now, stored in redux-undo) we might have by the time the connection is ready will become inaccessible due to the undo system change, which I think is at least part of the general data loss problem we're investigating (Imagine a user that due to network/proxy issues cannot connect to the WebSocket, and due to some system or network change suddenly connects 20m later).

Finally, check out this branch, and repeat the testing process. There should be no single instance of Undoing from redux-undo state logs, because even before the WebSocket connection is active, we're already updating the yjs document and tracking do/undo events with yjs' own undo system.

johngodley · 2022-05-25T09:13:34Z

I've tried this out in the storybook (I was able to see the problem there too) and it seems to break undo there, with this console error:

Uncaught (in promise) wrong state

I'm not sure if the storybook needs changing in some way.

The error aside, I think this only partially solves #136, specifically the delay between the editor being available and the Yjs undo manager being active.

I don't think it helps the situation where collab is not enabled (or included) at the beginning of the editing session.

My own preference is that we always use redux-undo until collab is actively being used (i.e. we have a peer). However, I understand this is more complicated as it would mean copying redux-undo history into Yjs. I'm not sure how practical this is, but doing it would also solve the situation of collab not being enabled from the beginning.

This does lead me to wonder if it is possible to use the Yjs undo manager directly (i.e. without importing all of Yjs), and ditch redux-undo? Having two undo managers with slightly different behaviour doesn't seem ideal, and as Yjs is the one shouting the loudest then maybe it's easier to go with that. We're only using redux-undo from convenience (it exists and works) rather than necessity.

Just to clarify my original bug report, I don't think the change in undo manager itself causes data loss in the sense that content is immediately lost from the editor - the content will be the same. However, content is lost from the undo history, and it's possible this could fall on an unfortunate time when the user has done something terrible to their content and wants to undo, but can't. As such I think this is more responsible for the strange undo behaviour we've seen rather than old content being published.

msurdi · 2022-05-25T12:06:51Z

I've tried this out in the storybook (I was able to see the problem there too) and it seems to break undo there, with this console error:

I understand this is happening because, after these changes, we will also need to call startSharing earlier (as soon as yjs is initialized).

My own preference is that we always use redux-undo until collab is actively being used (i.e. we have a peer). However, I understand this is more complicated as it would mean copying redux-undo history into Yjs. I'm not sure how practical this is, but doing it would also solve the situation of collab not being enabled from the beginning.

I wonder if the complexity of getting this right and the maintenance cost of dealing with two undo systems and making them play nice between them is worth the benefits here. Personally, I don't think so, or at least not right now given how complex this is showing to be. But that's just my opinion, as a noob to this codebase.

This does lead me to wonder if it is possible to use the Yjs undo manager directly (i.e. without importing all of Yjs), and ditch redux-undo? Having two undo managers with slightly different behaviour doesn't seem ideal, and as Yjs is the one shouting the loudest then maybe it's easier to go with that. We're only using redux-undo from convenience (it exists and works) rather than necessity.

100% agree, and I think this is also what @mirka suggested. Even if we can't load "just" the undo part and still have to load the full yjs setup, I'd say that's an acceptable trade off if that simplifies the codebase and solves the data loss problems. We can always look for optimizing it later, right?

Just to clarify my original bug report, I don't think the change in undo manager itself causes data loss in the sense that content is immediately lost from the editor - the content will be the same. However, content is lost from the undo history, and it's possible this could fall on an unfortunate time when the user has done something terrible to their content and wants to undo, but can't. As such I think this is more responsible for the strange undo behaviour we've seen rather than old content being published.

Yes, I understood this and this is what I meant with "data loss" in this PR, the fact that things you had in your undo/redo history are gone, even if not shown on screen, that was certainly some "data" you knew it was there and suddenly it is not there anymore.

So, in order to move forward... what do you think about:

Going all-in on Yjs, loading it from the get go and use it even if no one else is connected.
Removing redux-undo
If the data loss problems are solved, then we look for optimizations such as only loading the minimal requirements for the undo system until there are active peers or write an abstraction on top of it which makes the switch easier, etc?

Or do you folks have any other recommended action plan? I'm all 👂👂

Thanks for your help!

johngodley · 2022-05-25T12:35:26Z

Going all-in on Yjs, loading it from the get go and use it even if no one else is connected.

It's worth bearing in mind that there are other users of this repo apart from P2. Currently Yjs is tree-shaken out of any client that doesn't use it. I don't know if this will still apply if we switch to Yjs for the undo manager - there could be a bundle cost.

Doing it in a branch and using that for P2 may be a good way to test the theory.

mirka · 2022-05-25T20:48:01Z

My own preference is that we always use redux-undo until collab is actively being used (i.e. we have a peer). However, I understand this is more complicated as it would mean copying redux-undo history into Yjs. I'm not sure how practical this is, but doing it would also solve the situation of collab not being enabled from the beginning.

I'm also hoping this is possible! I think it is, although I'm not sure if the performance will be acceptable.

As Matias says, it's a lot easier if we could just keep Yjs running for everybody, but it's important to consider that:

a lot of users won't need collab at all
even if they do want collab sometimes, the majority of times they won't be using it

So it would be ideal if we didn't have to penalize the majority use case with an unused 24kb Yjs bundle. The current tree-shaking addresses 1 but not 2. It would be cool if we could optimize the 2 case at some point. But since we do already penalize everyone in the 2 case, I think it's fine to circumvent redux-undo entirely there.

Maybe we could ask Tag1 about this in a consulting session. I'll add it to the list of topics.

I understand this is happening because, after these changes, we will also need to call startSharing earlier (as soon as yjs is initialized).

The current implementation doesn't initialize the Yjs doc content until the connection is established. This logic was inherited from AsBlocks, but I guess this is to ensure that only the first user to join the channel gets to initialize the Yjs doc with their own editor content. It does kind of presume that the connection will be established immediately, and that seems counter to what Yjs is capable of (merging in spite of unreliable networks). So it might be ok to remove this safeguard. In any case we'll want to do some additional testing around this behavior if we end up reworking the logic.

msurdi · 2022-05-26T10:46:50Z

The current implementation doesn't initialize the Yjs doc content until the connection is established. This logic was inherited from AsBlocks, but I guess this is to ensure that only the first user to join the channel gets to initialize the Yjs doc with their own editor content. It does kind of presume that the connection will be established immediately, and that seems counter to what Yjs is capable of (merging in spite of unreliable networks). So it might be ok to remove this safeguard. In any case we'll want to do some additional testing around this behavior if we end up reworking the logic.

The more I'm looking into this, the more I think this change I'm proposing will make things worse . Because we'd have different peers competing for "initializing" the Yjs document, and nothing will prevent each peer ending up with its own independent doc, right?

I've been reading through Yjs docs, and some discussions on their forums (like this, this, this, and in particular from this comment it looks like there is no easy way out of the problem of the persistence and initial loading of the content given we're not storing the Yjs document to be used as the source of truth when a new peer joins the session (so this is why we rely on the first connected peer, right?).

I was wondering if having a custom block type which would persist the Yjs document and generate the actual blocks "on the fly" would work, but right now I'm too far to validate this by myself, and seems like a huge departure of what we have.

I'll close this PR as I don't think it will lead us to a better place right now. I will try to look at this again from different angles with all the new knowledge and understanding I'm getting from the code, your comments, and what I'm reading around.

Thanks a lot for your help 🙇

msurdi · 2022-05-27T08:26:57Z

For future reference, here is another comment from the Yjs main developer recommending how to load and store collaborative documents on the backend.

The more I learn about this the more I lean towards thinking we do need to keep around the full YDoc data structure to be able to consistently load, save and do/undo without data loss. But I'm on my first baby steps on this still, so don't take me too seriously.

Initialize collab document and undo as early as possible

08c4601

msurdi requested review from mirka and johngodley May 25, 2022 08:26

msurdi closed this May 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize collab document and undo as early as possible #151

Initialize collab document and undo as early as possible #151

msurdi commented May 25, 2022

johngodley commented May 25, 2022

msurdi commented May 25, 2022

johngodley commented May 25, 2022

mirka commented May 25, 2022

msurdi commented May 26, 2022

msurdi commented May 27, 2022

Initialize collab document and undo as early as possible #151

Initialize collab document and undo as early as possible #151

Conversation

msurdi commented May 25, 2022

Description

Testing

johngodley commented May 25, 2022

msurdi commented May 25, 2022

johngodley commented May 25, 2022

mirka commented May 25, 2022

msurdi commented May 26, 2022

msurdi commented May 27, 2022