You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As discussed with @snomos and @Trondtr today, here are some optimizations that my version has, and that are needed:
1) Ability to only check the selected paragraphs
This is important if you're working on a particular part of a long document, and don't want to see markings for other parts.
Technically, it looks at the current selection (or cursor if there is no selection) and expands that range to encompass all paragraphs touched by it. Paragraphs, because you need context to perform corrections, but there is no way to determine sentence boundaries without a full analysis anyway, so paragraph is the smallest usable chunk.
Outlook: Can't access selection - have to get whole document.
2) Don't recheck checked paragraphs - cache them
If a paragraph has been checked and there are no changes to it, there is no reason to send it to the backend again.
Technically, every paragraph is stored alongside the backend result for that paragraph, keyed on a hash of the paragraph. Then, when preparing the payload of what to send to the backend, if the hash is in the cache, skip appending this paragraph to the payload.
When parsing the result from the backend, fill in missing paragraphs from the cache. It's important they are still parsed as-if they were from the backend, because the user still wants to see the markings.
In order to get near-instantaneous results and let the user take action as soon as possible, break up the payload to the backend into max 1 KiB chunks (or larger if it's a fast backend). This also avoids many timeout issues.
Technically, well, really just as said. Cycle goes sendTexts -> parseResult -> sendTexts -> parseResults ... until there are no more paragraphs to be sent. Notice this plays well with caching.
We show a progress bar underneath the current markings, so people can see more is coming. We use 4 KiB chunks for GrammarSoft backends, but I've found 1 KiB is max for the Greenlandic backend to feel responsive, and I bet that holds for the other FST-based backends.
Showing all marking in the sidebar at once is maybe bad. I can imagine that doesn't scale when working with large documents.
No way to add a word to a user dictionary. Though, you don't currently have a user system at all, so fair 'nuff. But that is something Greenlandic users have asked for - even for company-wide dictionaries.
The vast majority of UI and code can be shared across add-ins, and even loaded from the same HTTPS source, so that most changes can be done without needing to go via Google or Microsoft's approval.
The text was updated successfully, but these errors were encountered:
As discussed with @snomos and @Trondtr today, here are some optimizations that my version has, and that are needed:
1) Ability to only check the selected paragraphs
This is important if you're working on a particular part of a long document, and don't want to see markings for other parts.
Technically, it looks at the current selection (or cursor if there is no selection) and expands that range to encompass all paragraphs touched by it. Paragraphs, because you need context to perform corrections, but there is no way to determine sentence boundaries without a full analysis anyway, so paragraph is the smallest usable chunk.
2) Don't recheck checked paragraphs - cache them
If a paragraph has been checked and there are no changes to it, there is no reason to send it to the backend again.
Technically, every paragraph is stored alongside the backend result for that paragraph, keyed on a hash of the paragraph. Then, when preparing the payload of what to send to the backend, if the hash is in the cache, skip appending this paragraph to the payload.
When parsing the result from the backend, fill in missing paragraphs from the cache. It's important they are still parsed as-if they were from the backend, because the user still wants to see the markings.
3) Asynchronous progress in chunks of 1 KiB
In order to get near-instantaneous results and let the user take action as soon as possible, break up the payload to the backend into max 1 KiB chunks (or larger if it's a fast backend). This also avoids many timeout issues.
Technically, well, really just as said. Cycle goes
sendTexts -> parseResult -> sendTexts -> parseResults ...
until there are no more paragraphs to be sent. Notice this plays well with caching.We show a progress bar underneath the current markings, so people can see more is coming. We use 4 KiB chunks for GrammarSoft backends, but I've found 1 KiB is max for the Greenlandic backend to feel responsive, and I bet that holds for the other FST-based backends.
Other notes while I remember
The text was updated successfully, but these errors were encountered: