-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor sync #3312
base: main
Are you sure you want to change the base?
Refactor sync #3312
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very high-level look, but this already looks very promising!
Currently, Pretranslation is tightly integrated with sync in order to minimize time between exposing new strings for localization and pretranslating them. What's your plan with this?
pontoon/sync/sync_project.py
Outdated
paths = get_paths(project, checkouts) | ||
except Exception as e: | ||
log.error(f"{log_prefix} {e}") | ||
project_sync_log.skip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: when we'll be refactoring the sync log, we should make a distinction between skip()
(no changes needed during sync) and fail()
(sync failed due to an error).
pontoon/sync/checkouts.py
Outdated
if db_repo.last_synced_revisions is None: | ||
self.prev_commit = None | ||
else: | ||
pc = db_repo.last_synced_revisions.get("single_locale", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should drop the single_locale
but if we're resolving #3303.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. But let's do it as a follow-up; I would much prefer being able to do this refactor without any db changes, so that it's easier to roll back if necessary.
removed_source_paths: set[str], | ||
now: datetime, | ||
) -> None: | ||
lc_readonly = set( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: There are quite a few prefixes and acronyms used in this patch (lc_
, mod_
, tx
etc.), that are not immediately understandable like some of the more established ones (pk
, db
, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. lc
is short for "locale code", mod
for "modified", and tx
for "translation", but those are by no means obvious.
For local variables, I have a strong preference for keeping their names short, and relying on the reader to be able to figure out the meaning from where the values come from. Would documenting these somewhere be a sufficient improvement, or do you think that some or all of these are just too obscure? Or are there alternative shortenings that would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me even single-letter variables are acceptable in cases like this. I'd find l
and t
easier to understand than lc
and tx
.
I don't think we should document that.
Ah, I'd missed that! Yeah, that needs to happen the same as before. |
1f24718
to
397e5f5
Compare
This is now at the dangerous stage of looking like it works. But some verification work still remains:
|
# FIXME: zip downloads should only be for projects with 2..10 resources | ||
from pontoon.sync.utils import download_translations_zip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a compromise reached after discussion with @mathjazz elsewhere. We'll be refactoring this code again as a part of the subsequent data model refactor, at which point we'll be able to introduce download support that won't require us to re-fetch any files.
if project.pretranslation_enabled and changed_paths: | ||
# Pretranslate changed and added resources for all locales | ||
pretranslate(project, changed_paths) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new sync is retaining slightly different information than before, so it makes more sense to trigger the pretranslation for all entities in changed resources. The pretranslation will then filter out the entities and locales for which pretranslation is not needed or wanted.
From my own experimentation, it looks like there's at least some timing issue due to the git clone that's now needed. Retrying got
Does this work even now? As in, does uploading a resource with a string removed remove it from Pontoon? |
I uploaded a resource which had a translation that didn't exist in Pontoon. |
Can we use |
We should update the README file. |
This comment was marked as resolved.
This comment was marked as resolved.
The stats update should now be much more robust, and a bit simpler -- albeit written in SQL, as I couldn't figure out the corresponding Django invocations. In particular the This also guarantees that the work is entirely done within the database, where even for the most complex projects it should take at most a few seconds even though the update is applied to the entire current project. A "messy local setup" as mentioned above should no longer be able to produce the error that @flodolo encountered. As a minor regression, gettext plurals are now counted as separate entries. This means that when translating from English to a locale with more plural categories may result in an |
Confirming that this worked locally 👍🏼 |
Correction: the error now shows up when you update any string (e.g. reject one), e.g.
|
Note that |
This comment was marked as resolved.
This comment was marked as resolved.
Also for the same new file (not committed to the repo). Trying to download translations will crash Pontoon, because the file is not in the repo.
EDIT: it actually fails also after syncing to the repo, so the file is actually there. |
Update: Not sure what happened, but I think my local DB is toasted (too many tests). Creating a new project with a different slug worked 🤷🏼 |
lc_str = f"{len(updated_locales)} localizations" | ||
else: | ||
lc_str = ", ".join(f"{loc.name} ({loc.code})" for loc in updated_locales) | ||
commit_msg = f"Pontoon/{project.name}: Update {lc_str}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of this commit message.
Pontoon/Firefox (local): Update Italian (it)
I find the existing one more readable:
Pontoon: Update Italian (it) localization of Firefox (local)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered what the message looks like if it contains changes for multiple locales? With the old format, it quickly gets much more variable and clunkier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pontoon only commits one locale at a time now. Is that changing with the new code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing yes, given lc_str = "1 locale" if lc_count == 1 else f"{lc_count} locales"
.
I feel like we have discussed this at some point, but it's been long enough that I don't know if that happened, or where 🤔
Closes #2023 -- entity- and translation-level changes are logged in ActionLog
Fixes #2057
Fixes #2068
Fixes #2078
Closes #2083 -- source changes are now sync'd to targets eagerly
Fixes #2087 -- git file moves are caught, but not copies
Closes #2110 -- the remaining sync tests are largely replaced here
Closes #2129 -- refactoring the sync changes the performance characteristics completely
Fixes #2169
Fixes #2175
Fixes #2189
Fixes #2211
Fixes #2242
Closes #2285 -- not relevant after the refactor
Fixes #2641
Fixes #3302
Fixes #3449
This is effectively a rewrite of the
sync_project()
function that's currently here, and which ends up calling most of the code underpontoon/sync/
.The end results of the code here should be the same as currently, but the implementation is completely new, and does things in a different order.
Per-locale repositories are dropped here, as per #3303.
Explicitly left out of this PR but liable to change later: