-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit locale file loading #6328
Comments
Assigning you, @jonasraoni; please have a review and do some research, but let's talk approaches before you dive into coding. |
Knowing which locale files are already loaded at a given point is not feasible, so I'm normally using the "trial and error" approach, when using a given template/class in another place, the issue rearises. A not needed require might pass unperceived, etc. I'd say that we're safe to load everything in memory (unless we have some keys with tons of text), but I'll check the impact.
|
In principle yes, but we shouldn't be writing/maintaining that in our own software when there are widely accepted standards; part of the goal of this issue is to move considerably closer to just using existing frameworks/toolsets for managing translations. We don't have so many translation strings that standard stuff like Laravel and/or gettext aren't usable.
Most packages using Weblate only have a few components (a.k.a. separate locale files) and longer term I think we should move towards that. (Again, we'll need to make sure we stage up any comprehensive locale file changes so that they suit our release plans, but that's not to say that we shouldn't do it.) I think the ideal structure would be to have one or two locale files each for OJS/OPS/OMP, plus one or two for pkp-lib, plus one for each plugin. Reducing the number of files considerably and possibly just loading them all would help resolve the
The default translation plugin is a stepping stone towards this. Translators have often complained that it's hard to know where in the interface a term appears in order to understand what they're translating; an OJS install to explore that shows locale keys would help. The other part that's harder to solve is having a locale key (e.g. something untranslated) and needing to answer the question: where does this appear? (Without randomly exploring, that is.) It's not necessary to solve this for this issue, but since you asked :) |
The last problem is tough... To have some numbers in mind... Excluding the emails, we have around 1489 keys in The first part of the problem, to have a custom OJS, will probably involve updating the The second part, to reach an interface from a key, will require a great idea haha! I've left one idea below to start thinking about it, maybe a light will appear at the end of the tunnel 😫 Run the Cypress tests with a special flag, which will monitor updates to the elements (perfectly possible with the current Web APIs), looking for translated elements... Whenever something unique is found, the item will be highlighted and a screenshot with its name will be taken. Even if it misses some things, I think it might catch a lot of things. The remaining keys, which escaped the first filter, could be manually associated with an instruction (e.g. "open the submission menu"), URL or grouped based on the place where it was found (e.g. if it was found in "submissionReview.tpl, so it must be appearing somewhere at the submission review). |
^ Good ideas, but definitely outside of the core scope of this issue! I did a proof of concept for automatic screenshot generation for documentation using Travis testing, and when we come to revisiting that, we should talk about this too. |
Follow my considerations for some items after an initial evaluation:
The Laravel function has basically the same arguments and under the hood it gets a "translator" object from the resolver/container, so we could override/extend this key in the container, then the patch wouldn't be needed... Otherwise, I think it's better to just namespace our translation function (and probably also rename it).
I checked the Laravel's code to load the json files and didn't see anything special... If it's too slow for us, it's possible to provide a custom loader.
I see this is a requirement =]
Yeah, the gettext format supports it out of the box. The Laravel has a less flexible system to define plurals (just specific amounts + ranges), but I've inspected the code and it has an internal embedded support for pluralization based on the locale, so it should work fine...
The Laravel Gettext (I didn't find anything else) code isn't huge, and the commits to make it work with the newest versions of Laravel basically just required bumps in the composer file (in case the company supporting it fades away)... But internally it's using the native
Due to these drawbacks I'd vote against using the native gettext in favor of Laravel (it has a native way of adding extra json paths to search for locales) 🤔
Here's a problem for adopting Laravel. I checked the Weblate integration and it doesn't support the Laravel format. As a fallback, they have an API, but tinkering with a new integration for many repositories will probably be more time-consuming than writing code. SummaryI didn't find a way that fits perfectly, so any solution will require some patching here and there... Before writing code, it might be worth to check if Weblate is able to provide support for the Laravel format. If not, I think we'll have to keep using the gettext parser (which I don't see as a bad option), address the goals/bonus points and try to replace/remove some local code by external packages. Another idea I can try to explore is to convert the gettext format to the Laravel format on-demand. |
Thanks, @jonasraoni -- so if I'm following, we can adopt the Laravel
If that's correct, then I propose that we decide: Let's adopt Laravel's What do you think about performance? I think this is going to drive some of our toolset choices, maybe more so than philosophical json-vs-po considerations. Apparently The more I think about it, the more the downsides you've pointed out with PHP-native gettext (e.g. interprocess problems) are prohibitive. We'll be working with a lot of people on shared hosts without much control of their servers. So I suggest we also decide to avoid PHP-native gettext in favour of either a PHP-based parser (as we currently use) or another file format. |
I think that's fine, I didn't see problems to adopt it.
I didn't see such optimizations in the Laravel Translator, but AFAICS we can provide a custom FileLoader to it... Last time I checked, the performance of a About the configuration, in order to have the best performance I think we should try to fit under the default limits of: Instead of counting the amount of php files that we have on our system + the ones generated by the FileCache, Smarty cache, etc. we could get the status from a heavy installation... If we're far from the defaults, then we can't expect the opcache to make miracles, but we still can try to decrease the usage (e.g. merging several locale files into one; moving not so important, but large, caches out of this method; etc). According to #7111, I assume we're going to use Stash. I checked fast and their code is using file locks + a handmade var_export, strange that performance dropped so much. We can try to debug the source of slowness and provide a report for them.
I prefer the po, it's more generic/technology agnostic and has community tooling, but going against the flow has its price. ps: As I didn't write anything (I was just going to check if the Laravel Gettext works fine, but stopped in the middle of the process), those are just my assumptions after reading documentation, comments and skimming through code. |
We could either use Laravel Cache or Stash; both are mature and widely adopted. Both support PSR-6, so swapping one for the other shouldn't be too bad for A/B testing. Even though we have (old) proof-of-concept code for Stash, I'm currently leaning further towards Laravel, as we're already using so much more of that toolset and adopting as much as possible of the Laravel translation/loading/caching toolset will be sensible to coders rather than a mix of Laravel and Stash (and our third-party stuff). |
Laravel uses Flysystem's Filesystem to load translations in |
That's filesystem metadata caching, not file contents; we'd be using Laravel Cache or Stash to parse .po (or JSON) localization content only when necessary, and if we continue using .php-based file caches, to allow PHP to opcode compile and cache using those mechanisms. (I think filesystem metadata caching might be helpful on network filesystems, but I suspect file caches will perform badly there for other reasons.) |
Just to confirm, the
Ok! I'll try to give preference to Laravel instead of adding a new package. |
@asmecher about the pluralization...
private function _parseExpression(string $expression): callable
{
if (preg_match('/[^<>|%&!?:=n()\d ]/', $expression)) {
throw new InvalidArgumentException('Invalid expression');
}
$pieces = explode(':', str_replace('n', '$n', $expression));
$last = array_pop($pieces);
$expression = '';
foreach ($pieces as $piece) {
$expression .= "$piece:(";
}
$expression .= $last . str_repeat(')', count($pieces));
return function ($n) use ($expression) {
return eval('return ' . $expression . ';');
};
} It basically just replaces n by $n and add parenthesis to the else part of the ternary operator (it's just a draft, I'll review and add an extra check to ensure there's just a single "n", etc.) and I don't like using
|
What about changing the locale file format (cached, that is) to store plural forms 1:1 with how they're supported in PO files? That feels like a more natural solution than generating and interpreting PHP. |
Hmm, I think I didn't understand very well what you said. I just wrote this function to handle complex plural rules (e.g. Anyway, we can revisit this later when I open the PR. About the cache: I setup the Laravel cache and compiled dynamically all the locale files (from OJS, pkp/lib and plugins) into a single cached resource. For now I have a working prototype and I've addressed several items of the issue, except:
Then I still need to:
|
Are you sure we need to parse |
I've filed the apparently missing functionality on the php gettext library here: php-gettext/Gettext#273 The maintainer of that package has typically been quite responsive, so we'll see what they say! |
After taking a look at how weird some languages build their plurals, I'm sure haha 😅 Looks like they've dropped support for their Given it's a generic library, I think their version:
Given that I already made my choice, I'll just leave other possibilities that I considered:
|
@jonasraoni, Oscar (the maintainer of that toolset) has confirmed that the Translator class is still available but has hived off to another package: https://github.com/php-gettext/Translator/blob/master/src/Translator.php#L186 We should use that library rather than reinventing the wheel; if you think there are improvements to be made there, I'd suggest filing them as issues. |
The What about the Laravel initialization? I believe I'll probably need to modify it, so if you know about any strong reason to keep it this way, it will be helpful. |
Go ahead an modify it; it wasn't necessary to initialize the container before OJS/OMP/OPS was installed in the past, but now it will be for the sake of translations. |
- Introduced a Laravel Facade - Removed the need of loading specific components - Added isInstalled and isUpgrading to PKPApplication
…calization pkp/pkp-lib#6328 Updated localization
… upgrade/install status
… backwards compatibility with external plugins
@jonasraoni, I've merged almost all the PRs listed at the top, but don't forget some unaddressed comments on the following (which can be addressed after merge):
I haven't merged the following:
The custom locale plugin is also not working and needs attention: https://github.com/pkp/customLocale |
… upgrade/install status
Thanks! I'll check the plugins soon (customLocale + defaultTranslation) and probably this side-issue: #7673. |
… upgrade/install status
Exciting to see this go in! Congrats @jonasraoni. 🎉 As part of the work on this, can you please review and provide some documentation on these changes? I've filed an issue on our docs repo: pkp/pkp-docs#923 Don't worry about writing perfect English. Just get the substance of the changes, as well as some code samples, into a PR and I will do the copyediting. |
I've filed an issue against the |
The security issue was merged (php-gettext/Translator#5), just a reminder a update the packages later. |
I've updated the I've created an extra issue to handle the unused keys (#7837), I'll proceed to the docs. |
Revisit locale file loading.
Goals:
__(...)
translation function (perhaps we could use Laravel's instead, which means we could get rid of this helper removal patch)AppLocale
/PKPLocale
classes could be provided by a 3rd-party implementationBonus points:
AppLocale::requireComponents(...)
calls, which are currently used to load each .po file when required. This might require making performance good enough that we can simply load everything._YY
only when necessary to distinguish between two regions, e.g.pt_BR
vs.pt_PT
.).po
file format supports this.)Gotchas:
gettext
toolset may not respond well to .po/.mo files changing while the server runs. Need to assess how users will be affected by a change from the current behaviour, in which a .po file is re-cached when modified with no fuss.gettext
, then.po
files need to be compiled to.mo
files in a normal workflow. This is a requirement that users haven't had before (and locale file modifications aren't an uncommon job).PRs
Applications
Modules
Plugins
Missing a PR for the https://github.com/pkp/defaultTranslation
The text was updated successfully, but these errors were encountered: