Translation server migration weblate
Migrate the translation server to Weblate (from Pootle)
Status
Migration step (2023-10)
Investigating the process (2023-09-29)
The test server is up.
Version: Weblate 5.0.2
Proposal
Weblate structure
There are 3 concepts in Weblate: projects, components, and (since v5) categories.
Categories allow to group components within a project. A component represents a po file.
We could then have:
projects: koha, koha-manual
categories: koha/main, koha/23.11, koha/23.05, koha-manual/main
components: koha/main/installer, koha/main/messages, koha/23.11/installer, koha/23.11/messages, etc.
Koha changes
I am suggesting to have 2 new repositories: koha-i18n and koha-l10n.
- koha-i18n will group the different scripts we need to deal with the translation workflow (on the weblate server and eventually by the release maintainers and the doc team)
- koha-l10n will have a copy of the po files. 1 git branch per version, and all at the root directory. (example: https://gitlab.com/joubu/koha-l10n/-/tree/main)
Migration
It's quite trivial to retrieve the users from Pootle and create them in Weblate. However it will be tricky to retrieve the activities, the permissions, etc. Can we live without that?
How to sync
Weblate can track a git repository, and update its DB when the repo is updated. What I am suggesting is the following workflow:
A push to the Koha git repository will trigger an update of koha-l10n (retrieving weblate's translations, running our `translate update` to generate new strings, merge both, update koha-l10n), that will auto-update Weblate's translations.
In a second step we could even ask Weblate to push to the repo when new strings are translated (translators will "see" their changes and will be able to test them easily).
We can script and interact with Weblate easily, via their REST API https://docs.weblate.org/en/weblate-5.0.2/api.html or a wlc CLI tool https://github.com/WeblateOrg/wlc
Pros/Cons
Advantages
- A copy of the up-to-date po files will be available (can retrieve directly instead of running the translate script)
- More automated process: no need to wait for anybody to update the strings
- Can catch regression before the release (we could have tests on top of generated po files)
- It's a first move in the direction of removing the po files from the git history
Cons
- Update the whole things on each push will certainly put the server down, we may want to run it only if templates are changed, or once a day (once a week?)
- po files are in two different places (not a big deal, it's two different things anyway: one is containing the "moving" files, the other is for releasing)
Problems
I am facing different problems in this implementation.
We have been waiting for version 5 to get the "categories" concept. But I have the feeling that it's not ready yet, or it's a quick "hack" to answer some people's needs (if you are working on the Weblate project and reading this, don't take me wrong, I am a developer and I do understand that :D). I am still not sure if it's that it's not fully ready yet, or if I am missing something.
Examples:
- When you create a new component you provide a source repo and a branch. You can link to an existing weblate's component, to not clone the repo several times. However I don't manage to point to a component that is within a category. See https://docs.weblate.org/en/latest/admin/projects.html#source-code-repository and
https://github.com/WeblateOrg/weblate/discussions/9556#discussioncomment-7143904and https://github.com/WeblateOrg/weblate/discussions/10112 - I don't manage to retrieve all the components of a project for a specific language (so all the po files of main for es-ES for instance). You can: `wlc download koha-light/main%252Finstaller --output output_directory` or `wlc download koha-light/main%252Finstaller/es --output output_directory`. But `wlc download koha-light/main/es --output output_directory` is not working, neither "http://localhost:8888/projects/koha-light/-/fr/". Is it not a problem for now as we can still download them one by one, but it can take more (network) time than a big zip file which would contain all the translations. Maybe not a problem if it's running on the same server (?).
Manual
The structure of the koha-manual project will be adjusted:
Merge into a single branch (only one branch will be translatable)(postpone, we are keeping the old branches for now)- Switch from "main" to "main" for the git branch
Merge of all the PO files into a single one (locales/*/LC_MESSAGES/all.po)We are keeping the split into different PO files/components, see doc meeting from 2023-10-25.- Forgotten languages will be added (sk, sv, pt)
Later
- Automatic suggestions?
- Propagate translations to other components (again, does it work with categories?)
- Retrieve the PO files from the sandboxes (so that translators will see their changes "immediately")
Migration steps
Done
- Old "koha manual" projects disabled on pootle (only "Koha manual 22.11" is kept)
- koha-l10n and koha-manual-l10n created in https://gitlab.com/koha-community
- Pre-production server is up and running (share the URL on request, ask Joubu)
- Write migration script:
- Create teams/groups
- Adjust permissions (admin of a team + member of a team)
- Adjust users info (full_name, date_joined, last_login)
- Adjusted permissions:
- Guest (non logged in user): Nothing
- User (logged in user): Add suggestion
- Member of a language team: Power User for the given language (for both projects koha and koha-manual)
- Admin of a language team: Add/remove members within the team
- Move users
- Export pootle users and inject them into weblate's DB
- Run migration scripts to adjust permissions and users info
- koha-manual
- Switch to "main"
- Remove PO files from koha-manual (use koha-manual-l10n instead)
- Setup a cronjob to run koha-manual-i18n/weblate-sync.pl nightly (2UTC)
- project koha-manual22.11 is frozen on Pootle
- Document the new workflow for RM and RMaints
- Take control on stable branches and:
Push Koha bug 34959Push Koha bug 35024 - translations in PO files will no longer be wrappedPush Koha bug 35043 - tab characterPush Koha bug 35079 - Add option to gulp tasks po:update and po:create to control if POT should be built
- Send an email to RM and RMaints about the new workflow (after October releases)
Freeze Pootle, download po files, create component in Weblate, have branches in koha-l10n- Remove po files from the codebase (bug 35174) and from all branches
- Adjust Release maintenance and Release management to remove ref to po files
- Adjust "Release day actions/checks" https://tree.taiga.io/project/koha-ils/task/106
Setup a cronjob to run koha-i18n/update-po.pl nightly (to start, then we will see if a trigger is a better option)- Post-push hook, generate tarballs when a new tag is pushed
Pull latest changes from pootle, merge and push them to koha-l10n and koha-manual-l10nAdjust server's name/url (nginx, weblate docker env, .config/weblate) + DNS move
Next
- Setup/request a SMTP server (weblate will need to send emails)
- Ensure .po files are correctly generated (junitmsgfmt)
- Post-push hook in koha: when a tag is pushed adjust koha-l10n/changelog, and add the same tag as in Koha (see with Mason)
Architecture
This is an adjusted copy of the file "README" that is at the root of the server.
The server is (will be) translate.koha-community.org to host Weblate.
It is hosted by BibLibre.
Installation
Weblate is running using the official image of the docker container.
The `weblate-docker` project is in `/home/weblate/weblate-docker` and the container can be started up with the following command:
`docker compose -f docker-compose-https.yml -f docker-compose-https.override.yml up`
Env variables should be set in `docker-compose-https.override.yml`.
Migration
See Translation server migration weblate#Migration
Weblate config
Projects
There are 2 projects: `koha` and `koha-manual`
`koha` uses categories for versions, then components are .po files
For instance:
- koha/main/pre
- koha/23.11/about
`koha-manual` does not use categories as we are supporting only one version of the manual (branch `main`)
We also have one component per .po file.
For instance:
- koha-manual/about
- koha-manual/searching
Projects config
- Allow translation propagation is ON, so one string in one component should be propagated to other components \o/
- One component is tracking a po file in a git repository for a specific branch.
For instance `koha/main/pref` will track `koha-l10n/main/*-pref.po` (* is the different languages)
As we don't want to clone one repo per component we can use a specific syntax to reuse one from another component
For instance: `weblate://koha/main/installer`
IMPORTANT:
`koha/main` has the repo object in `koha/main/installer`
`koha-manual` has the repo object in `koha-manual/about`
Do not remove them!
Users and groups/teams
There is one team per language, and each team has the "Power User" role on the language
https://docs.weblate.org/en/latest/admin/access.html#list-of-privileges-and-built-in-roles
There are also team administrators. For instance here is the member list of the German team: https://test-translate.biblibre.com/teams/44/#users
Git integration
The `koha` project is using the [koha-l10n](https://gitlab.com/koha-community/koha-l10n.git) git repo and `koha-manual` is using [koha-manual-l10n](https://gitlab.com/koha-community/koha-manual-l10n.git).
Each changes in Weblate will generate a commit in those repo (it's not immediate however)
A user `koha-weblate` has been created on gitlab and has push permissions to those 2 repositories.
It should be the only user with push permission to these repo! You should not need to modify them!
See also "Projects config".
Translate server structure
FS
Everything you need is in /home/weblate
- README
This file
- bin
bin/koha-manual-update.sh is run nightly at 2UTC
Script for the l10n sync and generate the translated versions
- environments
Python env to generate the manual (~/environments/sphinx-koha-manual/)
- koha
Koha src used to generate new strings (gulp po:update)
- koha-i18n
Clone of https://gitlab.com/koha-community/koha-i18n.git
Used by bin/koha-manual-update.sh to update Koha's .po files
- koha-l10n
Clone of https://gitlab.com/koha-community/koha-l10n.git
Used by koha-i18n/update-po.pl to retrieve new strings from Weblate
- koha-manual
Clone of https://gitlab.com/koha-community/koha-manual.git
Koha manual src used to generate new strings
- koha-manual-i18n
Clone of https://gitlab.com/koha-community/koha-manual.git
- koha-manual-l10n
Clone of https://gitlab.com/koha-community/koha-manual.git
Used by koha-manual-i18n/weblate-sync.pl to retrieve new strings from Weblate
- logs
logs/cronjobs/ contains the output of bin/koha-manual-update.sh
- weblate-docker
Clone of https://github.com/WeblateOrg/docker-compose.git
To start the Weblate containers
Syncing steps
For Koha
- bin/koha-update.sh is triggered by a nightly cron
- Run koha-i18n/weblate-sync.pl
- Fetching koha to retrieve the changes
- Lock Weblate (koha project only)
- Push Weblate changes to koha-l10n (`wlc push`)
- Retrieve .po files from koha-l10, copy them to misc/translator/po
- Generate .pot to get new strings and merge (`gulp po:extract`, `gulp po:update --generate-pot=never`)
- Retrieve new .po files and push to koha-l10n
- Unlock Weblate
- Run koha-i18n/weblate-sync.pl
For Koha manual
- bin/koha-manual-update.sh is triggered by a nightly cron
- Run koha-manual-i18n/weblate-sync.pl
- Fetching koha-manual to retrieve the changes
- Lock Weblate (koha-manual project only)
- Push Weblate changes to koha-manual-l10n (`wlc push`)
- Retrieve .po files from koha-manual-l10, copy them to koha-manual/locales
- Generate .pot to get new strings and merge (make gettext, msgmerge, etc.)
- Retrieve new .po files and push to koha-manual-l10n
- Unlock Weblate
- `make all_html && make_all_epub` to generate the HTML and EPUB versions
- push html and epub to koha-community.org:/var/www/manual/latest
- Run koha-manual-i18n/weblate-sync.pl
wlc
wlc is a Weblate command-line client using Weblate's REST API.
https://github.com/WeblateOrg/wlc/
Its config is stored in `/home/weblate/.config/weblate`
It is used by the i18n scripts.
Monitoring
The server is currently monitored using Netdata. If you are facing bad performances please note the date+hour and tell Joubu.
Questions/Notes/Discussion
Katrin
- Structure
- I think it would be important that you can easily work on all files for a specific version and see how much is done for that. More important than seeing the translation status of the same file across different versions.
- Joubu: Hum I am not sure. I am not sure this is possible exactly as you would expect. You can have this view to get an overview of "How much is 'main' translated in German". You will also need to see what is possible with the "Dashboard". And also linking to a weblate github issue that might be relevant for us.
- Katrin: Sorry, but I don't understand your comment. :( What I meant is that translators usually translate a specific version, not 'the OPAC files for different versions'. Maybe it could work if we made the projects "Koha Manual", "Koha 23.05", etc.? In Pootle the page I mean would be: https://translate.koha-community.org/de/23.05/
- Joubu: It will work the same
- I think it would be important that you can easily work on all files for a specific version and see how much is done for that. More important than seeing the translation status of the same file across different versions.
- Migration
- I believe we can do without the activities.
- For permissions it would be good if we could still limit by language. If we need to manually migrate those, maybe we could limit migration of users to those with somewhat recent activity.
- Joubu: Not done yet but I will see if it's possible to retrieve the permissions and adjust them, unless we want to take profit of the migration to start over?
- Katrin: Not sure about that one, it could be quite disruptive, especially if we need to assign permissions for new registrations manually.
- In the sync step:
- If I understand that correctly the strings would be updated on every change. I believe that is ok, but we should still have a string freeze where the change stop, so translators get a chance to catch up before a release. I think nightly or once a week would be enough, the important bit is that you get time before a release to get it done.
- Joubu: The update of the strings on weblate means that we need to "freeze" weblate during the process (to prevent conflicts). So we actually certainly don't want to do that on every push. I think once per day is good, so that translators can see their changes quickly. We need to pick a "good" slot, but never good for everybody... I think I have reduced a lot the time this process takes, and I think it can be improved even more (ie. don't freeze all languages but only the one we are processing, see bug 35079).
- Katrin: I am confused by "so that translators can see their changes quickly" - do you mean new strings? Because they would see their work on Weblate directly, right?
- The project koha-l10n will contain the PO files with the translations up-to-date (reflecting the state of Jenkins). In a later step we will be able to retrieve the PO from this repository and use them from ktd or the sandboxes: translators will be able to see what they have translated in the next days (compared to now, they need to wait for the next release).
- How would the translated po files be synced back into the Koha repository/made available for the packages?
- Joubu: Same as now, "manually" and right before the release
- If I understand that correctly the strings would be updated on every change. I believe that is ok, but we should still have a string freeze where the change stop, so translators get a chance to catch up before a release. I think nightly or once a week would be enough, the important bit is that you get time before a release to get it done.