Exclude packages from language pack

Hey guys,
As some of you may already know I’m the Hebrew translator of Ubuntu.

Lately I’m starting to see a phenomenon of guys helping translating CLI only templates (imported from the original packages) and there are some strings I translated to Hebrew myself which appears only in CLI.

With Hebrew we are experiencing numerous problems when displaying or using Hebrew in various terminals, even when Hebrew appears correct it might be reversed, there are no command in Hebrew and sometimes typing in Hebrew in terminal messes up the entire input and requires a restart.

Some approached have been taken along the years, there are packages which are completely reversed (the translation was written in the correct order and then LTRed using a script) and there has been some sort of effort to transliterate some of the text, bottom line it’s not working and I learned to live with that.

What I want to achieve is having the ability to select a certain package and mark it as CLI only so that its translated strings won’t appear on screen whatsoever and mess up the entire display.

There has been some attempts in the past to try and maybe include MLTerm by default or adding LOCALE=en_US automatically so the strings won’t appear but I want to try another approach, simply exclude certain templates from the language pack and this way nobody will have to deal with a messy terminal or anything like that.

An please don’t make it a thread about fixing RTL and unicode display in terminals because it’s way to heavy and unrelated to the case, I’ll be glad to open a different thread for that, all I want to try and achieve here is merely allow marking the CLI only packages and exclude them, I don’t care if Canonical wants to do it themselves but so far it’s been very unpleasant for the user and I’m not going to walk through all the translated strings and revert them back, this job is too dirty when it comes to doing it using Rosetta/Launchpad.

Thanks!

1 Like

While I understand what you say, I think that step would be too radical.

Also many CLI applications are prepared for translation and translators have translated them into a lot of languages. If I understand you correctly, you suggest that Ubuntu as a distro should simply ignore all those translations. There will never be a consensus about that. After all, not everyone in the world speaks English.

There is a simple way for any user (including myself) who prefers English in the terminal. Add these lines to ~/.bashrc:

LANGUAGE=en
LANG=en_US.UTF-8

I’m not asking to exclude all translations, I’m asking to exclude some specific languages that will probably won’t display correctly on terminal.

This is a well known trick, the problem is: it’s a trick, meaning that new users might encounter bad terminal behavior when they start using the terminal for the first time, so if you want this to happen you have to set it by default for all new users.

I’m talking only about the average user and the way these things are handled, we deliver a bad experience for first users in certain languages and I don’t have any control over it.

How can we make this thing more robust and built-in instead of asking the users to patch?

Primarily, this is an issue to resolve with the coordinator of your language team.
They should make the call whether to wait for a translation to reach, let’s say, 80% completion before they submit it to the project.

If could be possible to have a cut-off percentage in Ubuntu (i.e. in Launchpad) that would not include translations for a language on a specific project, if they have not translated, let’s say, over 80%.
That may have some merits, but it is not good because it adds all sorts of administrative burdens.
For example,

  1. There are translations for en_GB, en_AU, en_NZ, etc. Obviously, they use scripts to find words like initialization and translate to initialisation. They do not touch other messages that do not need any change. Their translation percentage is around 1% or less, so they would be affected by such a cut-off.
  2. Some projects have lots of messages, but the important UI messages could be far less than %80. For example, snap-store appears to be OK with a selective 50% translation level.

All in all, contact the translation coordinator for the specific project, if one exists. For example, if the project is under the Translation Project umbrella, contact the translation coordinator for this. It would be bad form to contact a translator directly. There are such coordinators also for GNOME and KDE.

Your comment is a bit off-topic because I was talking about a technical problem and you suggested a regulatory solution but I will still answer.

Hebrew open source localization is a very small swamp, most of the external projects which are included by default in Ubuntu are handled by me to make sure they are aligned with the rest of the system and that the translation is upstream.

I can try contacting myself although I’m pretty sure I won’t answer.

If I understand correctly you’re suggesting keeping them untranslated or copy the original string to the translation, it could work in some cases but I’d rather have a simpler solution such as marking a string or a whole template as irrelevant for Hebrew so the new translators won’t invest time in a CLI app that the users won’t benefit from.

I’ll explain briefly about the Hebrew in CLI:
Hebrew is written from Right-To-Left (Hence RTL) and most terminal emulators does not support that either because of encoding (Unicode) or incorrect implementation of BiDi (Bi-Directional) algorithms including the Linux kernel most basic TTY/PTY/whatever.

So there are cases where the messages appear correctly on screen when using CLI in the following cases:

  1. The terminal fully supports Unicode (MLTerm)
  2. The CLI app is translated in reverse (dlroW olleH)
  3. FriBiDi is implemented correctly and the app is translated correctly.

Mixing those methods usually leads to a big mess and unreadable messages on screen.
The only place where Hebrew is supported correctly on CLI is either the Hebrew version of Ubuntu CLI installer (and the name of the language in grub before that) or the Debian installer.
Besides these two, although there has been some nice attempts to fix this issue it was never standardized and it simply frustrates many Hebrew users daily.

1 Like

Consider this script (the execute bit needs to be set):

$ cat /usr/sbin/gnome-terminal
#!/bin/sh -e
for langcode in ar ckb he; do
    if [ "${LANG%_*}" = $langcode ]; then
        if [ $(locale -a | grep 'C.UTF-8') ]; then
            unset LANGUAGE
            LANG=C.UTF-8
        fi
        break
    fi
done
exec /usr/bin/gnome-terminal $@
2 Likes

It’s a nice solution yet I want to verify several things:

  1. How can we make this decision clear enough to the users so they can opt-out if needed.
  2. Can it survive updates?
  3. What if this user has both Hebrew and Spanis/German on the same computer? The fallback will be C.UTF-8 and not the other supported language.

And just a small fix to your script, there are several additional languages using RTL scripts:
yi fa ur dv (these are from the top of my head, I also know that Punjabi is using both Gurmukhi script which is the native script and Shahmukhi which is an Arabic script and there are several other cases like that) and there are some very rare languages (alive but not so popular) such as N’Ko, Syriac (Aramaic) and some sorts of Azeri (spoken in both Azerbaijan and Iran).

There is a list on Wikipedia but Ubuntu is translated to only a few of these language.

Don’t know. This is the main weakness with the idea. Ideally there should be a GUI in Settings or the gnome-terminal preferences (or possibly Tweaks) where the user could select “English” or “System language”.

If we install it via a package it will indeed survive updates.

True. My spontaneous thought is that this would need to be handled manually by the user. I have seen many users ask how to get the terminal in English, but nobody who has asked about any other language.

I know, I know. OTOH, as long as there is no simple opt out, I’m disinclined to push this for all those languages without a clear demand from the translators/users of respective language.

As a first step I would be ready to install a script for Hebrew only. The language-selector package already carries a ‘Portuguese special’ (the /etc/profile.d/cedilla-portuguese.sh file) for quite another reason, and I can think of adding a ‘Hebrew special’. I leave the decision to you.

Do you think it’s worth conveying this idea to the gnome-terminal team? Do you know of any plans to allow “Tweaks”/“Plugins”/“Whatever”?

Yup, sounds promising, so uninstalling this package is basically your way to opt out of it?

Do you think it’s detectable by locale -a or it’s too broad?

Although possible I’m thinking that we should always head to the source and try to implement this Cedillia and Hebrew bypasses using the hands of the original maintainers and see how they can handle it.

Thanks!

Nope, they think it should be handled in a different way:
https://gitlab.gnome.org/GNOME/gnome-terminal/-/issues/267

So you were turned down by the GNOME folks. I wasn’t too surprised.

Well, not if we would ship it with language-selector, which I had in mind. On Ubuntu, uninstalling that package would require that a bunch of other packages are uninstalled, which would break the desktop.

Opting out by uninstalling would need a separate package which no other package depends on, and that would mean quite some work.

I don’t understand that question.

1 Like

Sorry but I don’t think it’s good enough and I don’t want to cause too much trouble with breaking existing packages (while potentially removing the entire UI :slight_smile: ), is there a way to add some sort of locale setting to allow turning this behavior off so it will inhibit the script action?
This way we can add this script and maintaining control over it without breaking anything, should be some sort of advanced regional overrides.

I was refering to locale -a, I was wondering if it’s possible to detect these kind of changes using this command but I don’t think this question is relevant, feel free to ignore it.

You can bypass the script. Putting a shorter script in ~/bin is one way.

$ cat ~/bin/gnome-terminal
#!/bin/sh
exec /usr/bin/gnome-terminal $@

(the execute bit needs to be set on that too)

I’m pretty sure we can handle it in the background but I want an actual option in Gnome-Control-Center to reflect that, otherwise we’re creating something that is inaccessible to most users.

Understood. But then it would be a rather big project, so you need to find someone with sufficient skill who is ready to spend time on it.

I’m aware of that yet since it’s aimed at new users this is the best way I can think of.

Asking the users to launch nano or vim with sudo and change some words (or sudo sed) is not so user friendly as I see it.

@yaron: I find your position on this topic somewhat contradictory. If you are convinced that English in the terminal is a better default for Hebrew users than Hebrew, the logical consequence should be that changing the default to English would result in fewer users wanting to change it.

Btw, you don’t need root access to change files in your $HOME.

I understand what you’re saying, all I want is the most robust, easy and clean solution I can find, “hacking” this kind of things make linux look bad and I don’t think I want to convey such experience to new or intermediate users.

I will try to see if I can get some input from Arabic or Persian speakers and see if we should consider this approach or there are other ways to handle this.

Thank you!

2 Likes