Commit graph

38 commits

Author SHA1 Message Date
Solomon Peachy
79e4e075e6 updatelang: Respect the target ordering in individual phrases
When matching the target id in a phrase, if there is more than one match
we always use the final one.  This allows us to easily specify a
default/wildcard entry that gets overridden by a target-specific one.

The list of targets was sorted alphabetically to ensure consistent ordering
from one run of the tooling to the next.

However, if a phrase contained both device-specific phrases as well as a
generic "feature" fallback, alphabetical sorting may screw things up, as
the "feature default" was no longer at the top of the list. This is
known to be an issue for LANG_TIME_SET_BUTTON and LANG_TIME_REVERT, but
may affect other phrases as well.

(To be blunt, we shouldn't be mixing feature and device-specific targets
 in this context.  The "feature default" should be removed in favor of
 target-specific entries, but in this specific case it looks to be a
 real PITA due to incomplete keymaps)

Consequently, work around this by sorting the target list within each phrase
based on the ordering in the master (ie English) language file.

Change-Id: Id32439c179a98663f414530fb36012f9b217c1b6
2025-10-02 12:36:49 -04:00
Solomon Peachy
c08bbaac49 Revert "lang: Complain if there are multiple target matches for a given string"
This reverts commit a88ef80560.
2025-05-04 09:21:18 -04:00
Solomon Peachy
a88ef80560 lang: Complain if there are multiple target matches for a given string
The tooling will always use the *final* match, which may or may not be
what is desired.  Treat this as a bug, and complain appropriately.

However, there is a special case.  The RTC set screen uses strings that
include the device button names. There should be an entry for the
specific device, but if not, we wanted to fall back to the string
specified by the 'rtc' feature flag as opposed to falling back to the
default, empty string.

To still support this, add a special "FALLBACK" value; If we end up
using this for a device, the tooling will treat this as a bug, and
complain accordingly.

This should fix FS#13615 and FS13616, and may introduce build failures
on targets that are missing appropriate entries.  We'll see.

Change-Id: Ie78bb247f968e19d450a0fbf6e1177b6d01126a1
2025-05-04 08:53:22 -04:00
Solomon Peachy
12b9419006 Some fixes for language and voice scripts:
Languages:
  * Get rid of leading space on LANG_ID3_VBR [ " (VBR)" ]
  * Fix up sole user to insert the space programmically
 updatelang:
  * strip leading and trailing spaces on all phrases except VOICE_PAUSE
 voice.pl
  * Debug logging with UTF-8 output
  * Explicitly delete tab character from voiced strings

Change-Id: Ie466793479ce15ce7a9553770583a070530e7afd
2025-04-29 20:03:11 -04:00
Solomon Peachy
81e050871b updatelang: Correct grammatical goofs in some of the errors
Change-Id: I3795d89e68453e636188b26a1620226a836c8a4d
2025-04-20 11:58:38 -04:00
Solomon Peachy
f563fe54c2 updatelang: Alter syntax for 'phrase missing entirely' errors
Instead of 'this phrase missing entirely [...]' followed by the
verbatim phrase copied from English, instead the message now
reads 'the 'PHRASE_ID' is missing entirely [...].  This allows
the warning to be self-contained.

Change-Id: I413c29e0c1f6616e74d875d197b34c4724330d67
2025-04-19 21:59:01 -04:00
Solomon Peachy
eb2d596d72 updatelang: Normalize all strings in our lang files to NFC form.
Now no matter how [de]normalized the input strings are, we will
normalize them to the best of our ability in what we use.

This adds a dependencey for Perl's Unicode::Normalize.

Change-Id: I13e275692ea33a463b19f3a499ea06ce1acbb44a
2024-10-22 07:25:41 -04:00
Solomon Peachy
a056150d52 updatelang: Flag '|' in voice strings too
Change-Id: Id82bf7bd19741e7275d188ceeea872ebeb30e1eb
2024-09-18 10:43:34 -04:00
Solomon Peachy
24ae4aee33 updatelang: Expand suspicius character tests.
* dest:  < >
 * voice: [ ] < > { }

Change-Id: I97701e52807db996037b7542fb0b01f9db0dbc0f
2024-09-17 10:18:29 -04:00
Solomon Peachy
3c2a110728 updatelang: Add the ability to sort output file in the English file's order
Change-Id: Ia115549b96365cbee6f1f96c5b0351dcec538955
2024-07-28 20:45:34 -04:00
Solomon Peachy
9f366b1b8a voice: Flag ':' in voiced strings.
Also corrected the 100% languages:

   English, English-US, Chinese-Simplified, German, and Italian

(There is one Italian string I didn't know how to fix)

Change-Id: I958c6737810ad0199333d17fc092eab3120cef40
2024-07-21 11:22:24 -04:00
Solomon Peachy
c2c8fcb561 updatelang: ignore "same as english" flag when determining suggested voice string
Change-Id: I78d416679c64b837fff29d51e15e1dbd78f9fc0b
2024-06-24 13:21:27 -04:00
Solomon Peachy
bd2f5760ab updatelang: More special VOICE_LANG_NAME handling.
If someone submits an incomplete translation without a VOICE_LANG_NAME,
it will add the phrase but with a blank string.  In subsequent runs,
the blank string will be treated as an error.. and copied from English.

Make it so that if it is blank, it stays blank.

Change-Id: Ib4a6645a5a52c9d0ff6dcfd0702c2a507bf8d756
2024-06-20 22:08:30 -04:00
Solomon Peachy
e100daf343 voice: Voiced strings for INVALID_VOICE and LANGUAGE_NAME
* Voice generation script will create standalone .talk clips
 * These talk clips will be included in the rockbox .zip file
 * All .voice files will be included in the rockbox .zip file
 * Added LANGUAGE_NAME for all languages in the nightly builds

This way, any voice pack installed will give you a the langauge
voiced in the browser, and if the voice file fails to load you
will get a natively translated error message.

Change-Id: I6b627a51746cd088d6e200666dd326ea2745f55f
2024-06-20 17:31:31 -04:00
Solomon Peachy
d22dbe74cb updatelang: '~' is not a legal character in dest or voice strings
...Unless it's the very first character (and will get stripped).

So detect and complain about this!

Change-Id: I5e333e8ee134160f64a67783b0d5aa564716d44e
2024-05-30 21:02:18 -04:00
Solomon Peachy
c615a02ee3 updatelang: Improve tests for illegal characters
Change-Id: I1a8ed93f1e7d6b449e634656c8ff087f28c259f5
2024-05-16 21:51:14 -04:00
Solomon Peachy
3a6ed727d4 lang: Add a special flag to differentiate "intentionally identical to english"
We normally treat "same as English" as a translation errors that needs to be
corrected.  However, many languages effectively use english words as-is, so
we need a way of distinguishing the "intentionally the same" situations with
our tools "automatically copying missing translated strings from English" to
avoid blank or missing UI strings.

The solution is to make sure these "intentionally same as english"
strings are actually different.  This will be accomplished by prepending
'~' to the these strings.  This special character is stripped from the
binary data files used by the player and the voice generation tools.

Change-Id: I90088cbd74de0e5cb9d65f75f26afe04f7e301bf
2024-05-16 20:40:37 -04:00
Solomon Peachy
b32266b7db updatelang: Avoid some runtime warnings
...And add '"' to the suspicious character list

Change-Id: Ia8a790882085a6e82c89cae09164ddbccf36e47f
2024-05-01 09:11:58 -04:00
Solomon Peachy
0c0b1b1a6b updatelang: Sanity-check the translated LANG_VOICED_DATE_FORMAT
This must be *localized* not translated!

Change-Id: I961eac91356a4b3ba7bba9828df69a08ce273543
2024-04-30 21:26:59 -04:00
Solomon Peachy
34c6ee539f updatelang: Include the old/incorrect format specifier in the error message
Change-Id: Ic8ea9430e1412d98b518bcb2d8508ef459d1700a
2024-04-30 06:18:46 -04:00
Solomon Peachy
73a47a1b5e updatelang: Make sure translated string has the correct format
We do this by parsing out the format specifiers and making sure the
translation has the correct number, type, and order of specifiers.
Percent literals ('%%') are ignored.

Mis-matched formats can lead to much badness, so to be safe, use the
untranslated string instead and flag it as a problem on the translation
site.

Change-Id: Ib48c2e5c3502735b1c724dd3621549faa8b602b7
2024-04-29 22:03:12 -04:00
Solomon Peachy
9fd4782c6a updatelang: Complain about suspicious characters in voiced strings.
The main intent is to catch printf() format specifiers (ie '%')

Change-Id: I8ed54993431e5f4d35e98de8faa7690198d5947f
2024-04-29 17:14:15 -04:00
Solomon Peachy
5883cb4a52 languages: Prefer the translated <dest> over a <voice> that is identical to English
A lot of our translations have voice phrases that are identical
to English, even though they are translated in the display text.

In these scenarios, just use the translated text when generating
the voice files.  These will still be flagged as problems by the
translation web site!

Change-Id: I39a9888eaad650e4c847cccc60bd89cf44ae150a
2021-09-29 10:54:19 -04:00
Solomon Peachy
757766e807 languages: Prefer translated <dest> over untranslated english <voice>
When a prase is translated but the voice is not, default to using
the translated phrase over the untranslated English voice

Change-Id: Ie2cb1c6d0c370f450586b8a4653f1a073f8aec9d
2021-09-29 10:07:51 -04:00
Solomon Peachy
70e72e01d2 talk: Add support for languages that swap the tens position in numbers
For example, English would say "231" as "two hundred thirty one" but
many other languages would say "two hundred one and thirty"

So, if VOICE_NUMERIC_TENS_SWAP_SEPARATOR is not an empty string, swap
the tens and ones position and use that string ("and" in the above
example) as the voiced separator.

Change-Id: I69f8064d44b3995827327cabae6ad352bf257d04
2021-09-28 17:25:28 -04:00
Solomon Peachy
bac897381c updatelang: Handle/flag the bad data that led to english-us breaking
Change-Id: Ifffea9557d50ab5a103e13473ebe074ae1aa7b6d
2021-03-05 17:43:32 +00:00
Solomon Peachy
b459ded533 updatelang: Fix a couple of typos in the output used by the translate site
Change-Id: I13fe3e106c128dbc646906b5cb2c9702feb6bda2
2020-12-12 22:26:35 -05:00
Solomon Peachy
25529e4fe0 lang: More automated rejiggering, USB_MODE_* is no longer ibasso-specific
Change-Id: I8e7eb3bb3c5ed61572c0ade4059c3e3527558932
2020-11-22 14:45:16 +00:00
Solomon Peachy
b7b0c7c648 languages: convert recording_swcodec -> recording
Change-Id: I481a53284d63457717f4a6524edc5b477f29a20a
2020-11-19 09:52:37 -05:00
Solomon Peachy
7ff3c94e13 lang: Make all swcodec &| lcd_bitmap strings default.
Change-Id: Id0a3282884c3e258c5b2f24b35aa7e618a8e8bbe
2020-11-17 11:06:09 -05:00
Solomon Peachy
f495c4846d updatelang: Fix the ignore list having issues with line endings.
Change-Id: Ib4add14ff7415c42d0cc2ec11ec918ec02fac72d
2020-07-28 18:55:16 -04:00
Solomon Peachy
530bc16679 updatelang: Extract langstr ignore list into a separate file so it can be shared
Change-Id: I4b77e1fe435e1f02df665f18e69b5c1db0a2e0b5
2020-07-28 11:11:39 -04:00
Solomon Peachy
2aeeeb43c9 updatelang: Fix false warnings about deprecated strings
Change-Id: Ia208909ed42dc7f9b8bd7d22ca88f1a1e47d0576
2020-07-28 09:17:46 -04:00
Solomon Peachy
5da59ce2fd updatelang: more tweaks for master language and sub-languages
Change-Id: I5af62b2f03bb4ee34518592e14c6ded3ccfea4e3
2020-07-28 00:12:23 -04:00
Solomon Peachy
f30f1bb467 updatelang: don't special-case english-us yet
Change-Id: If1a331d4f603154c036cd6c6b46f3a11e5e595e4
2020-07-27 22:17:40 -04:00
Solomon Peachy
cda5b055fe updatelang: Fix a few straggling issues
Change-Id: I549a33c94c339151cf5a74f13a8ecb73454bbfd4
2020-07-27 16:56:18 -04:00
Solomon Peachy
8159c9537f updatelang: Don't rely on non-core modules
Change-Id: I262f47e10aee51116375238b458270e92e25154d
2020-07-27 19:19:02 +00:00
Solomon Peachy
2305966d84 updatelang: New tool to update language files.
Change-Id: I3c18bb34770b4b4b321199149a2ea693dfbdb7f4
2020-07-27 14:58:38 -04:00