1
0
Fork 0
forked from len0rd/rockbox

internals: Support characters beyond the first unicode plane (WIP)

We used 16-bit variables to store the 'character code' everywhere but
this won't let us represent anything beyond U+FFFF.

This patch changes those variables to a custom type that can be 32 or 16
bits depending on the build, and adjusts numerous internal APIs and
datastructures to match.  This includes:

 * utf8decode() and friends
 * on-screen keyboard
 * font manipulation, caching, rendering, and generation
 * VFAT code parses and generates utf16 dirents
 * WIN32 simulator reads and writes utf16 filenames

Note that this patch doesn't _enable_ >16bit unicode support; a followup
patch will turn that on for appropriate targets.

Known bugs:

  * Native players in 32-bit unicode mode generate mangled filename
    entries if they include UTF16 surrogate codepoints.  Root cause
    is unclear, and may reside in core dircache code.

Needs testing on:

 * windows simulator (16bit+32bit)

Change-Id: I193a00fe2a11a4181ddc82df2d71be52bf00b6e6
This commit is contained in:
Solomon Peachy 2024-12-17 08:55:21 -05:00
parent 94712b34d4
commit d05c59f35b
44 changed files with 480 additions and 335 deletions

View file

@ -225,7 +225,7 @@ void beep_play(unsigned int frequency, unsigned int duration, unsigned int ampli
\param amplitude
\description
unsigned short *bidi_l2v( const unsigned char *str, int orientation )
ucschar_t *bidi_l2v( const unsigned char *str, int orientation )
\param str
\param orientation
\return
@ -407,13 +407,13 @@ const struct cbmp_bitmap_info_entry *core_bitmaps
\return
\description
const unsigned char *font_get_bits( struct font *pf, unsigned short char_code )
const unsigned char *font_get_bits( struct font *pf, ucschar_t char_code )
\param pf
\param char_code
\return
\description
const unsigned char* utf8decode(const unsigned char *utf8, unsigned short *ucs)
const unsigned char* utf8decode(const unsigned char *utf8, ucschar_t *ucs)
\group unicode stuff
\param utf8
\param ucs
@ -747,7 +747,7 @@ int font_getstringsize(const unsigned char *str, int *w, int *h, int fontnumber)
\return
\description
int font_get_width(struct font* pf, unsigned short char_code)
int font_get_width(struct font* pf, ucschar_t char_code)
\param pf
\param char_code
\return
@ -972,7 +972,7 @@ bool is_diacritic(const unsigned short char_code, bool *is_rtl)
\return
\description
int kbd_input(char* buffer, int buflen, unsigned short *kbd)
int kbd_input(char* buffer, int buflen, ucschar_t *kbd)
\group misc
\param buffer
\param buflen