internals: Support characters beyond the first unicode plane (WIP)

We used 16-bit variables to store the 'character code' everywhere but
this won't let us represent anything beyond U+FFFF.

This patch changes those variables to a custom type that can be 32 or 16
bits depending on the build, and adjusts numerous internal APIs and
datastructures to match.  This includes:

 * utf8decode() and friends
 * on-screen keyboard
 * font manipulation, caching, rendering, and generation
 * VFAT code parses and generates utf16 dirents
 * WIN32 simulator reads and writes utf16 filenames

Note that this patch doesn't _enable_ >16bit unicode support; a followup
patch will turn that on for appropriate targets.

Known bugs:

  * Native players in 32-bit unicode mode generate mangled filename
    entries if they include UTF16 surrogate codepoints.  Root cause
    is unclear, and may reside in core dircache code.

Needs testing on:

 * windows simulator (16bit+32bit)

Change-Id: I193a00fe2a11a4181ddc82df2d71be52bf00b6e6
This commit is contained in:
Solomon Peachy 2024-12-17 08:55:21 -05:00
parent 94712b34d4
commit d05c59f35b
44 changed files with 480 additions and 335 deletions

View file

@ -22,8 +22,8 @@
#include "kbd_helper.h"
/* USAGE:
unsigned short kbd[64];
unsigned short *kbd_p = kbd;
ucschar_t kbd[64];
ucschar_t *kbd_p = kbd;
if (!kbd_create_layout("ABCD1234\n", kbd, sizeof(kbd)))
kbd_p = NULL;
@ -34,14 +34,14 @@
* success returns size of buffer used
* failure returns 0
*/
int kbd_create_layout(const char *layout, unsigned short *buf, int bufsz)
int kbd_create_layout(const char *layout, ucschar_t *buf, int bufsz)
{
unsigned short *pbuf;
ucschar_t *pbuf;
const unsigned char *p = layout;
int len = 0;
int total_len = 0;
pbuf = buf;
while (*p && (pbuf - buf + (ptrdiff_t) sizeof(unsigned short)) < bufsz)
while (*p && (pbuf - buf + (ptrdiff_t) sizeof(ucschar_t)) < bufsz)
{
p = rb->utf8decode(p, &pbuf[len+1]);
if (pbuf[len+1] == '\n')
@ -60,7 +60,7 @@ int kbd_create_layout(const char *layout, unsigned short *buf, int bufsz)
*pbuf = len;
pbuf[len+1] = 0xFEFF; /* mark end of characters */
total_len += len + 1;
return total_len * sizeof(unsigned short);
return total_len * sizeof(ucschar_t);
}
return 0;