[Libre-soc-isa] [Bug 794] SVP64 REMAP for utf8
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Wed Mar 30 16:16:30 BST 2022
https://bugs.libre-soc.org/show_bug.cgi?id=794
--- Comment #6 from Jacob Lifshay <programmerjake at gmail.com> ---
additional links:
(WTF-8 is UTF-8 but modified to also represent unpaired surrogates, like in
ill-formed UTF-16. this is useful for Windows File Names, Java/JS Strings,
etc.)
https://simonsapin.github.io/wtf-8/
https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf
Table 3-7 (modified to put a star next to where the original used bold text)
Well-Formed UTF-8 Byte Sequences
Code Points First Byte Second Byte Third Byte Fourth Byte
U+0000..U+007F 00..7F
U+0080..U+07FF C2..DF 80..BF
U+0800..U+0FFF E0 *A0..BF 80..BF
U+1000..U+CFFF E1..EC 80..BF 80..BF
U+D000..U+D7FF ED 80..*9F 80..BF
U+E000..U+FFFF EE..EF 80..BF 80..BF
U+10000..U+3FFFF F0 *90..BF 80..BF 80..BF
U+40000..U+FFFFF F1..F3 80..BF 80..BF 80..BF
U+100000..U+10FFFF F4 80..*8F 80..BF 80..BF
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list