Case mapping
There are three sets of characters: upper case, lower case, and neither. toUpperCase()
and toLowerCase()
provide mappings between them. Let's define the following sets:
- is the set of all single-code-point Unicode characters.
- := are characters that are invariant under both
toUpperCase()
andtoLowerCase()
, such as numbers and emojis. These are "uninteresting" characters that aren't in the scope of discussion.- is closed under both functions, i.e. there does not exist a character such that but or . This has been built into the data collector code shown below, so you can check that there are no logs in the browser console. (Note that also includes code points that aren't assigned to characters, so the exact set of characters is hard to get.)
- := . Characters in are never mapped to . Size: 2907
- := = To summarize, they are:
Character set (21)
- ǰ (U+01F0) → J̌ (U+004A U+030C) → ǰ (U+01F0)
- ΐ (U+0390) → Ϊ́ (U+03AA U+0301) → ΐ (U+0390)
- ΰ (U+03B0) → Ϋ́ (U+03AB U+0301) → ΰ (U+03B0)
- ẖ (U+1E96) → H̱ (U+0048 U+0331) → ẖ (U+1E96)
- ẗ (U+1E97) → T̈ (U+0054 U+0308) → ẗ (U+1E97)
- ẘ (U+1E98) → W̊ (U+0057 U+030A) → ẘ (U+1E98)
- ẙ (U+1E99) → Y̊ (U+0059 U+030A) → ẙ (U+1E99)
- ὐ (U+1F50) → Υ̓ (U+03A5 U+0313) → ὐ (U+1F50)
- ὒ (U+1F52) → Υ̓̀ (U+03A5 U+0313 U+0300) → ὒ (U+1F52)
- ὔ (U+1F54) → Υ̓́ (U+03A5 U+0313 U+0301) → ὔ (U+1F54)
- ὖ (U+1F56) → Υ̓͂ (U+03A5 U+0313 U+0342) → ὖ (U+1F56)
- ᾶ (U+1FB6) → Α͂ (U+0391 U+0342) → ᾶ (U+1FB6)
- ῆ (U+1FC6) → Η͂ (U+0397 U+0342) → ῆ (U+1FC6)
- ῒ (U+1FD2) → Ϊ̀ (U+03AA U+0300) → ῒ (U+1FD2)
- ῖ (U+1FD6) → Ι͂ (U+0399 U+0342) → ῖ (U+1FD6)
- ῗ (U+1FD7) → Ϊ͂ (U+03AA U+0342) → ῗ (U+1FD7)
- ῢ (U+1FE2) → Ϋ̀ (U+03AB U+0300) → ῢ (U+1FE2)
- ῤ (U+1FE4) → Ρ̓ (U+03A1 U+0313) → ῤ (U+1FE4)
- ῦ (U+1FE6) → Υ͂ (U+03A5 U+0342) → ῦ (U+1FE6)
- ῧ (U+1FE7) → Ϋ͂ (U+03AB U+0342) → ῧ (U+1FE7)
- ῶ (U+1FF6) → Ω͂ (U+03A9 U+0342) → ῶ (U+1FF6)
- Latin Extended-B
- LATIN SMALL LETTER J WITH CARON (U+01F0)
- Greek and Coptic
- GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS (U+0390)
- GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS (U+03B0)
- Latin Extended Additional
- 4 characters
- Greek Extended
- 14 characters
- := = The only character is:
Character set (1)
- İ (U+0130) → i̇ (U+0069 U+0307) → İ (U+0130)
- Latin Extended-A
- LATIN CAPITAL LETTER I WITH DOT ABOVE (U+0130)
- := = To summarize, they are:
Character set (79)
- ß (U+00DF) → SS (U+0053 U+0053) → ss (U+0073 U+0073)
- ʼn (U+0149) → ʼN (U+02BC U+004E) → ʼn (U+02BC U+006E)
- և (U+0587) → ԵՒ (U+0535 U+0552) → եւ (U+0565 U+0582)
- ẚ (U+1E9A) → Aʾ (U+0041 U+02BE) → aʾ (U+0061 U+02BE)
- ᾀ (U+1F80) → ἈΙ (U+1F08 U+0399) → ἀι (U+1F00 U+03B9)
- ᾁ (U+1F81) → ἉΙ (U+1F09 U+0399) → ἁι (U+1F01 U+03B9)
- ᾂ (U+1F82) → ἊΙ (U+1F0A U+0399) → ἂι (U+1F02 U+03B9)
- ᾃ (U+1F83) → ἋΙ (U+1F0B U+0399) → ἃι (U+1F03 U+03B9)
- ᾄ (U+1F84) → ἌΙ (U+1F0C U+0399) → ἄι (U+1F04 U+03B9)
- ᾅ (U+1F85) → ἍΙ (U+1F0D U+0399) → ἅι (U+1F05 U+03B9)
- ᾆ (U+1F86) → ἎΙ (U+1F0E U+0399) → ἆι (U+1F06 U+03B9)
- ᾇ (U+1F87) → ἏΙ (U+1F0F U+0399) → ἇι (U+1F07 U+03B9)
- ᾈ (U+1F88) → ἈΙ (U+1F08 U+0399) → ἀι (U+1F00 U+03B9)
- ᾉ (U+1F89) → ἉΙ (U+1F09 U+0399) → ἁι (U+1F01 U+03B9)
- ᾊ (U+1F8A) → ἊΙ (U+1F0A U+0399) → ἂι (U+1F02 U+03B9)
- ᾋ (U+1F8B) → ἋΙ (U+1F0B U+0399) → ἃι (U+1F03 U+03B9)
- ᾌ (U+1F8C) → ἌΙ (U+1F0C U+0399) → ἄι (U+1F04 U+03B9)
- ᾍ (U+1F8D) → ἍΙ (U+1F0D U+0399) → ἅι (U+1F05 U+03B9)
- ᾎ (U+1F8E) → ἎΙ (U+1F0E U+0399) → ἆι (U+1F06 U+03B9)
- ᾏ (U+1F8F) → ἏΙ (U+1F0F U+0399) → ἇι (U+1F07 U+03B9)
- ᾐ (U+1F90) → ἨΙ (U+1F28 U+0399) → ἠι (U+1F20 U+03B9)
- ᾑ (U+1F91) → ἩΙ (U+1F29 U+0399) → ἡι (U+1F21 U+03B9)
- ᾒ (U+1F92) → ἪΙ (U+1F2A U+0399) → ἢι (U+1F22 U+03B9)
- ᾓ (U+1F93) → ἫΙ (U+1F2B U+0399) → ἣι (U+1F23 U+03B9)
- ᾔ (U+1F94) → ἬΙ (U+1F2C U+0399) → ἤι (U+1F24 U+03B9)
- ᾕ (U+1F95) → ἭΙ (U+1F2D U+0399) → ἥι (U+1F25 U+03B9)
- ᾖ (U+1F96) → ἮΙ (U+1F2E U+0399) → ἦι (U+1F26 U+03B9)
- ᾗ (U+1F97) → ἯΙ (U+1F2F U+0399) → ἧι (U+1F27 U+03B9)
- ᾘ (U+1F98) → ἨΙ (U+1F28 U+0399) → ἠι (U+1F20 U+03B9)
- ᾙ (U+1F99) → ἩΙ (U+1F29 U+0399) → ἡι (U+1F21 U+03B9)
- ᾚ (U+1F9A) → ἪΙ (U+1F2A U+0399) → ἢι (U+1F22 U+03B9)
- ᾛ (U+1F9B) → ἫΙ (U+1F2B U+0399) → ἣι (U+1F23 U+03B9)
- ᾜ (U+1F9C) → ἬΙ (U+1F2C U+0399) → ἤι (U+1F24 U+03B9)
- ᾝ (U+1F9D) → ἭΙ (U+1F2D U+0399) → ἥι (U+1F25 U+03B9)
- ᾞ (U+1F9E) → ἮΙ (U+1F2E U+0399) → ἦι (U+1F26 U+03B9)
- ᾟ (U+1F9F) → ἯΙ (U+1F2F U+0399) → ἧι (U+1F27 U+03B9)
- ᾠ (U+1FA0) → ὨΙ (U+1F68 U+0399) → ὠι (U+1F60 U+03B9)
- ᾡ (U+1FA1) → ὩΙ (U+1F69 U+0399) → ὡι (U+1F61 U+03B9)
- ᾢ (U+1FA2) → ὪΙ (U+1F6A U+0399) → ὢι (U+1F62 U+03B9)
- ᾣ (U+1FA3) → ὫΙ (U+1F6B U+0399) → ὣι (U+1F63 U+03B9)
- ᾤ (U+1FA4) → ὬΙ (U+1F6C U+0399) → ὤι (U+1F64 U+03B9)
- ᾥ (U+1FA5) → ὭΙ (U+1F6D U+0399) → ὥι (U+1F65 U+03B9)
- ᾦ (U+1FA6) → ὮΙ (U+1F6E U+0399) → ὦι (U+1F66 U+03B9)
- ᾧ (U+1FA7) → ὯΙ (U+1F6F U+0399) → ὧι (U+1F67 U+03B9)
- ᾨ (U+1FA8) → ὨΙ (U+1F68 U+0399) → ὠι (U+1F60 U+03B9)
- ᾩ (U+1FA9) → ὩΙ (U+1F69 U+0399) → ὡι (U+1F61 U+03B9)
- ᾪ (U+1FAA) → ὪΙ (U+1F6A U+0399) → ὢι (U+1F62 U+03B9)
- ᾫ (U+1FAB) → ὫΙ (U+1F6B U+0399) → ὣι (U+1F63 U+03B9)
- ᾬ (U+1FAC) → ὬΙ (U+1F6C U+0399) → ὤι (U+1F64 U+03B9)
- ᾭ (U+1FAD) → ὭΙ (U+1F6D U+0399) → ὥι (U+1F65 U+03B9)
- ᾮ (U+1FAE) → ὮΙ (U+1F6E U+0399) → ὦι (U+1F66 U+03B9)
- ᾯ (U+1FAF) → ὯΙ (U+1F6F U+0399) → ὧι (U+1F67 U+03B9)
- ᾲ (U+1FB2) → ᾺΙ (U+1FBA U+0399) → ὰι (U+1F70 U+03B9)
- ᾳ (U+1FB3) → ΑΙ (U+0391 U+0399) → αι (U+03B1 U+03B9)
- ᾴ (U+1FB4) → ΆΙ (U+0386 U+0399) → άι (U+03AC U+03B9)
- ᾷ (U+1FB7) → Α͂Ι (U+0391 U+0342 U+0399) → ᾶι (U+1FB6 U+03B9)
- ᾼ (U+1FBC) → ΑΙ (U+0391 U+0399) → αι (U+03B1 U+03B9)
- ῂ (U+1FC2) → ῊΙ (U+1FCA U+0399) → ὴι (U+1F74 U+03B9)
- ῃ (U+1FC3) → ΗΙ (U+0397 U+0399) → ηι (U+03B7 U+03B9)
- ῄ (U+1FC4) → ΉΙ (U+0389 U+0399) → ήι (U+03AE U+03B9)
- ῇ (U+1FC7) → Η͂Ι (U+0397 U+0342 U+0399) → ῆι (U+1FC6 U+03B9)
- ῌ (U+1FCC) → ΗΙ (U+0397 U+0399) → ηι (U+03B7 U+03B9)
- ῲ (U+1FF2) → ῺΙ (U+1FFA U+0399) → ὼι (U+1F7C U+03B9)
- ῳ (U+1FF3) → ΩΙ (U+03A9 U+0399) → ωι (U+03C9 U+03B9)
- ῴ (U+1FF4) → ΏΙ (U+038F U+0399) → ώι (U+03CE U+03B9)
- ῷ (U+1FF7) → Ω͂Ι (U+03A9 U+0342 U+0399) → ῶι (U+1FF6 U+03B9)
- ῼ (U+1FFC) → ΩΙ (U+03A9 U+0399) → ωι (U+03C9 U+03B9)
- ff (U+FB00) → FF (U+0046 U+0046) → ff (U+0066 U+0066)
- fi (U+FB01) → FI (U+0046 U+0049) → fi (U+0066 U+0069)
- fl (U+FB02) → FL (U+0046 U+004C) → fl (U+0066 U+006C)
- ffi (U+FB03) → FFI (U+0046 U+0046 U+0049) → ffi (U+0066 U+0066 U+0069)
- ffl (U+FB04) → FFL (U+0046 U+0046 U+004C) → ffl (U+0066 U+0066 U+006C)
- ſt (U+FB05) → ST (U+0053 U+0054) → st (U+0073 U+0074)
- st (U+FB06) → ST (U+0053 U+0054) → st (U+0073 U+0074)
- ﬓ (U+FB13) → ՄՆ (U+0544 U+0546) → մն (U+0574 U+0576)
- ﬔ (U+FB14) → ՄԵ (U+0544 U+0535) → մե (U+0574 U+0565)
- ﬕ (U+FB15) → ՄԻ (U+0544 U+053B) → մի (U+0574 U+056B)
- ﬖ (U+FB16) → ՎՆ (U+054E U+0546) → վն (U+057E U+0576)
- ﬗ (U+FB17) → ՄԽ (U+0544 U+053D) → մխ (U+0574 U+056D)
- Latin-1 Supplement
- LATIN SMALL LETTER SHARP S (U+00DF)
- Latin Extended-A
- LATIN SMALL LETTER N PRECEDED BY APOSTROPHE (U+0149)
- Latin Extended Additional
- LATIN SMALL LETTER A WITH RIGHT HALF RING (U+1E9A)
- Armenian
- ARMENIAN SMALL LIGATURE ECH YIWN (U+0587)
- Greek Extended
- Greek small letters with ypogegrammeni (): U+1F{8,9,A}0 – U+1F{8,9,A}7, U+1F{B,C,F}2 – U+1F{B,C,F}4, U+1F{B,C,F}7 (the Iota subscript, COMBINING GREEK YPOGEGRAMMENI (U+0345) itself, maps to GREEK CAPITAL LETTER IOTA (U+0399) which will be discussed later)
- Greek capital letters with prosgegrammeni (): U+1F{8,9,A}8 – U+1F{8,9,A}F, U+1F{B,C,F}C
- Alphabetic Presentation Forms
- All latin ligatures: U+FB00 – U+FB06
- All Armenian ligatures: U+FB13 – U+FB17
- := = ∅
- := . Conceptually, these are uppercase letters that just don't have a single Unicode code point.
- := . Conceptually, these are lowercase letters that just don't have a single Unicode code point.
- :=
- :=
- := . Size: 1464
- := . Size: 1485
Also define the following predicates:
Thus define the following sets:
- := . Size: 1821 (Note: Unicode utility lists 1831, with the extra 10 possibly being duplicates by normalization.)
- := . Size: 2223 (Note: Unicode utility lists 2233, with the extra 10 possibly being duplicates by normalization.)
- := . Size: 1350
- := . Size: 1441
Define the following terminologies:
- is upper case if .
- is lower case if .
- upper case and lower case are mutually exclusive: = ∅
- is cased if .
- is uncased if .
- is lowercase variant if .
- is uppercase variant if .
- is case-mapping variant if is either lowercase variant or uppercase variant.
- is case-mapping invariant if is neither lowercase variant nor uppercase variant.
NOTE: To maximize the number of single-code-point characters in discussion, we normalize the output with .normalize("NFC")
.
Case-mapping properties
Idempotence
The first invariant we want to establish is toUpperCase(toUpperCase(c)) == toUpperCase(c)
and toLowerCase(toLowerCase(c)) == toLowerCase(c)
for all .
- = ∅
- = ∅
This also means that if is uppercase variant, then will not be the output of toUpperCase()
; similarly, if is lowercase variant, then will not be the output of toLowerCase()
.
Complementary ranges
The ranges of toUpperCase()
and toLowerCase()
are disjoint:
- = ∅
But, they are not partitions of :
- =
Character set (31)
Dž (U+01C5)Lj (U+01C8)Nj (U+01CB)Dz (U+01F2)ᾈ (U+1F88)ᾉ (U+1F89)ᾊ (U+1F8A)ᾋ (U+1F8B)ᾌ (U+1F8C)ᾍ (U+1F8D)ᾎ (U+1F8E)ᾏ (U+1F8F)ᾘ (U+1F98)ᾙ (U+1F99)ᾚ (U+1F9A)ᾛ (U+1F9B)ᾜ (U+1F9C)ᾝ (U+1F9D)ᾞ (U+1F9E)ᾟ (U+1F9F)ᾨ (U+1FA8)ᾩ (U+1FA9)ᾪ (U+1FAA)ᾫ (U+1FAB)ᾬ (U+1FAC)ᾭ (U+1FAD)ᾮ (U+1FAE)ᾯ (U+1FAF)ᾼ (U+1FBC)ῌ (U+1FCC)ῼ (U+1FFC)
27 of these characters are . The other 4 are:
- Latin Extended-B
- Ligatures (): U+01C5, U+01C8, U+01CB, U+01F2
These characters cannot be produced by toUpperCase()
or toLowerCase()
with any input, including themselves.
Relationships between case-mapping variance and case
Does upper(lower) case imply upper(lower)case invariance?
Yes. toUpperCase()
and toLowerCase()
are identity functions on and , respectively.
- = ∅
- = ∅
This means upper case implies uppercase invariance, and lower case implies lowercase invariance.
Does upper(lower) case imply lower(upper)case variance?
No. and are not proper subsets of :
- =
Character set (471)
ϒ (U+03D2)ϓ (U+03D3)ϔ (U+03D4)ℂ (U+2102)ℇ (U+2107)ℋ (U+210B)ℌ (U+210C)ℍ (U+210D)ℐ (U+2110)ℑ (U+2111)ℒ (U+2112)ℕ (U+2115)ℙ (U+2119)ℚ (U+211A)ℛ (U+211B)ℜ (U+211C)ℝ (U+211D)ℤ (U+2124)ℨ (U+2128)ℬ (U+212C)ℭ (U+212D)ℰ (U+2130)ℱ (U+2131)ℳ (U+2133)ℾ (U+213E)ℿ (U+213F)ⅅ (U+2145)𝐀 (U+1D400)𝐁 (U+1D401)𝐂 (U+1D402)𝐃 (U+1D403)𝐄 (U+1D404)𝐅 (U+1D405)𝐆 (U+1D406)𝐇 (U+1D407)𝐈 (U+1D408)𝐉 (U+1D409)𝐊 (U+1D40A)𝐋 (U+1D40B)𝐌 (U+1D40C)𝐍 (U+1D40D)𝐎 (U+1D40E)𝐏 (U+1D40F)𝐐 (U+1D410)𝐑 (U+1D411)𝐒 (U+1D412)𝐓 (U+1D413)𝐔 (U+1D414)𝐕 (U+1D415)𝐖 (U+1D416)𝐗 (U+1D417)𝐘 (U+1D418)𝐙 (U+1D419)𝐴 (U+1D434)𝐵 (U+1D435)𝐶 (U+1D436)𝐷 (U+1D437)𝐸 (U+1D438)𝐹 (U+1D439)𝐺 (U+1D43A)𝐻 (U+1D43B)𝐼 (U+1D43C)𝐽 (U+1D43D)𝐾 (U+1D43E)𝐿 (U+1D43F)𝑀 (U+1D440)𝑁 (U+1D441)𝑂 (U+1D442)𝑃 (U+1D443)𝑄 (U+1D444)𝑅 (U+1D445)𝑆 (U+1D446)𝑇 (U+1D447)𝑈 (U+1D448)𝑉 (U+1D449)𝑊 (U+1D44A)𝑋 (U+1D44B)𝑌 (U+1D44C)𝑍 (U+1D44D)𝑨 (U+1D468)𝑩 (U+1D469)𝑪 (U+1D46A)𝑫 (U+1D46B)𝑬 (U+1D46C)𝑭 (U+1D46D)𝑮 (U+1D46E)𝑯 (U+1D46F)𝑰 (U+1D470)𝑱 (U+1D471)𝑲 (U+1D472)𝑳 (U+1D473)𝑴 (U+1D474)𝑵 (U+1D475)𝑶 (U+1D476)𝑷 (U+1D477)𝑸 (U+1D478)𝑹 (U+1D479)𝑺 (U+1D47A)𝑻 (U+1D47B)𝑼 (U+1D47C)𝑽 (U+1D47D)𝑾 (U+1D47E)𝑿 (U+1D47F)𝒀 (U+1D480)𝒁 (U+1D481)𝒜 (U+1D49C)𝒞 (U+1D49E)𝒟 (U+1D49F)𝒢 (U+1D4A2)𝒥 (U+1D4A5)𝒦 (U+1D4A6)𝒩 (U+1D4A9)𝒪 (U+1D4AA)𝒫 (U+1D4AB)𝒬 (U+1D4AC)𝒮 (U+1D4AE)𝒯 (U+1D4AF)𝒰 (U+1D4B0)𝒱 (U+1D4B1)𝒲 (U+1D4B2)𝒳 (U+1D4B3)𝒴 (U+1D4B4)𝒵 (U+1D4B5)𝓐 (U+1D4D0)𝓑 (U+1D4D1)𝓒 (U+1D4D2)𝓓 (U+1D4D3)𝓔 (U+1D4D4)𝓕 (U+1D4D5)𝓖 (U+1D4D6)𝓗 (U+1D4D7)𝓘 (U+1D4D8)𝓙 (U+1D4D9)𝓚 (U+1D4DA)𝓛 (U+1D4DB)𝓜 (U+1D4DC)𝓝 (U+1D4DD)𝓞 (U+1D4DE)𝓟 (U+1D4DF)𝓠 (U+1D4E0)𝓡 (U+1D4E1)𝓢 (U+1D4E2)𝓣 (U+1D4E3)𝓤 (U+1D4E4)𝓥 (U+1D4E5)𝓦 (U+1D4E6)𝓧 (U+1D4E7)𝓨 (U+1D4E8)𝓩 (U+1D4E9)𝔄 (U+1D504)𝔅 (U+1D505)𝔇 (U+1D507)𝔈 (U+1D508)𝔉 (U+1D509)𝔊 (U+1D50A)𝔍 (U+1D50D)𝔎 (U+1D50E)𝔏 (U+1D50F)𝔐 (U+1D510)𝔑 (U+1D511)𝔒 (U+1D512)𝔓 (U+1D513)𝔔 (U+1D514)𝔖 (U+1D516)𝔗 (U+1D517)𝔘 (U+1D518)𝔙 (U+1D519)𝔚 (U+1D51A)𝔛 (U+1D51B)𝔜 (U+1D51C)𝔸 (U+1D538)𝔹 (U+1D539)𝔻 (U+1D53B)𝔼 (U+1D53C)𝔽 (U+1D53D)𝔾 (U+1D53E)𝕀 (U+1D540)𝕁 (U+1D541)𝕂 (U+1D542)𝕃 (U+1D543)𝕄 (U+1D544)𝕆 (U+1D546)𝕊 (U+1D54A)𝕋 (U+1D54B)𝕌 (U+1D54C)𝕍 (U+1D54D)𝕎 (U+1D54E)𝕏 (U+1D54F)𝕐 (U+1D550)𝕬 (U+1D56C)𝕭 (U+1D56D)𝕮 (U+1D56E)𝕯 (U+1D56F)𝕰 (U+1D570)𝕱 (U+1D571)𝕲 (U+1D572)𝕳 (U+1D573)𝕴 (U+1D574)𝕵 (U+1D575)𝕶 (U+1D576)𝕷 (U+1D577)𝕸 (U+1D578)𝕹 (U+1D579)𝕺 (U+1D57A)𝕻 (U+1D57B)𝕼 (U+1D57C)𝕽 (U+1D57D)𝕾 (U+1D57E)𝕿 (U+1D57F)𝖀 (U+1D580)𝖁 (U+1D581)𝖂 (U+1D582)𝖃 (U+1D583)𝖄 (U+1D584)𝖅 (U+1D585)𝖠 (U+1D5A0)𝖡 (U+1D5A1)𝖢 (U+1D5A2)𝖣 (U+1D5A3)𝖤 (U+1D5A4)𝖥 (U+1D5A5)𝖦 (U+1D5A6)𝖧 (U+1D5A7)𝖨 (U+1D5A8)𝖩 (U+1D5A9)𝖪 (U+1D5AA)𝖫 (U+1D5AB)𝖬 (U+1D5AC)𝖭 (U+1D5AD)𝖮 (U+1D5AE)𝖯 (U+1D5AF)𝖰 (U+1D5B0)𝖱 (U+1D5B1)𝖲 (U+1D5B2)𝖳 (U+1D5B3)𝖴 (U+1D5B4)𝖵 (U+1D5B5)𝖶 (U+1D5B6)𝖷 (U+1D5B7)𝖸 (U+1D5B8)𝖹 (U+1D5B9)𝗔 (U+1D5D4)𝗕 (U+1D5D5)𝗖 (U+1D5D6)𝗗 (U+1D5D7)𝗘 (U+1D5D8)𝗙 (U+1D5D9)𝗚 (U+1D5DA)𝗛 (U+1D5DB)𝗜 (U+1D5DC)𝗝 (U+1D5DD)𝗞 (U+1D5DE)𝗟 (U+1D5DF)𝗠 (U+1D5E0)𝗡 (U+1D5E1)𝗢 (U+1D5E2)𝗣 (U+1D5E3)𝗤 (U+1D5E4)𝗥 (U+1D5E5)𝗦 (U+1D5E6)𝗧 (U+1D5E7)𝗨 (U+1D5E8)𝗩 (U+1D5E9)𝗪 (U+1D5EA)𝗫 (U+1D5EB)𝗬 (U+1D5EC)𝗭 (U+1D5ED)𝘈 (U+1D608)𝘉 (U+1D609)𝘊 (U+1D60A)𝘋 (U+1D60B)𝘌 (U+1D60C)𝘍 (U+1D60D)𝘎 (U+1D60E)𝘏 (U+1D60F)𝘐 (U+1D610)𝘑 (U+1D611)𝘒 (U+1D612)𝘓 (U+1D613)𝘔 (U+1D614)𝘕 (U+1D615)𝘖 (U+1D616)𝘗 (U+1D617)𝘘 (U+1D618)𝘙 (U+1D619)𝘚 (U+1D61A)𝘛 (U+1D61B)𝘜 (U+1D61C)𝘝 (U+1D61D)𝘞 (U+1D61E)𝘟 (U+1D61F)𝘠 (U+1D620)𝘡 (U+1D621)𝘼 (U+1D63C)𝘽 (U+1D63D)𝘾 (U+1D63E)𝘿 (U+1D63F)𝙀 (U+1D640)𝙁 (U+1D641)𝙂 (U+1D642)𝙃 (U+1D643)𝙄 (U+1D644)𝙅 (U+1D645)𝙆 (U+1D646)𝙇 (U+1D647)𝙈 (U+1D648)𝙉 (U+1D649)𝙊 (U+1D64A)𝙋 (U+1D64B)𝙌 (U+1D64C)𝙍 (U+1D64D)𝙎 (U+1D64E)𝙏 (U+1D64F)𝙐 (U+1D650)𝙑 (U+1D651)𝙒 (U+1D652)𝙓 (U+1D653)𝙔 (U+1D654)𝙕 (U+1D655)𝙰 (U+1D670)𝙱 (U+1D671)𝙲 (U+1D672)𝙳 (U+1D673)𝙴 (U+1D674)𝙵 (U+1D675)𝙶 (U+1D676)𝙷 (U+1D677)𝙸 (U+1D678)𝙹 (U+1D679)𝙺 (U+1D67A)𝙻 (U+1D67B)𝙼 (U+1D67C)𝙽 (U+1D67D)𝙾 (U+1D67E)𝙿 (U+1D67F)𝚀 (U+1D680)𝚁 (U+1D681)𝚂 (U+1D682)𝚃 (U+1D683)𝚄 (U+1D684)𝚅 (U+1D685)𝚆 (U+1D686)𝚇 (U+1D687)𝚈 (U+1D688)𝚉 (U+1D689)𝚨 (U+1D6A8)𝚩 (U+1D6A9)𝚪 (U+1D6AA)𝚫 (U+1D6AB)𝚬 (U+1D6AC)𝚭 (U+1D6AD)𝚮 (U+1D6AE)𝚯 (U+1D6AF)𝚰 (U+1D6B0)𝚱 (U+1D6B1)𝚲 (U+1D6B2)𝚳 (U+1D6B3)𝚴 (U+1D6B4)𝚵 (U+1D6B5)𝚶 (U+1D6B6)𝚷 (U+1D6B7)𝚸 (U+1D6B8)𝚹 (U+1D6B9)𝚺 (U+1D6BA)𝚻 (U+1D6BB)𝚼 (U+1D6BC)𝚽 (U+1D6BD)𝚾 (U+1D6BE)𝚿 (U+1D6BF)𝛀 (U+1D6C0)𝛢 (U+1D6E2)𝛣 (U+1D6E3)𝛤 (U+1D6E4)𝛥 (U+1D6E5)𝛦 (U+1D6E6)𝛧 (U+1D6E7)𝛨 (U+1D6E8)𝛩 (U+1D6E9)𝛪 (U+1D6EA)𝛫 (U+1D6EB)𝛬 (U+1D6EC)𝛭 (U+1D6ED)𝛮 (U+1D6EE)𝛯 (U+1D6EF)𝛰 (U+1D6F0)𝛱 (U+1D6F1)𝛲 (U+1D6F2)𝛳 (U+1D6F3)𝛴 (U+1D6F4)𝛵 (U+1D6F5)𝛶 (U+1D6F6)𝛷 (U+1D6F7)𝛸 (U+1D6F8)𝛹 (U+1D6F9)𝛺 (U+1D6FA)𝜜 (U+1D71C)𝜝 (U+1D71D)𝜞 (U+1D71E)𝜟 (U+1D71F)𝜠 (U+1D720)𝜡 (U+1D721)𝜢 (U+1D722)𝜣 (U+1D723)𝜤 (U+1D724)𝜥 (U+1D725)𝜦 (U+1D726)𝜧 (U+1D727)𝜨 (U+1D728)𝜩 (U+1D729)𝜪 (U+1D72A)𝜫 (U+1D72B)𝜬 (U+1D72C)𝜭 (U+1D72D)𝜮 (U+1D72E)𝜯 (U+1D72F)𝜰 (U+1D730)𝜱 (U+1D731)𝜲 (U+1D732)𝜳 (U+1D733)𝜴 (U+1D734)𝝖 (U+1D756)𝝗 (U+1D757)𝝘 (U+1D758)𝝙 (U+1D759)𝝚 (U+1D75A)𝝛 (U+1D75B)𝝜 (U+1D75C)𝝝 (U+1D75D)𝝞 (U+1D75E)𝝟 (U+1D75F)𝝠 (U+1D760)𝝡 (U+1D761)𝝢 (U+1D762)𝝣 (U+1D763)𝝤 (U+1D764)𝝥 (U+1D765)𝝦 (U+1D766)𝝧 (U+1D767)𝝨 (U+1D768)𝝩 (U+1D769)𝝪 (U+1D76A)𝝫 (U+1D76B)𝝬 (U+1D76C)𝝭 (U+1D76D)𝝮 (U+1D76E)𝞐 (U+1D790)𝞑 (U+1D791)𝞒 (U+1D792)𝞓 (U+1D793)𝞔 (U+1D794)𝞕 (U+1D795)𝞖 (U+1D796)𝞗 (U+1D797)𝞘 (U+1D798)𝞙 (U+1D799)𝞚 (U+1D79A)𝞛 (U+1D79B)𝞜 (U+1D79C)𝞝 (U+1D79D)𝞞 (U+1D79E)𝞟 (U+1D79F)𝞠 (U+1D7A0)𝞡 (U+1D7A1)𝞢 (U+1D7A2)𝞣 (U+1D7A3)𝞤 (U+1D7A4)𝞥 (U+1D7A5)𝞦 (U+1D7A6)𝞧 (U+1D7A7)𝞨 (U+1D7A8)𝟊 (U+1D7CA) - =
Character set (782)
ĸ (U+0138)ƍ (U+018D)ƛ (U+019B)ƪ (U+01AA)ƫ (U+01AB)ƺ (U+01BA)ƾ (U+01BE)ȡ (U+0221)ȴ (U+0234)ȵ (U+0235)ȶ (U+0236)ȷ (U+0237)ȸ (U+0238)ȹ (U+0239)ɕ (U+0255)ɘ (U+0258)ɚ (U+025A)ɝ (U+025D)ɞ (U+025E)ɟ (U+025F)ɢ (U+0262)ɤ (U+0264)ɧ (U+0267)ɭ (U+026D)ɮ (U+026E)ɰ (U+0270)ɳ (U+0273)ɴ (U+0274)ɶ (U+0276)ɷ (U+0277)ɸ (U+0278)ɹ (U+0279)ɺ (U+027A)ɻ (U+027B)ɼ (U+027C)ɾ (U+027E)ɿ (U+027F)ʁ (U+0281)ʄ (U+0284)ʅ (U+0285)ʆ (U+0286)ʍ (U+028D)ʎ (U+028E)ʏ (U+028F)ʐ (U+0290)ʑ (U+0291)ʓ (U+0293)ʕ (U+0295)ʖ (U+0296)ʗ (U+0297)ʘ (U+0298)ʙ (U+0299)ʚ (U+029A)ʛ (U+029B)ʜ (U+029C)ʟ (U+029F)ʠ (U+02A0)ʡ (U+02A1)ʢ (U+02A2)ʣ (U+02A3)ʤ (U+02A4)ʥ (U+02A5)ʦ (U+02A6)ʧ (U+02A7)ʨ (U+02A8)ʩ (U+02A9)ʪ (U+02AA)ʫ (U+02AB)ʬ (U+02AC)ʭ (U+02AD)ʮ (U+02AE)ʯ (U+02AF)ϼ (U+03FC)ՠ (U+0560)ֈ (U+0588)ᴀ (U+1D00)ᴁ (U+1D01)ᴂ (U+1D02)ᴃ (U+1D03)ᴄ (U+1D04)ᴅ (U+1D05)ᴆ (U+1D06)ᴇ (U+1D07)ᴈ (U+1D08)ᴉ (U+1D09)ᴊ (U+1D0A)ᴋ (U+1D0B)ᴌ (U+1D0C)ᴍ (U+1D0D)ᴎ (U+1D0E)ᴏ (U+1D0F)ᴐ (U+1D10)ᴑ (U+1D11)ᴒ (U+1D12)ᴓ (U+1D13)ᴔ (U+1D14)ᴕ (U+1D15)ᴖ (U+1D16)ᴗ (U+1D17)ᴘ (U+1D18)ᴙ (U+1D19)ᴚ (U+1D1A)ᴛ (U+1D1B)ᴜ (U+1D1C)ᴝ (U+1D1D)ᴞ (U+1D1E)ᴟ (U+1D1F)ᴠ (U+1D20)ᴡ (U+1D21)ᴢ (U+1D22)ᴣ (U+1D23)ᴤ (U+1D24)ᴥ (U+1D25)ᴦ (U+1D26)ᴧ (U+1D27)ᴨ (U+1D28)ᴩ (U+1D29)ᴪ (U+1D2A)ᴫ (U+1D2B)ᵫ (U+1D6B)ᵬ (U+1D6C)ᵭ (U+1D6D)ᵮ (U+1D6E)ᵯ (U+1D6F)ᵰ (U+1D70)ᵱ (U+1D71)ᵲ (U+1D72)ᵳ (U+1D73)ᵴ (U+1D74)ᵵ (U+1D75)ᵶ (U+1D76)ᵷ (U+1D77)ᵺ (U+1D7A)ᵻ (U+1D7B)ᵼ (U+1D7C)ᵾ (U+1D7E)ᵿ (U+1D7F)ᶀ (U+1D80)ᶁ (U+1D81)ᶂ (U+1D82)ᶃ (U+1D83)ᶄ (U+1D84)ᶅ (U+1D85)ᶆ (U+1D86)ᶇ (U+1D87)ᶈ (U+1D88)ᶉ (U+1D89)ᶊ (U+1D8A)ᶋ (U+1D8B)ᶌ (U+1D8C)ᶍ (U+1D8D)ᶏ (U+1D8F)ᶐ (U+1D90)ᶑ (U+1D91)ᶒ (U+1D92)ᶓ (U+1D93)ᶔ (U+1D94)ᶕ (U+1D95)ᶖ (U+1D96)ᶗ (U+1D97)ᶘ (U+1D98)ᶙ (U+1D99)ᶚ (U+1D9A)ẜ (U+1E9C)ẝ (U+1E9D)ẟ (U+1E9F)ℊ (U+210A)ℎ (U+210E)ℏ (U+210F)ℓ (U+2113)ℯ (U+212F)ℴ (U+2134)ℹ (U+2139)ℼ (U+213C)ℽ (U+213D)ⅆ (U+2146)ⅇ (U+2147)ⅈ (U+2148)ⅉ (U+2149)ⱱ (U+2C71)ⱴ (U+2C74)ⱷ (U+2C77)ⱸ (U+2C78)ⱹ (U+2C79)ⱺ (U+2C7A)ⱻ (U+2C7B)ⳤ (U+2CE4)ꜰ (U+A730)ꜱ (U+A731)ꝱ (U+A771)ꝲ (U+A772)ꝳ (U+A773)ꝴ (U+A774)ꝵ (U+A775)ꝶ (U+A776)ꝷ (U+A777)ꝸ (U+A778)ꞎ (U+A78E)ꞕ (U+A795)ꞯ (U+A7AF)ꟓ (U+A7D3)ꟕ (U+A7D5)ꟺ (U+A7FA)ꬰ (U+AB30)ꬱ (U+AB31)ꬲ (U+AB32)ꬳ (U+AB33)ꬴ (U+AB34)ꬵ (U+AB35)ꬶ (U+AB36)ꬷ (U+AB37)ꬸ (U+AB38)ꬹ (U+AB39)ꬺ (U+AB3A)ꬻ (U+AB3B)ꬼ (U+AB3C)ꬽ (U+AB3D)ꬾ (U+AB3E)ꬿ (U+AB3F)ꭀ (U+AB40)ꭁ (U+AB41)ꭂ (U+AB42)ꭃ (U+AB43)ꭄ (U+AB44)ꭅ (U+AB45)ꭆ (U+AB46)ꭇ (U+AB47)ꭈ (U+AB48)ꭉ (U+AB49)ꭊ (U+AB4A)ꭋ (U+AB4B)ꭌ (U+AB4C)ꭍ (U+AB4D)ꭎ (U+AB4E)ꭏ (U+AB4F)ꭐ (U+AB50)ꭑ (U+AB51)ꭒ (U+AB52)ꭔ (U+AB54)ꭕ (U+AB55)ꭖ (U+AB56)ꭗ (U+AB57)ꭘ (U+AB58)ꭙ (U+AB59)ꭚ (U+AB5A)ꭠ (U+AB60)ꭡ (U+AB61)ꭢ (U+AB62)ꭣ (U+AB63)ꭤ (U+AB64)ꭥ (U+AB65)ꭦ (U+AB66)ꭧ (U+AB67)ꭨ (U+AB68)𝐚 (U+1D41A)𝐛 (U+1D41B)𝐜 (U+1D41C)𝐝 (U+1D41D)𝐞 (U+1D41E)𝐟 (U+1D41F)𝐠 (U+1D420)𝐡 (U+1D421)𝐢 (U+1D422)𝐣 (U+1D423)𝐤 (U+1D424)𝐥 (U+1D425)𝐦 (U+1D426)𝐧 (U+1D427)𝐨 (U+1D428)𝐩 (U+1D429)𝐪 (U+1D42A)𝐫 (U+1D42B)𝐬 (U+1D42C)𝐭 (U+1D42D)𝐮 (U+1D42E)𝐯 (U+1D42F)𝐰 (U+1D430)𝐱 (U+1D431)𝐲 (U+1D432)𝐳 (U+1D433)𝑎 (U+1D44E)𝑏 (U+1D44F)𝑐 (U+1D450)𝑑 (U+1D451)𝑒 (U+1D452)𝑓 (U+1D453)𝑔 (U+1D454)𝑖 (U+1D456)𝑗 (U+1D457)𝑘 (U+1D458)𝑙 (U+1D459)𝑚 (U+1D45A)𝑛 (U+1D45B)𝑜 (U+1D45C)𝑝 (U+1D45D)𝑞 (U+1D45E)𝑟 (U+1D45F)𝑠 (U+1D460)𝑡 (U+1D461)𝑢 (U+1D462)𝑣 (U+1D463)𝑤 (U+1D464)𝑥 (U+1D465)𝑦 (U+1D466)𝑧 (U+1D467)𝒂 (U+1D482)𝒃 (U+1D483)𝒄 (U+1D484)𝒅 (U+1D485)𝒆 (U+1D486)𝒇 (U+1D487)𝒈 (U+1D488)𝒉 (U+1D489)𝒊 (U+1D48A)𝒋 (U+1D48B)𝒌 (U+1D48C)𝒍 (U+1D48D)𝒎 (U+1D48E)𝒏 (U+1D48F)𝒐 (U+1D490)𝒑 (U+1D491)𝒒 (U+1D492)𝒓 (U+1D493)𝒔 (U+1D494)𝒕 (U+1D495)𝒖 (U+1D496)𝒗 (U+1D497)𝒘 (U+1D498)𝒙 (U+1D499)𝒚 (U+1D49A)𝒛 (U+1D49B)𝒶 (U+1D4B6)𝒷 (U+1D4B7)𝒸 (U+1D4B8)𝒹 (U+1D4B9)𝒻 (U+1D4BB)𝒽 (U+1D4BD)𝒾 (U+1D4BE)𝒿 (U+1D4BF)𝓀 (U+1D4C0)𝓁 (U+1D4C1)𝓂 (U+1D4C2)𝓃 (U+1D4C3)𝓅 (U+1D4C5)𝓆 (U+1D4C6)𝓇 (U+1D4C7)𝓈 (U+1D4C8)𝓉 (U+1D4C9)𝓊 (U+1D4CA)𝓋 (U+1D4CB)𝓌 (U+1D4CC)𝓍 (U+1D4CD)𝓎 (U+1D4CE)𝓏 (U+1D4CF)𝓪 (U+1D4EA)𝓫 (U+1D4EB)𝓬 (U+1D4EC)𝓭 (U+1D4ED)𝓮 (U+1D4EE)𝓯 (U+1D4EF)𝓰 (U+1D4F0)𝓱 (U+1D4F1)𝓲 (U+1D4F2)𝓳 (U+1D4F3)𝓴 (U+1D4F4)𝓵 (U+1D4F5)𝓶 (U+1D4F6)𝓷 (U+1D4F7)𝓸 (U+1D4F8)𝓹 (U+1D4F9)𝓺 (U+1D4FA)𝓻 (U+1D4FB)𝓼 (U+1D4FC)𝓽 (U+1D4FD)𝓾 (U+1D4FE)𝓿 (U+1D4FF)𝔀 (U+1D500)𝔁 (U+1D501)𝔂 (U+1D502)𝔃 (U+1D503)𝔞 (U+1D51E)𝔟 (U+1D51F)𝔠 (U+1D520)𝔡 (U+1D521)𝔢 (U+1D522)𝔣 (U+1D523)𝔤 (U+1D524)𝔥 (U+1D525)𝔦 (U+1D526)𝔧 (U+1D527)𝔨 (U+1D528)𝔩 (U+1D529)𝔪 (U+1D52A)𝔫 (U+1D52B)𝔬 (U+1D52C)𝔭 (U+1D52D)𝔮 (U+1D52E)𝔯 (U+1D52F)𝔰 (U+1D530)𝔱 (U+1D531)𝔲 (U+1D532)𝔳 (U+1D533)𝔴 (U+1D534)𝔵 (U+1D535)𝔶 (U+1D536)𝔷 (U+1D537)𝕒 (U+1D552)𝕓 (U+1D553)𝕔 (U+1D554)𝕕 (U+1D555)𝕖 (U+1D556)𝕗 (U+1D557)𝕘 (U+1D558)𝕙 (U+1D559)𝕚 (U+1D55A)𝕛 (U+1D55B)𝕜 (U+1D55C)𝕝 (U+1D55D)𝕞 (U+1D55E)𝕟 (U+1D55F)𝕠 (U+1D560)𝕡 (U+1D561)𝕢 (U+1D562)𝕣 (U+1D563)𝕤 (U+1D564)𝕥 (U+1D565)𝕦 (U+1D566)𝕧 (U+1D567)𝕨 (U+1D568)𝕩 (U+1D569)𝕪 (U+1D56A)𝕫 (U+1D56B)𝖆 (U+1D586)𝖇 (U+1D587)𝖈 (U+1D588)𝖉 (U+1D589)𝖊 (U+1D58A)𝖋 (U+1D58B)𝖌 (U+1D58C)𝖍 (U+1D58D)𝖎 (U+1D58E)𝖏 (U+1D58F)𝖐 (U+1D590)𝖑 (U+1D591)𝖒 (U+1D592)𝖓 (U+1D593)𝖔 (U+1D594)𝖕 (U+1D595)𝖖 (U+1D596)𝖗 (U+1D597)𝖘 (U+1D598)𝖙 (U+1D599)𝖚 (U+1D59A)𝖛 (U+1D59B)𝖜 (U+1D59C)𝖝 (U+1D59D)𝖞 (U+1D59E)𝖟 (U+1D59F)𝖺 (U+1D5BA)𝖻 (U+1D5BB)𝖼 (U+1D5BC)𝖽 (U+1D5BD)𝖾 (U+1D5BE)𝖿 (U+1D5BF)𝗀 (U+1D5C0)𝗁 (U+1D5C1)𝗂 (U+1D5C2)𝗃 (U+1D5C3)𝗄 (U+1D5C4)𝗅 (U+1D5C5)𝗆 (U+1D5C6)𝗇 (U+1D5C7)𝗈 (U+1D5C8)𝗉 (U+1D5C9)𝗊 (U+1D5CA)𝗋 (U+1D5CB)𝗌 (U+1D5CC)𝗍 (U+1D5CD)𝗎 (U+1D5CE)𝗏 (U+1D5CF)𝗐 (U+1D5D0)𝗑 (U+1D5D1)𝗒 (U+1D5D2)𝗓 (U+1D5D3)𝗮 (U+1D5EE)𝗯 (U+1D5EF)𝗰 (U+1D5F0)𝗱 (U+1D5F1)𝗲 (U+1D5F2)𝗳 (U+1D5F3)𝗴 (U+1D5F4)𝗵 (U+1D5F5)𝗶 (U+1D5F6)𝗷 (U+1D5F7)𝗸 (U+1D5F8)𝗹 (U+1D5F9)𝗺 (U+1D5FA)𝗻 (U+1D5FB)𝗼 (U+1D5FC)𝗽 (U+1D5FD)𝗾 (U+1D5FE)𝗿 (U+1D5FF)𝘀 (U+1D600)𝘁 (U+1D601)𝘂 (U+1D602)𝘃 (U+1D603)𝘄 (U+1D604)𝘅 (U+1D605)𝘆 (U+1D606)𝘇 (U+1D607)𝘢 (U+1D622)𝘣 (U+1D623)𝘤 (U+1D624)𝘥 (U+1D625)𝘦 (U+1D626)𝘧 (U+1D627)𝘨 (U+1D628)𝘩 (U+1D629)𝘪 (U+1D62A)𝘫 (U+1D62B)𝘬 (U+1D62C)𝘭 (U+1D62D)𝘮 (U+1D62E)𝘯 (U+1D62F)𝘰 (U+1D630)𝘱 (U+1D631)𝘲 (U+1D632)𝘳 (U+1D633)𝘴 (U+1D634)𝘵 (U+1D635)𝘶 (U+1D636)𝘷 (U+1D637)𝘸 (U+1D638)𝘹 (U+1D639)𝘺 (U+1D63A)𝘻 (U+1D63B)𝙖 (U+1D656)𝙗 (U+1D657)𝙘 (U+1D658)𝙙 (U+1D659)𝙚 (U+1D65A)𝙛 (U+1D65B)𝙜 (U+1D65C)𝙝 (U+1D65D)𝙞 (U+1D65E)𝙟 (U+1D65F)𝙠 (U+1D660)𝙡 (U+1D661)𝙢 (U+1D662)𝙣 (U+1D663)𝙤 (U+1D664)𝙥 (U+1D665)𝙦 (U+1D666)𝙧 (U+1D667)𝙨 (U+1D668)𝙩 (U+1D669)𝙪 (U+1D66A)𝙫 (U+1D66B)𝙬 (U+1D66C)𝙭 (U+1D66D)𝙮 (U+1D66E)𝙯 (U+1D66F)𝚊 (U+1D68A)𝚋 (U+1D68B)𝚌 (U+1D68C)𝚍 (U+1D68D)𝚎 (U+1D68E)𝚏 (U+1D68F)𝚐 (U+1D690)𝚑 (U+1D691)𝚒 (U+1D692)𝚓 (U+1D693)𝚔 (U+1D694)𝚕 (U+1D695)𝚖 (U+1D696)𝚗 (U+1D697)𝚘 (U+1D698)𝚙 (U+1D699)𝚚 (U+1D69A)𝚛 (U+1D69B)𝚜 (U+1D69C)𝚝 (U+1D69D)𝚞 (U+1D69E)𝚟 (U+1D69F)𝚠 (U+1D6A0)𝚡 (U+1D6A1)𝚢 (U+1D6A2)𝚣 (U+1D6A3)𝚤 (U+1D6A4)𝚥 (U+1D6A5)𝛂 (U+1D6C2)𝛃 (U+1D6C3)𝛄 (U+1D6C4)𝛅 (U+1D6C5)𝛆 (U+1D6C6)𝛇 (U+1D6C7)𝛈 (U+1D6C8)𝛉 (U+1D6C9)𝛊 (U+1D6CA)𝛋 (U+1D6CB)𝛌 (U+1D6CC)𝛍 (U+1D6CD)𝛎 (U+1D6CE)𝛏 (U+1D6CF)𝛐 (U+1D6D0)𝛑 (U+1D6D1)𝛒 (U+1D6D2)𝛓 (U+1D6D3)𝛔 (U+1D6D4)𝛕 (U+1D6D5)𝛖 (U+1D6D6)𝛗 (U+1D6D7)𝛘 (U+1D6D8)𝛙 (U+1D6D9)𝛚 (U+1D6DA)𝛜 (U+1D6DC)𝛝 (U+1D6DD)𝛞 (U+1D6DE)𝛟 (U+1D6DF)𝛠 (U+1D6E0)𝛡 (U+1D6E1)𝛼 (U+1D6FC)𝛽 (U+1D6FD)𝛾 (U+1D6FE)𝛿 (U+1D6FF)𝜀 (U+1D700)𝜁 (U+1D701)𝜂 (U+1D702)𝜃 (U+1D703)𝜄 (U+1D704)𝜅 (U+1D705)𝜆 (U+1D706)𝜇 (U+1D707)𝜈 (U+1D708)𝜉 (U+1D709)𝜊 (U+1D70A)𝜋 (U+1D70B)𝜌 (U+1D70C)𝜍 (U+1D70D)𝜎 (U+1D70E)𝜏 (U+1D70F)𝜐 (U+1D710)𝜑 (U+1D711)𝜒 (U+1D712)𝜓 (U+1D713)𝜔 (U+1D714)𝜖 (U+1D716)𝜗 (U+1D717)𝜘 (U+1D718)𝜙 (U+1D719)𝜚 (U+1D71A)𝜛 (U+1D71B)𝜶 (U+1D736)𝜷 (U+1D737)𝜸 (U+1D738)𝜹 (U+1D739)𝜺 (U+1D73A)𝜻 (U+1D73B)𝜼 (U+1D73C)𝜽 (U+1D73D)𝜾 (U+1D73E)𝜿 (U+1D73F)𝝀 (U+1D740)𝝁 (U+1D741)𝝂 (U+1D742)𝝃 (U+1D743)𝝄 (U+1D744)𝝅 (U+1D745)𝝆 (U+1D746)𝝇 (U+1D747)𝝈 (U+1D748)𝝉 (U+1D749)𝝊 (U+1D74A)𝝋 (U+1D74B)𝝌 (U+1D74C)𝝍 (U+1D74D)𝝎 (U+1D74E)𝝐 (U+1D750)𝝑 (U+1D751)𝝒 (U+1D752)𝝓 (U+1D753)𝝔 (U+1D754)𝝕 (U+1D755)𝝰 (U+1D770)𝝱 (U+1D771)𝝲 (U+1D772)𝝳 (U+1D773)𝝴 (U+1D774)𝝵 (U+1D775)𝝶 (U+1D776)𝝷 (U+1D777)𝝸 (U+1D778)𝝹 (U+1D779)𝝺 (U+1D77A)𝝻 (U+1D77B)𝝼 (U+1D77C)𝝽 (U+1D77D)𝝾 (U+1D77E)𝝿 (U+1D77F)𝞀 (U+1D780)𝞁 (U+1D781)𝞂 (U+1D782)𝞃 (U+1D783)𝞄 (U+1D784)𝞅 (U+1D785)𝞆 (U+1D786)𝞇 (U+1D787)𝞈 (U+1D788)𝞊 (U+1D78A)𝞋 (U+1D78B)𝞌 (U+1D78C)𝞍 (U+1D78D)𝞎 (U+1D78E)𝞏 (U+1D78F)𝞪 (U+1D7AA)𝞫 (U+1D7AB)𝞬 (U+1D7AC)𝞭 (U+1D7AD)𝞮 (U+1D7AE)𝞯 (U+1D7AF)𝞰 (U+1D7B0)𝞱 (U+1D7B1)𝞲 (U+1D7B2)𝞳 (U+1D7B3)𝞴 (U+1D7B4)𝞵 (U+1D7B5)𝞶 (U+1D7B6)𝞷 (U+1D7B7)𝞸 (U+1D7B8)𝞹 (U+1D7B9)𝞺 (U+1D7BA)𝞻 (U+1D7BB)𝞼 (U+1D7BC)𝞽 (U+1D7BD)𝞾 (U+1D7BE)𝞿 (U+1D7BF)𝟀 (U+1D7C0)𝟁 (U+1D7C1)𝟂 (U+1D7C2)𝟄 (U+1D7C4)𝟅 (U+1D7C5)𝟆 (U+1D7C6)𝟇 (U+1D7C7)𝟈 (U+1D7C8)𝟉 (U+1D7C9)𝟋 (U+1D7CB)𝼀 (U+1DF00)𝼁 (U+1DF01)𝼂 (U+1DF02)𝼃 (U+1DF03)𝼄 (U+1DF04)𝼅 (U+1DF05)𝼆 (U+1DF06)𝼇 (U+1DF07)𝼈 (U+1DF08)𝼉 (U+1DF09)𝼋 (U+1DF0B)𝼌 (U+1DF0C)𝼍 (U+1DF0D)𝼎 (U+1DF0E)𝼏 (U+1DF0F)𝼐 (U+1DF10)𝼑 (U+1DF11)𝼒 (U+1DF12)𝼓 (U+1DF13)𝼔 (U+1DF14)𝼕 (U+1DF15)𝼖 (U+1DF16)𝼗 (U+1DF17)𝼘 (U+1DF18)𝼙 (U+1DF19)𝼚 (U+1DF1A)𝼛 (U+1DF1B)𝼜 (U+1DF1C)𝼝 (U+1DF1D)𝼞 (U+1DF1E)𝼥 (U+1DF25)𝼦 (U+1DF26)𝼧 (U+1DF27)𝼨 (U+1DF28)𝼩 (U+1DF29)𝼪 (U+1DF2A)
This means there are cased letters that are case-mapping invariant. Always-upper characters include:
- Greek and Coptic
- Variants of GREEK UPSILON: U+03D2 – U+03D4
- Letterlike Symbols
- EULER CONSTANT (U+2107)
- DOUBLE-STRUCK (ITALIC) CAPITAL {C,H,N,P,Q,R,Z,GAMMA,PI,D}: U+2102, U+210D, U+2115, U+2119, U+211A, U+211D, U+2124, U+213E, U+213F, U+2145
- SCRIPT CAPITAL {H,I,L,R,B,E,F,M}: U+210B, U+2110, U+2112, U+211B, U+212C, U+2130, U+2131, U+2133
- BLACK-LETTER CAPITAL {H,I,R,Z,C}: U+210C, U+2111, U+211C, U+2128, U+212D
- Mathematical Alphanumeric Symbols
- MATHEMATICAL {BOLD,ITALIC,BOLD ITALIC,SCRIPT,BOLD SCRIPT,FRAKTUR,DOUBLE-STRUCK,BOLD FRAKTUR,SANS-SERIF,SANS-SERIF BOLD,SANS-SERIF ITALIC,SANS-SERIF BOLD ITALIC,MONOSPACE} CAPITAL Latin alphabet: U+1D400 – U+1D419, U+1D434 – U+1D44D, U+1D468 – U+1D481, U+1D49C – U+1D4B5, U+1D4D0 – U+1D4E9, U+1D504 – U+1D51C, U+1D538 – U+1D550, U+1D56C – U+1D585, U+1D5A0 – U+1D5B9, U+1D5D4 – U+1D5ED, U+1D608 – U+1D621, U+1D63C – U+1D655, U+1D670 – U+1D689
- MATHEMATICAL {BOLD,ITALIC,BOLD ITALIC,SANS-SERIF BOLD,SANS-SERIF ITALIC} CAPITAL Greek alphabet: U+1D6A8 – U+1D6C0, U+1D6E2 – U+1D6FA, U+1D71C – U+1D734, U+1D756 – U+1D76E, U+1D790 – U+1D7A8
- MATHEMATICAL BOLD CAPITAL DIGAMMA (U+1D7CA)
Always-lower characters include:
- Latin Extended-A
- LATIN SMALL LETTER KRA (U+0138)
- Latin Extended-B
- LATIN SMALL LETTER TURNED DELTA (U+018D)
- LATIN SMALL LETTER LAMBDA WITH STROKE (U+019B)
- LATIN LETTER REVERSED ESH LOOP (U+01AA)
- LATIN SMALL LETTER T WITH PALATAL HOOK (U+01AB)
- LATIN SMALL LETTER EZH WITH TAIL (U+01BA)
- LATIN LETTER INVERTED GLOTTAL STOP WITH STROKE (U+01BE)
- LATIN SMALL LETTER {D,L,N,T} WITH CURL: U+0221, U+0234 – U+0236
- LATIN SMALL LETTER DOTLESS J (U+0237)
- LATIN SMALL LETTER {DB,QP} DIGRAPH: U+0238, U+0239
- IPA Extensions
- U+0250 – U+02AF, except 28 of them
- Greek and Coptic
- GREEK RHO WITH STROKE SYMBOL (U+03FC)
- Armenian
- ARMENIAN SMALL LETTER TURNED AYB (U+0560)
- ARMENIAN SMALL LETTER YI WITH STROKE (U+0589)
- Phonetic Extensions
- Latin letters, Greek letters, and Cyrillic letter: U+1D00 – U+1D2B
- Latin letter for American lexicography, Latin letters with middle tilde: U+1D6B – U+1D76
- LATIN SMALL LETTER TURNED G (U+1D77)
- Other phonetic symbols, except LATIN SMALL LETTER INSULAR G: U+1D7A – U+1D7F
- Phonetic Extensions Supplement
- Latin letters with palatal hook, except LATIN SMALL LETTER Z WITH PALATAL HOOK: U+1D80 – U+1D8D
- Latin letters with retroflex hook: U+1D8F – U+1D9A
- Latin Extended Additional
- Mediavalist additions: U+1E9C, U+1E9D, U+1E9F
- ...
To make our discussions more meaningful, we will limit our future discussions to and instead of and , so that all sets in question are subsets of . Upper(Lower) case letters that are case-mapping variant must be lower(upper)case variant, because we already showed that they are upper(lower)case invariant.
Does case-mapping variance imply casedness?
No. and are not partitions of : there are characters that are uncased, but are case-mapping variant:
- =
Character set (116)
Dž (U+01C5)Lj (U+01C8)Nj (U+01CB)Dz (U+01F2)ͅ (U+0345)ᾈ (U+1F88)ᾉ (U+1F89)ᾊ (U+1F8A)ᾋ (U+1F8B)ᾌ (U+1F8C)ᾍ (U+1F8D)ᾎ (U+1F8E)ᾏ (U+1F8F)ᾘ (U+1F98)ᾙ (U+1F99)ᾚ (U+1F9A)ᾛ (U+1F9B)ᾜ (U+1F9C)ᾝ (U+1F9D)ᾞ (U+1F9E)ᾟ (U+1F9F)ᾨ (U+1FA8)ᾩ (U+1FA9)ᾪ (U+1FAA)ᾫ (U+1FAB)ᾬ (U+1FAC)ᾭ (U+1FAD)ᾮ (U+1FAE)ᾯ (U+1FAF)ᾼ (U+1FBC)ῌ (U+1FCC)ῼ (U+1FFC)Ⅰ (U+2160)Ⅱ (U+2161)Ⅲ (U+2162)Ⅳ (U+2163)Ⅴ (U+2164)Ⅵ (U+2165)Ⅶ (U+2166)Ⅷ (U+2167)Ⅸ (U+2168)Ⅹ (U+2169)Ⅺ (U+216A)Ⅻ (U+216B)Ⅼ (U+216C)Ⅽ (U+216D)Ⅾ (U+216E)Ⅿ (U+216F)ⅰ (U+2170)ⅱ (U+2171)ⅲ (U+2172)ⅳ (U+2173)ⅴ (U+2174)ⅵ (U+2175)ⅶ (U+2176)ⅷ (U+2177)ⅸ (U+2178)ⅹ (U+2179)ⅺ (U+217A)ⅻ (U+217B)ⅼ (U+217C)ⅽ (U+217D)ⅾ (U+217E)ⅿ (U+217F)Ⓐ (U+24B6)Ⓑ (U+24B7)Ⓒ (U+24B8)Ⓓ (U+24B9)Ⓔ (U+24BA)Ⓕ (U+24BB)Ⓖ (U+24BC)Ⓗ (U+24BD)Ⓘ (U+24BE)Ⓙ (U+24BF)Ⓚ (U+24C0)Ⓛ (U+24C1)Ⓜ (U+24C2)Ⓝ (U+24C3)Ⓞ (U+24C4)Ⓟ (U+24C5)Ⓠ (U+24C6)Ⓡ (U+24C7)Ⓢ (U+24C8)Ⓣ (U+24C9)Ⓤ (U+24CA)Ⓥ (U+24CB)Ⓦ (U+24CC)Ⓧ (U+24CD)Ⓨ (U+24CE)Ⓩ (U+24CF)ⓐ (U+24D0)ⓑ (U+24D1)ⓒ (U+24D2)ⓓ (U+24D3)ⓔ (U+24D4)ⓕ (U+24D5)ⓖ (U+24D6)ⓗ (U+24D7)ⓘ (U+24D8)ⓙ (U+24D9)ⓚ (U+24DA)ⓛ (U+24DB)ⓜ (U+24DC)ⓝ (U+24DD)ⓞ (U+24DE)ⓟ (U+24DF)ⓠ (U+24E0)ⓡ (U+24E1)ⓢ (U+24E2)ⓣ (U+24E3)ⓤ (U+24E4)ⓥ (U+24E5)ⓦ (U+24E6)ⓧ (U+24E7)ⓨ (U+24E8)ⓩ (U+24E9)
They include , , and also:
- Number Forms
- Uppercase roman numerals (): U+2160 – U+216F
- Small roman numerals (): U+2170 – U+217F
- Enclosed Alphanumerics
- Circled Latin letters (): U+24B6 – U+24CF
- Circled small Latin letters (): U+24D0 – U+24E9
- Combining Diacritical Marks
- COMBINING GREEK YPOGEGRAMMENI (U+0345) (Previously mentioned)
Are uppercase variance and lowercase variance mutually exclusive?
No. There are characters that are both uppercase variant and lowercase variant:
- = =
Character set (31)
- Dž (U+01C5) → DŽ (U+01C4), dž (U+01C6)
- Lj (U+01C8) → LJ (U+01C7), lj (U+01C9)
- Nj (U+01CB) → NJ (U+01CA), nj (U+01CC)
- Dz (U+01F2) → DZ (U+01F1), dz (U+01F3)
- ᾈ (U+1F88) → ἈΙ (U+1F08 U+0399), ᾀ (U+1F80)
- ᾉ (U+1F89) → ἉΙ (U+1F09 U+0399), ᾁ (U+1F81)
- ᾊ (U+1F8A) → ἊΙ (U+1F0A U+0399), ᾂ (U+1F82)
- ᾋ (U+1F8B) → ἋΙ (U+1F0B U+0399), ᾃ (U+1F83)
- ᾌ (U+1F8C) → ἌΙ (U+1F0C U+0399), ᾄ (U+1F84)
- ᾍ (U+1F8D) → ἍΙ (U+1F0D U+0399), ᾅ (U+1F85)
- ᾎ (U+1F8E) → ἎΙ (U+1F0E U+0399), ᾆ (U+1F86)
- ᾏ (U+1F8F) → ἏΙ (U+1F0F U+0399), ᾇ (U+1F87)
- ᾘ (U+1F98) → ἨΙ (U+1F28 U+0399), ᾐ (U+1F90)
- ᾙ (U+1F99) → ἩΙ (U+1F29 U+0399), ᾑ (U+1F91)
- ᾚ (U+1F9A) → ἪΙ (U+1F2A U+0399), ᾒ (U+1F92)
- ᾛ (U+1F9B) → ἫΙ (U+1F2B U+0399), ᾓ (U+1F93)
- ᾜ (U+1F9C) → ἬΙ (U+1F2C U+0399), ᾔ (U+1F94)
- ᾝ (U+1F9D) → ἭΙ (U+1F2D U+0399), ᾕ (U+1F95)
- ᾞ (U+1F9E) → ἮΙ (U+1F2E U+0399), ᾖ (U+1F96)
- ᾟ (U+1F9F) → ἯΙ (U+1F2F U+0399), ᾗ (U+1F97)
- ᾨ (U+1FA8) → ὨΙ (U+1F68 U+0399), ᾠ (U+1FA0)
- ᾩ (U+1FA9) → ὩΙ (U+1F69 U+0399), ᾡ (U+1FA1)
- ᾪ (U+1FAA) → ὪΙ (U+1F6A U+0399), ᾢ (U+1FA2)
- ᾫ (U+1FAB) → ὫΙ (U+1F6B U+0399), ᾣ (U+1FA3)
- ᾬ (U+1FAC) → ὬΙ (U+1F6C U+0399), ᾤ (U+1FA4)
- ᾭ (U+1FAD) → ὭΙ (U+1F6D U+0399), ᾥ (U+1FA5)
- ᾮ (U+1FAE) → ὮΙ (U+1F6E U+0399), ᾦ (U+1FA6)
- ᾯ (U+1FAF) → ὯΙ (U+1F6F U+0399), ᾧ (U+1FA7)
- ᾼ (U+1FBC) → ΑΙ (U+0391 U+0399), ᾳ (U+1FB3)
- ῌ (U+1FCC) → ΗΙ (U+0397 U+0399), ῃ (U+1FC3)
- ῼ (U+1FFC) → ΩΙ (U+03A9 U+0399), ῳ (U+1FF3)
In addition, as shown before, these are also characters that cannot be produced by toUpperCase()
or toLowerCase()
with any input, including themselves.
Is Lower(Upper)Case_Letter
always mapped to Upper(Lower)Case_Letter
by toUpper(Lower)Case
?
We already mentioned that certain upper-/lower-case letters are mapping invariant. Furthermore, there are plenty of characters in that are cased. Dropping those, the answer is yes. If the input is a Lowercase_Letter
, the output of toUpperCase()
is always an Uppercase_Letter
. If the input is an Uppercase_Letter
, the output of toLowerCase()
is always a Lowercase_Letter
.
- = ∅
- = ∅
Does toUpper(Lower)Case
always produce Upper(Lower)case_Letter
? Can it produce Lower(Upper)case_Letter
?
(Again, disregarding multi-code-point characters) No and no (but yes, if you count case-mapping invariant but cased characters). and are proper subsets of and , respectively:
- = ∅
- = =
Character set (42)
Ⅰ (U+2160)Ⅱ (U+2161)Ⅲ (U+2162)Ⅳ (U+2163)Ⅴ (U+2164)Ⅵ (U+2165)Ⅶ (U+2166)Ⅷ (U+2167)Ⅸ (U+2168)Ⅹ (U+2169)Ⅺ (U+216A)Ⅻ (U+216B)Ⅼ (U+216C)Ⅽ (U+216D)Ⅾ (U+216E)Ⅿ (U+216F)Ⓐ (U+24B6)Ⓑ (U+24B7)Ⓒ (U+24B8)Ⓓ (U+24B9)Ⓔ (U+24BA)Ⓕ (U+24BB)Ⓖ (U+24BC)Ⓗ (U+24BD)Ⓘ (U+24BE)Ⓙ (U+24BF)Ⓚ (U+24C0)Ⓛ (U+24C1)Ⓜ (U+24C2)Ⓝ (U+24C3)Ⓞ (U+24C4)Ⓟ (U+24C5)Ⓠ (U+24C6)Ⓡ (U+24C7)Ⓢ (U+24C8)Ⓣ (U+24C9)Ⓤ (U+24CA)Ⓥ (U+24CB)Ⓦ (U+24CC)Ⓧ (U+24CD)Ⓨ (U+24CE)Ⓩ (U+24CF) - = ∅
- = =
Character set (43)
ͅ (U+0345)ⅰ (U+2170)ⅱ (U+2171)ⅲ (U+2172)ⅳ (U+2173)ⅴ (U+2174)ⅵ (U+2175)ⅶ (U+2176)ⅷ (U+2177)ⅸ (U+2178)ⅹ (U+2179)ⅺ (U+217A)ⅻ (U+217B)ⅼ (U+217C)ⅽ (U+217D)ⅾ (U+217E)ⅿ (U+217F)ⓐ (U+24D0)ⓑ (U+24D1)ⓒ (U+24D2)ⓓ (U+24D3)ⓔ (U+24D4)ⓕ (U+24D5)ⓖ (U+24D6)ⓗ (U+24D7)ⓘ (U+24D8)ⓙ (U+24D9)ⓚ (U+24DA)ⓛ (U+24DB)ⓜ (U+24DC)ⓝ (U+24DD)ⓞ (U+24DE)ⓟ (U+24DF)ⓠ (U+24E0)ⓡ (U+24E1)ⓢ (U+24E2)ⓣ (U+24E3)ⓤ (U+24E4)ⓥ (U+24E5)ⓦ (U+24E6)ⓧ (U+24E7)ⓨ (U+24E8)ⓩ (U+24E9)
Uncased letters produced by to{Upper,Lower}Case
are the same sets as discussed before: those characters that are uncased but case-mapping variant. To produce an uncased output, the input must be uncased too.
On the other hand, and , and are disjoint:
- = ∅
- = ∅
So toUpperCase()
never produces a Lowercase_Letter
, and toLowerCase()
never produces an Uppercase_Letter
.
Can toUpper(Lower)Case
produce Upper(Lower)case_Letter
from uncased characters?
Yes. Uncased letters may become cased after case mapping:
- = =
Character set (5)
DŽ (U+01C4)LJ (U+01C7)NJ (U+01CA)DZ (U+01F1)Ι (U+0399) - = =
Character set (31)
dž (U+01C6)lj (U+01C9)nj (U+01CC)dz (U+01F3)ᾀ (U+1F80)ᾁ (U+1F81)ᾂ (U+1F82)ᾃ (U+1F83)ᾄ (U+1F84)ᾅ (U+1F85)ᾆ (U+1F86)ᾇ (U+1F87)ᾐ (U+1F90)ᾑ (U+1F91)ᾒ (U+1F92)ᾓ (U+1F93)ᾔ (U+1F94)ᾕ (U+1F95)ᾖ (U+1F96)ᾗ (U+1F97)ᾠ (U+1FA0)ᾡ (U+1FA1)ᾢ (U+1FA2)ᾣ (U+1FA3)ᾤ (U+1FA4)ᾥ (U+1FA5)ᾦ (U+1FA6)ᾧ (U+1FA7)ᾳ (U+1FB3)ῃ (U+1FC3)ῳ (U+1FF3)
These characters are also the characters that are uncased but case-mapping variant.
To summarize:
Input case\Output case | Upper case | Lower case | Uncased |
---|---|---|---|
Upper case | (identity) | Never | Never |
Lower case | (identity) | Never | |
Uncased | , U+0345 | Never | , , other identities |
Input case\Output case | Upper case | Lower case | Uncased |
---|---|---|---|
Upper case | (identity) | Never | |
Lower case | Never | (identity) | Never |
Uncased | Never | , | , , other identities |
Properties of characters that map to multiple characters
We now focus on these particular subsets:
-
- Lower case (thus lowercase invariant): = ∅
-
- Upper case (thus uppercase invariant): = ∅
-
- We already mentioned that is uncased, and both uppercase and lowercase variant. is lower case (thus lowercase invariant): = ∅
-
- Empty set
Define . So, we may characterize the domain and codomain of toUpperCase()
and toLowerCase()
as the following, where each piece has a disjoint domain and each except the last has a disjoint codomain:
There are other cases where multiple code points can be mapped to single code points, but they are not of our interest. We will discuss these multi-code-point characters soon.
We wonder if characters in always stay in . In order to narrow the codomain of the pieces marked with (*), we want to find if there are characters such that , or .
- = ∅
- =
Character set (1)
ẞ (U+1E9E)
There is exactly one: LATIN CAPITAL LETTER SHARP S (U+1E9E). The mappings of this character are:
- ẞ ẞ (U+1E9E)
- ẞ ß (U+00DF) SS (U+0053 U+0053)
i.e. it maps to a character in . There's no other character that maps to U+00DF:
- =
Character set (2)
ß (U+00DF)ẞ (U+1E9E)
The characters that map to are exactly and :
- =
Character set (54)
ᾀ (U+1F80)ᾁ (U+1F81)ᾂ (U+1F82)ᾃ (U+1F83)ᾄ (U+1F84)ᾅ (U+1F85)ᾆ (U+1F86)ᾇ (U+1F87)ᾈ (U+1F88)ᾉ (U+1F89)ᾊ (U+1F8A)ᾋ (U+1F8B)ᾌ (U+1F8C)ᾍ (U+1F8D)ᾎ (U+1F8E)ᾏ (U+1F8F)ᾐ (U+1F90)ᾑ (U+1F91)ᾒ (U+1F92)ᾓ (U+1F93)ᾔ (U+1F94)ᾕ (U+1F95)ᾖ (U+1F96)ᾗ (U+1F97)ᾘ (U+1F98)ᾙ (U+1F99)ᾚ (U+1F9A)ᾛ (U+1F9B)ᾜ (U+1F9C)ᾝ (U+1F9D)ᾞ (U+1F9E)ᾟ (U+1F9F)ᾠ (U+1FA0)ᾡ (U+1FA1)ᾢ (U+1FA2)ᾣ (U+1FA3)ᾤ (U+1FA4)ᾥ (U+1FA5)ᾦ (U+1FA6)ᾧ (U+1FA7)ᾨ (U+1FA8)ᾩ (U+1FA9)ᾪ (U+1FAA)ᾫ (U+1FAB)ᾬ (U+1FAC)ᾭ (U+1FAD)ᾮ (U+1FAE)ᾯ (U+1FAF)ᾳ (U+1FB3)ᾼ (U+1FBC)ῃ (U+1FC3)ῌ (U+1FCC)ῳ (U+1FF3)ῼ (U+1FFC)
Then we can refine the domain and codomain of toUpperCase()
and toLowerCase()
as the following, so that each piece has a disjoint domain and codomain:
Mapping graph
To study the injectivity/surjectivity of case mapping, we introduce the concept of a mapping graph, a directed graph . The vertices are all characters that we discussed, and the edges have two colors: iff , iff . Therefore, we can reformulate case-mapping variance as:
- A node is case-mapping invariant iff both of its out edges are self-loops. These are nodes in .
- A node is uppercase invariant iff its out -edge is a self-loop.
- A node is lowercase invariant iff its out -edge is a self-loop.
- A node is both uppercase and lowercase variant iff both of its out edges are not self-loops. These are the 31 characters mentioned before: .
We also have the following observations:
- Edges can be self-referential. Each node except those in and has exactly one out -edge and exactly one out -edge. has a self-referential out -edge and no out -edge. has a self-referential out -edge and no out -edge.
- Each node has zero or more in edges colored either or .
- Idempotence: if a node has a non-self-referential in -edge, then its out -edge is self-referential. Similarly, if a node has a non-self-referential in -edge, then its out -edge is self-referential.
- Complementary ranges: if a node has a non-self-referential in -edge, then it has no non-self-referential in -edge. Similarly, if a node has a non-self-referential in -edge, then it has no non-self-referential in -edge.
- Closedness of : if a node has a non-self-referential in -edge, then it has no out -edge or its out -edge is non-self-referential. Similarly, if a node has a non-self-referential in edge colored , then it has no out -edge or its out -edge is non-self-referential.
We already established that:
- Nodes in have no in edges from other nodes.
- Nodes in have no in edges from other nodes.
Therefore, ignoring these nodes, the graph is inherently bipartite:
- Each uppercase invariant node points to a lowercase invariant node with an -edge and points to itself with a -edge. It has zero or more in -edges and no in -edge.
- Each lowercase invariant node points to an uppercase invariant node with a -edge and points to itself with an -edge. It has zero or more in -edges and no in -edge.
Connected subgraphs
A connected subgraph, called a cluster, formed by is recursively defined:
- is in the cluster.
- If is in the cluster, then all nodes that point to and those that points to are all in the cluster.
Nodes that are in the cluster formed by do not always form cycles through some series of toUpperCase
/toLowerCase
transformations, because they eventually map to a node in , which has no out -edge or no out -edge.
Character set (51)
- SS (U+0053 U+0053), ß (U+00DF), ẞ (U+1E9E)
- ʼN (U+02BC U+004E), ʼn (U+0149)
- ԵՒ (U+0535 U+0552), և (U+0587)
- Aʾ (U+0041 U+02BE), ẚ (U+1E9A)
- ἈΙ (U+1F08 U+0399), ᾀ (U+1F80), ᾈ (U+1F88)
- ἉΙ (U+1F09 U+0399), ᾁ (U+1F81), ᾉ (U+1F89)
- ἊΙ (U+1F0A U+0399), ᾂ (U+1F82), ᾊ (U+1F8A)
- ἋΙ (U+1F0B U+0399), ᾃ (U+1F83), ᾋ (U+1F8B)
- ἌΙ (U+1F0C U+0399), ᾄ (U+1F84), ᾌ (U+1F8C)
- ἍΙ (U+1F0D U+0399), ᾅ (U+1F85), ᾍ (U+1F8D)
- ἎΙ (U+1F0E U+0399), ᾆ (U+1F86), ᾎ (U+1F8E)
- ἏΙ (U+1F0F U+0399), ᾇ (U+1F87), ᾏ (U+1F8F)
- ἨΙ (U+1F28 U+0399), ᾐ (U+1F90), ᾘ (U+1F98)
- ἩΙ (U+1F29 U+0399), ᾑ (U+1F91), ᾙ (U+1F99)
- ἪΙ (U+1F2A U+0399), ᾒ (U+1F92), ᾚ (U+1F9A)
- ἫΙ (U+1F2B U+0399), ᾓ (U+1F93), ᾛ (U+1F9B)
- ἬΙ (U+1F2C U+0399), ᾔ (U+1F94), ᾜ (U+1F9C)
- ἭΙ (U+1F2D U+0399), ᾕ (U+1F95), ᾝ (U+1F9D)
- ἮΙ (U+1F2E U+0399), ᾖ (U+1F96), ᾞ (U+1F9E)
- ἯΙ (U+1F2F U+0399), ᾗ (U+1F97), ᾟ (U+1F9F)
- ὨΙ (U+1F68 U+0399), ᾠ (U+1FA0), ᾨ (U+1FA8)
- ὩΙ (U+1F69 U+0399), ᾡ (U+1FA1), ᾩ (U+1FA9)
- ὪΙ (U+1F6A U+0399), ᾢ (U+1FA2), ᾪ (U+1FAA)
- ὫΙ (U+1F6B U+0399), ᾣ (U+1FA3), ᾫ (U+1FAB)
- ὬΙ (U+1F6C U+0399), ᾤ (U+1FA4), ᾬ (U+1FAC)
- ὭΙ (U+1F6D U+0399), ᾥ (U+1FA5), ᾭ (U+1FAD)
- ὮΙ (U+1F6E U+0399), ᾦ (U+1FA6), ᾮ (U+1FAE)
- ὯΙ (U+1F6F U+0399), ᾧ (U+1FA7), ᾯ (U+1FAF)
- ᾺΙ (U+1FBA U+0399), ᾲ (U+1FB2)
- ΑΙ (U+0391 U+0399), ᾳ (U+1FB3), ᾼ (U+1FBC)
- ΆΙ (U+0386 U+0399), ᾴ (U+1FB4)
- Α͂Ι (U+0391 U+0342 U+0399), ᾷ (U+1FB7)
- ῊΙ (U+1FCA U+0399), ῂ (U+1FC2)
- ΗΙ (U+0397 U+0399), ῃ (U+1FC3), ῌ (U+1FCC)
- ΉΙ (U+0389 U+0399), ῄ (U+1FC4)
- Η͂Ι (U+0397 U+0342 U+0399), ῇ (U+1FC7)
- ῺΙ (U+1FFA U+0399), ῲ (U+1FF2)
- ΩΙ (U+03A9 U+0399), ῳ (U+1FF3), ῼ (U+1FFC)
- ΏΙ (U+038F U+0399), ῴ (U+1FF4)
- Ω͂Ι (U+03A9 U+0342 U+0399), ῷ (U+1FF7)
- FF (U+0046 U+0046), ff (U+FB00)
- FI (U+0046 U+0049), fi (U+FB01)
- FL (U+0046 U+004C), fl (U+FB02)
- FFI (U+0046 U+0046 U+0049), ffi (U+FB03)
- FFL (U+0046 U+0046 U+004C), ffl (U+FB04)
- ST (U+0053 U+0054), ſt (U+FB05), st (U+FB06)
- ՄՆ (U+0544 U+0546), ﬓ (U+FB13)
- ՄԵ (U+0544 U+0535), ﬔ (U+FB14)
- ՄԻ (U+0544 U+053B), ﬕ (U+FB15)
- ՎՆ (U+054E U+0546), ﬖ (U+FB16)
- ՄԽ (U+0544 U+053D), ﬗ (U+FB17)
Simple mapping pairs
Define a mapping pair as a cluster of size 2. A mapping pair () satisfies:
- (the -edge of points to )
- (the -edge of points to )
- such that or (no other node points to or )
There are 1386 such pairs. Of these, 1322 are pairs of Uppercase_Letter
and Lowercase_Letter
, and the rest are:
Character set (64)
- İ (U+0130) — i̇ (U+0069 U+0307)
- ǰ (U+01F0) — J̌ (U+004A U+030C)
- ΐ (U+0390) — Ϊ́ (U+03AA U+0301)
- ΰ (U+03B0) — Ϋ́ (U+03AB U+0301)
- ẖ (U+1E96) — H̱ (U+0048 U+0331)
- ẗ (U+1E97) — T̈ (U+0054 U+0308)
- ẘ (U+1E98) — W̊ (U+0057 U+030A)
- ẙ (U+1E99) — Y̊ (U+0059 U+030A)
- ὐ (U+1F50) — Υ̓ (U+03A5 U+0313)
- ὒ (U+1F52) — Υ̓̀ (U+03A5 U+0313 U+0300)
- ὔ (U+1F54) — Υ̓́ (U+03A5 U+0313 U+0301)
- ὖ (U+1F56) — Υ̓͂ (U+03A5 U+0313 U+0342)
- ᾶ (U+1FB6) — Α͂ (U+0391 U+0342)
- ῆ (U+1FC6) — Η͂ (U+0397 U+0342)
- ῒ (U+1FD2) — Ϊ̀ (U+03AA U+0300)
- ῖ (U+1FD6) — Ι͂ (U+0399 U+0342)
- ῗ (U+1FD7) — Ϊ͂ (U+03AA U+0342)
- ῢ (U+1FE2) — Ϋ̀ (U+03AB U+0300)
- ῤ (U+1FE4) — Ρ̓ (U+03A1 U+0313)
- ῦ (U+1FE6) — Υ͂ (U+03A5 U+0342)
- ῧ (U+1FE7) — Ϋ͂ (U+03AB U+0342)
- ῶ (U+1FF6) — Ω͂ (U+03A9 U+0342)
- Ⅰ (U+2160) — ⅰ (U+2170)
- Ⅱ (U+2161) — ⅱ (U+2171)
- Ⅲ (U+2162) — ⅲ (U+2172)
- Ⅳ (U+2163) — ⅳ (U+2173)
- Ⅴ (U+2164) — ⅴ (U+2174)
- Ⅵ (U+2165) — ⅵ (U+2175)
- Ⅶ (U+2166) — ⅶ (U+2176)
- Ⅷ (U+2167) — ⅷ (U+2177)
- Ⅸ (U+2168) — ⅸ (U+2178)
- Ⅹ (U+2169) — ⅹ (U+2179)
- Ⅺ (U+216A) — ⅺ (U+217A)
- Ⅻ (U+216B) — ⅻ (U+217B)
- Ⅼ (U+216C) — ⅼ (U+217C)
- Ⅽ (U+216D) — ⅽ (U+217D)
- Ⅾ (U+216E) — ⅾ (U+217E)
- Ⅿ (U+216F) — ⅿ (U+217F)
- Ⓐ (U+24B6) — ⓐ (U+24D0)
- Ⓑ (U+24B7) — ⓑ (U+24D1)
- Ⓒ (U+24B8) — ⓒ (U+24D2)
- Ⓓ (U+24B9) — ⓓ (U+24D3)
- Ⓔ (U+24BA) — ⓔ (U+24D4)
- Ⓕ (U+24BB) — ⓕ (U+24D5)
- Ⓖ (U+24BC) — ⓖ (U+24D6)
- Ⓗ (U+24BD) — ⓗ (U+24D7)
- Ⓘ (U+24BE) — ⓘ (U+24D8)
- Ⓙ (U+24BF) — ⓙ (U+24D9)
- Ⓚ (U+24C0) — ⓚ (U+24DA)
- Ⓛ (U+24C1) — ⓛ (U+24DB)
- Ⓜ (U+24C2) — ⓜ (U+24DC)
- Ⓝ (U+24C3) — ⓝ (U+24DD)
- Ⓞ (U+24C4) — ⓞ (U+24DE)
- Ⓟ (U+24C5) — ⓟ (U+24DF)
- Ⓠ (U+24C6) — ⓠ (U+24E0)
- Ⓡ (U+24C7) — ⓡ (U+24E1)
- Ⓢ (U+24C8) — ⓢ (U+24E2)
- Ⓣ (U+24C9) — ⓣ (U+24E3)
- Ⓤ (U+24CA) — ⓤ (U+24E4)
- Ⓥ (U+24CB) — ⓥ (U+24E5)
- Ⓦ (U+24CC) — ⓦ (U+24E6)
- Ⓧ (U+24CD) — ⓧ (U+24E7)
- Ⓨ (U+24CE) — ⓨ (U+24E8)
- Ⓩ (U+24CF) — ⓩ (U+24E9)
Which are roman numerals , circled letters , and , .
Now, the remaining nodes are the complex cycles that neither have dead ends nor are simple mapping pairs. They are:
Character set (25)
- I (U+0049), i (U+0069), ı (U+0131)
- S (U+0053), s (U+0073), ſ (U+017F)
- µ (U+00B5), Μ (U+039C), μ (U+03BC)
- DŽ (U+01C4), Dž (U+01C5), dž (U+01C6)
- LJ (U+01C7), Lj (U+01C8), lj (U+01C9)
- NJ (U+01CA), Nj (U+01CB), nj (U+01CC)
- DZ (U+01F1), Dz (U+01F2), dz (U+01F3)
- ͅ (U+0345), Ι (U+0399), ι (U+03B9)
- Β (U+0392), β (U+03B2), ϐ (U+03D0)
- Ε (U+0395), ε (U+03B5), ϵ (U+03F5)
- Θ (U+0398), θ (U+03B8), ϴ (U+03F4), ϑ (U+03D1)
- Κ (U+039A), κ (U+03BA), ϰ (U+03F0)
- Π (U+03A0), π (U+03C0), ϖ (U+03D6)
- Ρ (U+03A1), ρ (U+03C1), ϱ (U+03F1)
- Σ (U+03A3), ς (U+03C2), σ (U+03C3)
- Φ (U+03A6), φ (U+03C6), ϕ (U+03D5)
- В (U+0412), в (U+0432), ᲀ (U+1C80)
- Д (U+0414), д (U+0434), ᲁ (U+1C81)
- О (U+041E), о (U+043E), ᲂ (U+1C82)
- С (U+0421), с (U+0441), ᲃ (U+1C83)
- Т (U+0422), т (U+0442), ᲄ (U+1C84), ᲅ (U+1C85)
- Ъ (U+042A), ъ (U+044A), ᲆ (U+1C86)
- Ѣ (U+0462), ѣ (U+0463), ᲇ (U+1C87)
- ᲈ (U+1C88), Ꙋ (U+A64A), ꙋ (U+A64B)
- Ṡ (U+1E60), ṡ (U+1E61), ẛ (U+1E9B)
Which characters are the upper(lower)case form of multiple characters?
Below are all characters that is the uppercase form of multiple characters.
Character set (53)
- I (U+0049): i (U+0069), ı (U+0131)
- S (U+0053): s (U+0073), ſ (U+017F)
- DŽ (U+01C4): Dž (U+01C5), dž (U+01C6)
- LJ (U+01C7): Lj (U+01C8), lj (U+01C9)
- NJ (U+01CA): Nj (U+01CB), nj (U+01CC)
- DZ (U+01F1): Dz (U+01F2), dz (U+01F3)
- Β (U+0392): β (U+03B2), ϐ (U+03D0)
- Ε (U+0395): ε (U+03B5), ϵ (U+03F5)
- Θ (U+0398): θ (U+03B8), ϑ (U+03D1)
- Ι (U+0399): ͅ (U+0345), ι (U+03B9)
- Κ (U+039A): κ (U+03BA), ϰ (U+03F0)
- Μ (U+039C): µ (U+00B5), μ (U+03BC)
- Π (U+03A0): π (U+03C0), ϖ (U+03D6)
- Ρ (U+03A1): ρ (U+03C1), ϱ (U+03F1)
- Σ (U+03A3): ς (U+03C2), σ (U+03C3)
- Φ (U+03A6): φ (U+03C6), ϕ (U+03D5)
- В (U+0412): в (U+0432), ᲀ (U+1C80)
- Д (U+0414): д (U+0434), ᲁ (U+1C81)
- О (U+041E): о (U+043E), ᲂ (U+1C82)
- С (U+0421): с (U+0441), ᲃ (U+1C83)
- Т (U+0422): т (U+0442), ᲄ (U+1C84), ᲅ (U+1C85)
- Ъ (U+042A): ъ (U+044A), ᲆ (U+1C86)
- Ѣ (U+0462): ѣ (U+0463), ᲇ (U+1C87)
- Ṡ (U+1E60): ṡ (U+1E61), ẛ (U+1E9B)
- Ꙋ (U+A64A): ᲈ (U+1C88), ꙋ (U+A64B)
- ἈΙ (U+1F08 U+0399): ᾀ (U+1F80), ᾈ (U+1F88)
- ἉΙ (U+1F09 U+0399): ᾁ (U+1F81), ᾉ (U+1F89)
- ἊΙ (U+1F0A U+0399): ᾂ (U+1F82), ᾊ (U+1F8A)
- ἋΙ (U+1F0B U+0399): ᾃ (U+1F83), ᾋ (U+1F8B)
- ἌΙ (U+1F0C U+0399): ᾄ (U+1F84), ᾌ (U+1F8C)
- ἍΙ (U+1F0D U+0399): ᾅ (U+1F85), ᾍ (U+1F8D)
- ἎΙ (U+1F0E U+0399): ᾆ (U+1F86), ᾎ (U+1F8E)
- ἏΙ (U+1F0F U+0399): ᾇ (U+1F87), ᾏ (U+1F8F)
- ἨΙ (U+1F28 U+0399): ᾐ (U+1F90), ᾘ (U+1F98)
- ἩΙ (U+1F29 U+0399): ᾑ (U+1F91), ᾙ (U+1F99)
- ἪΙ (U+1F2A U+0399): ᾒ (U+1F92), ᾚ (U+1F9A)
- ἫΙ (U+1F2B U+0399): ᾓ (U+1F93), ᾛ (U+1F9B)
- ἬΙ (U+1F2C U+0399): ᾔ (U+1F94), ᾜ (U+1F9C)
- ἭΙ (U+1F2D U+0399): ᾕ (U+1F95), ᾝ (U+1F9D)
- ἮΙ (U+1F2E U+0399): ᾖ (U+1F96), ᾞ (U+1F9E)
- ἯΙ (U+1F2F U+0399): ᾗ (U+1F97), ᾟ (U+1F9F)
- ὨΙ (U+1F68 U+0399): ᾠ (U+1FA0), ᾨ (U+1FA8)
- ὩΙ (U+1F69 U+0399): ᾡ (U+1FA1), ᾩ (U+1FA9)
- ὪΙ (U+1F6A U+0399): ᾢ (U+1FA2), ᾪ (U+1FAA)
- ὫΙ (U+1F6B U+0399): ᾣ (U+1FA3), ᾫ (U+1FAB)
- ὬΙ (U+1F6C U+0399): ᾤ (U+1FA4), ᾬ (U+1FAC)
- ὭΙ (U+1F6D U+0399): ᾥ (U+1FA5), ᾭ (U+1FAD)
- ὮΙ (U+1F6E U+0399): ᾦ (U+1FA6), ᾮ (U+1FAE)
- ὯΙ (U+1F6F U+0399): ᾧ (U+1FA7), ᾯ (U+1FAF)
- ΑΙ (U+0391 U+0399): ᾳ (U+1FB3), ᾼ (U+1FBC)
- ΗΙ (U+0397 U+0399): ῃ (U+1FC3), ῌ (U+1FCC)
- ΩΙ (U+03A9 U+0399): ῳ (U+1FF3), ῼ (U+1FFC)
- ST (U+0053 U+0054): ſt (U+FB05), st (U+FB06)
Below are all characters that is the lowercase form of multiple characters.
Character set (5)
- dž (U+01C6): DŽ (U+01C4), Dž (U+01C5)
- lj (U+01C9): LJ (U+01C7), Lj (U+01C8)
- nj (U+01CC): NJ (U+01CA), Nj (U+01CB)
- dz (U+01F3): DZ (U+01F1), Dz (U+01F2)
- θ (U+03B8): Θ (U+0398), ϴ (U+03F4)
What's the longest case-mapping chain?
A case-mapping chain is a sequence of distinct nodes such that . Invariant nodes in have case-mapping chains of length 1 (only the node itself). Simple mapping pairs have case-mapping chains of length 2 (the two nodes). The longest case-mapping chain has length 3, and there are many of them:
Character set (28)
- µ (U+00B5) → Μ (U+039C) → μ (U+03BC)
- ı (U+0131) → I (U+0049) → i (U+0069)
- ſ (U+017F) → S (U+0053) → s (U+0073)
- Dž (U+01C5) → DŽ (U+01C4) → dž (U+01C6)
- Lj (U+01C8) → LJ (U+01C7) → lj (U+01C9)
- Nj (U+01CB) → NJ (U+01CA) → nj (U+01CC)
- Dz (U+01F2) → DZ (U+01F1) → dz (U+01F3)
- ͅ (U+0345) → Ι (U+0399) → ι (U+03B9)
- ς (U+03C2) → Σ (U+03A3) → σ (U+03C3)
- ϐ (U+03D0) → Β (U+0392) → β (U+03B2)
- ϑ (U+03D1) → Θ (U+0398) → θ (U+03B8)
- ϕ (U+03D5) → Φ (U+03A6) → φ (U+03C6)
- ϖ (U+03D6) → Π (U+03A0) → π (U+03C0)
- ϰ (U+03F0) → Κ (U+039A) → κ (U+03BA)
- ϱ (U+03F1) → Ρ (U+03A1) → ρ (U+03C1)
- ϴ (U+03F4) → θ (U+03B8) → Θ (U+0398)
- ϵ (U+03F5) → Ε (U+0395) → ε (U+03B5)
- ᲀ (U+1C80) → В (U+0412) → в (U+0432)
- ᲁ (U+1C81) → Д (U+0414) → д (U+0434)
- ᲂ (U+1C82) → О (U+041E) → о (U+043E)
- ᲃ (U+1C83) → С (U+0421) → с (U+0441)
- ᲄ (U+1C84) → Т (U+0422) → т (U+0442)
- ᲅ (U+1C85) → Т (U+0422) → т (U+0442)
- ᲆ (U+1C86) → Ъ (U+042A) → ъ (U+044A)
- ᲇ (U+1C87) → Ѣ (U+0462) → ѣ (U+0463)
- ᲈ (U+1C88) → Ꙋ (U+A64A) → ꙋ (U+A64B)
- ẛ (U+1E9B) → Ṡ (U+1E60) → ṡ (U+1E61)
- ẞ (U+1E9E) → ß (U+00DF) → SS (U+0053 U+0053)