NACLO 2023 - Problem GFeathers of the Roseate Spoonbill

First note that the 16^th century and 20^th century writing systems are identical, except the latter adds some diacritics, so the latter contains more information than the former. This is useful for G3.

Since the 20^th century system contains strictly more information, we should try to match it with IPA (instead of 16^th), since the IPA is also rich in information. Below, I align them by consonants/vowels. "tl", "ch", and "tz" are well-known to be complex consonants, so I've grouped them. In case where the vowels don't match exactly (there are two vowels in the 20^th century system but only one in IPA), I'm including the previous or next consonant as well, because the missing vowel may have been absorbed into either of the neighboring sounds. For example, "hui" becomes /wi/, which may either because h → /w/, ui → /i/, or because hu → /w/, i → /i/. "#" indicates word boundary.

n i c tl āuh que ch ō  l i hui m o l o hua
n i k tɬ aːw ke  tʃ oː l i wi  m o l o wa

n i c t e ō  cui tl a ì  cui y a
n i k t e oː kʷi tɬ a iʔ kʷi j a

n i c que tz a l hui x t oi l p ī  z  #  i n  #  i c n īuh y ō  tl
n i k ke  ts a l wi  ʃ t oi l p iː s  #  i n  #  i k n iːw j oː tɬ

i n  #  t ēuc  p a n  #  n i c quī x t ī  z
i n  #  t eːkʷ p a n  #  n i k kiː ʃ t iː s

ō  t i y à  quê  #  y e  #  m i c tl ā  n
oː t i j aʔ keʔ  #  j e  #  m i k tɬ aː n

Here are all unique correspondences:

a → /a/
ā → /aː/
à → /aʔ/
āuh → /aːw/
c → /k/
ch → /tʃ/
cui → /kʷi/
e → /e/
ēuc → /eːkʷ/
hua → /wa/
hui → /wi/
i → /i/
ī → /iː/
ì → /iʔ/
īuh → /iːw/
l → /l/
m → /m/
n → /n/
o → /o/
ō → /oː/
p → /p/
que → /ke/
quê → /keʔ/
quī → /kiː/
t → /t/
tl → /tɬ/
tz → /ts/
x → /ʃ/
y → /j/
z → /s/

Notice a few patterns for the diacritics: horizontal bars like ā and ē indicate lengthening; grave accents like ì and à and circumflex like ê indicate glottalization. We have a few complex segments left to dissect:

āuh → /aːw/
cui → /kʷi/
ēuc → /eːkʷ/
hua → /wa/
hui → /wi/
īuh → /iːw/
que → /ke/
quê → /keʔ/
quī → /kiː/

Note three things:

the vowel rewrite is consistent with the standalone rules: ā → /aː/, i → /i/, etc.
the disappearing vowel is always "u", and we have no standalone "u" ever.
we have no standalone "h", "c", "q" either (we do have "h" in "ch", but that's a different thing).

This strongly suggests that "uh", "hu", "qu", "uc", "cu" are all complex segments that rewrite to a single consonant:

uh → /w/
hu → /w/
qu → /k/
uc → /kʷ/
cu → /kʷ/

This is already sufficient for G1, because we have a 1-to-1 mapping from 20^th century to IPA. For G1, we first need to look up each 16^th century word in the 20^th century system, because the latter contains more information such as length and glottalization.

a. y e # n i hu ā l l â → /j e # n i w aː l l aʔ/
b. a n t o c n ī hu ā n i n → /a n t o k n i w aː n i n/
c. qu i n # ì cu ā c → /k i n # iʔ kʷ aː k/

In G2, we need to do the reverse, which is a bit more involved because currently we can have the same IPA sound rewrite to multiple 20^th century segments:

Glottalization /ʔ/ can be either circumflex â or grave accent à.
/w/ can be either "uh" or "hu".
/k/ can be either "c" or "qu".
/kʷ/ can be either "cu" or "uc".

For each one, it means there must exist some context that allows us to deterministically pick one of the alternatives, so we just need to list out all the contexts.

Glottalization /ʔ/:
- Grave: "a ì cu", "y à qu", "# ì cu"
- Circumflex: "qu ê #", "l â #"
/w/:
- "uh": "ā uh q", "ī uh y"
- "hu": "i hu i", "o hu a", "l hu i", "i hu ā", "ī hu ā"
/k/:
- "c": "i c tl", "i c t", "i c qu", "i c n", "i c tl", "o c n", "ā c #"
- "qu": "h qu e", "c qu e", "c qu ī", "à qu ê", "# qu i"
/kʷ/:
- "cu": "ō cu i", "ì cu i", "ì cu ā"
- "uc": "ē uc p"

For glottalization, the pattern is that the circumflex only appears at the word end, while the grave accent only appears within a word or at the start. For /w/, /k/, and /kʷ/, it has to do with whether a vowel comes after: whenever there's a vowel after (including both before and after), "hu", "qu", and "cu" are used; otherwise, "uh", "c", and "uc" are used.

Therefore:

a. /w eː w eʔ/ → "hu ē hu ê"
b. /tʃ o k o l aː tɬ/ → "ch o c o l ā t l"
c. /m i k tɬ aː n t eː kʷ tɬ i/ → "m i c tl ā n t ē uc t l i"

Finally for G3, we've been doing this all along: it's possible to rewrite back and forth between the 20^th century system and IPA, so they are bijective. 16^th century system is a lossy translation and cannot be deterministically rewritten back.