NACLO 2022 - Problem ALines in the Sand

I have to show this off. There turned out to be no Unicode characters for Avoiuli, so I had to make a custom font:

sab senta blongmelenisian institiut blong tijim saen filosofihiumaniti mo teknoloji lisa vilij lolovini

You can even copy out the text to see the full transcription!

As always, we need to answer two preliminary questions: is it alphabetic or syllabic? Is it left-to-right or right-to-left? Unlike many other writing system problems, we aren't given all the words present in the text, neither are we given any correspondence. We only know three words in the Latin alphabet: "filosofi", "institiut", "blong". This is enough to start: in "filosofi", the start and end syllables are the same, so whether syllabic or alphabetic, the first and last glyphs must be the same. We find 9. filosofi that starts and ends with fi, so this means "fi". Since f never appears elsewhere while i appears in many other places, they stand for "f" and "i" respectively. Furthermore this string has 8 glyphs, so this script is more or less alphabetic (unless vowels can be elided in certain contexts), and left-to-right. We have the following character matches: f = f, i = i, l = l, o = o, s = s.

Next for "institiut", we search for a word with i in the 1st, 5th, and 7th positions. There are only two words that contain 3 is: 4. institiut and 12. hiumaniti. Both are 9 glyphs, but neither has "i" positions matching "institiut" when read left to right. However, 4 also contains 3 ts, which we also need, and a s. So there's overwhelming evidence that 4 is "institiut". But when we match all the "i"s and "t"s, we get "T_ITITS_I" instead of "I_STITI_T". This means the language is... right to left?

The revelation comes by noticing that i on line 3 appears different from the one on i on line 4. More precisely: they are mirror images of each other. This happens for other glyphs as well, such as l and l. This means the script is actually written in alternating directions: left-to-right, then right-to-left, then left-to-right, and so on. This is known as boustrophedon, and is common in ancient writing because it minimizes the amount of hand movement needed when writing, important when writing on stone, clay, or here, sand. So, if this is the correct analysis, then we just need to run right-to-left for the second and fourth lines. We get the matchings: n = n, t = t, u = u.

Knowing what "l", "o", and "n" is, we can now search for "blong". Unfortunately, there's no word containing the sequence lon or lon. However, we do find two cases of blong which contain lo and are short enough, so b = b, ng = ng (a bigram).

Now we can plug in all the letters we know:

sa b se nta blong

mele nisia n institiut

blong tij im sae n filosofi

hiuma niti m o tek noloj i

lisa v ilij lolov ini

Since this is a creole language, and also because we aren't given any other means of translation anyway, the words in A2 are likely to be similar in Bislama as well. For example, for "humanity" and "technology", we see two long words: 12 hiumaniti (decoded as hiumaniti) and 10 teknoloji (decoded as teknoloji). So undoubtedly, h = h, m = m, a = a, e = e. We just have two more letters in "te_nolo_i", which are probably k = k and j = j.

At this point most of the text is decoded, so we can just match the remaining by sound. 2. senta is decoded to "senta" which appears like "center". 7. tijim is decoded to "tijim" which appears like "teaching". In 14. vilij, we don't know what v is, but by method of elimination this has to be "vilij" for "village".

Here's the full transcription:

sab senta blong melenisian institiut blong tijim saen filosofi hiumaniti mo teknoloji lisa vilij lolovini

To find the Pacific area, refer to the map and recognize "melenisian" as "Melanesia".