Open round | 20 points | 50.00% | Problem statement | Official solution | Tags: Computational
We need to reverse-engineer the algorithm of each computer.
Now for C2. Here are a few hypotheses that don't work:
Our biggest problem is why it accepts "The linguist visited the spy." but not "The woodcarver visited the programmer.", when none of the words are unknown and the structures are also highly regular. If it's not the word and not the structure, then it must be how the words are placed into the structure. Indeed, for a pattern like "The X Y the Z.", "linguist" has appeared as "X" (1), "visited" has appeared as "Y" (7), and "spy" has appeared as "Z" (2), but "woodcarver" has never appeared as "X". Therefore, C2 takes distribution into account. To verify this hypothesis, we must count the distribution of each word in the training set. Note that it's able to get "The main concierge saw the blacksmith" where there's an extra adjective, so the distribution is not just about the word's index position in the sentence, but its position relative to landmarks. Let's suppose that C2 learned the following templates (I'm assuming it has learned that the same categories B, C, and D can be reused across templates, instead of learning one category per template, and it turned out to be a correct, at least irrelevant, assumption):
| A | B | C | D | E |
|---|---|---|---|---|
| cheerful famous happy main talented | ballerina calligrapher cartoonist concierge detective haberdasher linguist programmer spy watchmaker | met saw visited | astronaut ballerina blacksmith linguist programmer spy woodcarver yodeler | asleep famous happy tall knowledgeable |
Indeed, we can verify that all sentences accepted by C2 follow one of these templates, and all sentences rejected by C2 violate these templates. Therefore:
31. "tall" not in A; (a) = U
32. Follows template 2; (b) = G
33. Doesn't follow a template; (c) = U
34. "yodeler" not in B; (d) = U
35. Follows template 1; (e) = G
36. Follows template 2; (g) = G
37. "talented" not in E; (j) = U
38. "cartoonist" not in D; (m) = U
In D2, "asleep" only appeared in position E; "happy" appeared in both B and E; "main" only appeared in position B. C2 gave 39 and 40 both G, so HIDDEN_WORD_1 = happy. It gave 41 U, which means HIDDEN_WORD_2 cannot be in position E, so HIDDEN_WORD_2 = main. It gave 44 U, which means HIDDEN_WORD_3 cannot be in position B, so HIDDEN_WORD_3 = asleep.
For me, "happy" is definitely both attributive and predicative; "main" is mostly only attributive (I can think of very limited contexts in which "he is main" is legal, usually as slangs); "asleep" is mostly only predicative (the attributive version is "sleeping"). So I give the same judgment as C2 and differ with C3 by 2.
For D4, here are the ciphered texts and their contexts:
The problem setup says that all these words have something to do with morphology. Therefore "QGTRHUU" has something to do with "small"; naturally the "TRHUU" part stands for "small", making "QG" stand for "en" (i.e., make), which is consistent with the later mention of "enlarge".1 Now we can just substitute these deciphered letters into other words and keep guessing as the words become more revealed:
Funny aside: in my CS data structures class, the functions to shrink/grow a hash map are actually called "ensmallen/embiggen". I can never get over the double causative morphology. ↩