Language Creation Workshop

There was a three-hour Language Creation Workshop at Renovation, this year’s Worldcon. That really wasn’t enough time to generate much of a language, even if we’d used a streamlined procedure; we actually were given a seven-stage process which included things like vowel shifts, borrowing from other languages, etc., which was all very cool and educational and would have made for a more interesting language in the end, but we only had time to complete three steps.

My main contribution was a system of pronouns. Nobody in the group wanted gender-based pronouns like “he” and “she”, and I think we were about to settle on one pronoun for people and another for everyone else, when I suggested distinguishing between animate objects (like people or animals), things that change slowly (like an apple or snow), and things that are permanent (like a mountain). Our language ended up with five singular and five plural pronouns: nu (I), su (you), and three third-person pronouns, pe, te, and ke (pronounced pay, tay, kay). We also decided that our language formed plurals by adding -ish to words, so our plural pronouns were nish, sish, pish, tish, and kish.

I expected to leave it at that, but in the small hours of the morning I found myself refining the system. I added pu, a definite third-person pronoun, which refers to a specific person. (So “I saw a hunter by the creek. He was carrying a rabbit.” would use pe. “I saw Sylvia hunting by the creek. She was carrying a rabbit.” would use pu.) That got me thinking about whether tu and ku ought to be words. I didn’t see any use for tu, except maybe for nature spirits or some deities, but it struck me that dead people would be referred to with ke, so ku would be a pronoun that could logically refer to ghosts. After a while, I also came up with the idea that anything that was cyclical would use te, so tu would be the definite pronoun for a person carrying out a repetitive task.

I thought that it would be nice to have a mirror-image pair of words for birth and death, deriving from the concepts of te-to-pe (growing womb to person) and pe-to-te (person to corpse). When I looked at my notes I saw that we’d already agreed on pjet as the word for birth, so, drawing on that, I added a short e to our phonology and came up with pjet or péjet as the word for birth, téjep as a verb meaning an ordinary or gradual death, kép as a verb meaning to die instantly, and téjet or tjet as the word for cycle or repetition.

I thought it would be nice if the same word could be used for “die” or “kill”, so I decided on a basic word order of actor-verb-thing being acted on. So “Kép nu” would be “I die”, and “Nu kép” would be “I kill”.

Next, I asked myself how I’d make sentences like “I give you the book” or “I make a basket”. I decided that what was being acted on was the book and the bark—the book was being made your possession, the bark was being transformed into a basket. So what I needed was some sort of a market to turns “yours” and “basket” into verbs. I decided on the suffix -ta. So: “Nu yours-ta book” and “Nu basket-ta bark” (or just, “Nu basket-ta”). (I don’t have words for “yours”, “basket”, and “bark”, but I’m just coming up with grammar here.)

I even decided that -ta could be attached to verbs: “I run” would be “Run nu”, but “I make myself run” would be “Nu run-ta nu”, and “I make you run” would be “Nu run-ta su”.

If I’d continued, I would have done prepositions next, and I was thinking about making a system where you might say “from the ground through me” for “above me” or “toward the lake from you” for “to your left” (if the lake happened to be on your left), but thankfully I had no more sleepless nights, so I haven’t done any more work on the language. I have enough commitments already. But I can easily see how this could become addictive.

My Precious

A seven-volume dictionary, A Magyar Nyelv Értelmező Szótára, just arrived from Hungary. 23 pounds, 1.4 ounces, 7362 pages.

I decided to test-drive it on a sentence from Ida, the book I used in my Translation Exercise #1: “Egy pohos úr a Kávékirályban göcögve nevette.” Neither “pohos” nor “göcög” is in my iPad dictionary, but my big Hungarian-English dictionary defines “pohos” as “pot/big-bellied, paunchy”. Even my best dictionary didn’t have “göcög”, but Googling turned up an 1897 dictionary which defined it as “magába fojtva nevet”, and my iPad dictionary does have “fojt”, which it defines as “choke, stifle, suffocate”, and “magába fojtja érzelmeit”, “repress/supress one’s feelings, bottle up one’s feelings”, so I was fairly comfortable concluding that “göcögve nevette” could be translated as “stifled laughter”.

The new dictionary basically agrees on “pohos”, with the note that the word is “kissé rosszalló v. gúny” (slightly derogatory or derisive). For “göcög”, though, the definition is “(kisgyermek, kövér ember) jóízűen kacag, hogy a teste is rázkodik; döcög (5)”, with the sentence I quoted from Ida used as an example. This is exactly the opposite of what I thought it meant: “(small children, obese person) laugh heartily, so the body also shakes”. To be safe, I also looked up the fifth definition of “döcög”, which reads in part “teste rázkodik a nevetéstől”, “the body shaking from laughter”, and also “el-elfulladva, szakadozottan beszél”, which basically means that you’re laughing so hard you can’t speak or breathe.

Presumably, the definition which led me astray should have been interpreted as “laughing so hard you choke” (i.e., can’t breathe). The more complete definition is harder to misinterpret, so I think I can safely conclude that this purchase was well worth the money.

Oh, and the translation of that sentence, in context:

A stout gentleman in the Coffee King laughed heartily, his enormous belly rippling:

“The devil to these newspaper writers!” he said, throwing down the paper. “What great villains!”

The Problem With Flashcards

Estonian has no word for “bread”. This can pose a problem for the unwary translator. In particular, it can pose a problem for the makers of a flashcard program I bought, who translated the word “bread” as “leib” and illustrated it with a photograph of a loaf of sliced white bread.

“Leib” is the Estonian word for black bread. The Estonian word for white bread is “sai”. But if you look up bread in a typical English-Estonian dictionary, it may not explain that distinction. When an Estonian sees the English word “bread”, he thinks of leib—that’s what they eat. He knows, intellectually, that the word also encompasses sai, but the concept of bread as an American understand it, of leib-and-sai, doesn’t come to mind as readily or as naturally to an Estonian. So the poor maker of the flash card program, who is after all just adapting the same English words and photos to dozens of different languages, looks up “bread” in the dictionary and gets the wrong answer. Or maybe he gives a list of words to be translated to an Estonian consultant, but without photos attached, and gets the wrong answer. And even if he got the right answer, he won’t be able to convey that right answer on a flashcard.

Conceptually, flashcards are based on the presumption that a word in one language has a unique translation into another language. That presumption is a useful approximation, in the same way that a spherical point mass is a useful approximation of a cow. And understanding why that presumption often fails illuminates why good machine translation is so difficult, and so far off. After all, simply to translate a sentence containing the word “bread” into Estonian requires information that may not be in the text—anywhere.