October 29, 2007

Different directions

One of the things I like learning is synonyms, which are plentiful in Thai, what with the combination of different speech registers and slews of Indic loanwords. I'm often surprised to make connections with words I've heard but didn't know the meaning of.

Case in point: the Indic-derived words for the compass directions:
อุดร (as in the name of the province, อุดรธานี, "northern city")
South: ทักษิณ (the name of a certain Prime Minister; also, the express train south is called the "Thaksin Express")
East: บูรพา (the author กุหลาบ สายประดิษฐ์ wrote under the pseudonym ศรีบูรพา, roughly, "the glorious east")
West: ปัจฉิม (most commonly seen in the phrase ปัจฉิมลิขิต or ป.ล., the Thai version of "p.s."--บูรพา and ปัจฉิม have alternate meanings of "front" and "back", respectively)
Northeast: อีสาน (the northeastern region of Thailand!)
Northwest: พายัพ (there's a Payap University in Chiangmai)
อาคเนย์ (seen in the phrase เอเชียอาคเนย์ "Southeast Asia", or in the สนธิ-compound of the same meaning, อุษาคเนย์)
Southwest: หรดี (haven't seen this one around much, frankly)

October 28, 2007

Etymologist 11: ปฏิ- words and language modernization

Thai, like many languages, has been undergoing a conscientious process of modernization for several decades. This means there are people who intentionally coin new Thai words to correspond to words in English or other languages. The work is never really through, however, as new words are always coming up that need Thai equivalents. One option is to simply borrow the foreign word directly (such words are known as ทับศัพท์), but some view this as a corruption of the language. In Thailand, the many committees of the Royal Institute do most of this work. They give us words like โลกาภิวัตน์ for 'globalization', and คณิตกรณ์ for 'computer'. Sometimes they catch on (as with วัฒนธรรม 'culture' or นโยบาย 'policy'). Other times they fail miserably (คณิตกรณ์ 'computer' is a great example of failure--คอมพิวเตอร์, and its abbreviation คอม, are ubiquitous).

One possible strategy for simplifying this never-ending process is to create a set of tools in Thai to correspond to English. That is, to systematically use one Thai morpheme* to correspond to a given English morpheme. This sounds great in theory, but it's difficult in practice, because any given morpheme can have any number of meanings. Thus, this has only been done haphazardly in Thai.

Consider the example of ปฏิ-. It's a prefix borrowed from Indic, meaning (according to RID99) ตอบ, ทวน, กลับ. I
n several coinages, ปฏิ- corresponds to the prefix re- in English:

ปฏิรูป = reform
ปฏิวัติ = revolt
ปฏิกิริยา = reaction
ปฏิกรณ์ = reactor (e.g. nuclear)

These are calques (a.k.a. "loan translations") from English:
ปฏิ + รูป = re + form
ปฏิ + วัติ = re + volt (meaning 'turn back', related to revolve)
ปฏิ + กิริยา = re + action
ปฏิ + กรณ์ = re + actor/agent

Here are a few less common ones, which are looser calques:
ปฏิกรรมสงคราม = reparations
ปฏิสังขรณ์ = restore/renovate (a building)

There's also the common word ปฏิเสธ "reject", but it isn't a recent coinage, rather an existing word expanded to include the meaning "reject". It's a long-standing term for the "negative" mood in grammar (i.e. the opposite of "affirmative").

Beyond that, there are a number of relatively common Thai words with this prefix that don't correspond to re- words in English:

ปฏิทิน = calendar
ปฏิบัติ = to carry out, to put into practice
ปฏิญาณ = to vow, swear an oath

Given the success of neologisms like ปฏิรูป and ปฏิวัติ, you'd think it would be easy to expand ปฏิ for other re- words like recycle, renew, etc. The problem you run into is that re- has a related meaning which doesn't quite match ปฏิ-: "again". In fact, this is probably the primary meaning of re- in most English words its found in. So while ปฏิ- fits some words, it doesn't quite fit others.

But what can you expect? Systematically capturing the nuances of one language in another would require a pretty massive restructuring, for two languages as different as English and Thai. And for a language like Thai, language modernization has always been a tightrope walk across the fine line between keeping up with the pace of the world and sacrificing what Thais view as
the most significant aspect of their society and culture: their language.

A morpheme is the smallest meaningful language unit. For example, the word bicycles has three morphemes: bi-, cycle, and -s. This tells you it's a plurality of two-wheeled things.

October 27, 2007

Royal Institute Dictionary of New Words

Back in June of this year, the Royal Institute held a press conference on the topic การจัดทำพจนานุกรมคำใหม่ "Making a Dictionary of New Words" (the page about it on the RI site is here, the PDF press release is here). This was widely covered in the Thai press, but also widely misunderstood.

The main confusion appears to surround the Thai term ศัพท์บัญญัติ, and specifically the word บัญญัติ. The term is frequently translated as "coin", in the sense of "coining a word", but in English, to coin a word simply means to come up with it, while บัญญัติ carries an air of authority. More broadly applied, as a verb it means to prescribe or legislate, and as a noun, law or regulation. So ศัพท์บัญญัติ literally means something like prescribed words. Usually these take the form of technical vocabulary, coined into existence to match some equivalent term in English or another foreign language. This type of coining/prescribing is necessary in the process of modernizing a language like Thai.

Let me switch tracks for a second. Notice that in English (or at least American English as I'm familiar with it), we say "the dictionary". This use of the definite article implies that there is some sort of authoritative dictionary out there, the official arbiter of all things linguistically correct. I regularly hear the sentiment "that's not a real word". And who can forget that childhood taunt, "ain't ain't a word 'cause it ain't in the dictionary"? Popular opinion aside, though, English is notable because it doesn't have such a dictionary, nor does it have any organization empowered to regulate "standard" or "proper" use of the language.

Thus, for English, those who would seek to prescribe proper usage must assume the authority themselves. English dictionaries of the last century or more have not usually attempted to do this, though many still read them as if they did. Early notable lexicographers of English, namely Samuel Johnson and Noah Webster, included some element of prescriptivism in their dictionaries, but Webster was notable for bucking British prescriptive trends and helping to define American English as a separate linguistic entity, canonizing many spellings and pronunciations we still use today. More recently, lexicographers like James Murray (Oxford English Dictionary) and Philip Gove (Webster's Third) saw their tasks as being to describe the language as it was a
nd is used, and employ usage notes to differentiate standard from non-standard. Still, it is the everyday users of the language who clamor for a linguistic king--demanding a final answer on things like who vs. whom (which provided great fodder for an excellent scene in a recent episode of The Office), whether ain't is "really a word", or how to "correctly" spell such-and-such word. While I believe that a standard language is important, and you will be judged socially and professionally by your ability to use the standard language conventionally, you, dear reader, may be picking up what I'm putting down: I don't much care for arbitrary prescriptivism.

Back to Thai. Unlike English, the Thai language has just such an organization. It is the Royal Institute. And surrounding the issue of this "new words" dictionary, I'm seeing confusion among the Thai people in the opposite direction from English. That is, since the Royal Ins
titute exists, and is formally endowed with the power to prescribe proper, standard language use, people think that anything that comes from them is being prescribed. With respect to the forthcoming พจนานุกรมคำใหม่, this means that people are misunderstanding the publication of this volume as an endorsement of the words contained therein. And given the RI's past tendency to keep slang at arm's length, it's understandable. The inclusion of slang was one of the reasons why the Matichon Dictionary (พจนานุกรม ฉบับมติชน) was such big news upon its release.

There is a great mirror for this sort of controversy in English, actually. In 1961, Merriam-Webster published Webster's Third New International Dictionary, Unabridged (commonly known as Webster's Third or W3), which was met by a fair amount of criti
cism at its inclusion of--you guessed it--ain't, among other words. The funny thing was, ain't had been in dictionaries for a long time, but Webster's Third became a whipping boy for dictionary "permissiveness". (Read more on Wikipedia. This is also the subject of an excellent book.)

Things haven't gotten quite so far in Thailand (though the Royal Institute has taken some beatings over the years), and I doubt things will, unless the Institute announces plans to incorporate the contents of the "new words" dictionary into the next edition of RID (very unlikely). Nonetheless, some folks seem to be responding with displeasure that such an august institution as the RI would even bother with what they see as teenage language abominations. The Institute is quite clear on this thoug
h: this is not a collection of ศัพท์บัญญัติ, but rather a compilation of new words currently in use, ostensibly for the benefit of later generations, who may be confused by the wacky slang of today's popular media. (I think, in some ways, they're referring to themselves, being mostly elderly folks.)

Consider a couple of quotes from the web board of the Royal Institute website (my translations):
"อยากทราบว่าในพจนานุกรมฉบับใหม่ที่กำลังถกเถียงกันอยู่ตอนนี้ มีคำอะไรบ้างคะที่จะบัญญัติลง..."
(I'd like to know what words are prescribed in the new dictionary being argued over right now...)
"ต้องการทราบเรื่อง การบัญญัติคำศัพท์ใหม่ ที่ออกรายการช่อง 3..."
(I want to know about the prescribed new words that were reported about on channel 3...)

If this is what people took away from media reports on the June press conference, it seems that the press missed the message on this one. Now, there are those on the message board who are trying to point out that these aren't ศัพท์บัญญัติ, but I wouldn't be surprised if many folks are still under the misconception.

In late September, the Royal Institute had another press conference to show off the newly printed book (on the left in the photo at right). The formal premier of the book was this past Monday, October 22nd, at the national book fair. It's available exclusively there until the fair ends (tomorrow).

In all, I think this is a good move on the part of the Royal Institute. I think it's a bit silly and unnecessary that they have to go through a bunch of logical gymnastics to justify the new dictionary, but I'm glad they did it. While this idea has basically been done before in the form of Matichon's พจนานุกรมนอกราชบัณฑิตยฯ (Dictionary of Words Not in RID, the out-of-print predecessor to Matichon's full dictionary), this will still be a legitimately valuable reference work for not just future generations of Thais (one of the Royal Institute's justifications for the book), but for all the Thai language learners who've ever worn a bald spot into their scalp scratching their heads over the seemingly bottomless fount of new Thai words. I hope เล่ม ๒ comes sooner rather than later.

[Note: I wrote this without having gotten my hands on the book. I bought a copy tonight. I'll post my thoughts soon.]

October 26, 2007

Classifiers! Get your classifiers!

The Royal Institute, your friendly neighborhood language regulating organization, is all about the dissemination of knowledge. Their website is, frankly, not very good (too much style over functionality--their fancy menus don't display correctly in any browser but IE, blech), but I'll be darned if it isn't packed with goodies. Difficult to access, often half-baked goodies, but goodies nonetheless.

Case in point: classifiers. In Thai we know them as ลักษณนาม [ลัก-ษะ-ณะ-นาม, in case the four consonants in a row was throwing you]. As the Thai word indicates, they are a type of นาม (noun, though this word also means name), which tells you a characteristic (ลักษณะ). So, ลักษณะ + นาม (the vowel disappears
because of สมาส combining rules).

Now, classifiers can be quite a chore to remember. I mean, how often do you use กระบอก, the classifier for gun? Or remember that a pair of pants is only a ตัว, but a pair of socks is a คู่? And many a learner forgets, in the heat of the conversation, that a pencil is a แท่ง but a pen is a ด้าม.

There is some research that suggests that native speakers don't employ nearly so complex a set of classifiers as the textbooks would lead you to believe, and use a lot of generic and repeater classifiers (nouns which can be used as their own classifiers, like ร้านร้านหนึ่ง "a store"). Interestingly, it also shows that younger speakers are better at using the Royal Institute's prescribed classifiers (presumably because, statistically, younger Thais are more well-educated than their elders). In my uninformed opinion, I find it's a matter of situation. In a more formal situation, or especially in writing, it's more important to know the standard. In informal conversation, it's easy to get by using a lot of อัน and ตัว.

Which all brings me to this link. It is the full text of the Royal Institute's useful (if unimaginatively titled) booklet, ลักษณนาม ฉบับราชบัณฑิตยสถาน (roughly, The Royal Institute Book of Classifiers). It's the text of the sixth printing, to be precise. Kudos to them for putting it online. It is organized alphabetically by the word you want the classifier of. In other words, look for นาฬิกา to find เรือน. The most glaring error is the lack of searchability, followed closely by the decision to arbitrarily number the 22 pages of text, sometimes with multiple consonants-worth of classifiers on one numbered page. Even when you know the letter of the alphabet, you still have to hunt around to find exactly which page it's on. The next obvious shortcoming is the inability to see what words share a certain classifier. For example, to be able to look up in one shot all the words the Royal Institute says it's okay to use ตัว as a classifier for.

In an effort to remedy this, and completely without the Royal Institute's knowledge or permission, I've taken their text and put it in a Google Spreadsheet, viewable here. I've divided it up into different "sheets" by letter. This doesn't solve all of the problems, but it's a start. This should serve to make the Royal Institute's date more valuable to language learners like you and me. (Feel free to contact me for the XLS or DOC or TXT files I made from the RI data.)

And let me know in the comments if you find a particularly interesting classifier you didn't know existed before.

October 20, 2007

More Thai Wikimedia love

The other day I wrote about Thai Wikipedia, so I wanted to highlight another cool Wikimedia project: Thai Wiktionary. In a comment on the Thai Wikipedia post, Jason had a great thought:
The gap between Thai Wikipedia and English Wikipedia is an aspiring translator's dream: plenty of material and no penalty for mistakes.
Now, I knew that in theory, but it took him saying that to get me to fully realize it. So I've decided to hone my Thai skills and get more involved. For no real reason, I created an article on Herman Melville (though I focused mostly on a list of his works). I created a "navbox" for the important technological fields, which you can see at the bottom of pages like วิศวกรรมชีวเวช (biomedical engineering). Among other things.

Enter Thai Wiktionary. Very much in its infancy. Little consistency, and as of yesterday, a mere 248 entries. By comparison, English Wiktionary has 551,000, and little ol' Vietnamese Wiktionary has an astounding 225,000 words. So I decided to double Thai's number of entries. And right now it stands at 415. I'm not really that prolific--I added pages for almost 200 Thai abbreviations. I figured that was an easy way to start. Doesn't require much definition (at least not yet), just a single word saying what it stands for. Gets the job done on a basic level. Get involved! The important thing is to look at what's been done and copy the style and the standard.

When Thai Wiktionary hits 1,000 words it moves up to the next bracket on the homepage. I'm thinking an ETA of next week sounds reasonable. Wanna help?

October 16, 2007

Bangkok at nightfall .. กรุงเทพฯ ยามรัตติกาล

Even a seemingly endless sea of concrete and glass has its charm. Here are a couple of beautiful pictures of Bangkok at night from Wikipedia. I have a soft spot for this particular vantage point. The skytrain stop visible here is Sala Daeng ศาลาแดง Chong Nonsi ช่องนนทรี. Before we were married, my wife used to work on the 48th floor of Empire Tower, the tall building on the left, and I used to regularly meet her around here after work. The view from the skytrain station ain't too shabby, either.

And here's Chinatown. Something about sunset shots. Beautiful.

Thai Wikipedia

I wonder how many people know about Thai Wikipedia. I recommend it--and I recommend getting involved. I've been a member since July 2006, and while I've only made 100 or so edits, I try to do what I can. Particularly, since I'm not quite confident enough to start full blown articles, I create a lot of redirect pages for things with more than one name, correct obvious typos, interlink with Wikipedias of other languages, and use the Thai info to create or expand English pages about various Thai things, like authors, or the Royal Institute.

On the Wikipedia frontpage, which lists languages in groups by number of articles, Thai is firmly in the 10,000+ group (with 28,135 as of just now). It has more articles than the respective Wikipedia sites of all of mainland Southeast Asia (assuming I'm interpreting their statistics pages correctly):

Vietnamese Wikipedia: 25,150 articles
Malay Wikipedia: 23,659 articles
Khmer Wikipedia: 373 articles
Lao Wikipedia: 228 articles
Burmese Wikipedia: 101 articles

As for the island portions of Southeast Asia,
Tagalog Wikipedia has 9,722 articles, and only Indonesian Wikipedia has more than Thai, with 67,922 articles. Some of the other major Asian languages, like Japanese and Chinese, dwarf Southeast Asia's Wikipedias, but things are well on their way. Thai Wikipedia is a great resource, and great language practice.

Let me get you started: here's a link to the article on Bangkok.

October 15, 2007

Drawn out woooooords

There's something I've noticed about writing Thai. When one wants to indicate a word that is drawn out, like in the title of this post, you typically multiply the last consonant of the word. As in, คิดถึงมากกกกกก "I miss (you) soooooooo much". Now, in both languages, the sound you're actually drawing out is the vowel. That's why it's interesting, I guess. You might expect it to be written มาาาาาาก. And indeed, you can find instances of this on the internet--or a combination of both, e.g. มาาากกก--but not many, versus hundreds of thousands the other way. The exception to this is words that have no final consonant (a.k.a. open syllables). For example, อย่า would be อย่าาา, มา would be มาาาา, etc.

So, I wondered, how do you draw out syllables like รู้ in writing? It's an open syllable, but you can't draw extra vowels without extra consonants. And รู้รู้รู้ would mean repetition of the whole word, not drawing out the vowel. As it turns out, the internet contains instances of both รรรรรรู้ and รู้รรรรร. Which is predictable, since non-linear vowels don't have a obvious strategy. For โอ้โห, the results are similar both โโโอ้โห and โอออ้โห among others.

Of course, this is informal writing, and so there's probably no real standard. I have seen words drawn out like this in advertising. A billboard for condos near where I work that says in huge letters, โดนนนนนสุดๆ (โดน here is short for โดนใจ, "to be to one's liking").

As an exercise, I tried Googling the word มาก with varying numbers of extra ก's. And no matter how long I got, there were always results, with such expected culprits as "น่ารักมากกกกกกกกกกก", "สวยมากกกกกกกกกกกกกกกกกกก", "เยอะมากกกกกกกกกกกกกกก" etc. I also discovered that Google has a limit on the length of a single word. It's 42 characters. After that it tells you X is too long a word. Try using a shorter word. It still carries out the search, though. So I tried even longer. And longer. And longer. I was still getting around 26,000 results! And then it happened: Google just couldn't take it anymore. It told me "
Bad Request. Your client has issued a malformed or illegal request." The internet cops are knocking on my door as I type.

October 4, 2007

Translator, traitor

[When I said the blog would be sparse, I didn't mean the content. Or so I thought. Sorry about the dearth of posts lately. I've just gotten back to Bangkok and started work again in earnest. Life is busy. In the meantime, here is something I'm cross-posting from Language Scraps.]

I'm an amateur translator. Mostly Thai to English, though I've dabbled with English to Thai. Much more difficult. There are some short examples of my work (in English) on my personal blog here here and here. They're excerpts from books I've read, and there's another post about translating titles here.

How do you convey something in another language? There's an Italian saying, "traduttore, traditore." That is, "translator, traitor." I first heard this phrase in an essay by Umberto Eco, which decries, in part, the sentiment behind it, by encouraging the original author to be involved in the translation of his work. I particularly like this statement of Eco's:
The job of translation is a trial and error process, very similar to what happens in an Oriental bazaar when you are buying a carpet. The merchant asks 100, you offer 10 and after an hour of bargaining you agree on 50.
I think this is a great insight into the nature of translation. There is no Perfect Language, so there is no Perfect Translation. Eco's explanation of source-oriented vs. target-oriented translation was quite helpful for me.

This essay on the problem of translating puns gives the following dilemma, thought up by French philosopher Jacques Derrida. Translate the following phrase into English:
"Oui, oui, vous m'entendez bien, ce sont des mots français."
You might translate it as, "Yes, yes, you are receiving me well, these are French words." But wait, they're not anymore. In that case, is "Yes, yes, you are receiving me well, these are not French words" a better translation? What about "English words"? You will probably offend some portion of readers with any of these solutions. Eco mentions a similar problem in translating the French dialogue portions of War and Peace into French--there was a specific reason Tolstoy chose to include French dialogue, or why Eco chose to include Latin text in The Name of the Rose, and why the Latin was left untranslated in the English version of that novel.

Eco ends the essay with a lighthearted reference to the Italian maxim by calling translation "admirable treason." I just hope my efforts, leave much to be desired though they may, can too be called admirable. Or am I weaving the noose for my own traitorous neck?