Tripping the flight fantastic

A heads up--I'm leaving on a trip on Monday. My wife and I are spending about three weeks in the U.S. to visit family and so I can attend a conference in Maryland. I'll be presenting at SEALS XVII on a piece of research I've been working on (off and on) for more than a year. Expect more details after the fact, since I'm still working on the talk.

I'll try my best to write regular posts while I'm traveling. Part of my trip involves a cross-country drive, though, so bear with me if things get sporadic.

On books and my LibraryThing widget

There's something that bugs me about my blog--my LibraryThing widget links to, when almost none of the books in the widget have pages on that site.

So when you see an interesting cover and you try to click it, you get.... nothing.

The reason is simple--the widget code is provided by LibraryThing, and the vast majority of their covers come from Amazon. In turn, Amazon's terms of use are simple: use our cover images, link to our site. So LT assumes (and the statistics are on their side here) that your covers mostly come from Amazon.

But I've limited my widget to books in Thai or otherwise related to Thailand. And I got almost none of those cover images from Amazon. I've scanned and uploaded 800+ covers to LibraryThing. In fact, I started scanning and uploaded all of my book covers, Thai and otherwise, after I joined LibraryThing. I liked the idea of seeing the cover of my exact copy on my virtual bookshelf. I enjoyed progressing toward the goal of cataloging my entire library. I finished the cataloging last year some time, and now I just add books as I buy them.

The point of all this is that I want to apologize to everyone who has been tricked into clicking on dead Amazon links by the widget. Allow me to point you to a "shelf view" of my Thai books directly in LibraryThing. The covers are somewhat bigger, which is nice, and you can click through to the "info" link to find out more. Title, author, ISBN, publication data that sort of thing. (If you want still larger versions of my book covers, head here.)

In other book news, I plan to start a new feature where I profile Thai books on the blog. These will be kind of like reviews, but focused (at least at first) on books about the Thai language. High (perhaps even first) on my list is คลังคำ, a rather unconventional dictionary. My aim is to introduce you to good books you may not be familiar with, or even aware of.

More on searching the SEAlang dictionary

After the other day's post, I figured it would be good to do some exploring of search functions in SEAlang*'s Thai-English dictionary.

First up: IPA characters
Sometimes you know how a word is pronounced, but you aren't sure how it's spelled in Thai. The IPA search box can help you out. IPA refers to the International Phonetic Alphabet. You can use SEAlang's IPA search box to search by the sound of a word. (In reality, these are a modified form of IPA). The six special characters are: ə ɛ ɔ ʉ ŋ ʰ

ə = The "schwa" is an upside-down letter e, and represents the vowel เออะ (əə = เออ).
For example, เรอ would be r
ɛ = The "script e" looks like a backwards 3, and represents the vowel แอะ (ɛɛ = แอ).
For example, แม่ would be m
ɔ = The "open o" looks like a backwards letter c, and represents the vowel เอาะ (ɔɔ = ออ).
For example, พ่อ would be p
ʉ = The "u-bar" is a u with a line through it, representing the vowel อึ (ʉʉ = อือ).
For example, คือ would be k
ŋ = The "eng" is an n with a tail, which represents the consonant ง.
For example, โง่ would be
ŋoo (and งู would be ŋuu).
ʰ = This superscript h represents an aspirated sound. Must follow c, k, p, or t
For example: ทัน = t
ʰan while ตัน = tan. Likewise, ชัน = cʰan while จันทร์ = jan; พัน = pʰan while ปัน = pan; and คัน = kʰan while กัน = kan)

You can produce these either by clicking the buttons above the search box, or by using the keyboard shortcut (note you must be typing in the IPA box or this won't work, and neither will the search):
ə = shift + a
ɛ = shift + e
= shift + o
ʉ = shift + u
ŋ = shift + n
ʰ = shift + h

Now I'm going to present some scenarios and explore how to get the desired results.

Q: You heard the word สร้างสรรค์ and you want to know how it's spelled.
A: In the IPA search box, you can use a couple of different methods. The easiest is to type the phonetics into the IPA box:
saaŋ-san (you must separate syllables with the hyphen). Up comes สร้างสรรค์. But say you heard the word and weren't sure about the vowel length. You searched saŋ-san, but that only returned สังสรรค์, which isn't what you were looking for. Try the search again, this time clicking the V button in the "Approximate matching" section. This means it will find words with variations on vowel length or other plausible vowel variations. Sure enough, it returns two results: สร้างสรรค์ and สังสรรค์.

Q: You want to know how many words begin with the letter ษ.
The phonetic searching is quite handy, but some of the search syntax can also be used with the native orthography. This is a good example of such a situation. The search ษ.* will match ษ followed by any combination of characters. (Turns out SEAlang isn't the best place to look for rare words like this, since it's based on a student's dictionary, not a comprehensive dictionary. But you can use this same technique, searching กระ.* for all the กระ-words, or ปฎิ.* for all the words that begin with ปฏิ-, etc.)

Q: You want to find all word that begin and with with the sound /k/, with any vowel in the middle.
A: Time to use phonetic search again. This time, we'll see how the search kV*k does. The V represents any vowel, and the * means any number of vowels. This returns 52 results, and only matches words like กอก, not but words like กรอก, because it has a consonant. To match both types of words, search
k.*k, which will return 182 results, matching any consonants and vowels in between. Note, though, that it only matches one syllable, so a word like กระรอก isn't going to come up. To search for two-syllable words that begin and end with k, change your search term to k.*-.*k instead. This will find a first syllable with k followed by anything, and a last syllable with anything ending in k.

Also notice that you can adjust the type of results you get using the radio buttons in the "Approximate matching" section. If you search kV*k, and select "syllable or longer," you'll get all results that contain a syllable that matches your criteria, such as มะกอก or ตะกัก.

Q: You want to know some Thai words that come from Chinese.
A: Clear all the search fields, and head straight down to the "Tags" section. In the Etymology menu, select "Chinese," and click "Show all". Voila. It only has 60 words tagged as Chinese, but you can imagine how combining the search capability of SEAlang with the breadth of some other dictionaries would be a powerful combination.

Q: You want to find all the กระ- words that are nouns.
A: Head back down to the Tags, this time making sure to "Reset all" first and this time selecting "noun" from the "Part of speech" drop-down menu. Now head back up and type กระ.* and then press "Go!" You'll get 506 items returns, pared down from the 1299 you get if you search กระ.* for any part of speech. Try the search again, setting part of speech to "classifier". This time you'll get 9 classifiers that begin with กระ. Or clear out the search field and select "classifier" to see a list of all classifiers in the SEAlang dictionary. Piece of cake.

Q: Finally, someone wrote me to say they
ran across the word ชรัว, which the dictionary says is pronounced [ชฺรัว]. The cluster /chr/ isn't supposed to exist in Thai, if you believe all the books out there. But lo and behold, here is a real live example. So how would we find other words pronounced with this cluster?
A: Doing a search for c
ʰr.* in the IPA field is easiest. It only returns one result, because, again, it's not a comprehensive dictionary, so it's missing a lot of obscure words. And yes, ชรัว is an example of an obscure word. I also don't think it's really pronounced with the cluster /chr/ in the real world. As is, it returns barely 100 Google hits. These seem to be mostly surnames, false hits (where Google has incorrectly detected word boundaries), and jocular spellings of the word ชัวร์ (meaning "sure," แน่นอน). You can search ชร.* for other words spelled with this combination of letters, and find ชรา, pronounced [ชะรา]. If you browse the ชร section of RID (you'll need to scroll down a bit in that link), you'll find more, but they're mostly used in poems, and have variants without this odd cluster.

The search tools are extremely powerful, and while they can take a while to get used to, the searching is vastly more useful. I'm interested to see what cool things other folks have been able to find using SEAlang's search tools. Drop them in the comments, if you please. Or other search quirks you can't figure out. I'm only a user myself (see the footnote), but I've played around with the dictionary long enough (and clicked all the ? buttons to read the instructions enough times) that I may be able to help out.

*In the interest of full disclosure, I should mention that I have worked as a research contractor for CRCL, the parent organization of the SEAlang projects, since February 2007. Previous to my employment, I used data provided by CRCL for research projects while working on my bachelor's degree at Dartmouth College and did a chunk of unpaid spec work. To date I haven't had any involvement with the dictionaries, although that might change at some future point. Right now, I'm just an avid (longtime) user. This is why I use the third person pronoun with respect to SEAlang.

Jokes! 4

I let the most recent batch of Thai jokes stew a bit long. Without further adieu, let's jump right into the analysis of last time's jokes:

Q: ซุปอะไรมีสารอาหารครบทุกหมู่

What kind of soup has nutrients from all the food groups?
A: ซุปเปอร์มาเก็ต

The supermarket!
This joke works okay in English, although in Thai the loanword "soup" ซุป is spelled the same as the first syllable of the loanword "supermarket," so arguably it works better in Thai. This one's a simple pun to start off the batch.

Q: มีเงิน 20 บาท ให้น้องยี่สิบบาท จะเหลือเงินกี่บาท

You have 20 baht. You give your young sister twenty baht. How much do you have left?
A: เหลือ 10 บาท (น้องชื่อยี่)
10 baht (Yi is your younger sister's name.)
This joke requires explanation even after the punchline, which means it's not a very good joke. More of a trick question. The Thai word for twenty is ยี่สิบ /yii-sip/, but the punchline tells you that ยี่ /yii/ is the person's name, and so ให้น้องยี่สิบบาท is meant to be interpreted as "You give your younger sister Yii ten baht." Nevermind the fact that stress and intonation would probably differ if you were really talking about someone named Yii. In ยี่สิบ, ยี่ is unstressed and thus often is pronounced short. You would likely stress both ยี่ and สิบ if you meant a person named ยี่ (or reorder the sentence to avoid confusion). But now I'm taking all the fun out of the joke now, aren't I? :P

Q: ขี่ช้างจับอะไร
What do you ride an elephant to catch?
A: จับให้แน่นๆเดี๋ยวตก
Hold on tight! You could fall!
This joke makes no sense in English. The setup is a reference to a famous idiom, ขี่ช้างจับตั๊กแตน, "ride an elephant to catch grasshoppers," meaning to go to a lot of expense or effort for something that gets very little results, since trying to catch grasshoppers on an elephant would be a rather futile activity. So the setup is phrased so that the listener thinks the answer is ตั๊กแตน "grasshoppers." The punchline is, instead, a play on จับ. It means "catch," but also "grasp, hold on to." The alternate interpretation of the setup is, "When you ride an elephant, what do you hold on to?" Even with this interpretation, the punchline is still a play on words, because the phrasing จับอะไร makes you expect a noun in the answer, regardless. You've got to hold something. Instead of จับ_____ (noun), we get จับให้แน่นๆ "Hold on tight!" Notice, though, that even if the listener, knowing this is the setup for a joke, does answer with something like "ears," the joke still works. If I think I'm all smart and say "the ears" instead of "grasshoppers," the punchline reads like, "Well hold on (to the ears) tight! You could fall!" Interesting.

Q: ยาอะไรใช้รักษาคนไม่ได้
What kind of medicine can't cure people?
A: ยามาฮ่า

This is a simple play on ยา "medicine" vs. ยา as the first syllable of the brand name Yamaha. Remember that modifiers come after the noun in Thai, so ยา____ would be the typical structure for the name of a type of medicine. For example, ยาพารา is colloquial for "paracetamol" (shortened from พาราเซตามอล, the alternate chemical name for what is known as acetaminophen in North America, the most famous brand of which is Tylenol).

Q: มีเงิน 15 บาท ไปซื้อขนมราคา 4.50 บาท จะได้เงินทอนเท่าไหร่
You have 15 baht. You buy candy that costs 4.50 baht. How much change do you get?
50 สตางค์ (ก็ให้ไปห้าบาทไง)
50 satang (you gave them five baht!)
The setup sounds like a typical question you'd hear in an elementary school math class. The twist here is simple. If you have 15 baht, then you probably have two coins: a 10 baht and a 5 baht coin (the common coin denominations nowadays are 1, 2, 5 and 10. The 2 baht coin is a recent addition, though there have been 2 baht coins before in the past). So the trick is simply that if you have a 5 baht coin and you buy a 4.50 baht candy, you're only going to give them 5 baht (not 15), and thus 1/2 a baht (50 satang--that's Thai for cents) back. Simple, eh?

Now for a few more.

ลิฟท์จอดที่ชั้น 30 มีคนเข้าไปถึง 20 คน ลิฟท์ร้อง บี๊บ บี๊บ คนยังไม่ทันออกเลย สลิงก็ขาด เสียก่อน ปรากฏว่าไม่มีใครบาดเจ็บเลยสักคน ถามว่าเพราะอะไร?

Q: อะไรเอ่ย เวลาเรายืนมันห้อย เวลาเราเดินมันแกว่ง?

Q: ยายพายเรือไปทำบุญที่วัด ปรากฏว่าเรือรั่วและกำลังจะจม ยายต้องเสียสละทิ้งของ สองอย่างระหว่างปิ่นโตกับดอกไม้ ถามว่ายายจะเสียอะไรจึงจะไปถึงวัด แน่ๆ?

Q: พระใช้อะไรตีระฆัง?

Until next time, have fun...

On the name of this blog

I'll admit, some might find the title of this blog, "Thai 101," misleading. But I have my reasons for naming it as such.

I consider myself a perpetual student of Thai. And by "Thai" I mean everything Thai. I've been studying the language since 2002, and I've lived in Thailand for 33 months out of those five years. Just shy of three years. (This sort of astounds me, too. Hardly seems that long.) And even if I've made progress in trying to master the language, I still find myself woefully lacking in many other aspects. The longer I'm here, the more I realize that there is a staggering amount I don't know about what we might term the Thai Experience--all those little things folks know that make a person a well-rounded, culturally literate member of Thai society. No two ways about it, I'm still a beginner.

Hence the name of the blog.

Thai 101 turns four months old tomorrow. This is my 37th post, making for an average of about two posts a week. I'm trying to post more regularly, and to continue to keep the content varied and worthwhile enough to maintain your interest and readership.

In the first "review" of my blog I've seen, user DavidHouston described the blog on the ThaiVisa forum as "
a mixture of wonderment and erudition." I consider this a high compliment that I'll try to keep living up to, and I think the word 'wonderment' perfectly captures how I continue to feel about the language. As an example of my perhaps eccentric interest in the language, I spent 10 minutes transcribing graffiti in a bathroom stall at the mall last week. The cleaning lady hanging out by the sinks was probably wondering if she needed to call an ambulance.

This blog will usually assume its readers are familiar with the Thai script, but I try to include phonetic spellings in square brackets [] whenever I think a certain word is likely to be unfamiliar or difficult to pronounce. Personally, I feel that romanization is a learning crutch that should be abandoned as quickly as possible. I started really trying to learn to read after about two months (I had some limited exposure to the alphabet before that), and I can't say enough about how important I think reading the Thai script is.

Sometimes the script will confuse you--I learned the word อร่อย right off the bat, but once I saw it written, I doubted myself and thought I should be pronouncing the second syllable with a falling tone. I mispronounced it like that for a while, until I understood the spelling rule. But far and away, reading has helped my pronunciation and comprehension from day one. Then again, I am a visual learner. I participated in spelling bees as a kid. Associating a word with its spelling has always been natural for me.

Not everyone learns the same way, though, so allow me to suggest a couple of tools to make my blog a little friendlier:
This website is a struggling Thai learner's best friend. It's the Ellis Island for students of Thai. It beckons, Give me your tired, your poor, your huddled masses yearning to [learn Thai]... Whether you come from the romanization school of Mary Haas, David Smyth, Benjawan Poomsan Becker, it accepts everyone. The ability to set your preferred romanization in the preferences is the true boon of this site, I think. This site can take whole chunks of Thai text (or the URL of a Thai-language webpage). It will parse and space the words, and give you float-over definitions and romanization. It's not perfect, but it's very good.

Increase Thai text size
(Firefox extension). First off, if you're not using Firefox, start. This extension also comes from Mike of You have to save it (right-click, save as) and then run it (open it with Firefox if it asks you). After you've installed it, you can right click on a page containing Thai text and it will increase the size of the Thai text. I don't like to make the Thai larger than the English when I write, so consider this little extension my excuse for doing things the way I like them.

These two links should help level the playing field for all potential readers. There are many other excellent learning tools which I haven't mentioned here. (If you have a website or tool you'd like me to examine in more detail on this blog, drop me a line.)

As always, I welcome input and feedback. This blog is my hobby, so I can keep on my toes with Thai and exercise my writing muscles regularly. On a side note, I'll be debuting my first piece of "professional" work with respect to Thai within the next couple of weeks (though 99% of the work I did on spec last year). It's an online version of a Thai-English dictionary from the 1840s. I'm excited. Stay tuned for that, too. [Update: The Jones Thai-English Dictionary is now up and running. I wrote it about it in this post.]

New section of the Royal Institute website

It looks as if the Royal Institute is poised to a open a new portion of its website at under the banner รู้ รัก ภาษาไทย Know, Love the Thai Language.

This website is clearly aimed at native speakers, and (if I had to guess) is probably an overdue project meant to coincide with National Thai Language Day this past July 29th (the release of the online version of RID99 was also meant to roughly coincide with this date, but it was released into the wild early--maybe too early). This annual day of observance for the language is fairly new, I believe, but I can't find any specifics about when it was established. I say it's new because my wife had never heard of a National Thai Language Day, and from what I saw, it didn't really get much press attention.

Right now the site's basically just a placeholder. There's not much real content yet--the phrase โปรดติดตามข้อมูลเนื้อหา
ในเร็วๆ นี้ ("Check back for content soon") is plastered everywhere.

There's a menu on the side with 12 items (English translations and comments mine):
หน้าหลัก (Main page)
ข่าวสารและกิจกรรม (News and activities)
รู้ รัก ภาษาไทย (Know, Love the Thai Language)
วิดีไทย (Video? Not sure if this is a play on วิดีทัศน์, the RI-coined term for "video," or what)
ภาษาไทยใช้ให้ถูก (Using the Thai language correctly)
สปอตรณรงค์ (Campaign spots--in the sense of a radio "spot" or advertisement; there are already two of these up for download)
กระดานสนทนา (Discussion board)
แนะนำเว็บไซต์ (Recommended websites)
บุคคลต้นแบบ (Exemplary individuals)
คลังความรู้ภาษาไทย (Treasury of Knowledge of the Thai language--links to a longstanding section of the Royal Institute website with articles about the language, some of which I've linked to before on this blog)
หลักเกณฑ์การใช้ภาษาไทย (Rules for using the Thai language)
คำต่างประเทศที่ใช้คำไทยแทนได้ (Foreign words that you can use Thai words instead of)

I'm particularly interested in what we'll see in the "Exemplary Individuals" section, and I think the last menu item is also intriguing. The French equivalent of the Royal Institute has a similar aversion to foreign (particularly English) phrases in colloquial speech, so it's not surprising to see the Royal Institute continue to try to convince the masses that a Thai phrase is inherently better than an English one, even though the "Thai" phrase will most likely be constructed from Pali/Sanskrit roots.

I have some issues with their website design and layout, but Thai websites in general are stuck in the 90s (ahem--blinking graphics,
scrolling text, visit counters), so this is better than many (although Google isn't going to find the site if they insist on using graphical menu items and links all over--search engines need text to index). I'll reserve further judgment until we see more content rolled out. According to the counter, 672 visitors so far.

Stay tuned.

Words that end in สระอึ, SEAlang searching, and teaching a man to fish

The other day, a friend of mine emailed to ask me to help him find words that end in in the vowel -ึ*, or สระอึ, as one would say it aloud in Thai. He had come up with two already, อึ and ตึ. Words that end in this vowel are quite rare, so I thought I'd share what I found.

First of all, I cheated. I can't say that I relied on my immeasurably vast knowledge of the language--though it would be nice if I had anything remotely close to that (that's what reference works are for). So I turned to the dictionary. Specifically, the online dictionary. It's exactly this sort of question which a well-designed online dictionary can handle far better than the traditional dead tree variety. Why? Because of a little thing called a wildcard.

Note that I say "well-designed dictionary." Not every online dictionary supports wildcards, and some don't support them as well as I'd like (ahem.. RID). The best that I've seen in this area is the SEAlang Thai-English dictionary, although it has a bit of a daunting interface at first.

Much of the basic search syntax of SEAlang is like traditional regular expression syntax. For example, a period (.) matches any character.
For wildcard searches on SEAlang you have to use the phonetic search box (they call it "IPA", though it's not exactly that), so I first .ʉ as my first search string. If you're wondering how to make the special character for this vowel, SEAlang makes it easy. You can click the button above the search box, or you can use the shortcut key given in parentheses on the button and it will automatically convert to the desired character. In this case the shortcut is U (capital u). It magically becomes ʉ before your very eyes.

These are the results of my search: ตึ รึ ฮึ (also หึๆ as part of the phrase หัวเราะหึๆ)

Now, one thing I noticed is it didn't find อึ--but there's a simple reason for that. Phonetically speaking, there's no sound before the vowel (except possibly a glottal stop, but SEAlang's phonetics don't include that). Searching just ʉ does return อึ, though.

But that's not all. It occurred to me that the search string .
ʉ wouldn't return words that began with a consonant cluster. So I tried .*ʉ next. The asterisk means "zero or more of the preceding characters." In combination with the period, it amounts to "any combination of characters." The problem is, it causes the search to make the long vowel, represented by two characters (ʉʉ). (Note this search does return อึ, since that would be the "zero" case.)

Finally I search C*
ʉ. The C in this case stands for any consonant. The asterisk makes it match zero or more consonants. Voila! Finally I've found the search I wanted all along. It matches all the previously found words, including อึ, as well as a newcomer: ครึ

Now that we have the words from SEAlang, let's switch things up by looking them up in RID99 (my English translations):

ครึ [คฺรึ] (ปาก) ว. เก่าไม่ทันสมัย.
(Colloq.) adj/adv. Old and out of date.

ตึ, ตึ ๆ
ว. ลักษณะกลิ่นเหม็นอย่างหนึ่งคล้ายกลิ่นเนื้อแห้ง, มักใช้ประกอบกับคํา เหม็น เป็น เหม็นตึ.
A kind of unpleasant odor, like the smell of dried meat; usu. with /men/ as /men tʉ/.

รึ [not in RID99] A common abbreviation of หรือ used in transcribing speech or informal writing. Frequently seen in the question phrase รึเปล่า.

ฤ ๑ [รึ] เป็นรูปสระในภาษาสันสกฤต เมื่อไทยนํามาใช้ออกเสียงเป็น ริ รึ หรือ เรอ เช่น ฤทธิ์ ฤดู ฤกษ์.
A Sanskrit vowel. When used in Thai, pronounced /ri/ or /rʉ/, or /rəə/, e.g. /rit/ /rʉduu/ /rəək/.

ฤ ๒
[รึ] (กลอน) ว. หรือ, ไม่, เช่น จะมีฤ ว่า จะมีหรือ, ฤบังควร ว่า ไม่บังควร.
(Poetic) adv. /rʉʉ/, /mai/, e.g. /ca mi rʉ/ "Will there be?", /rʉ baŋkhuan/ "not appropriate".

ฦ, ฦๅ ๑ วิธีเขียนเสียง ลึ ลือ แต่บัญญัติเขียนเป็นอีกรูปหนึ่งต่างหาก อนุโลมตามอักขรวิธีของสันสกฤต.
A way to transcribe the sound /lʉ/ or /lʉʉ/, but it is prescribed to be written another way, after the Sanskrit.

หึ, หึ ๆ ว. เสียงดังเช่นนั้น.
Adv. Onomat. A loud sound.
(Note the use of เช่นนั้น indicates onomatopoeia--literally meaning "like that" (i.e. like the sound of the word).

อึ (ปาก) ก. ถ่ายอุจจาระ (มักใช้แก่เด็ก). น. ขี้, อุจจาระ.
(Colloq.) v. To defecate (usu. of children). n. poop, feces.

ฮึ อ. คําที่เปล่งออกมาแสดงความไม่พอใจหรือแปลกใจ.
Interj. An expression of displeasure or surprise.

One last thing. The search string we used only covers one-syllable results. SEAlang has an easy way to fix this. Under the "Approximate matching" section, change the selection to "syllable or longer," which means it will now match any word that has any syllable ending in /
ʉ/. There are a bunch of these. One of the most common is พฤหัส /pharʉhat/ "Thursday." However, it also gives us yet another word for our list: สะตึ. This one's in RID, too:

สะตึ, สะตึ ๆ (ปาก) ว. ไม่มีอะไรดี, ไม่ได้เรื่อง, ไม่มีค่า, เช่น หนังเรื่องนี้สะตึดูแล้วเสียดายเงิน ของสะตึ ๆ อย่างนี้ไม่ซื้อหรอก.
(Colloq.) Adj./adv. No good, useless, worthless, e.g. "This movie is no good--once you've watched it, you regret paying." "I don't buy useless stuff like this."

There are probably more hidden out there, including another an elaborate version of ครึ I found in RID99 (
คร่ำครึ ว. เก่าเกินไป, ไม่ทันสมัย.) Can anyone else track down more?

* Romanized variously as /ʉ/ or /ɨ/ or even /y/, but they're just arbitrary symbols, really. Personally, I dislike using digraphs (like eu or ue) to represent this single sound. I think insofar as a romanization system is necessary for learners, a one-to-one correspondence of symbol to sound is best, ala basic IPA.

A clever movie title

After my earlier post, in which I briefly discussed translating titles, I had to report this oh-so clever use of the Thai language that I saw on a movie standee at MBK earlier this week.

The Luke Wilson/Kate Beckinsale horror vehicle Vacancy opens here soon, and the title in Thai is ห้องว่างให้เชือด. (เชือด means to slit, as in, e.g., the throat.) The clever twist is that not only is this a verbal play on the phrase ห้องว่างให้เช่า "vacancy," but it's a visual play on the orthography of the words เช่า and เชือด. Run with me for a second here.

On the standee at the mall, the title of the film was designed to look like the type of neon sign you'd see at any motel. It appeared, at first, to read ห้องว่างให้เช่า, but after a few seconds, a portion of the neon sign that was out would flick back on (in red to highlight the difference), revealing the true title of the film. Here's a picture to highlight what I mean:

See how you can simply add a few strokes to เช่า to turn it into เชือด? When I saw the blinking lights on the standee, it blew my mind for a few seconds. I was tickled by the sheer ingenuity of it. And so I share it with you. The movie looks like a stinker, though. :P

Usage Shmusage 2: ทรงพระเจริญ

With the Queen's 75th birthday just past, and the King's 80th birthday fast approaching, I want to take a deeper look at a phrase that has interested me for a while: ทรงพระเจริญ [ซง-พระ-จะ-เริน]. It is the Thai equivalent to "Long Live the King," but it isn't specific as to which royal person it references, so it can be used to refer to any member of the immediate royal family.

For one thing, I've wondered how long this phrase has been around. It seems highly possible (and even likely) that ทรงพระเจริญ is an example of ศัพท์บัญญัติ, or, words or phrases coined as Thai equivalents to foreign words or phrases. But it's also possible that the Thais were using this phrase before extensive contact with the West began. To date, I haven't found evidence one way or the other.

I also used to wonder about the underlying syntax of the phrase, and whether I was interpreting that syntax correctly. ทรงพระเจริญ is an example of ราชาศัพท์ [รา-ชา-สับ], or royal language--the vocabulary used when speaking or referring to royalty (or deity--the two concepts are closely intertwined in Thai thought). Typically, in using royal language, ทรง [ซง] is placed before common verbs to make them royal. Many verbs have specialized royal equivalents, though, in which case ทรง is usually unnecessary.

For example, when your average Joe is hungry, he will typically กิน ทาน or รับประทาน; a monk will ฉัน; but royalty will เสวย. Because there's a special word for it, ทรง isn't strictly necessary (but you'll still sometimes see ทรง used with royal verbs anyway). There are thousands and thousands of verbs, however, and using specialized alternatives for every one is difficult. That's where ทรง comes in handy. It still sets the agent of the verb apart as royal, but easily allows for an unlimited number of actions, such as ทรงเป็น ทรงทราบ ทรงเปิดเผย, etc. In this case, เจริญ means to grow or age. We see it in common language in the elaborate phrase เจริญเติบโต.

The part of ทรงพระเจริญ that I wondered about more, though, was the combination of both ทรง and พระ before a verb. I didn't recall seeing this kind of construction very frequently. As it turns out, it's not entirely uncommon, but it seems to appear as ทรงพระราช- [ซง-พระ-ราด-ชะ] more often. Other examples include:

ทรงพระราชทาน = to give/bestow
ทรงพระราชนิพนธ์ = to write/compose
ทรงพระราชสมภพ = to be born

Any of these can be used without ทรง, though, and in fact, apart from ทรงพระเจริญ, I haven't found any other ราชาศัพท์ phrase that uses ทรงพระ without ราช attached. I'm all ears (eyes?) for suggestions.

Now a bit more about the syntax. If this phrase is like its English counterpart, then its an imperative phrase, or, in other words, a command. That isn't to say it's rude or forceful, simply that when we say "Long Live the King," the grammar of the phrase is commanding him to live long. In Thai, commands can either be marked or unmarked, which result in varying levels of forcefulness. Your basic (non-forceful) command is usually grammatically unmarked. For example,
if I want to tell you "come here," I can just say มานี่. This is the form that ทรงพระเจริญ takes. A basic unmarked imperative.

A more forceful command uses the auxiliary จง, although this isn't used much in colloquial speech. Therefore we also see จงทรงพระเจริญ, which in this sense is kind of like the exclamation point on the phrase, if you will.

It's also very common, though, to see ขอ or ขอให้ at the beginning of the phrase, too, both as ขอ(ให้)ทรงพระเจริญ and ขอ
(ให้)จงทรงพระเจริญ. When it's left off, it's still understood as if it were there, as opposed a literal command. In a segment of her TV program ภาษาไทยวันละคำ*, Dr. Karnchana Nacaskul had this to say about ทรงพระเจริญ:
"อาจใช้เป็นคำถวายชัยมงคลแด่พระบาทสมเด็จพระเจ้าอยู่หัวและสมเด็จพระบรมราชินานาถ มีความหมายไปในเชิงว่า ขอให้มีสุขภาพดี ร่างกายแข็งแรง และมีอายุยืนยาว"

"It may be used as an expression of blessing for His Majesty the King and Her Majesty the Queen, having the meaning along the lines of, 'May you have good health, a strong body, and a long life.'"** [My translation.]
Here Dr. Karnchana limits the phrase to the king and queen, but I've seen it used with, for instance, the prince and Princess Sirindhorn, especially near their birthdays. I also found instances on the web using the phrase with the future heir, the prince's youngest son, who is now just over 2 years old.

Here's a breakdown of Google hits for different versions of the phrase:
ทรงพระเจริญ = 746,000
จงทรงพระเจริญ = 399,000
ขอทรงพระเจริญ = 62,900
ขอจงทรงพระเจริญ = 333,000
ขอพระองค์ทรงพระเจริญ = 276,000
ขอพระองค์จงทรงพระเจริญ = 76,400
ขอให้พระองค์ทรงพระเจริญ = 162,000
ขอให้พระองค์จงทรงพระเจริญ = 23,700

Most of these results are overlapping, since every one of them contains the smaller phrase ทรงพระเจริญ. But it shows us which phrases (in the world of the web) are more or less commonly used.

There are also several variations for what comes after ทรงพระเจริญ (if anything), but ยิ่งยืนนาน is perhaps the most common (
ทรงพระเจริญยิ่งยืนนาน turns up 272,000 Google hits).

*This program was a 1-2 minute segment that ran after the news on Thailand's Channel 3 from 1984 to 1994. The entirety of the program was compiled into one volume and published by Chula U Press in 2005.
** ภาษาไทยวันละคำ ฉบับรวมเล่ม, p. 285.

[Hat tip to David for the post suggestion]

RID: Past, Present, Future

This post started as a reply to David's comment on this earlier post about the online version of the พจานุกรม ฉบับราชบัณฑิตยสถาน, or Royal Institute Dictionary (see also my follow-up post). The reply started getting so long that I decided to give it the full treatment. David writes:

The RID is Thailand's official language standard and getting it on-line was a major step forward in creating accessibility for much of the Thai community. While the text itself is not very expensive at 600 baht for a work of this size, this is a volume not often found in Thai households. In addition portability of this gigantic text is a major issue. While the new Matichon dictionary represents a private industry attempt to update the venerable RDI, the Thai-speaking community really needs the RID as a baseline for the language.

He's right--most Thai households don't have this dictionary. And even if, say, you only were to count homes with a college-educated head of the household, I think most people still don't have it. Some dictionary, yes, but not this one.

One problem is that it's too big, physically. It's not printed on the thin "bible paper" that we are (or at least I am) used to seeing in English dictionaries. Two different members of the Royal Institute gave me different anecdotal reasons for this: first, that they wanted to use Thai paper (the thinner paper isn't produced domestically and must be ordered from abroad), which would certainly be a case of misguided nationalism, and second, that it was simply an error in communicating with the publishing company, นานมีบุ๊คส์ (Nanmee Books). I think the latter is more likely.

The size makes it unwieldy. It's around 1500 pages, but it would be less than half as thick on the thinner paper. Compare it to the Matichon Dictionary, which is around 1000 pages, but only 1/3 as thick (and much lighter and handier to use).

But besides its dimensions, I think the number of decades of (off-and-on) work on this dictionary represent a hugely squandered opportunity. In his comment, David continues:

Let's hope that the RI will get the expertise it needs in database and internet technology and the assistance of a dedicated staff to allow the dictionary to be disseminated to the on-line Thai community.

Again he's right. But not only do they need the technical know-how to move the dictionary into the digital age, but they need the lexicographic know-how to do a complete and utter reworking of the dictionary. Then it will truly be worthy to be had in the homes of every Thai family (and the Royal Institute can improve its reputation in dictionary reliability and quality--perhaps comparable to how Webster is a household name irrevocably associated with dictionaries in America, almost 200 years after his original dictionary).

I know a bit about the pedigree of the RID, because I wrote my college thesis on Thai dictionaries (which was, I think, also a semi-squandered opportunity--it should be better than it is, though at least it didn't take me decades to complete). This started as an independent research project in early 2005, when I received grant money from my university to research the topic in Bangkok. One part of my research was to interview members of the Royal Institute, and to attend meetings of the dictionary revision committee (คณะกรรมการชำระพจนานุกรม). The RID has many committees that its 150+ experts are members of. Besides the Institute's general dictionary, there are many specialized dictionaries, their many-volume encyclopedia, etc. But for many, their work is extra-curricular. These people are the cream of the crop in their fields, and they often balance this work with full-time jobs (though many are retired).

Each generation of the book (1950, 1982, 1999) has only been a of the previous one (they use the word ชำีระ, which means "cleanse"). More than one interviewee made a point of clarifying this for me, whenever at first I would use a phrase in the interview like "ทำพจนานุกรม," they would point out that the work they were actually doing was to ชำระ, not ทำ.

I've studied the numbers pretty closely (I can post the data if folks are interested), and the increase in the scope and breadth of the dictionary with each generation, overall, is not overwhelmingly significant, though the changes between 1950 and 1982 is greater than the changes between 1982 and 1999 (which, despite its "official" release date, so-called to coincide with the King's 72nd birthday, was actually 4 years past schedule and first published in 2003).

The methodology is the biggest problem. It's a committee of 15-20 people who meet for four hours a week (two two-hour meetings), and who are all busy with other projects (the president of the committee, a nominally retired septuagenarian, was on 11 different committees at the Institute when I interviewed him). They go over the previous incarnation of the dictionary entry by entry in alphabetical order and edit definitions, add and remove entries, or add senses to existing entries. As everything is done by committee, they frequently discuss and argue over words and spellings and etymologies and how to phrase or rephrase the definition. But just as often it seemed the members would defer to the president after a period of debate, apparently not wanting to drag things on too long for any given entry.

Perhaps a bigger problem than the committee method is that there is no (and, so far as I have been able to ascertain, has never been any) systematic attempt to comb the language for all words, phrases and senses to include in the dictionary. To put it frankly, a full-time editor and staff might do in weeks or months what takes the Royal Institute years. But the reason they do it how they do it goes back to the idea of expertise and authority. Unfortunately, this limits the dictionary to the knowledge of a small group of people, experts though they may be.

The first edition of the Oxford English Dictionary took seventy years to finish, and they had thousands of volunteer readers sending in usage examples from works dating back throughout more than 700 years of the language. Thailand's written history is just as long, and modern technology makes it much quicker to do much of this work (and much more accurate).

For me, it boils down to this: If something is going to take years (or decades), let's get an OED out of it, or a Webster's Third out of it. (By the way, these books both have fascinating histories, which you can read about in any of several books on the subjects--for a "textbook" of lexicography in general, I recommend Landau--particularly the 2nd Edition, which includes much on modern corpus-based lexicographic techniques).

I am only so hard on RID because I see its potential. There are many, many, many areas that need improvement, both in the physical book and the online version. I'm rooting for them, but if a couple decades from now the Royal Institute is still plodding along in its set ways, others may have already taken up the torch. In fact, that may be what they need--a Thai dictionary that outpaces their technology and methodology to the point that the necessity of a full overhaul will finally be understood. Ultimately, this would be a good thing for both the language and its speakers.

More issues with RID online

Following up on two previous posts about the new online version of the 1999 Royal Institute Dictionary (พจนานุกรม ฉบับราชบัณฑิตยสถาน พ.ศ. 2542), I have a bit more to report on.

There's something off with their wildcard searching. Specifically, if you use a wildcard at the beginning of the search string, it doesn't work properly.

Compare, for example, the searches for ใจ* and *ใจ. One would expect the first to return results such as ใจร้อน and ใจเย็น. The second, words like น้อยใจ and สบายใจ.

Indeed, the search for ใจ* returns 60 results (I had to count--there's another feature request already--tell me the number of search results!). Including both the words I expected. It appears to be a complete and thorough search of the dictionary.

But *ใจ returns six results: ดลใจ ดีใจ ดีเนื้อดีใจ ดีอกดีใจ ดูใจ ได้ใจ. All from the letter ด. When you put the wildcard at the beginning of the string in the main search window, somehow it only searches this one letter of the alphabet.

There is a (tiresome) workaround--you can click on any letter of the alphabet to do a letter-specific search. I tried น and got 6 results, ช returns 7 results, etc. But it's highly inefficient to go clicking through every letter of the alphabet.

So why is there a problem with this? Well, if you think about it, this is a special kind of search. From the regular search box on the main page, this is the only type of search that requires searching the entire contents of the dictionary, since the desired results could begin with any letter. Any other regular word search allows the site to return quick results, by simply looking at the first letter and kicking the query straight to that letter of the alphabet.

One idea would be to use a simple radio button for search queries, allowing you to search for the whole word (currently the default), or partial word (which would be equivalent to searching for both ใจ* and *ใจ at the same time). Of course, even then it would be nice to still have wildcards available, to be able to search only for one or the other.

Now, a few things about definition search, which, it should be noted, is not full text search. It will only search the text of definitions. (Feature request #2: true full text search.) Now, wildcards may be unnecessary in this sort of search, since it's searching the full text, but I tried it anyway to see what happened. If you use a wildcard at the beginning of the search, it fails entirely:

Microsoft JScript runtime error '800a139a'

Unexpected quantifier

/new-search/meaning-search-all.asp, line 20234

They simply haven't programmed it to expect this sort of search, which would be a simple thing to fix.

Next, I tried searching ใจ and ใจ* to see if the results are the same. They're not. Searching ใจ* returns false results (i.e. they don't have ใจ anywhere in them). Exactly how many is difficult to assess, because the dictionary is unfortunately programmed in a most unhelpful way. Definition search performs an independent search for every letter of the alphabet, but only returns the top five most relevant results. That's right--it doesn't give you results alphabetically, but rather by relevance rank (which are also included in all their sixteen-decimal-point, screen-real-estate-wasting glory). And there's no option to change or tweak this behavior (especially the 5-result-per-letter limit--talk about handicapping your own product).
Here's a screenshot:

So back to the false results when doing a definition search for ใจ*. Of the top five results for ก, only 3 actually contain ใจ. But of the false results, both contain the vowel ใ. It appears that the presence of the asterisk is causing it to analyze other characters that affect the relevance ranking, since * can be any character or string of characters. The search algorithm should be told either (a) to ignore * in the case of definition searches and just search ใจ, or (b) disqualify any results that don't contain the full text string. The math behind relevance ranking is getting in the way of the usability.

I'll be shooting off an email to the Royal Institute about this as soon as I get a second. In the meantime, drop a comment if you have any other additions (or subtractions) regarding my analysis.

On Vonnegut, bookstores, and translating titles

Today I made what I consider a good find. I enjoy collecting Thai translations of English-language literature, and I stumbled across a book of short stories by Kurt Vonnegut at the bookstore. I'm past my book-buying budget for the foreseeable future, but I bought it anyway. Hi ho.*

Thai bookstores are hit and miss. For one thing, Thai bookstores suffer from disorganization resulting from poor classification. For another, most bookstores are too small, and poorly stocked. But these are manifestations of the larger symptom that the reading culture in Thailand is anemic. Supply to match the demand. From what I've seen, it seems books here are rarely printed in runs of more than a few thousand, though some well-known award winners are printed dozens of times over the years. That still only begins to add up to the first run of a new book by a major publisher in the States, where 50,000 is a cautious number.
The psychological upside for booklovers is that this gives book shopping in Thailand a treasure hunt feel. I enjoy browsing (and buying) literature the most, both translated and native, so whenever I pass a bookstore I haven't been to before, I try to stop in at least briefly. You never know what rarity you'll find, since there's no set of "classic literature" that every Thai bookstore is guaranteed to have. Having any classic literature is rare enough as it is. But if you let me loose at, say, the Chula Book Centre, I've been known to spend several hours there at a stretch, usually the lone foreigner in a sea of Thai bookworms, albeit a bit too big to comfortably browse amongst the store's narrow aisles. You'll most likely find me hovering there around the Thai language section, the dictionary section, the Western-literature-in-translation or the award-winning-Thai-literature section.

As an example of the "treasure hunt" style of bookshopping, over the past few months I've been purchasing books in the Lemony Snicket series in Thai, since I fully plan to see that my kid(s) to read for leisure (my wife is pregnant with our first child, as it happens), and I want to have a good library. I now have books 1-7 (of 13), and I purchased those at 5 different stores. I haven't picked up the rest because I have to budget my bookbuying, so I haven't picked up the rest of the set yet, but I have yet to find any store that has all 13 in stock (although the B2S at Central Lad Phrao came close).

Back to Vonnegut. The stories are all from the 1968 collection Welcome to the Monkey House. It's only nine of the twenty-five stories from that book, but nine is better than none. It's always interesting to look at how titles are rendered in a foreign language. Rarely are they translated literally. Here are the titles, their translations, and the literal English meaning of the Thai, from a few of the Vonnegut short stories:
  • Harrison Bergeron (แอกแห่งความเสมอภาค = The yoke of equality)
  • Adam (กำเนิดอดัม = Birth of Adam)
  • The Manned Missiles (มนุษย์อวกาศ = Outer space humans)
  • Next Door (เพื่อนบ้านเรือนข้าง = Next-door neighbor)
And here are the Thai titles of a few English books:
  • The Old Man and the Sea (เฒ่าผจญทะเล = The old man braves the sea)
  • Charlotte's Web (แมงมุมเพื่อนรัก = Spider, beloved friend)
  • Charlie and the Chocolate Factory (โรงงานชอกโกแล็ต = Chocolate factory)
  • Forrest Gump (โลกใบใหม่ของฟอร์เร้สท์กั๊มป์ = Forrest Gump's new world)
  • Black Coffee (ฆาตกรไม่กลับใจ = The impenitent killer)
Translations of movie titles (in both directions) are also intriguing. But that's a post for another day.

Hi ho.

*This book was published in October 2006, so it wasn't meant to coincide with Vonnegut's death this past April. I just didn't find it until now, since the publisher, นาคร, is pretty small. It's not the first time Vonnegut has been translated into Thai, though the only previous instance I can find mention of is the story Tomorrow, Tomorrow and Tomorrow in the 1981 sci-fi short story collection ห้องอนาคต. S.E.A.Write Awardee Prabda Yoon (ปราบดา หยุ่น) is also translating Vonnegut's last book, A Man Without a Country, under the title ชายผู้ไร้ประเทศ on his blog--see, so far, chapters 1 2 3 4 5 and 6. Good for him. [Update: Chapters 7 and 8 are now up.]

Problems with RID Online

While an improvement in some ways over the previous (now defunct) incarnation of the Royal Institute Dictionary online, the newly-released RID 1999 online for now may have to remain in the "not ready for primetime" category.

I noticed a while back that it was missing large chunks of two letters of the alphabet: ต and ส.

ต Complete up through the entry ตา ๒. That means entries from ตาก to ไต้หวัน are missing. If you compare it to the paper version, that's 28 pages of the dictionary, or roughly 400 entries gone (one entry may include several or even dozens of subentries).

ส This one ends at สาด ๒. So we're missing from สาต to ไสว, a whopping 64 pages, or upwards of 1000 missing entries.

So what gives? I emailed the Royal Institute about this over a month ago, and no word yet. In fact, since I first noticed it, now the letter ต doesn't work at all (any word beginning with ต will return no results, and you can't browse the letter, either), while that large chunk of ส is still AWOL. Let's hope this means an update is on the way.

Etymologist 5: โลกาภิวัตน์

Globalization refers to the modern trend toward increased interdependency, connectivity and integration across the world, be it economic, political, or cultural. The world is becoming smaller, in other words, with our increased ability to communicate, travel, and otherwise interact with the rest of the globe.

The term for this in Thai, as coined by the Royal Institute, is โลกาภิวัตน์ (which has replaced an earlier term, โลกานุวัตร).

This word is an example สนธิ in Thai. สนธิ, (aka sandhi*), is a blanket term referring to various phonological processes that can occur when you combine two words into a compound. There are various types of sandhi: tone sandhi, vowel sandhi, In the case of โลกาภิวัตน์, we see vowel sandhi, where the vowel changes when you combine the two roots.

The component parts here are โลก and อภิวัตน์. So why the long vowel in โลกา?

Thai words from Indic origins have a standalone form and a "combining form." This usually consists of an extra vowel at the end, that is hidden or suppressed in its standalone form (or when it's at the end of a compound). Sometimes it's written but not pronounced, as in ประวัติ, whose combining form is (usually) pronounced [ประ-วัด-ติ].

There are actually two types of compounding that Thai borrows from Indic languages (i.e. Pali and Sanskrit). Besides สนธิ, there is also สมาส [สะ-หมาด]. The difference is that in สมาส, the existing vowels of the component words don't change. (But we do get additional syllables from the combining form).

For โลก, the combining form is [โลก-กะ]. The word โลกทัศน์ "worldview"
is an example of สมาส. โลก+ทัศน์ = [โลก-กะ-ทัด]. Because it is a สมาส compound, โลก takes its combining form.

Back to โลกาภิวัตน์. With สนธิ, you start with the combining forms of the component parts. Vowel sandhi in Thai has certain rules for what happens to two given vowels when you combine them. When you combine the two short อะ vowels (the อะ vowel at the end of โลก in its combining form, and the short vowel อะ at the beginning of อภิวัตน์) it becomes the long vowel อา. โลกกะ+อภิวัตน์ = โลกาภิวัตน์. This process also breaks โลก into two syllables, and อา because the vowel of the second syllable: โลกา. This, in turn, affects the tone. Notice that in โลกาภิวัตน์, โล-กา is two syllables, both mid tone.

To break it down further, อภิวัตน์ is a compound of อภิ (meaning ยิ่ง or วิเศษ) and วัตน์ (meaning ความเป็นไป or ความเป็นอยู่). The combined meaning is การเข้าถึง, การแผ่ถึง, or การเอาชนะ. But in the case of อภิวัตน์ there is no sandhi going on, because spelling/pronunciation change is needed. The compound อภิวัตน์ itself is formed by สมาส.

So the literal meaning of the entire word is "conquering the world" or "spreading throughout the world."

Other examples of สนธิ:
  • เดชานุภาพ (great power/might) is from เดช (power) + อานุภาพ (power, greatness)
  • ราชูปถัมภ์ (royal patronage) is from ราช (king) + อุปถัมภ์ (patronage, support)
  • พจนานุกรม (dictionary) is from พจน์ (word) + อนุกรม (order, or sequence)
  • มกรคม (January) is from มกร (dragon) + อาคม (arrival)
  • ธันวาคม (December) is from ธนู (bow) + อาคม (arrival)
Other examples of สมาส:
  • พุทธศาสนา [พุด-ทะ-สาด-สะ-หนา] (Buddhism) = พุทธ [พุด] (Buddha) + ศาสนา (religion). Cf. the non-compound form ศาสนาพุทธ [สาด-สะ-หนา-พุด]
  • เศรษฐกิจ [เสด-ถะ-กิด] (economy) = เศรษฐ [เสด] (literally excellent or best, in this sense, money) + กิจ (work, business).
With a bit of practice, it gets easier to unravel long words into more easily digestible parts.

[Sources: RID99,
จดหมายข่าวราชบัณฑิตยสถาน ปีที่ ๔ ฉบับที่ ๔๑, ตุลาคม ๒๕๓๗.
* It comes from Sanskrit, meaning "put together" or "combine." This root is also seen in the Thai word ปฏิสนธิ, meaning conceive (a child). ปฏิ+สนธิ = the result of combination, hence conceive.