March 26, 2008

Comparative Tai Source Book

Lately I've been reading William J. Gedney's Comparative Tai Source Book, a new publication from Thomas John Hudak. Gedney, who died in 1999, is a giant in the study of the Tai languages. His 1947 PhD thesis, Indic Loanwords in Spoken Thai, is still a good read more than 60 years thence (available here).

This volume brings to completion a book of comparative Tai originally planned by Gedney. Hudak has organized Gedney's notes on 1159 cognate Tai words, making it easy to quickly compare the various cognate forms for a given word. Hudak has also added a chapter for each of the major branches of (Gedney's) division of the Tai language family: Southwestern, Central, and Northern. These chapters give detailed information about the phonology of each of the languages cited in the book.

Gedney collected comparative data on 19 Tai languages:

Southwestern Tai
  • Siamese (Standard Thai)
  • White Tai (Tai Khaw)
  • Black Tai (Tai Dam)
  • Shan
  • the Tai dialect of Nong Khai
  • Lue of Chieng Hung
  • Lue of Muong Yong
  • the Tai dialect of Chiengmai
Central Tai
  • the Tai dialect of Lei Ping
  • the Tai dialect of Lungming
  • the Tai dialect of Western Nung
  • the Tai dialect of Bac Va
  • the Tai dialect of Lungchow
  • the Tai dialect of Ping Siang
  • the Tai dialect of Ning Ming
Northern Tai
  • Yay
  • Saek
  • the Tai dialect of Wuming
  • the Tai dialect of Po-ai
Here's a map from the book of the distribution of these languages. Note that it's a map of just the 19 Tai languages Gedney collected data for, not all Tai languages. Obviously there are Tai speakers in Laos, too, among other places.

[Click image to enlarge]

In reading through the book today, I discovered my new favorite word (okay, well, my favorite word for today). It's the Shan cognate of the word คา /khaa/ in Thai. The Thai meaning is 'stuck'. The Shan meaning is significantly more interesting. Here's the entry from the book:
0497 - stuck, A4
SW - S khaa¹; W, B kaa⁴; Sh kaa⁴ 'to escape, as an animal pierced by any weapon, and carrying the weapon in its flesh'; LNK khaa⁶; LMY kaa⁴
CN - LP khaa⁴; LM kaa⁴; WN kaa⁴, caa⁴; PS, NM kaa⁴
N - Y ka⁴; Sk khaa⁴
In case you missed that: to escape, as an animal pierced by any weapon, and carrying the weapon in its flesh. Granted, the data is 50 years old. I wonder if that word is still used much these days. Time to go ask my Shan-speaking friend.

March 23, 2008

Same same, but different

Sometimes, through some twist of semantic fate, a word can acquire two senses with opposite meanings. Like 'sanction', which can mean to approve or condemn. Or 'fine', meaning merely acceptable or exceptionally good. There's a word for these. They're called contronyms, or alternately auto-antonyms.

A couple of weeks ago, there was a brief discussion on ThailandQA about the auto-antonymy of the Thai word รื้อ, which can mean either to tear down, as in รื้อตึก 'raze a building', or to bring up, as in รื้อเรื่องเก่า 'revive old matters'.

Another auto-antonym occurred to me recently: ป้องกัน. RID defines it thusly (my translation):

ก. กั้นไว้เพื่อต้านทานหรือคุ้มครอง.
/pɔŋ kan/
v. to block, in order to oppose or protect.

So when you ป้องกัน something, you are either opposing it or protecting it. Quite different meanings. You can ป้องกันโรค 'protect against disease', or you can ป้องกันตัว 'protect yourself'. Here's a real life example of what can happen with careless translation of what I suspect was the word ป้องกัน in the source material. From a press release of the Public Relations Department:

Deputy Secretary-General of the Office of the Narcotics Control Board (ONCB) Pittaya Jinawat revealed points made in a tactical meeting with computer game related businesses focused on protecting and guarding against the pervasion of narcotics and also game addiction.


Game developers have affirmed that they are ready to produce more positive games that are age appropriate but also evoke family participation, which they believe will be the best way to safeguard deviant behavior. [Emphasis added]

I don't have the original Thai, so it's only a guess that the original word is ป้องกัน. But clearly they meant 'safeguard against deviant behavior'. What a difference a word makes.

Thinking more about it, there are other words that are sort of like auto-antonyms, but not exactly. For example, เหม็น. This word most commonly means 'to have an objectionable smell', as in ตัวคุณเหม็นบุหรี่ไปหมด 'you reek of cigarettes'; but it can also mean 'to find a smell objectionable', as in ขออนุญาตสูบบุหรี่ คุณเหม็นรึเปล่า 'Mind if I smoke? Does (the smell) bother you?'

It works similarly for หนัก, 'heavy/to find heavy': กล่องนี้หนักมาก 'this box is really heavy', vs. ยกไปเท่านี้ก่อน กลัวคุณจะหนัก 'that's enough to carry, I don't want it to be too heavy for you.'

There's something different going on with เหม็น and หนัก than with ป้องกัน, but I haven't thought of (or come across) a good way to classify it. Any ideas?

March 21, 2008

Broken transitivity

I noticed something about the verbs แตก /tɛɛk/ and หัก /hak/, which both mean 'break'. It's about transitivity.

For those of you who need a simple refresher course, a transitive verb is one which requires a direct object. You have to do it to something. For example, I can lift a box, but I can't just lift (unless it's understood from the context, but that's different).

An intransitive verb is one which does not require a direct object (but you may be able to specify an indirect object using a preposition). For example, I can complain, I can complain to you, but I can't complain you.

Etymologically, หัก is transitive (e.g. I broke the lamp), while แตก is intransitive (e.g. the lamp broke). English uses the same verb in both senses. The English verb is ambitransitive, in linguistics-speak. Thai often has two words where English has just one. Where English has he boiled water vs. the water boiled, Thai has เขาต้มน้ำ vs. น้ำเดือด, (ต้ม /tom/ is transitive and เดือด /dʉat/ is intransitive).

However, หัก has come to be used quite commonly as an intransitive verb. While you can still หัก something, more often you ทำ X หัก ('cause X to break'). This matches the usage of แตก--typically, you ทำ X แตก (also 'cause X to break'). The most common transitive uses of หัก still around are figurative, like หักหลัง, to betray, literally to break (someone's) back; also หักคอ, หักใจ, หักอก, etc. หักอก /hak ok/ is interesting because หักอก means 'break (someone's) heart', while อกหัก /ok hak/ is just as common, meaning ('heartbroken'). This would've been a good one for my Semantic Switcheroo post. Google even turns up a small number of hits for ทำอกหัก 'cause (someone's) heart to break', too. It would appear that หัก is flirting with becoming entirely intransitive.

On the flip side, the traditionally intransitive verb แตก
has developed some transitive senses. One in particular seems to be influenced from English. แตกแบงค์ /tɛɛk bɛŋ/ means to 'break a bill', one of the extremely vital services that 7-Eleven provides in Thailand. If you're about to get into a taxi and all you have is a 1000 baht bill (or even a 500 baht bill), you'd better go แตก that แบงค์ at Seven first. In a similar vein, I've also seen แตกวง /tɛɛk woŋ/ meaning 'to break up (a band)', as in 'Aerosmith ทำท่าจะแตกวง' Aerosmith is acting like they're going to break up. The intransitive form is วงแตก /woŋ tɛɛk/, as in 'Potato วงแตกแล้ว' The band Potato broke up.

These are limited uses of แตก as a transitive verb (there are possibly more, like แตกแถว and แตกฝูง, meaning to be different from the pack, or non-conformist), but they're very interesting developments nonetheless.

A kind of transitivity switcheroo.

March 18, 2008

Thai musician sampler: Loso โลโซ

Seems about time for another music video sampler. If you missed it, see the last installment, on the band Silly Fools. This time up is the band Loso (โลโซ).

The name of the band is a play on words. The rich and famous in Thailand are called ไฮโซ (hi-so), which the Thais clipped from the English phrase 'high society'.*
To reflect their lower class roots as children of rice farmers, the band chose the name โลโซ.

For me, their best work is 2001's ปกแดง, a.k.a. the Red Album. And in browsing YouTube for music videos, I realized most of my favorite songs are from that one album, which turned out to be the band's last. They broke up in 2002, and Sek Loso embarked on a solo career, while the other two members of the band formed a new band, Fahrenheit.

เคยรักฉันบ้างไหม - Thai and English lyrics - from ปกแดง (2001)
I'm a sucker for rock songs with strings. This probably goes back to my Beatle-philia, and songs like Eleanor Rigby and A Day in the Life.

พันธุ์ทิพย์ - Thai and English Lyrics - from ปกแดง (2001)
Not the greatest song, mostly powered by its catchy chorus. It serves as a laundry list of the big shopping centers in Bangkok, because the whole premise is that the singer doesn't want to go to Pantip Plaza, since there's an ex-girlfriend who broke his heart opened up shop there.

ฝนตกที่หน้าต่าง - Thai and English lyrics - from (2001)
A nice little ditty on the softer side.

จักรยานสีแดง - from the soundtrack of the film Red Bike Story (1997)
And finally, an earlier song from the movie that helped make Tata Young famous.

At least, I've never heard 'hi-so' anywhere outside Thailand, and Google doesn't readily turn up evidence to the contrary. ไฮโซ is more about a specific way of life than actual wealth or status, however. The true ไฮโซ are the minuscule but uber-conspicuous group which consumes only the best and most expensive of everything, and thus drives popular taste in all matters of fashion and lifestyle. Pretty much the entire middle class stretches its means to try to have the appearance of being ไฮโซ. (The wannabe hi-so are called ไฮซ้อ, though I don't know where ซ้อ comes from here, other than that it's a play on โซ.)

March 12, 2008

Loanwords 4: English loanwords in 1892

There's something irreconcilably nerdy about reading the dictionary. What can I say, I like dictionaries. It's not like I read them cover to cover--I browse. Electronic dictionaries are good for many things, but I love the simple serendipity of flipping through a paper dictionary and finding great new words, or making unexpected discoveries.

I also have a thing for old dictionaries. Take my digital critical edition of the first Thai-English dictionary as proof of that. It's based on a mid-19th Century manuscript of unknown provenance in the British Museum. I gradually typed out the 500-page document over the course of 2006. It was roughly equal parts fascinating and tedious. I got pretty good at reading the chicken-scratch English. The Thai is much more easy to read, ironically, despite a few orthographic quirks of the era.

I typed it up from a digital scan made of a microfilm copy of the manuscript. Since old dictionaries are so hard to find in the flesh--er, paper--a decent scan will do. And thanks to such scans I've been able to examine many early Thai dictionaries. No doubt, without this technology I never would've gotten to read through them closely even if I did find them in some library.

Recently I've been enjoying E. B. Michell's 1892 work A Siamese-English Dictionary, For the use of students in both languages. The book is in the public domain, and downloadable from Google Books within the United States, or viewable on SEAlang.
I don't know much about Michell other than what the title page says: "M.A., Barrister-at-Law, late Legal Adviser to His Siamese Majesty's Government." The Majesty in question here is King Chulalongkorn, or Rama V, who reigned from 1868 to 1910. Google tells me Michell's full name is Edward Blair Michell, and that's the extent of my knowledge of him.

I posted last month about finding 'copy' in this dictionary, spelled กอปี้, whereas today it's usually spelled ก๊อปปี้. As it turns out, there are a number more loanwords that Michell says come from English. And interestingly, all of them are still used:

ไปรเวต = private; I've only seen this used nowadays to refer to casual attire. I first encountered it when my wife and I had pictures taken before our wedding. We had pictures taken in a few different outfits, including ชุดไปรเวต. This usage must be uniquely Thai, because 'private outfit' doesn't sound like anything I'd normally have my picture taken in.

= plan; I still hear this used as an alternative to แผน. I don't know the etymology of แผน, but it seems to be preferred as the native (or more native sounding?) alternative to แปลน.

แหม่ม [แหฺม่ม] = Ma'am; this has gone from referring to a woman Westerner to being a very popular girl's nickname.

= office; it's even still spelled this way, with the final ศ. You can usually spot a loanword as being of 19th Century origin by the presence of these less common letters usually reserved for loans from Pali and Sanskrit. Two other examples are โปลิศ 'police' and อังกฤษ 'English'.

บ๋อย = boy; this specifically refers to a servant boy or a waiter. I still hear this around.

บิล = bill; everybody knows this one, don't they? Pronounced 'bin' in the typical Thai way, and nowadays usually paired with 'check' เช็ค as เช็คบิล used to ask for the check at a restaurant. In this context, 'check' and 'bill' are actually two words for the same thing. I would hypothesize that if 'bill' was already in the language, and so was 'check' in the verb sense 'to check, to examine', then the influence of English 'check please' in the restaurant setting influenced the birth of the quirky Thai-ism 'check bill', which in the Thai context it means to literally check the bill.

แบงก์ = bank, meaning the financial institution; more commonly spelled แบงค์ nowadays.

ปิ่น = pin; used for one's hair. Immortalized in Thai in phrases like ปิ่นเกล้า pin klao, a pin for holding the hair in place when pulled up on the crown like a bun.

ฟุด = foot (the unit of measure); nowadays spelled ฟุต, reflecting the final t of the English spelling.

มรสุม [มอ-ระ-สุม] = monsoon; I don't think this is actually from English as Michell claims. Etymonline traces its route into English as Arabic > Portuguese > Dutch > English:
"trade wind of the Indian Ocean," 1584, from Du. monssoen, from Port. monçao, from Ar. mawsim "appropriate season" (for a voyage, pilgrimage, etc.), from wasama "he marked." When it blows from the southwest (April through October) it brings heavy rain, hence "the rainy season" (1747).
I'd say it's quite plausible that it came into Thai from Arabic, perhaps through Persian (which has many Arabic loans), since Thai has other words of purported Persian origin, like องุ่น 'grape', กุหลาบ 'rose', and กะหล่ำ 'cabbage'. Also notice that 'morasum' is slightly closer to 'mawsim' than to 'moncao' or 'monssoen' (but not conclusively so). If it's a newer loan, it may have come through Portuguese, which gave Thai at least one other early loanword, สบู่ 'soap'.

March 9, 2008

Google's new trick

Sometime last year, Google started segmenting Thai text in webpages it indexes, dramatically increasing the number of hits for pretty much every word out there.

Well, Google has a new trick.

I noticed this week (though I can't confirm exactly when it started) that when you search for a given word, Google returns words with different tone marks from the word you queried. This may be a good thing or a bad thing, depending on what you want to use Google for.

Fortunately, you can get around it by using quotes around your search term. The only exceptions I've noticed to this are non-dictionary words, for which Google still seems to think it knows better than the user.

A few test cases (feel free to replicate these at home):
  • I searched กึง (an onomatopoeic word for a loud noise). I was fishing for กึ่ง ('half'). Of the 100 results on the first page, only one contained my actual target. The rest were the much more common word กึ่ง.
  • I searched ดิ้น ('wriggle, struggle'), fishing for ดิน ('dirt'). Similar story to above.
  • I searched กิ๊ฟ (a non-dictionary word, seen in loans like 'gift shop'). Returns a lot of hits for things like Giffarine (spelled กิฟฟาริน). But this is where it gets weird. Putting quotes around it not only doesn't restrict the search to only my exact query, it actually increased the number of reported hits. I have no idea why.
Another thing I've meant to write about is Google's spelling suggestions. One of the great things about search engines these days is the "did you mean..." feature, where they harvest the power of the gigantic language corpus that is the web to predict what you probably meant to search for if you misspell a word. I know Google does this for Thai, too (simple test search: ประมาน will return the suggestion ประมาณ). But it only offers suggestions for Thai search terms on the Thai version of their site, And since I usually use, not, I have no idea how long they've been doing this. For users of out there, is this a longstanding feature or something new?

March 8, 2008

Semantic Switcheroo

The last post about Thai palindromes gave me another idea. Often when you put two Thai words together it means one thing, and reverse them and it means something else entirely. I thought it would be interesting to look at some examples of how different (or not) these meanings can be:

หมายเลข /maai leek/ vs. เลขหมาย /leek maai/
Both of these mean number, but the usage is slightly different. For one thing, you've got the word หมาย in there, which means aim, mark, intend. Both of these are a kind of assigned or ordered number.
  • หมายเลข is the formal word for number in 'telephone number' หมายเลขโทรศัพท์ (the colloquial word being เบอร์, coming from the last syllable of the English word 'number'). RID defines it as เลขลําดับ 'ordered number', and gives the examples of telephone number and contestant number one. The Thai word for First Lady (as in Laura Bush, but not for long) is สตรีหมายเลขหนึ่ง. Note that this is different from the typical ordinal numbers, which are ที่+number, e.g. ที่หนึ่ง 'first'.
  • เลขหมาย is most commonly heard in that obnoxious little soundbite, เลขหมายที่ท่านเรียกไม่สามารถติดต่อได้ในขณะนี้ 'The number you have dialed cannot be connected', which usually means the person's cell phone is off. But wait, you ask, shouldn't this be หมายเลข, because it's a phone number. You'd think so. That's where the switcheroo comes in. เลขหมาย is defined in RID as จํานวนตัวเลขที่กําหนดไว้ 'an assigned number', so maybe that's why. And now that I think about it, I think I've heard the above 'cannot be connected' message with both หมายเลข and เลขหมาย. Overall, this switcheroo is a bit nitpicky.
ดีใจ /dii jai/ vs. ใจดี /jai dii/
This one is a classic illustration of the difference between the two main categories of 'heart words' in Thai.* Most often, word+ใจ is a temporary emotional state, while ใจ+word indicates a more permanent personality trait. This is probably because ใจ+word is really ใจ+modifier, which is to say, it describe a characteristic of your heart/mind.
  • ดีใจ is perhaps best translated as 'glad', but often 'happy' works, too. 'I'm so glad you called' ดีใจจังที่คุณโทรมา
  • ใจดี is often translated as 'kind' or 'generous'. Literally, to have a good heart. Pretty straightforward.
หน้าเสีย /naa sia/ vs. เสียหน้า /sia naa/
  • หน้าเสีย means to look disappointed, crestfallen, etc. The หน้า here is literally the face, and เสีย covers a broad range of negative things, from broken รถเสีย 'broken down car', น้ำเสีย 'wastewater', อาหารเสีย 'spoiled food'.
  • เสียหน้า means to 'lose face', that classic Asian concept. And 'lose face' works as a decent literal translation, too. The word เสีย is seen in related senses in expressions like เสียพ่อแม่ 'to lose one's parents (i.e. they died)', เสียเงิน 'to spend money',** and เสียตัว 'to lose one's virginity' (literally 'lose your body/self').
There are plenty more where these came from, so I'll probably revisit this topic again soon. If you think of any other interesting ones, leave a comment.

* If you prefer fancy linguistics terms, 'heart words' are a type of psycho-collocation (PDF alert).
** But notice that it doesn't mean
'lose money', which would be either ทำเงินหาย, if you literally misplaced some cash, or ขาดทุน, if you lost money in a business venture.

March 5, 2008

Thai palindromes

Jason sent me a link to an interesting blog post on Thai palindromes.
As you may recall from grade school, a palindrome is a word or phrase that reads the same forwards and backwards. To put it another way, it is a symmetrical word or phrase. Very easy ones include mom and dad, things get a little trickier with
racecar, snack cans, or Dr. Awkward. Longer palindromes can be a lot of fun. Some of the famous ones include what might have been the first words of the first man: Madam, I'm Adam; or a potential title for a book about Teddy Roosevelt: A man, a plan, a canal, Panama.

With its non-linear orthographic complexities, Thai isn't well-suited for this type of wordplay, at least not as we define it in English. There are some palindromic Thai words, like กนก 'gold', And it's possible to make things up that are readable, but don't make much sense as a phrase: e.g. บริการรากริบ. I did come up with one very simple one I'm pleased with. For those who dislike snoring,
กรน: นรก. But the majority of Thai words are just nigh impossible to force into this mold, particularly words with complex vowels and การันต์ silent letters.

So it's no surprise that the Thai palindromes aren't really palindromes. They're sentences (verses, really) that use the same syllables twice, the second time in reverse order from the first. Like this:


True palindromes are a type of orthographic wordplay, since they tend to have totally different word boundaries in the second half, and thus different meanings. These Thai 'palindromes' are restricted to different senses of the same syllables in the two halves. Notice that the example above tries to get around that fact by using ทุกข์ 'suffering' in one half, and ทุก 'every' in the other.

Here's another chunk:


They are very much in the style of traditional Thai poetry, with heavy alliteration and internal rhyme. I'm severely underpracticed with Thai poetry, so I find them a bit opaque, but I do think this is an interesting use of language.

I did a little digging around Wikipedia, and found a type of figure of speech that seems related to this Thai usage, perhaps more closely than palindrome: it's called antimetabole. For example:

Mankind must put an end to war or war will put an end to mankind.'

Of course, antimetabole only requires that certain words be repeated in reverse order. So the Thai palindromes share features of both English palindromes and English antimetabole, but neither rhetorical device is a perfect fit.

As another blogger linked from the page above calls them, they're พาลินโดรมแบบไทย ๆ 'Thai-style palindromes'. So we'll leave it at that. Pretty cool.