Showing posts with label language. Show all posts
Showing posts with label language. Show all posts

November 1, 2008

More free Thai, Khmer, and Vietnamese language courses

I've posted before about FSI-language-courses.com, a site dedicated to disseminating language courses prepared by the United States government (and thus in the public domain).

The site's owner seems to be AWOL, but one user on the site has continued his work of tracking down FSI books and tapes and digitizing them. In this thread on the site's forum, he posts external links to download more than two dozen new books and half a dozen more audio courses, until the site owner reemerges and uploads these materials to the site proper.

The following new materials for Southeast Asian language study are now available:

Audio for the Thai Basic Course Vol. 2 (part 1 -- 98MB and part 2 -- 79MB) -- a PDF of the accompanying book, along with Basic Course 1 PDF + MP3, and Intro to Thai Phonology MP3 are here.

Thai Reference Grammar by Richard B. Noss (10MB PDF) -- also available at SEAlang in DjVu format. This new version is very well done; the quality of the scan is much better, and includes bookmarked sections within the PDF.

Contemporary Cambodian Introduction (42MB) -- grammatical sketch and Basic Course Vol. 1 PDF + MP3 and Basic Course Vol. 2 PDF are here.

And as long as we're in Southeast Asia, last time I didn't post about the Vietnamese materials available:
Vietnamese Basic Course Vol. 1 (PDF + 37 MP3 files)
Vietnamese Basic Course Vol. 2 (PDF)

And also not yet added to the site, but linked in the forum: audio files for Vietnamese Basic Course Vol. 2 (part 1 -- 81MB and part 2 -- 69MB).

Update:
The following Cambodian materials are also available directly as PDF scans from the U.S. government at www.eric.ed.gov (note that there is some overlap with FSI-language-courses.com):

Contemporary Cambodian: Introduction (670 pages, 1972)
Contemporary Cambodian: Grammatical Sketch (127 pages, 1972)
Contemporary Cambodian: The Land and the Economy (375 pages, 1973)
Contemporary Cambodian: The Social Institutions (392 pages, 1974)
Contemporary Cambodian: Political Institutions (387 pages, 1974)

The quality of these scans varies between acceptable and poor. Also, these predate the Khmer Rouge, so their factual value is probably quite out of date, unfortunately, but their value as advanced readers for the Khmer language remains.

Cassette tapes were also produced for the advanced books in the Contemporary Cambodian series, and can be purchased from ntis.gov, at a prohibitively expensive price ($120 per volume). These are, as is everything the U.S. government produces, public domain materials. I hope someone will set these free online in the future.

October 16, 2008

RID99 in retrospect: don't judge a book by its cover ... or thickness

Here's a blast from the past. A headline from Matichon Online, August 25, 2003 (click for the full article):

ร้อนๆ "พจนานุกรมราชบัณฑิต 2542" เพิ่มคำศัพท์ใหม่ 2 เท่าตัว

Hot of the presses: "Royal Institute Dictionary 1999" doubles number of entries


The dictionary is called the Royal Institute Dictionary 1999, despite being first published in 2003, because it was supposed to coincide with the king's sixth-cycle (72nd) birthday, but was years behind schedule.

Just one thing--the 1999 edition nowhere close to double the number of entries.

From the article:
ก่อนหน้านี้ ราชบัณฑิตยสถาน ถูกวิพากษ์วิจารณ์อย่างหนัก ถึงความเป็นพวกหัวโบราณ เก่าเก็บ คร่ำครึ ไม่ปรับปรุงแก้ไขเพิ่มเติมคำศัพท์ต่างๆ ในภาษาไทย ให้ทันกระแสโลก คนไทยจึงทนอยู่กับพจนานุกรมฯ พ.ศ.2525 ซึ่งมีการแก้ไขและพิมพ์ใหม่ถึง 6 ครั้ง โดยครั้งที่ 6 พิมพ์เมื่อปี 2539 จำนวน 60,000 เล่ม รวมพิมพ์ 6 ครั้งเป็นทั้งหมด 280,000 เล่ม แต่คำศัพท์ต่างๆ แทบไม่มีการเปลี่ยนแปลง

พจนานุกรม ฉบับราชบัณฑิตยสถาน พ.ศ.2542 กลายเป็นพระเอกขี่ม้าขาว มาช่วยแก้ภาพพจน์ของราชบัณฑิตฯ เพราะมีคำศัพท์ใหม่ๆ บัญญัติไว้อย่างทันยุคทันสมัย ถึงแม้ไม่ใช่ทั้งหมด แต่ยังมากกว่าฉบับที่แล้วๆ มา

...

พจนานุกรม ฉบับล่าสุดนี้ จะเห็นว่ามีรูปเล่มหนามากกว่าเดิมถึงสองเท่า เนื่องจากมีการบัญญัติศัพท์ใหม่ๆ เพิ่มคำนิยามใหม่ๆ มากขึ้นเกือบเท่าตัว ...

My translation:
Prior to now, the Royal Institute has been severely criticized for being old-fashioned, fuddy-duddy, antiquated, failing to revise and expand the number of words in Thai at pace with the real world. The Thai people have put up with RID 1982, which has been revised and reprinted six times, most recently with a print run of 60,000 copies in 1996, bringing the total number of copies printed to 280,000. The lexical content remained virtually unchanged, however.

The Royal Institute Dictionary 1999 thus comes as the knight in shining armor to rescue the reputation of the Royal Institute, because there are many newly coined words that bring it up to date. It doesn't include every word, but it has more than past editions.

...

This latest edition of the dictionary is more than twice as thick as its predecessor, because there are nearly double the number of entries and definitions. ...

I don't think that RID99 really proved to be the savior they had hoped it would, and that this article so optimistically claimed. One of the comments on the Pantip thread this article is archived on is: น่าจะมีแบบ CD ROM นะ "there should be a CD-ROM version".

Indeed, there should have been. There still has never been, although a web version of RID99 was finally released in February 2008, four-and-a-half years after the paper edition came out. (A web version of RID82 was available prior to that.)

The new RID was too large to become widely used. It is more than 1,400 pages, but it's also a solid four inches thick. Twice as thick as RID82.

So how about all the new words that RID99 supposedly had? I've written about wordcounts in RID before. By my count, the net increase in headwords from RID82 to RID99 was a mere 922 words: 19,526 versus 20,448. That represents an increase of less than 5%.

RID made more gains in subheads, however. RID is organized such that a word like ใจ is a headword, and ใจดี is one of its subhead. All subheads come within the particular headword entry, and begin with that headword. RID99 contains 18,753 subheads, compared to 14,387 in 1982, representing a 30% increase.

The combined total of heads and subheads grows from 33,913 to 39,201 or just 15%. Unless RID was suddenly writing much longer definitions, something else had to account for the increased thickness in RID. The answer is simple: the paper is thick.

I believe that this fact has been more of a detriment to its widescale adoption as a standard reference than the Royal Institute would like.

Think about it: over the course of 21 years, 280,000 copies of RID82 were printed and sold. The first print run of RID99 was a massive 200,000 copies. More than ten times the print run of a typical commercial book.

Five years later, the first printing has still not sold out. If they had printed a modest first run, they could have fixed this problem by now. Unfortunately, they invested millions of baht in the huge first printing, and surely can't justify reprinting before they've completely sold their stock.

Among the other comments on the Pantip thread are several reference to how darn huge the book is. At least commenter said she had planned to pick up a copy, but changed plans upon seeing its massive girth: "ไม่รู้จะยัดไว้ตรงไหนของบ้านอ่ะ" ("I don't know where on earth I'd put it.")

Compare that with the Matichon Dictionary. Publishing empire Matichon debuted its own dictionary in November 2004, barely a year after RID99, containing far more new words, including much more slang and colloquialisms. And despite being 1100 pages, it's less than half the thickness of RID99. It's attractively printed on so-called 'bible paper'.

Matichon wasn't trying to make a quick buck with its dictionary. Their dictionary had its origin in 1997, the year of the big "bubble burst"--ฟองสบู่แตก. Writer ขรรค์ชัย บุนปาน started Matichon (มติชน) with พงษ์ศักดิ์ พยัฆวิเชียร in 1978, when both men were in their early 30s. It took time to build up their publishing empire, but by the late 1990s, things were going well. And so, ขรรค์ชัย decided he was finally going to scratch a longstanding itch: to create a dictionary of his own. It would come to rival that of the Royal Institute, which in 1997 was 15 years into revising its latest edition.

He put together a team, including his childhood friend and longtime collaborator, writer สุจิตต์ วงศ์เทศ, literary scholar and professor ล้อม เพ็งแก้ว, Thai language scholar สันต์ จิตภาษา (pen name ภาษิต จิตภาษา), and others. สุพจน์ แจ้งเร็ว, editor of ศิลปวัฒนธรรม (Art & Culture Magazine), and who I had a chance to talk with about the origins of the Matichon Dictionary in early 2005, was appointed editor of the project.

The first fruits of their labor was published in 2000, a slim dictionary with a few thousand entries, พจนานุกรมนอกราชบัณฑิตฯ "Dictionary of Words Not in RID".

Interestingly, Matichon began its work in the same way that RID has produced its dictionary for 75 years: by having all the experts sit together and discuss each word. They discovered, of course, that this is an impossibly slow way to work, so they modified their method. Something RID could learn from, frankly. In the end, the project cost Matichon ten million baht, according to media reports.

How does Matichon wordcount weigh against RID? According to Matichon, the total number of words in the dictionary is 39,515. (I counted slightly differently and came up with 40,502, because Matichon uses numbered senses under one entry where RID uses separate entries.)

My count puts Matichon with 1,301 more entries than RID99. Where RID is heavy on the literary vocabulary and archaisms, Matichon eschews many of these in favor of more modern colloquialisms and slang, including words like กิ๊ก, many of which were finally recognized by the Royal Institute in their supplementary volume, พจนานุกรมคำใหม่ เล่ม ๑ (Dictionary of New Words Vol. 1), published October 2007.

I wish there were more worthy rivals like Matichon for the venerable RID. Not because I want it to fail or be superceded, because I don't. I refer it to nearly every day (online, of course). But competition breeds innovation, and necessity brings about change. The greater the competition, the better Thai dictionaries will become. And it's folks like us--Joe the Dictionary User--who are the winners in that scenario.

August 11, 2008

Thai nicknames from English words: an eventually comprehensive list

Here's something that just occurred to me: compile a list of Thai nicknames that come from English words.

What this list includes:  mostly nicknames of people I have actually met (or who are otherwise well-known) which comes from English words.

What this list doesn't include: nicknames which are only meaningful in English as names (so Sarah is out, but Rose is in). I've also excluded names which have other meanings, but those meanings have no bearing on their selection for Thai nicknames (e.g. Peter).

This list is an ongoing project
. Please leave a comment with other nicknames you know and I'll expand the list. I'll also add to it as I think of more or meet new people. I'm including the Thai spelling used by the person I know, but other variations are certainly possible. Nicknames I've heard of but never
personally met are in green. Some words aren't English, like Benz or Olé, but I'm including them anyway because of the familiarity of these words to English speakers.

This also demonstrates how the spelling of English words in Thai often does not transparently reveal their actual pronunciation. So in some cases I've included the typical Thai pronunciation in square brackets.

And away we go:
A1 เอวัน

Apple แอปเปิ้ล [แอบเปิ้น or แอ๊บเปิ้น] or for short เปิ้ล [เปิ้น]
Bank แบงค์ [แบ๊ง] 
Benz เบนซ์ [เบ๊น] 
Bird เบิร์ด [เบิ๊ด] 
Bomb บอมบ์ [บ็อม] 
Bow โบว์
Boy บอย (not to be confused with บอยด์ Boyd; cf. บ๋อย)
Cartoon การ์ตูน or for short, ตูน 

Champ แชมป์ [แช้ม]
Cream ครีม
Earth เอิร์ธ [เอิ๊ด]
Firm เฟิร์ม 

First เฟิร์สท [เฟิ้ด]
Fuse ฟิวส์
God ก๊อด (his dad is Muslim.. that's the only explanation I can give)
Golf กอล์ฟ [ก๊อบ]

Guide ไกด์ [ไก๊]]
Lily ลิลี่
Mafia มาเฟีย 

May เมย์ 
Ma'am แหม่ม
New นิว 
Note โน้ต or โน๊ต
Off อ๊อฟ 

Opal โอเปิ้ล [โอเปิ้น] or for short เปิ้ล [เปิ้น], also โอปอล [โอปอ]
Olé โอเ่ล่

Peach พีช [พี้ด]
Rose โรซ [โร้ด] 
Stop สต๊อป
Tiger ไทเกอร์ [ไทเก้อ]
Time ไทม์ [ทาย]
Title ไตเติ้ล [ไตเติ้น] or for short เติ้ล [เติ้น]
Valentine ไทน์ [ทาย] (the person I know only ever used the short version) 

Yeast ยีสต์ [ยี้ด] (his dad worked as a baker)

Okay, now your turn. Let the list grow.

August 2, 2008

Should we fork or spoon? Musings on ช้อนส้อม

So here's something straight from my dinner table: when is a spoon not a spoon? When it's a fork, apparently.

On a couple of recent occasions now, I've been getting things ready at dinner time and found I'd forgotten to bring eating utensils with me to the table. Thai meals generally call for a fork and a spoon. So if my wife happened to be standing near the silverware, I would ask her to grab me ช้อนส้อม. Both times she brought me only a fork. And both times she's said, "You only asked for a fork."

Turns out, to her, ช้อนส้อม means 'fork'. If we break it down, ช้อน = spoon and ส้อม = fork.

There are a number of fixed phrases in Thai where adjoining two nouns creates a new coordinate noun, without the need to use the conjunction and. You see it phrases like:
  • พ่อแม่ 'parents' (literally 'dad + mom')
  • พี่น้อง 'siblings' (literally 'older sibling + younger sibling')
  • สามีภรรยา 'husband and wife' (literally 'husband + wife')
  • ปู่ย่าตายาย 'grandparents' (literally 'paternal grandfather + paternal grandmother + maternal grandfather + maternal grandmother')
  • เสื้อผ้า 'clothes' (literally 'shirt + cloth') 
  • วัวควาย 'livestock' (literally 'cow + buffalo')
So I always thought ช้อนส้อม worked the same way. But not so with my wife. I must be sure to ask for fork and spoon (mandatory conjunction, in thise case กับ = 'and'). She has been surprised that I don't know this, either.

And that's fine, of course. There's lots I don't know. But looking on the internet, there's no shortage of evidence that ช้อนส้อม is used to mean both fork and spoon. And ชุดช้อนส้อม can mean a fork-spoon pair, or the entire cutlery set (sometimes including a knife, sometimes multiple forks or spoons).

There's textual evidence, too. Clauses like ถ้าหากว่าช้อนส้อมคู่ไหนเริ่มเก่า "If any fork and spoon pair begins to get old..."

None of this is to say that I doubt my wife's knowledge of her own native language. I just don't really know if her usage is widespread or not. She's Bangkok born and raised.

If we analyze how ช้อนส้อม can mean just 'fork', we could say that fork is part of the class of things called ช้อน, where we have ช้อนชา 'teaspoon', ช้อนโต๊ะ 'tablespoon', ช้อนกลาง 'serving spoon', and ช้อนส้อม 'fork'.

Which raises the most important question of all: what on earth are we going to call sporks when they finally arrive in Thailand? Maybe ส้อมช้อน? I don't think ช้อม or ส้อน will go over well. (Note to self: invest in sporks.. those things are gonna be huge here some day.)

Readers, please makes this query of the nearest Thai person: If I ask you to bring me ช้อนส้อม, are you going to bring me a fork, a spoon, or both?

August 1, 2008

Etymologist 14: Mangoes and cashews in Thai and Mon

The last week or so I've been spending a lot of time with Harry Shorto's A Dictionary of Modern Spoken Mon (1962). With its companion volume, A Dictionary of the Mon Inscriptions (1971), it makes for some interesting reading.

Personally, I don't know a lot about Mon, but historically it has had a lot of influence on Thai, which mostly goes unrecognized. The Mon were in Southeast Asia more than two millennia before the arrival of the Tai (ancestors of the Thai, Lao, Shan, etc.) from an area of what is now southern China.

Yesterday I noticed a connection, but of a different kind. It's a semantic similarity between modern spoken Thai and modern spoken Mon.

In Mon, just as in Thai, the word for 'cashew' comes from the word for 'mango'. The Thai word for 'mango' is มะม่วง /mamûaŋ/ (from หมาก /màak/ + ม่วง /mûaŋ/); the Mon word is /krɜk/. Now, obviously there's no common origin between those two words.

The mango/cashew connection presumably derives from their similar shape. The cashew nut is native to Brazil, and was spread throughout the world by the Portuguese, who were among the earliest Europeans in Southeast Asia. The cashew took well to the tropical climes of South and Southeast Asia. Today, Vietnam is the largest producer of cashews in the world, with more than four times the annual output of Brazil; India produces more than double that of Brazil.

(Interesting side note: In areas of Thailand's south, the cashew is called กาหยู /kaayǔu/, which, like English 'cashew', comes from the Portuguese word acajú, and ultimately from the indigenous Tupi word acajuba.)

The standard Thai word for 'cashew' is มะม่วงหิมพานต์ /mamûaŋ hǐmmaphaan/ 'Himmaphan mango'. The Thai word หิมพานต์ /hǐmmaphaan/ is from Himavanta, the name of the forest in Hindu mythology which lies at the base of Mount Meru.

The Mon word for 'cashew' is /krɜk soiŋkhɜ̀/. The phonology obscures it a bit, but the second word is from Sanskrit /siṃhala/ (corresponding to Thai สิงหล /sǐŋhǒn/). It is the Mon word for Ceylon/Sinhala, which was partially ruled by the Portuguese in the 17th Century, before the Dutch moved in, and then the British, before modern independence under the name Sri Lanka.

So for Thais, a cashew is a 'Himavanta mango', while for Mon it's a 'Sinhala mango'. The Mon name presumably comes from the nut's place of origin from the Mon perspective. I'm not sure how the Thai word came to refer to a mythical forest in the Jataka tales. Does anyone else have any insight into that?

Regardless, it's an interesting semantic similarity. I wonder if there was any influence in one direction or the other.

July 20, 2008

A Thai eggcorn: อุ้งมือ - อุ้มมือ

I've been fascinated with eggcorns ever since I first heard about them a couple years ago. For starters, here's the Wikipedia definition:

An eggcorn is "an idiosyncratic substitution of a word or phrase for a word or words that sound similar or identical in the speaker's dialect."

These linguistic nuggets get their name from one example of this phenomenon: A fair number of people think the word acorn is actually eggcorn. So the word "eggcorn" is an example of a linguistic eggcorn. These are listener's errors--mistakes we make when we hear words but never (or rarely) see them written. Our mind analyzes them in a new way that make sense to us, but isn't strictly correct.

Usually there's a semantic connection, which is why it makes sense to us (acorns are vaguely egg-shaped, after all, so it's an understandable leap to make).


There are plenty of these in English: "all intents and purposes" becomes "all intensive purposes", "duct tape" is often called "duck tape" (there's even a Duck Tape brand of duct tape as a result).

Well, today I finally discovered a real live Thai eggcorn, and it was
my mistake, to boot. I love discovering this stuff. I just don't know why I never realized it before.

Thai has the phrase อุ้งมือ, referring to the area formed by the cupped palm of the hand. If I found a tiny frog, say, and held it in my cupped hand, it's in my อุ้งมือ. I know this term and use it. And yet, somehow, I have been mistakenly using the phrase อุ้มพระหัตถ์. That's my eggcorn.

You see, I go to church at a Thai congregation, teach Sunday School in Thai, the whole nine yards. So I talk about religion in Thai a fair amount, and this involves knowing my share of ราชาศัพท์ (royal vocabulary), for use in speaking about kings or deities. The real phrase is อุ้งพระหัตถ์ (notice that's อุ้ง, not อุ้ม). Of course, พระหัตถ์ is simply the ราชาศัพท์ word for hand. อุ้งมือ and อุ้งพระหัตถ์ mean the same thing. One would use the phrase อุ้งพระหัตถ์ to say something like "it's in God's hands"--เรื่องนี้อยู่ในอุ้งพระหัตถ์ของพระผู้เป็นเจ้าแล้ว.

Today I found myself typing the phrase, and as soon as I typed it, I realized the mistake I've been making. Something about seeing it written down.

There's also a phonetic reason for why อุ้ง becomes อุ้ม. The word อุ้ง ends in the nasal sound
[ŋ] or 'ng', a velar sound, and พระ begins with the sound [], a labial sound. When speaking quickly, you're already closing your lips to make the พ [pʰ] sound by the time you've even gotten the ง [ŋ] sound out. That leads to [ŋ] becoming [m], through the phonological process known as assimilation (the specific variety of assimilation here is labialization). This means one sound in a word becomes more like another nearby sound. That's why so many people pronounce sandwich as samwich--the [d] is elided, and the [n] is influenced by the [w] immediately following it, causing it to change to [m]. Basic phonetics. Maybe that sounds like mumbo jumbo to you, but it's true. I checked.

So I was changing อุ้ง to อุ้ม without really realizing it. And the semantic connection here is that อุ้ม means to hold or carry, as you do with a young child, so in my subconscious mind it kind of made sense, since we use our hands to carry things.

Interestingly, the same thing happens with อุ้งมือ. Because the sound immediately following ง is ม [m], it's also likely to assimilate to [m].


It turns out I'm not the only person who uses this Thai eggcorn. I Googled both "อุ้มมือ" and "อุ้มพระหัตถ์" in addition to the proper spellings. Here are the counts:

"อุ้งมือ" =
93,900 hits
"อุ้มมือ" = 4,660 hits

"อุ้งพระหัตถ์" = 717 hits
"อุ้มพระหัตถ์" = 36 hits

So while not particularly widespread, this is still a fairly common mistake to make. For all I know, I picked it up from someone else subconsciously. Obviously อุ้ง/อุ้มพระหัตถ์ is much less common than อุ้ง/อุ้มมือ, but I find it interesting that the ratio between the proper and eggcorn form is nearly identical in the two pairs--almost exactly 5%. If this is a representative sample (and I can't say that it is or isn't), as many as 1 in 20 Thais makes this mistake.

I think it's so cool to learn stuff like this.

See the Eggcorn Database for many, many more examples of eggcorns from English.

June 22, 2008

1942 Thai spelling reform announcement

Below is the text of the announcement for the 1942 spelling reform that I first wrote about in February.

"Prime minister's office announcement on improving the Thai script" is dated May 29, 1942, and was published in the Royal Gazette on June 1, 1942. It outlines the new spelling rules, and bears the name of Field Marshal Plaek Phibunsongkhram. You can see the scan of the original document
in PDF format on the Royal Gazette website.

The first page of the announcement.

I'll post a translation another time. For now, I've color-coded the text, because I think it makes for an interesting way of visually absorbing the announcement all at once.

Black = words not affected by the reform
Red = words affected by the reform
Blue = words not affected by the reform, but whose spelling has changed since 1942
Green = the letter ญ, the only letter actually altered by the reform (green because there's no simple way to take away the base and have it display correctly for most people)
Purple = words that should be in red, but force of habit caused the typist to "misspell" them (i.e. spell them the pre-reform way)

ประกาสสำนักนายกรัถมนตรี เรื่องการปรับปรุงตัวอักสรไทย

ด้วยรัถบาลพิจารนาเห็นว่า ภาสาไทยย่อมเป็นเครื่องหมายสแดงวัธนธรรมของชาติไทย สมควนได้รับการบำรุงส่งเสิมไห้แพร่หลายออกไปกว้างขวางยิ่งขึ้น ไห้สมกับความจเรินก้าวหน้าของชาติ ซึ่งกำลังขยายตัวออกไปไนปัจจุบัน เพราะฉะนั้นจึงได้ตั้งกรรมการส่งเสิมวัธนธรรมภาสาไทยขึ้นคนะหนึ่ง ดังมีรายชื่อแจ้งอยู่ในประกาสตั้งกรรมการส่งเสิมวัธนธรรมภาสาไทยนั้นแล้ว เพื่อร่วมกันพิจารนาหาทางปรับปรุงและส่งเสิมภาสาไทยไห้มีความจเรินก้าวหน้ายิ่งขึ้น อันที่จริงภาสาไทยก็เป็นภาสาที่มีสำเนียงไพเราะสละสลวย และมีความกว้างขวางของภาสาสมกับเป็นสมบัติของชาติไทยที่มีวัธนธรรมสูงอยู่แล้ว ยังขาดอยู่ก็แต่การส่งเสิมไห้แพร่หลาย สมควนแก่ความสำคันของภาสาเท่านั้น

กรรมการส่งเสิมวัธนธรรมภาสาไทยได้มีการประชุมกันเป็นครั้งแรก เมื่อวันที่ ๒๓ พรึสภาคม ๒๔๘๕ มีความเห็นไนชั้นต้นว่า สมควนจะปรับปรุงตัวอักสรไทยไห้กระทัดรัด เพื่อได้เล่าเรียนกันได้ง่ายยิ่งขึ้น ได้พิจารนาเห็นว่า ตัวสระและพยัชนะของภาสาไทยมีอยู่หลายตัวที่ซ้ำเสียงกันโดยไม่จำเป็น ถ้าได้งดไช้เสียบ้างก็จะเป็นความสดวกไนการสึกสาเล่าเรียนภาสาไทยไห้เป็นที่นิยมยิ่งขึ้น ตัวอักสรที่ควนงดไช้คือ--

สระ
สระ ใ, ฤ, ฤๅ, ฦ, ฦๅ รวม ๕ ตัว

พยัชนะ
พยัชนะ ฃ, ฅ, ฆ, ฌ, ฎ, ฏ, ฐ, ฑ, ฒ, ณ, ศ, ษ, ฬ รวม ๑๓ ตัว ส่วน ญ (หญิง) ไห้คงไว้ แต่ไห้ตัดเชิงออกเสีย คงเป็นรูป (ไม่มีเชิง)

ดังนั้น อักสรที่จะไช้ไนภาสาไทยจะมีดังต่อไปนี้

สระ
ะ (อะ) ั (อั-) า (อา) ิ (อิ) ี (อี) ึ (อึ) ื (อื) ุ (อุ) ู (อู) เ-ะ (เอะ) เ (เอ) แ-ะ (แอะ) แ (แอ) โ-ะ (โอะ) โ (โอ) เ-าะ (เอาะ) -อ (ออ) -ัวะ (อัวะ) -ัว (อัว) เ-ียะ (เอียะ) เ-ีย (เอีย) เ-ือะ (เอือะ) เ-ือ (เอือ) เ-อะ (เออะ) เ-อ (เออ) เ-ิ (เอิ-) ไ (ไอ) เ-า (เอา) -ำ (อำ)

พยัชนะ
ก ข ค ง
จ ฉ ช ซ
ด ต ถ ท ธ น
บ ป ผ ฝ พ ฟ ภ ม
ย ร ล ว ส ห อ ฮ

เมื่อได้งดไช้พยัชนะและสระบางตัวดังนี้แล้ว คนะกรรมการจึงได้วางหลักการเขียนหนังสือไทยไว้อย่างกว้าง ๆ ดังต่อไปนี้

คำที่เคย
ไช้สระ ใ (ไม้ม้วน) ไห้ไช้ ไ (ไม้มลาย) แทน

คำที่เคย
ไช้สระ ฤ ฤๅ ไห้ไช้ ร (เรือ) ประกอบสระตามกรนีที่ออกเสียงภาสาไทย เช่น
ไน พฤกษา ไช้ รึ เป็น พรึกสา
ไน ฤกษ์ ไช้ เริ เป็น เริกส์
ไน ฤทธิ์ ใช้ ริ เป็น ริทธิ์
ฤๅ
ไช้ รือ

คำที่เคย
ไช้สระ ฦ ฦๅ ไห้ไช้ ล (ลิง) ประกอบสระตามกรนีที่ออกเสียงภาสาไทย เช่น ฦๅ ไช้ ลือ เป็นต้น

คำที่เคย
ไช้พยัชนะ ฆ (ระฆัง) ไช้ ค (ควาย) แทน เช่น เฆี่ยน ฆ้อง ไช้ เคี่ยน ค้อง เป็นต้น

คำที่เคย
ไช้พยัชนะ ฌ (เฌอ) ไช้ ช (ช้าง) แทน

คำที่เคย
ไช้พยัชนะวรรค ฎ (ชฎา) ไห้ไช้พยัชนะวรรค ด (เด็ก) แทน โดยลำดับ คือ
ฎ (ชฎา)
ไห้ไช้ ด (เด็ก) เช่น ชดา
ฏ (ประฏัก)
ไห้ไช้ ต (เต่า) เช่น ประตัก
ฐ (ฐาน)
ไห้ไช้ ถ (ถุง) เช่น ฐาน ไห้ไช้ ถาน รัฐ ไห้ไช้ รัถ
ฑ (มณโฑ)
ไนกรนีที่อ่านเป็นเสียง ด ไห้ไช้ ด (เด็ก) เช่น บันดิต ในกรนีที่อ่านเป็นเสียง ท ไห้ไช้ ท (ทหาน) เช่น ไพทูรย์
ฒ (ผู้เฒ่า)
ไห้ไช้ ธ (ธง) เช่น เฒ่า ไห้ไช้ เธ่า วัฒนธรรม ไห้ไช้ วัธนธรรม
ณ (เณร)
ไห้ไช้ น (หนู) เช่น ธรนี เป็นต้น คำที่เคยไช้พยัชนะ ศ ษ ไห้ไช้ ส แทน

คำที่เคยไช้พยัชนะ ศ ษ ไห้ไช้ ส แทน

คำที่เคย
ไช้ ฬ (จุฬา) ไห้ไช้ ล (ลิง) แทน

อนึ่ง คำที่มิได้มาจากบาลี-
สันสกริต ไห้เขียนตามระเบียบคำไทย เช่น บรร (ร หัน) ไห้เขียน บัน, ควร ไห้เขียน ควน, เสริม ไห้เขียน เสิม, เจริญ ไห้เขียน จเริน สำคั ให้ขียน สำคัน, ทหาร ไห้เขียน ทหาน, กระทรวง ไห้เขียน กระซวง ฯ ล ฯ ดังจะได้ประกาสหลักเกนท์ลเอียดต่อไป

คนะรัถมนตรีได้พิจารนาหลักที่คนะกรรมการส่งเสิมวัธนธรรมภาสาไทยเสนอมาข้างต้นนี้ มีความเห็นชอบด้วย จึงลงมติเป็นเอกฉันท์ไห้ไช้สระและพยัชนะไนภาสาไทยดังกล่าวนี้ ตั้งแต่วันที่ประกาสนี้เป็นต้นไป

ประกาส นะ วันที่ ๒๙ พรึสภาคม ๒๔๘๕
จอมพล ป. พิบูลสงคราม

นายก
รัถมนตรี

See also: Simplified Thai spelling during World War II

June 19, 2008

Free Thai language courses (with audio)

This falls under the category of 'things I've known about forever and consistently forgotten to blog about': FSI-Language-Courses.com.

If you're not familiar with it, FSI is the Foreign Service Institute, the U.S. government's training center for foreign service officers—diplomats and the like. It was established 60 years ago to replace an earlier incarnation established in 1924. The Institute has developed language courses in a large number of languages. And as a government body, all of its work is in the public domain and freely distributable.

FSI Language Courses was started by Glen D. Fellows in 2006.
Users on the site (including Glen) scan FSI coursebooks and record FSI audio tapes that they either buy (they're regularly sold by third parties at exorbitant prices) or check out from a library. The resulting pdf and mp3 files are then made available for everyone. No copyright. No cost. Simple. Brilliant.

The site has FSI course materials for 34 languages at the moment, including Thai, Lao and Cambodian. Granted, these courses are at the youngest a few decades old, but you can't beat the price.

The site has the following materials for Thai:
Introduction to Thai phonology (19 mp3 files
3.5 hours)
Thai Basic Course Volume 1 (student text pdf
—426 pages; 20 mp3 files10 hours)
Thai Basic Course Volume 2 (student text pdf
—421 pages; audio needed)

And these materials for Lao:
Reading Lao (student text pdf
—492 pages; 79 mp3 files—35 hours)*
Lao Basic Course Volume 1 (student text pdf—448 pages; audio needed)

And materials for Cambodian:
Contemporary Cambodian: Grammatical Sketch (student text pdf
—125 pages)
Cambodian Basic Course Volume 1 (student text pdf—453 pages; 45 mp3 files—12.5 hours)
Cambodian Basic Course Volume 2 (student text pdf—367 pages; audio needed)

Among the other languages you'll find materials for are Mandarin (listed as "Chinese"), Cantonese, Korean, Hindi, Swahili, Arabic, and a couple dozen more. Should keep any polyglot busy for years, really.

And if your eyes are rolling back in your head at the thought of right-click downloading all those files, then try DownThemAll, one of my favorite Firefox plug-ins. Just play nice and don't download everything at once.

*There is a typo in the URL for Tape 39 of Reading Lao, but you can manually correct it: change 396A to 096A in the filename.

June 18, 2008

My idea: Thai Video Transcripts

I've been in the U.S. for a month with my family. If you're clever enough to find my family blog, you can read about it (I'm making it marginally more difficult by not linking you in order to deter lazy stalkers). You'd think I'd have more time to post here, but that's not how it's turned out.

But I'd like to share an idea of mine with readers here.

Presenting my demo, sandbox, not-ready-for-primetime version of Thai Video Transcripts. I've been playing with it off and on for a little while, and sent the link to a number of people who I thought might be interested in the idea, to ask for feedback. Many responded with excellent suggestions and wishlists of features for a site like this. (I apologize for not properly replying to some of those responses.)

So what's it for? It's a place to collaboratively transcribe Thai videos found on sites like YouTube and kosanathai.com.

Maybe that description underwhelms you. So allow me to start at the beginning of my thought process.

In the last year or so, an incredible number of Thai-language videos have cropped up on YouTube (commercials, movie trailers, music videos, TV show clips, even full TV episodes and full movies). I'm sure this increase is in part due to the Great YouTube Drought of Ought-Seven (you know, due to the Streisand Effect and all). Nowadays, you can watch any of Thailand's big network TV shows within hours of their live airing (super-serialized into YouTube-size chunks).

As a second-language speaker, fast colloquial speech in videos is some of the hardest to understand. Scripted dialogue in sitcoms, talk show banter, and off-the-cuff wordplay on variety shows are different from speaking with acquaintances live and in person, who we know and who know we may not always catch every word. And you can't stop the flow to ask about a certain word or phrase.

I often rewatch a clip several times, trying to catch the parts that go over my head. Sometimes I figure it out, sometimes I don't.

So the idea is simple: create a site where anyone (that includes you) can help transcribe the video clips that are already out there. You type out the parts you know, and other people help fill in the gaps. If you can't type in Thai yet, or aren't up to the task of transcribing, you can still benefit from the work that others do. It's a wiki, so it's easy to edit and open to everyone. It's a way to take video clips aimed essentially at native speakers, and add value to them for language learners.

You can help even if you don't know Thai, for that matter. You can create new pages for videos you'd like transcribed, which you can then copy-and-paste into websites like thai2english.com to help make at least some sense of the content.

This brings me to another feature which several people suggested: collaborative translation. I think this is a natural extension of my idea, but I'm concerned it would be like trying to run before walking. But that's the great thing. It's a collaborative site, so anyone is welcome to try what they want.

Here's a sample of the type of thing I've done on the sandbox version of the site:



Transcript
[00:01] เจ็บที่สุด ... คือการเป็นอีโง่
[00:28] สายเลือดยากูซ่า[1]
[00:33] กับผู้หญิงทรยศ
[00:38] ให้กำเนิดเด็กพิเศษ
[00:38] มาตรวจดูแล้วนะครับ คิดว่าเขาน่าจะมีปัญหาในเรื่องการพัฒนาการทางสมองครับ
[01:01] ออทิสติก[2] ความบกพร่อง หรือพรจากสวรรค์
[01:13] จดจำทุกทักษะการต่อสู้
[01:28] เรียนรู้อย่างอัจฉริยะ
[01:51] จับตา
[01:51] ทุกการเคลื่อนไหว
[02:10] เล่นจริง
[02:18] เจ็บจริง
[02:38] จีจ้า ญานิน
[02:43] ฮิโรชิ อาเบะ ซูเปอร์สตาร์จากญี่ปุ่น
[02:50] ช็อกโกแลต
[02:55] ภาพยนตร์ โดย ปรัชญา ปิ่นแก้ว[3] จาก องค์บาก และ ต้มยำกุ้ง
[02:58] ตรุษจีน[4] 7 กุมภาพันธ์นี้ ทั้งประเทศ

[1] Yakuza, Japanese organized crime.
[2] Transcription of the English word 'autistic'.
[3] Thai film director and producer Prachya Pinkaew is known for such films as Ong-Bak and Tom-Yum-Goong.
[4] /trut ciin/, the Thai word for Chinese New Year. This is the release date for the movie (February 7, 2008).

Additional links
Chocolate at Wikipedia.
Chocolate at IMDb.
Get the idea? I've played around with different notations to indicate various things. Personal names in green, difficult or unusual words in red. I also toyed with italics for words shown onscreen (as opposed to spoken words)--this is a common issue for movie trailers), but I think it's kind of distracting. I'm not married to any of it, and I'm sure there are better ways of doing things that I haven't thought of. They don't have to be my ideas or my standards. Rather, whatever the community comes up with and decides works the best is what we'll use.

As I've said, this is just my sandbox version of my idea, but I'd like to invite everyone to come play around with it, too. Please, make new pages. Start transcribing, if you're so inclined.

I registered a domain name that I plan to eventually move this blog to. And I'll make a subdomain of that domain for the Thai Video Transcripts wiki, using Wikimedia software, which will allow me to better customize the features of the wiki (like hypertext footnotes, a la Wikipedia).
Whatever is done in the sandbox site will be moved to the final site when the time comes.

I've also been looking at other Wikimedia plug-ins that would let me do some pretty cool things. For example, if I'm reading their website correctly, Kaltura can overlay dynamic subtitles on flash videos, which would mean being able to turn a static transcript into live subtitles, which would be awesome (but I think that would require actually hosting the videos myself, instead of piggybacking on YouTube). Another problem for another day.

A few words about what I don't plan for this site to do: create videos. The whole point is to bring something to life as simply as possible that is self-sustaining and community based, built around a motivated group of Thai-language students (and, if we're lucky, some native speakers) who can collaborate to create a great learning resource for themselves and others.

So there you have it. My big idea (or, one of them). I think it's got a lot of potential. I'd be much obliged to receive any comments and suggestions you all may have.

May 20, 2008

Pen names of Thai writers

Thai writers very frequently use pen names (Thai: นามปากกา, literally 'pen name', likely a calque from English*). Seemingly more than English. Especially between the 1930s and 1960s, pen names were a small way of protecting oneself from the oppressive government. But the practice continues today. Let's look at a few:

กุหลาบ สายประดิษฐ์
Kulap Saipradit (1905-1974 -- See Thai/English Wikipedia), better known as ศรีบูรพา Siburapha, wrote more fifteen novels, many short stories, numerous non-fiction books, and even translated a few novels into Thai. A pioneering modern writer and prominent journalist, he was at times jailed and eventually spent the last 16 years of his life in exile in China. Thailand's military dictatorships were not kind to the new guard of intellectuals and writers, and "Siburapha" is a prime example. A prolific writer, history has vindicated him, and he is now rightly celebrated as one of the great writers of the mid-20th century. The pen name Siburapha is two words, Si ศรี "glory" and Burapha บูรพา "east", which together "glory of the east" or "glorious east", which is not a self-aggrandizing reference, but rather praise for the land of his birth--the orient, or east.

หม่อมหลวงบุปผา นิมมานเหมินท์ M.L. Buppha Nimmanhemin (1905-1963 -- See Thai Wikipedia) authored more than a dozen novels under the pen name ดอกไม้สด, which means "fresh flowers".


ก้าน พึ่งบุญ ณ อยุธยา Kan Phuengbun Na Ayutthaya (1905-1942, see Thai Wikipedia) is well known as ไม้ เมืองเดิม Mai Mueangdoem. The first name ไม้ /máay/ "wood", is a reference to his true first name, ก้าน /kâan/, which means "twig" or "stem". The last name means "old (former) city", which is a reference to his last name, ณ อยุธยา Na Ayutthaya. The former Thai capital of Ayutthaya, often called in Thai กรุงเก่า /kruŋ kàw/ (which also means "old city"). Thai surnames that begin with "Na" and are followed with a city name indicate that that family is descended from former royalty of that city. Besides Na Ayutthaya, you'll also see Na Songkhla, Na Lampang, etc.

มกุฏ อรฤด Makut Orarit (b. 1950) wrote the award-winning 1978 novel ผีเสื้อและดอกไม้ "Butterfly and Flowers", made into a film in 1985, under the name นิพพาน, the Thai word for "nirvana".

จิตร ภูมิศักดิ์ Chit Phumisak (1930-1966, see Thai/English Wikipedia) was another of Thailand's great thinkers. He was staunchly anti-nationalist, and as such he was persecuted by the harsh military dictatorships that ruled for decades in the wake of the 1932 fall of the absolute monarchy. He was shot to death
at the age of 36. He used more than a dozen pen names in his life, but given my recent post about wordplay, the one I find most interesting is จักร ภูมิสิทธิ์ Chak Phumisit, a spoonerism (คำผวน) of his real name.

*Also known as นามแฝง "hidden name".

May 17, 2008

Thai movie title puns

[Note: I've continued to experiment with pop-up romanization for readers who don't read Thai. However, I've tweaked the system, opting to go for AUA romanization instead of the homebrewed modification of AUA that I had been using. Again, if this isn't your cup of tea, feel free to copy-and-paste the Thai into thai2english.com or thai-language.com, which allow users to select the romanization scheme of their preference.]

Thais love wordplay. From คำผวน (Thai spoonerisms) to อะไรเอ่ย jokes (of the "what's black and white and re(a)d all over" variety), wordplay abounds in Thai. And of all the forms of verbal trickery in Thai, the pun is as alive and well as any other.

Of course, I'm certain much punning whizzes over my head without my realizing it, but I catch enough to know it's common. One place I've been noticing puns lately is in movie titles.

Take the recent four-segment horror film, สี่แพร่ง (literally, "four paths", i.e. a four-way crossroads). Its English title is 4bia, a pun on the word "phobia". This is a pretty lame pun in English, but it works better in Thai. That's because the way Thais are taught English, the word "four" is written โฟร์, and pronounced just like the first syllable of "phobia". The word for phobia in Thai is, well, phobia, and "four" is such basic English that every Thai knows it. So it still counts as a pun by Thais for Thais. And it's a better pun in Thai than in English, at any rate.

Another title that I noticed: The Last Moment, the new film by Yuthlert Sippapak (ยุทธเลิศ สิปปภาค). Its Thai title is รัก/สาม/เศร้า. This is a pun on รักสามเส้า, the standard Thai phrase for "love triangle". RID99 has a quaint definition:

รักสามเส้า น. ความรักที่ชาย ๒ คนรักหญิงคนเดียวกัน หรือหญิง ๒ คนรักชายคนเดียวกัน.
love triangle n. love in which two men love the same woman, or two women love the same man.

Putting aside the fact that their definition denies the existence of love triangles in which one or more members are (gasp!) homosexual, a classic love triangle involves all three having a relationship of some kind with the other two, rather than simple two-guys-after-the-same-woman competition. One would expect the two men to be brothers or mortal enemies
or bridge partners. Something like that.

Literally translated, the Thai phrase means "three-legged love". The pun in the film title is in the last word: เส้า "leg, support" is a homophone of เศร้า "sad, sorrowful". The implication is that it's a doomed love triangle. (Are they ever not?) Also note that the word เส้า is not the same word as เสา "column, pole, pillar", although the two words differ only in tone (and I wouldn't be surprised if they were etymologically related).

On an interesting side note, my wife thought that รักสามเศร้า (with the word meaning "sad") was the correct spelling of this phrase, and says she has thought so for as long as she's known the phrase. The word เส้า is rare in Thai, so she had reanalyzed it in a way that made sense to her, by substituting it with a homophone which she knew. This is a good example of folk etymology. I don't know if this misconception is at all widespread, but it's a fascinating possibility to think that this movie title might be a covert pun for some folks, and that it may even reinforce the misconception.

See also: this post from last year about a clever movie title, which involves both verbal and orthographic trickery.

March 26, 2008

Comparative Tai Source Book

Lately I've been reading William J. Gedney's Comparative Tai Source Book, a new publication from Thomas John Hudak. Gedney, who died in 1999, is a giant in the study of the Tai languages. His 1947 PhD thesis, Indic Loanwords in Spoken Thai, is still a good read more than 60 years thence (available here).

This volume brings to completion a book of comparative Tai originally planned by Gedney. Hudak has organized Gedney's notes on 1159 cognate Tai words, making it easy to quickly compare the various cognate forms for a given word. Hudak has also added a chapter for each of the major branches of (Gedney's) division of the Tai language family: Southwestern, Central, and Northern. These chapters give detailed information about the phonology of each of the languages cited in the book.

Gedney collected comparative data on 19 Tai languages:

Southwestern Tai
  • Siamese (Standard Thai)
  • White Tai (Tai Khaw)
  • Black Tai (Tai Dam)
  • Shan
  • the Tai dialect of Nong Khai
  • Lue of Chieng Hung
  • Lue of Muong Yong
  • the Tai dialect of Chiengmai
Central Tai
  • the Tai dialect of Lei Ping
  • the Tai dialect of Lungming
  • the Tai dialect of Western Nung
  • the Tai dialect of Bac Va
  • the Tai dialect of Lungchow
  • the Tai dialect of Ping Siang
  • the Tai dialect of Ning Ming
Northern Tai
  • Yay
  • Saek
  • the Tai dialect of Wuming
  • the Tai dialect of Po-ai
Here's a map from the book of the distribution of these languages. Note that it's a map of just the 19 Tai languages Gedney collected data for, not all Tai languages. Obviously there are Tai speakers in Laos, too, among other places.

[Click image to enlarge]

In reading through the book today, I discovered my new favorite word (okay, well, my favorite word for today). It's the Shan cognate of the word คา /khaa/ in Thai. The Thai meaning is 'stuck'. The Shan meaning is significantly more interesting. Here's the entry from the book:
0497 - stuck, A4
SW - S khaa¹; W, B kaa⁴; Sh kaa⁴ 'to escape, as an animal pierced by any weapon, and carrying the weapon in its flesh'; LNK khaa⁶; LMY kaa⁴
CN - LP khaa⁴; LM kaa⁴; WN kaa⁴, caa⁴; PS, NM kaa⁴
N - Y ka⁴; Sk khaa⁴
In case you missed that: to escape, as an animal pierced by any weapon, and carrying the weapon in its flesh. Granted, the data is 50 years old. I wonder if that word is still used much these days. Time to go ask my Shan-speaking friend.

March 23, 2008

Same same, but different

Sometimes, through some twist of semantic fate, a word can acquire two senses with opposite meanings. Like 'sanction', which can mean to approve or condemn. Or 'fine', meaning merely acceptable or exceptionally good. There's a word for these. They're called contronyms, or alternately auto-antonyms.

A couple of weeks ago, there was a brief discussion on ThailandQA about the auto-antonymy of the Thai word รื้อ, which can mean either to tear down, as in รื้อตึก 'raze a building', or to bring up, as in รื้อเรื่องเก่า 'revive old matters'.

Another auto-antonym occurred to me recently: ป้องกัน. RID defines it thusly (my translation):

ป้องกัน
ก. กั้นไว้เพื่อต้านทานหรือคุ้มครอง.
/pɔŋ kan/
v. to block, in order to oppose or protect.

So when you ป้องกัน something, you are either opposing it or protecting it. Quite different meanings. You can ป้องกันโรค 'protect against disease', or you can ป้องกันตัว 'protect yourself'. Here's a real life example of what can happen with careless translation of what I suspect was the word ป้องกัน in the source material. From a press release of the Public Relations Department:

Deputy Secretary-General of the Office of the Narcotics Control Board (ONCB) Pittaya Jinawat revealed points made in a tactical meeting with computer game related businesses focused on protecting and guarding against the pervasion of narcotics and also game addiction.

...

Game developers have affirmed that they are ready to produce more positive games that are age appropriate but also evoke family participation, which they believe will be the best way to safeguard deviant behavior. [Emphasis added]


I don't have the original Thai, so it's only a guess that the original word is ป้องกัน. But clearly they meant 'safeguard against deviant behavior'. What a difference a word makes.

Thinking more about it, there are other words that are sort of like auto-antonyms, but not exactly. For example, เหม็น. This word most commonly means 'to have an objectionable smell', as in ตัวคุณเหม็นบุหรี่ไปหมด 'you reek of cigarettes'; but it can also mean 'to find a smell objectionable', as in ขออนุญาตสูบบุหรี่ คุณเหม็นรึเปล่า 'Mind if I smoke? Does (the smell) bother you?'

It works similarly for หนัก, 'heavy/to find heavy': กล่องนี้หนักมาก 'this box is really heavy', vs. ยกไปเท่านี้ก่อน กลัวคุณจะหนัก 'that's enough to carry, I don't want it to be too heavy for you.'

There's something different going on with เหม็น and หนัก than with ป้องกัน, but I haven't thought of (or come across) a good way to classify it. Any ideas?

March 21, 2008

Broken transitivity

I noticed something about the verbs แตก /tɛɛk/ and หัก /hak/, which both mean 'break'. It's about transitivity.

For those of you who need a simple refresher course, a transitive verb is one which requires a direct object. You have to do it to something. For example, I can lift a box, but I can't just lift (unless it's understood from the context, but that's different).

An intransitive verb is one which does not require a direct object (but you may be able to specify an indirect object using a preposition). For example, I can complain, I can complain to you, but I can't complain you.

Etymologically, หัก is transitive (e.g. I broke the lamp), while แตก is intransitive (e.g. the lamp broke). English uses the same verb in both senses. The English verb is ambitransitive, in linguistics-speak. Thai often has two words where English has just one. Where English has he boiled water vs. the water boiled, Thai has เขาต้มน้ำ vs. น้ำเดือด, (ต้ม /tom/ is transitive and เดือด /dʉat/ is intransitive).

However, หัก has come to be used quite commonly as an intransitive verb. While you can still หัก something, more often you ทำ X หัก ('cause X to break'). This matches the usage of แตก--typically, you ทำ X แตก (also 'cause X to break'). The most common transitive uses of หัก still around are figurative, like หักหลัง, to betray, literally to break (someone's) back; also หักคอ, หักใจ, หักอก, etc. หักอก /hak ok/ is interesting because หักอก means 'break (someone's) heart', while อกหัก /ok hak/ is just as common, meaning ('heartbroken'). This would've been a good one for my Semantic Switcheroo post. Google even turns up a small number of hits for ทำอกหัก 'cause (someone's) heart to break', too. It would appear that หัก is flirting with becoming entirely intransitive.


On the flip side, the traditionally intransitive verb แตก
has developed some transitive senses. One in particular seems to be influenced from English. แตกแบงค์ /tɛɛk bɛŋ/ means to 'break a bill', one of the extremely vital services that 7-Eleven provides in Thailand. If you're about to get into a taxi and all you have is a 1000 baht bill (or even a 500 baht bill), you'd better go แตก that แบงค์ at Seven first. In a similar vein, I've also seen แตกวง /tɛɛk woŋ/ meaning 'to break up (a band)', as in 'Aerosmith ทำท่าจะแตกวง' Aerosmith is acting like they're going to break up. The intransitive form is วงแตก /woŋ tɛɛk/, as in 'Potato วงแตกแล้ว' The band Potato broke up.

These are limited uses of แตก as a transitive verb (there are possibly more, like แตกแถว and แตกฝูง, meaning to be different from the pack, or non-conformist), but they're very interesting developments nonetheless.

A kind of transitivity switcheroo.

March 12, 2008

Loanwords 4: English loanwords in 1892

There's something irreconcilably nerdy about reading the dictionary. What can I say, I like dictionaries. It's not like I read them cover to cover--I browse. Electronic dictionaries are good for many things, but I love the simple serendipity of flipping through a paper dictionary and finding great new words, or making unexpected discoveries.

I also have a thing for old dictionaries. Take my digital critical edition of the first Thai-English dictionary as proof of that. It's based on a mid-19th Century manuscript of unknown provenance in the British Museum. I gradually typed out the 500-page document over the course of 2006. It was roughly equal parts fascinating and tedious. I got pretty good at reading the chicken-scratch English. The Thai is much more easy to read, ironically, despite a few orthographic quirks of the era.

I typed it up from a digital scan made of a microfilm copy of the manuscript. Since old dictionaries are so hard to find in the flesh--er, paper--a decent scan will do. And thanks to such scans I've been able to examine many early Thai dictionaries. No doubt, without this technology I never would've gotten to read through them closely even if I did find them in some library.

Recently I've been enjoying E. B. Michell's 1892 work A Siamese-English Dictionary, For the use of students in both languages. The book is in the public domain, and downloadable from Google Books within the United States, or viewable on SEAlang.
I don't know much about Michell other than what the title page says: "M.A., Barrister-at-Law, late Legal Adviser to His Siamese Majesty's Government." The Majesty in question here is King Chulalongkorn, or Rama V, who reigned from 1868 to 1910. Google tells me Michell's full name is Edward Blair Michell, and that's the extent of my knowledge of him.

I posted last month about finding 'copy' in this dictionary, spelled กอปี้, whereas today it's usually spelled ก๊อปปี้. As it turns out, there are a number more loanwords that Michell says come from English. And interestingly, all of them are still used:

ไปรเวต = private; I've only seen this used nowadays to refer to casual attire. I first encountered it when my wife and I had pictures taken before our wedding. We had pictures taken in a few different outfits, including ชุดไปรเวต. This usage must be uniquely Thai, because 'private outfit' doesn't sound like anything I'd normally have my picture taken in.

แปลน
= plan; I still hear this used as an alternative to แผน. I don't know the etymology of แผน, but it seems to be preferred as the native (or more native sounding?) alternative to แปลน.

แหม่ม [แหฺม่ม] = Ma'am; this has gone from referring to a woman Westerner to being a very popular girl's nickname.

ออฟฟิศ
= office; it's even still spelled this way, with the final ศ. You can usually spot a loanword as being of 19th Century origin by the presence of these less common letters usually reserved for loans from Pali and Sanskrit. Two other examples are โปลิศ 'police' and อังกฤษ 'English'.

บ๋อย = boy; this specifically refers to a servant boy or a waiter. I still hear this around.

บิล = bill; everybody knows this one, don't they? Pronounced 'bin' in the typical Thai way, and nowadays usually paired with 'check' เช็ค as เช็คบิล used to ask for the check at a restaurant. In this context, 'check' and 'bill' are actually two words for the same thing. I would hypothesize that if 'bill' was already in the language, and so was 'check' in the verb sense 'to check, to examine', then the influence of English 'check please' in the restaurant setting influenced the birth of the quirky Thai-ism 'check bill', which in the Thai context it means to literally check the bill.

แบงก์ = bank, meaning the financial institution; more commonly spelled แบงค์ nowadays.

ปิ่น = pin; used for one's hair. Immortalized in Thai in phrases like ปิ่นเกล้า pin klao, a pin for holding the hair in place when pulled up on the crown like a bun.

ฟุด = foot (the unit of measure); nowadays spelled ฟุต, reflecting the final t of the English spelling.

มรสุม [มอ-ระ-สุม] = monsoon; I don't think this is actually from English as Michell claims. Etymonline traces its route into English as Arabic > Portuguese > Dutch > English:
"trade wind of the Indian Ocean," 1584, from Du. monssoen, from Port. monçao, from Ar. mawsim "appropriate season" (for a voyage, pilgrimage, etc.), from wasama "he marked." When it blows from the southwest (April through October) it brings heavy rain, hence "the rainy season" (1747).
I'd say it's quite plausible that it came into Thai from Arabic, perhaps through Persian (which has many Arabic loans), since Thai has other words of purported Persian origin, like องุ่น 'grape', กุหลาบ 'rose', and กะหล่ำ 'cabbage'. Also notice that 'morasum' is slightly closer to 'mawsim' than to 'moncao' or 'monssoen' (but not conclusively so). If it's a newer loan, it may have come through Portuguese, which gave Thai at least one other early loanword, สบู่ 'soap'.