August 17, 2007

Words that end in สระอึ, SEAlang searching, and teaching a man to fish

The other day, a friend of mine emailed to ask me to help him find words that end in in the vowel -ึ*, or สระอึ, as one would say it aloud in Thai. He had come up with two already, อึ and ตึ. Words that end in this vowel are quite rare, so I thought I'd share what I found.

First of all, I cheated. I can't say that I relied on my immeasurably vast knowledge of the language--though it would be nice if I had anything remotely close to that (that's what reference works are for). So I turned to the dictionary. Specifically, the online dictionary. It's exactly this sort of question which a well-designed online dictionary can handle far better than the traditional dead tree variety. Why? Because of a little thing called a wildcard.

Note that I say "well-designed dictionary." Not every online dictionary supports wildcards, and some don't support them as well as I'd like (ahem.. RID). The best that I've seen in this area is the SEAlang Thai-English dictionary, although it has a bit of a daunting interface at first.

Much of the basic search syntax of SEAlang is like traditional regular expression syntax. For example, a period (.) matches any character.
For wildcard searches on SEAlang you have to use the phonetic search box (they call it "IPA", though it's not exactly that), so I first .ʉ as my first search string. If you're wondering how to make the special character for this vowel, SEAlang makes it easy. You can click the button above the search box, or you can use the shortcut key given in parentheses on the button and it will automatically convert to the desired character. In this case the shortcut is U (capital u). It magically becomes ʉ before your very eyes.

These are the results of my search: ตึ รึ ฮึ (also หึๆ as part of the phrase หัวเราะหึๆ)

Now, one thing I noticed is it didn't find อึ--but there's a simple reason for that. Phonetically speaking, there's no sound before the vowel (except possibly a glottal stop, but SEAlang's phonetics don't include that). Searching just ʉ does return อึ, though.

But that's not all. It occurred to me that the search string .
ʉ wouldn't return words that began with a consonant cluster. So I tried .*ʉ next. The asterisk means "zero or more of the preceding characters." In combination with the period, it amounts to "any combination of characters." The problem is, it causes the search to make the long vowel, represented by two characters (ʉʉ). (Note this search does return อึ, since that would be the "zero" case.)

Finally I search C*
ʉ. The C in this case stands for any consonant. The asterisk makes it match zero or more consonants. Voila! Finally I've found the search I wanted all along. It matches all the previously found words, including อึ, as well as a newcomer: ครึ

Now that we have the words from SEAlang, let's switch things up by looking them up in RID99 (my English translations):

ครึ [คฺรึ] (ปาก) ว. เก่าไม่ทันสมัย.
(Colloq.) adj/adv. Old and out of date.

ตึ, ตึ ๆ
ว. ลักษณะกลิ่นเหม็นอย่างหนึ่งคล้ายกลิ่นเนื้อแห้ง, มักใช้ประกอบกับคํา เหม็น เป็น เหม็นตึ.
A kind of unpleasant odor, like the smell of dried meat; usu. with /men/ as /men tʉ/.

รึ [not in RID99] A common abbreviation of หรือ used in transcribing speech or informal writing. Frequently seen in the question phrase รึเปล่า.

ฤ ๑ [รึ] เป็นรูปสระในภาษาสันสกฤต เมื่อไทยนํามาใช้ออกเสียงเป็น ริ รึ หรือ เรอ เช่น ฤทธิ์ ฤดู ฤกษ์.
A Sanskrit vowel. When used in Thai, pronounced /ri/ or /rʉ/, or /rəə/, e.g. /rit/ /rʉduu/ /rəək/.

ฤ ๒
[รึ] (กลอน) ว. หรือ, ไม่, เช่น จะมีฤ ว่า จะมีหรือ, ฤบังควร ว่า ไม่บังควร.
(Poetic) adv. /rʉʉ/, /mai/, e.g. /ca mi rʉ/ "Will there be?", /rʉ baŋkhuan/ "not appropriate".

ฦ, ฦๅ ๑ วิธีเขียนเสียง ลึ ลือ แต่บัญญัติเขียนเป็นอีกรูปหนึ่งต่างหาก อนุโลมตามอักขรวิธีของสันสกฤต.
A way to transcribe the sound /lʉ/ or /lʉʉ/, but it is prescribed to be written another way, after the Sanskrit.

หึ, หึ ๆ ว. เสียงดังเช่นนั้น.
Adv. Onomat. A loud sound.
(Note the use of เช่นนั้น indicates onomatopoeia--literally meaning "like that" (i.e. like the sound of the word).

อึ (ปาก) ก. ถ่ายอุจจาระ (มักใช้แก่เด็ก). น. ขี้, อุจจาระ.
(Colloq.) v. To defecate (usu. of children). n. poop, feces.

ฮึ อ. คําที่เปล่งออกมาแสดงความไม่พอใจหรือแปลกใจ.
Interj. An expression of displeasure or surprise.

One last thing. The search string we used only covers one-syllable results. SEAlang has an easy way to fix this. Under the "Approximate matching" section, change the selection to "syllable or longer," which means it will now match any word that has any syllable ending in /
ʉ/. There are a bunch of these. One of the most common is พฤหัส /pharʉhat/ "Thursday." However, it also gives us yet another word for our list: สะตึ. This one's in RID, too:

สะตึ, สะตึ ๆ (ปาก) ว. ไม่มีอะไรดี, ไม่ได้เรื่อง, ไม่มีค่า, เช่น หนังเรื่องนี้สะตึดูแล้วเสียดายเงิน ของสะตึ ๆ อย่างนี้ไม่ซื้อหรอก.
(Colloq.) Adj./adv. No good, useless, worthless, e.g. "This movie is no good--once you've watched it, you regret paying." "I don't buy useless stuff like this."

There are probably more hidden out there, including another an elaborate version of ครึ I found in RID99 (
คร่ำครึ ว. เก่าเกินไป, ไม่ทันสมัย.) Can anyone else track down more?

* Romanized variously as /ʉ/ or /ɨ/ or even /y/, but they're just arbitrary symbols, really. Personally, I dislike using digraphs (like eu or ue) to represent this single sound. I think insofar as a romanization system is necessary for learners, a one-to-one correspondence of symbol to sound is best, ala basic IPA.

No comments:

Post a Comment