We're nearing the end of my translations of the frontmatter from RID99 (พจนานุกรม ฉบับราชบัณฑิตยสถาน พ.ศ. ๒๕๔๒). Etymology's on the agenda today. The original Thai for this section is here. There is actually more to translate in the dead tree version of the RID99, including a synopsis of the history of RID. But the electronic text of that extra stuff isn't included in the web version, which makes it a little harder to work with. Maybe I'll get around to translating that in the future, though.
In this post I've tried something new: the blue text is direct quotes of full or partial entries from the dictionary, for easier scanning. Scattered individual Thai words are still black.
Part 6: Etymology
1. The origin of a word is given at the end of the entry for that word, as an abbreviation in parentheses e.g. สทึง [สะ-] น. แม่น้ำ, ใช้ว่า จทึง ฉทึง ชทึง ชรทึง สทิง หรือ สรทึง ก็มี. (ข. สทึง ว่า คลอง). or การะบุหนิง น. ดอกแก้ว. (ช).One of the key things I learned from this section is why sometimes RID gives the spelling of the original word, like (ข. โปฺรส), and why sometimes it just gives the language, like (ป., ส.). If no spelling of the source word is given, then (so RID claims) the Thai spelling maintains the original spelling. In other words, it's a transliteration from the source language, which is often the case with Pali and Sanskrit words.
2. Any word that is given as being from another language in fact does not correspond exactly with the source word, because the words from languages such as Pali, Sanskrit, or Khmer that are borrowed in Thai are usually either shortened, changed orthographically, or changed phonetically, e.g. ธมฺม (Pali) and ธรฺม (Sanskrit) correspond to Thai ธรรม; โปฺรส (Khmer) corresponds to Thai โปรด. In giving the etymology, sometimes the spelling in the original language is given as well, e.g. ธรรม has (ส. ธรฺม; ป. ธมฺม), or โปรด has (ข. โปฺรส), in order to compare the original spelling with the spelling used in Thai. For loanwords which are written very close to the original language, only the source language is given, e.g. กฏุก only gives (ป.), and ศิขร only gives (ส.). If a word is both Pali and Sanskrit, then both languages are given, e.g. รจนา has (ป., ส.). If a word is partially Pali and partially Sanskrit, then the original spellings of both Pali and Sanskrit are given, e.g. ปราโมทย์ gives (ส. ปฺรโมทฺย; ป. ปาโมชฺช). If the spelling is Pali, but is very similar to Sanskrit, e.g. หทัย, then it is given as (ป.; ส. หฺฤทย), or if the spelling is Sanskrit but very similar to Pali, e.g. สตัมภ์, then it is given as (ส. สฺตมฺภ, สฺตมฺพ; ป. ถมฺภ).
3. Any word for which the language of origin is uncertain, but is written similarly to another language, it is given is parentheses to compare with this language or that language, e.g. กำปั่น น. เรือเดินทะเลขนาดใหญ่ชนิดหนึ่ง... (เทียบมลายู หรือฮินดูสตานี ว่า capel).
4. Some archaic words are written one way, but nowadays the spelling has changed, in which case both the archaic and the modern spelling may be included, e.g. วงษ์ (โบ) น. วงศ์. วงศ- and วงศ์ [วงสะ-, วง] น. เชื้อสาย, เหล่ากอ, ตระกูล. (ส. วํศ; ป. วํส). Or only the modern spelling may have an entry, and the archaic spelling is given in parentheses at the end of the definition, e.g. กำสรวล [–สวน] (แบบ) ก. โศกเศร้า, คร่ำครวญ, ร้องไห้, เช่น ไทกำสรดสงโรธ ท้ยนสงโกจกำสรวลครวญไปพลาง. (ม. คำหลวง ทานกัณฑ์). (โบ กำสรวญ).
By doing a simple analysis of the (very flawed) full online text of the dictionary, here's a count of Thai word origins from several languages:
Mistakes in my counting aside, clearly this is a significant weakness of RID.
Point 3 notwithstanding, it often holds that if no connection is certain, no etymological info is given. The (เทียบ X) note is used just over 100 times. Even older but well-known loanwords, like the Thai numbers เอ็ด, ยี่ and สอง through เก้า, which were borrowed from Chinese, are implicitly claimed as Thai. This may have been acceptable 50+ years ago, when the conventional wisdom among Thai scholars was that Thai was a relative of Chinese, but it's hard to excuse nowadays.
In addition, to say there are only 400 words from Khmer in Thai is comical, and I'd be surprised if Malay has really had less of an influence than French--even in Bangkok. The English figure above can't be trusted at all, because RID doesn't have an automatic way to distinguish between words which are transliterations from English (e.g. โฮเต็ล) and words which are translated from English (e.g. โทรทัศน์). I'd have to go through and do a manual count to know that.
The number of Indic loans, at roughly 10,000 words, makes up 25% of the dictionary's total entries, which sounds reasonable. I'm sure that a relatively small number of these make up more than 25% of actual word usage in Thai based on frequency, however. It's also worth mentioning that many Indic words--or alternate versions of them--came into Thai by way of Khmer, which is ignored in RID.
All said, RID is a decent beginning source, if woefully incomplete. Mostly their analysis is simplistic, ignoring how and when a particular word came into Thai, as well as failing to give the meaning of the word in the original language.
Oh, yeah. And Happy Valentine's Day!