August 12, 2007

RID: Past, Present, Future

This post started as a reply to David's comment on this earlier post about the online version of the พจานุกรม ฉบับราชบัณฑิตยสถาน, or Royal Institute Dictionary (see also my follow-up post). The reply started getting so long that I decided to give it the full treatment. David writes:

The RID is Thailand's official language standard and getting it on-line was a major step forward in creating accessibility for much of the Thai community. While the text itself is not very expensive at 600 baht for a work of this size, this is a volume not often found in Thai households. In addition portability of this gigantic text is a major issue. While the new Matichon dictionary represents a private industry attempt to update the venerable RDI, the Thai-speaking community really needs the RID as a baseline for the language.

He's right--most Thai households don't have this dictionary. And even if, say, you only were to count homes with a college-educated head of the household, I think most people still don't have it. Some dictionary, yes, but not this one.

One problem is that it's too big, physically. It's not printed on the thin "bible paper" that we are (or at least I am) used to seeing in English dictionaries. Two different members of the Royal Institute gave me different anecdotal reasons for this: first, that they wanted to use Thai paper (the thinner paper isn't produced domestically and must be ordered from abroad), which would certainly be a case of misguided nationalism, and second, that it was simply an error in communicating with the publishing company, นานมีบุ๊คส์ (Nanmee Books). I think the latter is more likely.

The size makes it unwieldy. It's around 1500 pages, but it would be less than half as thick on the thinner paper. Compare it to the Matichon Dictionary, which is around 1000 pages, but only 1/3 as thick (and much lighter and handier to use).

But besides its dimensions, I think the number of decades of (off-and-on) work on this dictionary represent a hugely squandered opportunity. In his comment, David continues:

Let's hope that the RI will get the expertise it needs in database and internet technology and the assistance of a dedicated staff to allow the dictionary to be disseminated to the on-line Thai community.

Again he's right. But not only do they need the technical know-how to move the dictionary into the digital age, but they need the lexicographic know-how to do a complete and utter reworking of the dictionary. Then it will truly be worthy to be had in the homes of every Thai family (and the Royal Institute can improve its reputation in dictionary reliability and quality--perhaps comparable to how Webster is a household name irrevocably associated with dictionaries in America, almost 200 years after his original dictionary).

I know a bit about the pedigree of the RID, because I wrote my college thesis on Thai dictionaries (which was, I think, also a semi-squandered opportunity--it should be better than it is, though at least it didn't take me decades to complete). This started as an independent research project in early 2005, when I received grant money from my university to research the topic in Bangkok. One part of my research was to interview members of the Royal Institute, and to attend meetings of the dictionary revision committee (คณะกรรมการชำระพจนานุกรม). The RID has many committees that its 150+ experts are members of. Besides the Institute's general dictionary, there are many specialized dictionaries, their many-volume encyclopedia, etc. But for many, their work is extra-curricular. These people are the cream of the crop in their fields, and they often balance this work with full-time jobs (though many are retired).

Each generation of the book (1950, 1982, 1999) has only been a of the previous one (they use the word ชำีระ, which means "cleanse"). More than one interviewee made a point of clarifying this for me, whenever at first I would use a phrase in the interview like "ทำพจนานุกรม," they would point out that the work they were actually doing was to ชำระ, not ทำ.

I've studied the numbers pretty closely (I can post the data if folks are interested), and the increase in the scope and breadth of the dictionary with each generation, overall, is not overwhelmingly significant, though the changes between 1950 and 1982 is greater than the changes between 1982 and 1999 (which, despite its "official" release date, so-called to coincide with the King's 72nd birthday, was actually 4 years past schedule and first published in 2003).

The methodology is the biggest problem. It's a committee of 15-20 people who meet for four hours a week (two two-hour meetings), and who are all busy with other projects (the president of the committee, a nominally retired septuagenarian, was on 11 different committees at the Institute when I interviewed him). They go over the previous incarnation of the dictionary entry by entry in alphabetical order and edit definitions, add and remove entries, or add senses to existing entries. As everything is done by committee, they frequently discuss and argue over words and spellings and etymologies and how to phrase or rephrase the definition. But just as often it seemed the members would defer to the president after a period of debate, apparently not wanting to drag things on too long for any given entry.

Perhaps a bigger problem than the committee method is that there is no (and, so far as I have been able to ascertain, has never been any) systematic attempt to comb the language for all words, phrases and senses to include in the dictionary. To put it frankly, a full-time editor and staff might do in weeks or months what takes the Royal Institute years. But the reason they do it how they do it goes back to the idea of expertise and authority. Unfortunately, this limits the dictionary to the knowledge of a small group of people, experts though they may be.

The first edition of the Oxford English Dictionary took seventy years to finish, and they had thousands of volunteer readers sending in usage examples from works dating back throughout more than 700 years of the language. Thailand's written history is just as long, and modern technology makes it much quicker to do much of this work (and much more accurate).

For me, it boils down to this: If something is going to take years (or decades), let's get an OED out of it, or a Webster's Third out of it. (By the way, these books both have fascinating histories, which you can read about in any of several books on the subjects--for a "textbook" of lexicography in general, I recommend Landau--particularly the 2nd Edition, which includes much on modern corpus-based lexicographic techniques).

I am only so hard on RID because I see its potential. There are many, many, many areas that need improvement, both in the physical book and the online version. I'm rooting for them, but if a couple decades from now the Royal Institute is still plodding along in its set ways, others may have already taken up the torch. In fact, that may be what they need--a Thai dictionary that outpaces their technology and methodology to the point that the necessity of a full overhaul will finally be understood. Ultimately, this would be a good thing for both the language and its speakers.

