Obviously, wordcount isn't a good measure of dictionary quality. In Thai we don't see this as much. The notable exception that I've seen is Wit Thiengburanatham (วิทย์ เที่ยงบูรณธรรม), whose library-size Thai-English dictionary boasts on its front cover to contain more than 80,000 words. Anyone who has used this dictionary knows what a large percentage of these are flora and fauna names, which I would guess is what makes up most of the different. These words are certainly valuable to record, but it goes to show why having a huge count doesn't necessarily mean more words that will be useful to the average user. But even if wordcounts aren't a good gauge of overall quality, they can still be a (very rough) gauge of coverage.
In 2005, during the early days of the project that became my senior thesis, I wanted to know how RID stacked up. Believe it or not, in this computer age, I counted the entries. I had no digital texts I could use, and since I watched a lot of TV anyway, whenever I found myself with downtime I'd count away, doing each letter of the alphabet separately, periodically jotting the number down on the corner of the page so I wouldn't lose my count. I kept a table on the flyleaf. I did one pass to count headwords (e.g. ใจ) and another pass to count subheads (e.g. ใจดี). Oh yeah, and did I mention I did this for the 1950, 1982 and 1999 editions? I know, it's embarrassing to admit, like saying I translated Harry Potter into Klingon. But lest you should mourn for my social life, I was engaged when I started and married by the time I finished. My wife thought I was crazy. Still does, most of the time. I also wonder.
Nowadays, I have digital texts of RID82 and RID99 (although the latter has lots of missing entries). And though I could check my counts, I haven't. The percentage difference is sure to be small. Or maybe I'm just scared I'll discover that I don't know how to count as well as I thought I did.
The curious can see letter-by-letter breakdowns in the spreadsheet I made at the time, which I've uploaded to Google Docs. (It also contains a partial count of Matichon.)
Its interesting to note that while there were significant gains in both headwords and subheads between RID50 and RID82, there were less than a thousand new headwords in RID99, but upwards of 5,000 subheads. The differences aren't just new words, though. There are lots of changes (large and small) to the definitions between editions, mostly in the direction of beefing them up and adding more senses. And even though the total number of heads and subheads in RID99 is less than half of what Dr. Wit's dictionary claims, this does not even begin to capture the number of senses in the dictionary, so it's difficult to compare the two.
In the end, though, it's clear RID isn't anywhere near a complete record of the Thai language. It's probably a nearly complete record of Standard Thai, that mythical creature that is taught to all and spoken by none. Actual Thai is much harder to put in a book, much harder even to define.
Contenders in the commercial sector like Matichon are helpful, because by taking the descriptive approach to lexicography, they're helping force the Royal Institute to come out of its prescriptive shell, as seen by last year's release of their Dictionary of New Words, Vol. 1 (พจนานุกรมคำใหม่ เล่ม ๑). That's a good thing, because unless they continue to adapt their methods, their dictionary will become obsolete. If they keep up the same pace of their last dictionary, the next edition of RID will be released in 2024. And if there are only a few thousand new words and compounds in it, people will wonder if the decades of work were worth the trouble.