More Weird Word Work


Seeking words with relatively few distinct letters

In contrast to my post about words with lots of LOJO (Letters Occurring Just Once), this item is about words that have (for their length) very few different letters. An example that will be familiar to regular readers is alfalfa, which has a total of 7 letters, but only 3 distinct letters: a, l and f.

Now, it seemed to me not very interesting to simply count the number of distinct letters in various words. If you do that, any three-letter word would score as low as alfalfa---or lower. Instead, I compute what I call the Distinct Letter Index (DLI), which is the ratio of distinct letters to total letters. So the word alfalfa has a DLI of 3/7, or about 0.43. On the other hand, a typical three letter word such as box has a DLI of 3/3, or 1.

Using a Ruby script to compute DLI scores for my lists of English words, I collected the lowest scoring words. Many of these are relatively short words with repeated syllables such as booboo and muumuu (with DLIs of 0.333). The somewhat unlikely word senselessnesses has a very low DLI of about 0.267, and there are many other low-scoring words that end with -lessnesses.

Very long words tend to show up, because they generally have many repeated letters, and the classic pneumonoultramicroscopicsilicovolcanoconiosis has a quite low score of 0.311(as does its plural, which differs only in having an e in the penultimate position).

The word kinnikinnik has a very low score of 0.273, and it has several variant spellings with somewhat higher scores. The word can refer either to certain smoking mixtures used by Native Americans, or to a plant also known as bearberry, which has a fairly low score of 0.5.

Finally, one of my all-time favorite words, humuhumunukunukuapuaa (the recently-reinstated state fish of Hawaii) is among the 40 lowest-scoring words I've found, at 0.333.

Posted: Mon - April 2, 2007 at 08:51 PM       by email

|

Weblog Commenting and Trackback by HaloScan.com



©