Distribution of letters in the English language

Some letters in the English language appear more often in the beginning of words. Some appear more often at the end, and others show up in the middle. Using the Brown corpus from the Natural Language Toolkit, David Taylor looked closer at letter position and usage.

I’ve had many “oh, yeah” moments looking over the graphs. For example, words almost never begin with “x”, but it’s quite common as the second letter. There’s a little hump near the beginning of “u” that’s caused by its proximity to “q”, which is most common at the beginning of a word. When you remove “q” from the dataset, the hump disappears. “F” occurs toward the extremes, especially in prepositions (“for”, “from”, “of”, “off”) but rarely just before the middle.

Next step: letter proximity.

Favorites

Where Bars Outnumber Grocery Stores

A closer look at the age old question of where there are more bars than grocery stores, and vice versa.

One Dataset, Visualized 25 Ways

“Let the data speak” they say. But what happens when the data rambles on and on?

Unemployment in America, Mapped Over Time

Watch the regional changes across the country from 1990 to 2016.

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.