Find the names in your data with Mr. People

Inspired by Shan Carter’s simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.

Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I’ve finally put together a Web front end for it.

Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can’t do anything with it yet, because as often is the case, it’s not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.

[Mr. People via @mericson]

3 Comments

  • Joshua Muskovitz November 8, 2010 at 12:25 pm

    Alas, it fails on “Mr. Dr. Professor Patrick Star”.

  • There are always lots of edge cases in names, so you’ll probably always need some manual work and a lot of checking.

    E.g. María de las Mercedes d’Orléans y Borbón

    It also omits ecclesiatical titles such as The Venerable William Smith (Ven. William Smith)

Favorites

10 Best Data Visualization Projects of 2015

These are my picks for the best of 2015. As usual, they could easily appear in a different order on a different day, and there are projects not on the list that were also excellent.

The Most Unisex Names in US History

Moving on from the most trendy names in US history, let’s look at the most unisex ones. Some names have …

Famous Movie Quotes as Charts

In celebration of their 100-year anniversary, the American Film Institute selected the 100 most memorable quotes from American cinema, and …

The Best Data Visualization Projects of 2011

I almost didn’t make a best-of list this year, but as I clicked through the year’s post, it was hard …