Find the names in your data with Mr. People

Posted to Apps  |  Nathan Yau

Inspired by Shan Carter’s simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.

Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I’ve finally put together a Web front end for it.

Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can’t do anything with it yet, because as often is the case, it’s not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.

[Mr. People via @mericson]

3 Comments

  • Joshua Muskovitz November 8, 2010 at 12:25 pm

    Alas, it fails on “Mr. Dr. Professor Patrick Star”.

  • There are always lots of edge cases in names, so you’ll probably always need some manual work and a lot of checking.

    E.g. María de las Mercedes d’Orléans y Borbón

    It also omits ecclesiatical titles such as The Venerable William Smith (Ven. William Smith)

Favorites

Pizza Place Geography

Most of the major pizza chains are within a 5-mile radius of where I live, so I have my pick, …

10 Best Data Visualization Projects of 2015

These are my picks for the best of 2015. As usual, they could easily appear in a different order on a different day, and there are projects not on the list that were also excellent.

Shifting Incomes for American Jobs

For various occupations, the difference between the person who makes the most and the one who makes the least can be significant.

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.