Find the names in your data with Mr. People

November 8, 2010  |  Online Applications

Inspired by Shan Carter's simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.

Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I've finally put together a Web front end for it.

Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can't do anything with it yet, because as often is the case, it's not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.

[Mr. People via @mericson]

3 Comments

  • Joshua Muskovitz November 8, 2010 at 12:25 pm

    Alas, it fails on “Mr. Dr. Professor Patrick Star”.

  • There are always lots of edge cases in names, so you’ll probably always need some manual work and a lot of checking.

    E.g. María de las Mercedes d’Orléans y Borbón

    It also omits ecclesiatical titles such as The Venerable William Smith (Ven. William Smith)

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.
7ads6x98y