• GeoCommons 2.0, now with more mapping features

    June 6, 2011  |  Online Applications

    Harvard distance from subway

    GeoCommons, an open repository of data and maps, launched version 2.0 this week, which is more feature-rich and robust than the first. Two of the major updates have to do with the fast-changing data landscape: amount of data and browser technology.
    Continue Reading

  • DataWrangler for your data formatting needs

    May 26, 2011  |  Online Applications

    Formatting data is a necessary pain, so anything that makes formatting easier is always welcome. Data Wrangler, from the Stanford Visualization Group, is the latest in the growing set of tools to get your data the way you need it (so that you can get to the fun part already). It's similar to Google Refine in that they're both browser-based, but my first impression is that Data Wrangler is more lightweight and it feels more responsive.
    Continue Reading

  • Google Correlate lets you see how your data relates to search queries

    May 25, 2011  |  Online Applications

    Influenza search - Google Correlate

    A while back, Google showed how Influenza outbreaks correlated to searches for flu-related terms with Google Flu Trends. It helped researchers and policy-makers estimate flu activity much sooner than with previous methods. Google Correlate is the evolution of Flu Trends in that now you can correlate search trends with not just flu cases, but with your own data or other search queries.
    Continue Reading

  • Sorting algorithms demonstrated with Hungarian folk dance

    April 14, 2011  |  Coding

    Bubble sort dance

    We've seen sorting algorithms visualized and auralized, but now it's time to see them through the spirit of Hungarian folk dance. In a series of four videos (so far), folks at Sapientia University in Romania demonstrate how different sorting algorithms work with numbered people dancing around and arranging themselves from least to greatest.

    See them in action in the video below. This one is for Bubble-sort. They move with such zest.
    Continue Reading

  • Infochimps R package for easy access to API

    March 29, 2011  |  Software

    The data marketplace Infochimps recently expanded their API to include datasets such as Twitter People Search and IP to demographic. To get that data into R, you could easily download the full dataset and import, but why do that when you can connect to the Infochimps API directly from R? Drew Conway recently updated his own R package, available on GitHub, to allow for new API calls, so now it's even easier to explore 60,000+ UFO sightings.

    [Zero Intelligence Agents via @jakeporway]

  • Open-source Data Science Toolkit

    March 25, 2011  |  Software

    Pete Warden does the data community a solid and wraps up a collection of open-source tools in the Data Science Toolkit to parse, geocode, and process data.

    A collection of the best open data sets and open-source tools for data science, wrapped in an easy-to-use REST/JSON API with command line, Python and Javascript interfaces. Available as a self-contained VM or EC2 AMI that you can deploy yourself.

    Many of the services are available via public APIs, but the usual benefits apply of running your own service such as privacy, independence, and no limits. Hit your machine with as many requests as you want. The code is available in its entirety on GitHub.

    [Data Science Toolkit via @JanWillemTulp]

  • Code to make your own movie barcodes available

    March 16, 2011  |  Coding

    Austin Powers - Jay Roach (1997)

    You know those compressed movie barcodes that we saw last week? Here's a Python script by Benoît Romito to make your own. Run a .avi format movie through, and voila. Free gift idea: digitize some old home movies and make a personalized barcode for your family.

  • WeatherSpark for more graphs about the weather than you will ever need

    March 14, 2011  |  Online Applications


    You know Matthew Ericson's simple weather mashup? It shows only what you need to know for the day. WeatherSpark is the the opposite of that.
    Continue Reading

  • Data-Driven Documents for visualization in the browser

    March 9, 2011  |  Software

    Voronoi diagram

    As we know, browsers keep getting better, and it grows easier every day to visualize data native in the browser, when you used to have to use Flash. In the early goings, the JavaScript visualization libraries felt clunky to their Flash counterparts, but the roles are changing. There's Protovis, Polymaps, and Processing.js that help you make full use of modern browsers' functionality.

    Mike Bostock, who had a big hand in those first two, recently made Data-Driven Documents, or D3 for short, available to play with.
    Continue Reading

  • RStudio: a new IDE for R that makes coding easier

    March 2, 2011  |  Software

    RStudio in windows

    I tweeted this out earlier, but people are really excited about RStudio, an integrated development environment (IDE) that has the potential to make R coding and development a whole lot easier.
    Continue Reading

  • Every baseball game and play since 1951 on your iPad

    February 22, 2011  |  Infographics, Software

    Phillies Pennant

    If you love baseball and have an iPad, you need Pennant, a project by Steve Varga. The app lets you explore every game and play since 1951. See the numbers for your favorite player or team with just a few taps or swipes while you're plopped on your couch watching the game. Imagine: one hand with an ice cold beverage, iPad on your lap, and the game on in front of you.
    Continue Reading

  • Google opens up Public Data Explorer to your data

    February 17, 2011  |  Online Applications

    Public data explorer

    With Google's recent data-related offerings, it shouldn't come as much of a surprise that they've opened up their Public Data Explorer so that you can upload your own data. Previously, it was only available when you searched for something like "GDP" and a related dataset was supplied by an official provider.

    [W]e’re opening the Public Data Explorer to your data. We’re making a new data format, the Dataset Publishing Language (DSPL), openly available, and providing an interface for anyone to upload their datasets. DSPL is an XML-based format designed from the ground up to support rich, interactive visualizations like those in the Public Data Explorer. The DSPL language and upload interface are available in Google Labs.

    In terms of visualization, there's isn't anything new. You've got your maps, bar charts, and time series line charts, with the checkboxes on the left (like the snapshot below). Then there's the chart types available via the charting API.
    Continue Reading

  • Find more of the data you need with DataMarket

    January 31, 2011  |  Data Sources, Online Applications

    Add another online destination to find the data that you need. DataMarket launched back in May with Icelandic data, but just a few days ago relaunched with data of the international variety. They tout 100 million time series datasets and 600 million facts. I'm not totally sure what that means (100 million lines, sets of lines?), but I take it that means a lot.

    Just over 2 years and countless cups of coffee after we started coding, DataMarket.com launches with international data. You can now find, visualize and download data from many of the world’s most important data providers on our site.

    At first glance DataMarket feels a lot like now defunct Swivel. Search for the data you want and you get back a list of datasets. The focus on only time series though is actually a plus in that they can provide more specific tools to visualize and explore. The current toolset isn't going to blow you away, but it's not bad.
    Continue Reading

  • This Tract provides a view of Census data on your block

    January 6, 2011  |  Online Applications

    Tract map

    This Tract, by Michal Migurski of Stamen, with some help from Craig Mod, lets you view details of your block by way of Census data. It's still using 2000 data but was built in anticipation of the 2010 release, which should come in a couple of months. So we'll probably see some improvements from now until then.

    Enter your location or browse the slippy map for information on race, income, gender, education, age, and housing. There are also aggregates for your Census tract, county, state, and country.
    Continue Reading

  • Search how phrases have been used via Google Ngram Viewer

    December 20, 2010  |  Online Applications

    Ngram - kindergarten

    Language changes. Culture changes. And we can see some of these changes via what authors write about in books over the years. Google's Book Ngram Viewer lets you search through this data, and shows a graph similar similar to the output of Google Trends. The above is the trends for nursery school, kindergarten, and child care:

    This shows trends in three ngrams from 1950 to 2000: "nursery school" (a 2-gram or bigram), "kindergarten" (a 1-gram or unigram), and "child care" (another bigram). What the y-axis shows is this: of all the bigrams contained in our sample of books written in English and published in the United States, what percentage of them are "nursery school" or "child care"? Of all the unigrams, what percentage of them are "kindergarten"? Here, you can see that use of the phrase "child care" started to rise in the late 1960s, overtaking "nursery school" around 1970 and then "kindergarten" around 1973. It peaked shortly after 1990 and has been falling steadily since.

    Find anything interesting?
    Continue Reading

  • Advanced visualization without programming – Impure

    December 2, 2010  |  Online Applications

    Color map

    Programming can be tough in the beginning, which can make advanced visualization beyond the Excel spreadsheet hard to come by. Bestiario tries to make it easier with their most recent creation Impure:

    Impure is a visual programming language aimed to gather, process and visualize information. With impure is possible to obtain information from very different sources; from user owned data to diverse feeds in internet, including social media data, real time or historical financial information, images, news, search queries and many more.

    It's not a plug-and-play application, but it's not scripting in a text editor either. Think of it as somewhere in between that (hence the visual programming language). They've taken the logic behind code, and encapsulated them into modules or structures, and you can piece them together like a puzzle. The interface kind of reminds me of Yahoo Pipes.
    Continue Reading

  • R is the need-to-know stat software

    November 17, 2010  |  Software, Statistics

    This Forbes post on the greatness that is R is being passed around by every statistician and his mother today.

    It's not that this type of analysis wasn't possible before — statisticians have existed, and commercial software has been available to support them, for decades. The fact that R is free to use, free to modify, and its source is open to view, extend and improve means students, stock traders-in-training and fantasy football junkies can familiarize themselves with the software. They can write programs against it. They're likely to continue that usage into their professional lives. When they share their work, the community, down the line, benefits. And the virtuous cycle strengthens.

    What's your favorite (graphical) use of R?

  • Format and clean your data with Google Refine

    November 16, 2010  |  Software

    When we first learn how to deal with data in school, it's nicely formatted and fits perfectly into a rectangular spreadsheet. Then when we start to deal with real data, we find missing values, inconsistencies, and for some reason it doesn't plug straight into our software. What the heck?

    The caveman way to fix this problem is to open Excel and manually edit everything. Some ad hoc code can often fix your problems, but still that takes time and can be a pain. Google Refine, the Googley evolution of Freebase Gridworks, can help you.
    Continue Reading

  • Find the names in your data with Mr. People

    November 8, 2010  |  Online Applications

    Inspired by Shan Carter's simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.

    Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I've finally put together a Web front end for it.

    Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can't do anything with it yet, because as often is the case, it's not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.

    [Mr. People via @mericson]

  • Why everyone should learn programming

    October 28, 2010  |  Coding

    Daniel Shiffman, assistant professor at the NYU Interactive Telecommunications Program, talks programming, computation, data, and why everyone should learn programming in this interview by Mark Webster.

    It's not just about saving time. There are certain things you can discover and be creative with with computation that you can't by hand. They both go together.

    Watch the four-minute interview below. The excitement in Shiffman's voice alone might want to make you learn some Processing (which he wrote a useful book for).
    Continue Reading

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.