• Infochimps R package for easy access to API

    March 29, 2011  |  Software

    The data marketplace Infochimps recently expanded their API to include datasets such as Twitter People Search and IP to demographic. To get that data into R, you could easily download the full dataset and import, but why do that when you can connect to the Infochimps API directly from R? Drew Conway recently updated his own R package, available on GitHub, to allow for new API calls, so now it's even easier to explore 60,000+ UFO sightings.

    [Zero Intelligence Agents via @jakeporway]

  • Open-source Data Science Toolkit

    March 25, 2011  |  Software

    Pete Warden does the data community a solid and wraps up a collection of open-source tools in the Data Science Toolkit to parse, geocode, and process data.

    A collection of the best open data sets and open-source tools for data science, wrapped in an easy-to-use REST/JSON API with command line, Python and Javascript interfaces. Available as a self-contained VM or EC2 AMI that you can deploy yourself.

    Many of the services are available via public APIs, but the usual benefits apply of running your own service such as privacy, independence, and no limits. Hit your machine with as many requests as you want. The code is available in its entirety on GitHub.

    [Data Science Toolkit via @JanWillemTulp]

  • Code to make your own movie barcodes available

    March 16, 2011  |  Coding

    Austin Powers - Jay Roach (1997)

    You know those compressed movie barcodes that we saw last week? Here's a Python script by Benoît Romito to make your own. Run a .avi format movie through, and voila. Free gift idea: digitize some old home movies and make a personalized barcode for your family.

  • WeatherSpark for more graphs about the weather than you will ever need

    March 14, 2011  |  Online Applications

    Weatherspark

    You know Matthew Ericson's simple weather mashup? It shows only what you need to know for the day. WeatherSpark is the the opposite of that.
    Continue Reading

  • Data-Driven Documents for visualization in the browser

    March 9, 2011  |  Software

    Voronoi diagram

    As we know, browsers keep getting better, and it grows easier every day to visualize data native in the browser, when you used to have to use Flash. In the early goings, the JavaScript visualization libraries felt clunky to their Flash counterparts, but the roles are changing. There's Protovis, Polymaps, and Processing.js that help you make full use of modern browsers' functionality.

    Mike Bostock, who had a big hand in those first two, recently made Data-Driven Documents, or D3 for short, available to play with.
    Continue Reading

  • RStudio: a new IDE for R that makes coding easier

    March 2, 2011  |  Software

    RStudio in windows

    I tweeted this out earlier, but people are really excited about RStudio, an integrated development environment (IDE) that has the potential to make R coding and development a whole lot easier.
    Continue Reading

  • Every baseball game and play since 1951 on your iPad

    February 22, 2011  |  Infographics, Software

    Phillies Pennant

    If you love baseball and have an iPad, you need Pennant, a project by Steve Varga. The app lets you explore every game and play since 1951. See the numbers for your favorite player or team with just a few taps or swipes while you're plopped on your couch watching the game. Imagine: one hand with an ice cold beverage, iPad on your lap, and the game on in front of you.
    Continue Reading

  • Google opens up Public Data Explorer to your data

    February 17, 2011  |  Online Applications

    Public data explorer

    With Google's recent data-related offerings, it shouldn't come as much of a surprise that they've opened up their Public Data Explorer so that you can upload your own data. Previously, it was only available when you searched for something like "GDP" and a related dataset was supplied by an official provider.

    [W]e’re opening the Public Data Explorer to your data. We’re making a new data format, the Dataset Publishing Language (DSPL), openly available, and providing an interface for anyone to upload their datasets. DSPL is an XML-based format designed from the ground up to support rich, interactive visualizations like those in the Public Data Explorer. The DSPL language and upload interface are available in Google Labs.

    In terms of visualization, there's isn't anything new. You've got your maps, bar charts, and time series line charts, with the checkboxes on the left (like the snapshot below). Then there's the chart types available via the charting API.
    Continue Reading

  • Find more of the data you need with DataMarket

    January 31, 2011  |  Data Sources, Online Applications

    Add another online destination to find the data that you need. DataMarket launched back in May with Icelandic data, but just a few days ago relaunched with data of the international variety. They tout 100 million time series datasets and 600 million facts. I'm not totally sure what that means (100 million lines, sets of lines?), but I take it that means a lot.

    Just over 2 years and countless cups of coffee after we started coding, DataMarket.com launches with international data. You can now find, visualize and download data from many of the world’s most important data providers on our site.

    At first glance DataMarket feels a lot like now defunct Swivel. Search for the data you want and you get back a list of datasets. The focus on only time series though is actually a plus in that they can provide more specific tools to visualize and explore. The current toolset isn't going to blow you away, but it's not bad.
    Continue Reading

  • This Tract provides a view of Census data on your block

    January 6, 2011  |  Online Applications

    Tract map

    This Tract, by Michal Migurski of Stamen, with some help from Craig Mod, lets you view details of your block by way of Census data. It's still using 2000 data but was built in anticipation of the 2010 release, which should come in a couple of months. So we'll probably see some improvements from now until then.

    Enter your location or browse the slippy map for information on race, income, gender, education, age, and housing. There are also aggregates for your Census tract, county, state, and country.
    Continue Reading

  • Search how phrases have been used via Google Ngram Viewer

    December 20, 2010  |  Online Applications

    Ngram - kindergarten

    Language changes. Culture changes. And we can see some of these changes via what authors write about in books over the years. Google's Book Ngram Viewer lets you search through this data, and shows a graph similar similar to the output of Google Trends. The above is the trends for nursery school, kindergarten, and child care:

    This shows trends in three ngrams from 1950 to 2000: "nursery school" (a 2-gram or bigram), "kindergarten" (a 1-gram or unigram), and "child care" (another bigram). What the y-axis shows is this: of all the bigrams contained in our sample of books written in English and published in the United States, what percentage of them are "nursery school" or "child care"? Of all the unigrams, what percentage of them are "kindergarten"? Here, you can see that use of the phrase "child care" started to rise in the late 1960s, overtaking "nursery school" around 1970 and then "kindergarten" around 1973. It peaked shortly after 1990 and has been falling steadily since.

    Find anything interesting?
    Continue Reading

  • Advanced visualization without programming – Impure

    December 2, 2010  |  Online Applications

    Color map

    Programming can be tough in the beginning, which can make advanced visualization beyond the Excel spreadsheet hard to come by. Bestiario tries to make it easier with their most recent creation Impure:

    Impure is a visual programming language aimed to gather, process and visualize information. With impure is possible to obtain information from very different sources; from user owned data to diverse feeds in internet, including social media data, real time or historical financial information, images, news, search queries and many more.

    It's not a plug-and-play application, but it's not scripting in a text editor either. Think of it as somewhere in between that (hence the visual programming language). They've taken the logic behind code, and encapsulated them into modules or structures, and you can piece them together like a puzzle. The interface kind of reminds me of Yahoo Pipes.
    Continue Reading

  • R is the need-to-know stat software

    November 17, 2010  |  Software, Statistics

    This Forbes post on the greatness that is R is being passed around by every statistician and his mother today.

    It's not that this type of analysis wasn't possible before — statisticians have existed, and commercial software has been available to support them, for decades. The fact that R is free to use, free to modify, and its source is open to view, extend and improve means students, stock traders-in-training and fantasy football junkies can familiarize themselves with the software. They can write programs against it. They're likely to continue that usage into their professional lives. When they share their work, the community, down the line, benefits. And the virtuous cycle strengthens.

    What's your favorite (graphical) use of R?

  • Format and clean your data with Google Refine

    November 16, 2010  |  Software

    When we first learn how to deal with data in school, it's nicely formatted and fits perfectly into a rectangular spreadsheet. Then when we start to deal with real data, we find missing values, inconsistencies, and for some reason it doesn't plug straight into our software. What the heck?

    The caveman way to fix this problem is to open Excel and manually edit everything. Some ad hoc code can often fix your problems, but still that takes time and can be a pain. Google Refine, the Googley evolution of Freebase Gridworks, can help you.
    Continue Reading

  • Find the names in your data with Mr. People

    November 8, 2010  |  Online Applications

    Inspired by Shan Carter's simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.

    Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I've finally put together a Web front end for it.

    Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can't do anything with it yet, because as often is the case, it's not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.

    [Mr. People via @mericson]

  • Why everyone should learn programming

    October 28, 2010  |  Coding

    Daniel Shiffman, assistant professor at the NYU Interactive Telecommunications Program, talks programming, computation, data, and why everyone should learn programming in this interview by Mark Webster.

    It's not just about saving time. There are certain things you can discover and be creative with with computation that you can't by hand. They both go together.

    Watch the four-minute interview below. The excitement in Shiffman's voice alone might want to make you learn some Processing (which he wrote a useful book for).
    Continue Reading

  • How people in your area spend money

    October 28, 2010  |  Online Applications

    San Francisco spending

    The personal finance site Mint aggregates spending data from four million users. At the individual level, Mint is useful in that it brings all of your finances into one place. Zoom out and aggregate, and you have spending for a city or a state. This is what Mint Data does.
    Continue Reading

  • Find your flight via visual interface

    October 21, 2010  |  Misc. Visualization, Online Applications

    hipmunk flight search

    Booking flights became so much easier when it all shifted online, but it hasn't changed in years. You put in your preferred dates and times and you get a long list of options. Oftentimes those listings can be a pain as you browse through all of your options. Oh the burden of choice. Hipmunk tries to make flight search easier with a visual interface.

    As usual, you enter your origin and destination but instead of plain HTML tables, you get something like the above, and you can sort the options from least to greatest amount of agony. Rectangle lengths represent flight times and are color-coded by airline. Flights with the same take off and arrival times, but priced higher are hidden to help you narrow down quicker.

    Hipmunk is still in the early stages, but a quick search shows a lot of promise.

    [Hipmunk via Matt]

  • How K-12 schools in your area measure up

    October 13, 2010  |  Mapping, Online Applications

    Education scorecard - how does this district compare

    In collaboration with NBC News and The Gates Foundation, Ben Fry-headed Fathom Design shows you how K-12 schools measure up in your area. If you're a parent or soon-to-be parent considering a move, this will be especially interesting to you. The Education Nation Scorecard lets you search for your location or a specific school to see how they perform and how they compare to the rest of the country.
    Continue Reading

  • The state of mapping APIs

    September 15, 2010  |  Mapping, Software

    O'Reilly Radar surveys the state of mapping APIs from old sources (like Google) and new ones (like CloudMade). Spoiler alert: there's a lot of opportunity out there.

    Maps took over the web in mid-2005, shortly after the first Where 2.0 conference. They quickly moved from fancy feature to necessary element of any site that contained even a trace of geographic content. Today we're amidst another location and mapping revolution, with mobile making its impact on the web. And with it, we're seeing even more geo services provided by both the old guard and innovative new mapping platforms.

    [O'Reilly Radar]

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.