• June 10, 2013

    Topic

    News  / 

    With all the stuff going on with surveillance and data privacy — especially the past week — it’s worthwhile to revisit this essay by Daniel J. Solove, a professor of law at George Washington University, on why privacy matters even if you “have nothing to hide.”

    “My life’s an open book,” people might say. “I’ve got nothing to hide.” But now the government has large dossiers of everyone’s activities, interests, reading habits, finances, and health. What if the government leaks the information to the public? What if the government mistakenly determines that based on your pattern of activities, you’re likely to engage in a criminal act? What if it denies you the right to fly? What if the government thinks your financial transactions look odd—even if you’ve done nothing wrong—and freezes your accounts? What if the government doesn’t protect your information with adequate security, and an identity thief obtains it and uses it to defraud you? Even if you have nothing to hide, the government can cause you a lot of harm.

    “But the government doesn’t want to hurt me,” some might argue. In many cases, that’s true, but the government can also harm people inadvertently, due to errors or carelessness.

    You might not have anything to hide right now, but maybe a random string of choices that was completely harmless looks a lot like something else a few years from now, to someone sniffing around the archives. The patterns when there are no patterns sort of thing. Personal data without the person. [via @hmason]

  • The Brewers Association just released data for 2012 on craft beer production and growth. The New Yorker mapped the data in a straightforward interactive.

    As of March, the United States was home to nearly two thousand four hundred craft breweries, the small producers best known for India pale ales and other decidedly non-Budweiser-esque beers. What’s more, they are rapidly colonizing what one might call the craft-beer frontier: the South, the Southwest, and, really, almost any part of the country that isn’t the West or the Northeast.

    Most articles and lists on craft beer tend to focus on total production and breweries, so California, a big state with a lot of people, always ends up on top. And as a Californian, I’m more than happy with my access to all the fine brews around here, but clearly, there are many more states to visit. RV trip anyone? [via @kennethfield]

  • COMING MAY 29

    Pre-order on Amazon
  • Because every day is a good day to listen to Hans Rosling talk numbers. In this short video, Rosling uses Lego bricks to explain population growth and the gaps in wealth and carbon footprint.

  • When you talk to different people across the United States, you notice small differences in how people pronounce words and phrases. Sometimes different terms are used to describe the same thing. Bert Vaux’s dialect survey tried to capture these differences, and NC State statistics graduate student Joshua Katz mapped the data.
    Read More

  • Josh Orter takes back-of-the-napkin math to the next level with Stupid Calculations, which promises to turn practical facts into utterly useless ones. Stupid calculation number one is the size of a giant iPhone screen if you combined all the iPhone screens ever sold into one.

    The eye-glazing calculations are laid out below for those who appreciate the dirty work but, skipping ahead, the Kubrick-inspired monophone would stretch 5,059 feet into the sky and have a base measuring 2,846 feet across (Central Park is 2,640 feet wide). Its surface area would take in 2.07 billion square inches. That’s 14.39 million square feet or 330.54 acres. The new World Trade Center, by comparison, will have a surface area of 23 glass-clad acres, giving us enough screenage to watch Game of Thrones on all four sides of fourteen WTCs.

    See also how long it would it take to drink the water in an olympic-sized pool through a straw.

  • Using data from the London Fire Brigade, James Cheshire mapped 144,000 incidents in London.

    This map shows the geography of fire engine callouts across London between January and September 2011. Each of the 144,000 or so lines represents a fire engine (pump) attending an incident (rounded to the nearest 100m) and they have been coloured according to the broad type of incident attended. These incident types have been further broken down in the bar chart on the bottom right. False alarms (in blue), for example, can be malicious (fortunately these are fairly rare), genuine or triggered by an automatic fire alarm (AFA). As the map shows, false alarms – thanks I guess to AFAs in office buildings – seem most common in central London.

    It looks a lot like a sky of fireworks in this view. I bet a map for each category might help flesh out different patterns.

  • Microsoft researcher Kate Crawford describes several myths of big data. Myth #4: It makes cities smarter.

    “It’s only as good as the people using it,” Ms. Crawford said. Many of the sensors that track people as they manage their urban lives come from high-end smartphones, or cars with the latest GPS systems. “Devices are becoming the proxies for public needs,” she said, “but there won’t be a moment where everyone has access to the same technology.” In addition, moving cities toward digital initiatives like predictive policing, or creating systems where people are seen, whether they like it or not, can promote lots of tension between individuals and their governments.

    Yep. I hear those people things can introduce a lot of challenges.

  • Data Points: Visualization that Means SomethingIt seems like ages since we ran one of these.

    It’s hard to believe Data Points hit the shelves two months ago. (Thank you to everyone who got a copy!) It still feels brand new in my head. I kind of thought that time would slow down after I finished the book (and dissertation), but it seems to be moving even faster now.

    Anyways, if you’d like a chance to win a copy of Data Points blemished by my signature, leave a comment below by Wednesday, June 5, 2013 11:59pm PST. Tell us what your favorite number is and why. One entry per person please. I’ll pick a winner at random via sample() in R. Good luck.

    And of course, if you can’t wait, have never been lucky at cards, or want a blemish-free version of the book, you can get it at online and physical bookstores everywhere.

  • Members Only

    Also known as specialized or custom line charts. Figure out how to draw lines with the right spacing and pointed in the right direction, and you’ve got your slopegraphs.

  • The central limit theorem:

    In probability theory, the central limit theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables, each with a well-defined mean and well-defined variance, will be approximately normally distributed.

    Victor Powell animated said random variables falling into a normal distribution (which should look familiar to those who have seen that ping pong ball exhibit in exploratoriums and science museums). Play around with the number of bins and delay time and watch it go.

  • June 2, 2013

    Topic

    Maps  / 

    Twitter mapped all the geotagged tweets since 2009. There’s billions of them, so as you might expect, roads, city centers, and pathways emerge. And it only took 20 lines of R code to make the maps.

  • In celebration of Arrested Development’s return via Netflix, NPR combed through the jokes — obvious and obscure — and set them in a handy interactive guide.

    Arrested Development is back! Because we’re obsessed we care about your watching enjoyment, we wrote down all the recurring gags in every episode — including the new season 4 episodes — with special attention to jokes hidden in the background (like Cloudmir vodka) or being foreshadowed (like when Buster lost his hand).

    The three categories of joke are color-coded, where each row represents a joke and a tick represents an occurrence of that joke over four seasons.

    I’ve only watched a handful of episodes, but I’m tempted to turn on Netflix with this guide in front of me. [Thanks, @onyxfish]

  • In distributed denial-of-service attack a bunch of machines make a bunch of requests to a server to make it buckle under the pressure. There was recently an attack on VideoLAN’s download infrastructure. Here’s what it looked like.


    Read More

  • NYT hospital browserThe Centers for Medicare and Medicaid Services released billing data for more than 3,000 U.S. hospitals, showing high variance in cost of health scare across the country and even between nearby hospitals.

    As part of the Obama administration’s work to make our health care system more affordable and accountable, data are being released that show significant variation across the country and within communities in what hospitals charge for common inpatient services.

    The data provided here include hospital-specific charges for the more than 3,000 U.S. hospitals that receive Medicare Inpatient Prospective Payment System (IPPS) payments for the top 100 most frequently billed discharges, paid under Medicare based on a rate per discharge using the Medicare Severity Diagnosis Related Group (MS-DRG) for Fiscal Year (FY) 2011. These DRGs represent almost 7 million discharges or 60 percent of total Medicare IPPS discharges.

    The data is downloadable as CSV or Excel files and is surprisingly usable and worth a look.

    The New York Times has a useful per-hospital browser and The Washington Post provides quick comparisons by state.

  • NYU ITP graduate student Federico Zannier collected data about himself — online browsing, location, and keystrokes — for his thesis. As he dug into personal data more and looked closer at company privacy policies, he wondered what it might be like if individuals profited from their own data. That is, companies make money using the data we passively generate while we browse and use applications and visit sites. What if individuals owned that data and were able to sell it?

    Enter Zannier’s Kickstarter campaign to sell his own data for $2 per day of activity.

    I started looking at the terms of service for the websites I often use. In their privacy policies, I have found sentences like this: “You grant a worldwide, non-exclusive, royalty-free license to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such content in any and all media or distribution methods (now known or later developed).” I’ve basically agreed to give away a lifelong, international, sub-licensable right to use my personal data.

    Somebody told me that we live in the data age, that the silicon age is already over. “In this new economy,” they said, “data is the oil.”

    Well, this is me trying to do something about it.

    Clearly this is more of a statement and conversation starter, but what if?

    There’s about a week left in the campaign, and it’s well past the goal.

  • PBS Off Book’s recent episode is on “the art of data visualization.” It feels like a TED talk — kind of fluffy and warm — with several names and visualization examples that you’ll recognize. No clue who the first guy is though.

  • We’ve seen plenty of augmented reality where you put on some digitally-enabled glasses or point your camera phone on something and visuals are overlaid on reality. The augmentation is typically a layer on top.

    Eidos is a student project that tries taking this in a different direction. One piece applies an effect similar to long-exposure photography, and the other sends audio to your inner ear to focus on a subject and drown out ambient noise. See the devices in action in the video below.

    [via FastCo]

  • About 35,000 meteorites have been recorded since 2500 BC, and a little over 1,000 of them were seen while they fell, based on data from the Nomenclature Committee of the Meteoritical Society. Carlo Zapponi, a data visualization designer, visualized the latter in Bolides.

    We saw a mapped version of this data a while back, but Bolides takes a time-based approach. A bar chart shows the number and volume of meteorites that have been seen over time, and on the initial load, you get to watch the meteorites fall, one bright orange fireball at a time.

  • May 21, 2013

    Topic

    Maps  /  ,

    In collaboration between USGS, NASA and TIME, Google released a quarter century of satellite imagery to see how the world has changed over time.

    The images were collected as part of an ongoing joint mission between the USGS and NASA called Landsat. Their satellites have been observing earth from space since the 1970s—with all of the images sent back to Earth and archived on USGS tape drives that look something like this example (courtesy of the USGS).

    We started working with the USGS in 2009 to make this historic archive of earth imagery available online. Using Google Earth Engine technology, we sifted through 2,068,467 images—a total of 909 terabytes of data—to find the highest-quality pixels (e.g., those without clouds), for every year since 1984 and for every spot on Earth. We then compiled these into enormous planetary images, 1.78 terapixels each, one for each year.

    Be sure to check out the Timelapse feature on Time.