Datopolis is a board game by Ellen Broad and Jeni Tennison from the Open Data Institute, and as you might expect, it promotes the use of open data. Datopolis is…
Statistics
More than mean and standard deviation.

Open data board game

Why all the swimming ties in the Olympics
As the Olympics are all about reaching peak physical potential, it shouldn’t surprise that a lot of races are close, but there’s been a good number of ties this year.…

DrawMyData lets you plot points manually and then download the data
When you have graphs to draw or statistical concepts to teach, you need your data and you need it now. You can look for a suitable dataset, or you can…

Trump tweets from Android are angrier than from iPhone
Sometimes I check Donald Trump’s Twitter feed, as many find themselves doing and quickly regretting. There’s definitely a certain style to some of the tweets. But there are also tweets…

Building a generator for stuff
There comes a time in every data scientist’s life when an idea for a weird Twitter bot pops into your head. Or some oneoff game based on dataset you played…

Searchable campaign finance data from the FEC
Every four years, campaign finance data from the Federal Election Commission peeks its head out into the light of importance. Committees and officials must report significant contributions to campaigns, which…

Guide to spotting data BS
As we delve deeper into election season, politicians will spit out more and more statistics to lend some factitude to their talking points. Some are real, and others will be…

Charting all the Pokemon
Pokemon is everywhere these days. I think it’s just something the world really needs right now. I know very little about the universe, but I do like it when people…

Sketchy summary statistics
Ben Orlin of Math With Bad Drawings explains the pitfalls of using summary statistics — mean, median, and mode — to make decisions in life. Aggregates like these are meant…

Graphing all the music
Glenn McDonald attempts to graph the musical space in its entirety on a twodimensional scale. He calls it Every Noise at Once. This is an ongoing attempt at an algorithmicallygenerated,…

Election forecast tracker
FiveThirtyEight published their election forecast tracker this week, and it’s a beaut. It starts with the standard state map and most importantly the probability of each candidate winning the presidency.…

Gender equality in the movies, a screenplay analysis
Hollywood has been talking gender equality in the movies more than usual lately, so Hanah Andersen and Matt Daniels for Polygraph looked into the matter from a data perspective. We…

Emoji semantic space
Dango is an Android app that predicts relevant emojis as you type. Xavier Snelgrove, the CTO for the group, explains how they use neural networks to make that happen. Recently,…

Nearly impossible to predict mass shootings with current data
Even if there were a statistical model that predicted a mass shooter with 99 percent accuracy, that still leaves a lot of false positives. And when you’re dealing with individuals…

Automatic versus manual data analysis
Hilary and Roger touch on some interesting topics in the most recent Not So Standard Deviations, specifically on scalable and automated data analysis. At the surface, it can seem like…

Bias built in to crime prediction
Predictive policing seems to be playing a bigger role in court decisions these days. People charged with crimes can be given a risk score based on priors and their background,…

Track what your government representatives are doing for you
Taking over an old New York Times project, ProPublica relaunches Represent, which offers an app and an API to see what your local lawmakers have been doing on your behalf.…

The Guardian analyzes 70m comments, unearthing online abuse
Online comments are an odd entity that can get out of hand quickly, and it only takes one or two sour comments to sully an entire thread. To shed some…

Treating visualization as a process
Many people think of visualization as a plugin tool that spits out something to look at. Microsoft Excel comes to mind. Some think of visualization as just that final chart…

Stephen Curry statistical dominance
Robert O’Connell for the Atlantic ponders basketball analytics and the rise of Stephen Curry. Like every sport, basketball has recently undergone a statistical overhaul. A new generation of analysts has…

Moving to the “worst” place in America
In 1999, the Department of Agriculture published a Natural Amenities Scale that took into account “six measures of climate, topography, and water area” to help identify desirable places to live…

Data scientists mostly just do arithmetic
Noah Lorang, a data scientist at Basecamp, explains the key for most companies isn’t finding a way to use the most advanced methods. Instead, it’s about asking the right questions.…

Emergency room data in R
For my graphic on emergency room visits over time and the other on things that get stuck, I used data from the National Electronic Injury Surveillance System, which is maintained…

Math of crime and terrorism
Numberphile, from the Mathematical Sciences Research Institute, is one my new favorite YouTube channels. In this episode, Hannah Fry talks crime, data, and the Poisson distribution. [Thanks, Mike]…

Predictive policing
Crime and data have an old history together, but because there are new methods of collection and analysis these days, there are new decisions to make. The Marshall Project, in…

Campaign Finance API moves to ProPublica
Back in 2008, the New York Times rolled out a campaign finance API so that you could easily access data based on Federal Election Commission filings. (If you’ve tried grabbing…

Catalog of criminal justice data
There’s a lot of data on criminal justice — prison populations, crime rates, police policies, etc — but it can be hard to find, because it’s scattered across and deep…

Game: Guess the correlation
Guess the Correlation is a straightforward game where you do just that, and it’s surprisingly fun. You get a scatterplot and you guess the correlation coefficient. That’s it. If you’re…

Playing with fonts using neural networks
Erik Bernhardsson downloaded 50,000 fonts and then threw them to the neural networks to see what sort of letters a model might come up with. These are all characters drawn…

Missing 11th of the month
David Hagan looked closer at why the 11th of the month appeared to be missing in books. As with many modern curiosities, it began with an xkcd comic. First I…