Numberphile, from the Mathematical Sciences Research Institute, is one my new favorite YouTube channels. In this episode, Hannah Fry talks crime, data, and the Poisson distribution.
[Thanks, Mike]
Numberphile, from the Mathematical Sciences Research Institute, is one my new favorite YouTube channels. In this episode, Hannah Fry talks crime, data, and the Poisson distribution.
[Thanks, Mike]
Between 2009 and 2014, there were an estimated 17,968 visits to the emergency room for things stuck in a rectum. Here are those things’ stories.
Crime and data have an old history together, but because there are new methods of collection and analysis these days, there are new decisions to make. The Marshall Project, in collaboration with the Verge, looks at the current state of predictive policing and the social issues that surround it.
As predictive policing has spread, researchers and police officers have begun exploring how it might contribute to a version of policing that downplays patrolling — as well as stopping, questioning, and frisking — and focuses more on root causes of particular crimes. Rutgers University researchers specializing in “risk terrain modeling” have been using analysis similar to HunchLab to work with police on “intervention strategies.” In one Northeast city, they have enlisted city officials to board up vacant properties linked to higher rates of violent crime, and to advertise after-school programming to kids who tend to gather near bodegas in high-risk areas.
Of course, then there’s the whole action-reaction stuff. More time required.
People with certain professions tend to marry others with a given profession. Adam Pearce and Dorothy Gambrell for Bloomberg Business were curious.
When it comes to falling in love, it’s not just fate that brings people together—sometimes it’s their jobs. We scanned data from the U.S. Census Bureau’s 2014 American Community Survey—which covers 3.5 million households—to find out how people are pairing up.
You get a matrix of professions organized by more male to more female, left to right. Mouse over any profession or use the search box and lines project out to the five most common professions that the one of focus tends to marry to. The pink and blue color gradients indicate the sexes of the two spouses.
So for each profession, you get a quick view of who people marry, whether it be outside their own or within. I like how when you mouse over the far left or the far right, you see lines jut across to the opposite side. I wonder what the tendencies are in total for male-dominant to marry female-dominant professions and vice versa.
China’s economic slowdown means a major decline in imports from other countries, which leads to significant effects in these areas. The Guardian takes a look. The vertical axis represents lost export income as a percentage of GDP, the size of the outer red circle represents GDP, and the inner white circle represents exports to China. Dollar units are in billions of dollars. Billions.
Stacked area charts let you see categorical data over time. Interaction allows you to focus on specific categories without losing sight of the big picture.
Data can be intimidating and confusing for beginners, and as a result they stay away from the spreadsheets and delimited files altogether. DataBasic, a suite of tools built as an introduction to poking at data, injects a bit of fun into the onboarding process.
Read More
These are the top 250 products that people injure themselves on or with in a year.
A fun one from Interactive Things that shows cover songs with a galaxy metaphor:
The panorama view shows the 50 top songs as individual planetary systems with the original work as the sun. Each planet represents a version of the song and it’s appearance indicates characteristics including genre, popularity, tempo, valence, energy, and speechiness. The radius of its orbit around the sun shows the years between the publication dates. This view allows you to compare the structure and density of the constellation of different songs from a high-level perspective.
A high percentage of Americans are glued to the television or party sample platter during the Super Bowl each year, which is especially obvious if you go anywhere without a television during this time. Todd Schneider for the Upshot looks at this phenomenon through the lens of New York taxi rides per minute.
Taxi activity’s lowest level in New York coincided with the climactic moment of the game, just as Malcolm Butler intercepted Russell Wilson at 9:59 p.m. to secure the 28-24 victory for the Patriots. New England called a timeout after Butler’s interception, but many Super Bowl party guests apparently didn’t wait around to watch Tom Brady take a knee before they hailed cabs.
Fun. Although nothing beats the Canadian toilet flushing symphony during the Olympic gold medal hockey game of 2010. [Thanks, Todd]
As we use up current energy resources, it grows more important to look to alternative energy sources. Wind is one potential area, but the problem is that one has to know where it’s windy enough — now and in the future — to justify the cost of building the structures to harness the energy. It’s freakin’ wind, and variability is all over the place.
Project Ukko is an effort to make wind research predictions accessible to those who need such information. The visualization component by Moritz Stefaner, in collaboration with Future Everything and BSC, shows a number of wind factors around the world.
Read More
Back in 2008, the New York Times rolled out a campaign finance API so that you could easily access data based on Federal Election Commission filings. (If you’ve tried grabbing data direct from the source, you know this is a pain.) ProPublica took the reins a few days ago as we lead up to this year’s elections.
Like millions around the world, you’re probably like, “What the what? I thought the FEC released their own API recently!” They did. But:
One big difference is timeliness: the FEC API is updated nightly, while ours will be updated throughout each day. For many users of campaign finance data, that distinction may not be a big deal, but on filing days, when thousands of filings are submitted to the FEC, timeliness can matter a lot. Another is the source data: the FEC considers electronic filings to be “unofficial” in the sense that data from them is then brought into agency databases before being published as bulk data. The FEC API publishes data only from those official tables, while the ProPublica API has data from both the official tables and the raw electronic filings.
I’d trust the ProPublica one more for now.
There’s a lot of data on criminal justice — prison populations, crime rates, police policies, etc — but it can be hard to find, because it’s scattered across and deep within thousands of local sites. Hall of Justice from the Sunlight Foundations is an effort to catalog a significant portion of reports and datasets.
While not comprehensive, Hall of Justice contains nearly 10,000 datasets and research documents from all 50 states, the District of Columbia, U.S. territories and the federal government. The data was collected between September 2014 and October 2015. We have tagged datasets so that users can search across the inventory for broad topics, ranging from death in custody to domestic violence to prison population. The inventory incorporates government as well as academic data.
Dealing with those pesky government PDF files is up to you. At least there’s an app for that.
On the PolicyViz podcast, Kim Rees of Periscopic and Mushon Zer-Aviv of Shual Design Studio discuss whether or not empathy plays a role in visualization. Stuff on this topic tends to be annoyingly dismissive or hand-wavy, but this is a good chat worth listening to.
There’s a little bit of swearing, so maybe put on headphones if you’re in an area where that is frowned upon.
Adam Pearce charted minute-by-minute point differentials for NBA games during the 2014-15 season.
To squeeze distribution in, I had to make a couple of trade offs. Instead of being able to encode point differentials with vertical position like I did with my Golden State’s win streak chart, I used color for point difference and saved vertical position for distribution. Since there have been more score differences (GSW was beating by MEM 52 at one point) than can be usefully encoded as unique colors, I bucketed the score differences into 7 colors.
Nifty.
And go Warriors.
Guess the Correlation is a straightforward game where you do just that, and it’s surprisingly fun. You get a scatterplot and you guess the correlation coefficient. That’s it. If you’re off by too much, you lose a life, and if you’re almost spot on, you gain a life. If you’re somewhat right, you get a coin. Bonus points for streaks of correct guesses.
What if you relived life’s activities in big clumps? Thirty years of sleeping in one go. Five months sitting on the toilet. Based on David Eagleman’s Sum: Forty Tales from the Afterlives, this short film by Temujin Doran imagines such a life. Watch to the end.
[via Brain Pickings]