The Straits Times visualized the Marvel Cinematic Universe with a 3-D browsable network. Link colors represent type of relationship, and proximity naturally represents commonalities between characters. Click on individual characters for information on each. Turn on the sound for extra dramatics.
Kevin Quealy and Josh Katz for The Upshot analyzed shoe and running data to see if Nike’s Vaporfly running shoes really helped marathoners achieve faster times. Accounting for a number of confounding factors, the results appear to point to yes.
We found that the difference was not explained by faster runners choosing to wear the shoes, by runners choosing to wear them in easier races or by runners switching to Vaporflys after running more training miles. Instead, the analysis suggests that, in a race between two marathoners of the same ability, a runner wearing Vaporflys would have a real advantage over a competitor not wearing them.
Very statistics-y, even for The Upshot. I like it.
It takes me back to my fourth grade science fair project where I asked: Do Nike’s really make you jump higher? Our results pointed to yes too. Although our sample size of five with no control or statistical rigor might not stand up to more technical standards. My Excel charts were dope though.
Birth control is one of those topics often saved for private conversations, so people’s views are often anecdotal. Someone knows what their friend, family member, etc used, but not much else. Amber Thomas for The Pudding provides a wider view of birth control using data from the CDC’s ongoing National Survey of Family Growth.
You see what other people use, how the method changes with age, and side effects. There’s a Clippy-like character for added information on the different methods. So there’s a good amount of information there to make the choice that’s right for you.
After seeing polar charts of street orientation in major cities, Vladimir Agafonkin, an engineer at Mapbox, implemented an interactive version that lets you see directions for everywhere:
Extracting and processing the road data for every place of interest to generate a polar chart seemed like too much work. Could I do it on an interactive map? It turns out that this is a perfect use case for Mapbox vector maps — since the map data is there on the client, we can analyze and visualize it instantly for any place in the world.
So someone’s going to take the next step to rank and rate griddyness around the world, right?
Sapna Maheshwari for The New York Times on Samba TV software running on smart televisions:
Once enabled, Samba TV can track nearly everything that appears on the TV on a second-by-second basis, essentially reading pixels to identify network shows and ads, as well as programs on HBO and even video games played on the TV. Samba TV has even offered advertisers the ability to base their targeting on whether people watch conservative or liberal media outlets and which party’s presidential debate they watched.
I feel like this is something most people don’t want.
Many have found Amazon’s Alexa devices to be helpful in their homes, but if you can’t physically speak, it’s a challenge to communicate with these things. So, Abhishek Singh used TensorFlow to train a program to recognize sign language and communicate with Alexa without voice.
The Rush Hour puzzle game was invented by Nob Yoshigahara in the 1970s and made its way to the United States in the 1990s. There are vehicles of varying length in a parking lot, and you have to figure out how to get one of the cars out by shifting all the others inside a six-by-six grid. Michael Fogleman wrote a solver and generator for the game, resulting in a database of 1.5 million puzzles.
Earlier this year, The New York Times investigated fake followers on Twitter showing very clearly that it was a problem. It’s hard to believe that Twitter didn’t already know about the scale of the issue, but after the story, the social service finally started to work on the problem.
An investigation by The New York Times in January demonstrated that just one small Florida company sold fake followers and other social media engagement to hundreds of thousands of users around the world, including politicians, models, actors and authors. The revelations prompted investigations in at least two states and calls in Congress for intervention by the Federal Trade Commission. In interviews this week, Twitter executives said that The Times’s reporting pushed them to look more closely at steps the company could take to clamp down on the market for fakes, which is fueled in part by the growing political and commercial value of a widely followed Twitter account.
This is statistics driving positive change instead of just advertising. I’m ready for more of this.
Using OpenStreetMap data, Geoff Boeing charted the orientation distributions of major cities:
Each of the cities above is represented by a polar histogram (aka rose diagram) depicting how its streets orient. Each bar’s direction represents the compass bearings of the streets (in that histogram bin) and its length represents the relative frequency of streets with those bearings.
So you can easily spot the gridded street networks, and then there’s Boston and Charlotte that are a bit nutty. Check out Boeing’s other chart for orientation of major non-US cities.
The trade war started in January of this year when the administration imposed tariffs on 18 solar panel and washing machine products. Then the United States imposed more, and countries returned the favor on U.S. products, which ballooned the product count to 10,000. Keith Collins and Jasmine C. Lee for The New York Times chronicled the shifts with force-directed bubbles.
So many bubbles. Maybe we should just get it over with and impose tariffs on all the things now.
Mike Loukides, Hilary Mason, and DJ Patil published a first post in a series on data ethics on O’Reilly.
We particularly need to think about the unintended consequences of our use of data. It will never be possible to predict all the unintended consequences; we’re only human, and our ability to foresee the future is limited. But plenty of unintended consequences could easily have been foreseen: for example, Facebook’s “Year in Review” that reminded people of deaths and other painful events. Moving fast and breaking things is unacceptable if we don’t think about the things we are likely to break. And we need the space to do that thinking: space in project schedules, and space to tell management that a product needs to be rethought.
Because data might just be computer output — cold and mechanical — but what data represents and the things it leads to are not.
On July 24-25 from 10am-5pm ET, Metis will host its free Demystifying Data Science live online conference for aspiring data scientists and data-curious business professionals. Attendees will experience a total of 28 interactive data science talks from industry-leading speakers.
Day 1 (July 24): For Aspiring Data Scientists
Hear talks on the training, tools, and career paths to the best job in the United States, featuring a keynote by Lillian Pierson, CEO of Data-Mania LLC.
Day 2 (July 25): For Data Curious Business Leaders
Speakers explain how to integrate data science into your organization and how it all applies to you. The day includes a keynote from Beth Comstock, author and former Vice Chair of General Electric.
Each talk is an 18-minute live presentation followed by a Q&A session with questions from the audience. All registrants will have access to recorded versions of the presentations post-conference. Register for free here!
The eighth Thai boy was rescued from the flooded cave recently. Great news. The South China Morning Post has a series of graphics to explain the rescue path and strategy.
Things have a way of repeating themselves, and it can be useful to highlight these patterns in data.
Benjamin Pavard from France made a low-probability goal the other day. Seth Blanchard and Reuben Fischer-Baum for The Washington Post explain the rarity and use it as a segue into expected versus actual goals to gauge how teams have played.
This statistic can also tell us which teams are over and under-producing given their level of play so far, by comparing their expected goals and actual results. Surprise quarterfinalist Russia is the biggest overproducer, with an actual goal differential of +4 compared with an expected goal differential of -1.7. This can mean a lot of things. The team could be getting a bit lucky, or just playing extremely well in such a way that they finish more hard challenges than you would normally expect.
Seems right, I think. I mean, I have to take it at face value, as the sports world is essentially dead to me until basketball season starts again.
In the early 1990s, the CIA published internal survey results for how people within the organization interpreted probabilistic words such as “probable” and “little chance”. Participants were asked to attach a probability percentage to the words. Andrew Mauboussin and Michael J. Mauboussinran ran a public survey more recently to see how people interpret the words now.
The main point, like in the CIA poll, was that words matter. Some words like “usually” and “probably” are vague, whereas “always” and “never” are more certain.
I wonder what results would look like if instead of showing a word and asking probability, you flipped it around. Show probability and then ask people for a word to describe. I’d like to see that spectrum.
Pedro M. Cruz, John Wihbey, Avni Ghael and Felipe Shibuya from Northeastern University used a tree metaphor to represent a couple centuries of immigration in the United States:
Like countries, trees can be hundreds, even thousands, of years old. Cells grow slowly, and the pattern of growth influences the shape of the trunk. Just as these cells leave an informational mark in the tree, so too do incoming immigrants contribute to the country’s shape.