Paul Hebert was curious about the colors used on the web’s larger sites, so he scraped the top ten ranked by Alexa. Then he plotted the colors in the stylesheets.
-
Compare incomes for young people from the Millennial generation and the baby boomer generation.
-
Summary statistics such as mean, median, and mode can only tell you so much about a dataset. Their scope is limited because for them to be useful, you have to assume things like distribution and dependencies. Visualization helps you see what else there is.
Justin Matejka and George Fitzmaurice demonstrate in their paper for the ACM SIGCHI Conference, in which they developed a method to generate datasets that “are identical over a range of statistical properties, yet produce dissimilar graphics.
-
By @matttomic, this chart speaks to me.
-
Yeah, but what if you combine and overlay all these datasets? [xkcd]
-
It’s that time of year again. Turn up the volume and blast It’s Gonna Be Me by NSYNC. No one will judge you.
-
Here’s a fun what-if simulation that imagines a world where all natural causes of death were gone. People only die of things like car crashes and homicide. The result: people who live to thousands of years old.
Of course, this assumes that the likelihood of dying from external causes stays the same. With such a long life expectancy, do people start to take more risks? Or do we become more sloth-like because we have all the time in the world? Ah, that’s a thinker.
Want a simulation closer to reality? Here you go.
-
Statistician John Tukey, who coined Exploratory Data Analysis, talked a lot about using visualization to find meaning in your data. You don’t always know what you’re looking for, so you explore it visually. Etyn Adar, who teaches information visualization at the University of Michigan, makes a good case for banning the phrase in his students’ project proposals.
For all the clever names he created for things (software, bit, cepstrum, quefrency) what’s up with EDA? The name is fundamentally problematic because it’s ambiguous. “Explore” can be both transitive (to seek something) and intransitive (to wander, seeking nothing in particular). Tukey’s book seems emphasize the former — it’s full of unique graphical tools to find certain patterns in the data: distribution types, differences between distributions, outliers, and many other useful statistical patterns. The problem is that students think he meant the latter.
I see this sort of thing in my suggestion box too. Data exploration with visualization is good, but when someone describes their project as an exploration tool, it often means it lacks focus or direction. Instead it looks like generic graphs that don’t answer anything particular and leave all interpretation to the reader.
-
These are the traffic crashes that resulted in deaths in 2015, categorized by month, time of day, and factors involved.
-
Nationwide mortality data relies on death certificates, and when cause of death is unknown, sometimes “garbage codes” are used to fill the space on the form. This leads to unwanted noise, because garbage in, garbage out as the saying goes. The Institute for Health Metrics and Evaluation tried to soften the noise and strengthen the signal. Ella Koeze for FiveThirtyEight mapped the results. Flip through causes and animate over time.
-
Some states have high rates. Some have low. But whether a state is lower or higher for you depends on more than just the high brackets.
-
Disinformation is kind of a problem these days, yeah? Fatih Erikli uses a simulation that works like a disaster spread model applied to social networks to give an idea of how disinformation spreads.
I tried to visualize how a disinformation becomes a post-truth by the people who subscribed in a network. We can think this network as a social media such as Facebook or Twitter. The nodes (points) in the map represent individuals and the edges (lines) shows the relationships between them in the community. The disinformation will be forwarded to their audience by the unconscious internet (community) members.
Set the “consciousness” parameter and select a node to run.
-
Reuben Fischer-Baum for The Washington Post looks at professional football expectations given their draft picks versus performance.
Read More -
The choice for Most Valuable Player in the NBA is only minimally about the numbers, but it’s fun to look anyways. FiveThirtyEight makes the case for Stephen Curry. I particularly like the chart that shows how other players on a team fare when an MVP candidate doesn’t play.
Not only do virtually all of his teammates (10 of 11 players with at least 30 shots, representing over 1,700 shots taken without him3) shoot worse without Curry on the court to draw attention, they shoot dramatically worse. Overall, Curry’s teammates shoot 7.3 percentage points worse with Curry off the court, with his average teammate4 shooting 8.3 points worse. Among our MVP candidates, LeBron has the next-highest impact on average teammate shooting (3.9 points), followed by Westbrook (2.5 points). When it comes to opening up a team’s offense, Curry has no equal.
-
From Little Planet Factory, a Solar System in a bottle made to scale:
A small bottle attempting to maintain the correct scale between the 8 planets of the solar system at a scale of 1:5,000,000,000. Much as in reality the entire bottle is almost entirely dominated by the volume (and mass) of the four gas giants while the four solid planets settle almost dust like in comparison at the bottom of it.
Cute. [via @alykat]
-
Government data isn’t always the easiest to use with computers. Maybe it’s in PDF format. Maybe you have to go through a roundabout interface. Maybe you have to manually request files through an email address that may or may not work. However, this file that OpenElections received might take the cake.
It’s a spreadsheet, but the numbers are clipart.
City of Detroit produced a lookup tables for its absentee precincts in 2016. It's in Excel. But wait for it: the values are CLIP ART. pic.twitter.com/pzsPbjvc6j
— OpenElections (@openelex) April 17, 2017
Did someone enter clipart manually? Why is it clipart instead of numbers in Excel? Who made this file? So many questions, so little data.
-
In case you didn’t hear, California had a bit of a drought problem for the past few years. We complained about not enough rain constantly, and we finally got a lot of it this year. Now we complain that there’s too much rain (because you know, we have to restore balance). On the upside, the state looks a lot greener and less barren these days. David Yanofsky for Quartz has got your satellite imagery right here.
-
Gabriel Goh models momentum over at the new machine learning journal Distill. The visualization is not the focus, but it’s a nice supplement to help explain more complex concepts to a wider audience. More generally, if you haven’t checked out Distill yet, it’s worth your time.
-
NASA recently released composite images of the Earth at night based on 2016 data, which was a follow-up to similar images for 2012. John Nelson compared the two, specifically looking for new lights that came on (blue) and lights that went off (pink). The former, suggesting growth and the latter, suggesting decline.
-
I know, it’s only April 2017, but some senators and representatives have some extra planning to as they figure out how to persuade midterm voters to re-elect them when the voters went a different direction for the presidential election. Kevin Schaul and Kevin Uhrmacher for The Washington Post use a scatterplot and scrollytelling to explain.