Ian Johnson provides some good direction for those looking to get their feet wet with d3.js.
Read More
-
-
Gerrymandering is the practice of manipulating boundaries in such a way that favors a political party. If you slice and group in various ways, you can end up with different election results.
How many different ways can you draw boundaries though? And can results really change that much, depending on you draw the boundaries? District, by Christopher Walker, is a puzzle game that shows you how it works. The goal: Group circles in such a way that favors your color.
-
The unexpected resignation of Census director John H. Thompson was likely related to the administration providing only half the requested budget for 2017. Mona Chalabi for The Guardian on why this matters:
Budget cuts at the Census Bureau mean counting fewer things. Fair enough right? Except that the subjects that are being targeted for cuts seem conspicuous. So far, the Trump administration has deleted questions on sexual orientation from the 2020 Census and at least two other government surveys. Meanwhile, two Republican-sponsored bills introduced in January say that government money can’t be used to collect data on “racial disparities”.
Worrisome.
-
The U.S. Census director John H. Thompson resigned and will leave June 30. Seems not good. And a horrible time for government data in general.
The news, which surprised census experts, follows an April congressional budget allocation for the census that critics say is woefully inadequate. And it comes less than a week after a prickly hearing at which Thompson told lawmakers that cost estimates for a new electronic data collection system had ballooned by nearly 50 percent.
Isn’t this supposed to be the age of big data or something? I thought data was the new oil. Measure things to improve them. Etc. Census data — not just the decennial stuff — is core in so much policy-making to make sure people are properly represented. The direction government data seems to be headed confuses and frustrates to no end, and to think that just a few years ago I’d think happy thoughts with the prospect of where it was going.
-
This is what you get when you add up all the days the average American adult spends sleeping, eating, commuting, and doing other activities.
-
FiveThirtyEight continues their look at mortality by geography. This graphic by Anna Maria Barry-Jester compares life expectancy over time for each state. Purple means below average and orange means above.
The good news is that all the lines trend upward. The bad news is that some states are trending upwards much more slowly than the rest.
-
Using satellite data and spatial models, researchers estimate human influence in the ocean. Darker means more impact.
Two-thirds of the ocean shows increased strain from human-related factors, such as fishing and climate change. And more than three-quarters of coastal waters suffer from climate change and increases in the effects of harmful land-based activities, including pollution. In all, the researchers classified more than 40 percent of the ocean as “heavily impacted” by human activity.
It looks like you can find much of the data used here, under the Cumulative Impact Mapping section.
-
[arve url=”https://www.youtube.com/watch?v=jYaIMyEoqic” /]
Filmmaker Kevin Smith talks about making things versus critiquing them. He’s talking about movies, but you can so easily plug in visualization. I just kept nodding yes. [via swissmiss]
-
Paul Hebert was curious about the colors used on the web’s larger sites, so he scraped the top ten ranked by Alexa. Then he plotted the colors in the stylesheets.
-
Compare incomes for young people from the Millennial generation and the baby boomer generation.
-
Summary statistics such as mean, median, and mode can only tell you so much about a dataset. Their scope is limited because for them to be useful, you have to assume things like distribution and dependencies. Visualization helps you see what else there is.
Justin Matejka and George Fitzmaurice demonstrate in their paper for the ACM SIGCHI Conference, in which they developed a method to generate datasets that “are identical over a range of statistical properties, yet produce dissimilar graphics.
-
By @matttomic, this chart speaks to me.
-
Yeah, but what if you combine and overlay all these datasets? [xkcd]
-
It’s that time of year again. Turn up the volume and blast It’s Gonna Be Me by NSYNC. No one will judge you.
-
Here’s a fun what-if simulation that imagines a world where all natural causes of death were gone. People only die of things like car crashes and homicide. The result: people who live to thousands of years old.
Of course, this assumes that the likelihood of dying from external causes stays the same. With such a long life expectancy, do people start to take more risks? Or do we become more sloth-like because we have all the time in the world? Ah, that’s a thinker.
Want a simulation closer to reality? Here you go.
-
Statistician John Tukey, who coined Exploratory Data Analysis, talked a lot about using visualization to find meaning in your data. You don’t always know what you’re looking for, so you explore it visually. Etyn Adar, who teaches information visualization at the University of Michigan, makes a good case for banning the phrase in his students’ project proposals.
For all the clever names he created for things (software, bit, cepstrum, quefrency) what’s up with EDA? The name is fundamentally problematic because it’s ambiguous. “Explore” can be both transitive (to seek something) and intransitive (to wander, seeking nothing in particular). Tukey’s book seems emphasize the former — it’s full of unique graphical tools to find certain patterns in the data: distribution types, differences between distributions, outliers, and many other useful statistical patterns. The problem is that students think he meant the latter.
I see this sort of thing in my suggestion box too. Data exploration with visualization is good, but when someone describes their project as an exploration tool, it often means it lacks focus or direction. Instead it looks like generic graphs that don’t answer anything particular and leave all interpretation to the reader.
-
These are the traffic crashes that resulted in deaths in 2015, categorized by month, time of day, and factors involved.
-
Nationwide mortality data relies on death certificates, and when cause of death is unknown, sometimes “garbage codes” are used to fill the space on the form. This leads to unwanted noise, because garbage in, garbage out as the saying goes. The Institute for Health Metrics and Evaluation tried to soften the noise and strengthen the signal. Ella Koeze for FiveThirtyEight mapped the results. Flip through causes and animate over time.
-
Some states have high rates. Some have low. But whether a state is lower or higher for you depends on more than just the high brackets.
-
Disinformation is kind of a problem these days, yeah? Fatih Erikli uses a simulation that works like a disaster spread model applied to social networks to give an idea of how disinformation spreads.
I tried to visualize how a disinformation becomes a post-truth by the people who subscribed in a network. We can think this network as a social media such as Facebook or Twitter. The nodes (points) in the map represent individuals and the edges (lines) shows the relationships between them in the community. The disinformation will be forwarded to their audience by the unconscious internet (community) members.
Set the “consciousness” parameter and select a node to run.