When this all started, Covid-19 was impacting large cities at a much higher rate than everywhere else. This straightforward chart from NPR shows how the share of deaths in small and medium cities has made its way up to over half of all weekly Covid-19 deaths.
-
Companies are tracking what you do online. You know this. But it can be a challenge to know the extent, because the methods are hidden on purpose. So The Markup built Blacklight:
To investigate the pervasiveness of online tracking, The Markup spent 18 months building a one-of-a-kind free public tool that can be used to inspect websites for potential privacy violations in real time. Blacklight reveals the trackers loading on any site—including methods created to thwart privacy-protection tools or watch your every scroll and click.
We scanned more than 80,000 of the world’s most popular websites with Blacklight and found more than 5,000 were “fingerprinting” users, identifying them even if they block third-party cookies.
We also found more than 12,000 websites loaded scripts that watch and record all user interactions on a page—including scrolls and mouse movements. It’s called “session recording” and we found a higher prevalence of it than researchers had documented before.
Try it out here. Just enter a URL, and you’ll see a real-time count of the ad trackers, third-party cookies, cookie evaders, and keystroke recorders on any given site.
This is why I got rid of Google Analytics, social media widgets, and ad-serving snippets on FD years ago.
-
Adam Pearce and Ellen Jiang for Google’s PAIR, explain how granular data can lead to easy identification of individuals and how randomization can help:
Aggregate statistics about private information are valuable, but can be risky to collect. We want researchers to be able to study things like the connection between demographics and health outcomes without revealing our entire medical history to our neighbors. The coin flipping technique in this article, called randomized response, makes it possible to safely study private information.
-
For NYT Opinion, Stuart A. Thompson and Yaryna Serkez mapped the most predominant “climate threat” in each county:
This picture of climate threats uses data from Four Twenty Seven, a company that assesses climate risk for financial markets. The index measures future risks based on climate models and historical data. We selected the highest risk for each county to build our map and combined it with separate data from Four Twenty Seven on wildfire risks.
Got me thinking about Tim Meko’s maps of natural disasters.
-
Smoke from the wildfires made its way to the other side of the country and over the ocean. Using data from NOAA, Reuters animated the smoke clouds over time:
With climate change expected to exacerbate fires in the future, by worsening droughts and warming surface ocean temperatures, wildfire research is becoming especially important. Over the last year, the world has seen record fires in Australia, Brazil, Argentina, Siberia and now the U.S. West.
“I’m concerned that we are starting to see these phenomena more often … everywhere in the world,” Gassó said. “If it’s one year like this, it’s fine, as long as it doesn’t keep repeating itself like this.”
Uh oh.
-
For The Washington Post, Ashlyn Still and Kevin Schaul charted how long it took for primary ballots to be counted in each state. The times might give a hint of what we’re in for on election night:
Before the pandemic struck, mail-in states like California were already counting slowly. Then the coronavirus forced dozens of states to quickly expand absentee voting, and the slowdowns got more dramatic. These two trends — more absentee voting, not much time to prepare for it — could lead to some snail’s-paced race calls in November.
There are some nice details to note in this piece.
The inverted vertical axis and area fills focus on ballots left to count over time instead of ballots already in. The limited contrast keeps attention away from the white space under the lines.
The states move up to the top, and as the lines roll out (in the scrollytelling format), the speed is fixed, so that states that took more time count finish moving later.
And finally, the scrollytelling format helps highlight individual states at a time, and the small multiples at the end probably help satiate those who want to just see it all at once.
It’s a relatively straightforward dataset with multiple time series lines, but the choices make the patterns obvious.
-
Members Only
-
An often painful yet necessary step in visualization is to get your data in the right format. Arquero, from the University of Washington Interactive Data Lab, aims to make this part of the process easier:
Arquero is a JavaScript library for query processing and transformation of array-backed data tables. Following the relational algebra and inspired by the design of dplyr, Arquero provides a fluent API for manipulating column-oriented data frames. Arquero supports a range of data transformation tasks, including filter, sample, aggregation, window, join, and reshaping operations.
Before working with JavaScript, I almost always end up in R or Python to get the data where it needs to be. I’m curious if this’ll help streamline the process, if just by a bit.
-
For your analytical perusal, Emil Hvitfeldt provides ten seasons’ worth of scripts from the Friends sitcom in an easy-to-use R package:
The goal of friends to provide the complete script transcription of the Friends sitcom. The data originates from the Character Mining repository which includes references to scientific explorations using this data. This package simply provides the data in tibble format instead of json files.
The ten seasons ran from 1994 to 2004. I’m suddenly feeling my age.
-
North Drinkware molded Half Dome in the bottom of a hand-blown pint glass using elevation data from the United States Geological Survey. Wow. [via @blprnt]
-
Bloomberg mapped tree loss between 2000 and 2019 in Brazil:
“What we have seen in Brazil is that rainforest protection is a highly political issue,” says Gerlein-Safdi of the University of Michigan. “With every change in government, laws can change very quickly, both for better or for worse.”
In some areas, the damage has been done. Efforts to build roads through the forest have opened up large swaths to exploitation. Satellite images of a new highway through the Amazon show how fast the land use changes from primary forest to agricultural land once logging companies and farmers gain access.
The maps are based on an analysis by University of Maryland geographers. The researchers compared satellite imagery over time to compare forest changes on a global scale, and you can download the data here.
-
With mail-in ballots looking to be more common than ever this year, NYT’s The Upshot is tracking the mail:
The data here, covering more than 28 million pieces of first-class letters tracked by SnailWorks, shows how on-time delivery declined noticeably in July after the arrival of Louis DeJoy, the Trump-aligned postmaster general, and the start of policies to trim transportation costs. That drop in national performance was more abrupt than during the chaotic period when the coronavirus pandemic began spreading across the country.
“We had a wave of our members, hundreds and hundreds of locals, telling us there were service problems a month ago,” said Jim Sauber, the chief of staff for the National Association for Letter Carriers.
Hm.
I wonder what the distributions for each time frame looks like. Even during non-pandemic times, it looks like a quarter of the mail is counted as late. And it’s at least a little bit comforting that we’re talking in units of days late rather than weeks or months.
-
The Washington Post provides another straightforward voting guide, based on where you live and how you plan to vote.
Election season is always interesting graphics-wise, because all of the news outlets are starting with the same data and information. But they all show the data a little differently, asking various questions or using different visual approaches.
Things are just getting started, but contrast this Post piece with FiveThirtyEight’s voting guide. The former zeros in on the your voting scenario, whereas the latter still gives some space for the overall national view.
-
Reddit user WhiteCheeks used dot density to show population counts of various animals. Each dot represents an animal. So animals with lower counts show less obviously.
This is similar to the use of pixelation to show endangered species, which I think works better since the size of the dots above don’t encode anything.
-
Members Only
-
The wind was blowing smoke and ash from wildfires further up north from where I live. The sky turned an eerie orange. I wondered about past fires and made the chart below.
-
The math behind wearing a mask can seem unintuitive at times. Minute Physics and Aatish Bhatia break it down in this illustrated video to show why wearing masks works:
The premise is that there’s a two-way effect with breathing in and breathing out. There are some assumptions here, but there’s an interactive component that lets you adjust the variables. They’ve also made the code available.
-
For The Pudding, Ilia Blinderman rounds out his three-part series on creating visual, data-driven essays. This last part in on the fuzziest task of telling stories:
Storytelling, however, is much more abstract — it’s not merely a technical matter of creating an image of a map, or designing the right chart; rather, it refers to the broader universe of considerations that impact nearly every decision you make in the way you frame and present a project. The focus is much less on the technical “how,” like in the first two installments of these guides, but on the “why” of designing the narrative. It certainly doesn’t help that technical tools are inherently more concrete: they’re ways of solving specific problems (e.g., “how do I show the locations where people are concentrated on a map?” or “how do get this visual element to move through this specific path?”), while storytelling is much more of a nebulous concept. Thus, in this guide, I’ll be focusing on the relevant questions and considerations that we, at The Pudding, tend to consider when creating data-driven projects.
-
Picking colors for your charts can be tricky, especially when you’re starting a palette from scratch. For Datawrapper, Lisa Charlotte Rost has been writing guides on color as it pertains to political parties, gender, and more recently, colorblindness. Rost put the pieces together for a single, more comprehensive guide on the subject.
Be sure to check out Rost’s other guides on making better charts. She has a knack for explaining visualization methods in a practical and concrete way.
-
As we have seen, small shifts in voting behavior of various demographic groups can swing an election. The Washington Post provides an interactive that lets you shift these groups by both turnout and vote margin to see what might happen (based on a simplified model).