Mail-in ballots can be rejected if they’re not filled out or mailed correctly. A small percentage of them always are. This year, when we’re talking millions of mail-in ballots, even a small percentage means a lot of tossed ballots. For NYT’s The Upshot, Larry Buchanan and Alicia Parlapiano show how some states modified the design of their ballots to reduce the rejections.
-
Members Only
-
Google released a search trends dataset earlier this month. Using this dataset, Adam Pearce made an explorer to compare search volume over time:
The COVID-19 Search Trends symptoms dataset shows aggregated, anonymized trends in Google searches for more than 400 health symptoms, signs, and conditions, such as cough, fever and difficulty breathing. The dataset provides a time series for each region showing the relative volume of searches for each symptom.
Even if you’re not keen on analyzing Covid-19 data, this is likely a good time series source to at least bookmark for later.
-
The Marshall Project contrasted ad spending on Facebook by Trump’s campaign against Joe Biden’s:
Our analysis found that of the $82 million Trump’s reelection campaign has spent on Facebook ads this year, $6.6 million paid for ads about crime and policing—a top focus of his Facebook campaign. Almost all of it came since George Floyd was killed by police in Minneapolis in May. More than one-third of those ad buys were aimed at key battleground states and many sought to persuade specific undecided voters, and married women in particular. The Biden campaign? It didn’t spend a cent on criminal justice ads on Facebook until late August, choosing instead to focus on the COVID-19 pandemic and economic recovery. Yet Biden had, during the Democratic primaries, articulated a more progressive criminal justice platform than any of his party’s recent nominees.
-
For NYT Opinion, Gus Wezerek and Andrew Whitby on what might happen if the count ends on September 30:
Times Opinion predicted how many people would remain uncounted on Sept. 30, based on each state’s current response rate. Our analysis shows that those undercounts will cheat some states — mostly Republican — out of federal funding and one state out of a congressional seat.
-
When this all started, Covid-19 was impacting large cities at a much higher rate than everywhere else. This straightforward chart from NPR shows how the share of deaths in small and medium cities has made its way up to over half of all weekly Covid-19 deaths.
-
Companies are tracking what you do online. You know this. But it can be a challenge to know the extent, because the methods are hidden on purpose. So The Markup built Blacklight:
To investigate the pervasiveness of online tracking, The Markup spent 18 months building a one-of-a-kind free public tool that can be used to inspect websites for potential privacy violations in real time. Blacklight reveals the trackers loading on any site—including methods created to thwart privacy-protection tools or watch your every scroll and click.
We scanned more than 80,000 of the world’s most popular websites with Blacklight and found more than 5,000 were “fingerprinting” users, identifying them even if they block third-party cookies.
We also found more than 12,000 websites loaded scripts that watch and record all user interactions on a page—including scrolls and mouse movements. It’s called “session recording” and we found a higher prevalence of it than researchers had documented before.
Try it out here. Just enter a URL, and you’ll see a real-time count of the ad trackers, third-party cookies, cookie evaders, and keystroke recorders on any given site.
This is why I got rid of Google Analytics, social media widgets, and ad-serving snippets on FD years ago.
-
Adam Pearce and Ellen Jiang for Google’s PAIR, explain how granular data can lead to easy identification of individuals and how randomization can help:
Aggregate statistics about private information are valuable, but can be risky to collect. We want researchers to be able to study things like the connection between demographics and health outcomes without revealing our entire medical history to our neighbors. The coin flipping technique in this article, called randomized response, makes it possible to safely study private information.
-
For NYT Opinion, Stuart A. Thompson and Yaryna Serkez mapped the most predominant “climate threat” in each county:
This picture of climate threats uses data from Four Twenty Seven, a company that assesses climate risk for financial markets. The index measures future risks based on climate models and historical data. We selected the highest risk for each county to build our map and combined it with separate data from Four Twenty Seven on wildfire risks.
Got me thinking about Tim Meko’s maps of natural disasters.
-
Smoke from the wildfires made its way to the other side of the country and over the ocean. Using data from NOAA, Reuters animated the smoke clouds over time:
With climate change expected to exacerbate fires in the future, by worsening droughts and warming surface ocean temperatures, wildfire research is becoming especially important. Over the last year, the world has seen record fires in Australia, Brazil, Argentina, Siberia and now the U.S. West.
“I’m concerned that we are starting to see these phenomena more often … everywhere in the world,” Gassó said. “If it’s one year like this, it’s fine, as long as it doesn’t keep repeating itself like this.”
Uh oh.
-
For The Washington Post, Ashlyn Still and Kevin Schaul charted how long it took for primary ballots to be counted in each state. The times might give a hint of what we’re in for on election night:
Before the pandemic struck, mail-in states like California were already counting slowly. Then the coronavirus forced dozens of states to quickly expand absentee voting, and the slowdowns got more dramatic. These two trends — more absentee voting, not much time to prepare for it — could lead to some snail’s-paced race calls in November.
There are some nice details to note in this piece.
The inverted vertical axis and area fills focus on ballots left to count over time instead of ballots already in. The limited contrast keeps attention away from the white space under the lines.
The states move up to the top, and as the lines roll out (in the scrollytelling format), the speed is fixed, so that states that took more time count finish moving later.
And finally, the scrollytelling format helps highlight individual states at a time, and the small multiples at the end probably help satiate those who want to just see it all at once.
It’s a relatively straightforward dataset with multiple time series lines, but the choices make the patterns obvious.
-
Members Only
-
An often painful yet necessary step in visualization is to get your data in the right format. Arquero, from the University of Washington Interactive Data Lab, aims to make this part of the process easier:
Arquero is a JavaScript library for query processing and transformation of array-backed data tables. Following the relational algebra and inspired by the design of dplyr, Arquero provides a fluent API for manipulating column-oriented data frames. Arquero supports a range of data transformation tasks, including filter, sample, aggregation, window, join, and reshaping operations.
Before working with JavaScript, I almost always end up in R or Python to get the data where it needs to be. I’m curious if this’ll help streamline the process, if just by a bit.
-
For your analytical perusal, Emil Hvitfeldt provides ten seasons’ worth of scripts from the Friends sitcom in an easy-to-use R package:
The goal of friends to provide the complete script transcription of the Friends sitcom. The data originates from the Character Mining repository which includes references to scientific explorations using this data. This package simply provides the data in tibble format instead of json files.
The ten seasons ran from 1994 to 2004. I’m suddenly feeling my age.
-
North Drinkware molded Half Dome in the bottom of a hand-blown pint glass using elevation data from the United States Geological Survey. Wow. [via @blprnt]
-
Bloomberg mapped tree loss between 2000 and 2019 in Brazil:
“What we have seen in Brazil is that rainforest protection is a highly political issue,” says Gerlein-Safdi of the University of Michigan. “With every change in government, laws can change very quickly, both for better or for worse.”
In some areas, the damage has been done. Efforts to build roads through the forest have opened up large swaths to exploitation. Satellite images of a new highway through the Amazon show how fast the land use changes from primary forest to agricultural land once logging companies and farmers gain access.
The maps are based on an analysis by University of Maryland geographers. The researchers compared satellite imagery over time to compare forest changes on a global scale, and you can download the data here.
-
With mail-in ballots looking to be more common than ever this year, NYT’s The Upshot is tracking the mail:
The data here, covering more than 28 million pieces of first-class letters tracked by SnailWorks, shows how on-time delivery declined noticeably in July after the arrival of Louis DeJoy, the Trump-aligned postmaster general, and the start of policies to trim transportation costs. That drop in national performance was more abrupt than during the chaotic period when the coronavirus pandemic began spreading across the country.
“We had a wave of our members, hundreds and hundreds of locals, telling us there were service problems a month ago,” said Jim Sauber, the chief of staff for the National Association for Letter Carriers.
Hm.
I wonder what the distributions for each time frame looks like. Even during non-pandemic times, it looks like a quarter of the mail is counted as late. And it’s at least a little bit comforting that we’re talking in units of days late rather than weeks or months.
-
The Washington Post provides another straightforward voting guide, based on where you live and how you plan to vote.
Election season is always interesting graphics-wise, because all of the news outlets are starting with the same data and information. But they all show the data a little differently, asking various questions or using different visual approaches.
Things are just getting started, but contrast this Post piece with FiveThirtyEight’s voting guide. The former zeros in on the your voting scenario, whereas the latter still gives some space for the overall national view.
-
Reddit user WhiteCheeks used dot density to show population counts of various animals. Each dot represents an animal. So animals with lower counts show less obviously.
This is similar to the use of pixelation to show endangered species, which I think works better since the size of the dots above don’t encode anything.
-
Members Only