Shot chart for Aug 26 2020 NBA playoffs

August 26, 2020

Topic

Chart Everything / Bucks, NBA

FDA commissioner corrects his misinterpretation of reduced mortality

August 26, 2020

Topic

Mistaken Data / absolute, coronavirus, FDA, relative

Talking about a possible plasma treatment for Covid-19, the Food and Drug Administration Commissioner Stephen Hahn misinterpreted results from the study. The study from the Mayo Clinic notes a possible 35% reduction in mortality rate, and Hahn said that if 100 people were sick with Covid-19, 35 lives would be saved.

For The Washington Post, Aaron Blake discusses why the interpretation is incorrect:

The vast majority of people who get the virus will recover with or without plasma. The 35 percent figure comes into play among those who die — a much smaller group. That would still be a huge development if borne out. But strictly speaking, the treatment would have saved about 3 out of 100 coronavirus patients, not 35. And given the smaller numbers we’re talking about, the finding is much closer to the margin of error — even as the preliminary study finds the effect to be statistically significant.

And even then, the claim doesn’t make sense. The data that he and Trump were referring to compared those receiving plasma treatments not to a control group, but between higher and lower levels of plasma treatments. The group with lower levels died at a rate of 11.9 people out of 100 died, while 8.7 percent died with higher levels.

Hahn later corrected himself.

See also Christopher Ingraham’s quick explanation of relative versus absolute risk. And this visual explainer from 2015 by NYT’s The Upshot should also be helpful in understanding the difference.

Data Underload / age

Redefining Old Age

What is old? When it comes to subjects like health care and retirement, we often think of old in fixed terms. But as people live longer, it’s worth changing the definition.

Optimizing a peanut butter and banana sandwich

August 25, 2020

Topic

Statistics / deep learning, optimization, sandwich

How do you assemble a banana and peanut butter sandwich that maximizes the number of bites with the perfect ratio of bread, peanut butter, and banana? Ethan Rosenthal, in a quest to work on something truly meaningless, solved the problem over several months with a truly roundabout solution:

So, how do we make optimal peanut butter and banana sandwiches? It’s really quite simple. You take a picture of your banana and bread, pass the image through a deep learning model to locate said items, do some nonlinear curve fitting to the banana, transform to polar coordinates and “slice” the banana along the fitted curve, turn those slices into elliptical polygons, and feed the polygons and bread “box” into a 2D nesting algorithm.

Best.

How long before there is gender equality in the U.S. House and Senate

August 25, 2020

Topic

Statistical Visualization / gender equality, government, Sergio Peçanha, Washington Post

For The Washington Post, Sergio Peçanha asks, “What will it take to achieve gender equality in American politics?”

It will take some more time and a lot more effort to reach equal representation. I asked my colleague David Byler, a statistics expert, to estimate how long it would take for women to reach equal numbers in Congress at the current pace. His estimate: about 60 years.

Racist housing policy from 1930s and present-day temperature highs

August 24, 2020

Topic

Maps / climate, housing, New York Times, racism, redlining

Brad Plumer and Nadja Popovich for The New York Times show how policies that marked black neighborhoods as “hazardous” for real estate investment led to a present-day with fewer trees and higher temperatures. The maps that shift back and forth between past districting and how things are now show the picture clearly.

This goes hand-in-hand with how tree-cover and neighborhood incomes are also tightly coupled.

Visits to businesses compared year-over-year in each state

August 24, 2020

Topic

Statistical Visualization / business, coronavirus, New York Times

Businesses are still seeing visits mostly down compared to last year, which shouldn’t be much of a surprise. But there is a lot of variation across the states. The New York Times shows the comparison over time, based on mobile location data (which I still feel uneasy about). NYT went with the scrollytelling state-by-state approach to work their way through the spaghetti plot.

Excess deaths, by race

August 21, 2020

Topic

Statistical Visualization / coronavirus, Marshall Project, race

It’s clear that Covid-19 has affected groups differently across the United States. By geography. By education level. By income. The Marshall Project breaks down excess deaths by race:

Earlier data on cases, hospitalizations and deaths revealed the especially heavy toll on Black, Hispanic and Native Americans, a disparity attributed to unequal access to health care and economic opportunities. But the increases in total deaths by race were not reported until now; nor was the disproportionate burden of the disease on Asian Americans.

With this new data, Asian Americans join Blacks and Hispanics among the hardest-hit communities, with deaths in each group up at least 30 percent this year compared with the average over the last five years, the analysis found. Deaths among Native Americans rose more than 20 percent, though that is probably a severe undercount because of a lack of data. Deaths among Whites were up 9 percent.

Difference charts are used to show deaths above (red) or below (turquoise) normal counts, but of course, it’s mostly red.

See the piece for an additional categorization by state.

Analyzing the topics of cable TV news

August 21, 2020

Topic

Statistical Visualization / cable news, deep learning, Stanford, video

From the Computer Graphics Lab at Stanford University, the results from an analysis of a decade of cable news:

The Stanford TV News Analyzer has applied deep-learning-based image and audio analysis processing techniques to nearly a decade of 24–7 broadcasts from Fox News, CNN, and MSNBC going back to January 1, 2010. That’s over 270,000 hours of video updated daily. Computer vision is used to detect faces, identify public figures, and estimate characteristics such as gender to examine news coverage patterns. To facilitate topic analysis the transcripts are time-aligned with video content, and compared across dates, times of day and programs.

You can search for topics or people, combine queries, and set time ranges. Then you get a time series for how much someone’s face showed up or the number of times a word was used.

Give it a go.

Vote-by-mail volume compared against years past

August 20, 2020

Topic

Statistical Visualization / mail, Upshot, USPS

The volume of mail-in ballots will likely be higher than usual this year, but relative to the Postal Service’s usual volumes from years past, the bump doesn’t seem unfathomable. The chart above, which shows average weekly volume over the years, from Quoctrung Bui and Margot Sanger-Katz for NYT’s The Upshot, shows the scale.

Of course, if certain administrations continue to hamper USPS operations, that’s a different story.

Members Only

The Process 103 – End Result

August 20, 2020

Topic

The Process / iteration, questions, tools

Last month I did a short Q&A about FD and my workflow. I thought I’d elaborate on one of my answers.

Inference of key shape from the sound it makes in the lock

August 20, 2020

Topic

Statistics / keys, security, unlock

Researchers from the National University of Singapore found a way to infer key shape based on the sound the lock makes when you insert the key.

First they capture a sound recording with a standard microphone. Then they run the audio file through software to filter out the metallic clicks. This provides a time series from which they can infer likely keys.

Soundarya Ramesh presented the work at HotMobile 2020 in the talk below:

Oh to be back in graduate school again. [via kottke]

California wildfires map

August 19, 2020

Topic

Maps / California, Los Angeles Times, wildfire

Los Angeles Times provides a California-specific map of the current wildfires to stay updated on what’s happening right now.

In the zoomed out view, hexagons bin the individual fires and color by number of hotspots. Wavy hatching indicates levels of air pollution. In the zoomed in view, see the individual fires and click for current status.

Fire and smoke map

August 19, 2020

Topic

Maps / smoke, wildfire

With the rush of wildfires in California, governor Gavin Newsom declared (another) state of emergency. The Fire and Smoke Map from the U.S. Forest Service and Environmental Protection Agency provides a picture of where we’re currently at. The map incorporates data from a variety of sensors across the country:

The sensor data comes from PurpleAir, which crowdsources data from that company’s particle pollution sensors and shows the data on a map. Before the sensor data appear on the AirNow Fire and Smoke Map, EPA and USFS apply both a scientific correction equation to mitigate bias in the sensor data, and the NowCast, the algorithm to show the data in the context of the Air Quality Index.

Tracking who’s wearing masks correctly

August 19, 2020

Topic

Social Data Analysis / coronavirus, Los Angeles Times, mask

For The Los Angeles Times, Casey Miller went hyperlocal to track mask wearing in three locations in Los Angeles and Orange counties. Over a week, a group of reporters counted people who passed by and tallied if people wore their mask correctly, incorrectly, or no mask at all.

The above is the breakdown for a spot on Main Street in Huntington Beach.

Maybe the best part is that there’s a simple tool at the end so that you can count in your own spot:

If it weren’t so smoky outside, I’d give this a go.

Scale of the explosion in Beirut

August 19, 2020

Topic

Infographics / Beirut, explosion, Reuters, scale

There was an explosion in Beirut. It was big. How big? Marco Hernandez and Simon Scarr for Reuters provide a sense of scale:

George William Herbert, an adjunct professor at the Middlebury Institute of International Studies Center for Nonproliferation Studies and a missile and effects consultant, used two methods to estimate the yield of the explosion. One used visual evidence of the blast itself along with damage assessments. The other calculation was based on the amount of ammonium nitrate reportedly at the source of the explosion.

Both techniques estimate the yield as a few hundred tons of TNT equivalent, with the overlap being 200 to 300, Herbert told Reuters.

It starts with a Hellfire Missle, which is 0.01 tons. Then it just keeps going.

Data-information-conspiracy

August 18, 2020

Topic

Infographics / conspiracy, humor

Seems about right. (Who made it?)

Census counting during the pandemic

August 18, 2020

Topic

Statistics / census, coronavirus, imputation, New York Times

Reporting for The New York Times, Giovanni Russonello on the decennial census during these times:

If households can’t be reached, even by enumerators, then census takers rely on a process known as imputation — that is, they use data from demographically similar respondents to take a best guess at what the missing data ought to say.

“This year I can imagine imputation being much higher, and that will itself be a source of controversy — because imputation involves assumptions,” Dr. Miller said. “No matter what you do at that point, you’re going to have a bunch of places around the country that are unhappy with the numbers, and are going to sue. So there’s going to be a lot of controversy around this.”

Where more imputation is needed, Dr. Miller said, the door opens a bit wider for statistical wrangling — and, potentially, more political influence.

In 2010, 74 percent of households responded. This year, with only about a month left, 63 percent have responded.

In a time when data is ubiquitous and affects so many things that we do, the census count grows more uncertain. Strange.

Where schools are ready to reopen

August 18, 2020

Topic

Maps / coronavirus, education, New York Times, reopening, school

For NYT Opinion, Yaryna Serkez and Stuart A. Thompson estimated where we’re ready:

Our analysis considers two main things: the rate of new infections in a county and the county’s testing capabilities. We used guidelines from the Harvard Global Health Institute, which proposed a variety of ways to open schools as long as the county has fewer than 25 cases of Covid-19 per 100,000 people. We also used the World Health Organization’s proposal to open only if fewer than 5 percent of all those who are tested for the virus over a two-week period actually have it.

The second part matters because if a higher proportion of people are testing positive, it could mean that not enough tests are being conducted to adequately measure the spread.

As you might expect, based on these guidelines, reopening in some places and not others poses disparities when you start breaking down demographics.

Stock market mountains

August 17, 2020

Topic

Data Art / Michael Najjar, mountains, photoshop, stock market

After seeing stoxart, I was reminded of Michael Najjar’s project High Altitude from 2010-ish. He used photos he captured while climbing Mount Aconcagua, the highest mountain in the Americas, as the backdrop for stock data:

The series visualizes the development of the leading global stock market indices over the past 20-30 years. The virtual data mountains of the stock market charts are resublimated in the craggy materiality of the Argentinean mountainscape. Just like the indices, mountains too have their timeline, their own biography. The rock formations soaring skywards like so many layered folds of a palimpsest bear witness to the life history of the mountain – stone storehouses of deep time unmeasureable on any human scale. The immediate reality of nature thus becomes a virtual experience. Such experience of virtuality is strikingly exemplified by the global economic and financial system. If the focus used to be on the exchange of goods and commodities, it is now securely on the exchange of immaterial information.

The above is the price for Lehman Brothers from 1992 to 2008.

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Topic

Recently for Members

August 7, 2025 Familiar chart advantages

July 31, 2025 Visualization Tools, Datasets, and Resources — July 2025 Roundup

July 24, 2025 Visualization and the machine

July 17, 2025 Ignoring audience

July 10, 2025 Switching axis positions

Second Edition

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics (2nd Edition)

Browse by Chart Type See All →

Browse By Topic

Visualization

Maps

Infographics

Networks

Statistics

Software

Sources

Design

Made by FlowingData

August 7, 2025
Familiar chart advantages

July 31, 2025
Visualization Tools, Datasets, and Resources — July 2025 Roundup

July 24, 2025
Visualization and the machine

July 17, 2025
Ignoring audience

July 10, 2025
Switching axis positions