-
Members Only
-
NPR put together a set of stories, videos, and interactives about bringing more joy into your life, which of course is always welcome.
-
FlowingData turned 14 years old last week. Is that old? It feels old.
The site started as a sandbox to put class projects. Flat HTML files. JPEG files. Google Maps placemarkers. Flash. Vanilla JavaScript.
As I studied from across the country, it turned into a place to share links with classmates. Did you see that project on Infosthetics? How did Stamen make that map? These big infographics are getting out of hand.
I experimented. To my surprise and delight, stuff I made traversed the internets. Some work landed in my friends’ and family’s feeds through roundabout routes. I learned how visualization could reach a lot of people and get them excited about data.
Statistics grew out of that required course that everyone hated. Data also grew. It got big. It became a science.
Visualization grew with the data. Once thought of as just an analytical tool (to some), it developed into a medium for communication, expression, and storytelling.
As I finished my PhD, thinking about my future, I took job interviews. I think as the interviewee, you’re supposed to try to impress the interviewers. But deep down, it was the other way around for me. I was looking for someone to convince me that what they had to offer was better than running FlowingData. I didn’t find anything.
So, here I am, 6,243 posts, guides, tutorials, links, and projects later. Sheesh.
Thanks for reading. Thank you to supporting members. If you’re not a member yet and you’d like to keep the data flowing, I’d of course appreciate your support. I’m hoping to do this for many more years.
-
Introduction to Modern Statistics by Mine Cetinkaya-Rundel and Johanna Hardin is a free-to-download book:
Introduction to Modern Statistics is a re-imagining of a previous title, Introduction to Statistics with Randomization and Simulation book. The new book puts a heavy emphasis on exploratory data analysis (specifically exploring multivariate relationships using visualization, summarization, and descriptive models) and provides a thorough discussion of simulation-based inference using randomization and bootstrapping, followed by a presentation of the related Central Limit Theorem based approaches.
Read it in the browser or buy a print version. A good deal either way.
-
It’s been hot in the Pacific Northwest the past few days. NYT’s The Upshot plotted the temperatures against previous max temperatures since 1979. Hot.
-
Based on satellite imagery, Erin Davis found the average color of places around the world. The above is by county in the United States, but Davis also made maps by country, which are a mix of greens, browns, and yellows.
See also the NYT piece from 2020, which framed color by political leaning.
-
Postmaster General Louis DeJoy proposed new standards for first-class mail, which would slow down how long it takes for you to receive a letter. The Washington Post made an interactive (paywall) to see how the plan would change delivery times from your ZIP code.
-
To see all the matches from the group stage of Euro 2020 in one chart, Krist Wongsuphasawat used a streamgraph showing aggregate scorelines from kickoff to finish. All matches start at 0-0, and the team that scores first, colored in blue, almost always wins.
The percentage of comeback wins surprises me, as someone who knows almost nothing about soccer.
-
Members Only
-
ProPublica continues their analysis of an anonymous IRS tax records dump. In their most recent, they look at how Peter Thiel uses a Roth IRA to avoid taxes on billions.
In the second half of the piece, a time series chart showing the growth of Thiel’s account versus a standard maxed out account. The data progresses as you scroll, which moves the article forward, until it fills the whole window. Nice.
-
Using data collected by Johns Hopkins University, Michelle McGhee and Will Chase for Axios provide a visual reference for the billing practices of for-profit hospitals:
Rising deductibles and out-of-pocket costs are increasingly leaving patients responsible for bloated medical bills. A new analysis by Johns Hopkins University reveals that many of the top 100 hospitals by revenue in the U.S. use predatory tactics to pursue patients with unpaid bills.
-
The New York Times mapped birth rates, which are down almost everywhere, especially among women in their 20s:
The result has been the slowest growth of the American population since the 1930s, and a profound change in American motherhood. Women under 30 have become much less likely to have children. Since 2007, the birthrate for women in their 20s has fallen by 28 percent, and the biggest recent declines have been among unmarried women. The only age groups in which birthrates rose over that period were women in their 30s and 40s — but even those began to decline over the past three years.
-
Bloomberg used a Sankey diagram to show the path of over a thousand voting bills, classifying them as restrictive, mixed effect, or expansive:
Across the country, Republican state lawmakers proposed more than 300 bills this year to restrict voting and dozens more that would restrict in some ways and expand in others. But the broadest measures either stalled or were scaled back.
-
Prasanta Kumar Dutta and Manas Mishra reporting for Reuters on the slow rollout of Covid-19 vaccinations in India:
Compared to many Western countries, India was late in procuring vaccines. Modi’s government placed the first advance order for an unapproved vaccine only this month, after being criticised for being slow. Countries including the United States and Britain signed orders last year.
-
It’s hot here in the western United States, and it’s only mid-June. From The Washington Post, we’re stuck in a heat dome:
Hot air masses expand vertically into the atmosphere, creating a dome of high pressure that diverts weather systems around them. One way to gauge the magnitude of a heat wave is to measure the height of the typical halfway point of the atmosphere — at the 500 millibar pressure level. For this pressure level to stretch to heights of 600 dekameters, or 19,685 feet, is quite rare, but that marker was forecast for this week, and it was indeed reached in Flagstaff, Ariz., on Tuesday.
Splendid.
-
To measure drought in the present day, we use data from sensors that constantly record environmental conditions, such as soil moisture, precipitation, and snow water content. But to measure drought thousands of years ago, researchers can use tree rings. Alvin Chang for The Guardian shows how the researchers line up old rings to gather historical data and then do that across a region.
-
Michael Friendly and Howard Wainer have a new book out: A History of Data Visualization and Graphic Communication. They rewind back 400 years and discuss the beginnings of visualization, when nobody knew what a chart was. Putting this in my queue and hoping it’s back in stock soon.
Visualization still seems like a relatively new thing. It’s old.
-
Members Only
-
While it is often easy, and tempting, to write a scraper as a dirty one-off script, spatula makes an attempt to provide an easy framework that most scrapers fit within without additional overhead.
This reflects the reality that many scraper projects start small but grow quickly, so reaching for a heavyweight tool from the start often does not seem practical.
The initial overhead imposed by the framework should be as light as possible, providing benefits even for authors that do not wish to use every feature available to them.
Although, without my dirty one-off scripts, what will I put in my tmp data folder?
-
How to Make Alluvial Diagrams
Here’s how to do it in R from start to finish, plus editing in illustration software. Make design choices and trade-offs for more readable charts.