-
Members Only
-
For Datawrapper, Lisa Charlotte Rost outlines the steps to prepare and clean your data in Excel or Google Spreadsheets. From the beginning:
When you download an Excel file, it often has multiple sheets. Our data set has three of them, as seen on the bottom: “Data”, “Metadata – Countries” and “Metadata – Indicators”. Look through all of your sheets and make sure you understand what you’re seeing there. Do the headers, file name and/or data itself indicates that you downloaded the right file? Are there footnotes? What do they tell you? Maybe that you’re dealing with lots of estimates? (Does that maybe mean that you need to look for other data?) If you don’t find notes in the data, make sure you look for them on the website of your source.
The guide is in the context of prepping your data to load into the Datawrapper tool, but the advice easily applies more generally.
-
Overview is an ongoing project that uses a zoomed out view for a new perspective on the world:
Seeing the Earth from a great distance has been proven to stimulate awe, increase desire to collaborate, and foster long-term thinking. We aim to inspire these feelings — commonly referred to as the Overview Effect — through our imagery, products, and collaborations. By embracing the perspective that comes from this vantage point, we believe we can stimulate a new awareness that will lead to a better future for our one and only home.
Far away enough to see patterns. Close enough to stay connected to the parts.
-
For commuters, the farther away you live from the workplace, the earlier you have to leave your house to get to work on time. How much does that start time change the farther out you get?
-
On Multiple Views, the Interactions Lab talks about their experience as a design studio and how quickly implementations can change when you introduce real data into the system:
It’s easy to assume that the tools and approaches used for general software design apply equally to data visualization design. But data visualization design and interface design are often deeply and fundamentally distinct from one another. We learned this the hard way when we turned our research lab into a collaborative data visualization design studio for a few years. Data permeates visualization interfaces in ways that pose challenges at every stage of the design process. These challenges are even greater within large visualization teams. By reflecting on and articulating these challenges, we hope to inspire new, powerful data visualization design tools and communication processes.
Always start with real data. You’re wasting your time otherwise.
-
For Tampa Bay Times, Tracey McManus and Eli Murray delve into the purchasing of properties Clearwater, Florida by the Church of Scientology:
The Church of Scientology and companies run by its members spent $103 million over the past three years buying up vast sections of downtown Clearwater.
They now own most commercial property on every block within walking distance of the waterfront, putting the secretive church firmly in control of the area’s future.
Most of the sales have not previously been reported. The Tampa Bay Times discovered them by reviewing more than 1,000 deeds and business records, then interviewed more than 90 people to reconstruct the circumstances surrounding the transactions.
The lead-in scrollytelling through Clearwater is quite effective in laying the foundations of the story.
-
Microsoft just open sourced their data exploration tool known as SandDance:
For those unfamiliar with SandDance, it was introduced nearly four years ago as a system for exploring and presenting data using “unit visualizations.” Instead of aggregating data and showing the resulting sums as bar charts, SandDance shows every single row of a dataset (for datasets up to ~500K rows). It represents each of these rows as a mark that can be colored and organized into different areas on the screen. Thus, bar charts are made of their constituent units, stacked, or sorted.
Nice. I hadn’t heard about SandDance until now, but I’m saving for later. You can grab the source on GitHub.
-
Members Only
-
When it comes to meaningful visualization, context is everything. Richard Brath, at the 2018 Information+ Conference, looks back on historical visualization approaches and how they might be applied today to make data graphics easier to read and use.
-
One person’s long commute is another’s dream. Another person’s normal might be someone else’s nightmare. What counts as a long commute depends on where you live.
-
A study found that a hospital program significantly reduced the number of hospitalizations and emergency department visits. Great. But then the researchers realized that the data was recoded incorrectly, and the program actually increased hospitalizations and emergency department visits. Not so great.
The identified programming error was in a file used for preparation of the analytic data sets for statistical analysis and occurred while the variable referring to the study “arm” (ie, group) assignment was recoded. The purpose of the recoding was to change the randomization assignment variable format of “1, 2” to a binary format of “0, 1.” However, the assignment was made incorrectly and resulted in a reversed coding of the study groups. Even though the data analyst created and conducted some test analysis programs, they were of the type that did not show any labeling of the arm categories, only the “arm” variable in a regression, for example.
Here’s the original, now-retracted study. And here’s the revised one.
Data can be tricky and could lead to unintended consequences if you don’t handle it correctly. Be careful out there.
-
FiveThirtyEight has been predicting NBA games for a few years now, based on a variant of Elo ratings, which in turn have roots in ranking chess players. But for this season, they have a new metric to predict with called RAPTOR, or Robust Algorithm (using) Player Tracking (and) On/Off Ratings:
NBA teams highly value floor spacing, defense and shot creation, and they place relatively little value on traditional big-man skills. RAPTOR likewise values these things — not because we made any deliberate attempt to design the system that way but because the importance of those skills emerges naturally from the data. RAPTOR thinks ball-dominant players such as James Harden and Steph Curry are phenomenally good. It highly values two-way wings such as Kawhi Leonard and Paul George. It can have a love-hate relationship with centers, who are sometimes overvalued in other statistical systems. But it appreciates modern centers such as Nikola Jokić and Joel Embiid, as well as defensive stalwarts like Rudy Gobert.
I’ve mostly ignored sports-related predictions ever since the Golden State Warriors lost in the 2016 finals. There was a high probability that they would win it all, but they did not. That’s when I realized the predictions would only lead to a neutral confirmation or severe disappointment, but never happiness.
I’m sure this new metric will be different.
-
For The Washington Post, Lauren Tierney and Joe Fox mapped fall foliage colors across the United States:
Forested areas in the United States host a variety of tree species. The evergreens shed leaves gradually, as promised in their name. The leaves of deciduous varieties change from green to yellow, orange or red before letting go entirely. Using USDA forest species data, we mapped the thickets of fall colors you may encounter in the densely wooded parts of the country.
Nice. Be sure to click through to the full story to see leaf profiles and an animation of the changing colors as fall arrives.
-
Members Only
-
Here in northern California, PG&E is shutting off power to thousands of households in efforts to prevent wildfires. Luckily, the area I live is just outside of the shutoff areas, but for others, a map of what’s up would be useful, right?
However, instead of a map, which is “temporarily unavailable” at the time of this writing, PG&E is providing shapefiles. I mean, that’s kind of nice for people who like to make maps, but it’s not so great for the rest. There’s a metaphor in there somewhere.
At least you can keep track with the San Francisco Chronicle:
-
A quick annotation by Jonnie Hallman on Twitter: “GitHub is really good at visualizing burnout.”
-
As discussed previously, the “impeach this” map has some issues. Mainly, it equates land area to votes, which makes for a lot of visual attention to counties that are big even though not many people live in them. So, Karim Douïeb used a clever transition to change the bivariate map to a cartogram. Now you can have a dual view.
-
Members Only
I already covered how to make animated heatmaps in R, but in this tutorial you learn more about customizing the animation itself. Notice that in the animated GIF above, there is a pause in the middle to indicate a changing point and a pause at the end to show the most recent difference.
If you used the animation package in R, you could string together the images to make an animation, but you wouldn’t be able to create those pauses in between.
So instead, I created the images in R and then used ImageMagick command-line to string the images together — with the pause in between. For animated GIFs, the animation package uses ImageMagick. So going direct isn’t that big of a jump, and it gets you more flexibility.
-
Kelly Martin died of cancer on September 30. She was able to enjoy her final days at home, and as she knew the end was near, she kept track of her drug doses in a dashboard:
Brain tumors are unpredictable. I don’t want my last days with a personality that isn’t mine. I wanted to laugh, to enjoy the days, and fart around in the garden as much as possible. We added in a variety of medications to use as needed to manage symptoms and tracked what worked and what didn’t in a Tableau dashboard. It was the only way to see the patterns and to get more good days.
From Bridget Cogley, Martin’s friend who took over the writing as Martin grew too ill:
63% of Canadians with a terminal illness want to die at home. Only about 15% do. Kelly Martin died on September 30, 2019 in her home with her son and me (Bridget) at her side and her mother on the phone. A true honor she gifted us knowingly. We used this dashboard to provide care and communicate with providers. It was crafted in a couple of hours, edited with Kelly’s feedback, and used to provide a better death. Seeing the data can truly be life-changing.
I… just. Wow.
-
David Leonhardt, for The New York Times, discusses the relatively low tax rates for the country’s 400 wealthiest households. The accompanying animated line chart by Stuart A. Thompson shows how the rates have been dropping over the years, which are now “below the rates for almost everyone else.” Oh.