• June 29, 2022

    Topic

    Site News  / 

    This past weekend marked 15 years since I first posted on FlowingData. What started as a placeholder for class projects, became a hobby, which eventually turned into a career choice.

    With each year that passes, running an independent site, on data visualization of all things, seems less common. Many of my favorite data and visualization sites from years past are dead links now or are frozen in time.

    It’s for a good reason though: There are a lot of opportunities these days for people who know how to visualize data. The field is more established than it was 15 years ago.

    So, sometimes it feels weird out here in my little corner of the internet. But I’m glad that I’ve been able to do this for this long and still get to do it, all while enjoying the process. I get to see things develop and be a part of the growth, in my own quiet, introverted way.

    Thank you for reading. Thank you to past and present members who support FlowingData. Not a member? Check out the perks for keeping this fully member-supported site flowing.

    Alright, back to the data. Some fun things are headed down the pipeline.

  • June 29, 2022

    Visualising Knowledge is an open book from PBL Netherlands Environmental Assessment Agency, based on 25 years of making charts:

    PBL data visualisation is about visualising research results, using graphs, maps, diagrams and infographics. Over the years, the variety in types of visualisation formats has greatly increased. In addition, visualisations have to be presented in an increasing number of different media: from figures in reports to interactive visualisations that are easy to read on smartphones and tablets.

    The book ‘Visualising knowledge’ provides insight into how visualisations at PBL are created in a process of close collaboration between visualisation experts, researchers and communication experts, always keeping in mind both the medium and the target audience.

    The original version is in Dutch, and they just published an English version. Download either version here.

  • June 28, 2022

    Topic

    Statistics  /  , ,

    An Introduction to Statistical Learning, by Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani:

    As the scale and scope of data collection continue to increase across virtually all fields, statistical learning has become a critical toolkit for anyone who wishes to understand data. An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning. Each chapter includes an R lab. This book is appropriate for anyone who wishes to use contemporary tools for data analysis.

    The PDF version of the book is free to download. There’s also a free online course companion.

  • June 28, 2022

    Felix Krause tracks many metrics of his life, both manually and passively, and put the data in one database. He put up a subset of the data on an updating site that shows where he is, what he’s eaten, how he’s feeling, the time he spent on the computer, and plenty more. After three years, he concluded it was not worth the time:

    Overall, having spent a significant amount of time building this project, scaling it up to the size it’s at now, as well as analysing the data, the main conclusion is that it is not worth building your own solution, and investing this much time. When I first started building this project 3 years ago, I expected to learn way more surprising and interesting facts. There were some, and it’s super interesting to look through those graphs, however retrospectively, it did not justify the hundreds of hours I invested in this project.

    It’s interesting to see people independently come to this conclusion over the years. With the quantified self stuff, people often expect that culling data about your activities and behaviors will result in rich, unexpected insights. But unless you’re actively trying to answer a question or working towards a milestone, usually you won’t get much out of the collection process.

    It’s a similar sentiment around “let the data speak” with visualization. You have to actively look at and translate the data.

    But personal data collection as a form of reflection or journaling? That’s a different story.

  • June 27, 2022

    For The New York Times, Larry Buchanan and Lauren Leatherby used Sankey diagrams to show the endings from active shootings in the United States:

    Most attacks captured in the data were already over before law enforcement arrived. People at the scene did intervene, sometimes shooting the attackers, but typically physically subduing them. But in about half of all cases, the attackers commited suicide or simply stopped shooting and fled.

  • June 24, 2022

    Topic

    Maps  /  , ,

    With Roe vs. Wade in place, there were areas in the United States where a woman had to travel farther than others to get to the nearest clinic. With Roe vs. Wade overturned, the geography will change as states enforce bans. For NYT’s The Upshot, Quoctrung Bui, Claire Cain Miller and Margot Sanger-Katz mapped what will likely happen.

  • When Americans Had Intercourse with Opposite Sex for the First Time

    The National Survey of Family Growth, run by the Centers for Disease Control and Prevention, asks participants about their birth and relationship history.

  • Members Only
    June 23, 2022

    Topic

    The Process  / 

    Trading optimized visual efficiency in charts for joy and interest.

  • June 23, 2022

    In 1692, artist A. Boogert published a guide to watercolors, showing the thousands of possibilities of mixing 31 shades. Nicholas Rougeux, as per his specialty, modernized the work into an interactive diagram.

  • June 22, 2022

    While geographic boundaries can often seem like a semi-static thing, they’ve changed a lot when you look at them on the scale of centuries. Point in History, by Hans Hack, presents a map of what boundaries used to be. Click anywhere to see the history.

    For example, select the United States, and you see the country’s past boundaries, but then it keeps going back in time to BC years of hunter gatherers.

    The map is based on the historical basemaps project, which you can access here.

  • June 21, 2022

    John Rich made pie charts of dog body proportions. This is very important.

  • Members Only

    How to Make an Animated Donut Chart in R

    There are “better” ways to show proportions over time, but sometimes you just want an animated donut.

  • June 20, 2022

    For The New York Times, Pablo Robles, Anton Troianovski, and Agnes Chang mapped the change in destinations for Russian private jets, before and after sanctions. Before, it was more about Paris, Milan, and Geneva. After, Dubai became a top destination.

    I like the charts after the map. A slope chart with a white fill provides contrast and a flight departures board gives a little something extra.

  • June 17, 2022

    Topic

    Maps  /  , , ,

    Thousands of smaller airplanes are still allowed to use leaded fuel, which can lead to unwanted emissions around airports. For Quartz, David Yanofsky and Michael J. Coren mapped flight activity for such planes against schools, parks, and playgrounds:

    These maps illustrate where initial emissions are likely to be highest. Because lead pollution disburses with the wind, anyone within a 1.5 km radius of the runways may be exposed over the long term. But essentially three factors dictate the amount of lead exposure: the volume of air traffic (and thus lead emissions), one’s proximity to the airport, and the prevailing winds. The worst-case scenario for residents? Living alongside a busy airport, downwind of the runway. Often it’s lower-income families living in these areas. To determine individual lead risks, more detailed studies, such as the one at Reid-Hillview, would be needed.

    Use the search to look activity for airports in your area.

  • Members Only
    June 16, 2022

    Topic

    The Process  / 

    They serve as a point of reference in some charts and guide the eyes in others, coming in different styles and layouts.

  • June 16, 2022

    Tonight is game six of the NBA Finals. If the Golden State Warriors beat the Boston Celtics, the Warriors win it all and the season is done. So we almost went an entire playoffs without a cumulative multi-line chart that shows current and notable players. Luckily, NYT’s The Upshot got it done with cumulative three-pointers in career playoff games. That was close.

  • June 16, 2022

    Wayne Oldford, a statistics professor at the University of Waterloo, explains risk in the context of daily life at the individual level, because “one in a million” is not especially intuitive:

    A few years ago, I was the “go to guy” at the University of Waterloo, asked to speak to local media, whenever a lottery jackpot got stupendously large (and the news cycle got exceedingly slow). My purpose was to relate to their audience the size of the chance of winning in a way that was quick yet comprehensible, which I did with some success on local radio and television stations.

    Inevitably, though, the next day I would hear back of listener disappointment – that some of the fun of purchasing a ticket had been removed. Joy came from anticipating winning the prize and my exposition killed that for many, by them having gained an appreciation of the chance of actually winning.

    I felt a little bit bad about this. I wanted people to understand the probabilities but I didn’t want to be a kill joy.

    Important reading if you’re trying to understand the odds of things these days.

    My favorite explanation of risk in the day-to-day is still the one from David Spiegelhalter.

  • June 15, 2022

    Hands-On Data Visualization, by Jack Dougherty and Ilya Ilyankou, is an open-access book geared for beginners. The book starts with spreadsheets, and then walks you through some of the more high-level JavaScript libraries to put things online relatively quickly. If you don’t have programming experience but want to kick the tires, it’s probably worth saving this for later.

    You can also grab a physical copy.

  • June 15, 2022

    The Marshall Project and Axios report that the FBI changed their reporting system last year, and 40 percent of law enforcement agencies didn’t submit any data:

    In 2021, the FBI retired its nearly century-old national crime data collection program, the Summary Reporting System used by the Uniform Crime Reporting (UCR) program. The agency switched to a new system, the National Incident-Based Reporting System (NIBRS), which gathers more specific information on each incident. Even though the FBI announced the transition years ago and the federal government spent hundreds of millions of dollars to help local police make the switch, about 7,000 of the nation’s 18,000 law enforcement agencies did not successfully send crime data to the voluntary program last year.

    I am sure policymakers will definitely be very responsible and cite data appropriately and not cherrypick from incomplete data to push an agenda.

  • June 14, 2022

    Christophe Coupé and company analyzed speech rate (on the left) across different languages, and then compared it to information rate (on the right) in bits per second. While speech rate and information rate are still coupled, there’s less variation in information rate across languages. More syllables doesn’t necessarily mean more information.