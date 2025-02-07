  • Flight map shows firefighting efforts

    February 7, 2025

    To contain the fires in Los Angeles, aircraft flew back and forth to drop retardant and survey the area for several days. Peter Atwood used an animated map to show 24 hours of activity, totaling over 15,000 flight miles.

    Atwood used wildfire data from NASA, the ArcGIS Living Atlas for terrain, and FlightAware data for the flights. The neon aesthetic highlights the patterns and urgency of each aircraft’s travels.

  • Federal worker resignations

    February 7, 2025

    According to the U.S. Office of Personnel Management, about 65,000 federal workers have taken the resignation offer. The New York Times puts that number into context, given the size of the federal workforce.

    In other words, the federal government is an enormous work force that already experiences sizable turnover every year. In addition to workers who leave the government to retire or simply to quit, about another 50,000 to 60,000 are terminated every year for disciplinary or performance reasons, or because their appointments or funds expired. A small number — around 3,400 — die each year while employed by the government. All these departures are typically replaced by about 240,000 hires each year.

    While the resignation count might seem large, the denominator is a lot bigger.

  • Archiving effort to preserve Data.gov

    February 6, 2025

    The Harvard Law School Library Innovation Lab is archiving Data.gov and making the data easy to download. So far, they have a collection of 311,000 datasets:

    This is the first release in our new data vault project to preserve and authenticate vital public datasets for academic research, policymaking, and public use.

    We’ve built this project on our long-standing commitment to preserving government records and making public information available to everyone. Libraries play an essential role in safeguarding the integrity of digital information. By preserving detailed metadata and establishing digital signatures for authenticity and provenance, we make it easier for researchers and the public to cite and access the information they need over time.

    You can download the daily archive here.

    They also open sourced the software for others to build similar collections. Great.

    Partial Data Reflections

    February 6, 2025

    This week, I have a new tutorial for you and then we get into using data with baggage.

  • Common four-digit PINs of others

    February 6, 2025

    About 1 in 10 people use the same four-digit PIN, based on an analysis of Have I Been Pwned? data by Julian Fell and Teresa Tan for ABC News:

    Even though there are 10,000 possible combinations, when humans get involved that equation changes dramatically.

    If someone wants to unlock a stolen phone – or retrieve money from an ATM – and only have five guesses, this data suggests they still have a one-in-eight chance of guessing correctly.

    The scroll through the heatmap of PIN numbers, which shows the first two digits on the vertical axis and the last two digits on the horizontal, drives the point home. Maybe stay away from the diagonal and horizontal lines.

  • Tracking daily federal expenditures

    February 5, 2025

    The Hamilton Project is tracking federal expenditures and updating daily:

    This data interactive shows actual daily and weekly processed outlays to key programs and departments, as well as to states, Congress, and the Judiciary. This tool only reports outlays of federal funds, meaning the actual transmission of funds from the federal government to another entity. This tool, therefore, allows users to track federal government spending in real time.

    The data comes from the Daily Treasury Statement from the U.S. Department of the Treasury, so it’s anyone’s guess how long that will last. But for now, you can see where money is going in near real-time.

  • Download CDC data through Internet Archive

    February 5, 2025

    The data portal for the U.S. Centers for Disease Control and Prevention was taken down last Friday. For now, it seems data.cdc.gov is up in a modified form, but just in case, the Internet Archive has all the data that was available prior to January 28, 2025.

    The compressed data file is only 95 gigabytes, so maybe just download it now.

  • Geographic boundary data and microdata from Census Bureau is offline

    February 4, 2025

    As of this evening on February 4, 2025, the TIGER/Line shapefiles, which provide legal boundaries at various geographic levels, are currently unavailable on the Census website. The site is there, but when you try to download something via the menus, you get a box of nothing.

    Actually, poking around more, it seems that any Census web interface that relies on downloads via FTP gets you a 403 error. Data.census.gov is still up.

    In the meantime, IPUMS, which has worked with national agencies over the past couple decades, still has microdata. They sent this email earlier today:

    As you may already be aware, on Friday, January 31, federal agencies removed public data and documentation previously made available via public-facing federal government websites in response to administration directives. The types of data removed include large-scale population data sources that provide vital insight into the health and wellbeing of all communities.

    We are writing to reassure you that IPUMS data remain available, and that IPUMS remains committed to preserving and democratizing access to the world’s population data.

    We are monitoring this evolving situation closely. As part of our standard procedures, we download and preserve original data from U.S. statistical agencies that serve as the source data for IPUMS. Since last Friday, several organizations (and individuals) have downloaded many other public federal datasets. There are efforts underway to catalog and make these data available. We will share resources and guidance when we have it about how to locate or share missing data.

  • Data bead bracelets

    February 4, 2025

    Data Beads, by Eszter Katona and Mihály Minkó, is a fun initiative that encourages people to make and wear bracelets based on data:

    This is a grassroots initiative that’s all about brining data visualization into a whole new space—off the screen and into wearable, everyday objects. We turn data into simple, easy-to-make bracelets, making data more approachable and fun.

    These bracelets aren’t just accessories: they’re conversation starters that help break the ice around different topics, data and graphs, which can be difficult for many people to engage with. At the same time, we hope they spark curiosity and improve data literacy in a casual, creative way.

    I suddenly wish the short-lived Shirt Project was still going.

    Ridgeline charts are nice to look at, and that is enough reason to make them. Use a gradient fill for extra sauce.

  • Book chart showing Barnes & Noble opening new stores

    February 3, 2025

    I assumed that Barnes & Noble was on its way out, but I guess not. Danielle Alberti and Lindsey Bailey for Axios have this charming chart showing 57 new locations in 2024 and 60 planned for this year. Each book spine represents a location.

    I’m taking this as a cue that people are weaning off the internet, which is getting worse, and it’s not just my imagination.

  • Health and climate data purge

    February 3, 2025

    For The Verge, Justine Calma reports on the recent takedowns. Some groups have been preparing for this:

    The End of Term Web Archive project has saved content on federal government websites during every presidential transition since 2008. The Environmental Data and Governance Initiative (EDGI) that formed after Trump was first elected also documents changes to government websites and works to make archived datasets available elsewhere. It has backed up data from the CDC’s Social Vulnerability Index and Environmental Justice Index and shared it on a webpage for The Public Environmental Data Project.

    Yet even if these datasets have been archived, they aren’t as helpful when they aren’t updated. “Any dataset has a lifespan of utility,” says Dan Pisut, senior principal engineer at GIS software company Esri.

    Of course, this is just the beginning. Remember: marathon, not sprint.

  • About 8,000 U.S. government pages taken down

    February 2, 2025

    The New York Times used a programmatic approach to estimate the number of pages taken down so far since Friday. Ethan Singer reporting:

    On Friday, The Times downloaded the list of the most visited government domains in the U.S. and began compiling the complete list of pages available on each one using each site’s sitemap, a file that outlines the structure of a website and is typically used by search engines to keep track of what’s on the internet. (Some sites, including state.gov and weather.gov, were not included in our analysis because we were unable to identify a complete list of web pages on their sites, or for other technical reasons.) In all, we were able to identify more than seven million pages across more than 150 sites.

    We then repeated this process several times Friday night and on Saturday, and compared our new list of websites with those we originally found.

    About 3,000 pages from the Centers for Disease Control and Prevention, 3,000 from the Census Bureau, and 1,000 from the Office of Justice Programs make up the bulk of takedown.

  • Checklist for federal data backups

    January 31, 2025

    In preparation for days like this, MIT Libraries has a guide for making usable backups:

    The United States (US) federal government collects, aggregates, and disseminates a large volume of information and data. This content is used by researchers, policymakers, and many others for various purposes.

    Protecting access to US federal government data between and during presidential administrations is important. Data can potentially disappear because of government shutdowns, broken links, and policy shifts.

    This checklist provides steps you can take to ensure the government data you use in your research remains accessible to you and others.

    Identify the data, confirm, backup with documentation, and maintain re-usability.

  • Preserving government data before it disappears completely

    January 31, 2025

    Groups at universities and research labs are forming to preserve data. Naseem Miller reports:

    The ad hoc group that organized Friday’s data marathon at Chan School calls itself “The Preserving Public Health Data Collective” and it’s part of a growing effort among researchers and academic institutions across the U.S. to save federal health websites and databases.

    Researchers are using different tools, including downloading datasets, scraping websites and archiving them with the Wayback Machine, which is an initiative of the Internet Archive, a nonprofit digital library of Internet sites. It enables users to see how websites looked in the past.

    The changes to government websites are happening faster than researchers can keep up with.

    There are some tips on how you can preserve websites, including saving them to the Wayback Machine and suggesting databases to The Data Liberation Project.

  • Data.cdc.gov goes offline

    January 31, 2025

    As of this evening on January 31, 2025, the data portal for the Centers for Disease Control and Prevention is offline. You get the following text:

    Data.CDC.gov is temporarily offline in order to comply with Executive Order 14168 Defending Women From Gender Ideology Extremism and Restoring Biological Truth to the Federal Government and the OPM notice dated January 29, 2025, “Initial Guidance Regarding President Trump’s Executive Order Defending Women from Gender Ideology Extremism and Restoring Biological Truth to the Federal Government (Defending Women).” The website will resume operations once in compliance.

    The takedown is part of a directive to halt research and cut funding. From Roni Rabin and Apoorva Mandavilli for The New York Times:

    On Friday, hundreds of scientists gathered for a “datathon,” in an attempt to preserve websites related to health equity.

    “There’s been a history in this country recently of trying to make data disappear, as if that makes problems disappear,” said Nancy Krieger, a social epidemiologist at Harvard University and a co-leader of the effort.

  • Census.gov is down

    January 31, 2025

    As of Friday, January 31, 2025 at 3:26pm PST, the U.S. Census Bureau homepage is blank.

    I did not see that coming.

    FAA.gov also appears to be down. Any others?

    Update, 1/31/2025 6:50pm PST The Census.gov address is loading again, but the index.html link still points to the blank page with the input field. The FAA site only works with www is back online offline again.

  • Mapping the American Airlines and Army helicopter collision

    January 31, 2025

    The Washington Post mapped the flight paths leading up to the collision over the Potomac River.

    The helicopter and the American Eagle plane were in “standard” flight patterns, Transportation Secretary Sean P. Duffy said at a morning news conference. The patterns, Duffy said, were not unusual for D.C. airspace. “Something went wrong here,” he said.

    That “something” is still unknown. Flight data is going to get a close examination over the next few weeks. I hope the data is accurate.

  • Census Bureau director resigns, 2025 edition

    January 31, 2025

    The Census Bureau director Robert Santos announced his resignation on Thursday:

    Santos — a nationally recognized statistician who is the first Latino to head the bureau — joined the federal government’s largest statistical agency as a Biden appointee after years of interference at the bureau by the first Trump administration.

    Before becoming the agency’s director, Santos was a vocal opponent of how Trump officials handled the 2020 census — including a last-minute decision to end counting early during the COVID-19 pandemic and a failed push to add a question about U.S. citizenship status that was likely to deter many Latino and Asian American residents from participating in the official population tally.

    This is not totally unexpected, given there were resignations in 2017 and 2021, but still. You can see how data can be influenced and why we need to think about data carefully.

  • See wind data on Mars through tele-present wind

    January 31, 2025

    With the art installation tele-present wind, David Bowen displays data collected by NASA’s Perseverance rover mission. Grass stalks are attached to mechanical devices that shift the stalks back and forth in unison. The abstract Martian data becomes tangible and a physical space you can walk through.
