• With absolute certainty, you will die. When will it happen? That is a trickier question. But we can run simulations to explore the possibilities.

  • For Letterform Archive, designer Angie Wang examines a collection of chopstick sleeves as it relates to Japan:

    Paper chopstick sleeves emerged at the turn of the 20th century when disposable chopsticks and packaged meals gained popularity with the advent of train travel. In addition to ensuring cleanliness, printed paper chopstick sleeves became vernacular advertisements for shops and restaurants.

    The latest addition to the Archive’s holdings of Asian ephemera is the hashibukuro collection of Mr. Susumu Kitagawa of Fuji City, Japan. While individually modest in their design and messaging, when considered as a whole the sleeves that comprise this collection map a singular history of Japanese ideology and aesthetics.

  • The purpose of onomatopoeia is to imitate sounds with words, so you might expect the words for animal sounds to be similar across languages. For the Pudding, Vivian Li shows that this is not always the case.

    Onomatopoeia offers a fascinating glimpse into the interaction between sound and language. The way humans mimic animal sounds reflects not only shared biological instincts but also distinct cultural filters. Although onomatopoeia intends to imitate faithfully, its differences are ultimately far from arbitrary. In trying to capture the same auditory essence, English interprets a pig’s sound as [ojŋk], yet Hungarian hears [røf], and Vietnamese hears [ʔut it]. Even among the three animals discussed, cats are more consistent in their sound interpretation, while pigs are more variable — whether because pigs’ vocalizations are innately more complex, or because they call upon different phonotactic rules.

    All the words are clickable so that you can hear pronunciations for different languages. Colors indicate phone groups, such as nasal consonants and mid central vowels.

  • I like this chart set from Bloomberg that shows the top brands, ranked by market share in 2024. Faded lines show true estimates, and thicker lines in the foreground provide the trends. Tick labels are limited to the first column on the left to avoid busyness. Straightforward but effective.

    In the U.S., we usually see BYD, an electric vehicle car brand, mentioned in the context of Tesla as the competition. But it doesn’t look like much of a competition. BYD has rapidly gained market share in China over the last five years.

  • FiveThirtyEight is gone, and along with its visualization-centric projects, so is their poll tracking that they and others used to analyze public sentiment. The New York Times is picking up the baton:

    As one half of the Times/Siena College poll, which has been recognized as one of the country’s premier pollsters, we believe there’s value in an individual poll. But we also think aggregating polls and providing analysis of them collectively, as we did during last year’s election, is a service worth preserving — one that may be needed even more today with the profusion of polling, contradictory findings and loud partisan voices.

    We’re building on the work of the politics website 538, which for several years released this data as a public service until it was shuttered by ABC News this month, and which itself followed in the path of Pollster.com at The Huffington Post. Our goal is to ensure that this resource, which is a foundational tool for many journalists and researchers, remains updated long-term. The data will be made available free for anybody to use as they wish, so long as they provide attribution to The Times. (If you’re still using data collected by 538, you may still need to give it attribution as well.)

    They’re starting with presidential approval ratings.

  • Leading up to the NCAA Men’s basketball tournament, the Athletic has a bracket with projections expressed as win probabilities in each round. Surprise, Duke is heavily favored to win, which can only mean everyone’s brackets will be ruined early.

    On methodology:

    We create an offensive and defensive projection for every college basketball team using various box score metrics. These projections estimate how many points a team would score and allow in a game against an average opponent on a neutral court. We then assign a probability of how likely a team is to win a given game by adjusting for opponent, location and team health. Taking into account the bracket, we use the projections to simulate the tournament 200,000 times.

    After those 200,000 simulations, we calculate how often a team is to make each round of the tournament and win the championship. For example, if a team has a 10 percent chance of making the Final Four, that means that they’ve made the Final Four in 10 percent of the simulations run.

  • When you enter a query in traditional search engines, you get a list of results. They are possible answers to your question, and you decide what resources you want to trust. On the other hand, when you query via AI chatbot, you get a limited number of answers, as a sentence, that appear confident in the context.

    For Columbia Journalism Review, Klaudia Jaźwińska and Aisvarya Chandrasekar tested this accuracy and confidence by using several chatbots to cite articles:

    Overall, the chatbots often failed to retrieve the correct articles. Collectively, they provided incorrect answers to more than 60 percent of queries. Across different platforms, the level of inaccuracy varied, with Perplexity answering 37 percent of the queries incorrectly, while Grok 3 had a much higher error rate, answering 94 percent of the queries incorrectly.

    So not great.

    I am sure someone is working on improving that accuracy, but we’ll have to develop our own skills in separating truth from junk, just like we have with past online things. Going forward, maybe keep an eye out for the younger and older generations who tend to accept online things as automatic truth. Things could get dicey.

  • The “Department of Government Efficiency” keeps a “Wall of Receipts” to signal transparency in how they are “saving” money. However, it’s difficult to take it seriously when the data keeps changing, disappearing, and reappearing. Ethan Singer and Emily Badger, for NYT’s the Upshot, go with the clustered bubbles to show the edits since Feburary 16, 2025.

    I’ll be the first one to tell you that working with data is tricky and that there are bound to be mistakes. But it’s in everyone’s best interest to find the mistakes first instead of making life-changing decisions and then finding out what breaks after.

  • Pam Johnson got an email from her bank about her husband’s death. The Social Security Administration deducted funds from their account. The problem: her husband, Ned Johnson, is still alive. From Danny Westneat for the Seattle Times:

    “We recently received notification of LEONARD A. JOHNSON’s passing,” it began. “We offer our sincerest condolences …”

    At first she figured it was a scam — her husband, after all, was sitting right there. But then the bank got to the point.

    “We know this is a difficult time, and we’re here to help,” the bank wrote. “We received a request from Social Security Administration to return benefits paid to LEONARD A. JOHNSON’s account after their passing.”

    “There’s nothing you need to do — we’ve deducted the funds from LEONARD A. JOHNSON’s account.”

    Uh oh. It itemized how $5,201 had been stricken from their bank account, on the grounds that Ned wasn’t justified to get those benefits — because he was dead. That was for payments he’d received in December and January.

    After several weeks, they were able to get Johnson revived in the SSA database, but they still don’t know why he was marked dead to begin with. Whatever the reason, it should be obvious why it’s important to measure twice and cut once.

  • In efforts to reduce repeat offenses in Spain thirty years ago, researchers developed a formula that assigned a risk score to individuals. The score was used to decide if prisons should grant a prisoner temporary release, and the formula still factors into decision-making today. Civio describes the current downsides of using the scores, which are based on a relatively small sample of prisoners from the 1990s.

    An interactive graphic, shown above, illustrates the system and how a score goes up and down as you change variables in the drop-down menus. Foreign status increases the risk score the most, even more so than if a prisoner tried to escape.

  • For Axios, Marc Caputo reports:

    Secretary of State Marco Rubio is launching an AI-fueled “Catch and Revoke” effort to cancel the visas of foreign nationals who appear to support Hamas or other designated terror groups, senior State Department officials tell Axios.

    Why it matters: The effort — which includes AI-assisted reviews of tens of thousands of student visa holders’ social media accounts — marks a dramatic escalation in the U.S. government’s policing of foreign nationals’ conduct and speech.

    Something tells me that the view into the system’s usage, classification process, and underlying data will be quite fuzzy.

  • Members Only

    This week is about highlighting changes in data visually to make them glaringly obvious.

  • Amanda Shendruk and Catherine Rampell, for Washington Post Opinion, highlight the current strategies of removing data from public view so there’s no baseline to compare against.

    Curating reality is an age-old political game. Politicians spin facts, cherry-pick and create “truth” through repetition. Statistical sleight of hand has long been part of that tool kit, as has burying inconvenient numbers. (In 1994, for instance, U.S. lawmakers blocked federal data collection on “green” gross domestic product.) But Trump’s statistical purges have been faster and more sweeping — picking off not just select factoids but entire troves of public information.

    The deletions self-contradict when the same groups are also saying that “data does not lie” in reference to spending cuts and takedowns. Why delete all the truth about how the United States functions, how we live, and where we are headed?

  • According to data from ActivTrak, people are shortening their work days with higher productivity. For Bloomberg, Nibras Suliman reports on the 36 fewer minutes at the end of 2024 when compared to 2022.

    I don’t know anything about ActivTrak, so I wonder what kind of work they track and how they measure productivity. Either way, it’s good to see minutes going down. I think we could stand to work less, myself included.

  • This might come as a surprise to some, but since congestion pricing in Manhattan began, the number of complaints about honking declined. For The City, Jose Martinez and Mia Hollie looked at the 311 service data:

    “One more reason to love congestion relief — less honking,” Juliette Michaelson, the MTA’s deputy chief of policy and external relations, said in a statement to THE CITY. “Turns out it is, in fact, possible to make Manhattan a little more peaceful.”

    In addition, between Jan. 5 and March 4, the two Department of Environmental Protection noise cameras south of 60th Street didn’t issue a single horn-honking summons, according to numbers provided by the city agency. In contrast, those two cameras issued 27 summonses for excessive horn blowing during the same time period last year.

    311 service data can be found here.

  • In almost every dataset about life and people that stretches back past March 2020, you can find the blip when Covid changed how we live. Aatish Bhatia and Irineo Cabreros, for NYT’s the Upshot, used a stack of 30 charts to show the shifts.

    Each chart shows a pre-Covid gray on the left and a post-Covid red-orange on the right. The lines (or bars) on the post-Covid side extend the past when you scroll. Usually charts that show an empty space to start and then animate the rest are gimmicks, but the extensions highlight the sudden changes in this series.

    The scroll style and dimensions are very mobile-first, as the stack plays out in a more familiar way on a phone. The style also makes the 30 charts feel like not too much.

  • From Pew Research, this political typology quiz is from four years ago but is as relevant as it was then. Answer a handful of questions and see where you fall in the spectrum of nine groups. As the split between Democrat and Republican in the U.S. grows wider, maybe that means it gets easier to see the differences and similarities in the space between.

    On the methodology to define the groups:

    The typology groups are created with a statistical procedure that uses respondents’ scores on all 27 items to sort them into relatively homogeneous groups. The specific statistical technique used to calculate group membership is weighted clustering around medoids (using the WeightedCluster package version 1.4-1 in R version 4.1.1). The items selected for inclusion in the clustering were chosen based on extensive testing to find the model that fit the data best and produced groups that were substantively meaningful. Most prior Pew Research Center typologies used a closely related method, cluster analysis via the k-means algorithm, to identify groups.

  • Alvin Chang, for the Pudding, highlights education research on the awkwardness of middle school (or junior high as they used to call it (or intermediate where I’m from)).

    What they found across the country was that 6th, 7th, and 8th graders who attend middle schools learn less, while feeling lower levels of belonging and self esteem, when compared to kids who attend K-8 schools. One 2010 study of New York City students found that, when kids transition to middle school, their parents feel like the quality of education and safety of the schools is worse compared to the parents of students who still attend K-8 schools.

    I would have assumed that the transition year from being the oldest in elementary school to the youngest at a new school was when the sense of belonging was lowest. Instead, in the study, belonging also decreased in 7th and 8th grades and leveled out in high school.

    Although, as we exit elementary school, I can see how a transition from collective to distinct identities would shift the sentiment.

    Chang frames the story as a conversation between adult and students. He shows individual data points as the top half of student faces (with blinking eyes). They organize as you scroll and highlight the aggregates with a background color change to yellow.

    There is also a video version with music and narration by Chang. I think it nudges past the interactive.

  • Last week, Disney laid off FiveThirtyEight employees and announced the site would cease operations. However, I did not realize that the end for the site and archives was also coming quickly. Links to past FiveThirtyEight projects loaded a usable and readable archive. Now you just get redirected to an outdated ABC News politics page.

    That’s a lot of solid work that just poofed out of existence.

    For now, the FiveThirtyEight GitHub page is still up. While it hasn’t been updated in a few years since the restructuring, the datasets in particular might be useful to download for teaching purposes (or a dumb chart about candy), while they’re still online.

  • For Our World in Data, Saloni Dattani and Lucas Rodés-Guirao analyzed the various factors that led to the baby boom, typically marked by the period following World War II. As usual, it’s not that simple.

    The baby boom is typically defined as the time period between 1946 and 1964. As an example, Brittanica’s entry on the baby boom states that it describes “the increase in the birth rate between 1946 and 1964”. Similarly, the US Census Bureau defines baby boomers as “those born between 1946 and 1964”, with the common belief that the baby boom started immediately after World War II.

    But as the chart below shows, the rise began earlier.

    Birth rates in the United States had been falling in the early twentieth century, and the decline began to slow down at the end of the 1920s. Then, in the late 1930s, they turned around and began to rise, and this continued during parts of World War II. At the end of the war, they surged, but this was part of a multi-decadal increase.