In celebration of Chinese New Year, Julia Janicki, Daisy Chung, and Joyce Chou rotate through the traditional foods served with an illustrated Lazy Susan.
-
There’s been a lot of rain in California, which has been good to relieve some of the pressures from drought, at least in the short-term. For The New York Times, Elena Shao, Mira Rojanasakul, and Nadja Popovich show the sudden bump in water supply.
The areas to show historical averages in the background was a good choice. Very reservoir-ish.
-
AI training data comes from the internet, and as we know but maybe forget sometimes, there are harmful areas that are terrible for people. For Time, Billy Perrigo reports on how OpenAI outsourced a firm to label such data, which required people to read disturbing text:
To build that safety system, OpenAI took a leaf out of the playbook of social media companies like Facebook, who had already shown it was possible to build AIs that could detect toxic language like hate speech to help remove it from their platforms. The premise was simple: feed an AI with labeled examples of violence, hate speech, and sexual abuse, and that tool could learn to detect those forms of toxicity in the wild. That detector would be built into ChatGPT to check whether it was echoing the toxicity of its training data, and filter it out before it ever reached the user. It could also help scrub toxic text from the training datasets of future AI models.
To get those labels, OpenAI sent tens of thousands of snippets of text to an outsourcing firm in Kenya, beginning in November 2021. Much of that text appeared to have been pulled from the darkest recesses of the internet.
-
Members Only
-
Barely Maps is an ongoing project by Peter Gorman that shows geographic data as barely a map. Gorman strips away almost all context to the edge before being too abstract to comprehend.
The above is for the western coast of the United States. There are many more of the same flavor available in print.
-
ScrollyVideo.js is a JavaScript library that makes it easier to incorporate videos in a scrollytelling layout. The examples look really straightforward, which means I’m saving this for later.
-
To show snow cover across the United States, Althea Archer for the USGS used hexbins, but instead of hexbins, she used snowflakes. Archer provided her R code and outlined her process in a blog post, which is something I’m not used to seeing from a government agency. I like it.
-
For eight years, Liam Quigley tracked every slice of pizza he ate in New York City, which added up to 454 slices. Quigley did not rate the slices to “avoid controversy and bribes”, but I kind of wish he rated all those slices. Instead he logged the location, the price, and the type of pizza.
Also I want pizza now.
-
Members Only
-
We looked at what makes people happy. We looked at activities that people rate as meaningful. Now let’s put them together and see what people rate as both meaningful and joy-inducing.
-
Mapping the entire planet is not exactly a straightforward thing to do, especially during a time when there weren’t any flying objects to take photographs from above. Jeremy Shuback rewinds all the way back to this time and asks how the first world map came to be.
-
Tom Brady, the quarterback for the Tampa Bay Buccaneers, is 45 years old, which makes him the oldest player in the National Football League. Francesca Paris, for NYT’s The Upshot, places Brady’s age under the perspective of other occupations. For example, Lilian Thomas Burwell, who is an artist at 95 years old, is well in the upper percentile for those in her field (and the general population).
See also: the distributions of age and occupation.
-
Lensa is an app that lets you retouch photos, and it recently added a feature that uses Stable Diffusion to generate AI-assisted portraits. While fun for some, the feature reveals biases in the underlying dataset. Melissa Heikkilä, for MIT Technology Review, describes problematic biases towards sexualized images for some groups:
Lensa generates its avatars using Stable Diffusion, an open-source AI model that generates images based on text prompts. Stable Diffusion is built using LAION-5B, a massive open-source data set that has been compiled by scraping images off the internet.
And because the internet is overflowing with images of naked or barely dressed women, and pictures reflecting sexist, racist stereotypes, the data set is also skewed toward these kinds of images.
This leads to AI models that sexualize women regardless of whether they want to be depicted that way, Caliskan says—especially women with identities that have been historically disadvantaged.
-
People have been having fun with generative AI lately. Enter a prompt and get a believable body of text, or enter descriptive text and get a photorealistic image. But as with all things that are fun on the internet, there are those who are looking to exploit the popularity. Maggie Appleton discusses the trade-offs:
There’s a swirl of optimism around how these models will save us from a suite of boring busywork: writing formal emails, internal memos, technical documentation, marketing copy, product announcement, advertisements, cover letters, and even negotiating with medical insurance companies.
But we’ll also need to reckon with the trade-offs of making insta-paragraphs and 1-click cover images. These new models are poised to flood the web with generic, generated content.
You thought the first page of Google was bunk before? You haven’t seen Google where SEO optimizer bros pump out billions of perfectly coherent but predictably dull informational articles for every longtail keyword combination under the sun.
-
In the department of tedious and thorough, Reddit user _tsweezy_ tracked every hour of his life for five years. It’s like a personal American Time Use Survey diary for slightly longer than a single day. I’m sure there’s some estimation or fill-ins after-the-fact, but still, that’s a lot of days and hours.
-
Animals are going extinct at a faster rate. Reuters shows a developing pattern across species:
Losing hundreds of species over 500 or so years may not seem significant when there are millions more still living on the planet. But in fact, the speed at which species are now vanishing is unprecedented in the last 10 million years.
“We are losing species now faster than they can evolve,” O’Brien said.
-
There appears to be a trend of using human names for pets. Alyssa Fowers and Chris Alcantara, for WP’s Department of Data, asked the natural questions that come after: “How human is your dog’s name? How doggy is your name?” Enter your own name or a dog’s name to see where it falls on the dog to human scale.
-
Jon Keegan on how USGS researchers collected data for 125 square miles of sea floor:
In 2004 and 2005, two research vessels, Ocean Explorer and Connecticut set off into the waters off Cape Ann, Massachusetts on a U.S. Geological Survey mission to map a section of the bottom of the sea. Equipped with cameras, advanced sonar and bathymetric scanners, these ships mapped 125 square miles of the sea floor capturing a detailed dataset that allowed U.S. Geological Survey scientists to characterize the makeup of the sediment and bedrock in waters up to 92 meters deep.
-
Members Only