Twitter released a small JavaScript library to make density plots — for when you have a lot of overlapping points and could use some granular binning. Feed a method an array of thousands of x-y coordinates, and the library takes care of the rest.
-
With vaccines, we might be tempted to jump back into “normal” life before it’s really safe. The New York Times reports on why waiting until March instead of February might be the way to. This is based on estimates from Columbia University researchers, and you can read the preprint here (pdf) by Jeffrey Shaman et al.
We’ve come this far already…
-
Maybe you remember the SimCity-like views through satellite imagery from a few of years ago. Robert Simmon from Planet Labs returns to the topic discussing practical use cases and advantages over a top-down view:
Satellite imagery surrounds us — from Google Maps and daily weather forecasts to the graphics illustrating news stories — but almost all of it is from a map-like, top-down perspective. This view allows satellite data to be analyzed over time and compared with other sources of data. Unfortunately, it’s also a distorted perspective. Lacking many of the cues we use to interpret the world around us, top-down satellite imagery (often called nadir imagery in remote sensing jargon) appears unnaturally flat. It’s a view that is disconnected from our everyday experience.
-
Thomas Mock explains how to extract and parse data tables in image files via ImageMagick and R:
There are many times where someone shares data as an image, whether intentionally due to software constraints (ie Twitter) or as a result of not understanding the implications (image inside a PDF or in a Word Doc). xkcd.com jokingly refers to this as .norm or as the Normal File Format. While it’s far from ideal or a real file format, it’s all too common to see data as images in the “wild”. I’ll be using some examples from Twitter images and extracting the raw data from these. There are multiple levels of difficulty, namely that screenshots on Twitter are not uniform, often of relatively low quality (ie DPI), and contain additional “decoration” like colors or grid-lines. We’ll do our best to make it work!
You can never have too many tools to grab data from various, inconvenient file formats.
-
[arve url=”https://www.youtube.com/watch?v=gFFj22kjlZk&feature=emb_title” loop=”no” muted=”no” /]
Jon Schwabish has a new book coming out: Better Data Visualizations. To kick things off, he’s running a video series on the many different chart types. There will be 50 videos released daily, each with an invited practitioner to briefly talk about what the chart is and how it’s used. They’re already 10 videos into it.
Should be informative.
-
Members Only
-
The New York Times labeled all of the people sitting behind Joe Biden during the inauguration. It’s a straightforward but slick interactive that lets you pan and zoom the photograph. Click on a name for more details or use the list of names in a sidebar.
-
Based on estimates from the MIT Trancik Lab, The New York Times plotted average carbon dioxide emissions against average cost per month for electric, hybrid, and gas vehicles. Each dot represents a vehicle type. While electric vehicles cost more upfront, the lower maintenance and electric costs make up the difference in the long run.
The chart above only shows vehicles that retail for $55,000 or less, but you can see more vehicles in the original version.
-
For Bloomberg, Jeremy C.F. Lin and Rachael Dottle show what Joe Biden’s inauguration will look like, given all of the recent events and 2020. No public access and 25,000 National Guard personnel.
-
For NYT’s The Upshot, Kevin Quealy has been cataloging all of the insults Trump tweeted over the past five years. The project is complete:
As a political figure, Donald J. Trump used Twitter to praise, to cajole, to entertain, to lobby, to establish his version of events — and, perhaps most notably, to amplify his scorn. This list documents the verbal attacks Mr. Trump posted on Twitter, from when he declared his candidacy in June 2015 to Jan. 8, when Twitter permanently barred him.
-
As you probably know, there was a big Parler data scrape before the app and site went down. ProPublica spliced Parler video posts, sorting them by time and location. The result is basically a TikTok-style video feed of what happened.
-
[arve url=”https://www.youtube.com/watch?v=1cUUfMeOijg” loop=”no” muted=”no” /]
Tom Scott explains how Cloudflare uses a wall of lava lamps to generate random numbers. A video camera is pointed at the wall, and the movement in the lamps plus noise from the video provides randomness, which is used to secure websites.
Even though computers can do many things on their own, they still need help from the physical world for true unpredictability. The robot overlords aren’t here yet. [via kottke]
-
Members Only
-
In an effort to preserve part of her family’s culture, Jane Zhang designed recipe cards illustrating foods from her mother and grandmother. They provide ingredients and steps, but they also provide illustrations and diagrams that represent cuisine style, cooking method, texture, and taste.
My grandma spoke little English and I speak little Cantonese, so we often communicated through the language of food. So this project really speaks to me. I wish I had this for my own family.
-
Just before the social network Parler went down, a researcher who goes by the Twitter username @donk_enby scraped 56.7 terabytes of data from the site via a less-than-secure API. Motherboard reports on what some researchers are doing with the data:
One technologist took the scraped Parler data, took every file that had GPS coordinates included within it, formatted that information into JSON, and plotted those onto a map. The technologist then shared screenshots of their map with Motherboard, showing Parler posts originating from various countries, and then the United States, and finally in or around the Capitol itself. In other words, they were able to show that Parler users were posting material from the Capitol on the day of the rioting, and can now go back into the rest of the Parler data to retrieve specific material from that time.
I’ve only seen some quick maps so far, but I imagine there’s much more to come in terms of closer analysis and visualization.
-
The New York Times outlined the minutes from the speech leading to the mob at the Capitol. By now you’ve probably seen the videos and pictures and have an idea of what happened. But the timeline of events both inside and outside of the building really underscores how much worse it could’ve been.
-
For The Atlantic, Dani Alexis Ryskamp compares the financials of The Simpsons against present day medians, arguing that the fictional family’s lifestyle is no longer attainable:
The purchasing power of Homer’s paycheck, moreover, has shrunk dramatically. The median house costs 2.4 times what it did in the mid-’90s. Health-care expenses for one person are three times what they were 25 years ago. The median tuition for a four-year college is 1.8 times what it was then. In today’s world, Marge would have to get a job too. But even then, they would struggle. Inflation and stagnant wages have led to a rise in two-income households, but to an erosion of economic stability for the people who occupy them.
Someone should take this a step further and look at distributions and time series to show the shift, with The Simpsons as baseline.
-
Last year, around the time when people were baking a lot of things, Sarah Robinson used machine learning to find a recipe for a “cakie”:
Like many people, I’ve been entertaining myself at home by baking a ton and talking about my sourdough starter as if it were a real person. I’m pretty good at following recipes, but I decided I wanted to take things one step further and understand the science behind what differentiates a cake from a bread or a cookie. I also like machine learning so I thought: what if I could combine it with baking??!
Robinson provides the final recipe at the end, so first, I need to try this recipe. Second, what other foods and beverages can this apply to?
-
Natalie Wolchover for Quanta Magazine asked several physicists what a particle is. She came away with several points of view. For example, the particle as a “irreducible representation of a group”:
It’s the standard deep answer of people in the know: Particles are “representations” of “symmetry groups,” which are sets of transformations that can be done to objects.
Take, for example, an equilateral triangle. Rotating it by 120 or 240 degrees, or reflecting it across the line from each corner to the midpoint of the opposite side, or doing nothing, all leave the triangle looking the same as before. These six symmetries form a group. The group can be expressed as a set of mathematical matrices — arrays of numbers that, when multiplied by coordinates of an equilateral triangle, return the same coordinates. Such a set of matrices is a “representation” of the symmetry group.
Oh boy. A lot of this was over my head, as I nearly failed physics in college, but the various explanations with basic diagrams taught me a few new things.