• November 16, 2018

    Topic

    Maps  /  ,

    When you go skiing or snowboarding, you get a map of the mountain that shows the terrain and where you can go. James Niehues is the man behind many of these hand-painted ski maps around the world, and he has a kickstarter to catalog his life’s work.

    This is kind of amazing. I went skiing a lot as a kid, and I have distinct memories of these maps. I would stand at the top of the mountain, rip off one of my gloves with my teeth, and then pull out a folded map from a zipped pocket. I never knew they were by the same man, but in retrospect, it makes sense.

  • November 16, 2018

    Newsy, Reveal and ProPublica look into rape cases in the U.S. and law enforcement’s use of exceptional clearance.

    The designation allows police to clear cases when they have enough evidence to make an arrest and know who and where the suspect is, but can’t make an arrest for reasons outside their control. Experts say it’s supposed to be used sparingly.

    Culled data from various police departments shows the designation is used more often that one would expect.

  • November 16, 2018

    The Camp fire death toll rose to 63 and 631 missing as of yesterday. The Los Angeles Times provides some graphics showing scale and the buildings that burned.

    Ugh. I live a few hundred miles away and the smoke is bad enough that my son’s school is closed today. It has not been a good year for California in terms of wildfires.

  • Members Only
    November 15, 2018

    Topic

    The Process  / 

    Important question: Is animation in visualization even worthwhile? Well, it depends. Surprise, surprise. In this issue, I look at animation in data visualization, its uses, and how I like to think about it when I implement moving data.

  • November 15, 2018

    I’m behind on my podcast listening (well, behind in everything tbh), but Reply All covered the flaws of CompStat, a data system originally employed by the NYPD to track crime and hold officers accountable:

    But some of these chiefs started to figure out, wait a minute, the person who’s in charge of actually keeping track of the crime in my neighborhood is me. And so if they couldn’t make crime go down, they just would stop reporting crime. And they found all these different ways to do it. You could refuse to take crime reports from victims, you could write down different things than what had actually happened. You could literally just throw paperwork away. And so that guy would survive that CompStat meeting, he’d get his promotion, and then when the next guy showed up, the number that he had to beat was the number that a cheater had set. And so he had to cheat a little bit more.

    I sat in on a CompStat meeting years ago in Los Angeles. I went into it excited to see the data system that helped decrease crime, but I left skeptical after hearing the discussions over such small absolute numbers, which in turn made for a lot of fluctuations percentage-wise. Maybe things are different now a decade later, but I’m not surprised that some intentionally and unintentionally gamed the system.

    See also: FiveThirtyEight’s CompStat story from 2015.

  • November 14, 2018

    Atma Mani, a geospatial engineer for ESRI, imagined shopping for a house with data, maps, and analysis. Basically, a personalized recommendation system:

    The type of recommendation engine built in this study is called ‘content based filtering’ as it uses just the intrinsic and spatial features engineered for prediction. For this type of recommendation to work, we need a really large training set. In reality nobody can generate such a large set manually. In practice however, another type of recommendation called ‘community based filtering’ is used. This type of recommendation engine uses the features engineered for the properties, combined with favorite / blacklist data to find similarity between a large number of buyers. It then pools the training set from similar buyers to create a really large training set and learns on that.

    I love going all nerd on these sort of things. The most interesting part for me though is that it always seems to come down to a gut feeling. You have to see the house and get a feel for the area, which is much harder to get through data. So then, how do you couple the information you get from the data with more fuzzy emotions?

  • November 13, 2018

    Topic

    Maps  /  ,

    From Streetscapes by Zeit:

    Street names are stories of life. They tell us something about how the people in a given place work and live, what they believe in and their dreams. There are more than a million streets and squares in Germany. ZEIT ONLINE has compiled a database of the roughly 450,000 different names used. Some street names are used hundreds of times and others only once. But none of the names were chosen at random.

    It’s for street names in Germany, so the meaning might be lost for many of you, but much of the data comes from OpenStreetMap, which should mean something like this is doable for other cities and countries.

    See also the San Francisco history of street names mapped by Noah Veltman a few years ago. [via @maartenzam]

  • November 12, 2018

    Reading visualization research papers can often feel like a slog. As a necessity, there’s usually a lot of jargon, references to William Cleveland and Robert McGill, and sometimes perception studies that lack a bit of rigor. So for practitioners or people generally interested in data communication, worthwhile research falls into a “read later” folder never to be seen again.

    Multiple Views, started by visualization researchers Jessica Hullman, Danielle Szafir, Robert Kosara, and Enrico Bertini, aims to explain the findings and the studies to a more general audience. (The UW Interactive Data Lab’s feed comes to mind.) Maybe the “read later” becomes read.

    I’m looking forward to learning more. These projects have a tendency to start with a lot of energy and then fizzle out, so I’m hoping we can nudge this a bit to urge them on. Follow along here.

  • Members Only

    How I Made That: Animated Difference Charts in R

    A combination of a bivariate area chart, animation, and a population pyramid, with a sprinkling of detail and annotation.

  • November 9, 2018

    Charles-Joseph Minard, best known for a graphic he made (during retirement, one year before his death) showing Napoleon’s March, made many statistical graphics over his career. The Minard System from Sandra Rendgen is a collection of these works. The first section is background on Minard, his famed graphic, and his process, but really, you get it for the collection of vintage graphic goodness. [Amazon link]

  • November 8, 2018

    The Earth Puzzle by generative design studio Nervous System has no defined borders. You put it together how you want.

    Start anywhere and see where your journey takes you. This puzzle is based on an icosahedral map projection and has the topology of a sphere. This means it has no edges, no North and South, and no fixed shape. Try to get the landmasses together or see how the oceans are connected. Make your own maps of the earth!

    Get it here. There’s also one for the moon.

  • Members Only
    November 8, 2018

    Topic

    The Process  / 

    Election night has become quite the event for newsrooms and graphics departments over the years, and the visualization production cycle has started to feel more familiar each time.

  • November 8, 2018

    Ben Schmidt uses deep scatterplots to visualize millions of data points. It’s a combination of algorithm-based display and hiding of points as you zoom in and out like you might an interactive map. Schmidt describes the process and made the code available on GitHub.

  • November 7, 2018

    The Guardian goes with scaled, angled arrows to show the Republican and Democrat swings in these midterms for the House compared against those of 2016.

    It reminds me of the classic wind-like map by The New York Times from 2012, but the angles seem to give the differences a bit more room to breathe.

    Update: Also, see a similar map by NYT from 2016, except the arrows point the other direction.

  • November 7, 2018

    Topic

    Statistics  /  , , ,

    Artificial intelligence, given its name, sounds like a computer learns everything its own. However, a set of algorithms can only become useful if there’s something to learn from: data. Dave Lee for BBC reports on a company in Kenya that supplies training data for self-driving cars:

    Brenda loads up an image, and then uses the mouse to trace around just about everything. People, cars, road signs, lane markings – even the sky, specifying whether it’s cloudy or bright. Ingesting millions of these images into an artificial intelligence system means a self-driving car, to use one example, can begin to “recognise” those objects in the real world. The more data, the supposedly smarter the machine.

    On the one hand it sounds like tedious work on the cheap, but on the other it provides people with more opportunities that were previously unavailable.

  • November 6, 2018

    Data grows more intertwined with the everyday and more involved in important decisions. However, data is biased in many ways from collection, to analysis, and the conclusions, which is a problem when it is often intended to provide an objective point of view. In their recently released manuscript for Data Feminism, Catherine D’Ignazio and Lauren Klein discuss the importance of varied points of view:

    The double-edged sword of data shows just how important it is to understand how structures of power and privilege operate in the world. The questions we might ask about these structures can relate to issues of gender in the workplace, as in the case of Christine Darden and her wrongly delayed promotion. Or they can relate to issues of broader social inequality, as in the case of predictive policing described just above. So one thing you will notice throughout this book is that not all of our examples are about women–and deliberately so. This is because data feminism is about more than women. It’s is about more than gender. Put simply: Data Feminism is a book about power in data science. Because feminism, ultimately, is about power too. It is about who has power and who doesn’t, about the consequences of those power differentials, and how those power differentials can be challenged and changed.

    In the interest of making the published work as complete as possible, D’Ignazio and Klein made the manuscript public and are ready for feedback.

  • November 6, 2018

    Topic

    News  /  , , ,

    xkcd referenced the ever-so-loved forecasting needle. I’m so not gonna look at it this year. Maybe.

  • November 5, 2018

    A meme that cried “jobs not mobs” began modestly, but a couple of weeks later it found its way into a slogan used by the President of the United States. Keith Collins and Kevin Roose for The New York Times traced the spread of the meme through social media using a beeswarm chart. Blue represents activity on Twitter, yellow represents Facebook, and orange represents Reddit. Circles are sized by retweets, likes, and upvotes. The notes for key activities move the story forward.

  • November 5, 2018

    The Economist built an election model that treats demographic variables like blocks that output a probability of voting Republican or Democrat:

    Our model adds up the impact of each variable, like a set of building blocks. As a result, a group of weak predictors that point in the same direction can cancel out a single strong one. In theory, the model could identify a black voter as a Republican leaner, or a white evangelical as a probable Democrat—though it would require quite an unusual profile.

    Remember when most people paid little attention to midterm elections and result forecasting was not really a thing? Yeah, me neither.

    Be sure to check out the small interactive on the same page that lets you “build a voter” and get the model’s probability output. I’m a fan of the demographic-field-dropdowns-in-a-sentence format.

  • November 5, 2018

    As the midterm elections loom, the ads focusing on key issues are running in full force. Using data from Nielsen, Bloomberg mapped the issues talked about across the country.

    Bloomberg News analyzed more than 3 million election ads for 2018 congressional and gubernatorial races to get a sense of the most commonly discussed issue in 210 local television markets, as defined by the Nielsen Company. Across the U.S., 16 different topics are mentioned more than anything else during midterm TV ads.

    The map above shows the most common per Nielsen market, but read the full article for the national breakdowns of the major issues.

    Health care has been huge in my area. For the past few weeks, every YouTube video I watch is preceded by an ad, and my mailbox keeps getting filled with ads for and against a certain proposition, often on the same day.