• My first post on FlowingData was in June 2007, and since September of that year, a post has gone up every weekday, save holidays and special occasions. I think that makes me a grandpa in internet time.

    Visualization-Tutorials-on-FlowingData

    2015 was the second full year of FlowingData as my actual job. I spent a lot of time learning new things by working with as much data that time allowed, and then tried to help others understand their data through guides and tutorials. Then there’s the four-week visualization course in R, which seems to be something many were looking for.

    The best way to learn and really understand something is to do it yourself. Not only does it help you with your own data, but it also provides another level of appreciation for the work of others’, because you know the challenges and limitations.

    This is your life.

    Like last year, the most popular things on FlowingData in 2015 were my own projects, by a lot. The Data Stories folks spoke briefly about visualization blogs falling out of style this year, and it’s true to an extent. The sharing and linking aspect is on Facebook and Twitter more these days. But I can’t imagine putting your own work anywhere else besides a place that you own.

    Here are the ten most popular FlowingData projects from 2015:

    1. Years You Have Left to Live, Probably — This is when you will die.
    2. A Day in the Life of Americans — This is how America runs.
    3. Top Brewery Road Trip, Routed Algorithmically — This is going to happen.
    4. Work Counts — There are others just like you.
    5. How Americans Get to Work — Working from home is best.
    6. When Do Americans Leave For Work? — Yay, rush hour.
    7. How We Spend Our Money, a Breakdown — It depends.
    8. Reviving the Statistical Atlas of the United States with New Data — Old becomes new again.
    9. Who Earned a Higher Salary Than You? — This is all anyone really cares about.
    10. Real Chart Rules to Follow — Some rules aren’t meant to be broken.

    I made an extra effort this year, especially in the last few months, to work on interactive graphics. I used to think of interaction as a way for readers to look at data more deeply. As they say, provide an overview first and then let the reader dig into the details.

    But lately, I’ve been thinking of interaction for the other way around. Let people go with specifics first to “find themselves” in the data and then if they want, they can take a step back for the wideout view. It seems like data at the individual level provides a good mode of comparison and personal context that can sometimes be missed. I’ll have to explore more in 2016.

    Next year brings with it more interaction, animation, and simulation I am sure. Plus data. Of course. Probably more beer, too. Definitely more beer.

    I think next year might also be one where I step out of my comfort zone. Or at least I’ll try. I’m starting to feel a bit too comfortable and that leads to boredom, which leads to the dark side. I think this is where the beer comes into play.

    In any case, 2015. Another year in the bag. Thanks for reading and your support. I couldn’t do this without you.

    See you in 2016.

  • Martin Grandjean looked at the structure of Shakespeare tragedies through character interactions. Each circle (node) represents a character, and each connecting line (edge) represents two characters who appeared in the same scene.

    [T]he longest tragedy (Hamlet) is not the most structurally complex and is less dense than King Lear, Titus Andronicus or Othello. Some plays reveal clearly the groups that shape the drama: Montague and Capulets in Romeo and Juliet, Trojans and Greeks in Troilus and Cressida, the triumvirs parties and Egyptians in Antony and Cleopatra, the Volscians and the Romans in Coriolanus or the conspirators in Julius Caesar.

    At first glance, the eleven charts in total look hairball-ish. The above have similar network densities, which suggests similar story structures, but look at the ones that are more separated (lower network density) for contrast and then go from there.

    Look a bit deeper? See also Understanding Shakespeare from a few years back, which visualized word usage and structure.

  • In a collaboration between the Digital Scholarship Lab at the University of Richmond and Stamen Design, American Panorama combines United States history, geographic mapping, and individual narratives to create a visual atlas of history.

    They currently cover four topics — forced migration of enslaved people, overland trails, foreign-born population, and canals — with one map and chart interface for each. Each also uses a time component that lets you see changes by year. The first two are the most interesting though. They couple geographic data with personal stories that lend an important context, which tends to get lost as with time.

  • There isn’t a complete government record for people killed by police, which is why efforts such as the Guardian’s The Counted project exists. Mapping Police Violence is another source to look at, and they have a dataset for download for shootings from 2013 to present.

    We believe the data represented on this site is the most comprehensive accounting of people killed by police since 2013. The most liberal estimates project the total number of people killed by police in the U.S. to be about 1,200-1,300 per year. And while there are undoubtedly police killings that are not included in our database, these estimates suggest that our database captures at least 90-98 percent of all police killings that have occurred since 2013. We hope these data will be used to provide greater transparency and accountability for police departments as part of the ongoing campaign to end police violence in our communities.

    The data includes each victim’s name, location, race, agency responsible, news source, and more.

  • December 23, 2015

    Topic

    Coding  / 

    Joshua Kunst did a quick analysis on tag usage on StackOverflow, the question and answer site for programming. The R tag isn’t the top of course, but it is growing in usage the quickest, based on slope.

    This is good news for two reasons. The first of course is that R usage is growing. The second is that we can spend less time in archived R email threads, which tend to be a bit salty. Bonus: fewer people post questions to the threads, so fewer people will feel the need to respond. Win-win. [via Revolutions]

  • These are my picks for the best of 2015. As usual, they could easily appear in a different order on a different day, and there are projects not on the list that were also excellent.

  • Jan Willem Tulp, in collaboration with Visualized, shows all known exoplanets (currently 1,942 of them) and then filters down to the ones that are habitable.

    Astrobiology uses the Goldilocks principle in the argument that a planet must neither be too far away from, nor too close to a star to support life; either extreme would result in a planet incapable of supporting life. Such a planet is colloquially called a “Goldilocks Planet”.

    Sweet wiggles and chart transitions.

  • The Online Star Register takes you through a delightful view of a million stars. You can browse and gaze the sky, but be sure to “take a tour” via the button on the top. It starts on the ground at Earth, beams you out far out and then back again.

  • Remember when xkcd charted character interactions for fictional stories? Inspired by that and the upcoming Star Wars movie, Katie Franklin, Simon Elvery and Ben Spraggon made interaction charts for every episode of the galactic space opera.

    The one above is for Return of the Jedi. The horizontal axis represents time, and each line represents a character. The vertical bars show when the corresponding characters appear together.

  • Mathematician Katie Steckles shows logical solutions to wrapping variously shaped presents.

    I could’ve used this a few days ago. At some point in the wrapping process I wondered if it’d be better if I just haphazardly threw a bunch of paper scraps onto the gift and covered it in tape.

  • I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.

  • Take all the guesswork out of finding “love” on Tinder, and let the True Love Tinder Robot by Nicole He swipe for you. Sensors measure your palm sweat and the robotic hand acts accordingly.

    The True Love Tinder Robot will find you love, guaranteed. With Tinder open, you put your phone down front of the robot hand. Then you place your own human hands on the sensors. As you are looking at each Tinder profile, the robot will read your true heart’s desire through the sensors and decide whether or not you are a good match with that person based on how your body reacts. If it determines that you’re attracted to that person, it will swipe right. If not, it will swipe left. Throughout the process, it will make commentary on your involuntary decisions.

  • Enter the real world of data and statistics, and you find that files aren’t always neatly wrapped with a bow and delimited fields. Christopher Groskopf, who recently joined Quartz, provides an “exhaustive reference” to deal with the real stuff.

    Most of these problems can be solved. Some of them can’t be solved and that means you should not use the data. Others can’t be solved, but with precautions you can continue using the data. In order to allow for these ambiguities, this guide is organized by who is best equipped to solve the problem: you, your source, an expert, etc. In the description of each problem you may also find suggestions for what to do if that person can’t help you.

    The guide is aimed at journalists but easily applies to general data meanderings. I think we can all easily relate to problems such as missing data (“Where did the rest go?”), sample bias (“The population is who?”), and data in a difficult-to-manage format (“They gave you how many PDF files?”).

    Bookmark it, read it, and keep it in your digital pocket.

  • As the saying goes, “All roads lead to Rome.” Folks at the moovel lab were curious about how true this statement is, so they tested it out. They laid a grid on top of Europe, and then algorithmically found a route from each cell in the grid to Rome, resulting in about half a million routes total. Yep, there seems to be a way from Rome from every point.

    Above is the map of these routes. Road segments used more frequently were drawn thicker, and as you might expect you get what looks like a root system through the continent. I’m guessing thicker lines are highways and freeways.

    Moovel did the same with cities named Rome in the United States and the state capitals. Pretty sweet.

  • Getting to 100 percent renewable energy seems like such a far away goal at this point in time – which is why Mark Jacobson has a plan.

    Mark Jacobson, a Stanford engineering professor, believes the world can eliminate fossil fuels and rely on 100 percent renewable energy. Following up on his state-by-state road map for the United States, he has now released data on plans for how 139 countries could wean themselves from coal, oil, natural gas, and nuclear power.

    The plan provides an energy breakdown for each country, and the National Geographic graphic shows how that compares to other countries incorporated in the plan.

    See also the state-by-state plan for the United States, which shows breakdowns in the same fashion.

  • NASA mapped the annual cycle of all plant life on the planet in this animated map.

    Satellite instruments reveal the yearly cycle of plant life on the land and in the water. On land, the images represent the density of plant growth, while in the oceans they show the chlorophyll concentration from tiny, plant-like organisms called phytoplankton. From December to February, during the northern hemisphere winter, plant life in the higher latitudes is minimal and receives little sunlight.

    See also John Nelson’s breathing earth that used satellite imagery.

  • mass shootingsWith recent events, you’ve likely seen the articles and graphics that get into the number of mass shootings this year and further into the past. You might have noticed that the numbers seem to vary depending on where you look, and the difference likely stems from how “mass shooting” is defined by the author.

    Kevin Schaul for the Washington Post provides a straightforward interactive that uses a shootings dataset from Reddit, but shows how the count quickly changes depending on how your definition.