There are various connections between Stephen King novels. Gillian James puts them in a flowchart.
Network diagrams are notoriously messy. Even a small number of nodes can be overwhelmed by their chaotic placement and relationships. Cody Dunne of HCIL showed off his new work in simplifying these complex structures. In essence, he aggregates leaf nodes into a fan glyph that describes the underlying data in its size, arc, and color. Span nodes are similarly captured into crescent glyphs. The result is an easy to read, high level look at the network. You can easily compare different sections of the network, understand areas that may have been occluded by the lines in a traditional diagram, and see relationships far more quickly.
I love the elegance and simplicity of Cody's work. He details every step of the new layout in his paper, and it's definitely worth a read. The code for it will be pushed to NodeXL, an open-source tool for Excel, in the coming weeks.
Wow, Manuel Lima, Senior UX Designer at Bing, got through a world of information in this 11 minute RSA Animate video. He spoke about the topic for which we all know him - networks. Beginning with the tree as a symbol of relationships (e.g., Aristotle's Tree of Knowledge), Manuel then quickly sweeps through many concepts through the centuries to finally land on a modern day approach to relational information, the web or network. As trees are no longer capable of representing the complexities of the modern world, we have to find new ways to visualize these structures or perhaps even find a universal structure. His talk is loaded with beautiful examples of trees and networks.
If this fast paced animation is above your processing capacity, you can view the more austere real world video of Manuel instead. It has the bonus of an interesting interview with him in the last 6 minutes.
PhD student Adrien Friggeri demonstrates a new clustering algorithm with a visualization of the agreement groups within the United States Senate over time.
As you might imagine, there are two obvious groupings, Republican and Democrat. It gets interesting though when you look at Democrats classified as Republicans and vice versa. For example, the 11 Republicans placed in the Democratic group of the 110th Congress:
Most of whom are either moderates or closer to the Democrats than to their own party. Charles Hagel was critic of the Bush Administration which he described as "the lowest in capacity, in capability, in policy, in consensus — almost every area" of any presidency in the last forty years. George Voinovich has been known to oppose lowering taxes and frequently joined the Democrats on tax issues. John Warner is a moderate Republican and has centrist stances on many issues, to the point that he once faced opposition of other members of his own party when he decided to run for re-election.
Be sure to click on the gray boxes to follow the trajectories of different cohorts.
Many have been detailing the vast sums being raised by the presidential candidates and the super PACs supporting them. But where are all those millions being spent? Among other things, the answers can provide hints on potential improper coordination between campaigns and super PACs. Here are the 200 biggest recipients of spending by the major campaigns and most of the major super PACs.
It's a sankey diagram with campaigns and Super PACs on one side and payees on the other. (I rotated the image above clockwise.) Select a campaign to see what they've spent their money on, or select a payee to see who's paying them. As I browsed through payees, my next question was what these companies, organizations, and people do since $377,222 from Obama for America to a company called PDR II DBA Share Share doesn't mean much to me. I haven't looked at FEC data in a while, but I vaguely remember a way to categorize spending.
Find more information on the making of this graphic here.
For the MIT Sloan Sports Analytics Conference a few weeks ago, Stanford biomechanical engineering student and Ayasdi analyst Muthu Alagappan presented his work on redefining basketball positions.
After studying players like LeBron James and Blake Griffin, many analysts are now suggesting that there are new positions, which are simply hybrids of the one's we already had. For example, some players are now labeled "point-forwards" or "combo-guards." But what if we were wrong about our initial five positions. Maybe a "Center" is just a label for people over a certain height, and there are actually three different types of big men in the NBA.
An analysis, done with data exploration tool Ayasdi Iris, provided 13 possible positions, as shown above. Nodes and edges are colored by points per minute on a blue (low) to red (high) scale.
So for example, those typically classified as centers or power forwards are classified as scoring rebounders, paint protectors, and scoring paint protectors. Dirk Nowitzki might be considered a scoring rebounder, whereas Joakim Noah is a paint protector.
The point? Hopefully teams can use this information to make better decisions about who to trade and draft. Of course, I'm sure scouts know about these fuzzy positions already, so I think the next step is to look at what positions the best teams have and had, and more importantly, how a "one-of-a-kind" player can change everything.
With NCAA March Madness in full swing, the basketball graphics are out in full force. This one by Angi Chau, shows the probabilities of teams winning each game, and eventually the championship, based on simulated bracket rankings. Done with D3, each node represents a game and teams are circled on the outside. Roll over a team, and get all the probabilities for that team going to the end or roll over a game to see the probability of teams winning that game. Sorry, Colorado. You have a 0% chance of winning it all. You, too, Vermont.
Hopefully, Chau keeps updating throughout the tournament. And maybe some color-coding to indicate probabilities would be useful here. Now excuse me while I go place some educated bets. (One million on Colorado.)
The Iliad is an epic poem by Homer with a lot of characters and story lines going on at once. I vaguely remember reading bits and pieces in high school and getting totally lost. Santiago Ortiz explores these relationships in his latest work, which draws on the connections i.e. character sentence co-occurrences.
When you use the same password for every online account, there could be trouble down the line if one of those sites was breached. You gotta mix it up these days. As part of their Watchdog initiative, Mozilla released an add-on to help you see how you're reusing passwords, and to hopefully keep your personal information secure.
Ever been told not to reuse the same password across different websites? With this add-on, you can visualize your passwords and the sites you use them on. By looking at this visualization, you can get a quick idea of which passwords you've been using the most, and the kinds of sites you're using them on. As you continue to change your passwords and update your password manager, the picture will improve!
Personally, I don't save any of my passwords. The risk of my computer getting stolen and some random person gaining access to my online accounts is too much for me to handle. Of course as a result, I have to put up with the craptastic experience of trying to remember passwords with a variable number of capital letters, symbols, and digits.
While SOPA and PIPA are no laughing matter (join the strike), the reaction from those on Twitter who don't know what's going on is great entertainment. Do a search on 'wtf wikipedia' for tweets from confused individuals who are trying to find information on stuff. I'm just going to leave Twitter trackers Revisit and Spot, by Moritz Stefaner and Jeff Clark, respectively, open all day. "OMG I'm doing homework and Wikipedia is blacked out wtf !!!!!!!!!!!!!!!!!!!!!"
Twitter is an organic online location, full of retweets, conversations, and link sharing. Jeff Clark tries to show these inner workings with his newest interactive, Spot. Enter a query in the field on the bottom left, and Spot retrieves the most recent 200 tweets. You then can choose among five views: group, words, timeline, users, and source.
While we're on the topic of academic papers and how they're linked, Johan Bollen et. al used clickstream data to draw detailed maps of science, from the point of view of those actually reading the papers. That is, instead of relying on citations, they used log data on how readers request papers, in the form of a billion user interactions on various web portals.
Maps of science derived from citation data visualize the relationships among scholarly publications or disciplines. They are valuable instruments for exploring the structure and evolution of scholarly activity. Much like early world charts, these maps of science provide an overall visual perspective of science as well as a reference system that stimulates further exploration. However, these maps are also significantly biased due to the nature of the citation data from which they are derived: existing citation databases overrepresent the natural sciences; substantial delays typical of journal publication yield insights in science past, not present; and connections between scientific disciplines are tracked in a manner that ignores informal cross-fertilization.
Each circle represents a journal and edges represent connections between journals, according to Johan Bollen et. al's clickstream model. Circles are color-coded by journal classifications from the Getty Research Institute's Art and Architecture Thesaurus.
So you have most of the engineering and physical sciences on the perimeter, medical-related areas to the left, and liberal arts is that middle cluster. Statistics is towards the top left, mixed in with demographics, philosophy, and sociology. There aren't many surprises in the clusters, but there are interesting, albeit weaker, links in the open spaces, such as religion and chemistry or music and ecology.
From Autodesk Research, Citeology is an interactive that visualizes connections in academic research via paper citations:
The names of each of the 3,502 papers published at the CHI and UIST Human Computer Interaction (HCI) conferences between 1982 and 2010 are listed by year and sorted with the most cited papers in the middle. In total, 11,699 citations were made from one article to another within this collection. These citations are represented by the curved lines in the graphic, linking each paper to those that it referenced.
The interactive repsonds slowly to clicks and only works in Firefox for me, but it's interesting to play around even if you aren't familiar with CHI and HCI papers. It works better if you select one to three generations instead of all. Click on a specific paper and you get citations for that paper on the right (brown) and the papers that the selected cited on the left (blue).
Color-coding for categories, authors, or subject could add another level of meaning to this. For example, do we see the subject evolve? Do papers that focus on a certain subject site outside of the main topic?
Food flavors across cultures and geography vary a lot. Some cuisines use a lot of scallion and ginger, whereas another might use a lot of onion and butter. Then again, everyone seems to use garlic. Yong-Yeol Ahn, et al. took a closer look at what makes food taste different, breaking ingredients into flavor compounds and examining what the ingredients had in common. A flavor network was the result:
Each node denotes an ingredient, the node color indicates food category, and node size reflects the ingredient prevalence in recipes. Two ingredients are connected if they share a significant number of flavor compounds, link thickness representing the number of shared compounds between the two ingredients. Adjacent links are bundled to reduce the clutter.
Mushrooms and liver are on the edges, out on their lonesome.
[Nature | Thanks, Elise]
During the riots in London this past summer, a lot of information spread quickly about what was going on. Some of that information was true and some was not so true. The Guardian explores this spread of information on Twitter, and how fact and fiction seem to reveal themselves on their own:
A period of unrest can provoke many untruths, an analysis of 2.6 million tweets suggests. But Twitter is adept at correcting misinformation - particularly if the claim is that a tiger is on the loose in Primrose Hill.
Other rumors include when rioters cooked their own food at McDonald's (false), London Eye was set on fire (false), and Miss Selfridge was set on fire (true).
Each bubble represents a tweet and is sized by number of followers the tweeter has. The big one is usually the orignal tweet and the small ones that cluster around are retweets. Then the colors represent tweets that support, oppose, question, or comment. So when you play the animation for each rumor, bubbles swiftly pop up at the rumor peaks and then settle at true or false.
You can also use the scroll to move to a certain point in time, and roll over bubbles to see the tweets.
Really nice graphic and worth a look.
Hilary Mason, chief scientist at bitly, examined links to 600 science pages and the pages that those people visited next:
The results revealed which subjects were strongly and weakly associated. Chemistry was linked to almost no other science. Biology was linked to almost all of them. Health was tied more to business than to food. But why did fashion connect strongly to physics? And why was astronomy linked to genetics?
The interactive lets you poke around the data, looking at connections sorted from weakest (fewer links) to strongest (more links), and nodes are organized such that topics with more links between each other are closer together.
Natural next step: let me click on the nodes.
As the Eurozone crisis develops, the BBC News has a look at what country owes what to whom:
Europe is struggling to find a way out of the eurozone crisis amid mounting debts, stalling growth and widespread market jitters. After Greece, Ireland, and Portugal were forced to seek bail-outs, Italy - approaching an unaffordable cost of borrowing - has been the latest focus of concern.
But, with global financial systems so interconnected, this is not just a eurozone problem and the repercussions extend beyond its borders.
Simply click on a country, whose arc length represents how much they owe, and arrows show debt.
[BBC News | Thanks, Eugene]
If you don't watch the candidate debates — and let's face it, that's just about everyone — you pretty much miss everything, except for stuff like Rick Perry forgetting agency names. Politilines, by Periscopic, lets you see what the candidates talked about each night.
The left column lists top issues, the middle shows words used, and the right column shows candidates. Roll over any word or name to see who talked about what or what was talked about by whom.
We collected transcripts from the American Presidency Project at UCSB, categorized them by hand, then ranked lemmatized word-phrases (or n-grams) by their frequency of use. Word-phrases can be made of up to five words. Our ranking agorithm accounts for things such as exclusive word-phrases - meaning, it won't count "United States" twice if it's used in a higher n-gram such as "President of the United States."
While still in beta, the mini-app is responsive and easy to use. The next challenge, I think, is to really show what everyone talked about. For example, click on education and you see Newt Gingrich, Ron Paul, and Rick Perry brought those up. Then roll over the names to see the words each candidate used related to that topic. You get some sense of content, but it's still hard to decipher what each actually said about education.
Posts and links get shared over and over again, but we usually don't know how. We get counts, but who shares what and how far do does a link reach? Google+ Ripples gives you a peak into the process. A link or status is posted, and like when a pebble is dropped in a pond, a pattern forms outwards.
In 1937, mathematician Lothar Collatz proposed that given the following algorithm, you will always end at the number 1:
- Take any natural number, n.
- If n is even, divide it by 2.
- Otherwise, n is odd. Multiply it by 3 and add 1.
- Repeat indefinitely.
Developer Jason Davies puts it into reverse and shows all the numbers that fall within an orbit length of 18 or less. Press play, and watch the graph grow. Mostly a fun animation for nerds like me.