<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>FlowingData &#187; Data Design Tips</title>
	<atom:link href="http://flowingdata.com/category/design/feed/" rel="self" type="application/rss+xml" />
	<link>http://flowingdata.com</link>
	<description>Strength in Numbers</description>
	<lastBuildDate>Sun, 21 Mar 2010 04:35:19 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<atom:link rel="next" href="http://flowingdata.com/category/design/feed/?page=2" />

		<item>
		<title>Graphical perception &#8211; learn the fundamentals first</title>
		<link>http://flowingdata.com/2010/03/20/graphical-perception-learn-the-fundamentals-first/</link>
		<comments>http://flowingdata.com/2010/03/20/graphical-perception-learn-the-fundamentals-first/#comments</comments>
		<pubDate>Sun, 21 Mar 2010 03:40:59 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=6319</guid>
		<description><![CDATA[Before you dive into the advanced stuff - like just about everything in your life - you have to learn the fundamentals before you know when you can break the rules. <p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<a href="http://flowingdata.com/2010/03/20/graphical-perception-learn-the-fundamentals-first/" title="Graphical perception &#8211; learn the fundamentals first"><img src="http://flowingdata.com/wp-content/uploads/yapb_cache/perception.1krjbgaaam74k04ckkgkwg0k0.22qwr5zijcckg48go4wowg88o.th.png" width="545" height="337" alt="Graphical perception &#8211; learn the fundamentals first" ></a><p>When it comes to visualization, especially on the Web, you have to be open-minded, and you should be willing to try new things. There’s no advancing otherwise. However, before you dive into the advanced stuff - like just about everything in your life - you have to learn the fundamentals before you know when you can break the rules. </p>
<p>You have to know what flavors work together and against each other before you cook a feast fit for a king. You have to learn grammar and spelling before you can write a book that others will actually enjoy.</p>
<p>So when you’re learning to visualize data, do yourself a favor and learn the basic rules first. Then you can spend the rest of your days breaking them.</p>
<h2>What Works</h2>
<p>Luckily, researchers have already done lots of studies on what visual cues work and what  sucks, so you don’t have to start from scratch. Most notable is perhaps William S. Cleveland and Robert McGill’s paper <a href="https://secure.cs.uvic.ca/twiki/pub/Research/Chisel/ComputationalAestheticsProject/cleveland.pdf">Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods</a> [pdf] from the September 1984 edition of the <em>Journal of the American Statistical Association</em>. I won’t rehash the whole paper, but the findings of most interest here is a ranked list of how well people decode visual cues. </p>
<p>When we (the designers) visualize data, we encode the quantitative information in shapes, color, position, etc. The viewers then have to decode that information. Cleveland and McGill studied what people are able to decode most accurately and ranked them in the following list.</p>
<ol>
<li>Position along a common scale e.g. <a href="http://www.noc.soton.ac.uk/animate/data/cis/cis8_oxygen_scatter.png">scatter plot</a></li>
<li>Position on identical but nonaligned scales e.g. <a href="http://images.businessweek.com/ss/08/12/1211_numbers/2.htm">stacked bars</a></li>
<li>Length e.g. <a href="http://flowingdata.com/2009/07/02/whos-going-to-win-nathans-hot-dog-eating-contest/">bar chart</a></li>
<li>Angle & Slope (tie) e.g. <a href="http://flowingdata.com/2008/06/09/what-do-you-use-to-analyze-andor-visualize-data-poll-results/">pie chart</a></li>
<li>Area e.g. <a href="http://flowingdata.com/2007/10/02/americans-prefer-watered-down-beer/">bubbles</a></li>
<li>Volume, density, and color saturation (tie) e.g. <a href="http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/">heatmap</a></li>
<li>Color hue e.g. <a href="http://newsmap.jp/">newsmap</a></li>
</ol>
<p>I’d say that’s what we’d expect nowadays, right? However, that angle and slope ranking might be a bit of shock to some of you, given all the pie chart hate we see.</p>
<p>In fact, the decoding error for all encoding types isn’t wildly bad:</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/03/errors.png" alt="" title="errors" width="476" height="450" class="alignnone size-full wp-image-6320" /></p>
<h2>A Framework</h2>
<p>Now before you go shunning everything that isn't in the top three, keep in mind this list isn’t meant to be a definitive answer on what to use and what not to in your data graphics. Cleveland and McGill note, “The ordering does not result in a precise prescription for displaying data but rather is a framework within which to work.” </p>
<p>That’s sounds like an invitation to break some "rules” if you ask me. We might even be able to do certain things that push those error bars further left. That's for another post though. The keyword is <em>framework</em>. Start with the visual fundamentals along with other important stuff, like context, the audience, and what you're trying to accomplish, and you'll be in good shape.</p>
<p>From there, you'll learn from experience how to get fancy with your visualizations - just like how sentence fragments can be effective sometimes or how sometimes salt and fruit go well together. Sometimes area charts are a good choice.</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2010/03/20/graphical-perception-learn-the-fundamentals-first/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Think like a statistician &#8211; without the math</title>
		<link>http://flowingdata.com/2010/03/04/think-like-a-statistician-without-the-math/</link>
		<comments>http://flowingdata.com/2010/03/04/think-like-a-statistician-without-the-math/#comments</comments>
		<pubDate>Thu, 04 Mar 2010 08:39:48 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Featured]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=5645</guid>
		<description><![CDATA[I call myself a statistician, because, well, I'm a statistics graduate student. However, the most important things I've learned are less formal, but have proven extremely useful when working/playing with data.<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<a href="http://flowingdata.com/2010/03/04/think-like-a-statistician-without-the-math/" title="Think like a statistician &#8211; without the math"><img src="http://flowingdata.com/wp-content/uploads/yapb_cache/201844037_7dbd27025f_o.6t6zk8iv54g8sk048k8k0wgk4.22qwr5zijcckg48go4wowg88o.th.png" width="545" height="378" alt="Think like a statistician &#8211; without the math" ></a><p>I call myself a statistician, because, well, I'm a statistics graduate student. However, ask me specific questions about hypothesis tests or required sampling size, and my answer probably won't be very good. </p>
<p>The other day I was trying to think of the last time I did an actual hypothesis test or formal analysis. I couldn't remember. I actually had to dig up old course listings to figure out when it was. It was four years ago during my first year of graduate school. I did well in those courses, and I'm confident I could do that stuff with a quick refresher, but it's a no go off the cuff. It's just not something I do regularly.</p>
<p>Instead, the most important things I've learned are less formal, but have proven extremely useful when working/playing with data. Here they are in no particular order.</p>
<h2>Attention to Detail</h2>
<p>Oftentimes it's the little things that end up being the most important. There was this one time in class when my professor put up a graph on the projector. It was a bunch of data points with a smooth fitted line. He asked what we saw. Well, there was an increase in the beginning, a leveling off in the middle, and then another increase. However, what I missed was the little blip in the curve in the first increase. That was what we were after.</p>
<p>The point is that trends and patterns are important, but so are outliers, missing data points, and inconsistencies. </p>
<h2>See the Big Picture</h2>
<p>With that said, it's important not to get too caught up with individual data points or a tiny section in a really big dataset. We saw this in the recent <a href="http://flowingdata.com/2010/02/17/road-to-recovery-is-the-recovery-act-working/">recovery graph</a>. Like some pointed out, if we took a step back and looked at a larger time frame, the Obama/Bush contrast doesn't look so shocking.</p>
<h2>No Agendas</h2>
<p>This should go without saying, but approach data as objectively as possible. I'm not saying you shouldn't have a hunch about what you're looking for, but don't let your preconceived ideas influence the results. Because if you go to length looking for some specific pattern, you're probably going to find it. It'll just be at the sacrifice of accurate results.</p>
<h2>Look Outside the Data</h2>
<p>Context, context, context. Sometimes this will come in the form of metadata. Other times it'll come from more data.</p>
<p>The more you know about how the data was collected, where it came from, when it happened, and what was going on at the time, the more informative your results and the more confident you can be about your findings.</p>
<h2>Ask Why</h2>
<p>Finally, and this is the most important thing I've learned, always ask why. When you see a blip in a graph, you should wonder why it's there. If you find some correlation, you should think about whether or not it makes any sense. If it does make sense, then cool, but if not, dig deeper. Numbers are great, but you have to remember that when humans are involved, errors are always a possibility.</p>
<p><small><em>*Photo by <a href="http://www.flickr.com/photos/maisonbisson/201844037/">misterbisson</a></em></small></p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2010/03/04/think-like-a-statistician-without-the-math/feed/</wfw:commentRss>
		<slash:comments>52</slash:comments>
		</item>
		<item>
		<title>11 Ways to Visualize Changes Over Time &#8211; A Guide</title>
		<link>http://flowingdata.com/2010/01/07/11-ways-to-visualize-changes-over-time-a-guide/</link>
		<comments>http://flowingdata.com/2010/01/07/11-ways-to-visualize-changes-over-time-a-guide/#comments</comments>
		<pubDate>Thu, 07 Jan 2010 08:44:24 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Featured]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=4627</guid>
		<description><![CDATA[Deal with data? No doubt you've come across the time-based variety. This is a guide to help you figure out what type of visualization to use to see that stuff.<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<p>Deal with data? No doubt you've come across the time-based variety. The visualization you use to explore and display that data changes depending on what you're after and data types. Maybe you're looking for increases and decreases, or maybe seasonal patterns. </p>
<p>This is a guide to help you figure out what type of visualization to use to see that stuff.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/line.png" alt="" title="line graph" width="545" height="137" class="alignnone size-full wp-image-4657" /></p>
<p>Let's start with the basics: the line graph. This will work for most of your time series data. Use it when you have a lot of a points or just a few. Place multiple time series on one graph or place one. Mark the data points with squares, circles, or none at all. Basically, if you're not sure what to use, the line graph will usually do the trick.</p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2008/02/11/comparing-roger-clemens-to-hall-of-fame-pitchers/">Comparing Roger Clemens to Hall of Fame Pitchers</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/scatter.png" alt="" title="scatter" width="545" height="137" class="alignnone size-full wp-image-4658" /></p>
<p>Scatterplots work well if you have a lot of data points. Because the dots are small, it doesn't work well if you only have a few points. Scatterplots also work well when your measurements aren't nicely structured. For example, if your measurements aren't equally spaced, a line graph probably wouldn't work.</p>
<p><strong>An example:</strong> <a href="http://www.noc.soton.ac.uk/animate/data/cis/cis8_oxygen_scatter.png">Oxygen Concentration Over Time</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/bar.png" alt="" title="bar" width="545" height="137" class="alignnone size-full wp-image-4670" /></p>
<p>Bar charts work best for time series when you're dealing with distinct points in time (as opposed to more continuous data). They tend to work better when you have data points that are evenly spaced in time.</p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2009/07/02/whos-going-to-win-nathans-hot-dog-eating-contest/">Whoâ€™s Going to Win Nathanâ€™s Hot Dog Eating Contest?</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/stacked-bar.png" alt="" title="stacked bar" width="545" height="137" class="alignnone size-full wp-image-4659" /></p>
<p>Use this the same way you would a bar chart when you have multiple categories (hence the stacking). The stacks represent a significance in the sum of the parts. Don't stack if the parts don't go together though.</p>
<p><strong>An example:</strong> <a href="http://images.businessweek.com/ss/08/12/1211_numbers/2.htm">Bad Housing Loans in Forclosure</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/11/stacked-series.png" alt="" title="stacked-series" width="545" height="137" class="alignnone size-full wp-image-3951" /></p>
<p>The stacked area is the stacked bar's more versatile sibling. Use this if you've got a lot of data points in time and there isn't enough room for a bunch of bars. </p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2009/12/02/past-15-years-of-consumer-spending/">Past 25 Years of Consumer Spending</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/bubble.png" alt="" title="bubble" width="545" height="137" class="alignnone size-full wp-image-4654" /></p>
<p>The bubble plot is like a scatterplot, but instead of small dots, you size circles by some other metric. This way you can show two measurements at once over time. Hans Rosling's <a href="http://flowingdata.com/2007/07/06/hans-rosling-providing-data-inspiring-change/">TED talks</a> made this visualization method especially popular in the past couple of years.</p>
<p><strong>An example:</strong> <a href="http://graphs.gapminder.org/world/#$majorMode=chart$is;shi=t;ly=2003;lb=f;il=t;fs=11;al=30;stl=t;st=t;nsl=t;se=t$wst;tts=C$ts;sp=6;ti=2007$zpv;v=0$inc_x;mmid=XCOORDS;iid=phAwcNAVuyj1jiMAkmq1iMg;by=ind$inc_y;mmid=YCOORDS;iid=phAwcNAVuyj2tPLxKvvnNPA;by=ind$inc_s;uniValue=8.21;iid=phAwcNAVuyj0XOoBL_n5tAQ;by=ind$inc_c;uniValue=255;gid=CATID0;by=grp$map_x;scale=log;dataMin=194;dataMax=96846$map_y;scale=lin;dataMin=23;dataMax=86$map_s;sma=49;smi=2.65$cd;bd=0$inds=">Income per Person and GDP by Gapminder</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/color.png" alt="" title="color" width="545" height="137" class="alignnone size-full wp-image-4655" /></p>
<p>Color to show changes tends to be underutilized. It's easier to see differences in height than it is to see differences in shades of gray, but if you're limited by space or need to show a lot at once, color can be a good solution. The main challenges with color, that should play a role in the design process, are choosing color scale and dealing with the small portion of the population who is colorblind.</p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2009/09/10/3-in-depth-views-of-flight-delays-and-cancellations/">Congestion in the Sky</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/timeline.png" alt="" title="timeline" width="545" height="137" class="alignnone size-full wp-image-4660" /></p>
<p>Timelines work for events i.e. you're most interested in time of occurrence. While they don't work well if you have a lot of data, you can combine the timeline with any of the above to pretty good effect. </p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2008/03/14/10-largest-data-breaches-since-2000-millions-affected/">10 Largest Data Breaches Since 2000 â€“ Millions Affected</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/everything.png" alt="" title="everything" width="545" height="137" class="alignnone size-full wp-image-4656" /></p>
<p>Again, like the <a href="http://flowingdata.com/2009/11/25/9-ways-to-visualize-proportions-a-guide/">guide to proportions</a>, showing every single data point can work well when you're interested in the details of every event. This obviously takes up a lot of space, but is sometimes effective when you need to <a href="http://flowingdata.com/2008/10/10/great-data-visualization-tells-a-great-story/">humanize</a> the data.</p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2009/11/11/the-pitching-dominance-of-mariano-rivera/">The Pitching Dominance of Mariano Rivera</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2010/01/animation.png" alt="" title="animation" width="545" height="137" class="alignnone size-full wp-image-4653" /></p>
<p>Animation opens up a whole other bag of worms, and it can tricky if you don't know what you're doing. It can, however, work really well if you do know what you're doing. With animation, you can basically take any static graphic, create one for every point in time, and then string them together like a video.</p>
<p><strong>An example:</strong> <a href="http://flowingdata.com/2009/09/22/watch-the-giants-of-finance-shrink-then-grow/">Watch the Giants of Finance Shrinkâ€¦ Then Grow</a></p>
<p>Finally, if all else fails, you can always show your data in a basic table. If there aren't that many data points, a table usually works just fine. Many of the above options will also fit together nicely.</p>
<p>So what visualization methods did I miss? Help us be more smarterer in the comments below.</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2010/01/07/11-ways-to-visualize-changes-over-time-a-guide/feed/</wfw:commentRss>
		<slash:comments>32</slash:comments>
		</item>
		<item>
		<title>9 Ways to Visualize Proportions &#8211; A Guide</title>
		<link>http://flowingdata.com/2009/11/25/9-ways-to-visualize-proportions-a-guide/</link>
		<comments>http://flowingdata.com/2009/11/25/9-ways-to-visualize-proportions-a-guide/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 08:57:30 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Featured]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=3935</guid>
		<description><![CDATA[With all the visualization options out there, it can be hard to figure out what graph or chart suits your data best. This is a guide to make your decision easier for one particular type of data: proportions. 
Maybe you want to show poll results or the types of crime over time, or maybe you're [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<p>With all the visualization options out there, it can be hard to figure out what graph or chart suits your data best. This is a guide to make your decision easier for one particular type of data: <em>proportions</em>. </p>
<p>Maybe you want to show poll results or the types of crime over time, or maybe you're interested in a single percentage. Here's how you can show it. </p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/11/pie.png" alt="pie" title="pie" width="545" height="137" class="alignnone size-full wp-image-3974" /></p>
<p>We all know about the pie chart. The circle represents the whole, and the size of wedge represents a percentage of that whole. Together, those represented values, add up to 100 percent. Use this only if you're comparing a few values (like three or less) or if you're like me, use it for a ton of categories to annoy the BI people every now and then. </p>
<p><strong>See the pie in action:</strong> <a href="http://flowingdata.com/2008/06/09/what-do-you-use-to-analyze-andor-visualize-data-poll-results/">What Do You Use to Analyze and/or Visualize Data?</a></p>
<p><img class="alignnone size-full wp-image-3947" title="donut" src="http://flowingdata.com/wp-content/uploads/2009/11/donut.png" alt="donut" width="545" height="137" /></p>
<p>Oh yes, it's pie's lesser-used cousin, the donut. It's the same idea as the pie, but with a hole cut out in the middle. The same arguments of angles and human perception still apply (probably more so). I personally don't remember using the donut ever, and can't think of why I ever would. But I suppose it has its uses somewhere.</p>
<p><strong>See the donut in action:</strong> <a href="http://flowingdata.com/2009/08/06/what-britain-has-eaten-the-past-three-decades/">What Britain Has Eaten the Past Three Decades</a></p>
<p><img class="alignnone size-full wp-image-3951" title="stacked-series" src="http://flowingdata.com/wp-content/uploads/2009/11/stacked-series.png" alt="stacked-series" width="545" height="137" /></p>
<p>Use the stacked area chart if you want to show changes over time for several variables. You can use it for percentages, where the vertical always adds up to 100 percent, or you can use raw counts if you're more interested in the peaks and valleys.</p>
<p><strong>See the stacked area in action:</strong> <a href="http://www.babynamewizard.com/voyager">(Baby) NameVoyager</a></p>
<p><img class="alignnone size-full wp-image-3950" title="stacked-bar" src="http://flowingdata.com/wp-content/uploads/2009/11/stacked-bar.png" alt="stacked-bar" width="545" height="137" /></p>
<p>If you have only a few distinct points in time, you can use the stacked bar chart in the same way you use the stacked area (just set the bars vertical). You can also use them as you would a pie chart, and it's usually a better option because it's sans angle perception problem.</p>
<p><strong>See the stacked bar in action:</strong> <a href="http://www.nytimes.com/ref/us/polls_index.html">New York Times Poll Watch</a></p>
<p><img class="alignnone size-full wp-image-3952" title="tree" src="http://flowingdata.com/wp-content/uploads/2009/11/tree.png" alt="tree" width="545" height="137" /></p>
<p>Now we're getting into more advanced stuff. The treemap uses the areas of rectangles to show relative proportions. It works especially well if your data has a hierarchical structure with parent nodes, children, etc.</p>
<p><strong>See the treemap in action:</strong> <a href="http://newsmap.jp/">The Google Newsmap<br />
</a></p>
<p><img class="alignnone size-full wp-image-3953" title="voronoi" src="http://flowingdata.com/wp-content/uploads/2009/11/voronoi.png" alt="voronoi" width="545" height="137" /></p>
<p>Again we're using area to visualize magnitude, except instead of rectangles or wedges, a Voronoi diagram uses polygons. The argument for the Voronoi is a more robust algorithm that is able to sidestep some of the problems when restricted to rectangles.</p>
<p><strong>See the Voronoi in action:</strong> <a href="http://flowingdata.com/2008/05/05/american-consumers-spend-more-money-on-cheese-than-on-computers/">American Consumers Spend More Money On Cheese than On Computers</a></p>
<p><img class="alignnone size-full wp-image-3963" title="nightingale" src="http://flowingdata.com/wp-content/uploads/2009/11/nightingale1.png" alt="nightingale" width="545" height="137" /></p>
<p>The Nightingale rose graph (or the polar area diagram if you like), coined after its creator, Florence Nightingale, is like a combination of the stacked bar and pie chart. The length of radius is used to indicate one thing, usually a count, and polar area represents a portion of the whole.</p>
<p><strong>See the Nightingale in action:</strong> <a href="http://en.wikipedia.org/wiki/File:Nightingale-mortality.jpg">The original mortality chart from Florence Nightingale</a></p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/11/itembyitem.png" alt="itembyitem" title="itembyitem" width="545" height="137" class="alignnone size-full wp-image-3948" /></p>
<p>Designers like this one a lot when they want to focus on a single data point. Statisticians not so much. It takes up a lot of space, but sometimes puts things in better perspective. Basically instead of showing each data point, you're showing every individual count within a data point.</p>
<p><strong>See the everything in action:</strong> <a href="http://flowingprints.com/print2.php">The Cost of Higher Education</a></p>
<p>There are plenty of other methods to visualize proportions, but all others that come to mind are variants of the above. Of course, if none of these do it for you, you can always turn to your standard table. Forget the pictures or visualization, and just let the numbers do the talking.</p>
<p>Did I miss anything? Do you know of any other good examples that visualize proportions? Let us know in the comments below.</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/11/25/9-ways-to-visualize-proportions-a-guide/feed/</wfw:commentRss>
		<slash:comments>35</slash:comments>
		</item>
		<item>
		<title>Chart Junk vs. Eye Candy: What&#8217;s the Difference?</title>
		<link>http://flowingdata.com/2009/09/25/chart-junk-vs-eye-candy-whats-the-difference/</link>
		<comments>http://flowingdata.com/2009/09/25/chart-junk-vs-eye-candy-whats-the-difference/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 07:19:58 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Featured]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=3023</guid>
		<description><![CDATA[
Photo by horizontal.integration
There's this one phrase that really bothers me when it comes to data graphics. No doubt you've heard it or read it, and maybe it even popped into your head once or twice.
The phrase I'm talking about is: "Edward Tufte is crying."
People like to say this when they see a graphic that doesn't [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<div class="img-right"><img src="http://flowingdata.com/wp-content/uploads/2009/09/lollipop.jpg" alt="lollipop" title="lollipop" width="240" height="160" class="alignnone size-full wp-image-3024" /><br />
<small><em>Photo by <a href="http://www.flickr.com/photos/ebolasmallpox/">horizontal.integration</a></em></small></div>
<p>There's this one phrase that really bothers me when it comes to data graphics. No doubt you've heard it or read it, and maybe it even popped into your head once or twice.</p>
<p>The phrase I'm talking about is: "Edward Tufte is crying."</p>
<p>People like to say this when they see a graphic that doesn't fit the ET law of high <a href="http://www.infovis-wiki.net/index.php/Data-Ink_Ratio">data/ink ratio</a>. Then after the commenter has declared that ET is in fact a very emotional man, the graphic is classified "chart junk."</p>
<p>First off, I'm pretty sure ET isn't that melodramatic. He doesn't cry over a bad graph nor does he die a little inside or roll over in his grave if he were dead. I don't think an angel get its wings every time he rings a bell either. Although I could be wrong about the latter.</p>
<p>Second, not everything that fails to fit the mold of a traditional graph, visualization, or whatever you want to call it, is chart junk. One person's chart junk is another person's eye candy. What you see just depends on what angle you're looking at it from.</p>
<h2>Eye Candy</h2>
<p>Generally speaking, eye candy is a visual treat. It doesn't have to do with  data, but for the purpose of this post, let's pretend it does.</p>
<p>Now for me, eye candy can be a well-designed traditional chart or infographic, or it can be more abstract. It's essentially anything that stimulates my brain in a positive way. It might be because of some really impressive design or it could be about careful analysis. Or it could be both. Maybe it's data art or maybe it's visualization.</p>
<p>For example, a lot of <a href="http://flowingdata.com/2009/06/03/good-magazines-infographics-now-archived-on-flickr/">transparencies</a> from GOOD magazine are misclassified as chart junk, but a lot of the graphics are not meant to be read as traditional charts. They're some blend between visualization and data art, or I guess <em>information aesthetics</em> if you want to give it a name.</p>
<p>Sure, a lot of the data that GOOD makes visual could quickly be visualized as a bar graph or time series plot, but they're going for something less mechanical. They're trying to (artistically) express a story in the data.</p>
<p>I'm not saying that every transparency is spot on, but I think it's a lot more than some give the designers behind the graphics credit for.</p>
<p>At the other end of the eye candy spectrum are graphics from The New York Times, but I don't have to argue much for them since they're universally loved, right?</p>
<p>In the end, both GOOD and NYT are showing truth. It's just that one is more opinionated while the other is about getting just the facts out.</p>
<h2>Chart Junk</h2>
<p>With that said, there is plenty of chart junk out there. I point it out sometimes, but I usually leave that job to <a href="http://junkcharts.typepad.com/">Kaiser</a>.</p>
<p>I haven't picked up a Tufte book in a while, but I think of chart junk as the stuff on graphs that are supposed to be just the facts. It's the stuff that obscures the data, misleads readers, or makes graphs hard to read i.e. mislabeled axes, retina-burning colors, or gratuitous use of the third dimension. It comes from poor design or a misunderstanding of the data.</p>
<p>Chart junk also usually finds its way onto graphs that are trying to be "visually stimulating." You'll find these in a lot of Powerpoint presentations where someone graphed some data using the program's defaults and then smacked some weird clip art to make it um, cool, I guess?</p>
<p>Background images on graphs are also pretty ridiculous. Rarely do they work, so if you're unsure, it's best to just leave those out.</p>
<h2>The Difference</h2>
<p>Alright, so I've provided some examples, but what's the difference? Well, to be honest, there's no clear cut line between chart junk and eye candy. It has something to do with beauty in the eye of the beholder.</p>
<p>But if I were to take a stab, I'd say the main difference between the two is that chart junk comes out of carelessness or perhaps simply a lack of experience, while eye candy comes out of careful thought and an abundance of experience.</p>
<p>What do you think â€“ Is there a difference between the two?</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/09/25/chart-junk-vs-eye-candy-whats-the-difference/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>What Visualization Tool/Software Should You Use? &#8211; Getting Started</title>
		<link>http://flowingdata.com/2009/09/03/what-visualization-toolsoftware-should-you-use-getting-started/</link>
		<comments>http://flowingdata.com/2009/09/03/what-visualization-toolsoftware-should-you-use-getting-started/#comments</comments>
		<pubDate>Thu, 03 Sep 2009 07:07:27 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Featured]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=2551</guid>
		<description><![CDATA[Are you looking to get into data visualization, but don't quite know where to begin? 
With all of the available tools to help you visualize data, it can be confusing where to start. The good news is, well, that there are a lot of (free) available tools out there to help you get started. It's [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<p><img src="http://flowingdata.com/wp-content/uploads/2009/08/tool-250x375.png" alt="tool" title="tool" width="250" height="375" class="alignnone size-thumbnail wp-image-2612 img-right" />Are you looking to get into data visualization, but don't quite know where to begin? </p>
<p>With all of the <a href="http://flowingdata.com/2008/10/20/40-essential-tools-and-resources-to-visualize-data/">available tools</a> to help you visualize data, it can be confusing where to start. The good news is, well, that there are a lot of (free) available tools out there to help you get started. It's just a matter of deciding which one suits you best. This is a guide to help you figure that out.</p>
<p>But before we get into what you should use, a couple of questions.</p>
<h2>What data are you looking at?</h2>
<p>Hopefully you already have a dataset that you're interested in. If not, go find one. It's important to have actual data when you're learning, because the visualization tool that you use will depend on it.</p>
<p>There are lots of places on the Web to find data. Here are a few worth checking out:</p>
<ul>
<li><a href="http://infochimps.org/">Infochimps</a></li>
<li><a href="http://www.freebase.com/">Freebase</a></li>
<li><a href="http://many-eyes.com">Many Eyes</a></li>
</ul>
<p>The above is a very small subset of what's available. Oh, and let's not forget all the government organizations that have departments dedicated to putting together datasets. Just pick one you're interested in.</p>
<p>Got your data? Ok, good, on to the next step.</p>
<h2>What's the purpose of your visualization?</h2>
<p>The next step is to figure out you're trying to do with your visualization. Are you working on a Web application that has some graphs? Is it an interactive tool? Do you want to use better-looking graphs in your slide presentation? Is the visualization for a publication? Do you just need it for analysis?</p>
<p>Again, what you decide here will affect what tool you should use.</p>
<h2>What Visualization Software to Use</h2>
<p>Now that you have the answers to those two questions in mind, we can make a decision on what will work best for you.</p>
<h3>For Publication</h3>
<p>This means graphics like what you see in the newspaper. Most people use <a href="http://www.amazon.com/gp/product/B001EUDJWQ?ie=UTF8&tag=flowingdata-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=B001EUDJWQ">Adobe Illustrator</a>. It gives you control over all the elements in your graphic - color, stroke, font, orientation, etc. </p>
<p>If you want to do something more complicated than your traditional graphs, you can design it by hand in Illustrator or your can do it in <a href="http://www.r-project.org/">R</a> (either programmatically or with one of the add-on libraries), which is a software environment for statistical computing and graphics. From R, you can import your file as a PDF into Illustrator. That's usually what I do.</p>
<p>Illustrator is kind of pricey however. Some have suggested using the open-source alternative <a href="http://www.inkscape.org/">Inkscape</a>. I've never tried it though.</p>
<p><strong>Example:</strong> <a href="http://nytimes.com">The New York Times</a></p>
<h3>For Presentations</h3>
<p>Many want to add some spice to their presentations. You can use the same software as the above, but there's also not much harm in using Microsoft Excel despite the stigma. The key here is not to use the default settings. You can actually do a lot in Microsoft Excel and make it look good. Plus, you don't need to include many details in a graphic made for presentation slides, because people can't see them from far away.</p>
<p>Personally, I don't use it much for graphics since I'm comfortable with R and Illustrator.</p>
<h3>For Analysis</h3>
<p>There are a lot of analysis tools, and the preferred one will change depend on who you ask. I  use R, which requires some programming skills. Most people use Excel. I've also heard a lot of good things about <a href="http://tableausoftware.com/flowingdata">Tableau Software</a>.   </p>
<h3>For Web Applications</h3>
<p>I'm going to assume you have a programming background if you're looking to do visualization for a Web application. If you don't know anything about computer code, you can try Many Eyes or Fusion Charts. You'll be limited to their offerings though.</p>
<p>Now, if you're developing for the Web, there are two main options here. The first is <a href="http://processing.org">Processing</a>, which was designed to make coding easier and to give you more bang for the buck. Check out the site and Processing forums for plenty of tutorials and tips. The end result is a Java applet.</p>
<p>The second, more popular option is Flash. You can either do stuff in the actual Flash program, or you can use Actionscript for a pure coding solution. Either way, the end result is something that runs in the Flash environment. The <a href="http://flare.prefuse.org/">Flare visualization toolkit</a> is a good place to start.</p>
<p>The upside of Flash is that it tends to load faster than Java, and more people have Flash than Java installed on their computer. You might also be able to get away with just a little bit of code if you use just Flash, although, if you really want to get serious with visualization, you'll need to learn Actionscript.</p>
<p>To that end, Processing is a lot easier to learn coding-wise. Plus it's free and open source.</p>
<p><strong>Examples:</strong> <a href="http://many-eyes.com">Many Eyes</a>, <a href="http://rescuetime.com">Rescue Time</a></p>
<h3>For Art</h3>
<p>Processing definitely seems to be the software of choice for artists and designers. Again, it goes back to how easy it is to learn and how much you can do with it. Illustrator is the most common choice for non-interactive graphics since it gives you drag-and-drop control over all the elements.</p>
<p><strong>Example:</strong> <a href="http://processing.org/exhibition/">Processing Gallery</a></p>
<h2>What Software Do You Use?</h2>
<p>This is obviously a small subset of what's available. Ultimately, visualization is not just about using one piece of software, but having a full toolbox at your disposal. </p>
<p>Here's <a href="http://flowingdata.com/2008/10/20/40-essential-tools-and-resources-to-visualize-data/">a list</a> of all the programs, tools, and resources I frequently use. What do you use?</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/09/03/what-visualization-toolsoftware-should-you-use-getting-started/feed/</wfw:commentRss>
		<slash:comments>55</slash:comments>
		</item>
		<item>
		<title>Important Data &#8211; Please Act Responsibily</title>
		<link>http://flowingdata.com/2009/07/20/important-data-please-act-responsibily/</link>
		<comments>http://flowingdata.com/2009/07/20/important-data-please-act-responsibily/#comments</comments>
		<pubDate>Mon, 20 Jul 2009 07:41:14 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=2223</guid>
		<description><![CDATA[
Photo by nyki_m

Data visualization and infographics come in many forms. Some are comical and purely made for entertainment. Others are made for decisions, and important decisions at that. Let's focus on the latter right now. 
To make educated decisions based on graphics, you need accurate ones, and to make accurate graphics, you need a full [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<div class="img-right"><img src="http://flowingdata.com/wp-content/uploads/2009/07/drunk.jpg" alt="drunk" title="drunk" width="240" height="165" class="alignnone size-full wp-image-2250" /><br />
<em><small>Photo by <a href="http://www.flickr.com/photos/nyki_m/">nyki_m</a></small></em>
</div>
<p>Data visualization and infographics come in many forms. Some are comical and purely made for entertainment. Others are made for decisions, and important decisions at that. Let's focus on the latter right now. </p>
<p>To make educated decisions based on graphics, you need accurate ones, and to make accurate graphics, you need a full understanding of the data.</p>
<p>If you don't know about the data - the context of where it came from or how it was collected - your visualization or infographic is simply a data comic that could potentially misinform its readers.</p>
<h3>An Example</h3>
<p>You've probably seen Al Gore's documentary on global warming, <a href="http://www.amazon.com/gp/product/B000ICL3KG?ie=UTF8&tag=flowingdata-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=B000ICL3KG">An Inconvenient Truth</a><img src="http://www.assoc-amazon.com/e/ir?t=flowingdata-20&l=as2&o=1&a=B000ICL3KG" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />. Do carbon emissions from human actions have an effect on global temperature? There's a lot of scientific evidence that points to yes, but there's serious debate in Australia right now, led by Australian senator Steven Fielding, against that argument.</p>
<p>Fielding <a href="http://www.thepunch.com.au/articles/the-real-reason-ill-fight-in-the-senate-on-climate-change/">argues</a> that the data show no evidence that human-made carbon dioxide emissions have an effect on solar radiation, and he's flaunting this graph as his case-in-point:</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/07/The_global_temperature_chart-545x409.jpg" alt="Global Temperature Chart" title="Global Temperature Chart" width="545" height="409" class="alignnone size-medium wp-image-2240" /></p>
<p>We see surface air temperature and carbon dioxide concentration from 1995 to present. Carbon dioxide doesn't seem to be matching up with temperature. What's going on? Fielding has met with several major climate organizations asking that very question.</p>
<p>Graham Dawson does a good job at <a href="http://www.gpdawson.com/blog/2009/07/14/fielding-offside-on-climate-change/">summarizing</a> the governmental responses. In short, global temperature is only one measurement of climate change. The environmental models for global climate is of course far more complex e.g. ocean and atmosphere. I mean, we're talking about an entire planet here.</p>
<p>However, despite the responses from high-up organizations, Fielding plays off complexity as ambiguity and denial and he continues to dwell on the single graph as the tell-all. </p>
<p><a href="http://www.perceptualedge.com/blog/">Stephen Few</a> interprets Fielding's stance on global warming:</p>
<blockquote><p>This is a case of someone who listens only to what he wants to hear (the arguments of a few fringe organizations with agendas) and either ignores or is incapable of understanding the overwhelming weight of scientific evidence. He selected a tiny piece of data (a short period of time, with only one of many measures of temperature), misinterpreted it, and ignored the vast collection of data that contradicts his position. This fellow is either incredibly stupid or a very bad man.</p></blockquote>
<p>Now, I'm not going to pretend that I know all there is to know about global warming, but simply by reading Fielding's <a href="http://www.thepunch.com.au/articles/the-real-reason-ill-fight-in-the-senate-on-climate-change/">statement</a>, you certainly do get the feeling that someone is a bit diluted.</p>
<p>The problem is that many people believe Fielding whole-heartedly and are basing their decision on a single graph that tells an incomplete story. Where's the responsibility?</p>
<h3>Know Thy Data</h3>
<p>The lesson here isn't about global warming. It's that you shouldn't take data lightly. When you're dealing with data, you have to look past the numbers. </p>
<p>We've been taught that numbers mean hard facts. Numbers don't lie. But they can. People do it every day, unintentionally and oftentimes on purpose to serve an agenda. Don't be one of those people or let one of them fool you. </p>
<p>As Steve Duenes, graphics director of The New York Times puts it:</p>
<blockquote><p>The graphic's mission is determined by the data in the same way that story is written based on information the reporter has gathered... If you don't find interesting or complete information, no amount of design virtuosity will make up for that.
</p></blockquote>
<p>Always question the data. Design around the data instead of shaping data to a design. Your visualization will be the better for it and so will your readers.</p>
<p><strong>Sources:</strong> <a href="http://www.gpdawson.com/blog/2009/07/14/fielding-offside-on-climate-change/">Graham Dawson</a>, <a href="http://www.thepunch.com.au/articles/the-real-reason-ill-fight-in-the-senate-on-climate-change/">The Punch</a>, <a href="http://www.roughstockstudios.com/howmagazine.html">Information Overload</a></p>
<p>[Thanks <a href="http://data.timgraham.net">Tim</a> & <a href="http://www.perceptualedge.com/blog/">Stephen</a>]</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/07/20/important-data-please-act-responsibily/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>6 Easy Steps to Make Your Graph (Really) Ugly</title>
		<link>http://flowingdata.com/2009/06/15/6-easy-steps-to-make-your-graph-really-ugly/</link>
		<comments>http://flowingdata.com/2009/06/15/6-easy-steps-to-make-your-graph-really-ugly/#comments</comments>
		<pubDate>Mon, 15 Jun 2009 07:27:55 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=1706</guid>
		<description><![CDATA[We spend so much time trying to make our graphs accurate, simple, understandable, etc that we forget the lost art of making graphs that are inaccurate, unreadable, make absolutely no sense, and make your eyes want to vomit. I'm so tired of understanding data. I want to experience it, and I know you want to [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<p>We spend so much time trying to make our graphs accurate, simple, understandable, etc that we forget the lost art of making graphs that are inaccurate, unreadable, make absolutely no sense, and make your eyes want to vomit. I'm so tired of understanding data. I want to experience it, and I know you want to also. </p>
<p>So this one's for you, crappy graph. </p>
<p>We'll start with the graph below from a <a href="http://flowingdata.com/2009/05/22/poll-results-what-data-related-area-are-you-most-interested-in/">poll</a> a few weeks ago:</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/05/poll-results-fields1.gif" alt="What Data-related Area Are You Most Interested In?" title="What Data-related Area Are You Most Interested In?" width="545" height="369" class="alignleft size-full wp-image-1569" /></p>
<p>It's perfectly fine, but there's just one problem: you can read it, and when you're trying to do ugly, readability is a no-no. With ultimate confusion in mind, let's move on to our six easy steps to ultimate ugly. </p>
<h3>1. Use Lots of Pretty Colors</h3>
<p>The most obvious problem with the original graph is that it's not nearly colorful enough. All the bars are the came color with black labels. Seriously. What's that about? We need to make this graph sing with bright colors, and lots of them. In fact, the more your graph looks like the technicolor dreamcoat, the better.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/06/poll-results-uglified-1.gif" alt="poll-results-uglified-1" title="poll-results-uglified-1" width="545" height="369" class="alignleft size-full wp-image-1711" /></p>
<p>Much better. It's looking good already.</p>
<p>Please note the insignificance of color choices. Give as little thought to what colors you use as possible. When you try too hard, the reader will feel like he has to try hard to understand what's going on and therefore ignore the graph completely.</p>
<h3>2. Make it Unitless</h3>
<p>I specified percentage and votes in the original graph, which is just plain stupid. Units clutter and are utterly useless. Readers are smart enough to figure that stuff by themselves. Remember you want to make the reader think for a very long time. It should take hours before someone knows what your graph means.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/06/poll-results-uglified-2.gif" alt="poll-results-uglified-2" title="poll-results-uglified-2" width="545" height="369" class="alignleft size-full wp-image-1712" /></p>
<p>Ah, bright and ambiguous, just the way I like it. I am sure you agree.</p>
<h3>3. Forget About Sorting</h3>
<p>Forget about sorting values numerically or alphabetically. This goes back to the previous lesson. You want to make the reader think. It's all about time consumption, confusion, and deep pondering.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/06/poll-results-uglified-3.gif" alt="poll-results-uglified-3" title="poll-results-uglified-3" width="545" height="369" class="alignleft size-full wp-image-1713" /></p>
<p>That's better. Now it looks more human i.e. random. Graphs that look like a computer made it are dry, sterile, and boooorrring.</p>
<h3>4. Spell Incorrectly</h3>
<p>Typos are a perfect opportunity to make sure your readers are paying attention. Try using numbers in place of letters or removing words completely to keep things interesting.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/06/poll-results-uglified-4.gif" alt="poll-results-uglified-4" title="poll-results-uglified-4" width="545" height="350" class="alignleft size-full wp-image-1714" /></p>
<p>There's a reason why YouTube is so successful - the lively and insightful, almost illegible discussion of all the videos. U WNT ur c0pee to l00K just LIKE C0mm3ntz 0N U2be vidyos.</p>
<h3>5. Use Decorative PIctures</h3>
<p>So far this graph looks a lot like, well, a graph. Graphs bore people, and if you want to get your point across, you've got make it look like art or a comic even. Add decorative shapes and pictures to really make your "graph" pop.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/06/poll-results-uglified-5.gif" alt="poll-results-uglified-5" title="poll-results-uglified-5" width="545" height="355" class="alignleft size-full wp-image-1715" /></p>
<p>That's more like it. Now we're getting somewhere. Is it a graph? It it a comic? Is it art? I don't know. Maybe all the above. One thing's for sure though. It's darn ugly, which is exactly one we want.</p>
<p>There's just one more step to make your graph pure ugly!</p>
<h3>6. Arrange Everything Uniquely</h3>
<p>Now is your chance to catch people off guard. Arrange everything so that your graph looks like none of the other graphs out there. Make your readers look for the title, and don't make it too easy to glance over.</p>
<p>Oh, and let's not forget Comic Sans font. Don't you just love it? Smart and sophisticated with a touch of humor. Graphs are fun, so make your graph look the part.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/06/poll-results-uglified-6.gif" alt="poll-results-uglified-6" title="poll-results-uglified-6" width="545" height="439" class="alignleft size-full wp-image-1716" /></p>
<p>There you go, ladies and gentleman. There are your six easy steps to making your very own ugly graphs. Go out and amaze your co-workers and wow the rest of the world who don't know what they're doing.</p>
<h3>One more thing...</h3>
<p>In case you're oblivious to sarcasm, the above six steps are a guide for what not to do, so for all that is good and pure please don't actually follow them. If you want a proper guide, just follow this in reverse and do the opposite of each step.</p>
<p>There is a takeaway here though. When learning to visualize data, it's just as important to learn what not to do as it is what is to learn what is right. You can read about "the rules" in a book all you want, but if you really want to learn, like all things, you have to practice.</p>
<p>So what did I miss? What other graphing pitfalls did I miss? How can you make this graph even uglier and more illegible?</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/06/15/6-easy-steps-to-make-your-graph-really-ugly/feed/</wfw:commentRss>
		<slash:comments>49</slash:comments>
		</item>
		<item>
		<title>Rise of the Data Scientist</title>
		<link>http://flowingdata.com/2009/06/04/rise-of-the-data-scientist/</link>
		<comments>http://flowingdata.com/2009/06/04/rise-of-the-data-scientist/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 07:03:39 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=1585</guid>
		<description><![CDATA[
Photo by majamarko

As we've all read by now, Google's chief economist Hal Varian commented in January that the next sexy job in the next 10 years would be statisticians. Obviously, I whole-heartedly agree. Heck, I'd go a step further and say they're sexy now - mentally and physically. 
However, if you went on to read [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<div class='img-right'><img src="http://flowingdata.com/wp-content/uploads/2009/06/butterfly_m.jpg" alt="" title="butterfly_m" width="240" height="160" class="alignnone size-full wp-image-1591" /><br />
<em><small>Photo by <a href="http://www.flickr.com/photos/majamarko/">majamarko</a></small></em>
</div>
<p>As we've all read by now, Google's chief economist Hal Varian <a href="http://flowingdata.com/2009/02/25/googles-chief-economist-hal-varian-on-statistics-and-data/">commented</a> in January that the next sexy job in the next 10 years would be statisticians. Obviously, I whole-heartedly <a href="http://www.stat.ucla.edu/~nyau/">agree</a>. Heck, I'd go a step further and say they're sexy now - mentally <em>and</em> physically. </p>
<p>However, if you went on to read the rest of Varian's interview, you'd know that by <em>statisticians</em>, he actually meant it as a general title for someone who is able to extract information from large datasets and then present something of use to non-data experts.</p>
<h3>Sexy Skills of Data Geeks</h3>
<p>As a follow up to Varian's now-popular quote among data fans, Michael Driscoll of Dataspora, discusses the <a href="http://dataspora.com/blog/sexy-data-geeks/">three sexy skills of data geeks</a>. I won't rehash the post, but here are the three skills that Michael highlights:</p>
<ol>
<li>Statistics - traditional analysis you're used to thinking about</li>
<li>Data Munging - parsing, scraping, and formatting data</li>
<li>Visualization - graphs, tools, etc.</li>
</ol>
<h3>Oh, but there's more...</h3>
<p>These skills actually fit tightly with Ben Fry's dissertation on <a href="http://benfry.com/phd/">Computational Information Design</a> (2004). However, Fry takes it a step further and argues for an entirely new field that combines the skills and talents from often disjoint areas of expertise:</p>
<div style="border-top:1px dotted #CCC; border-bottom:1px dotted #CCC; margin-bottom: 20px;padding:20px 0;"><img src="http://flowingdata.com/wp-content/uploads/2009/06/all-fields.png" alt="" title="all-fields" class="alignnone size-full wp-image-1589" /></div>
<ol>
<li><strong>Computer Science</strong> - acquire and parse data</li>
<li><strong>Mathematics, Statistics, & Data Mining</strong> - filter and mine</li>
<li><strong>Graphic Design</strong> - represent and refine</li>
<li><strong>Infovis and Human-Computer Interaction (HCI)</strong> - interaction</li>
</ol>
<p>And after two years of highlighting visualization on FlowingData, it seems collaborations between the fields are growing more common, but more importantly, computational information design edges closer to reality. We're seeing <em>data scientists</em> - people who can do it all - emerge from the rest of the pack.</p>
<h3>Advantages of the Data Scientist</h3>
<p>Think about all the visualization stuff you've been most impressed with or the groups that always seem to put out the best work. Martin Wattenberg. Stamen Design. Jonathan Harris. Golan Levin. Sep Kamvar. Why is their work always of such high quality? Because they're not just students of computer science, math, statistics, or graphic design. </p>
<p>They have a combination of skills that not just makes independent work easier and quicker; it makes collaboration more exciting and opens up possibilities in what can be done. Oftentimes, visualization projects are disjoint processes and involve a lot of waiting. Maybe a statistician is waiting for data from a computer scientist; or a graphic designer is waiting for results from an analyst; or an HCI specialist is waiting for layouts from a graphic designer. </p>
<p>Let's say you have several data scientists working together though. There's going to be less waiting and the communication gaps between the fields are tightened.</p>
<p>How often have we seen a visualization tool that held an excellent concept and looked great on paper but lacked the touch of HCI, which made it hard to use and in turn no one gave it a chance? How many important (and interesting) analyses have we missed because certain ideas could not be communicated clearly? The data scientist can solve your troubles.</p>
<h3>An Application</h3>
<p>This need for data scientists is quite evident in business applications where educated decisions need to be made swiftly. A delayed decision could mean lost opportunity and profit. Terabytes of data are coming in whether it be from websites or from sales across the country, but in an area where Excel is the tool of choice (or force), there are limitations, hence all the tools, applications, and consultancies to help out. This of course applies to areas outside of business as well.</p>
<h3>Learn and Prosper</h3>
<p>Even if you're not into visualization, you're going to need at least a subset of the skills that Fry highlights if you want to seriously mess with data. Statisticians should know APIs, databases, and how to scrape data; designers should learn to do things programmatically; and computer scientists should know how to analyze and find meaning in data. </p>
<p>Basically, the more you learn, the more you can do, and the higher in demand you will be as the amount of data grows and the more people want to make use of it.</p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/06/04/rise-of-the-data-scientist/feed/</wfw:commentRss>
		<slash:comments>44</slash:comments>
		</item>
		<item>
		<title>Visual Representation of Tabular Information &#8211; How to Fix the Uncommunicative Table</title>
		<link>http://flowingdata.com/2009/04/21/visual-representation-of-tabular-information-how-to-fix-the-uncommunicative-table/</link>
		<comments>http://flowingdata.com/2009/04/21/visual-representation-of-tabular-information-how-to-fix-the-uncommunicative-table/#comments</comments>
		<pubDate>Tue, 21 Apr 2009 07:47:46 +0000</pubDate>
		<dc:creator>Nathan</dc:creator>
				<category><![CDATA[Data Design Tips]]></category>
		<category><![CDATA[Network Visualization]]></category>

		<guid isPermaLink="false">http://flowingdata.com/?p=1497</guid>
		<description><![CDATA[<a href="http://flowingdata.com/2009/04/21/visual-representation-of-tabular-information-how-to-fix-the-uncommunicative-table/" title="Visual Representation of Tabular Information &#8211; How to Fix the Uncommunicative Table"><img src="http://flowingdata.com/wp-content/uploads/yapb_cache/picture_12.a6y85ntf8s0sosok0wk4s4k0.22qwr5zijcckg48go4wowg88o.th.png" width="545" height="278" alt="Visual Representation of Tabular Information &#8211; How to Fix the Uncommunicative Table" ></a>This is a guest post by Martin Krzywinski who develops Circos, a GPL-licensed (free) visualization tool that can help you show relationships in data. This article is based on a longer writeup which you can find here.
Suppose that you are reading an article and the text refers you to a table on the next page. [...]<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></description>
			<content:encoded><![CDATA[<a href="http://flowingdata.com/2009/04/21/visual-representation-of-tabular-information-how-to-fix-the-uncommunicative-table/" title="Visual Representation of Tabular Information &#8211; How to Fix the Uncommunicative Table"><img src="http://flowingdata.com/wp-content/uploads/yapb_cache/picture_12.a6y85ntf8s0sosok0wk4s4k0.22qwr5zijcckg48go4wowg88o.th.png" width="545" height="278" alt="Visual Representation of Tabular Information &#8211; How to Fix the Uncommunicative Table" ></a><p><em>This is a guest post by Martin Krzywinski who develops <a href="http://mkweb.bcgsc.ca/circos">Circos</a>, a GPL-licensed (free) visualization tool that can help you show relationships in data. This article is based on a longer writeup which you can find <a href="http://mkweb.bcgsc.ca/circos/?Visualizing_Tabular_Data">here</a>.</em></p>
<p>Suppose that you are reading an article and the text refers you to a table on the next page. Before you turn the page, what are your expectations of the table? Chances are, you would like it to communicate trends and patterns. Chances are, too, that it will fail and simply deliver numerical minutiae. You are left hunting around the numbers for a while, only to return to the text in hopes that the table's data trends will be communicated elsewhere.</p>
<p>Imagine if, instead, the table were replaced by a visual representation that was agnostic to the data domain, sufficiently quantitative to identify patterns and descriptive statistics, and made no assumptions about the kind of patterns that might exist. In this article, I outline one such representation.</p>
<h2>Tables are Visual Obstacles</h2>
<p>As the saying goes - it's not the table, it's you. We are notoriously bad at evaluating quantitative information when it is presented in its raw numerical form. We reach our limit in the ability to glean trends from a table very quickly. Consider the five tables below - the 1x1 table is trivial to interpret and the 5x5 table impossible. Somewhere in between is where you reach numerical overload.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-01.png" alt="" title="fig-01"  class="alignnone size-full wp-image-1498" /></p>
<p>Unfortunately, most published tables are larger than these examples. Due to their size, many fail to effectively communicate their information. They provide the numerical minutiae from which visual representations can be genreated, but on their own they make opaque any patterns that might arise in such representations.</p>
<h2>An Uninterpretable Table</h2>
<p>Even prestigious journals are not exempt from poorly communicated data. Frequently it is not an issue of poor communication, as much as no communication. The reader is left frustrated, without a sense of what is important in the data and which differences are meaningful.</p>
<p>Consider the table below (Horvath, J. E. et al. Development and application of a phylogenomic toolkit: resolving the evolutionary history of Madagascar's lemurs. Genome Res 18, 489-99 (2008)), which suffers from two extremes of the same problem: inappropriate amount of information.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-02.png" alt="" title="fig-02"  class="alignnone size-full wp-image-1499" /></p>
<p>On the left half of the first table there is nearly no information - almost all values are 1.0. On the other hand, the right half of the table is packed so tightly with numbers as to make them visually unparsable. The second table is even worse, suffering not only from information overload, but also from both poor layout, and inconsistent precision (e.g. 7 (4.74-9.24)).</p>
<p>Poorly designed tables can suffer from visual noise (lots of ink, but no information), obscured statistics (descriptive statistics are hidden in numbers), unparsable content (too much information), misguided sightlines (poor row and column spacing), and burden of significance (reported precision is much higher than required for visual inspection). Such tables do not help understand the scale and tolerance inherent in the data and leave the reader faced with a deluge of numbers, to fend for themselves.</p>
<h2>Visualization of Tabular Data</h2>
<p>The method presented here provides an alternative to mitigate the problems outlined above. It is a visual approach that uses Circos[http://mkweb.bcgsc.ca/circos] to represent rows and columns in a circular fashion, and ribbons to represent cell values. Does it solve every table's problems? No. It does provide, however, a way to capture the essence of the table and present it quantitatively and attractively.</p>
<p>In this approach, relationships between data elements (e.g. a row and a column) are encoded by ribbons that join segments that correspond to these elements.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-03.png" alt="" title="fig-03"  class="alignnone size-full wp-image-1500" /></p>
<p>The ribbons can have different end thicknesses to represent a ratio between the elements. By coloring the ribbons (and/or adding transparency), such as shown below, the representation can focus on the flow of information in a particular direction (e.g. from A (left), or to A (right)).</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-04.png" alt="" title="fig-04"  class="alignnone size-full wp-image-1501" /></p>
<p>In practise, a visualization of a table based on this scheme might look like the figure below. Normalizing the segments to equal size is motivated by whether absolute or relative relationships are important.</p>
<p><a href='http://flowingdata.com/wp-content/uploads/2009/04/fig-05-large.png'><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-05-large-545x384.png" alt="" title="fig-05" width="545" height="384" class="alignnone size-medium wp-image-1502" /></a></p>
<h2>Practical Example - Preference for Hair Color in Relationships</h2>
<p>To illustrate this visual approach with a small data set, consider how one could visualize dating preference for hair color. You might have information about the relationship history of a large number of individuals and want to visualize the probabilities of transitions between hair colors in successive relationships.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-06.png" alt="" title="fig-06"  class="alignnone size-full wp-image-1503" /></p>
<p>The data might look like this, where each cell represents the number of cases in which someone moved from a partner with one hair color (row) to another (column). For example, 2,868 individuals dated someone with red hair right after someone with black hair.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/table.png" alt="" title="table"  class="alignnone size-full wp-image-1504" /></p>
<p>These data are synthetic (drawn from my own stereotypes) and visually represented in the image below</p>
<p><a href='http://flowingdata.com/wp-content/uploads/2009/04/fig-07-large.png'><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-07-large-545x387.png" alt="" title="fig-07" width="545" height="387" class="alignnone size-medium wp-image-1505" /></a></p>
<p>Several trends, not immediately discernable from the table, are made clear in the figure. Moreover, given that we can simultaneously process more visual details than numerical ones, this image can communicate many patterns at the same time and therefore enhance both interpretation and retention of information.</p>
<h2>Practical Example - Reactivity of Chemical Elements in Minerals</h2>
<p>The hair color data set was both small and synthetic. Let's turn to something much more complicated to see how a visual representation can help avoid visual burden.</p>
<p>For this example, I used a database of mineral formulae [http://un2sg4.unige.ch/athena/mineral/minppmi.html] to extract all pairwise element ratios from each mineral. or example, Zabuyelite is Li2CO3 and would therefore contribute +2 (Li,C), +2 (Li,O), +1 (C,Li), +1 (C,O), +3 (O,Li), +3 (O,C). The resulting table was a 77 x 77 matrix [http://mkweb.bcgsc.ca/circos/export/mineral-element-ratio-table.txt] of ratios of elements.</p>
<p>To start, I condensed the table by combining elements of the same classification (e.g. alkali metal, transition, etc). In the table below, the counts are in units of 1,000.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-08.png" alt="" title="fig-08"  class="alignnone size-full wp-image-1506" /></p>
<p>The image of the table below presents the trends in the data well. By keeping the segment size for each classification in absolute units, the representation also communicates information about abundance. By using relative tick marks however (every 10%) for each segment, it is possible to quickly evaluate extent of contribution from each ribbon to its segments.</p>
<p><a href='http://flowingdata.com/wp-content/uploads/2009/04/fig-09-large.png'><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-09-large-545x449.png" alt="" title="fig-09" width="545" height="449" class="alignnone size-medium wp-image-1507" /></a></p>
<p>By greying out ribbons that provide minor contribution, and varying the amount of opacity as a function of percentile rank for the remaining ribbons, major patterns can be accentuated (image below, left). Alternatively, ribbons' percentile rank can be mapped onto a rainbow color palette (image below, left).</p>
<p><a href='http://flowingdata.com/wp-content/uploads/2009/04/fig-10-large.png'><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-10-large-545x290.png" alt="" title="fig-10" width="545" height="290" class="alignnone size-medium wp-image-1508" /></a></p>
<p>Now what happens when the data for individual elements are drawn? It is no surprise that the result is a very complicated image.</p>
<p><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-11-large-545x327.png" alt="" title="fig-11" width="545" height="327" class="alignnone size-medium wp-image-1509" /></p>
<p>However, even at this level of detail, the image is visually parsable. First, relative sizes of ribbons quickly indicate which segments provide the majority of contribution to the table. The thin ribbons, which correspond to small values in the table, do not distract the eye to the same extent as a sea of small numbers in a table.</p>
<p>Oxygen's abundance in minerals is reflected in the fact that its segment occupies half of the figure. To explore how oxygen combines with elements as a function of their abundance, the image below shows all segments normalized to equal size (except oxygen, which is shown at 20x) and uses color to focus on pairings between oxygen and other elements.</p>
<p><a href='http://flowingdata.com/wp-content/uploads/2009/04/fig-12-large.png'><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-12-large-545x560.png" alt="" title="fig-12" width="545" height="560" class="alignnone size-medium wp-image-1510" /></a></p>
<p>The manner in which the ribbons transit across the figure, and in places cross, indicates a difference between the order of reactivity and the order of abundance for the elements. For example, look at the ribbon between sulphur (S) and oxygen, indicated by the black arrow. Sulphur is 4th most abundant, but 12th in terms of number of O atoms that combine with it. Similarly, calsium (Ca) is 7th most abundant but 3rd in terms of reactivity with oxygen (red arrow).</p>
<p>Another treatment of the figure is shown below, with the oxygen segment removed, and the ribbons that correspond to element pairs that have the highest relative affinity (strong preference) for one another shown in color.</p>
<p><a href='http://flowingdata.com/wp-content/uploads/2009/04/fig-13-large.png'><img src="http://flowingdata.com/wp-content/uploads/2009/04/fig-13-large-545x476.png" alt="" title="fig-13" width="545" height="476" class="alignnone size-medium wp-image-1511" /></a></p>
<h2>Conclusion</h2>
<p>While, it is possible to apply information design principles to a table to ensure that it communicates its content clearly, sometimes tables are not the best way to present data.</p>
<p>I hope that in this short writeup I have given you ideas that will be useful in your quest to articulate your own data sets.</p>
<p><em>Martin is a scientist who specializes in bioinformatics at the Genome Sciences Centre in Vancouver. Visit his site for more on <a href="http://mkweb.bcgsc.ca/circos/">Circos</a> and some of Martin's other data musings.</em></p>
<p><p>---------<br />
<a href="http://flowingprints.com/print4.php">World Progress Report</a> - 4 days left to order</p></p>
]]></content:encoded>
			<wfw:commentRss>http://flowingdata.com/2009/04/21/visual-representation-of-tabular-information-how-to-fix-the-uncommunicative-table/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
		</item>
	</channel>
</rss>
