Where Pictographs Beat Bar Charts: Proportional Data

Pictographs are exceptionally good for some types of data. In this post, I show how useful they are for displaying proportions (e.g. rates, percentages, fractions).

Look at the pictograph example on the right. It shows the case fatality rate using colored stick figure icons. These quantities could be just as appropriately shown using pie or bar charts (see above). However, the pictorial representation makes this statistic intuitive: out of every 100 individuals infected with SARS, you can expect 11 to die.

Pictographs have an intrinsic scale

The icons give the pictograph an intrinsic scale. Compare the pictograph (right) to the barchart (below). Both charts show that SARS is 3 times more deadly than pertussis, but the advantage of using a pictograph can be seen when we compare the other diseases. The pictograph clearly shows that the fatality rate for SARS is an order of magnitude bigger than that for smallpox. By contrast, on the bar chart, all we can see in the absence of any labels is that SARs is much bigger than smallpox.

The finer resolution provided by the icons is especially useful for the smaller values. In the bar chart, the much larger fatality rate of SARS makes the variation between the other diseases hard to see. But in the pictograph, it is clear that the smallpox fatality rate is at least double that of malaria.

Pictographs show quantities visually

A well designed pictograph makes quantities easy to read. In the example on the right, the small scale and the large number of icons can potentially cause problems. I avoid this by arranging the icons into 10 by 10 squares. Even without explicitly counting each icon, quantities can be evaluated by comparing the area of the square which is red.

The example on the right shows data labels in order to provide a greater level of detail. However, the main message of the chart – the enormous difference between the severity of different diseases – is effectively conveyed by the icons alone.

Acknowledgments

Author: Carmen Chan

Carmen is a member of the Data Science team at Displayr. She enjoys looking for better ways to manipulate and visualize data. Carmen studied statistics and bioinformatics at the University of New South Wales.

Fourth Grade Math Class Called. They Want Their Pie Chart Back.

Dashboards. Infographics. Vizzes. Everyone is talking about the cool buzzwords in Business Intelligence. But, with statistics like “70-80% of business intelligence projects fail” (Gartner analyst Patrick Meehan) floating around, folks are tripping over jargon and falling on their faces. Even in a perfect world where requirements are clear, business questions are documented, and key metrics are defined, communicating effectively with data visualization is still just plain hard. Here are five tips to keep you on track and to help make your vizzes both cool AND effective.

1.  Choose the Right Visualization

Talk with whoever will consume the report (have a conversation, in real life if possible – I know it’s scary, but you can do it). Make sure the visualization you choose caters to their specific needs. Do they just need that one number so they can copy and paste it into an email every Friday? Great. Then just report the number. Don’t hide that number behind colors, bars, lines, or added drama. Do they need historical context to explain that number? Fantastic. Then add a line chart with that metric over time. Do they need to compare that number to a benchmark? Wonderful. Then build a Gantt chart with the number and the benchmark. In data visualization, less is more.

2.  Be Deliberate

Fancy Business Intelligence software and even good ol’ Excel love to add gridlines, borders, shading, legends, and labels to every piece of every visualization. Resist the urge! Fight back! Embrace minimalism and let your data be the center of attention instead. Use descriptive labels and legends to help communicate your message rather than clutter it. Remember, white space is your friend.

3.  Color Matters

Green means go. Red means stop. Deep saturated red means you cut yourself shaving. Whether we like it or not, color carries inherent meaning. Use color to call attention to some of your data or to emphasize a point. But, be deliberate! And consistent. Does blue mean profit in one chart and loss in another? Perhaps a bad idea. Remember, colors are fun, pretty, and cool. But. if they don’t add meaning to your visualization, consider simplifying and going with a classy shade of grey.

4.  Cool Or Confused? Intercepts and Axes.

Nobody loves a hacked off axis more than news networks and politicians. The drama! The intrigue! Does it inspire shock and awe? Yes. Is it misleading? Very. Avoid confusion and err toward accuracy by setting your axis intercept to 0.

Another example of “cool” that quickly turns into “confusing” is the dual axis chart. Consider how your audience will interpret your visualization. Did orange outperform blue? Are those metrics even comparable? When in doubt, break those bars and lines into separate graphs. And if you insist on keeping them in one graph, for goodness sake, clearly label those lines and axes.

5.  Oh Those Pie Charts… And Maps…

Pie charts are incredibly fun to create and terribly difficult to interpret. Slices often look similar in size when they are quite different. And too many slices can set your end user up for a Where’s Waldo? game to find the right slice. Be kind to your end user. Recommend a better visualization.

Maps, like pie charts, are a ton of fun to create. The colors! I can even see the ocean! However, think twice before you slap that data on a map. Is this really a geospatial analysis? If you want to identify the state bringing in the highest profit, looking through a bunch of numbers on top of a map is just another Where’s Waldo? game. Once again, be kind to your end user. Choose a better visualization.

Now Ignore Everything I Just Said. Sort of.

Rules are made to be broken, but it’s important to learn and understand the rules before you break them. Famous painters like Pablo Picasso and Matisse studied and experimented with principals like color theory and perspective before breaking those rules with their more modern paintings. So, once you have a handle on some data visualization basics, don’t be afraid to experiment, push back on the hundredth pie chart request, and have some fun. Happy Vizzing.

The Risks and Limitations of Visualization

Guest blog post by Radhika Subramanian

Today’s need to leverage unprecedented amounts of available information has resulted in a flood of tools, services and models claiming to surface insights from Big Data. One model in particular, visualization, has received a lot of attention lately because of its abilities to organize and present information. However, visualization is actually one of the biggest barriers to insight because it places the burden of discovery on the user, and any tool that places the burden on the analyst is a game-stopper.

Data visualization is the study of the visual representation of data, meaning information that has been abstracted in some schematic form, including attributes or variables for the units of information. Humans are better equipped to consume visual data than text. As we know, a picture is worth a thousand words.

While visualization tools are interesting, they rely on human evaluation to extract insight and knowledge. The problem with this is that people often see what they are looking for and miss the breakthrough evidence they are actually seeking. It’s human nature: we see what we are conditioned to see and miss the fact that a gorilla just danced through the living room. But that’s just the beginning. The more severe limitation of visualization is it can only represent two or three dimensions before the amount of information is overwhelming. Visualizing a network of 10-100 friends is fine, but what happens when the data approaches one billion? Thus, while it is certainly a good test for small samples, it is not a sustainable method to gain insight into large volumes of shifting data.

In a previous blog, I wrote that given today’s explosion of “Big Data,” companies need more advanced methods for leveraging their data – methods that don’t rely solely on tribal knowledge, personal experience or best guesses. Like data mining, visualization is limited to manual endeavors. Why limit company success to antiquated methods that by design fail to leverage the data for all it’s worth? It’s time to usher in new methods and new technologies for transforming the enterprise from reactive (based on guesstimates, hunches, and flawed insight) to proactive (based on data-driven, actionable insight).

Datascape - Immersive 3D Data Visualisation

Guest blog post by David Burden

With the launch last week of Datascape I thought it would be worth putting an MD’s perspective on the product – how we got here, what the philosophy is that lies behind it, and where we hope to go with it. For a more formal view of the academic and commercial background see our Immersive Data Visualisation white paper.

Datascape has undoubtably grown out of Daden’s virtual world heritage – and my own interest in data and data visualisation. Over the years we’ve used virtual world platforms such as VRML, Active Worlds, Second Life and OpenSim to create a variety of data visualisations, probably culminating in our original Datascape virtual command centre (which won a prize at the US Government’s Federal Virtual World Challenge), and the visualisation of Twitter data we did in OpenSim for the Royal Wedding in 2011. These examples and experiments, and those of others, together with an MOD funded research project we did in 2011 within Aston University doing a quantitative comparison of immersive and non-immersive 3D visualisations spaces convinced us that there was definitely something in immersive data visualisation.

In then moving from ideas and demonstrators to a full blown product I think that there are 4 key ideas that have informed our journey.

IMMERSION

Datascape is about immersion. It is about putting you inside your data, allowing you to move around and through your data and view it from any angle, from inside or out. When in navigation mode there is no user interface – there is just your data ( possibly the ultimate expression of Edward Tufte’s Data Ink idea). This sense of immersion appears to help the brain see the patterns and anomalies in the data, because the data behaves like the real world – it stays still whilst your eye travels through it.

FLEXIBILITY

Datascape does not constrain you. If you want to map latitude to colour and longitude to shape you can do it. The heart of Datascape is the mapping screen, where you assign the fields in the data to the features of a plot point – its position, rotation, shape, size, colour, image and labels. With a full set of spreadsheet like functions at your call, and self-populating look-up tables, the plots you can produce probably really are only limited by your imagination. That flexibility does mean that initially there might be a bit more to learn, but we’ll be posting “recipes” and “how-to’s” on our web site to help you create the more common visualisations, and as we release successive versions of Datascape we may well start including wizards and templates that get you more directly to those common views.

ACCESS

Given that we needed good graphics and processing capability we took the decision early on that this would initially be a PC application, not something for the web or your tablet. However by basing Datascape on Unity we have got a path available to develop a web and/or tablet versions of Datascape if the demand is there. We have also been keeping a watching brief on HTML5 and WebGL and one feature under serious consideration is being able to export your completed workspace as a standalone HTML5 virtual world to share more easily with friends and colleagues.

LANGUAGE

One thing we have found as we begin to look at more and more data in Datascape is that we may need a new visual language to describe what we are doing with data in a 3D space. In 2D we are all used to line graphs and bar charts, pie charts and scatter plots. Whilst we can do these in 3D as well, they do not (except for the last) typically take fullest advantage of the medium.

For instance one problem we’ve found in 3D is that whilst the virtual space let’s us plot a long line of data stretching off into the distance, looking at the whole line is hard, you have to scroll as you do in 2D, unless we compress it (but then we lose the detail that the spread out 3D display brings). One solution that we have found is to plot the data as a cylinder, or even as a spiral, with the viewer in the centre. You can then take in a lot of data in one go, and just fly up and down the cylinder to other data – which is typically an easier action to control that horizontal flight. What other standard forms will we find, and how will we determine which form suits which type of data, and which type of enquiry?

Another difference is axes. In 2D the axes form a frame in which your data sits – and the same for non-immersive 3D cubes. But in an immersive space you are usually inside the data and the axes are nowhere in sight. So how do we maintain orientation within the data, and understand where the data points sit on the axes (that is if we actually need enumerated axes). There are no doubt a wide number of solutions to explore, and within Datascape we have distant XYZ markers so you can easily tell which direction you are looking in, whenever you hover over a point it can tell you its X,Y and Z values, and you can also have the point drop reference lines down to the axis or reference planes. One other thing we have tried, but not perfected enough to release, is a 3D compass, and another that we are looking at for future releases is the use of mini-maps, not just as a top-down (XZ plane) view but also as YZ and YX views as well. But can you cope with seeing your data in four directions at once?

A VIRTUAL WORLD FOR YOUR DATA

We thought long and hard about this tag line, just as we did about whether or not to have avatars. We didn’t put avatars in the single user version since we felt that a) you got enough of a sense of immersion from the navigation alone and b) for most corporate users we spoke to avatars are still a turn-off and too closely associated with gaming environments. However “virtual world” (most emphatically in lower case) did seem by far the most appropriate way to describe what you can create with Datascape, a virtual world populated solely by you and you data.

In multi-user mode we do provide you with a very basic humanoid avatar – but it is very much a place-holder, a glyph, for where you are in the world and what direction you are looking. We deliberately kept clear of an avatar that was human enough for you to start worrying about what gender, or race or age it was, and what clothes it should wear! The resulting avatar is enough to let you know where your colleagues are and what you are looking at, no more, but even so it’s not long before you’re playing hide and seek amongst the data.

Going forward we may well increase the virtual world sense – for those who want it – with better avatars, more 3D scenery in which to place your visualisations ( closer to the original Datascape), and persistency and controlled sharing of your data and workspaces. But let’s start simply, and with something that everyone can hopefully relate to.

So hopefully that gives you some insight into our thinking as we developed Datascape, and some clues as to where we might take it in the future. Please download it (there is a free community version with a 6000 point limit and a paid pro version with a 65,000 limit - although we have had it running up to 250k points)  and give it a try, and hopefully it will open up a whole new world of data visualisation for you.

Big data and Data Visualization

Like many people of a certain age, my first exposure to the term dashboard was when I developed a one for monitoring for corrective and preventive actions!

I have realised that Dashboard design itself is now the essence of simplicity and cutting edge technology, and stylish with it too, arising passions about what makes a great interface for analysis.
When it comes to software applications and websites, dashboards are around us everywhere too!

The era of Big Data has arrived, but most organizations are still unprepared. Enterprises erroneously believe and act like big data is a passing fad, and nothing has really changed. But big data is not a temporary thing. By acting as if it is, companies are missing out on tremendous opportunities by not focusing on such a great technology.

So what it is?

Like many of us  know, an enterprise application dashboard is a one-stop shop of information. It’s a page made up of portlets or regions, grouping up related information into displays of graphs, charts, and graphics of different kinds. Dashboards visualize a breadth of information that spreads over a large range of activities in a application or functional area.

There are numerous case studies in explaining how visual representations are locating and leveraging valuable insights from a large set of structured or unstructured data, i.e., big data, are asking better questions, and are making better decisions.

Is it solves the purpose?

Yes! Dashboards when designed to aggregate sturctured and unstructured data into meaningful visual displays and representations, using analytical formulas over available data-sets at the backend to do the analysis and derivation work that users used to do with notepads, calculators or spreadsheets to find what out what’s changed or in need of attention.

Dashboards over a large amount of data enable users to prioritize work and to manage exceptions by taking light-weight actions immediately from the page, or to drill down to explore and do more in a transactional or analytics work area, if necessary.

The design of Dashboards on a very large amount of data, on the other hand, is much more open to interpretation. Most of these Bigdata Dashboards are simply a series of graphs, charts, gauges, or other visual indicators that a user has chosen to monitor, some of which may be strategically important, but others of which may not. Even if a strategic link exists, it may not be clear to the person monitoring the Dashboard, since the Objective statements, which explain what achievement is desired, are typically not present on Dashboards.

Why this?

I found interesting that there is an infographics and a data visualization categories. My interpretation is that the entries in the infographics section are static and illustrated, while those in the data visualization are generated and data-driven.

Nowadays, Bigdata can be used to gain a better insight over Data visualization using superior tools and techniques to present or analyze the available data.

On the other hand, it is economical in terms of space and would probably work in almost every case which are two things that dashboards should be good at. So while I wouldn’t have used it myself I can understand why this decision has been made. What makes a dashboard, or any other information-based design successful, is neither the design execution nor the clever information analysis and visualization technique.

These kinds of Dashboards, on the other hand eventually, are meant to be useful and to solve a specific problem. Dashboards for business users represent powerful means of communications nowdays when companies build large amounts of data. Those visually compressed representations of only the most important data are used for trackig.

DataViz on my view!

These data visualization can unintentionally bias the viewer as a result of the analyzed choices in visual method, sometimes visualization failing as a result of not understanding your viewers assumptions (cultural for instance, is RED a good or bad color?).

One interesting thing I always think of creating visualizations that discover something with the human eye that can't be discovered by a program. But there will be a challenge showing enough data to give a sense of context while providing enough detail to enable understanding.

What's then?

Whenever a Visualization is done based on Bigdata, once a data visualization designer is aware of simple principles of presenting data on a screen, they can apply them to any report or graph, data analysis or information dashboard without changing it's context or meaning. Only then will it provide a powerful means to make sense of data. When done properly, data visualization will make us think, compare data, read stories out of our data, will put data in the right context and ultimately help decision-makers to make the right decisions regardless of the available type or amount of data.

Do you have any thoughts on this? I am waiting to hear from you!

Same dataset - Two different kind of visualizations

Guest Blog post by Nilesh Jethwa

Data is bits and bytes and visualization has the power to tell the story in multiple forms. Today I wish to share two different visualization for the same dataset.

Here is the link to the dataset and the dashboard as shown below

And here is the second Dashboard visual using cloropleth

You can analyze the pros and cons of both kind of visuals. Both of the visuals are pretty interesting and the key point is what story each one of is trying to tell.

Interactive Data Visualization for the Web, Free Online Version

Interactive Data Visualization for the Web, by Scott Murray, O’Reilly (2013), has a Free online version.

An introduction to D3 for people new to programming and web development, published by O’Reilly. “Explaining tricky technical topics with aplomb (and a little cheeky humor) is Scott Murray’s forte. If you want to dive into the world of dynamic visualization using web standards, even if you are new to programming, this book is the place to start.” - Mike Bostock, creator of D3.

From O’Reilly website: "This step-by-step guide is ideal whether you’re a designer or visual artist with no programming experience, a reporter exploring the new frontier of data journalism, or anyone who wants to visualize and share data. Create and publish your own interactive data visualization projects on the Web—even if you have little or no experience with data visualization or web development. It’s easy and fun with this practical, hands-on introduction. Author Scott Murray teaches you the fundamental concepts and methods of D3, a JavaScript library that lets you express data visually in a web browser. Along the way, you’ll expand your web programming skills, using tools such as HTML and JavaScript"

This online version of Interactive Data Visualization for the Web includes 44 examples that will show you how to best represent your interactive data. For instance, you'll learn how to create this simple force layout with 10 nodes and 12 edges. Click and drag the nodes below to see the diagram react.