A few weeks back Jon Udell blogged about a Many Eyes visualization he had created which charted monthly mean temperatures from 1871 to 2006 in Concord, New Hampshire. To create the visualization he pulled data from a NOAA ftp site that was expressed in 1/10ths of a degree Celsius and charted it as a Block Histogram.
As an American, I’m used to thinking of temperatures in whole numbers and in degrees Fahrenheit. While Jon’s values were whole numbers they represented fractional units of temperatures and were in a temperature scale that is not how I intuitively think of the weather. Fortunately converting from Jon’s data set to Fahrenheit was a simple two-step process easily done using formulas in Excel.
1) Take the original data and divide by 10 to convert it from tenths of a degree Celsius to degrees Celsius.
2) Convert the values in Step #1 (represented by the variable C) to Fahrenheit using the following formula:
I took the resulting data set and re-uploaded it to Many Eyes. You can find it here. For comparison, I recreated Jon’s Block Visualization using my data transform here. You can compare it to Jon’s by clicking here.
Now you might say that I’m being nitpicky and that Jon’s visualization was the most accurate representation as he literally graphed the raw data as it came down from NOAA. Yet I think its worth laboring the point that we should take into account formats when presenting data to the public. We live in a world were people measure scientific phenomena in different units and our intuitive grasp of information is based on the unit system that we have internalized. Ask an American if 30 degrees Centigrade is hot and he’ll probably scratch is head and say probably (or most likely he’ll ask you what the Fahrenheit equivalent is). When you tell him that 30 degrees Celsius is equal to 86 degrees, he’ll tell you that 30 degrees Centigrade is burning hot !!
The point here is that as we design visualizations for the public, its important to express data in units that are familiar to different audiences. Visualization tools should allow those creating visualizations to specify not just that a column of values is numerical but what types of numbers they are: temperatures in Celsius, temperatures in Fahrenheit, weight in pounds, weight in kilograms, etc. Once units are explicitly defined, the tools should be smart enough to allow end users of visualizations to customize data displays by toggling between unit systems (i.e. Celsius vs. Fahrenheit).
Dabble DB, a hosted database solution is starting to do this with geographical data by allowing users to specify that a column is not just a text column but the values in the column represent state names. Identifying values as state names allows their corresponding values to be literally mapped on a map of the United States. To be fair, Many Eyes does this without requiring the user to explicitly declare that a data is of type state name, but neither site does anything for numerical data.
Translating data visualizations into units tailored for different users may require more programming upfront by creators of tools like IBM’s who are building visualization tools but they will be a perquisite if public data visualizations are intended to be accessible to more than just “data geeks”.