A Guide to Data Visualization

Source: UX Motel, @FlavienP
Source: UX Motel, @FlavienP

One of the most interesting parts of having an active twitter feed is the immediate feedback you get to posts based on retweets, quotes, and favorites.   I found this chart on twitter and retweeted it, and it has been one of my most popular tweets.  I thought it would be valuable to feature it on the blog.

This chart gives some simple guidance as to how to think about visually displaying your data based on what it is you'd like to highlight.   The chart breaks things into four useful categories of relationships:

Composition:  If you'd like to highlight how a key variable is changing over time, the bar charts and pie charts provide a way to highlight compositional changes.   Our friends at FlowingData.com provide this doozy of a pie chart for your entertainment pleasure.   But all kidding aside, a well-made pie chart can be a tremendously powerful way to illustrate composition and I find them to be intuitive even to non-statistically oriented types.

Distribution:   Illustrating distributions is about trying to show the full spread of the data.  I think most people think of distributions by hearkening back to the results from an exam in high school;   the teacher explains that 3 people in the class got an A, 15 people got a B, and 3 got a C, and 1 got an F--this is a distribution.   I suspect most people (if they think about distributions at all) are familiar with the bell curve.  Here is an interesting Forbes article, however, on research that suggests that in the work place, most value is created by a small percentage of hyper performers at the very top of the distribution.

Comparison:  How do the experiences of one group differ from the experiences of another?  In comparison charts, we simply want to draw out similarities or differences between the outcomes or experiences of different sets of people.   In this line chart from a recent article on income inequality in The Huffington Post, we see a comparison of real average after-tax income for different wage earners over time.

Relationships:  In the real world, we think all the time about how two things relate.  If I spend more time exercising, how much weight will I lose?  If I save an extra dollar today, how much more will I have for retirement in 20 years?  Charts designed to show the relationships between two variables abound--and can be some of the most misleading or informative depending on presentation and content.  More on this later--but for now, enjoy this article on the relationship between margarine and divorce rates.

Can Your Voice Get You a Job?

From the advertisement for jobaline.com's Voice Analyzer tool.
From the advertisement for jobaline.com's Voice Analyzer tool.

When potential employers recruit new talent, they often use many means to evaluate candidates: prior work experience, grades and educational background, college attendance, writing samples, and the like.   Recently, a company is offering a different tool for assessing and recruiting talent: voice-evaluation.

In a recent story on Fast Company,  the Voice Analyzer (TM) tool was highlighted as a new means to recruit and evaluate talent based on the tone of one's voice and the potential emotions it might evoke in potential customers.   As explained in the article, Jobaline has developed an algorithm which is used to "assess paralinguistic elements of speech, such as tone and inflection, and predict which emotions a specific voice will elicit--excitement, for instance, or calmness."  A recent story on NPR's All Things Considered,Now Algorithms Are Deciding Whom to Hire, Based on Voice, features Jobaline CEO's Luis Salazar explaining the process.

Such algorithms are based on a large amount of data about voice qualities and the emotions they invoke.  One particularly interesting quote, however, that caught my attention from the Fast Company article: "There are so many sources of bias when you're dealing with humans...The beauty of math is that it's blind.  It helps give everybody a fair chance."  This quote raises an interesting question about data analysis--does the fact that an algorithm is based on reams of data mean it is unbiased?   It is the case that the same algorithm can be applied in a consistentfashion over and over so its application and the factors it is assessing is the same for every voice analyzed. But, that is not the same as saying any particular algorithm (and I am not offering any opinion about the Jobaline Voice Analyzer specifically) cannot contains any underlying bias in what it predicts. Data is not necessarily a panacea for all inherent biases.