The Everydata Interview: Amy Bucher, PhD

Today we offer the first in a new on-line series here at Everydata of interviews with interesting and smart data scientists from a wide range of fields.   Our first feature is Amy Bucher, PhD.  Amy is a PhD in Organizational Psychology from the University of Michigan at Ann Arbor.  Her current work involves creating digital solutions that help people get and use the medications they need easily and effectively so they can be their healthiest and happiest.  A transcript of our interview is below.

How do you use data in your work?

I use data both post-hoc and proactively. In the former case, I look through data that already exists to identify patterns. These patterns may help me understand gaps in what we offer users so we can design new tools or processes to improve the experience. They also may help me figure out what’s working well so we can emphasize and adapt it. In the latter case, I plan to measure the effects of anything we design or offer to our users in terms of key outcomes, whether it’s a health outcome like pounds lost, a medical outcome like prescriptions filled, or a financial metric like dollars earned.

The data available to me is both qualitative and quantitative. A typical data encounter for me is that I might notice a certain theme in the qualitative data; a couple of users might all mention the same issue, for example. Then we can pull the quantitative data to see how widespread the issue is and what effects it is having.

What is the most common mistake you see in terms of people misrepresenting or misinterpreting data?

There are several common data interpretation errors, but one that really bothers me is when people selectively use data to support their opinions and ignore it when it doesn’t. I remember in grad school a professor I highly respected telling me “if you live by the p-value, you die by the p-value.” (I was probably trying to argue that some results “trending toward significance” were worth discussion.) I took that lesson to heart. It means that sometimes I have to abandon ideas that have caught my fancy if the data doesn’t support them, or that I have to pursue ideas that didn’t initially appeal to me because the data supports their value.

I find that people with less formal training in the use of data are more reluctant to step away from ideas not supported by data. We have this natural confirmation bias where we look for the evidence that bolsters our beliefs and discount the evidence that doesn’t. Statistical training helps break that bias, not that it’s ever easy for any of us.

A common symptom of a person who is having a hard time being persuaded by data contrary to a beloved idea is grasping at anecdotes. “Sure, the numbers say that this approach is costing us money, but that one customer told that great story . . .”

How do you think the media affects the ways in which people consume data?

In general, I think it’s easier for people today to encounter data than at any time in history. Not only does digital media open up so many more outlets than traditional print and television, but it also adds an immediacy that wasn’t there 20 years ago. 

We now expect to see data alongside our news and if it’s not there, we can click and easily find some ourselves.

This is good and bad. On the good side, I think the media helps people become more attuned to the idea that there is so much data that can be shaped to tell a story. I’m a big believer that exposing people to ideas and communication styles helps them absorb some of it over time. On the bad side, I think the proliferation of data makes it harder for people to discriminate between high- and low-quality information. People who grew up using the internet don’t necessarily perceive that Wikipedia doesn’t have the same gravitas and accuracy as a peer-reviewed article pulled from JSTOR. And it’s much easier for poor-quality data to see the light of day since anyone can make a website.

One area where I think the media has done a nice job exposing people to sophisticated data and analysis is sports reporting. I’m always amazed at the bizarre stats they report during games, such as the percentage of times a batter has struck out against a right-handed pitcher with two outs. Not that all of these stats have real-world significance, but the fact that sports media is constantly revealing different data slices to their audiences tends to give sports fans a stronger intuitive sense of what one might accomplish with data.

Why should people care about understanding data? What are the consequences?

When I used to teach undergraduate research methods, the selling point I offered students was that we were building their bullshit detectors. Understanding data helps people make better decisions and avoid being manipulated. And there are always people trying to manipulate us with data, from politicians to marketers. Even small things, like knowing to check the tags on grocery shelves to see the per-unit price of items and select the most cost-effective one, can benefit people over time.

My career has been in the health area, and I’ve come to believe that understanding data also helps people be both physically and mentally healthier. Physically, because a better understanding of data helps people make choices about their behaviors that are meaningful. Mentally, because it helps people worry and stress less about making the wrong choice or factors they can’t control. I will say that I don’t think many people know enough about data to have enjoyed its full benefit health-wise, nor do I think the industry has done enough to communicate data in a clear and meaningful way.

An example is with the changes in the last few years in health guidelines, which have reduced the recommended frequencies of several types of preventative screenings. This has been hugely emotional for people, and while I don’t think understanding data completely neutralizes the fear associated with skipping a screening, it does help level-set patients for a better discussion with providers. If you understand data, you understand that these recommendations were made from a population health perspective. In the aggregate, frequent administration of these (or any other) screening measures does two things: It finds false positives, which cost money and anxiety, and it finds true positives, which triggers treatment that in some cases may not be necessary. From a systems perspective, more frequent screening costs more money, and statistically speaking, may not save a lot of lives. If you are the individual person who does have a potentially deadly disease, more frequent screening could save a very important life: Yours. Understanding how data plays into these recommendations can help people have a productive conversation with their providers about how to handle their own personal testing. It might also help people who are at a low baseline level of risk feel less stressed if they do receive fewer screenings.

What is one thing people could do to become a better consumer of data in their everyday lives?

This suggestion might be cheating because you probably need some minimum data training to implement it, but the number one thing that helped me to understand data was to work with some of my own. Doing my own research studies as a student and going through the process of formulating my research questions, selecting the data to collect, and actually analyzing it did more to help me understand data than any class ever did.

You don’t have to be a student to work with your own data. I’ve done this at work, for example, when I’ve wanted to pitch an idea and pulled and analyzed web usage data to support it. I’ve done it in my volunteer work with the Junior League of Boston, when I’ve developed relatively straightforward member surveys to assess the success of a project at the midway point and make recommendations for improvements. I’ve even done it in my personal life, when I pulled sales prices of comparable properties to my condo to demonstrate to the city government that my taxes per square foot should be lower. (That last one is another example of how understanding data can benefit you. In this case, it saved me a lot of money.) 


Please note that guest interviews are informational only, and do not necessarily represent the views of John H. Johnson, PhD.