You are a savvy consumer of data. You eat, breathe, and sleep numbers all day and night! As a result, we think you will really enjoy our new book EVERYDATA.
People Who Wear Glasses Are Smarter, Study Claims
MISLEADING – From a statistical perspective, we can find lots of apparent connections between two factors, such as wearing glasses and having a high IQ. These types of connections—when there is some sort of relationship between data—are called correlations. But the mere existence of such a statistical relationship between two factors does not imply that there is actually a meaningful link between them, as you'll learn in Chapter 4 of EVERYDATA. In this case, for example, if you look at the claim that people who wear glasses are smarter, what the original study actually describes is an association between having more years of schooling and having a form of myopia (nearsightedness).
High Chair Injuries Increase 22 Percent
MISLEADING – The key words here are "selected hospitals." It's not always possible to collect or analyze data from the full population, which is why researchers often use a sample. But when you conduct a statistical analysis on a sample of the available data, you can induce what in statistics is known as a sample selection problem. Running an analysis on less than the entire data set is not always a problem, but it can lead to mistaken conclusions depending on the question you are trying to answer. This high chair example is based on a study that used data from the National Electronic Injury Surveillance System (NEISS), which included data from approximately 100 hospitals–"a national probability sample of hospitals in the U.S. and its territories." Find out more about sampling issues in Chapter 2 of EVERYDATA.
Sox Win 12th in a Row Behind Buckholz Gem
MISLEADING – We see this in sports all the time, when writers and broadcasters cherry pick data in order to make a point. Yes, the Red Sox may be 45-1 when holding opponents scoreless after the 6th – but what about when they hold opponents scoreless after the 5th inning (or 4th, or 3rd)? What is their record then? When you see stats like this, ask yourself if the data will still tell the same story with a different sample set. Many times, you'll find that the answer is no. You can read more about cherry picking (and see other examples) in Chapter 7 of EVERYDATA.
Jury Finds $1 Billion Damages Verdict
NOT MISLEADING – The largest verdict in the history of antitrust law, $1.05 billion, occurred when Conwood Company – a tobacco manufacturer—was suing another tobacco manufacturer (U.S. Tobacco Company) for hindering Conwood's growth. (Conwood Company was purchased by Reynolds American, Inc., and changed its name to American Snuff Company, LLC, effective January 1, 2010.) What's interesting about this case (covered in Chapter 3 of EVERYDATA) is that the verdict hinged on outlier data. If that outlier data had been excluded—as it arguably should have been—then the results would have shown a clear increase in market share for Conwood. Instead, the conclusion—driven by an extreme observation—showed a decrease.
Stop Eating Out in Minnesota!
MISLEADING – The facts are true, but the conclusion—that you shouldn't eat out in Minnesota—is misguided. Why? Because in this real-life case, the reason for these food recalls in Minnesota (as Yahoo! Health reported) was simply because Minnesota is better at identifying cases of foodborne illnesses than other states. If anything, because of the ongoing diligence of Minnesota's Departments of Health and Agriculture, it's quite possible you're better off there than in other areas of the country. This story is used in the preface of EVERYDATA to show how easy it is to be misled by data.
Decision to launch doomed Space Shuttle Challenger ignored two-thirds of data
NOT MISLEADING – The night before the disaster, scientists and engineers were concerned that cold weather would affect the performance of the shuttle's O-rings, which were critical to the integrity of the fuel tanks. As part of the team's discussions, they studied data from 7 of the 24 previous flights—specifically, the 7 flights in which there were O-ring incidents in the past. Temperatures at launch for these 7 flights ranged from 53° F to 75° F, which seemed to indicate that colder temps didn't cause O-ring incidents. But by focusing only on the flights with O-ring incidents, people were truncating the data set—a fancy way of saying that they weren't looking at all of the data. If they had looked at all of the data, they likely would have seen that only 3 out of 20 flights above 65 degrees had incidents, yet all 4 flights below 65 degrees had incidents. As you'll discover in Chapter 2 of EVERYDATA (which examines the Challenger disaster in depth), the scientists and engineers did not study the right data for the question they needed to answer.
9 Out of 10 Pediatricians* Recommend Enriched Baby Food
MISLEADING – Did you catch the asterisk in the headline? If so, did you read the disclaimer? This ad isn't telling you that 9 out of 10 pediatricians recommend enriched baby food. It's telling you that—of the limited (and undisclosed) number of pediatricians who recommend baby food in the first place, 9 out of 10 of that group recommend enriched baby food. Imagine, for example, that you have a group of 100 pediatricians. Only 10 of them recommend any baby food, and of those 10 who do recommend it, 9 of them recommend enriched baby food. That scenario could give you the results you need for this ad, even though (if you don't cherry pick the data) 91 out of 100 pediatricians did not recommend enriched baby food. You can read more about this type of cherry picking in advertising (including a real-life example) in Chapter 7 of EVERYDATA.
Beware of Polling: What Dewey Defeats Truman Means for the 2016 Presidential Race
NOT MISLEADING – As you'll read in Chapter 8 of EVERYDATA regarding forecasts, back in 1948 the Chicago Daily Tribune printed nearly 150,000 papers with the erroneous headline, "Dewey Defeats Truman." In 2016, Bernie Sanders beat Hillary Clinton in a "stunning Michigan primary upset" – stunning given that Sanders was down 21 points in recent polls. But the polls aren't always right. There's margin of error, which accounts for the fact that polls take a sample of the full population. But you also have to consider that a poll is often used to make a prediction. And predictions can be wrong due to a number of factors, including inaccurate past data, a poor prediction model, or simply prediction error, which accounts for uncertainty in the future.
Less than 1% of General Population Requires Gluten-Free Foods for Health
NOT MISLEADING – You may have noticed more and more gluten-free foods on the shelves and on menus these days. In fact, more than 29% of Americans are trying to avoid gluten in their diets, according to one study. But the number of people who actually suffer from celiac disease—the underlying condition that relates to the inability of the small intestine to absorb gluten—is less than 1%, according to the National Foundation for Celiac Awareness. So, while the market for gluten-free foods may be quite large, the actual need for them is much less. When it comes to sampling, it's important to pay attention to these types of details—a lesson you'll learn in Chapter 2 of EVERYDATA.
SWEET! We wanted to increase the sweetness by 400%
MISLEADING – An increase of 1 to 4 is a 300% increase, not a 400% increase. The number of candy bars added was 3, and 3/1 is 300%. This example, while obvious to some, is similar to an error we found in an actual advertisement of a well-known brand! We talk about many examples of when the data is wrong in Chapter 6 of EVERYDATA.
Should a Third-Party Candidate Run
MISLEADING – How can you have a pie chart where the slices add up to more than 100%? Sometimes articles get basic math wrong. We've seen surveys with more than 100%. Occasionally, this occurs due to rounding. In this case, consider if 47.8% of the people said no, 44.6% said yes, and 7.6% are undecided. That's 100%. But if you round those numbers up you get 48, 45 and 8 (which add up to 101). However, in this case, we think this is more misleading than not, unless there was a disclaimer about rounding.
New Study Suggests That Reduced Street Lighting at Night Does Not Cause Traffic Collisions
MISLEADING – The headline implies that there was a casual effect between street lighting and traffic collisions. But all we're seeing here is a correlation (or, rather, a lack thereof) in the data.
Hermosa Beach Fire Dept. Touts Exceptional Response Times
NOT MISLEADING – This headline doesn't appear to be hiding anything. And it's true—in the city of Hermosa Beach, California, the average estimated response time for the fire department was just over five minutes, as you can read in Chapter 1 of EVERYDATA. That said, one question you might want to ask is, “Is that a good response time or not?" In order to interpret the data, you may want to compare it to the city's response times in the past, response times from similar communities, and other data.
17,000 Pregnant Men Found in Great Britain Due to Insurance Miscoding
NOT MISLEADING – It's a funny story—in a letter to the British Medical Journal, three physicians cited statistics showing that more than 17,000 men received inpatient obstetric services through England's NHS (National Health Service). But there's nothing misleading about this headline. It simply reflects the physicians' findings: that these men were most likely “pregnant" due to a medical coding error. You can learn more about misrepresentation and misinterpretation in Chapter 6 of EVERYDATA.
New Report Suggests Fukushima Could Have Been Mitigated with Better Forecasts
NOT MISLEADING – The Fukushima disaster—in which radioactive material was released into the environment—occurred when a massive tsunami reached Japan's Fukushima Daiichi Nuclear Power Plant. The plant was only designed to withstand a tsunami that was 3.1 meters high, based on historical records from just a few years before the plant was designed. The actual tsunami that hit Fukushima Daiichi was estimated to be 14 to 15 meters tall. (One report listed half a dozen tsunamis in and around Japan that would have had a maximum amplitude of more than 20 meters over the past 500 years.) As you'll see in Chapter 8 of EVERYDATA, the accuracy of your predictions often depends largely on the quality—and quantity—of past data.
Historian Argues 44% of Presidents Are Outliers Based on Length of Term
NOT MISLEADING – How can 44% of U.S. Presidents be outliers? If you look at how many days each U.S. president served in office, you'll see that most of them served either for 1,460 days or 2,921 days (plus or minus a day), which corresponds to four-year and eight-year terms, respectively. But 44 percent of our presidents served for shorter or longer periods of time, making them outliers according to an analysis. Every time a president died in office (therefore not completing the rest of his term) he became an outlier—as did the person who replaced him. Find out more about outliers (and averages) in Chapter 3 of EVERYDATA.