How to Accurately Interpret Statistics – Part I
By NikkiJade • Jun 17th, 2008 • Category: Articles, Spotlight
Numbers are power. Usually presented in the form of statistics, numbers can add credibility to any set of words or persuade an argument in a specific direction. For this reason, learning how to properly interpret statistics is crucial to one’s understanding of an argument or ability to compose one.
So here are a few things to keep in mind when reading and interpreting statistics. Let’s start with the basics—some important, non-technical questions/points to keep in mind in your initial glance at the data.
What is the background of the data?
Most of the time, experiments are conducted because a certain issue prompted the need to further develop or prove something. Check out the background of the issue:
- What prompted the experiment that gathered the data?
- Who is funding it?
- Does anyone stand to gain from the outcome/interpretation of the data?
- What are the external factors? I will go into more detail about this point in the discussion of causality, but for now think of this example: If pollution data is sampled from a nation with carbon-intensive industries, the data is obviously going to demonstrate high pollution levels. But consider beyond the data—why might the environmental policies be so lax. What are their other options?
Who conducted the research?
All data is created through some type of experiment, and, like anytime you are developing an opinion from someone’s research, it is a good idea to check out who your source is. In the case of statistics, reliable sources are going to be less likely to use deceptive practices when presenting the results.
Another note here is to make sure you are looking at the original data and not someone’s repost and own interpretation of it, like a game of telephone around a campfire, the message (data) can change the more people (interpretations) it is passes through. If you are assessing the original source data, you will find it easier to create your own interpretation of the information. If original data is inaccessible or too raw for interpretation, then you may pose the question of “who interpreted the data” as well as who conducted the research when you consider a secondary source.
How was the data collected?
How data is collected and recorded can make a huge difference on its accuracy. One of the popular methods of collecting data is through surveying or polling; that is, asking specific questions to a sampled group of individuals. It is important to consider how questions were asked, as the survey could contain leading or ineffective questions. A leading question is one that entices the responder towards a certain direction. An ineffective question might be something like inquiring on the years of education a person has versus the degree obtained—inquiring solely on years in school does not take into consideration that an individual may have done school part time, skipped a grade or failed a grade, and therefore may not be an accurate measure of the level of education that individual has.
Hypothetical questions can be difficult to interpret, so it is always important to see how they are worded. An example of a hypothetical question can be found in one of the ways that environmental economists measure how “valuable” the environment is in dollar terms, called contingent valuation. This might be a survey passed around a neighbourhood beside a proposed airport development site, and the hypothetical question posed would be “how much would you be willing to pay to maintain the low level of noise pollution that currently exists.” Without having to make an actual transaction, a person living in that neighbourhood is more likely to overstate what the worth of their peace and quiet would be to them.
The main point here is to analyse the questions yourself and consider how they might affect the response or the data.
Are the data presented in context?
This point goes with the idea of “what were the questions asked”. Anything can sound more exciting than it is when it is taken out of context. It is always important to think of the bigger picture. Say you’re told on the day of an ice storm that there were 25 accidents throughout the day. Only 25? (This is less than a typical day). No problem–the roads must not be that bad then–but thinking beyond the statistic, there are likely less people on the road, so the actual ratio of accidents (number of accidents per car to how many cars are on the road) may be greater than on a regular day, implying that the roads are indeed dangerous.
Also, when given a percentage, always make sure the base of that percentage is given to place it in context. I see this all the time in advertising: “This product is 75% better!” Okay… 75% better than what? Your last product? Your competitor’s product? The recipe your mom uses? Than nothing?
In Michael Moore’s, “Bowling for Columbine”, during the scene where he demonstrated that the trusting Canadians leave their doors unlocked in Toronto, wondered how many doors he tried that were locked before he caught the ones that were not. (Everyone I know that lives in Toronto locks their doors…)
What is the relationship of the data?
Here we address the issue of correlation versus causality. Correlation implies that the variables studied have a relationship of some sort (they are co–related). For example the numbers may rise and fall together. However, just because the data has a relationship does not mean that one causes the other. It could be coincidence, or it could actually be a third factor that causes both other variables to rise. This third factor is called a causal mechanism.
One illustration of this can be seen in analysing an experiment on the effectiveness of firefighters. Say it is found that the more firefighters that are sent in to combat a blaze, the more damage the fire does. Does this mean that firefighters are less effective in larger numbers? Not necessarily. The causal mechanism here is the size of the fire—more fire fighters are going to be sent in for a larger fire, and a larger fire is going to cause more damage.
On an environmental note, causality has been the base of much of the debate surrounding climate change. Basically, if variables are presented as linked through causality, make sure to explore that their causality is explained logically and enough evidence of it is given, and that third factors are considered.
Visual Aids
Be wary of visual aids—graphs of any type can be quickly skewed by changing the scale or range measuring and presenting the data. Consider the two images I created as an example:


Believe it or not they represent the exact same set of data, but with a larger range (0 to 25) the data for each month looks more evenly distributed, whereas when I made the range smaller, the month of May looks massive compared to the other months.
By asking these types of questions, you can begin to critically analyse and understand what the data is proving, if anything. The Walking vs. Driving case study by the Pacific Institute that was summarized and compared recently on TheGreenRocket.com is a great example of this. The researchers at the Pacific Institute did their homework when they were presented data they believed was biased. By thinking in the bigger picture and including more factors, their results were very different than those of environmentalist Chris Goodall.
For an even deeper understanding of statistics, watch out for Thursday’s article on some of the more technical elements of examining data that are important to understand as a reader.
Creative Commons Attribution: “Actual is not normal“, Flickr, kevindooley
Related Posts
NikkiJade is Co-Founder of TheGreenRocket.com, an indoor cycling instructor and Honours Economics and Global Studies student at Wilfrid Laurier University with a focus in econometrics, environmental and development economics, and ecotourism. Nicole is passionate about everything green, as she believes nature’s services can be used more efficiently to generate sustainable development in all areas of the world.
Twitter: @NikkiJade
Email this author | All posts by NikkiJade



Are there any reports you know of concerning green topics that you feel or know have been misrepresented to satisfy alternative motives?
You ask.
I believe it is important to question the validity and reliability of each report. That’s why this is such a great article. Brilliant : )
Pat
Great article, reminds me of one of my favourite books – Darrell Huff’s “How to Lie with Statistics”. I have it on the shelf beside my desk!
[...] statistics, see TheGreenRocket.com’s articles on how to properly interpret statistics, Parts I and Part [...]