in any of a number of ways.
In the way the sample is selected: For example, if you want to get an estimate of how much Christmas shopping people in your community plan to do this year, and you take your clipboard and head out to the mall on the day after Thanksgiving to ask customers about their shopping plans, you have bias in your sampling process. Your sample tends to favor those die-hard shoppers at that particular mall who were braving the massive crowds that day.
In the way data are collected: Poll questions are a major source of bias. Because researchers are often looking for a particular result, the questions they ask can often reflect that expected result. For example, the issue of a tax levy to help support local schools is something every voter faces at one time or another. A poll question asking, "Don't you think it would be a great investment in our future to support the local schools?" does have a bit of bias. On the other hand, so does the question, "Aren't you tired of paying money out of your pocket to educate other people's children besides your own?" Question wording can have a huge impact on the results. See Chapter 16 for more on designing polls and surveys.
Tip
When examining polling results that are important to you or that you're particularly interested in, find out what questions were asked and exactly how the questions were worded before drawing your conclusions about the results.
Data
Data are the actual measurements that you get through your study. (Remember that "data" is plural — the singular is datum — so sentences that use that word always sound a little funny, but they are grammatically correct.) Most data fall into one of two groups: numerical data or categorical data (see Chapter 5 for additional information).
Numerical data are data that have meaning as a measurement, such as a person's height, weight, IQ, or blood pressure; the number of stocks a person owns; the number of teeth a person's dog has; or anything else that can be counted. (Statisticians also refer to numerical data as quantitative data or measurement data. )
Categorical data represent characteristics, such as a person's gender, opinion, race, or even bellybutton orientation (innie versus outie — is nothing sacred anymore?). While these characteristics can take on numerical values (such as a "1" indicating male and "2" indicating female), those numbers don't have any specific meaning. You couldn't add them together, for example. (Note that statisticians also call this qualitative data. )
HEADS UP
Not all data are created equal. Finding out how the data were collected can go a long way toward determining how you weigh the results and what conclusions you draw from them.
Data set
A data set is the collection of all the data taken from your sample. For example, if you measured the weights of five packages, and those weights were 12 lbs, 15 lbs, 22 lbs, 68 lbs, and 3 lbs, those five numbers (12, 15, 22, 68, 3) constitute your data set. Most data sets are quite a bit larger than this one, however.
Statistic
A statistic is a number that summarizes the data collected from a sample. People use many different statistics to summarize data. For example, data can be summarized as a percentage (60% of the households sampled from the United States own more than two cars), an average (the average price of a home in this sample is … ), a median (the median salary for the 1,000 computer scientists in this sample was … ), or a percentile (your baby's weight is at the 90th percentile this month, based on data collected from over 10,000 babies … ).
HEADS UP
Not all statistics are correct or fair, of course. Just because someone gives you a statistic, nothing guarantees that the statistic is scientific or legitimate! You may have heard the saying, "Figures don't lie, but liars figure."
TECHNICAL STUFF
Statistics are based on sample data, not on population data. If you collect data from the entire population, this process
Tamora Pierce
Brett Battles
Lee Moan
Denise Grover Swank
Laurie Halse Anderson
Allison Butler
Glenn Beck
Sheri S. Tepper
Loretta Ellsworth
Ted Chiang