How Collected Data Tells The Truth

Help Questions

AP Statistics › How Collected Data Tells The Truth

Questions 1 - 10
1

A city health department investigates the research question: “Is the mean number of hours of sleep for adults in the city at least 7 hours?” Interviewers stood outside a downtown gym on weekday mornings and surveyed 85 adults leaving the gym; the sample mean was 7.4 hours. The department concluded that adults in the city average at least 7 hours of sleep. Which statement explains whether the conclusion is valid?

The conclusion is not valid because the sample is a convenience sample of gym-goers at a particular time and may not represent all adults in the city.

The conclusion is valid because exercising in the morning causes people to sleep more, which supports the claim.

The conclusion is valid because the sample mean is greater than 7, so the population mean must be greater than 7.

The conclusion is valid because 85 is a sufficiently large sample size for estimating a mean.

The conclusion is not valid because the interviewers should have measured sleep in minutes, not hours.

Explanation

This question examines how convenience sampling undermines the validity of generalizations to a larger population. The health department surveyed only people leaving a downtown gym on weekday mornings, creating a convenience sample that likely overrepresents health-conscious individuals who exercise regularly and may have different sleep patterns than the general adult population. People who go to the gym in the morning might prioritize sleep more or have schedules that allow for adequate rest. The sample mean of 7.4 hours cannot reliably estimate the mean for all city adults because the sampling method systematically excludes many groups (non-exercisers, people with different work schedules, etc.). A valid conclusion would require random sampling from all adults in the city, not just gym-goers.

2

A county election office investigates the research question: “What proportion of registered voters in the county approve of the new voting center location?” The office used a random-digit dialing method that called only landline phone numbers in the county and completed interviews with 520 registered voters; 61% approved. The office concluded that about 61% of all registered voters in the county approve. Which statement explains whether the conclusion is valid?

The conclusion is not valid because limiting calls to landlines may underrepresent voters who use only cell phones, so the sample may be biased.

The conclusion is valid because random-digit dialing always produces a simple random sample of voters.

The conclusion is valid because approval of the location causes people to keep landlines, making the estimate more accurate.

The conclusion is valid because the survey asked about approval, which is an objective fact.

The conclusion is not valid because 520 is too small to estimate a proportion for an entire county.

Explanation

This question illustrates coverage bias, where the sampling method systematically excludes certain population segments. By using random-digit dialing limited to landline phones, the election office excludes voters who rely exclusively on cell phones, a group that tends to be younger and may have different political views. This creates undercoverage of an important demographic, potentially biasing the approval estimate. While random-digit dialing of landlines was once considered good practice, changing communication patterns have made it problematic for representing entire populations. The 61% approval estimate may not accurately reflect all registered voters' opinions if cell-phone-only users have different views about the voting center location. Modern polling must include cell phones or use other methods to ensure all voter segments have a chance of selection.

3

A school district investigates the research question: “What proportion of all district parents support requiring school uniforms?” The district randomly selected 8 schools from the 32 schools in the district, then surveyed all parents who attended a PTA meeting at each selected school; 64% of those surveyed supported uniforms. The district concluded that about 64% of all district parents support requiring uniforms. Which statement explains whether the conclusion is valid?

The conclusion is valid because cluster sampling always produces unbiased results, regardless of who responds within clusters.

The conclusion is not valid because surveying only PTA meeting attendees may overrepresent more involved parents, so the sample may not represent all parents in the district.

The conclusion is not valid because the sample percentage would be valid only if exactly 32 schools were selected.

The conclusion is valid because selecting 8 schools at random guarantees that the parents surveyed at those schools form a random sample of all district parents.

The conclusion is valid because requiring uniforms would cause parents to support uniforms, so the estimate must be correct.

Explanation

This question illustrates how cluster sampling can introduce bias when combined with convenience sampling within clusters. While the district appropriately randomly selected 8 schools (clusters) from 32, they then surveyed only parents attending PTA meetings at those schools. PTA meeting attendees likely overrepresent parents who are more involved in school activities, have strong opinions about school policies, and have schedules permitting meeting attendance. These parents may have different views on uniforms than less-involved parents or those unable to attend meetings. The 64% support estimate probably doesn't represent all district parents accurately. For valid conclusions, the district should have randomly sampled parents within each selected school rather than relying on PTA meeting attendance, which creates a biased sub-sample within each cluster.

4

A wildlife agency investigates the research question: “What is the mean weight of adult trout in Lake Orion?” Biologists used nets at two easily accessible shoreline locations on a single afternoon and weighed 60 adult trout; the sample mean weight was 1.8 kg. They concluded that the mean weight of adult trout in Lake Orion is about 1.8 kg. Which statement explains whether the conclusion is valid?

The conclusion is valid because weighing 60 fish is enough to estimate a mean for the entire lake.

The conclusion is not valid because fish weight is categorical, so a mean cannot be computed.

The conclusion is valid because catching trout causes the remaining trout to be heavier on average, confirming the estimate.

The conclusion is valid because using nets prevents any selection bias in which fish are caught.

The conclusion is not valid because the fish were caught at only two accessible shoreline locations and one time, so the sample may not represent all adult trout in the lake.

Explanation

This question demonstrates how convenience sampling based on accessibility can create biased samples in ecological research. The biologists sampled trout from only two easily accessible shoreline locations on a single afternoon, which likely doesn't represent the full population of adult trout in Lake Orion. Fish at different depths, in different parts of the lake, or active at different times may have different weights due to varying food availability, water temperature, or habitat quality. Shoreline fish might be smaller or larger than those in deeper waters. Additionally, sampling on just one afternoon doesn't account for temporal variation. The 1.8 kg mean weight estimate is questionable because the sampling method systematically excludes trout from most of the lake. Valid conclusions would require sampling from multiple locations throughout the lake across different times.

5

A university investigates the research question: “Do first-year students at the university spend more time studying on weekends than on weekdays?” Researchers recruited 40 volunteers from an honors study-skills workshop and had them self-report hours studied on a typical weekday and a typical weekend day; the volunteers reported more hours on weekends. The researchers concluded that first-year students at the university generally study more on weekends than weekdays. Which statement explains whether the conclusion is valid?

The conclusion is valid because 40 students is enough to generalize to all first-year students.

The conclusion is valid because comparing weekend and weekday study time demonstrates that weekends cause students to study more.

The conclusion is not valid because the participants were volunteers from an honors workshop, so the sample may not represent all first-year students.

The conclusion is not valid because self-reported study time can never be used in statistics.

The conclusion is valid because the study measured the same students twice, eliminating all bias.

Explanation

This question highlights how volunteer samples from specific subgroups cannot support generalizations to larger populations. The researchers recruited volunteers from an honors study-skills workshop, creating a sample that likely overrepresents academically motivated students who actively seek study improvement resources. These students probably have different study habits than typical first-year students, making any conclusions about weekend versus weekday study time invalid for the general first-year population. Additionally, self-selection bias occurs when students choose to participate, further skewing the sample toward those interested in the research topic. To validly answer their research question, researchers would need to randomly sample from all first-year students, not just workshop attendees, and address potential nonresponse issues.

6

A state transportation agency investigates the research question: “What proportion of all registered drivers in the state have received a speeding ticket in the past year?” The agency mailed surveys to 2000 randomly selected registered drivers; 620 returned the survey, and 18% of respondents reported a speeding ticket in the past year. The agency concluded that about 18% of all registered drivers received a speeding ticket in the past year. Which statement explains whether the conclusion is valid?

The conclusion is not valid because 620 responses is less than half of 2000, so the sample size is automatically too small.

The conclusion is valid because the agency used random selection, so nonresponse can be ignored.

The conclusion is valid because receiving a ticket causes drivers to avoid surveys, proving the estimate is accurate.

The conclusion is valid because the surveys were mailed, which prevents any kind of bias.

The conclusion is not valid because drivers who received tickets might be less likely to respond, so nonresponse bias could make 18% an underestimate.

Explanation

This question demonstrates how nonresponse bias can specifically undermine conclusions when the characteristic being measured might influence response rates. The transportation agency randomly selected drivers, which is appropriate, but only 31% (620/2000) responded to the mailed survey. Critically, drivers who received speeding tickets might be less likely to respond due to embarrassment, legal concerns, or negative associations with government agencies. This differential nonresponse would cause the 18% estimate to underestimate the true proportion of drivers with speeding tickets. The low response rate combined with the sensitive nature of the question makes the conclusion questionable. A more valid approach might use official records rather than self-reported data, or employ methods to increase response rates and assess nonresponse bias.

7

A city transportation office wants to investigate the research question: “What proportion of all adult residents in the city support adding protected bike lanes on major streets?” The population is all adult residents of the city. The office posted a link to an online poll on the city’s official social media accounts and received 2,450 responses; 71% of respondents said they support adding protected bike lanes. The office concludes that about 71% of all adult residents support adding protected bike lanes. Which statement explains whether the conclusion is valid?

The conclusion is not valid because 71% support in the sample proves bike lanes will reduce traffic congestion.

The conclusion is valid because the sample size is large, so the estimate must be close to the true proportion.

The conclusion is not valid because the sample size should be at least 10% of the city’s adults.

The conclusion is valid because posting on the city’s accounts gives every adult an equal chance to respond.

The conclusion is not valid because the sample is a voluntary response sample and may overrepresent people with strong opinions.

Explanation

This question tests understanding of voluntary response bias in data collection. The city posted a poll on social media, which creates a voluntary response sample where people self-select to participate. This sampling method typically overrepresents people with strong opinions about the topic (bike lanes) and underrepresents those who are neutral or less engaged. The correct answer B identifies this fundamental flaw - voluntary response samples are not representative of the population. When data collection allows self-selection rather than using random sampling, the resulting estimates cannot be trusted to reflect the true population parameter, regardless of how large the sample size is.

8

A county wants to estimate the proportion of households that have access to high-speed internet. The county divided the county into 10 geographic regions and then randomly selected 2 regions to survey. Within each selected region, the county surveyed every household and found that 58% of surveyed households had high-speed internet. The county concluded that “about 58% of all households in the county have access to high-speed internet.” Which statement explains whether the conclusion is valid?

The conclusion is not valid because selecting only 2 regions is cluster sampling, and if the chosen regions differ from others, the result may not represent the entire county.

The conclusion is not valid because 58% is not a majority, so it cannot be generalized to the population.

The conclusion is valid because cluster sampling always produces a representative sample as long as clusters are geographic.

The conclusion is not valid because having high-speed internet causes households to live in the selected regions.

The conclusion is valid because surveying every household in the selected regions eliminates sampling error.

Explanation

The skill tested in this AP Statistics question is recognizing limitations in cluster sampling with few clusters, and how it impacts the validity of population inferences. The county selected only 2 out of 10 regions as clusters and surveyed all households within them, but with so few clusters, the sample may not capture county-wide variability if the chosen regions differ from others. Choice C distracts by claiming cluster sampling is always representative for geographic areas, but representativeness depends on random selection and sufficient clusters. Trustworthy conclusions in cluster sampling require enough randomly selected clusters to approximate the population's diversity. This conclusion is invalid due to the small number of clusters, highlighting that inadequate cluster counts can lead to bias similar to non-random sampling.

9

A company has 4 departments with different numbers of employees. To estimate the proportion of all employees who prefer working from home at least 3 days per week, an analyst randomly selected 50 employees from each department (200 total) and found that 55% of those sampled preferred working from home at least 3 days per week. The analyst concluded that “about 55% of all employees prefer working from home at least 3 days per week.” Which statement explains whether the conclusion is valid?

The conclusion may not be valid because sampling equal counts from unequal-sized departments overrepresents smaller departments unless results are weighted or a proportional stratified sample is used.

The conclusion is valid because stratifying always makes a sample representative even if strata sizes are ignored.

The conclusion is not valid because the survey proves that working from home increases employee productivity.

The conclusion is not valid because the sample size is only 200; it must be at least 500 to generalize.

The conclusion is valid because selecting the same number from each department guarantees no bias, regardless of department sizes.

Explanation

This AP Statistics question focuses on evaluating stratified sampling and the impact of unequal strata sizes on data representativeness, relating to how collection methods influence truthful inferences. The problem arises because the analyst sampled equal numbers from departments of varying sizes, overrepresenting smaller departments and potentially skewing the overall proportion unless weighted adjustments are made. A distractor like choice B misleadingly suggests that equal sampling from strata ensures no bias, but this ignores the need for proportional allocation or weighting in stratified samples. Trustworthy conclusions require that stratified samples reflect the population's composition, either by sampling proportionally or weighting results accordingly to avoid misrepresentation. In this scenario, the conclusion may not be valid without such adjustments, underscoring the mini-lesson that sampling design must align with population structure for accurate generalizations.

10

A health clinic wants to estimate the proportion of adults in its county who received a flu shot this season. A nurse sampled 300 patients who came to the clinic during a two-day period and found that 64% reported receiving a flu shot. The clinic concluded that “about 64% of adults in the county received a flu shot this season.” Which statement explains whether the conclusion is valid?

The conclusion is not valid because the nurse should have used a smaller sample to reduce variability.

The conclusion is not valid because the sample consists of clinic visitors and may not represent all adults in the county (selection bias).

The conclusion is valid because 300 is a standard sample size used in many surveys.

The conclusion is not valid because getting a flu shot causes people to visit clinics more often.

The conclusion is valid because patients at the clinic are a random subset of all adults in the county.

Explanation

Identifying selection bias in convenience samples is the focus of this AP Statistics question, exploring how data collection methods reveal or distort the truth. The clinic sampled only visitors, who may be more health-conscious and likely to get flu shots, creating a biased group that doesn't represent all county adults. Choice A distracts by claiming clinic patients are random, but they are self-selected and may differ systematically from the population. Trustworthy conclusions require a sampling method that includes non-clinic-goers, such as random digit dialing or address-based sampling. Here, the conclusion is invalid due to selection bias, teaching that samples from specific venues often fail to generalize unless the venue mirrors the population.

Page 1 of 3