Random Sampling and Data Collection
Help Questions
AP Statistics › Random Sampling and Data Collection
A school district wants to estimate the average number of hours of sleep per night for all high school students in the district (grades 9–12). A student researcher stands in the cafeteria for two days during lunch and asks every 5th student who walks past a certain hallway entrance to report their typical sleep on school nights, collecting 180 responses. No random number generator was used, and students who did not pass that entrance during those lunches could not be selected. Which statement best describes the sample representativeness?
The sample is representative because every student had an equal chance of being selected during the two lunches.
The sample is likely not representative because it is a convenience/systematic sample from a specific location and time, which may exclude some groups of students.
The sample is not representative only if the researcher randomly assigned students to sleep more or less.
The sample is likely representative because selecting every 5th student guarantees randomness.
The sample is representative because the sample size (180) is large enough to eliminate bias.
Explanation
This question assesses the skill of evaluating sample representativeness in AP Statistics, focusing on random sampling and data collection methods. The sampling method here is a combination of convenience and systematic sampling, as the researcher only surveys students passing a specific location during limited times without using randomization, potentially excluding students with different lunch schedules or paths. A common distractor is choice A, which incorrectly claims that selecting every 5th student guarantees randomness, but true randomness requires a random starting point and equal chance for all. In a mini-lesson on random sampling, remember that a representative sample gives every individual in the population an equal chance of selection, ideally through simple random sampling using a random number generator. This method minimizes bias and allows results to generalize to the population. Here, the limited access and lack of randomization introduce selection bias, making the sample unrepresentative.
A restaurant chain wants to estimate the mean satisfaction rating (1–10) among all customers who ate at its locations last month. The population is all such customers. Each receipt includes a QR code to an online survey, and customers choose whether to respond; 4,800 surveys are completed. Which statement best describes the sample representativeness?
The sample is likely unrepresentative because it is a voluntary response sample that may overrepresent very satisfied or very dissatisfied customers.
The sample is representative because the survey was the same for all respondents, so there is no confounding.
The sample is representative because 4,800 is a very large sample size.
The sample is representative because every customer had an equal chance to respond by scanning the QR code.
The sample is unrepresentative because customers were not randomly assigned to restaurant locations.
Explanation
This question exemplifies voluntary response bias, a major threat to representativeness. When people self-select into a sample (choosing whether to scan the QR code and complete the survey), those with strong opinions are more likely to respond. Very satisfied customers might want to praise the restaurant, while very dissatisfied customers want to complain, but moderately satisfied customers may not bother. This creates a biased sample that doesn't represent the typical customer experience. The large sample size (4,800) cannot fix this fundamental bias - voluntary response samples are inherently unrepresentative regardless of size. True random sampling requires the researcher, not the subjects, to control who is selected.
A company wants to estimate the proportion of its 4,800 employees who prefer working remotely at least 3 days per week. Employees are divided into departments (Engineering, Sales, Customer Support, HR, Finance). The company randomly selects 60 employees from each department (300 total) using a random number generator and surveys them. Which statement best describes the sample representativeness?
The sample is likely representative of each department, but it may not represent the overall company well if departments differ in size and equal numbers were taken from each.
The sample is representative because 300 employees is large enough to be representative of 4,800 employees.
The sample is likely representative because random assignment within departments ensures the results will generalize to all employees.
The sample is likely not representative because stratified sampling always creates bias by forcing equal numbers from each department.
The sample is representative only if the company selected the 300 employees by asking for volunteers in each department.
Explanation
This AP Statistics question focuses on stratified random sampling and its impact on representativeness in data collection. The sampling is stratified by department with equal sample sizes, which represents each stratum well but may distort the overall company if departments vary in size. Choice A distracts by claiming stratified sampling always biases, but it's actually useful for ensuring subgroup representation when done proportionally. Mini-lesson: Random sampling within strata helps capture population diversity, but for overall estimates, samples should be proportional to stratum sizes to avoid underrepresenting larger groups. Here, equal samples per department could bias the company-wide proportion if larger departments have different preferences.
A university wants to estimate the mean amount of money spent on textbooks per semester by all undergraduate students. The registrar provides a roster of all undergraduates, and the university uses a random number generator to select 250 students from the roster and emails them a required survey that must be completed to register for next semester. All 250 selected students respond. Which statement best describes the sample representativeness?
The sample is not representative because using a roster means some students had zero chance of selection.
The sample is representative because 250 is large enough to guarantee it matches the population.
The sample is not representative because emailing students is a form of convenience sampling.
The sample is likely representative because a simple random sample of all undergraduates was taken and nonresponse was essentially eliminated.
The sample is representative only if the 250 students were randomly assigned to buy textbooks or not.
Explanation
In AP Statistics, this question evaluates understanding of sample representativeness through proper random sampling techniques. The method used is simple random sampling from the full roster, with mandatory responses eliminating nonresponse bias. A distractor like choice E wrongly implies that using a roster excludes some students, but actually, labeling and random selection ensure equal chances. Mini-lesson on random sampling: It requires a complete list (sampling frame) and random selection to mirror the population and avoid bias. Here, the SRS and full participation make the sample highly representative for estimating textbook spending.
A principal wants to estimate the mean number of hours of homework per week for all students at a high school. The principal takes a simple random sample of 80 students from the school roster using a random number generator. Which statement best describes the sample representativeness?
The sample is likely to be representative because it is a simple random sample from the full roster, giving each student an equal chance of selection.
The sample is biased because 80 students is too small to be representative.
The sample is representative because the principal selected students without looking at their grades, which makes it random assignment.
The sample is biased because random number generators do not create truly random samples.
The sample is representative only if students are randomly assigned to different homework levels.
Explanation
This question tests understanding of simple random sampling, the gold standard for representative sampling. The principal uses a random number generator to select 80 students from the complete school roster, giving every student an equal chance of selection. This is a textbook example of simple random sampling (SRS), which, when properly implemented, produces representative samples. Option A correctly identifies this as a representative sampling method. The distractors contain common misconceptions: sample size of 80 can be adequate depending on the population size and variability, random number generators are valid tools for creating random samples, and this is random selection (not random assignment, which is an experimental design concept). The key principle is that every student had an equal probability of being selected, which SRS achieves. This method should produce unbiased estimates of the mean homework hours for all students in the school.
A company wants to estimate the mean number of hours per week its 3,000 employees work from home. The population is all employees. The company divides employees into departments (Engineering, Sales, HR, Support) and then randomly selects 50 employees from each department using a random number generator, for a total of 200 employees surveyed. Which statement best describes the sample representativeness?
The sample is likely representative because it is a stratified random sample with random selection within each department.
The sample is not representative because employees were not randomly assigned to departments.
The sample is representative because 200 employees is a large sample regardless of how they were chosen.
The sample is representative only if each employee had exactly the same chance of being selected.
The sample is not representative because stratified sampling is not random sampling.
Explanation
This question tests understanding of stratified random sampling, a valid probability sampling method. The company divided employees into strata (departments) and then randomly selected 50 employees from each stratum. This ensures representation from all departments and can actually improve precision compared to simple random sampling when groups differ. The key is that within each stratum, selection was random using a random number generator. While this gives different selection probabilities if departments have different sizes, stratified sampling is still a representative method. The distractors incorrectly suggest that only simple random sampling is valid or confuse random sampling with random assignment.
A gym wants to estimate the proportion of its members who would pay extra for childcare services. The gym prints a list of all members and then selects every 20th name starting from the first name on the list, surveying those selected members (about 150 people). The membership list is ordered by the date members joined the gym. Which statement best describes the sample representativeness?
The sample is likely representative because selecting every 20th name is always equivalent to a simple random sample.
The sample is not representative because it used a random number generator to pick every 20th name.
The sample is representative because 150 is large enough to ensure representativeness.
The sample is representative only if the gym randomly assigned members to join earlier or later.
The sample may not be representative because systematic sampling from an ordered list can be biased if the ordering is related to the variable of interest.
Explanation
In AP Statistics, this tests systematic sampling's potential biases in data collection. Selecting every 20th from a join-date-ordered list may introduce bias if ordering correlates with childcare interest, like newer members having different family statuses. Distractor A wrongly equates systematic to simple random sampling, but without randomization, patterns can bias. Mini-lesson: Random sampling avoids ordered list pitfalls by ensuring independence; systematic works if no periodicity relates to the variable. Here, the ordering could make the sample unrepresentative for the proportion interested in childcare.
A restaurant chain wants to estimate the mean customer satisfaction rating (1–5) for all customers nationwide. The chain randomly selects 50 of its 500 locations using a random number generator. At each selected location, managers survey the first 20 customers who dine in on a Monday morning, for a total of about 1,000 surveys. Which statement best describes the sample representativeness?
The sample is likely representative because the chain used random assignment to choose which customers ate on Monday morning.
The sample is representative because cluster sampling always produces unbiased results regardless of how individuals are chosen within clusters.
The sample is representative because the overall sample size is about 1,000, which guarantees representativeness.
The sample is representative because selecting 50 locations at random ensures every customer nationwide had an equal chance of being surveyed.
The sample is likely not representative because, although locations were randomly selected, the customers within each location were a convenience sample limited to Monday morning dine-in customers.
Explanation
This question assesses cluster sampling's effectiveness for representativeness in AP Statistics data collection. Locations are randomly clustered, but within-location convenience sampling (Monday mornings) biases toward specific customers. Choice E distracts by claiming cluster sampling always unbiased, but within-cluster selection must be random. Mini-lesson: Random sampling in clusters requires random individual selection inside clusters to mirror the population. The convenience approach here limits representativeness, potentially skewing nationwide satisfaction ratings.
A city wants to estimate the proportion of households that support a new recycling fee. The city has a list of all residential addresses and uses a random number generator to select 600 addresses to mail a survey. Only 210 households return the survey, and the city reports results using only those 210 responses. Which statement best describes the sample representativeness?
Because the original 600 addresses were randomly selected, the 210 responses are automatically representative of the city.
The sample is not representative because simple random sampling can never produce representative samples.
The sample is representative because random assignment was used to decide who returned the survey.
The 210 respondents form a voluntary response sample, so the results may be biased and not representative of all households.
The sample is representative because 210 is a large sample size for a city survey.
Explanation
This question tests the ability to identify biases in sampling for AP Statistics, particularly in random sampling and data collection. The initial selection is a simple random sample from all addresses, but the final sample is voluntary response due to low return rate, which can bias results toward those motivated to respond. Choice B is a distractor, suggesting the initial randomness carries over despite nonresponse, but nonresponse bias occurs when nonrespondents differ from respondents. A mini-lesson on random sampling: It involves giving each unit an equal, independent chance of selection to ensure representativeness and reduce bias. Voluntary response samples often overrepresent strong opinions, leading to unreliable estimates. Thus, the 210 responses may not represent the city's households.
A state park wants to estimate the average distance hiked by all visitors on Saturdays in October. Park staff survey visitors as they exit the main trailhead parking lot between 2 p.m. and 4 p.m. on two Saturdays, collecting 120 surveys. Visitors who use other trailheads or leave earlier/later are not surveyed. Which statement best describes the sample representativeness?
The sample is representative because systematic sampling was used by surveying everyone who exited.
The sample is representative because the staff randomly assigned visitors to exit between 2 p.m. and 4 p.m.
The sample is likely not representative because it undercovers visitors who do not exit at that location and time, so it may be biased.
The sample is representative because 120 is a sufficiently large sample size for a park.
The sample is likely representative because surveying at the main trailhead captures the typical visitor.
Explanation
Assessing sample representativeness is key in AP Statistics for random sampling and data collection scenarios. This is a convenience sample limited to a specific exit, time, and days, missing visitors using other paths or times. Distractor C claims large sample size ensures representativeness, but size alone doesn't overcome selection bias. Mini-lesson on random sampling: To be representative, every population member must have an equal chance; convenience sampling often biases toward accessible subgroups. In this case, the method undercovers certain hikers, potentially skewing average distance estimates.