Question 1 of 25
A teacher flips a coin 40 times and records the results. The sequence includes many short runs and alternations, but also has one stretch of 8 heads in a row. Is the pattern consistent with random behavior?
AP Statistics
Practice Test 42 for AP Statistics: real questions and explanations from the Varsity Tutors practice-test pool.
0%
0 / 25 answered
Question 1 of 25
A teacher flips a coin 40 times and records the results. The sequence includes many short runs and alternations, but also has one stretch of 8 heads in a row. Is the pattern consistent with random behavior?
Question Navigator
A teacher flips a coin 40 times and records the results. The sequence includes many short runs and alternations, but also has one stretch of 8 heads in a row. Is the pattern consistent with random behavior?
Explanation: This question examines whether long runs can occur in random coin flips. A run of 8 heads in a row might seem unlikely, but it's actually quite possible in 40 flips - the probability is small for any specific location but reasonable to occur somewhere in the sequence. Random sequences often contain surprising patterns, including long runs that seem non-random to our pattern-seeking brains. The incorrect answers reflect common misconceptions: that random outcomes must be perfectly balanced, that long runs are impossible, or that runs prove fairness. Understanding that randomness can produce seemingly non-random patterns is crucial for statistical thinking.
When creating a bar chart for the single categorical variable 'Favorite Fruit' with categories Apple, Banana, and Orange, what is a standard convention regarding the bars?
Explanation: For a bar chart representing a categorical variable, the bars are separated by spaces. This is because the categories are distinct and there is no inherent numerical order or continuity between them. This is a key visual difference from a histogram, where bars for adjacent bins of a quantitative variable touch.
A student runs a simulation of 100 trials where each trial results in one of three outcomes: win, lose, or tie. The student defines success as “win,” and records the number of wins in 100 trials. Does this meet binomial conditions?
Explanation: The skill here is identifying binomial settings, requiring fixed n, binary outcomes per trial, constant p, and independence. Although each trial has three outcomes (win, lose, tie), defining success as 'win' collapses it to binary (win or not), satisfying the condition if p is constant and trials independent, as assumed in the simulation. Distractor B insists on exactly two outcomes without redefinition, missing that we can group non-successes as failure. Choice E incorrectly adds a rarity requirement, which isn't needed for binomial. Mini-lesson: Binomial counts successes in n trials; it's flexible for redefined binaries, like 'win' vs. 'not,' but ensure constant p across trials. This meets the conditions.
A city compares mean commute times for residents who use a new express bus route versus residents who use the regular route. A random sample of n=52 express riders had mean xˉ=28.4 minutes, and a random sample of n=49 regular riders had mean xˉ=31.0 minutes. The claim is that the express route reduces the population mean commute time. Which hypotheses are appropriate?
Explanation: This question assesses two-mean hypothesis setup in AP Statistics, for commute times on express and regular bus routes. The claim is express reduces mean time (μ_E < μ_R), so H0: μ_E - μ_R = 0 and Ha: μ_E - μ_R < 0. Distractor choice B reverses order with >0, testing longer express times, and choice C uses sample means. Choice A is two-sided, ignoring the directional claim. Mini-lesson: Use group subscripts; null is zero difference, <0 for 'reduces.' Align subtraction for correct inequality. Sample means (28.4 vs. 31.0) are computational, not hypothetical.
A company that ships packages claims that the mean weight of its “standard” shipment is 12.0 pounds. A random sample of 49 standard shipments is weighed, and the sample mean is xˉ=11.7 pounds. Which hypotheses are appropriate for testing whether the population mean shipment weight μ is different from 12.0 pounds?
Explanation: This question involves setting up a two-sided hypothesis test for a population mean. The company claims the mean weight is 12.0 pounds, and we want to test if it's "different from" this value, indicating a two-sided test. The null hypothesis states the claimed value (H₀: μ = 12.0), and the alternative uses ≠ to test for any difference (Hₐ: μ ≠ 12.0). Option B incorrectly reverses the null and alternative hypotheses. Option C incorrectly uses the sample mean x̄ instead of the population parameter μ. Option D incorrectly uses p (proportion) instead of μ for weight data. Option E incorrectly uses the sample value 11.7 in the null hypothesis instead of the claimed value. The correct answer A properly sets up a two-sided test to determine if the population mean shipment weight differs from the claimed 12.0 pounds.
A school counselor wants to know whether students’ preferred study location is related to grade level. A random sample of 180 students is selected from the school, and each student reports their grade level (9th, 10th, 11th) and preferred study location (Home, Library). The results are shown in the table. Which chi-square test is appropriate, and what are the correct hypotheses?
(One random sample; two categorical variables measured on each individual.)
Explanation: This question tests the ability to distinguish between chi-square tests of independence and homogeneity based on sampling design. Since there is ONE random sample of 180 students from the school, and TWO categorical variables (grade level and study location) are measured on each individual, this calls for a chi-square test of independence. The test of homogeneity would require separate random samples from each grade level. The null hypothesis for independence states that the two variables are independent in the population, while the alternative states they are associated. Choice B correctly identifies both the test type and hypotheses, while choice E incorrectly suggests homogeneity despite the single-sample design.
A teacher recorded quiz scores (out of 20 points) for 160 students. The summaries were: mean =15.2, median =16, SD =3.1, five-number summary (2, 14, 16, 18, 20), and IQR =4. Which interpretation is correct?
Explanation: This question tests recognition of left-skewed distributions. The mean (15.2) is less than the median (16), indicating left skewness - unusual for test scores but possible with a floor effect. The minimum (2) is extremely far below Q1 (14), a distance of 12 points, while the maximum (20) is only 2 points above Q3 (18). This asymmetry confirms left skewness with low outliers. Choice B confuses IQR with range. Choice C misinterprets SD as a typical score. Choice D reverses the quartile interpretation. Choice E incorrectly links the maximum value to distribution shape.
To test whether background music affects reading comprehension, a researcher randomly selected 120 students from all first-year students at a university. Each selected student was randomly assigned to read a passage either in silence or while listening to instrumental music, then took the same comprehension quiz. The music group scored higher on average. Which conclusion is justified based on the design of this study?
Explanation: This question tests understanding of experiments that combine random selection with random assignment. The key skill is recognizing that this design allows both causal inference and generalization to the sampled population. The researcher used random selection to choose 120 students from all first-year students, then randomly assigned them to music or silence conditions. This is a true experiment because of the random assignment, which eliminates confounding and allows us to conclude that the music caused higher comprehension scores. Additionally, because the students were randomly selected from all first-year students at the university, we can generalize this causal conclusion to that entire population. The combination of random selection (for generalization) and random assignment (for causation) makes this a powerful design. The causal conclusion applies to first-year students at this university, not just the 120 in the study.
A meteorologist recorded daily rainfall amounts (in inches) for 45 days in a spring season at one weather station using a standard rain gauge. The dotplot below shows the distribution.
Dotplot (rainfall in inches): 0.0 | ••••••••••••••••••• 0.1 | ••••••••• 0.2 | ••••• 0.3 | ••• 0.4 | •• 0.5 | • 0.6 | • 0.7 | 0.8 | 0.9 | 1.0 | •
Which feature of the distribution is most evident?
Explanation: This question involves identifying features in dotplots of quantitative data like rainfall amounts. The dotplot shows a high stack at 0.0 inches (19 days), decreasing frequencies toward higher amounts, with a tail extending to 1.0 inches, indicating strong right skew. Most days have little to no rain, with infrequent higher rainfall days. Choice B distracts by suggesting left skew, but the long tail is to the right, toward higher values. Key lesson: right-skewed distributions often occur with non-negative variables like rainfall; describe shape, center (low due to skew), spread, and peaks to highlight common versus rare events.
At a large university, the distribution of time (in minutes) it takes students to walk from the main library to the student center is approximately Normal for the population of all such walks. The mean is μ=12 minutes with standard deviation σ=3 minutes. A particular walk time of 18 minutes is marked on the normal curve. Which statement about the marked value is correct?
Explanation: This question tests understanding of interpreting values on a normal distribution. Given μ = 12 minutes and σ = 3 minutes, we need to determine where 18 minutes falls. To find this, calculate how many standard deviations 18 is from the mean: (18 - 12)/3 = 6/3 = 2. So 18 minutes is exactly 2 standard deviations above the mean. In a normal distribution, values that are 2 standard deviations away from the mean (either above or below) are relatively uncommon, occurring only about 2.5% of the time in each tail. The key insight is recognizing that being 2σ above the mean places this walk time in the upper tail of the distribution, making it an unusually long walk time compared to most.
A school district randomly samples 30 classrooms and estimates the mean number of students per class. A 95% confidence interval for the population mean class size is (26.4,29.9). A school board member claims the true mean class size is 30 students. Is the claim supported by the confidence interval?
Explanation: This question evaluates whether a claim about mean class size is supported by a confidence interval. The 95% confidence interval is (26.4, 29.9) students, and the claimed value is 30 students. Since 30 is greater than the upper bound of 29.9, it falls outside the interval and the claim is not supported. Choice A incorrectly suggests rounding makes values outside the interval acceptable, but confidence intervals have precise boundaries. Choice E wrongly implies that the population mean could be any value regardless of the interval. The fundamental principle is that a confidence interval contains the plausible values for the population parameter - if a claimed value is outside the interval, even by a small amount, it is not supported by the data at the given confidence level.
A website A/B test assumes that if two page designs are equally effective, each visitor is equally likely to click either design’s button, so the probability a visitor clicks Design A is 0.50. In a random sample of 40 visitors shown both designs, 30 click Design A. The team expected about 20 clicks for A but observed 30. Is the result unexpected, given random variation with 40 visitors?
Explanation: The skill involves deciding if results are unexpected given binomial variability, part of AP Statistics' introductory statistics. With n=40 visitors, p=0.50, expected 20 clicks, SD ~3.16, typical 14-26. 30 is >3 SD above, probability <0.1%, unexpected. Distractor A claims 30 is close enough, but it's far outside typical range. Mini-lesson: Calculate mean and SD; if observed > mean + 3 SD, it's surprising as chance alone rarely produces it.
A city health department is estimating the mean number of minutes adults in the city exercise per day. Two independent random samples of 80 adults are taken from the same city during the same week. Sample A has a mean of 31.2 minutes; Sample B has a mean of 27.9 minutes. Why might the sample results differ?
Explanation: This question addresses sampling variability in sample means. The difference between the two sample means (31.2 vs 27.9 minutes) is a natural result of random sampling variability. When we randomly select different groups of 80 adults from the same population, each sample will likely contain different individuals with varying exercise habits. Some samples might randomly include more active adults, while others might include less active ones, leading to different sample means. This variation is expected and does not indicate bias, error, or a changing population. The concept of sampling variability in means is fundamental to understanding why we need confidence intervals and hypothesis tests. Students should recognize that even well-conducted random samples from the same population will produce different statistics due to the randomness inherent in the sampling process.
A city health department investigates the research question: “Is the mean number of hours of sleep for adults in the city at least 7 hours?” Interviewers stood outside a downtown gym on weekday mornings and surveyed 85 adults leaving the gym; the sample mean was 7.4 hours. The department concluded that adults in the city average at least 7 hours of sleep. Which statement explains whether the conclusion is valid?
Explanation: This question examines how convenience sampling undermines the validity of generalizations to a larger population. The health department surveyed only people leaving a downtown gym on weekday mornings, creating a convenience sample that likely overrepresents health-conscious individuals who exercise regularly and may have different sleep patterns than the general adult population. People who go to the gym in the morning might prioritize sleep more or have schedules that allow for adequate rest. The sample mean of 7.4 hours cannot reliably estimate the mean for all city adults because the sampling method systematically excludes many groups (non-exercisers, people with different work schedules, etc.). A valid conclusion would require random sampling from all adults in the city, not just gym-goers.
A biologist tags 9 turtles and later checks whether each tagged turtle is recaptured during a follow-up survey. Assume each turtle’s recapture status is independent of the others and the probability a tagged turtle is recaptured is 0.30. Let X be the number of tagged turtles recaptured. Which values correctly model this situation with a binomial distribution (identify n and p)?
Explanation: This question involves identifying parameters when counting a specific outcome. The biologist tags n = 9 turtles (number of trials), and X counts recaptured turtles with probability p = 0.30 each. Since we're counting recaptured turtles and that probability is 0.30, we use n = 9 and p = 0.30. A common error would be using p = 0.70 (not recaptured), but p must match the outcome X is counting. Choices B and D incorrectly swap n and p positions - remember that n must be a positive integer. The key to binomial parameter identification is matching p to exactly what your random variable X is counting, not its complement.
A hospital screens 35 patients for a particular infection using a rapid test. A success is defined as a positive test result. For this group, the probability a patient tests positive is 0.08, and patient results are treated as independent. If X is the number of positive test results, which values correctly model this situation (identify n and p)?
Explanation: This medical screening scenario tests understanding of binomial parameters when success is a positive test result. With 35 patients screened (n = 35) and success defined as testing positive with probability 0.08 (p = 0.08), the correct answer is B. Students might be tempted to use 0.92 (probability of negative result) if they misunderstand what's being counted. The distractors include various misplacements of n and p values. In medical contexts, carefully identify whether you're counting positive or negative outcomes—here, success explicitly means a positive test, so p = 0.08. The binomial model applies because we have independent patient results with constant probability.
A student rolls two dice 30 times and records whether the sum is 7. A success is “sum equals 7,” and exactly 30 rolls are made. Does this meet binomial conditions for the number of successes?
Explanation: This question introduces binomial in AP Statistics for games. Fixed n=30 rolls, binary (sum 7 or not), constant p=1/6, independent, yes. Meets all. Distractor B counts sums as 11 outcomes, but success is binary. Choice C imagines changing p, but rolls are identical. Mini-lesson: Games of chance fit binomial; compute probabilities easily.
A real estate analyst wants to know whether larger houses tend to sell for more. From a random sample of 50 recent sales in a city, the analyst records living area in square feet (x) and sale price in thousands of dollars (y) and fits a regression line predicting price from area. The research claim is that the relationship is linear with a positive slope. Which hypotheses are appropriate for testing this claim about the population regression slope?
Explanation: This question involves setting up hypotheses for a positive slope claim. The research states that larger houses tend to sell for more, indicating a positive linear relationship between area and price. For regression slope inference, we test the population parameter β, not the sample statistic b or correlation r. The claim of a positive relationship requires a one-sided alternative hypothesis Ha: β > 0. The null hypothesis is that there is no linear relationship (β = 0). Choice A correctly presents H₀: β = 0 vs. Ha: β > 0, properly reflecting the directional claim that increased area predicts increased price.
A consumer group compared the proportion of defective items produced by two factories. In a random sample of 500 items from Factory A, 34 were defective; in a random sample of 450 items from Factory B, 18 were defective. A two-proportion z test was conducted for H0:pA=pB versus Ha:pA>pB, and the p-value was 0.004. Using α=0.05, what conclusion is appropriate?
Explanation: This problem involves a one-tailed test where H_a: p_A > p_B tests if Factory A has a higher defect proportion. With p-value = 0.004 < α = 0.05, we reject H₀. The correct conclusion is that there's sufficient evidence that the population defect proportion is higher for Factory A than for Factory B (choice A). Choice B would be correct if we failed to reject H₀. Choice C reverses the direction of the inequality. Choice D incorrectly implies causation about the production process. Choice E refers to sample proportions without specifying direction. The very small p-value (0.004) provides strong evidence that Factory A has a higher population defect rate, though this observational comparison cannot establish causation about production processes.
A researcher completed an experiment to test whether a brief mindfulness exercise reduces test anxiety. At one high school, 10 math classes were available. The researcher randomly assigned 5 entire classes to do a 5-minute mindfulness exercise before a unit test and the other 5 classes to follow the usual pre-test routine. All students then took the same unit test and completed an anxiety scale immediately beforehand; the mindfulness classes reported lower average anxiety. Which conclusion is justified based on the design of this study?
Explanation: This question evaluates understanding of cluster randomization in experiments. While individual students weren't randomly assigned, entire classes were randomly assigned to conditions, which still permits causal conclusions - the mindfulness exercise caused lower anxiety for students in these 10 classes. However, classes came from one school (not randomly sampled), so results cannot be generalized beyond this context. The key distractor is choice B, which incorrectly denies causation due to lack of individual randomization. Cluster randomization (randomizing groups rather than individuals) is a valid experimental design that permits causal inference, though it may be less precise than individual randomization. The study can conclude causation for these classes but cannot generalize to all high school students.
A school nurse records the resting heart rates (beats per minute) of a simple random sample of n=60 students during homeroom to estimate the typical resting heart rate at the school. The nurse plans to use a normal model to describe the distribution of resting heart rates in the student population. Why is a normal model reasonable in this situation?
Explanation: This question tests understanding of when normal models are appropriate for describing data distributions. The correct answer recognizes that biological measurements like heart rates are influenced by many small, independent factors (genetics, fitness level, stress, etc.), which often combine to produce approximately symmetric, unimodal distributions. With a sample size of n=60, we have enough data to assess whether the distribution appears roughly normal. The key distractor (A) incorrectly claims that n=60 guarantees normality of individual data values, confusing sample size requirements for sampling distributions with the shape of the population distribution. Normal models are useful when data arise from many additive effects, not because they eliminate variability or guarantee specific shapes.
A real estate analyst randomly samples 35 homes in a region and records home size (x, square feet) and selling price (y, thousands of dollars). A least-squares regression line predicts price from size. A 98% confidence interval for the true slope is (0.08, 0.14) thousand dollars per square foot. Which interpretation is correct?
Explanation: This question tests understanding of slope units in a real estate context. The interval (0.08, 0.14) with units of thousand dollars per square foot means we're 98% confident the true slope is between 0.08 and 0.14, which translates to 80to140 per square foot. Option A correctly interprets both the confidence level and the unit conversion. Option B incorrectly refers to sample slopes in repeated samples, Option C confuses slope with correlation, Option D misinterprets the interval as applying to individual homes' selling prices above listing, and Option E reverses what confidence intervals contain. Remember: always pay attention to units when interpreting regression slopes - here, thousands of dollars requires conversion.
A basketball player shoots free throws until making 10 shots. A success is a made free throw and a failure is a missed free throw. The coach wants to use a binomial model for the number of made shots. Does this meet binomial conditions?
Explanation: This scenario violates a key binomial condition: the number of trials must be fixed in advance. The player shoots "until making 10 shots," which means the total number of shots taken is variable - it could be 10 shots (if all are made) or many more. In a binomial setting, we need to know exactly how many trials will occur before we start. This is different from counting successes in a fixed number of trials. The other conditions (two outcomes per shot, constant probability, independence) may be met, but without a fixed number of trials, this cannot be modeled with a binomial distribution. This situation would be better modeled with a negative binomial distribution.
A researcher compares mean resting heart rate (beats per minute) between runners and non-runners. Independent random samples were taken from each group. A two-sample t test was conducted for H0:μR−μN=0 versus Ha:μR−μN<0. The p-value was 0.001. At α=0.05, what conclusion is appropriate?
Explanation: This question tests a one-tailed hypothesis where H_a: μ_R - μ_N < 0, meaning we're testing if runners have a lower mean resting heart rate than non-runners. With a p-value of 0.001, which is much less than α = 0.05, we reject the null hypothesis. This provides convincing evidence that runners have a lower mean resting heart rate than non-runners. Choice D incorrectly implies causation from an observational study, while Choice E makes an incorrect claim about the exact difference between populations. When conducting hypothesis tests, we can only conclude about the direction of the difference (higher or lower), not establish causation or determine exact values. Statistical significance indicates evidence of a difference, not the magnitude of that difference.
A public health researcher tests whether the proportion of adults in a county who have received a flu shot is greater than 50%. In a random sample of 300 adults, 171 reported receiving a flu shot. A one-proportion z test was conducted for H0:p=0.50 versus Ha:p>0.50 at α=0.05. The p-value was 0.006, so the researcher rejected H0. Which conclusion is appropriate?
Explanation: This problem tests understanding of right-tailed tests for population proportions. Since the p-value (0.006) is less than α (0.05), we reject H₀ and conclude there is sufficient evidence that the population proportion exceeds 0.50. Option A correctly states this conclusion about all adults in the county. Option B makes an incorrect definitive claim. Option C introduces irrelevant causation. Option D misinterprets the p-value as relating to H_a rather than H₀. Option E only addresses the sample, not the population. Remember: hypothesis test conclusions always refer to population parameters, and rejecting H₀ supports the alternative hypothesis.