Home

Tutoring

Subjects

Live Classes

Study Coach

Essay Review

On-Demand Courses

Colleges

Games

Opening subject page...

Loading your content

Home

Tutoring

Subjects

Live Classes

Study Coach

Essay Review

On-Demand Courses

Colleges

Games

AP Statistics

AP Statistics Practice Test: Practice Test 15

Practice Test 15 for AP Statistics: real questions and explanations from the Varsity Tutors practice-test pool.

0 / 25 answered

Question 1 of 25

A business analyst tests whether advertising spending ( $x$ , thousands of dollars) predicts weekly sales ( $y$ , thousands of dollars) for 16 weeks. A regression of sales on ad spending is fit, and a slope test is performed with $H_0: \beta_1=0$ versus $H_a: \beta_1>0$ . The reported one-sided p-value is 0.008, and at $\alpha=0.05$ the analyst rejects $H_0$ . What conclusion is appropriate?

Question Navigator

All questions

Question 1

There is convincing evidence of a positive linear association between advertising spending and weekly sales in the population of weeks like these. (correct answer)
There is convincing evidence that increasing advertising spending will cause weekly sales to increase.
There is not convincing evidence of a positive linear association because 0.008 is less than 0.05.
Because $p=0.008$ , there is a 0.8% chance that the slope in the sample is positive.
There is convincing evidence that weekly sales predict advertising spending, so sales should be the explanatory variable.

Explanation: This question involves a one-sided test with Hₐ: β₁ > 0. The p-value (0.008) is less than α (0.05), so we reject H₀. This provides convincing evidence of a positive linear association between advertising spending and weekly sales in the population of weeks like these. Choice B incorrectly implies causation from observational data. Choice C misunderstands the decision rule. Choice D misinterprets what the p-value represents. Choice E incorrectly suggests switching variables based on the test result.

Question 2

A public health team is comparing whether the distribution of vaccination status (Up to date, Not up to date) is the same across four clinics (A, B, C, D). They take separate random samples of patients from each clinic during the same week and record vaccination status. The two-way table shows the results. Which chi-square test is appropriate, and what are the correct hypotheses?

Note: Multiple groups (clinics) with one categorical response.

Chi-square test of independence; $H_0$ : clinic and vaccination status are independent; $H_a$ : they are not independent
Chi-square goodness-of-fit; $H_0$ : vaccination status follows a specified distribution; $H_a$ : it does not
Chi-square test for homogeneity; $H_0$ : the distribution of vaccination status is the same for all clinics; $H_a$ : at least one clinic differs (correct answer)
Matched-pairs test; $H_0$ : no change in vaccination status; $H_a$ : change
Two-proportion $z$ test; $H_0$ : $p_A=p_B$ ; $H_a$ : $p_A\ne p_B$

Explanation: The skill tested is setting up chi-square tests for homogeneity or independence in AP Statistics contexts. Separate random samples from each clinic, with one categorical response (vaccination status), indicate a chi-square test for homogeneity to compare distributions across clinics. Choice C's hypotheses correctly nullify that distributions are the same, with the alternative that at least one differs. A distractor is choice A, independence, which is unsuitable because it needs one sample with two variables per unit, not multiple samples. Mini-lesson on chi-square: Goodness-of-fit tests against a predefined distribution; independence checks two variables' relationship in one sample; homogeneity verifies uniform distributions across groups. Choice E limits to two proportions, ignoring the full categorical nature. Thus, C is right.

Question 3

Daily low temperatures in a city during January are approximately Normal with mean $\mu=20^\circ\!F$ and standard deviation $\sigma=6^\circ\!F$ . Without calculating probabilities, which comparison is supported by the fact that the Normal curve is highest at the mean and decreases as you move away from the mean?

The area between $14^\circ\!F$ and $20^\circ\!F$ is greater than the area between $20^\circ\!F$ and $26^\circ\!F$ because the curve is higher on the left side.
The area between $18^\circ\!F$ and $22^\circ\!F$ is greater than the area between $8^\circ\!F$ and $12^\circ\!F$ because the first interval is closer to the mean. (correct answer)
The area below $14^\circ\!F$ is greater than the area above $26^\circ\!F$ because $14$ is closer to the mean.
The area between $8^\circ\!F$ and $12^\circ\!F$ is greater than the area between $18^\circ\!F$ and $22^\circ\!F$ because both intervals have width 4.
The area between $18^\circ\!F$ and $22^\circ\!F$ equals the area between $14^\circ\!F$ and $26^\circ\!F$ because both are centered at the mean.

Explanation: In AP Statistics, this question tests qualitative understanding of the normal curve's peak at the mean and decreasing height away from it, for temperatures ~N(20,6). The curve is highest at μ=20, so intervals closer to the mean have more area for the same width. Choice B correctly states the area between 18-22 is greater than between 8-12, as 18-22 is nearer the mean. Distractor choice A wrongly suggests the left side is higher, but the curve is symmetric and decreases equally on both sides. Mini-lesson on qualitative reasoning: The normal density decreases as you move from the mean, so for fixed-width intervals, those centered at μ have the largest area, decreasing outward. No specific shaded region, but imagine shading these intervals to compare heights visually.

Question 4

A factory records the weight of cereal in a box as $W$ ounces and the weight of the empty box as $B$ ounces. Assume $W$ and $B$ are independent with $\mu_W=18.2$ , $\sigma_W=0.4$ , $\mu_B=1.1$ , and $\sigma_B=0.1$ . Let $F=W+B$ be the total shipping weight. Which statement about $F$ is correct?

$\mu_F=19.3$ and $\sigma_F=\sqrt{0.4^2+0.1^2}$ (correct answer)
$\mu_F=17.1$ and $\sigma_F=\sqrt{0.4^2+0.1^2}$
$\mu_F=19.3$ and $\sigma_F=0.4+0.1$
$\mu_F=19.3$ and $\sigma_F=0.4-0.1$
$\mu_F=19.3$ and $\sigma_F=\sqrt{0.4^2-0.1^2}$

Explanation: This problem asks about adding two independent weights to find total shipping weight F = W + B. The mean of the sum is straightforward: μF = μW + μB = 18.2 + 1.1 = 19.3 ounces. For the standard deviation, we apply the independence rule that variances add: σ²F = σ²W + σ²B = (0.4)² + (0.1)² = 0.16 + 0.01 = 0.17, so σF = √0.17 = √(0.4² + 0.1²). A common error is to add or subtract the standard deviations directly, but the correct approach always involves the Pythagorean theorem-like formula for combining independent variables. This shows how even small sources of variation contribute to overall uncertainty.

Question 5

A car dealership models the number of cars, X, sold per day. The model gives the following probability distribution:

X	0	1	2	3
P(X=x)	0.1	0.3	0.4	0.2

Based on this probability distribution, which conclusion is most appropriate?

The dealership will sell an average of 2 cars per day over the next 10 days because that is the most likely outcome.
Over a long period, the dealership is expected to sell exactly 2 cars on approximately 40 percent of the days. (correct answer)
The dealership is guaranteed to sell between 0 and 3 cars each day, as these are the only outcomes with non-zero probabilities.
The probability of selling 4 cars is 0 because the sum of the other probabilities is 1.0, making it impossible.

Explanation: The value $P(X=2)=0.4$ is the theoretical probability of selling 2 cars on a given day. The law of large numbers states that over many repetitions (a long period), the observed relative frequency of an event will approach its theoretical probability. Therefore, choice B is the best conclusion. Choice A is incorrect because the average (expected value) is 1.7, not 2, and short-term results are not guaranteed. Choice C uses 'guaranteed', which is too strong for a model of a random process. Choice D correctly states that P(X=4)=0 in this model, but B is a more useful conclusion drawn from the distribution.

Question 6

A water utility claims the mean household water use is $\mu=320$ gallons per day. A random sample of households is taken and a 98% confidence interval for $\mu$ is $(300,\ 315)$ gallons per day. Is the claim supported by the confidence interval?

Yes, because 98% confidence means 98% of households use between 300 and 315 gallons per day.
No, because 320 is not in the interval, so the claim is not supported. (correct answer)
Yes, because 320 is greater than both endpoints, which indicates higher use.
Yes, because a higher confidence level always supports the company’s claim.
No, because 98% confidence means there is a 2% chance that $\mu=320$ .

Explanation: This question asks whether a claim that μ = 320 gallons per day is supported by a 98% confidence interval of (300, 315) gallons. The claimed value of 320 is NOT in the interval - it exceeds the upper bound of 315. This provides strong evidence against the claim at the 98% confidence level. Choice A misinterprets the interval as describing individual households. Choice C incorrectly suggests that being above the interval somehow supports the claim. Choice E misunderstands confidence levels. The key principle: when a claimed value falls outside a confidence interval, we have statistical evidence to reject that claim at the given confidence level.

Question 7

A teacher models final course grade (percent) as a linear function of number of classes missed for a random sample of students. A 95% confidence interval for the slope is $(-3.5,\ -1.2)$ . Which interpretation is correct?

We are 95% confident that for each additional class missed, the mean final grade decreases by between 1.2 and 3.5 percentage points for the population of students like those sampled. (correct answer)
There is a 95% probability that missing one more class will lower a particular student’s final grade by between 1.2 and 3.5 points.
We are 95% confident that the correlation between absences and final grade is between −3.5 and −1.2.
Because the interval is entirely negative, the relationship is perfectly linear with no scatter.
Since 0 is not in the interval, we can conclude that absences cause lower grades for all students.

Explanation: This question involves a negative confidence interval for slope in an educational context. The interval (-3.5, -1.2) indicates that grades decrease with more absences. Choice A correctly interprets this as being 95% confident that mean final grade decreases by 1.2 to 3.5 percentage points per absence for the population. Choice B incorrectly applies this to a particular student. Choice C confuses slope with correlation. Choice D makes an incorrect claim about perfect linearity. Choice E incorrectly infers causation for all students. The confidence interval describes the average relationship in the population, not individual effects or causal claims.

Question 8

A bank analyzed the amounts of 70 mobile check deposits (in dollars) made in one week. The summaries were: mean $=410$ , median $=250$ , SD $=520$ , five-number summary $(20,\ 120,\ 250,\ 480,\ 3200)$ , and IQR $=360$ . Which interpretation is correct?

The IQR of 360 means the deposit amounts range from $20 to$ 380.
The middle 50% of deposits are between $120 and$ 480, and the distribution is likely right-skewed because the mean is much larger than the median and the maximum is far above $Q_3$ . (correct answer)
Because the SD is 520, most deposits are about $520.
Since the median is $250, half of the deposits are between$ 250 and $480.
Because $Q_3=480, about 75% of deposits are greater than$ 480.

Explanation: This question examines interpretation of highly skewed data. The five-number summary (20, 120, 250, 480, 3200) shows Q1=120 and Q3=480, placing the middle 50% of deposits between $120 and$ 480. The distribution is strongly right-skewed because the mean ( $410) far exceeds the median ($ 250), and the maximum (3200) is extremely far above Q3 (480). Choice A incorrectly calculates range from IQR. Choice C misinterprets SD as a typical deposit amount. Choice D incorrectly claims half the data falls between median and Q3. Choice E reverses the meaning of Q3, which has 75% of data below it, not above.

Question 9

A fitness researcher randomly samples 12 adults and measures their resting heart rates (beats per minute). A one-sample $t$ interval is used to estimate the population mean resting heart rate, producing a 95% confidence interval of $(66.4,\ 74.9)$ . Which interpretation is correct?

About 95% of adults have resting heart rates between 66.4 and 74.9 beats per minute.
There is a 95% chance that the sample mean is between 66.4 and 74.9 beats per minute.
We are 95% confident that the interval from 66.4 to 74.9 beats per minute contains the true population mean resting heart rate. (correct answer)
If we sample 12 adults many times, 95% of the time the true population mean will fall between 66.4 and 74.9 because the mean changes from sample to sample.
If we sample 12 adults many times, 95% of the sample means will fall between 66.4 and 74.9 beats per minute.

Explanation: This question tests proper confidence interval interpretation for a population mean. The correct answer (C) appropriately states we are 95% confident the interval contains the true population mean resting heart rate. Choice A incorrectly applies the interval to individual adult heart rates rather than the mean. Choice B wrongly suggests uncertainty about the sample mean, which is known. Choice D incorrectly implies the population mean changes between samples. Choice E describes the sampling distribution of sample means, not what a confidence interval represents. Key insight: confidence intervals estimate fixed population parameters, not ranges for individuals or sample statistics.

Question 10

A shipping company claims that the mean delivery time for a certain route is at least 2.5 days. A competitor believes the mean delivery time is actually less than 2.5 days. A random sample of $n=45$ deliveries on that route is selected, and the sample mean delivery time is $\bar{x}=2.3$ days. Which hypotheses are appropriate for a test of the population mean delivery time, $\mu$ ?

$H_0: \mu=2.5$ days\quad vs.\quad $H_a: \mu>2.5$ days
$H_0: \mu\le2.5$ days\quad vs.\quad $H_a: \mu>2.5$ days
$H_0: \mu=2.5$ days\quad vs.\quad $H_a: \mu<2.5$ days (correct answer)
$H_0: \bar{x}=2.5$ days\quad vs.\quad $H_a: \bar{x}<2.5$ days
$H_0: p=2.5$ \quad vs.\quad $H_a: p<2.5$

Explanation: This question requires careful interpretation of "at least 2.5 days" to set up proper hypotheses for a population mean test. The shipping company claims the mean is "at least" (≥) 2.5 days, but in hypothesis testing, we always use equality in the null hypothesis at the boundary value. Since the competitor believes the mean is less than 2.5 days, we test H₀: μ = 2.5 days vs. Hₐ: μ < 2.5 days. Option C correctly states these hypotheses. Common mistakes include writing H₀: μ ≥ 2.5 or using sample statistics instead of population parameters. When a claim involves "at least" or "at most," convert it to an equality at the boundary for H₀, then set up Hₐ to test in the opposite direction of the original claim.

Question 11

A store manager tracks the last digit of each of 40 customer receipts (0–9). If the last digit is generated by a random process, you would expect the digits to be fairly even overall with some natural variation and no obvious restriction. The manager notices the last digit is always even (0,2,4,6,8), never odd. Is the pattern consistent with random behavior?

Yes, because random samples often exclude half the possible outcomes.
No, because never seeing any odd digit suggests the process is constrained or biased. (correct answer)
Yes, because even digits are more common than odd digits in most random processes.
No, because a random process would have exactly 4 of each digit in 40 receipts.
Yes, because the digits are still varying, so the process must be random.

Explanation: This question evaluates detecting constraints in supposedly random digits, part of randomness patterns in statistics. For random last digits (0-9), expect a fairly even spread with variation, including both even and odd, without systematic exclusion. Never seeing odd digits in 40 receipts suggests a non-random restriction, like rounding or bias, not chance variation. Distractor A overgeneralizes that excluding half is common, but such perfect avoidance is highly unlikely in uniform randomness. Mini-lesson: random processes produce inclusive outcomes over categories, with clumps possible, but total absence of a major subset defies independence and uniformity expectations. Thus, the pattern is inconsistent with random behavior.

Question 12

A school district wants to estimate the mean number of minutes per day that all middle school students in the district spend on homework. The district selects 4 of its 12 middle schools based on which principals respond first to an email request to participate, then surveys all students in those 4 schools. Which issue most threatens the validity of the estimate for the entire district?

Convenience sampling because the participating schools were chosen based on quick principal response (correct answer)
Simple random sampling because only 4 schools were selected
Response bias because students might not know how many minutes are in an hour
Voluntary response bias because every student was forced to respond
Measurement error because homework time is categorical, not quantitative

Explanation: This question tests your understanding of convenience sampling. The district wants to estimate homework time for ALL middle school students but selects schools based on which principals respond fastest to an email - a classic convenience sample. Schools with quick-responding principals may differ systematically from others (perhaps more organized schools assign more homework, or less bureaucratic schools assign less). The correct answer is A because convenience sampling selects units based on ease of access rather than random selection, introducing bias. When identifying sampling problems, look for selection methods based on availability, proximity, or ease rather than randomization.

Question 13

A machine fills bottles, and a technician inspects bottles one at a time until finding the first bottle that is underfilled. Each inspection results in either “underfilled” or “not underfilled.” Assume the probability a bottle is underfilled is constant over time and inspections are independent. The technician wants to model the number of bottles inspected until the first underfilled bottle. Why is a geometric model appropriate?

Because it models the number of underfilled bottles in a fixed set of $n$ inspected bottles
Because it models the number of inspections until the first success with independent trials and constant probability of success (correct answer)
Because it requires a fixed number of inspections and a varying number of underfilled bottles
Because the probability of an underfilled bottle changes deterministically after each inspection
Because the outcomes are measured on a continuous scale rather than success/failure

Explanation: In AP Statistics, this question addresses the geometric distribution for scenarios involving trials until the first success in independent Bernoulli settings with fixed p. Inspecting each bottle is a Bernoulli trial with 'success' as underfilled, continuing until the first such bottle. The 'until first success' structure is appropriate as inspections stop at the initial underfilled one, randomizing the count. Distractor A describes binomial counting in fixed trials, not applicable here. Mini-lesson: The geometric model uses P(X=k) = (1-p)^{k-1} p, assuming independent trials and constant success probability, which holds for the independent inspections with constant underfill probability.

Question 14

An environmental scientist measures mean nitrate concentration (mg/L) in water samples from two rivers. A random sample of $n=18$ samples from River A had mean $\bar{x}=6.4$ mg/L, and a random sample of $n=20$ samples from River B had mean $\bar{x}=5.7$ mg/L. The claim is that River A has a higher population mean nitrate concentration than River B. Which hypotheses are appropriate?

$H_0: \mu_A-\mu_B=0$ ; $H_a: \mu_A-\mu_B>0$ (correct answer)
$H_0: \mu_A-\mu_B=0$ ; $H_a: \mu_A-\mu_B\ne 0$
$H_0: \mu_B-\mu_A=0$ ; $H_a: \mu_B-\mu_A>0$
$H_0: \bar{x}_A-\bar{x}_B=0$ ; $H_a: \bar{x}_A-\bar{x}_B>0$
$H_0: \mu_A=6.4$ ; $H_a: \mu_A>6.4$

Explanation: This question examines setting up hypotheses for the difference of two means in AP Statistics, for nitrate concentrations in Rivers A and B. The claim is River A has higher mean than B (μ_A > μ_B), so H0: μ_A - μ_B = 0 and Ha: μ_A - μ_B > 0. Distractor choice C reverses order, testing if B is higher, and choice D uses sample means instead of population. Choice B has a two-sided alternative, but the claim specifies direction ('higher'). Mini-lesson: Use clear subscripts for each population; null is equality, one-sided >0 for 'higher' claim. Ensure subtraction order supports the inequality. Sample means (6.4 vs. 5.7) are for testing, not hypotheses.

Question 15

A researcher wants to compare whether three different neighborhoods (North, Central, South) have the same distribution of primary commuting method (Car, Public transit, Bike/Walk). She takes separate random samples of adults from each neighborhood and records each person’s commuting method. The results are shown in the two-way table. Which chi-square test is appropriate, and what are the correct hypotheses?

Note: This study uses multiple groups (neighborhoods) and compares a single categorical response.

Chi-square test for homogeneity; $H_0$ : the distribution of commuting method is the same in all neighborhoods; $H_a$ : at least one neighborhood has a different distribution (correct answer)
Chi-square test of independence; $H_0$ : neighborhood and commuting method are independent in the population; $H_a$ : they are not independent
Chi-square goodness-of-fit; $H_0$ : commuting methods follow a specified distribution; $H_a$ : they do not
Two-proportion $z$ test; $H_0$ : $p_{\text{Car,North}}=p_{\text{Car,South}}$ ; $H_a$ : not equal
Chi-square goodness-of-fit; $H_0$ : each neighborhood has equal counts in each commuting category; $H_a$ : not all counts are equal

Explanation: The skill here is identifying the correct chi-square test setup for homogeneity or independence based on study design in AP Statistics. This scenario involves separate random samples from each neighborhood, with one categorical response (commuting method) compared across groups, so a chi-square test for homogeneity is suitable to check if distributions are the same. The hypotheses in choice A correctly state that the distribution is the same for all neighborhoods under the null, with the alternative that at least one differs. A distractor like choice B, the test of independence, is wrong because independence requires one sample with two variables per unit, not multiple samples as here. For a mini-lesson: goodness-of-fit compares one variable to expected proportions; independence examines association in one sample with two variables; homogeneity tests if multiple populations have the same distribution for one variable. Choice E misapplies goodness-of-fit by assuming equal counts rather than proportions. Therefore, A is appropriate.

Question 16

A candy company claims that the colors in its assorted bag occur in the following distribution: 30% red, 25% blue, 20% green, 15% yellow, and 10% orange. A student randomly samples 200 candies from recent bags and records the observed counts below.

Which hypotheses are appropriate for a chi-square goodness-of-fit test of the company’s claim?

$H_0$ : The sample proportions are 0.30 red, 0.25 blue, 0.20 green, 0.15 yellow, 0.10 orange; $H_a$ : The sample proportions are different from these values.
$H_0$ : The distribution of candy colors in the population is 0.30 red, 0.25 blue, 0.20 green, 0.15 yellow, 0.10 orange; $H_a$ : The distribution of candy colors in the population is not this distribution. (correct answer)
$H_0$ : Candy color is independent of the bag it came from; $H_a$ : Candy color is not independent of the bag it came from.
$H_0$ : $p_{red}=p_{blue}=p_{green}=p_{yellow}=p_{orange}=0.20$ ; $H_a$ : At least one color proportion differs from 0.20.
$H_0$ : The observed counts match the claimed counts exactly; $H_a$ : At least one observed count differs from the claimed count.

Explanation: This question tests understanding of proper hypothesis setup for a chi-square goodness-of-fit test. The null hypothesis should state the claimed population distribution (30% red, 25% blue, 20% green, 15% yellow, 10% orange), while the alternative states that the true population distribution differs from this claim. Option A incorrectly refers to sample proportions rather than population parameters - we never test hypotheses about sample statistics. Option B correctly states hypotheses about the population distribution, which is what we're testing. In goodness-of-fit tests, we compare observed sample data to a claimed population distribution, not to independence or equal proportions.

Question 17

A fitness app company sampled 30 users and recorded average daily steps ( $x$ ) and resting heart rate ( $y$ ). A least-squares regression of $y$ on $x$ was computed. The slope test used $H_0: \beta_1=0$ vs. $H_a: \beta_1\ne 0$ and gave a p-value of 0.078 with an estimated negative slope. At the $\alpha=0.05$ level, the company failed to reject $H_0$ . What conclusion is appropriate?

Reject $H_0$ ; because the estimated slope is negative, there is a negative linear association in the population.
Fail to reject $H_0$ ; there is not convincing evidence at the 0.05 level that the population slope differs from 0. (correct answer)
Fail to reject $H_0$ ; this proves the true slope is exactly 0.
Reject $H_0$ ; the p-value 0.078 means there is a 7.8% chance $H_0$ is true.
Fail to reject $H_0$ ; therefore, increasing steps causes resting heart rate to decrease.

Explanation: This question assesses interpreting a slope hypothesis test in linear regression, particularly when the p-value exceeds the significance level. The p-value of 0.078 is greater than α=0.05, so we fail to reject H0: β1=0, meaning there is not convincing evidence of a linear association between daily steps and resting heart rate. The negative slope implies that more steps might relate to lower heart rates, but the evidence is not strong enough at this alpha level. A frequent distractor is choice A, which wrongly suggests rejecting H0 despite the p-value being above 0.05. Mini-lesson on conclusions: Rejecting H0 supports a nonzero slope and linear association, but failing to reject means insufficient evidence—it does not prove the slope is exactly zero. Consider sample size and power, as a larger sample might detect smaller effects.

Question 18

A game uses a fair six-sided die. A player wins if the sum of 3 rolls is at least 14. To estimate the probability of winning, a student simulates 1000 trials: each trial consists of generating 3 random integers 1–6, summing them, and recording whether the sum is at least 14. The student gets 216 wins. Which interpretation of the simulation results is correct?

The probability of winning is exactly $0.216$ because 216 wins occurred.
The estimated probability of winning is about $216/1000$ , and it would likely get closer to the true probability with more trials. (correct answer)
The die is not fair because the simulation did not produce exactly the theoretical probability.
The simulation shows that the sum of 3 rolls is at least 14 on exactly 216 of the next 1000 real games.
The simulation result means that the average sum of 3 rolls is 216.

Explanation: This AP Statistics skill of estimating probabilities using simulation applies to dice games, where summing three rolls and checking for at least 14 was simulated 1000 times, yielding 216 wins as an estimate. Repeated trials help average out variability, providing a closer approximation to the true probability of winning with a fair die. Distractor A claims exactness from the 216 wins, but simulations are empirical estimates, not theoretical certainties. Mini-lesson on simulation: it models discrete uniform distributions like die rolls by generating random values, computing outcomes per trial, and using the success proportion over repetitions to estimate probabilities. This is useful for multinomial problems where enumeration is impractical. The correct choice is B, noting the estimate and that more trials would refine it.

Question 19

A wildlife researcher is estimating the mean wingspan (in cm) of adult birds of a certain species in a large preserve. Two independent random samples of 35 adult birds are captured, measured, and released during the same month. Sample X has a mean wingspan of 18.4 cm; Sample Y has a mean wingspan of 19.1 cm. Why might the sample results differ?

One sample must have measured juveniles, because adult samples would have the same mean
The difference is best explained by random sampling variability in sample means (correct answer)
The difference proves the measuring tool was faulty, because random sampling removes measurement error
The preserve cannot be considered a single population, because random samples always agree exactly
The difference shows that sampling variability only affects proportions, not means

Explanation: This question tests understanding of sampling variability in biological measurements. The difference in mean wingspan (18.4 vs 19.1 cm) between the two samples reflects natural sampling variability. When randomly capturing 35 birds from a large preserve, each sample will include different individual birds with naturally varying wingspans. Some samples might randomly include more birds with larger wingspans, while others might include more with smaller wingspans. This 0.7 cm difference is entirely consistent with random variation and doesn't indicate measurement error, multiple populations, or sampling bias. The question helps students understand that sampling variability affects all types of measurements, not just proportions, and occurs even in carefully controlled scientific studies. This concept is fundamental to understanding why researchers report confidence intervals and conduct statistical tests rather than treating sample statistics as exact population values.

Question 20

To estimate the mean amount of time (in seconds) it takes a website to load for all users, a random sample of 60 load times was collected. A 99% confidence interval for the population mean load time was $(2.10, 2.55)$ seconds. Which interpretation is correct?

There is a 99% probability that the next user’s load time will be between 2.10 and 2.55 seconds.
If 100 different random samples of 60 load times were taken, about 99 of the resulting intervals would contain the sample mean from their own sample.
We are 99% confident that the true population mean load time for all users is between 2.10 and 2.55 seconds. (correct answer)
About 99% of all users have load times between 2.10 and 2.55 seconds.
The true population mean load time is between 2.10 and 2.55 seconds in 99% of all possible populations.

Explanation: This question evaluates the skill of correctly interpreting a confidence interval for a population mean in AP Statistics. Choice C is correct, stating that we are 99% confident the true population mean load time is between 2.10 and 2.55 seconds, which is a standard way to express the interval's meaning without implying probability on the fixed mean. A common distractor is choice D, which misapplies the interval to individual users rather than the mean, confusing it with a prediction interval. In a mini-lesson on confidence intervals for means, these intervals estimate the population mean with a specified level of confidence based on sample data and variability. The 'we are X% confident' phrasing captures the reliability of the method, meaning that if we repeat the process, X% of intervals will include the true mean. Always distinguish this from probabilities about future observations or the parameter itself to interpret accurately.

Question 21

A bookstore compares spending by two independent customers. Let $X$ be the amount (in dollars) spent by Customer A with $\mu_X=35$ and $\sigma_X=12$ . Let $Y$ be the amount (in dollars) spent by Customer B with $\mu_Y=28$ and $\sigma_Y=9$ . Define $W=X-Y$ , the difference in spending (A minus B). Which statement about the combined variable is correct?

$\mu_W=7$ and $\sigma_W=3$
$\mu_W=63$ and $\sigma_W=21$
$\mu_W=7$ and $\sigma_W=21$
$\mu_W=-7$ and $\sigma_W=\sqrt{225}$
$\mu_W=7$ and $\sigma_W=15$ (correct answer)

Explanation: This problem tests combining independent random variables for spending difference W = X - Y in AP Statistics. The mean is μ_W = 35 - 28 = 7 dollars. The variance is σ_W² = 12² + 9² = 144 + 81 = 225, so σ_W = √225 = 15. Choice D is a distractor that reverses the subtraction for a negative mean while using √225. For a mini-lesson, differences of independent variables subtract means but add variances; the standard deviation comes from the square root afterward. This helps compare expenditures without assuming distributions, given independence.

Question 22

Two different machines, A and B, produce widgets. Let X be the number of flaws in a widget from machine A, and Y be the number of flaws from machine B. Their probability distributions are given below.

Machine A: X | 0 | 1 | 2; P(X=x) | 0.8 | 0.1 | 0.1

Machine B: Y | 0 | 1 | 2; P(Y=y) | 0.6 | 0.3 | 0.1

Which statement correctly compares the two distributions?

The distribution for X is more skewed to the right than the distribution for Y.
Machine A is more likely than Machine B to produce a widget with no flaws. (correct answer)
The probability of a widget having at least one flaw is the same for both machines.
The probability of a widget having exactly one flaw is the same for both machines.

Explanation: We compare the probabilities for each statement. For choice B, the probability of a widget from Machine A having no flaws is $P(X=0) = 0.8$ . For Machine B, it is $P(Y=0) = 0.6$ . Since $0.8 > 0.6$ , Machine A is more likely to produce a flawless widget. Choice C is false: $P(X \geq 1) = 0.2$ while $P(Y \geq 1) = 0.4$ . Choice D is false: $P(X=1) = 0.1$ while $P(Y=1) = 0.3$ .

Question 23

Researchers compare mean resting heart rate (beats per minute) for people who drink coffee daily versus those who do not. Independent random samples were taken, and a 95% confidence interval for $(\mu_{\text{coffee}}-\mu_{\text{no coffee}})$ is $(1.2,\,5.6)$ . Which interpretation is correct?

We are 95% confident that people who drink coffee daily have a mean resting heart rate between 1.2 and 5.6 bpm higher than those who do not. (correct answer)
There is a 95% chance that the interval $(1.2,\,5.6)$ contains the true difference in means.
Because 0 is not in the interval, 95% of coffee drinkers have resting heart rates 1.2 to 5.6 bpm higher than non-coffee drinkers.
We are 95% confident that $\mu_{\text{no coffee}}-\mu_{\text{coffee}}$ is between 1.2 and 5.6 bpm.
In 95% of repeated samples, the sample mean difference will fall between 1.2 and 5.6 bpm.

Explanation: This question tests understanding of a positive confidence interval for μ_coffee - μ_no coffee. The interval (1.2, 5.6) means coffee drinkers have higher mean resting heart rate. Choice A correctly interprets this as being 95% confident that coffee drinkers have mean heart rate 1.2 to 5.6 bpm higher. Choice B incorrectly assigns probability to the interval containing the parameter. Choice C misapplies the interval to individuals. Choice D reverses the subtraction order. Choice E incorrectly describes sampling distribution. Key insight: positive intervals mean the first group has a higher mean.

Question 24

A company measures the time (in minutes) to complete a task using two different software interfaces. In each repetition, an independent random sample of $n_1=25$ workers uses Interface 1 and an independent random sample of $n_2=100$ workers uses Interface 2. The population standard deviations are $\sigma_1=8$ and $\sigma_2=8$ . The statistic is $\bar{x}_1-\bar{x}_2$ . Which statement is correct about how the sampling distribution of $\bar{x}_1-\bar{x}_2$ would change if the company instead used $n_1=100$ and $n_2=25$ (swapping the sample sizes)?

The mean would change because the larger sample size would pull the mean toward Interface 1.
The standard deviation would be larger because unequal sample sizes always increase variability.
The mean would stay $\mu_1-\mu_2$ , and the standard deviation would stay the same because $\sigma_1=\sigma_2$ and the formula is symmetric in $n_1,n_2$ . (correct answer)
The sampling distribution would become uniform because one sample size is small.
The standard deviation would become $\sigma_1/n_1-\sigma_2/n_2$ after swapping.

Explanation: This question examines how swapping sample sizes affects the sampling distribution of $\bar{x}_1 - \bar{x}_2$ . The mean of the sampling distribution is always $\mu_1 - \mu_2$ regardless of sample sizes. For the standard deviation, we use $\sqrt{\sigma_1^2/n_1 + sigma_2^2/n_2}$ . Since $\sigma_1 = sigma_2 = 8$ , this becomes $8\sqrt{1/n_1 + 1/n_2}$ . With the original sizes (25, 100), we get $8\sqrt{1/25 + 1/100} = 8\sqrt{0.05} \approx 1.79$ . With swapped sizes (100, 25), we get $8\sqrt{1/100 + 1/25} = 8\sqrt{0.05} \approx 1.79$ . The standard deviation remains the same because the formula is symmetric when the population standard deviations are equal. Choice B incorrectly claims unequal sizes increase variability, and Choice E shows a fundamental misunderstanding of how standard deviations combine.

Question 25

A university wants to estimate the proportion of all undergraduates who have taken at least one online course. A simple random sample of 200 undergraduates finds that 78 have taken at least one online course. Which inference procedure is most appropriate to estimate the population proportion?

Two-sample $z$ test for a difference in proportions
One-sample $z$ interval for a population proportion (correct answer)
Chi-square test for goodness of fit
One-sample $t$ interval for a population mean
Chi-square test for independence

Explanation: This problem asks for estimating a single population proportion (the proportion of all undergraduates who have taken online courses) using sample data. The one-sample z interval for a population proportion is the appropriate procedure for constructing a confidence interval to estimate this parameter. The one-sample z test (option A) would test a claim, not estimate. The chi-square tests (options C and E) are for different scenarios involving categorical data. The t-interval (option D) is for estimating a population mean with quantitative data. When estimating a single population proportion from sample data, use the one-sample z interval.