Home

Tutoring

Subjects

Live Classes

Study Coach

Essay Review

On-Demand Courses

Colleges

Games

Opening subject page...

Loading your content

Home

Tutoring

Subjects

Live Classes

Study Coach

Essay Review

On-Demand Courses

Colleges

Games

AP Statistics

AP Statistics Practice Test: Practice Test 27

Practice Test 27 for AP Statistics: real questions and explanations from the Varsity Tutors practice-test pool.

0 / 25 answered

Question 1 of 25

A company compares two website designs to see which leads to a higher purchase rate. Of 200 randomly selected visitors shown Design 1, 34 made a purchase. Of 180 randomly selected visitors shown Design 2, 18 made a purchase. The company’s research claim is that Design 1 has a higher purchase proportion than Design 2. Which hypotheses are appropriate for a test of the company’s claim about the population proportions?

Question Navigator

All questions

Question 1

$H_0: p_1-p_2=0\quad\text{vs.}\quad H_a: p_1-p_2>0$ (correct answer)
$H_0: \hat{p}_1-\hat{p}_2=0\quad\text{vs.}\quad H_a: \hat{p}_1-\hat{p}_2>0$
$H_0: p_2-p_1=0\quad\text{vs.}\quad H_a: p_2-p_1>0$
$H_0: p_1=0.17\quad\text{vs.}\quad H_a: p_1>0.17$
$H_0: p_1-p_2=0\quad\text{vs.}\quad H_a: p_1-p_2\ne 0$

Explanation: This problem requires setting up hypotheses to test if Design 1 has a higher purchase rate than Design 2. The research claim is directional (Design 1 > Design 2), so we need a one-sided alternative hypothesis with p_1 - p_2 > 0. Option B incorrectly uses sample statistics (p̂) instead of population parameters. Option C reverses the order, testing if Design 2 > Design 1, which contradicts the claim. Option D tests only one proportion against a fixed value rather than comparing two populations. Option E uses a two-sided alternative when the claim specifies a direction. Remember that hypothesis tests always use population parameters (p) not sample statistics (p̂), and the alternative hypothesis must match the direction of the research claim.

Question 2

A hospital states that the mean wait time in its emergency department is $\mu=30$ minutes. A random sample of 35 patients is selected, and a one-sample $t$ test is performed with $H_0:\mu=30$ and $H_a:\mu>30$ at $\alpha=0.01$ . The p-value is 0.012. What conclusion is appropriate?

Reject $H_0$ ; there is convincing evidence that the population mean wait time is greater than 30 minutes.
Fail to reject $H_0$ ; there is not convincing evidence that the population mean wait time is greater than 30 minutes. (correct answer)
Because p-value = 0.012, there is a 1.2% chance that $\mu$ is greater than 30 minutes.
Fail to reject $H_0$ ; therefore the mean wait time is 30 minutes for every patient.
Reject $H_0$ ; this proves the hospital’s staffing changes caused longer wait times.

Explanation: This problem tests interpretation when p-value slightly exceeds α in a one-tailed test. With p-value = 0.012 and α = 0.01, we fail to reject H₀ because 0.012 > 0.01, meaning there is not convincing evidence at the 0.01 level that the population mean wait time is greater than 30 minutes. Choice A would be correct if α were 0.05, but the stricter 0.01 level requires stronger evidence. Choice C misinterprets what the p-value represents. Choice E makes an unsupported causal claim about staffing changes. When using α = 0.01, we require very strong evidence (p < 0.01) to reject H₀.

Question 3

A marketing analyst takes a single random sample of 300 customers and records two categorical variables: membership status (Member/Non-member) and purchase outcome on a visit (Purchased/Did not purchase). The results are shown in the table. Which chi-square test is appropriate, and what are the correct hypotheses?

State clearly whether this is one sample or multiple groups.

Chi-square test of homogeneity; $H_0$ : the distribution of membership status is the same for purchasers and non-purchasers; $H_a$ : it differs
Chi-square goodness-of-fit; $H_0$ : Member and Non-member occur in a 50/50 split; $H_a$ : not 50/50
Chi-square test of independence; $H_0$ : membership status and purchase outcome are independent in the population; $H_a$ : they are associated (correct answer)
Two-proportion $z$ test; $H_0$ : the proportion who purchased is the same for members and non-members; $H_a$ : it differs
Chi-square goodness-of-fit; $H_0$ : all four cell probabilities are equal; $H_a$ : at least one differs

Explanation: The problem explicitly states "a single random sample of 300 customers" with two categorical variables recorded for each customer. This is a one-sample design where we examine the relationship between membership status and purchase outcome. A chi-square test of independence is appropriate here, testing whether these two variables are independent in the population. The null hypothesis states that membership status and purchase outcome are independent, while the alternative states they are associated. Choice C correctly identifies this setup. The distinction from homogeneity is crucial: we have one sample with two variables measured on each unit, not separate samples from different populations.

Question 4

A school district wants to know if students at a particular high school sleep less than the recommended 8 hours per night on average. A random sample of 50 students reports their sleep, and the sample mean is $\bar{x}=7.45$ hours. Which hypotheses are appropriate for a test about the population mean sleep time $\mu$ ?

$H_0: \mu=8$ hours \quad vs. \quad $H_a: \mu>8$ hours
$H_0: \mu\le 8$ hours \quad vs. \quad $H_a: \mu>8$ hours
$H_0: \bar{x}=8$ hours \quad vs. \quad $H_a: \bar{x}<8$ hours
$H_0: \mu=8$ hours \quad vs. \quad $H_a: \mu<8$ hours (correct answer)
$H_0: p=8$ \quad vs. \quad $H_a: p<8$

Explanation: This question tests setting up a one-sided hypothesis test when looking for evidence that students sleep "less than" 8 hours. When testing if a mean is less than a specific value, the null hypothesis states equality (H₀: μ = 8) and the alternative states the inequality in the direction of interest (Hₐ: μ < 8). Option B incorrectly includes an inequality in the null hypothesis (μ ≤ 8), which is sometimes seen but not the standard form. Option C incorrectly uses the sample mean x̄ instead of the population parameter μ. Option E incorrectly uses p (proportion) instead of μ for a quantitative variable. The correct answer D properly states H₀: μ = 8 hours vs. Hₐ: μ < 8 hours, which tests whether students sleep less than the recommended amount. Remember that in standard hypothesis testing, the null hypothesis contains the equality, even for one-sided tests.

Question 5

A company surveyed 250 employees about whether they work remotely at least 3 days per week and whether they report being satisfied with work-life balance. The results are shown. Which comparison is appropriate for assessing whether satisfaction is associated with remote-work status using conditional distributions?

	Satisfied	Not satisfied
Remote $\ge 3$ days	90	35
Remote $<3$ days	72	53

Compare the overall proportion satisfied to the overall proportion not satisfied.
Compare the counts of satisfied employees in the two remote-work categories.
Compare the proportion satisfied among employees remote $\ge 3$ days to the proportion satisfied among employees remote $<3$ days. (correct answer)
Compare the proportion remote $\ge 3$ days among satisfied employees to the proportion remote $\ge 3$ days among not-satisfied employees.
Compare the overall proportion remote $\ge 3$ days to the overall proportion remote $<3$ days.

Explanation: This question tests the skill of using conditional distributions for two categorical variables to assess association between remote-work status and satisfaction. The proper comparison conditions on remote status and compares satisfaction proportions: 90/125 = 0.72 for >=3 days versus 72/125 = 0.576 for <3 days, indicating association. This conditions on the row variable. Choice D distracts by conditioning on satisfaction and comparing remote status across those groups. Choice A uses overall satisfaction without conditioning, a common mistake. Mini-lesson: Calculate conditional distributions as proportions within each row or column to examine the conditional behavior of one variable given the other, where varying distributions suggest the variables are associated.

Question 6

A sports analyst randomly sampled 300 fans attending games and found that 198 favor adding instant replay reviews. A 96% confidence interval for the true proportion $p$ of all fans who favor adding instant replay reviews is $(0.60,\ 0.72)$ . Which interpretation is correct?

There is a 96% probability that the true proportion $p$ is between 0.60 and 0.72.
If the analyst repeated the sampling method many times, 96% of the resulting intervals would contain the true proportion $p$ . (correct answer)
96% of all fans favor adding instant replay reviews.
If another sample of 300 fans were taken, the sample proportion would be between 0.60 and 0.72 with probability 0.96.
The true proportion $p$ will fall between 0.60 and 0.72 in 96% of future years.

Explanation: This question tests the repeated sampling interpretation of confidence intervals. The correct answer (B) properly describes that if the sampling method were repeated many times, 96% of the resulting intervals would contain p. Choice A incorrectly treats 96% as a probability about the specific parameter p. Choice C misinterprets the confidence level as the percentage favoring instant replay. Choice D confuses the confidence interval with a prediction interval for future sample proportions. Choice E absurdly suggests p changes over time. This interpretation emphasizes the long-run performance of the confidence interval procedure across many hypothetical repetitions of the sampling process.

Question 7

A car magazine compared mean fuel efficiency (mpg) for cars using regular gasoline versus premium gasoline. From independent random samples, a 95% confidence interval for $(\mu_{\text{regular}}-\mu_{\text{premium}})$ was $( -2.7,\ 1.1)$ . Which interpretation is correct?

We are 95% confident that cars using regular gasoline get between 2.7 and 1.1 mpg more than cars using premium gasoline.
There is a 95% chance that $(\mu_{\text{regular}}-\mu_{\text{premium}})$ is between $-2.7$ and $1.1$ mpg.
We are 95% confident that the true difference in population mean fuel efficiency $(\mu_{\text{regular}}-\mu_{\text{premium}})$ is between $-2.7$ and $1.1$ mpg. (correct answer)
Because the interval includes 0, premium gasoline definitely increases mean mpg.
We are 95% confident that the mean mpg for regular gasoline cars is between $-2.7$ and $1.1$ .

Explanation: This question tests understanding of the interval (-2.7, 1.1) for μ_regular - μ_premium. The interval includes both negative and positive values, including 0, so we cannot conclude either fuel type is definitively better. Choice C correctly interprets this as our confidence about the true population mean difference. Choice A incorrectly states regular gets better mileage (the interval doesn't support this), Choice B uses incorrect probability language, Choice D makes the opposite claim of what including 0 means, and Choice E only mentions one fuel type's mean. When 0 is in the interval, there's no significant difference between the groups.

Question 8

A city compares mean commute time (minutes) for residents who use public transit versus those who drive. From independent random samples, a 90% confidence interval for $(\mu_{\text{transit}}-\mu_{\text{drive}})$ is $(-3,\,12)$ . Which interpretation is correct?

Since 0 is in the interval, public transit commuters definitely have the same mean commute time as drivers.
We are 90% confident that the true difference in mean commute time, $\mu_{\text{transit}}-\mu_{\text{drive}}$ , is between $-3$ and $12$ minutes. (correct answer)
There is a 90% probability that $\mu_{\text{transit}}-\mu_{\text{drive}}$ equals 0 minutes.
We are 90% confident that transit commuters have mean commute times between 3 and 12 minutes shorter than drivers.
About 90% of all residents have commute times between $-3$ and $12$ minutes if they switch from driving to transit.

Explanation: This question involves an interval containing zero: (-3, 12) for μ_transit - μ_drive. Choice B correctly states we're 90% confident the true difference is between -3 and 12 minutes. Choice A incorrectly concludes equality from 0 being in the interval; we can only say we're not confident about the direction of difference. Choice C wrongly assigns probability to the parameter equaling 0. Choice D misinterprets the negative part of the interval. Choice E incorrectly applies to individuals. Remember: when 0 is in the interval, we cannot conclude there's a significant difference at that confidence level.

Question 9

A coach compared mean improvement (seconds) in a 100-meter sprint after two different training plans, using independent groups of athletes: Plan P and Plan Q. A 90% confidence interval for $\mu_P-\mu_Q$ is $(0.02,\ 0.15)$ seconds. Which interpretation is correct?

There is a 90% probability that Plan P improves each athlete’s time by between 0.02 and 0.15 seconds more than Plan Q.
We are 90% confident that the population mean improvement under Plan P is between 0.02 and 0.15 seconds.
We are 90% confident that, in the population, the mean improvement for Plan P exceeds that for Plan Q by between 0.02 and 0.15 seconds. (correct answer)
Because 0 is not in the interval, there is a 90% chance that $\mu_P=\mu_Q$ .
We are 90% confident that $\mu_Q-\mu_P$ is between 0.02 and 0.15 seconds.

Explanation: AP Statistics here focuses on interpreting a 90% confidence interval for μ_P - μ_Q from 0.02 to 0.15 seconds. Choice C accurately conveys 90% confidence that Plan P's mean improvement exceeds Plan Q's by 0.02 to 0.15 seconds in the population. The positive endpoints exclude zero, suggesting a difference. Choice D is a distractor, wrongly implying a chance of equality despite the interval excluding zero. Mini-lesson: For two independent groups, the CI estimates the difference in population means with a range reflecting sampling error. Excluding zero aligns with rejecting the null of no difference at alpha = 0.10, emphasizing CIs as tools for inference on means rather than individual outcomes.

Question 10

An environmental scientist models ozone level ( $y$ , in ppb) from traffic volume ( $x$ , in thousands of cars per day) using data from days with traffic between 10 and 60 (thousand cars). The regression line is $\hat{y}=18+1.1x$ . The purpose of the linear model is to describe the association and predict typical ozone levels for traffic volumes in the observed range. Which interpretation of the model is correct?

For each additional 1,000 cars per day (within 10–60 thousand), the predicted ozone level increases by about 1.1 ppb, on average. (correct answer)
If traffic volume is 0, then the ozone level will be 18 ppb.
An increase of 1.1 ppb in ozone causes traffic volume to increase by 1,000 cars per day.
Because the slope is positive, increasing traffic causes ozone to increase by 1.1 ppb for every additional 1,000 cars.
At 100 thousand cars per day, the model predicts 128 ppb, so it is appropriate to use the model at 100 thousand cars per day.

Explanation: This question tests understanding of slope interpretation in an environmental science context. The regression equation $\hat{y}=18+1.1x$ models predicted ozone levels from traffic volume (in thousands of cars), where the slope 1.1 represents the average change in predicted ozone per thousand cars. Choice A correctly states "for each additional 1,000 cars per day, the predicted ozone level increases by about 1.1 ppb, on average." Choice B incorrectly treats the intercept as an actual value rather than a prediction outside the data range. Choice C reverses causation, suggesting ozone causes traffic changes. Choice D claims direct causation from an observational study. Choice E extrapolates to 100 thousand cars, well beyond the observed range of 10-60 thousand. Regression models from observational data describe associations, not causal relationships, and should not be extrapolated beyond their data range.

Question 11

An environmental agency uses a sample of water tests from a river to estimate whether the mean pollutant level exceeds a legal limit. Declaring the river unsafe could trigger costly restrictions; failing to act when levels truly exceed the limit could harm public health. Why is it important to consider potential error in the inference before making a policy decision?

Because if the sample size is large, the sample mean cannot be above the legal limit unless the population mean is above it
Because statistical inference involves uncertainty, and either kind of wrong decision has serious health or economic consequences (correct answer)
Because measurement units, not sampling, are the only source of error in water testing
Because the legal limit is a constant, so no error is possible in conclusions about it
Because a confidence interval always contains the true mean, so decisions cannot be wrong

Explanation: This question examines understanding of inference uncertainty in environmental policy contexts. The agency must decide whether pollutant levels exceed legal limits based on sample water tests. The correct answer (B) properly identifies that statistical inference involves uncertainty - the sample mean might differ from the true population mean due to sampling variability. Type I error (declaring the river unsafe when it's actually safe) triggers unnecessary costly restrictions, while Type II error (failing to act when levels are truly dangerous) risks public health. Choice A wrongly suggests large samples eliminate sampling error, while E incorrectly claims confidence intervals always contain the true mean. When using samples to make high-stakes decisions, we must acknowledge that our inference could be wrong and consider the consequences of both types of errors.

Question 12

A coffee shop owner wants to know if there is a difference in coffee preference (Espresso, Latte, Drip) between customers who visit during the morning shift and those who visit during the evening shift. From a sample of 500 customers, 300 visited in the morning. Overall, 100 preferred Espresso.

For a chi-square test of homogeneity, what would be the expected frequency of morning customers who prefer Espresso, assuming coffee preference is the same for both shifts?

$40$
$50$
$60$ (correct answer)
$100$

Explanation: The expected frequency (or count) is (row total × column total) / grand total. The row total for morning customers is 300. The column total for Espresso preference is 100. The grand total is 500. The expected frequency is $(300 \times 100) / 500 = 30000 / 500 = 60$ .

Question 13

A fitness app company asks: “Does enabling a daily step-goal notification increase average daily steps?” The population is all current app users. The proposed plan is to compare average steps for users who turn on notifications in the settings to users who do not, using one month of app data. Which aspect is most important to address before collecting data?

Increase the number of users included so the sample size is extremely large
Randomly assign notification status (or use a design that addresses confounding) because users who opt in may already be more motivated (correct answer)
Switch from average steps to total steps so the response variable is larger
Plan to use a histogram of steps for each group in the final report
Collect data for more months so the analysis includes more observations per user

Explanation: This question focuses on self-selection bias and confounding. Users who choose to enable step-goal notifications are likely already more motivated about fitness than those who don't enable them. This self-selection creates confounding - any observed differences in steps could be due to pre-existing motivation levels, not the notifications themselves. Random assignment of notification status (B) would eliminate this confounding by ensuring groups are comparable except for notification status. Simply increasing sample size (A) or data collection time (E) won't address this fundamental design issue. Without randomization or other methods to address confounding, the study cannot determine if notifications actually increase steps or if motivated users both enable notifications and walk more.

Question 14

A cafeteria tracks two independent random variables: $X$ , the number of apples sold in a day with $\mu_X=120$ and $\sigma_X=15$ , and $Y$ , the number of bananas sold in a day with $\mu_Y=90$ and $\sigma_Y=12$ . Let $C=X-Y$ be the difference in sales (apples minus bananas). Which statement about $C$ is correct?

$\mu_C=210$ and $\sigma_C=\sqrt{15^2+12^2}$
$\mu_C=30$ and $\sigma_C=\sqrt{15^2+12^2}$ (correct answer)
$\mu_C=30$ and $\sigma_C=15-12$
$\mu_C=-30$ and $\sigma_C=\sqrt{15^2+12^2}$
$\mu_C=30$ and $\sigma_C=15+12$

Explanation: This AP Statistics problem tests combining independent random variables for the difference C = X - Y. The mean is μ_C = 120 - 90 = 30. For independent variables, Var_C = 15² + 12² = 225 + 144 = 369, so σ_C = √369 = √(15² + 12²). Choice A uses the sum of means (210), a distractor for those not noting it's a difference. Choice D negates the mean, possibly reversing subtraction order. Mini-lesson: μ_{X-Y} = μ_X - μ_Y, but Var_{X-Y} = Var_X + Var_Y due to independence; this addition of variances applies equally to sums and differences.

Question 15

A gym owner believes that more than 35% of members attend at least 3 times per week. To check, a random sample of 160 members is selected, and 49 report attending at least 3 times per week (so $\hat{p}=49/160$ ). Which hypotheses are appropriate for a one-proportion $z$ test of the owners belief?

$H_0: p=0.35$ ; $H_a: p<0.35$
$H_0: p=0.306$ ; $H_a: p>0.306$
$H_0: \hat{p}=0.35$ ; $H_a: \hat{p}>0.35$
$H_0: p=0.35$ ; $H_a: p>0.35$ (correct answer)
$H_0: p>0.35$ ; $H_a: p=0.35$

Explanation: This question assesses hypotheses for believing gym attendance exceeds 35%. Choice D is correct: H0: p = 0.35 and Ha: p > 0.35, aligning with 'more than' for right-tailed. Distractors use sample p-hat ≈ 0.306 in B, or p-hat in C. Option A has wrong direction. Mini-lesson: 'More than' uses > in Ha, H0 with equality on p. Sample data tests H0 but isn't in statements. This tests if evidence supports higher attendance.

Question 16

A nonprofit is comparing two fundraising email subject lines to see if one leads to a higher donation rate. From a random sample of 500 recipients who received Subject Line 1, 65 donated ( $\hat{p}_1=0.13$ ). From an independent random sample of 480 recipients who received Subject Line 2, 48 donated ( $\hat{p}_2=0.10$ ). The research claim is that Subject Line 1 leads to a higher donation rate. Which hypotheses are appropriate?

$H_0: p_1-p_2=0$ ; $H_a: p_1-p_2>0$ (correct answer)
$H_0: p_1-p_2\neq 0$ ; $H_a: p_1-p_2=0$
$H_0: p_2-p_1=0$ ; $H_a: p_2-p_1>0$
$H_0: \hat{p}_1-\hat{p}_2=0$ ; $H_a: \hat{p}_1-\hat{p}_2>0$
$H_0: p_1=0.13$ ; $H_a: p_1>0.13$

Explanation: This question tests hypothesis formulation for two proportions in AP Statistics, claiming one higher. Null H0: p_1 - p_2 = 0, equal donation rates. Alternative Ha: p_1 - p_2 > 0, Line 1 higher, as in choice A. Distractor C reverses subtraction. Choice D uses samples. Mini-lesson: Define p1 for claimed better group; null =0; > for higher; populations only; avoid swapping null/alternative.

Question 17

A basketball player takes 30 free throws. The sequence of makes (M) and misses (X) contains a repeating pattern: M, X, M, X, M, X, repeating exactly for all 30 shots. Under random behavior, you might see short alternations, but exact repeating patterns over many trials are not expected. Is the pattern consistent with random behavior?

Yes, because randomness means makes and misses should alternate to look “mixed.”
No, because an exact repeating pattern over all 30 shots suggests a non-random mechanism. (correct answer)
Yes, because any specific sequence (including this one) could occur by chance, so it must be considered random.
No, because a random process must have the same number of makes and misses in 30 shots.
Yes, because repeating patterns are expected whenever the probability of a make is close to 0.5.

Explanation: This question tests recognition of non-random patterns. A perfect alternating pattern of make-miss-make-miss for all 30 shots is extremely unlikely under random behavior and strongly suggests a non-random mechanism. The probability of this exact sequence occurring randomly is (1/2)^30, which is astronomically small. The distractors incorrectly suggest that alternation is what randomness "looks like" or that any sequence could be random. The key lesson is that while any specific sequence has the same probability, patterns with obvious structure (like perfect alternation) are vastly outnumbered by sequences without such structure. Perfect repetition over many trials is a red flag for non-randomness.

Question 18

A city wants to estimate the mean number of minutes commuters wait for a bus during weekday mornings. Two independent simple random samples of 50 weekday commuters are taken from the same set of routes during the same month. Sample 1 has a mean wait time of 7.8 minutes; Sample 2 has a mean wait time of 9.1 minutes. Why might the sample results differ?

The difference is expected because random samples from the same population can yield different sample means due to chance variation. (correct answer)
Because the means differ, the sampling method must have been biased for at least one sample.
One of the samples must be wrong, since two samples of the same size from the same population should have the same mean.
Sampling variability occurs only when samples are taken without replacement; with random samples it does not occur.
The difference proves that the population mean wait time changed between the two samples.

Explanation: This question addresses sampling variability in sample means. The two samples produced mean wait times of 7.8 and 9.1 minutes - a difference of 1.3 minutes that is entirely expected from random sampling. Each sample of 50 commuters represents a different random subset of all commuters, and by chance alone, one sample might include more people who experienced longer waits. Choice B incorrectly assumes different results indicate bias, while choice C wrongly expects identical means from same-sized samples. Choice D confuses sampling with and without replacement, and choice E incorrectly assumes the population changed. Understanding that sample means naturally vary from sample to sample is fundamental to statistical inference.

Question 19

For the population of diameters of ball bearings produced by a factory, diameters are approximately Normal with mean $\mu=10.00$ mm and standard deviation $\sigma=0.04$ mm. The marked interval is from $9.92$ mm to $10.08$ mm. Which statement about the marked interval is correct?

The interval $9.92$ to $10.08$ is within about 2 standard deviations of the mean, so it should include most (but not all) bearings. (correct answer)
The interval $9.92$ to $10.08$ is within about 1 standard deviation of the mean, so it should include nearly all bearings.
The interval $9.92$ to $10.08$ is within about 4 standard deviations of the mean, so it should include only about two-thirds of bearings.
The interval $9.92$ to $10.08$ is within about 2 standard errors of the mean, so it should include most bearings.
The interval $9.92$ to $10.08$ is centered below the mean, so it captures mostly undersized bearings.

Explanation: This problem involves interpreting an interval in a normal distribution of bearing diameters. With μ = 10.00 mm and σ = 0.04 mm, the interval 9.92 to 10.08 mm extends from (9.92-10.00)/0.04 = -2 to (10.08-10.00)/0.04 = +2 standard deviations from the mean. According to the empirical rule, approximately 95% of values fall within 2 standard deviations of the mean, which means most (but not all) bearings. The distractors incorrectly calculate the number of standard deviations, confuse standard deviation with standard error, or misinterpret the interval's position. The interval is symmetric around the mean, not shifted below it.

Question 20

Two brands of batteries (Brand X and Brand Y) were tested for lifetime (hours) until failure. A side-by-side boxplot summary is given by five-number summaries below. Which comparison is supported by the display?

Brand X: min 6, $Q_1$ 10, median 14, $Q_3$ 18, max 22; no outliers
Brand Y: min 4, $Q_1$ 9, median 13, $Q_3$ 17, max 19; no outliers

Brand Y has a higher median lifetime than Brand X, and Brand Y has a larger IQR.
Brand X has a slightly higher median lifetime than Brand Y, and the spreads (IQRs) are the same. (correct answer)
Brand X is more variable because it has a smaller range.
Brand X and Brand Y have the same median because their IQRs overlap.
Brand Y has a larger range and therefore a higher center.

Explanation: This problem focuses on comparing battery lifetime distributions for Brand X and Brand Y using boxplots and five-number summaries. Brand X has a median of 14 hours, slightly higher than Y's 13, showing a marginally higher center, and both have identical IQRs of 8 hours (X: 18-10, Y: 17-9), indicating similar middle 50% spreads. The ranges are close (X: 16, Y: 15), with no outliers, supporting comparable variability. Choice A tempts by reversing the medians and claiming Y has larger IQR, possibly from miscalculating or swapping groups. Remember, in comparing quantitative distributions, prioritize center (median for skewed data), spread (IQR resists outliers), and note any unusual features like outliers; here, the slight difference in medians and equal IQRs are key.

Question 21

A research study produced the following two-way table of counts for two categorical variables, A and B. A total of 200 subjects were studied. For variable A, the counts are 80 for level 1 and 120 for level 2. For variable B, the counts are 100 for level X and 100 for level Y.

For a chi-square test of independence, which of the following comparisons between expected counts is correct?

The expected count for (Level 1, Level X) is equal to the expected count for (Level 2, Level X).
The expected count for (Level 1, Level X) is less than the expected count for (Level 1, Level Y).
The expected count for (Level 2, Level X) is greater than the expected count for (Level 1, Level X). (correct answer)
The expected count for (Level 1, Level Y) is greater than the expected count for (Level 2, Level Y).

Explanation: First, calculate the four expected counts using the formula (row total × column total) / grand total. The expected count for (Level 1, Level X) is $(80 \times 100) / 200 = 40$ . The expected count for (Level 1, Level Y) is $(80 \times 100) / 200 = 40$ . The expected count for (Level 2, Level X) is $(120 \times 100) / 200 = 60$ . The expected count for (Level 2, Level Y) is $(120 \times 100) / 200 = 60$ . Comparing the values as per the choices, only C is correct because the expected count for (Level 2, Level X), which is 60, is greater than the expected count for (Level 1, Level X), which is 40.

Question 22

A researcher interviews households until finding 12 households that agree to participate in a study. For each household contacted, a success is “agrees to participate” and a failure is “does not agree.” The researcher records the number of households contacted to reach 12 successes. Does this meet binomial conditions for modeling the number of successes?

Yes, because each contact results in success or failure and the probability of agreement is constant.
No, because the number of trials is not fixed in advance. (correct answer)
Yes, because the stopping rule guarantees independence between contacts.
No, because there are more than two outcomes (agree, refuse, or not home).
Yes, because binomial models apply whenever you stop after a fixed number of successes.

Explanation: This is a classic example of a negative binomial or geometric situation, not a binomial setting. The key violation is that the number of trials is not fixed in advance - the researcher continues until achieving 12 successes, so the total number of households contacted varies. In a binomial setting, we must know n (number of trials) before starting. Choice A incorrectly focuses on constant probability without addressing the variable trial count, C incorrectly suggests the stopping rule creates independence, D introduces an irrelevant third outcome not mentioned in the problem, and E incorrectly claims this type of stopping rule works for binomial models. The binomial distribution models the number of successes in a fixed number of trials, not the number of trials needed to achieve a fixed number of successes.

Question 23

A delivery company randomly samples 45 packages and estimates the mean delivery time (in days). A 95% confidence interval for the population mean delivery time is $(2.9, 3.4)$ days. The company advertises that the true mean delivery time is 3.0 days. Is the claim supported by the confidence interval?

Yes, because 3.0 is inside the 95% confidence interval. (correct answer)
No, because 95% confidence means the true mean is 3.0 for 95% of customers.
No, because the true mean must be the midpoint of $(2.9, 3.4)$ , not 3.0.
Yes, because 95% confidence guarantees the true mean is exactly 3.0 days.
No, because 3.0 is in the interval, so it cannot be the true mean.

Explanation: This question evaluates whether a claim about mean delivery time is supported by a confidence interval. The 95% confidence interval is (2.9, 3.4) days, and the company claims 3.0 days. Since 3.0 falls within this interval, the claim is supported by the data. Choice C incorrectly requires the true mean to be at the midpoint, while choice E absurdly suggests values in the interval cannot be the true mean. A confidence interval represents a range of plausible values for the population parameter based on sample data. Any value within the interval, not just the midpoint or sample mean, is a plausible value for the true population mean at the given confidence level.

Question 24

A wildlife camera records an animal sighting each night as one of {Deer, Raccoon, No animal}. Let N be the event that no animal is recorded. The park ranger states $P(N)=0.40$ . Which statement correctly describes the probability?

In the next 5 nights, there will be exactly 2 nights with no animal.
Because $P(N)=0.40$ , “No animal” is impossible on two consecutive nights.
In the long run, about 40% of nights will have no animal recorded. (correct answer)
The odds of no animal are 40 to 60, meaning no animal will occur more often than animals every week.
If a deer is recorded tonight, then $P(N)$ becomes 0.60 tomorrow.

Explanation: This question assesses the skill of interpreting probabilities in the context of an introduction to probability in AP Statistics. The sample space is {Deer, Raccoon, No animal}, and the event N is {No animal} with P(N) = 0.40. Choice C accurately conveys the long-run expectation of 40% of nights with no animal. In a mini-lesson on probability interpretation, this value indicates the proportion stabilizing at 0.40 over extended independent observations, illustrating the stability of probabilities in large samples despite short-term unpredictability. It counters ideas of dependencies between trials. A distractor like choice A assumes precise counts in small sets, which probability does not ensure. This perspective is useful for wildlife monitoring.

Question 25

A city planner expects that about 60% of residents support building a new bike lane. A random sample of 50 residents is surveyed, and 24 say they support it (48%). The planner expected the sample proportion to be near 60%, but the observed outcome is lower. Is the result unexpected, given the variability in random sampling with a sample size of 50?

Yes; because 48% is not close to 60%, it is impossible if the true support rate is 60%.
No; with 50 people, some sampling variability is expected, and 48% could occur by chance even if the true proportion is about 60%. (correct answer)
Yes; because the sample is random, the sample proportion should be exactly 60%.
No; because the observed proportion is below 60%, it confirms the true proportion is below 60%.
Yes; because a sample of 50 is large enough that the sample proportion should almost never differ from 60% by more than a tiny amount.

Explanation: This question examines understanding of sampling variability for proportions with moderate samples. Finding 48% support (24 out of 50) when expecting 60% represents a 12 percentage point difference. With 50 people, this level of variation is plausible due to random sampling. While the result differs from expectation, it's not so extreme as to be impossible under the assumed model. The incorrect answers either claim this variation cannot occur or misunderstand how sample proportions relate to population proportions. When assessing whether results are unexpected, we must consider that even with 50 observations, sample proportions can deviate noticeably from population values due to chance, though extreme deviations become increasingly unlikely.