Sampling for Differences in Sample Means

Help Questions

AP Statistics › Sampling for Differences in Sample Means

Questions 1 - 10
1

A manufacturer compares the mean lifetime of batteries from two production lines. In repeated sampling, an independent random sample of $n_1=16$ batteries from Line 1 and $n_2=16$ batteries from Line 2 is tested, and $\bar{x}_1-\bar{x}_2$ is recorded. Which statement is correct about how the standard deviation of the sampling distribution changes if both sample sizes are quadrupled (to $n_1=n_2=64$)?​

It becomes 4 times smaller because the sample sizes are 4 times as large.

It becomes 2 times larger because there are two groups instead of one.

It becomes about half as large because each standard error term involves $\sqrt{n}$.

It becomes 4 times as large because the sample sizes are 4 times as large.

It stays the same because the population standard deviations do not change.

Explanation

This question examines how sample size affects the standard deviation of the sampling distribution. When both sample sizes are quadrupled from 16 to 64, each term in the standard deviation formula $\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}$ is divided by 4, making the entire expression half as large (since $\sqrt{1/4} = 1/2$). Option A incorrectly suggests it increases. Option C is wrong - sample size does affect spread. Option D has the wrong factor. Option E makes no sense in this context.

2

A hospital compares patient wait times in two departments. Repeatedly, it takes an independent random sample of $n_1=100$ patients from the ER and $n_2=20$ patients from Urgent Care and computes $\bar{x}{ER}-\bar{x}{UC}$. Which statement is correct about the sampling distribution of $\bar{x}{ER}-\bar{x}{UC}$?

It is the distribution of individual wait-time differences between one ER patient and one Urgent Care patient.

It has no variability because $\mu_{ER}$ and $\mu_{UC}$ are fixed values.

It is centered at $0$ because each sample mean is an unbiased estimator.

It is centered at $\mu_{ER}-\mu_{UC}$, and its variability depends on both $n_1$ and $n_2$.

Its standard deviation depends only on the smaller sample size, $n_2=20$.

Explanation

This question tests understanding of how the sampling distribution behaves with different sample sizes. The sampling distribution of $\bar{x}{ER}-\bar{x}{UC}$ is centered at $\mu_{ER}-\mu_{UC}$, and its standard deviation is $\sqrt{\frac{\sigma_{ER}^2}{100} + \frac{\sigma_{UC}^2}{20}}$, which depends on both sample sizes. Option B is incorrect - the center is the difference in population means, not zero. Option C is wrong because both sample sizes affect variability. Option D incorrectly claims no variability exists. Option E confuses the sampling distribution with individual patient differences.

3

A company compares delivery times from two warehouses. Many times, it takes an independent random sample of $n_1=25$ deliveries from Warehouse 1 and $n_2=64$ deliveries from Warehouse 2, then computes $\bar{x}_1-\bar{x}_2$ (in minutes). Which statement is correct about the sampling distribution of $\bar{x}_1-\bar{x}_2$?

It has no variability if the same number of deliveries is sampled from each warehouse.

It is approximately normal only if the population distributions are exactly normal.

It has less variability when $n_1$ increases but is unaffected by $n_2$.

Its standard deviation is $\sigma_1/\sqrt{n_1}-\sigma_2/\sqrt{n_2}$ because the statistic subtracts the means.

Its mean is $\mu_1-\mu_2$, regardless of the sample sizes.

Explanation

This question asks about properties of the sampling distribution for the difference in sample means. The mean of the sampling distribution of $\bar{x}_1-\bar{x}_2$ is always $\mu_1-\mu_2$, regardless of sample sizes, because sample means are unbiased estimators. This is a fundamental property that holds for any sample sizes. Option A is incorrect because the Central Limit Theorem allows for approximate normality with large samples even if populations aren't normal. Option C gives an incorrect formula for the standard deviation. Option D is wrong because both sample sizes affect variability. Option E is incorrect because sampling variability always exists when taking samples.

4

A school compares two study programs by repeatedly taking random samples of students from each program. Each time, a random sample of $n_1=40$ students from Program A and an independent random sample of $n_2=40$ students from Program B are selected, and the mean exam score is computed for each group. The statistic of interest is $\bar{x}_A-\bar{x}_B$. Which statement is correct about the sampling distribution of $\bar{x}_A-\bar{x}_B$?

It is centered at $0$ whenever the two sample sizes are equal.

It describes the distribution of individual score differences between one student from A and one student from B.

Its standard deviation is $\sigma_A-\sigma_B$ because the statistic is a difference.

It has no variability because the same two programs are being compared each time.

It is centered at $\mu_A-\mu_B$, and its standard deviation decreases when either $n_1$ or $n_2$ increases.

Explanation

This question tests understanding of the sampling distribution of differences in sample means. The sampling distribution of $\bar{x}_A-\bar{x}_B$ is centered at the difference in population means, $\mu_A-\mu_B$, because each sample mean is an unbiased estimator of its population mean. The standard deviation of this distribution is $\sqrt{\frac{\sigma_A^2}{n_1} + \frac{\sigma_B^2}{n_2}}$, which decreases as either sample size increases. Option B is incorrect because the center depends on population means, not sample sizes. Option C is wrong because sampling variability exists even when comparing the same programs repeatedly. Option D incorrectly states the standard deviation formula. Option E confuses the sampling distribution with individual differences.

5

A teacher compares average time (seconds) to complete a puzzle under two conditions: quiet room (Q) and music playing (M). In repeated sampling, independent random samples of $n_Q=36$ and $n_M=64$ are taken and $\bar{x}_Q-\bar{x}_M$ is computed. Which statement is correct about the variability of $\bar{x}_Q-\bar{x}_M$?

If both sample sizes were doubled, the sampling distribution of $\bar{x}_Q-\bar{x}_M$ would typically have less spread.

The variability depends only on $n_Q$ because $\bar{x}_Q$ is listed first in the difference.

There is no spread because each sample mean equals the corresponding population mean.

The spread increases as sample sizes increase because more data create more possible values of $\bar{x}_Q-\bar{x}_M$.

The sampling distribution has the same spread as the distribution of puzzle times for individuals in condition Q.

Explanation

This question tests understanding of how sample size affects the variability of x̄_Q - x̄_M. The standard deviation of this sampling distribution is √(σ_Q²/n_Q + σ_M²/n_M), which decreases as sample sizes increase. Doubling both sample sizes would reduce each variance term by half, thereby reducing the overall standard deviation and spread of the sampling distribution. Choice A incorrectly ignores the contribution of the second sample. Choice C confuses the sampling distribution with the population distribution. Choice D wrongly claims no spread exists. Choice E reverses the relationship - larger samples actually reduce spread in the sampling distribution.

6

A nutritionist compares mean sodium intake (mg) for two independent populations: people who eat breakfast daily (B) and people who do not (N). In repeated sampling, she takes an SRS of $n_B=10$ and an SRS of $n_N=10$ and calculates $\bar{x}_B-\bar{x}_N$. Which statement is correct about when the sampling distribution of $\bar{x}_B-\bar{x}_N$ will be approximately normal?

It will be approximately normal whenever the two population distributions are approximately normal.

It will be exactly normal for any population distributions because the sample sizes are equal.

It will be approximately normal only if the population means are equal.

It cannot be approximately normal unless $n_B$ and $n_N$ are both at least 100.

It will be approximately normal because a difference of two sample means always looks normal, regardless of sample size and population shape.

Explanation

This question asks about conditions for approximate normality of x̄_B - x̄_N with small samples (n_B = n_N = 10). Since the samples are too small for the Central Limit Theorem to apply, we need the populations themselves to be approximately normal for the sampling distribution of the difference to be approximately normal. This makes choice A correct. Choice B wrongly claims equal sample sizes guarantee normality. Choice C sets an unnecessarily high bar - samples of 30+ often suffice for the CLT. Choice D incorrectly links normality to equal population means. Choice E makes the false claim that differences of sample means are always approximately normal.

7

A city compares mean monthly water use (gallons) for households with low-flow fixtures (L) versus standard fixtures (S). In repeated sampling, an SRS of $n_L=80$ and an SRS of $n_S=80$ are taken independently and $\bar{x}_L-\bar{x}_S$ is recorded. Which statement is correct about the center of the sampling distribution of $\bar{x}_L-\bar{x}_S$?

It is centered at the median of the combined population because means are sensitive to skew.

It is centered at $\mu_L+\mu_S$ because both groups contribute to the total.

It is centered at $\mu_L-\mu_S$.

It is centered at $0$ because sampling makes the two groups equal on average.

It is centered at $\bar{x}_L-\bar{x}_S$ from the most recent sample.

Explanation

This question asks about the center of the sampling distribution for x̄_L - x̄_S. A fundamental property of sampling distributions is that the expected value (mean) of x̄_L - x̄_S equals μ_L - μ_S, the difference in population means. This holds true regardless of sample sizes, population shapes, or variability. Choice B incorrectly uses a single observed difference instead of the theoretical center. Choice C wrongly assumes sampling equalizes the groups. Choice D incorrectly adds the means instead of subtracting. Choice E confuses the mean with the median and incorrectly considers the combined population.

8

A manufacturer compares the mean lifetime of batteries from two production lines. In repeated sampling, an independent random sample of $n_1=16$ batteries from Line 1 and $n_2=16$ batteries from Line 2 is tested, and $\bar{x}_1-\bar{x}_2$ is recorded. Which statement is correct about how the standard deviation of the sampling distribution changes if both sample sizes are quadrupled (to $n_1=n_2=64$)?

It stays the same because the population standard deviations do not change.

It becomes 4 times as large because the sample sizes are 4 times as large.

It becomes about half as large because each standard error term involves $\sqrt{n}$.

It becomes 4 times smaller because the sample sizes are 4 times as large.

It becomes 2 times larger because there are two groups instead of one.

Explanation

This question examines how sample size affects the standard deviation of the sampling distribution. When both sample sizes are quadrupled from 16 to 64, each term in the standard deviation formula $\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}$ is divided by 4, making the entire expression half as large (since $\sqrt{1/4} = 1/2$). Option A incorrectly suggests it increases. Option C is wrong - sample size does affect spread. Option D has the wrong factor. Option E makes no sense in this context.

9

Two independent random samples are repeatedly taken to compare average commute times. Each repetition selects $n_1=12$ commuters from City A and $n_2=48$ commuters from City B and computes $\bar{x}_A-\bar{x}_B$. Assume both population distributions are roughly symmetric with similar spread. Which statement is correct about how changing sample sizes affects the sampling distribution of $\bar{x}_A-\bar{x}_B$?​

Increasing $n_2$ will shift the center of the sampling distribution to 0.

Increasing $n_2$ will eliminate sampling variability entirely because City B’s mean becomes known.

Changing either sample size affects only the center, not the spread, of the sampling distribution.

Increasing $n_2$ will not change the spread because the smaller sample size $n_1=12$ controls all variability.

Increasing $n_2$ (from 48 to a larger value) will reduce the spread of $\bar{x}_A-\bar{x}_B$, even if $n_1$ stays 12.

Explanation

This question examines how unequal sample sizes affect the sampling distribution. The standard deviation is $\sqrt{\frac{\sigma_A^2}{12} + \frac{\sigma_B^2}{48}}$. Increasing $n_2$ from 48 will reduce the second term under the square root, thereby reducing the overall standard deviation of $\bar{x}_A-\bar{x}_B$. Option B is incorrect - both sample sizes matter, not just the smaller one. Option C is wrong - sample size doesn't affect the center. Option D incorrectly claims variability is eliminated. Option E is false - sample sizes affect spread, not center.

10

To compare two fertilizers, a researcher repeatedly takes an independent random sample of $n_1=10$ plants grown with Fertilizer A and $n_2=10$ plants grown with Fertilizer B, then records the mean height for each sample and computes $\bar{x}_A-\bar{x}_B$. Suppose the population of heights for each fertilizer is strongly right-skewed, with no extreme outliers. Which statement is correct about the sampling distribution of $\bar{x}_A-\bar{x}_B$?

Its center is $\bar{x}_A-\bar{x}_B$ because the statistic determines the mean of its own sampling distribution.

It describes the distribution of differences in height for pairs of individual plants, one from each fertilizer group.

It has no variability because the populations are fixed and known to be skewed.

It is likely to be closer to normal if both sample sizes were increased.

It must be exactly normal because it is a difference of two means.

Explanation

This question examines the effect of small sample sizes and skewed populations on the sampling distribution. With small samples (n=10 each) from strongly skewed populations, the sampling distribution of $\bar{x}_A-\bar{x}_B$ may not be approximately normal. However, increasing both sample sizes would help the distribution become more normal due to the Central Limit Theorem. Option A is too restrictive - exact normality isn't required. Option C incorrectly identifies the center. Option D misinterprets what the sampling distribution represents. Option E is wrong because sampling variability exists regardless of population shape.

Page 1 of 6