Difference of Two Means (Test)

Help Questions

AP Statistics › Difference of Two Means (Test)

Questions 1 - 10
1

A school district wants to know whether a new reading program changes mean reading-comprehension scores. A random sample of 40 students used the new program (Group N) and a separate random sample of 38 students used the old program (Group O). A two-sample $t$ test for $\mu_N-\mu_O$ was conducted with hypotheses $H_0: \mu_N-\mu_O=0$ and $H_a: \mu_N-\mu_O>0$. The test produced a p-value of 0.018. Using $\alpha=0.05$, what conclusion is appropriate?

Fail to reject $H_0$; the samples prove the two sample means are equal.

Reject $H_0$; there is convincing evidence that the new program increases the population mean score compared with the old program.

Fail to reject $H_0$; there is not convincing evidence that the new program increases the mean score.

Reject $H_0$; the new program causes higher scores for every student in the district.

Reject $H_0$; there is convincing evidence that the old program increases the population mean score compared with the new program.

Explanation

This question tests understanding of hypothesis test conclusions for a difference of two means. Since the p-value (0.018) is less than α (0.05), we reject the null hypothesis. The alternative hypothesis states μN - μO > 0, which means we're testing if the new program has a higher mean than the old program. When we reject H0, we have convincing evidence supporting the alternative hypothesis - that the new program increases the population mean score. Choice D incorrectly states we "prove" equality, but hypothesis tests never prove anything. Choice E overstates the conclusion by claiming the program works for "every student," when we can only make claims about population means.

2

A city compares mean response time (minutes) for two ambulance dispatch systems. Independent random samples of calls are taken from System 1 (S1) and System 2 (S2). A two-sample $t$ test for $\mu_{S1}-\mu_{S2}$ is performed with $H_0: \mu_{S1}-\mu_{S2}=0$ and $H_a: \mu_{S1}-\mu_{S2}>0$. The p-value is 0.11. Using $\alpha=0.10$, what conclusion is appropriate?

Fail to reject $H_0$; therefore the two systems have exactly the same mean response time in the population.

Fail to reject $H_0$; there is not convincing evidence that System 1 has a larger population mean response time than System 2.

Reject $H_0$; using System 1 causes slower response times for all calls.

Reject $H_0$; there is convincing evidence that System 1 has a larger population mean response time than System 2.

Reject $H_0$; there is convincing evidence that System 1 has a smaller population mean response time than System 2.

Explanation

This question involves comparing ambulance response times with a right-tailed test. The p-value (0.11) is greater than α (0.10), so we fail to reject the null hypothesis. The alternative Ha: μS1 - μS2 > 0 tests whether System 1 has a larger (worse) mean response time than System 2. Since we fail to reject H0, we don't have convincing evidence that System 1 has a larger population mean response time. Choice D incorrectly interprets failing to reject as proof of equality - we simply lack evidence of a difference. Choice E makes an inappropriate causal claim. When p-value > α, we always fail to reject H0 and conclude there's insufficient evidence for the alternative hypothesis.

3

A school district wants to know whether a new online homework system changes mean weekly math quiz scores. A random sample of students used the new system (group N) and another random sample used the old system (group O). A two-sample $t$ test was performed for $H_0: \mu_N-\mu_O=0$ versus $H_a: \mu_N-\mu_O>0$. The test produced a p-value of 0.03. Using $\alpha=0.05$, what conclusion is appropriate?

Reject $H_0$; the new system caused individual students’ quiz scores to increase.

Fail to reject $H_0$; there is not convincing evidence that the new system increases the mean quiz score.

Reject $H_0$; there is convincing evidence that the old system increases the mean quiz score.

Reject $H_0$; there is convincing evidence that the new system increases the mean quiz score.

Fail to reject $H_0$; the two samples have the same mean quiz score, so the population means are equal.

Explanation

This question tests your ability to interpret a two-sample t-test for the difference of means. Since the p-value (0.03) is less than the significance level α = 0.05, we reject the null hypothesis. The alternative hypothesis states μ_N - μ_O > 0, which means we're testing if the new system has a higher mean than the old system. By rejecting H₀, we conclude there is convincing evidence that the new system increases the mean quiz score. Choice D incorrectly claims we can conclude the population means are equal from sample data, and Choice E incorrectly makes a causal claim about individual students. When conducting hypothesis tests for difference of means, we compare the p-value to α and make conclusions about population means, not individual values.

4

A psychologist studies whether a mindfulness app affects mean stress score (higher = more stress). Participants were randomly assigned to use the app (group A) or a placebo app (group P). A two-sample $t$ test was conducted for $H_0: \mu_A-\mu_P=0$ versus $H_a: \mu_A-\mu_P<0$. The p-value was 0.018. At $\alpha=0.05$, what conclusion is appropriate?

Reject $H_0$; the two sample means are different, so the app group mean must be exactly 0.018 lower.

Reject $H_0$; the mindfulness app is proven to reduce stress for every individual.

Fail to reject $H_0$; there is not convincing evidence that the mindfulness app lowers mean stress score.

Reject $H_0$; there is convincing evidence that the mindfulness app lowers mean stress score.

Reject $H_0$; there is convincing evidence that the mindfulness app raises mean stress score.

Explanation

This randomized experiment tests whether a mindfulness app lowers mean stress score, with H_a: μ_A - μ_P < 0. The p-value of 0.018 is less than α = 0.05, so we reject the null hypothesis. This provides convincing evidence that the mindfulness app lowers mean stress score. Choice D incorrectly claims the app reduces stress for every individual, while Choice E wrongly states that we can determine the exact difference in population means. Even in randomized experiments, hypothesis tests provide evidence about population parameters (means), not guarantees about individual outcomes. The p-value tells us about statistical significance, not the size of the effect.

5

A school compares mean time (in minutes) to complete a standardized math test for students using a paper booklet versus an online version. Two independent random samples were taken (paper: $n=40$; online: $n=38$). A two-sample $t$ test for $\mu_{paper}-\mu_{online}$ used $H_0:\mu_{paper}-\mu_{online}=0$ and $H_a:\mu_{paper}-\mu_{online}<0$. The p-value was 0.041. Using $\alpha=0.05$, what conclusion is appropriate?

Reject $H_0$; there is sufficient evidence that the population mean completion time is higher for paper than for online.

Reject $H_0$; there is sufficient evidence that the population mean completion time is lower for paper than for online.

Fail to reject $H_0$; there is sufficient evidence that the population mean completion time is lower for paper than for online.

Reject $H_0$; using paper causes students to finish faster.

Fail to reject $H_0$; the sample data show paper was faster, so paper is faster in the population.

Explanation

This question tests the application of a two-sample t-test for the difference in mean completion times between paper and online math tests. The one-sided alternative hypothesizes that paper is faster, and since the p-value of 0.041 is less than α=0.05, we reject H0, supporting that the population mean time is lower for paper. A frequent distractor is choice B, which mistakenly says 'fail to reject' while claiming sufficient evidence, confusing the decision rule. When concluding two-mean tests, specify the direction if it's a one-sided test and evidence supports it, but don't generalize to causation like in choice E. Focus on population inferences, as sample differences alone (as in choice D) don't guarantee population differences without statistical significance.

6

An environmental scientist tests whether the mean nitrate concentration (mg/L) differs between two nearby lakes, Lake 1 and Lake 2. Independent random water samples were collected from each lake. A two-sample $t$ test was performed for $H_0: \mu_1-\mu_2=0$ versus $H_a: \mu_1-\mu_2\ne 0$. The p-value was 0.049. Using $\alpha=0.05$, what conclusion is appropriate?

Fail to reject $H_0$; because 0.049 is close to 0.05, the result is inconclusive and no decision can be made.

Reject $H_0$; there is convincing evidence that Lake 1 has a higher mean nitrate concentration than Lake 2.

Reject $H_0$; the sampling shows the lakes’ nitrate means are different, so one lake caused the other to change.

Reject $H_0$; there is convincing evidence that the mean nitrate concentration differs between the two lakes.

Fail to reject $H_0$; there is not convincing evidence of a difference in mean nitrate concentration between the lakes.

Explanation

This two-sample t-test uses a two-tailed alternative hypothesis (μ₁ - μ₂ ≠ 0) to test for any difference in mean nitrate concentration between the lakes. The p-value of 0.049 is just barely less than α = 0.05, so we reject the null hypothesis. This provides convincing evidence that the mean nitrate concentration differs between the two lakes. Choice C incorrectly specifies a direction (Lake 1 higher) when the two-tailed test doesn't indicate which lake has higher concentration. Choice E wrongly suggests that a p-value close to α makes the result inconclusive. In hypothesis testing, we use a clear decision rule: if p-value < α, we reject H₀, regardless of how close the values are.

7

A sports scientist compares mean resting heart rate for two populations: endurance athletes (A) and nonathletes (N). Independent random samples are taken, and a two-sample $t$ test for $\mu_A-\mu_N$ is performed with $H_0: \mu_A-\mu_N=0$ and $H_a: \mu_A-\mu_N<0$. The p-value is 0.27. At the $\alpha=0.05$ level, what conclusion is appropriate?

Fail to reject $H_0$; there is not convincing evidence that athletes have a lower population mean resting heart rate than nonathletes.

Fail to reject $H_0$; therefore the two population means are equal.

Reject $H_0$; being an endurance athlete causes a lower resting heart rate in the population.

Reject $H_0$; there is convincing evidence that athletes have a higher population mean resting heart rate than nonathletes.

Reject $H_0$; there is convincing evidence that athletes have a lower population mean resting heart rate than nonathletes.

Explanation

This problem involves a left-tailed test for the difference between athlete and non-athlete mean heart rates. The p-value (0.27) is much larger than α (0.05), so we fail to reject the null hypothesis. The alternative hypothesis Ha: μA - μN < 0 tests whether athletes have a lower mean heart rate than non-athletes. Since we fail to reject H0, we do not have convincing evidence that athletes have a lower population mean resting heart rate. Choice D incorrectly claims that failing to reject H0 means the means are equal - we simply don't have evidence of a difference. Remember that failing to reject H0 never proves the null hypothesis is true; it only means we lack sufficient evidence against it.

8

A teacher compares mean quiz scores for students using two different study apps. Two independent random samples are taken: App A and App B. A two-sample $t$ test for $\mu_A-\mu_B$ is conducted with $H_0: \mu_A-\mu_B=0$ and $H_a: \mu_A-\mu_B\neq 0$. The p-value is 0.52. At $\alpha=0.05$, what conclusion is appropriate?

Fail to reject $H_0$; this proves both apps produce the same mean score for all students.

Fail to reject $H_0$; there is not convincing evidence of a difference in population mean quiz scores between the two apps.

Reject $H_0$; using App A causes higher quiz scores than using App B.

Reject $H_0$; there is convincing evidence that App A has a higher population mean quiz score than App B.

Reject $H_0$; there is convincing evidence that the population mean quiz scores differ between the two apps.

Explanation

This problem presents a two-tailed test comparing study apps. The p-value (0.52) is much larger than α (0.05), so we fail to reject the null hypothesis. With a two-tailed alternative (Ha: μA - μB ≠ 0), failing to reject means we don't have convincing evidence of any difference between the population mean quiz scores. Choice A incorrectly suggests rejecting H0 when the large p-value clearly indicates we should fail to reject. Choice D wrongly claims this "proves" equality - hypothesis tests never prove the null hypothesis. A large p-value simply means our sample data are consistent with the null hypothesis of no difference between population means.

9

A school district compares mean math test scores for students taught with Method 1 versus Method 2. Two independent random samples of students were selected (Method 1: $n=28$; Method 2: $n=30$). A two-sample $t$ test for $\mu_1-\mu_2$ was performed with $H_0:\mu_1=\mu_2$ and $H_a:\mu_1>\mu_2$. The p-value was 0.004. Using $\alpha=0.01$, what conclusion is appropriate?

Fail to reject $H_0$; there is not convincing evidence that Method 1 leads to a higher mean score than Method 2.

Reject $H_0$; there is convincing evidence that Method 1 has a higher population mean score than Method 2.

Reject $H_0$; there is convincing evidence that Method 2 has a higher population mean score than Method 1.

Reject $H_0$; Method 1 causes higher scores for every student.

Fail to reject $H_0$; the sample means are equal, so the population means are equal.

Explanation

This question tests a one-tailed hypothesis where we're examining if Method 1 produces higher mean scores than Method 2 (H_a: μ₁ > μ₂). The p-value (0.004) is less than α = 0.01, so we reject the null hypothesis. This provides convincing evidence that Method 1 has a higher population mean score than Method 2. Choice C incorrectly claims the effect applies to "every student," which is an overstatement—hypothesis tests make conclusions about population means, not individual outcomes. Choice D reverses the direction of the conclusion. When conducting hypothesis tests, we make inferences about population parameters based on sample data, not claims about every individual in the population.

10

A public health researcher compares mean systolic blood pressure for adults who exercise regularly versus those who do not. Two independent random samples were taken (Exercise: $n=50$; No exercise: $n=48$). A two-sample $t$ test for $\mu_{\text{ex}}-\mu_{\text{no}}$ was conducted with $H_0:\mu_{\text{ex}}=\mu_{\text{no}}$ and $H_a:\mu_{\text{ex}}<\mu_{\text{no}}$. The p-value was 0.031. Using $\alpha=0.05$, what conclusion is appropriate?

Fail to reject $H_0$; there is not convincing evidence that exercisers have lower mean systolic blood pressure.

Reject $H_0$; there is convincing evidence that exercisers have higher mean systolic blood pressure than non-exercisers.

Reject $H_0$; exercising causes lower systolic blood pressure for every adult.

Fail to reject $H_0$; the sample means show the two groups have the same mean blood pressure.

Reject $H_0$; there is convincing evidence that the population mean systolic blood pressure is lower for adults who exercise regularly than for those who do not.

Explanation

This question tests understanding of a one-tailed test where we're examining if exercisers have lower mean systolic blood pressure (H_a: μ_ex < μ_no). The p-value (0.031) is less than α = 0.05, so we reject the null hypothesis. This provides convincing evidence that the population mean systolic blood pressure is lower for adults who exercise regularly than for those who do not. Choice C incorrectly claims causation for "every adult," which overstates the conclusion. Choice E misinterprets what failing to reject would mean. When we reject H₀ in a one-tailed test with "less than" alternative, we conclude there's evidence supporting the specific directional claim in the alternative hypothesis.

Page 1 of 4