Chi-Square Homogeneity or Independence (Setup)
Help Questions
AP Statistics › Chi-Square Homogeneity or Independence (Setup)
A restaurant chain wants to determine whether the distribution of satisfaction ratings (Poor, Fair, Good, Excellent) is the same across three locations. Managers randomly sample customers at each location and record one rating per customer. The results are shown below.
Which test is appropriate, and what are the correct hypotheses?
One-way ANOVA; $H_0$: the mean rating is the same at all locations; $H_a$: at least one mean differs
Chi-square test of independence; $H_0$: location and satisfaction rating are independent for this one random sample; $H_a$: they are associated
Chi-square goodness-of-fit test; $H_0$: ratings are equally likely at each location; $H_a$: ratings are not equally likely at each location
Chi-square test of homogeneity; $H_0$: the distribution of satisfaction ratings is the same for all locations; $H_a$: at least one location has a different distribution
Two-proportion $z$ test; $H_0$: the proportion Excellent is the same at all locations; $H_a$: at least one differs
Explanation
This question involves comparing the distribution of satisfaction ratings across three restaurant locations, where managers take separate random samples from each location. Since we have distinct samples from different populations (the three locations) and want to test if the categorical distribution is the same across these populations, we use a chi-square test of homogeneity. The null hypothesis states that the distribution of satisfaction ratings is the same for all locations. Option A incorrectly suggests independence (which requires one sample), Option B is inappropriate because ratings are categorical not numerical, Option D tests equal likelihood within each location rather than comparing across locations, and Option E only examines one rating category rather than the full distribution.
A teacher wants to know whether participation in an optional review session is related to exam outcome. From a single random sample of 180 students enrolled in a course, the teacher records whether each student attended the review session (Yes/No) and whether they passed the exam (Pass/Fail). The results are shown in the two-way table. Which chi-square test is appropriate, and what are the correct hypotheses?
Note: One sample; two categorical variables.
Two-proportion $z$ test; $H_0$: $p_{\text{Pass,Attend}}=p_{\text{Pass,NoAttend}}$; $H_a$: not equal
Chi-square test for homogeneity; $H_0$: the distribution of attendance is the same among passers and failers; $H_a$: different distributions
Chi-square goodness-of-fit; $H_0$: pass and fail occur in a 50/50 split; $H_a$: not a 50/50 split
Chi-square test of independence; $H_0$: review-session attendance and exam outcome are independent; $H_a$: they are associated
Matched-pairs test; $H_0$: no difference before vs after review; $H_a$: a difference
Explanation
Assessing AP Statistics skill in chi-square homogeneity versus independence setups, this uses one random sample of students with two categorical variables (attendance and outcome), ideal for chi-square test of independence on association. Choice C hypothesizes null independence and alternative association correctly. Choice B is a distractor, suggesting homogeneity, but that requires separate samples, not one here. Mini-lesson: Chi-square includes goodness-of-fit for single-variable proportions, independence for two-variable ties in one group, and homogeneity for multi-group comparisons. Choice A misapplies goodness-of-fit to outcomes. C is accurate.
A researcher takes one random sample of 200 adults and records each person’s gender (Male, Female) and whether they prefer streaming movies or watching cable TV. The counts are shown below.
Which test is appropriate, and what are the correct hypotheses?
Chi-square goodness-of-fit test; $H_0$: the four cell counts match a uniform distribution; $H_a$: at least one cell differs from uniform
Chi-square test of independence; $H_0$: gender and viewing preference are independent in the population; $H_a$: gender and viewing preference are associated
Two-proportion $z$ test; $H_0$: the proportion who prefer streaming is the same for males and females; $H_a$: the proportions differ
Chi-square goodness-of-fit test; $H_0$: streaming and cable are equally preferred overall; $H_a$: they are not equally preferred
Chi-square test of homogeneity; $H_0$: the distribution of gender is the same for streaming and cable groups; $H_a$: the distributions differ
Explanation
This scenario involves one random sample of 200 adults where two categorical variables (gender and viewing preference) are recorded for each person. Since we have a single sample and want to test whether these two variables are related, we use a chi-square test of independence. The null hypothesis states that gender and viewing preference are independent in the population, meaning knowing someone's gender doesn't help predict their viewing preference. Option B incorrectly treats this as comparing separate samples, Option C focuses on overall preferences rather than the relationship between variables, Option D only examines streaming preference rather than the full association, and Option E misinterprets the goal as testing for uniform distribution.
A campus office takes one random sample of 240 students and records whether each student lives on campus (On/Off) and their primary mode of transportation to class (Walk, Bike, Bus). The results are shown below.
Which test is appropriate, and what are the correct hypotheses?
Chi-square goodness-of-fit test; $H_0$: Walk, Bike, and Bus are equally likely overall; $H_a$: they are not equally likely overall
Chi-square test of homogeneity; $H_0$: the distribution of transportation mode is the same for on-campus and off-campus students; $H_a$: the distributions differ
Two-proportion $z$ test; $H_0$: the proportion who walk is the same for on-campus and off-campus students; $H_a$: the proportions differ
Chi-square test of independence; $H_0$: residence status and transportation mode are independent in the population; $H_a$: they are associated
Chi-square goodness-of-fit test; $H_0$: the six cell counts follow a uniform distribution; $H_a$: they do not
Explanation
This scenario describes one random sample of 240 students where two categorical variables (residence status and transportation mode) are recorded for each student. Since we have a single sample and want to test whether these two variables are associated, we use a chi-square test of independence. The null hypothesis states that residence status and transportation mode are independent in the population. Option A wrongly treats this as comparing separate samples, Option B tests for equal likelihood of transportation modes overall rather than their relationship with residence, Option D only examines walking rather than the full association, and Option E misinterprets the goal as testing for uniform distribution.
A health department wants to compare vaccination status across two counties. In County X, a random sample of 140 residents is surveyed; in County Y, a separate random sample of 160 residents is surveyed. Each resident is classified as Vaccinated or Not vaccinated. Results are shown below. Which chi-square test is appropriate, and what are the correct hypotheses?
(There are multiple groups: County X vs County Y.)
Chi-square test of independence; $H_0$: county and vaccination status are independent within each county sample; $H_a$: they are associated.
Chi-square test of homogeneity; $H_0$: the two counties have the same sample size; $H_a$: the sample sizes differ.
Chi-square test of homogeneity; $H_0$: the distribution of vaccination status is the same in Counties X and Y; $H_a$: the distributions differ.
Chi-square goodness-of-fit; $H_0$: vaccinated and not vaccinated are equally likely; $H_a$: they are not equally likely.
Two-proportion $z$ test; $H_0$: $p_X=p_Y$; $H_a$: $p_X\ne p_Y$.
Explanation
This question requires a chi-square test of homogeneity. The health department takes two separate random samples (140 residents from County X and 160 from County Y) and compares the distribution of vaccination status across these counties. Option B correctly identifies this, with the null hypothesis stating that both counties have the same distribution of vaccination status. Since vaccination status has only two categories (Vaccinated/Not vaccinated), option A's two-proportion z-test would also be appropriate and give the same result, but option B is the more general chi-square approach that extends to multiple categories. Option C incorrectly suggests independence, but we have separate samples from different populations, not one sample with two variables.
A zoologist wants to compare whether diet type (Herbivore, Omnivore, Carnivore) has the same distribution among animals in two different habitats (Forest vs. Grassland). The zoologist takes separate random samples of animals from each habitat and classifies each animal by diet type. The results are shown below.
Which test is appropriate, and what are the correct hypotheses?
Two-proportion $z$ test; $H_0$: the proportion carnivore is the same in both habitats; $H_a$: it differs
Chi-square goodness-of-fit test; $H_0$: the overall distribution of diet types is the same as a specified model; $H_a$: it differs
Chi-square test of homogeneity; $H_0$: the distribution of diet type is the same in Forest and Grassland habitats; $H_a$: the distributions differ
Chi-square goodness-of-fit test; $H_0$: each habitat has equal sample sizes in each diet category; $H_a$: at least one category differs
Chi-square test of independence; $H_0$: habitat and diet type are independent for one random sample; $H_a$: they are associated
Explanation
This question asks about comparing diet type distributions between Forest and Grassland habitats, where separate random samples of animals are taken from each habitat. Since we have distinct samples from different populations (the two habitats) and want to test if the categorical distribution is the same across these populations, we use a chi-square test of homogeneity. The null hypothesis states that the distribution of diet type is the same in both habitats. Option A incorrectly suggests independence (which requires one sample), Option B tests against a specified model rather than comparing habitats, Option D only examines carnivores rather than the full distribution, and Option E focuses on sample sizes rather than diet distributions.
A public health team takes a single random sample of 180 adults and records two categorical variables: smoking status (Smoker/Non-smoker) and exercise frequency (0–1 days/week, 2–4 days/week, 5–7 days/week). The results are shown in the table. Which chi-square test is appropriate, and what are the correct hypotheses?
State clearly whether this is one sample or multiple groups.
Chi-square test of independence; $H_0$: smoking status and exercise frequency are independent in the population; $H_a$: they are associated
Two-proportion $z$ test; $H_0$: the proportion of smokers is the same in the 0–1 and 5–7 categories; $H_a$: it differs
ANOVA; $H_0$: mean exercise frequency is the same for smokers and non-smokers; $H_a$: it differs
Chi-square goodness-of-fit; $H_0$: exercise frequency is equally likely across the three categories; $H_a$: not equally likely
Chi-square test of homogeneity; $H_0$: the distribution of smoking status is the same across exercise categories; $H_a$: it differs
Explanation
The problem states "a single random sample of 180 adults" with two categorical variables recorded for each person: smoking status and exercise frequency. This is a one-sample design where we examine the association between two categorical variables within that sample. A chi-square test of independence is appropriate, testing whether smoking status and exercise frequency are independent in the population. The null hypothesis states these variables are independent, while the alternative states they are associated. Choice C correctly identifies this setup. The single sample with two variables measured on each individual distinguishes this from a homogeneity test, which would require separate samples from different populations.
A company wants to compare whether customers from two different regions (Region 1 and Region 2) have the same distribution of preferred customer-service contact method (Phone, Email, Chat, In-person). Independent random samples of customers are taken from each region. The results are shown in the table. Which chi-square test is appropriate, and what are the correct hypotheses?
State clearly whether this is one sample or multiple groups.
Chi-square test of independence; $H_0$: region and contact preference are independent in a single random sample; $H_a$: they are associated
Chi-square goodness-of-fit; $H_0$: each method is preferred by 25% of customers; $H_a$: not 25% each
Chi-square test of homogeneity; $H_0$: the distribution of preferred contact method is the same in Region 1 and Region 2; $H_a$: the distributions differ
Chi-square test of homogeneity; $H_0$: the column totals are equal; $H_a$: at least one column total differs
Two-proportion $z$ test; $H_0$: the proportion preferring Phone is the same in both regions; $H_a$: it differs
Explanation
The problem states "Independent random samples of customers are taken from each region," clearly indicating two separate samples (one from Region 1, one from Region 2). This is a multiple groups design comparing distributions across independent samples, requiring a chi-square test of homogeneity. We're testing whether the distribution of preferred contact methods is the same between the two regions. The null hypothesis states that both regions have the same distribution of preferences, while the alternative states the distributions differ. Choice A correctly identifies this as a test of homogeneity with appropriate hypotheses. The independent sampling from each region creates separate groups for comparison, distinguishing this from an independence test.
A university takes a single random sample of 250 students and records two categorical variables: class year (First-year, Sophomore, Junior, Senior) and whether the student lives on campus (Yes/No). The results are shown in the table. Which chi-square test is appropriate, and what are the correct hypotheses?
State clearly whether this is one sample or multiple groups.
Two-proportion $z$ test; $H_0$: the proportion living on campus is the same for first-years and seniors; $H_a$: it differs
Chi-square goodness-of-fit; $H_0$: the two columns (Yes/No) have equal totals; $H_a$: they do not
Chi-square test of independence; $H_0$: class year and living on campus are independent in the population; $H_a$: they are associated
Chi-square test of homogeneity; $H_0$: the distribution of class year is the same for on-campus and off-campus students; $H_a$: it differs
Chi-square goodness-of-fit; $H_0$: class years occur equally often; $H_a$: not equally often
Explanation
The problem states "a single random sample of 250 students" with two categorical variables recorded for each student: class year and campus residence status. This is a one-sample design examining the relationship between two categorical variables within that sample. A chi-square test of independence is appropriate, testing whether class year and living on campus are independent in the population. The null hypothesis states these variables are independent, while the alternative states they are associated. Choice C correctly identifies this setup. The single sample with two variables measured on each unit distinguishes this from a homogeneity test, which would require separate samples from different populations.
A sociologist takes a single random sample of 220 adults and records two categorical variables: employment status (Employed/Unemployed) and highest education level (High school or less, Some college, Bachelor’s or higher). The results are shown in the table. Which chi-square test is appropriate, and what are the correct hypotheses?
State clearly whether this is one sample or multiple groups.
Chi-square goodness-of-fit; $H_0$: education levels occur equally often; $H_a$: not equally often
Two-proportion $z$ test; $H_0$: the proportion employed is the same for “High school or less” and “Bachelor’s or higher”; $H_a$: it differs
Chi-square goodness-of-fit; $H_0$: the six cell probabilities are all equal; $H_a$: at least one differs
Chi-square test of independence; $H_0$: employment status and education level are independent in the population; $H_a$: they are associated
Chi-square test of homogeneity; $H_0$: the distribution of education level is the same for employed and unemployed adults; $H_a$: the distributions differ
Explanation
The problem explicitly states "a single random sample of 220 adults" with two categorical variables recorded for each adult: employment status and education level. This is a one-sample design examining the relationship between two categorical variables within that sample. A chi-square test of independence is appropriate, testing whether employment status and education level are independent in the population. The null hypothesis states these variables are independent, while the alternative states they are associated. Choice C correctly identifies this setup. The single sample with two variables measured on each unit is the key indicator for independence testing, as opposed to homogeneity testing which requires multiple independent samples.