Confidence Intervals: Slope of Regression Models

Help Questions

AP Statistics › Confidence Intervals: Slope of Regression Models

Questions 1 - 10
1

A real estate analyst selects 30 houses in a region and records $x$ = size (hundreds of square feet) and $y$ = selling price (thousands of dollars). The regression of price on size yields a 95% confidence interval for the population slope $\beta$ of $(8,\ 14)$. Which interpretation is correct?

We are 95% confident that for each additional 100 square feet, the mean selling price in the population increases by between $\$8{,}000 and $\$14{,}000.

There is a 95% probability that the selling price of a randomly selected house is between $\$8{,}000 and $\$14{,}000.

Because the interval does not include 0, the correlation between size and price is between 8 and 14.

We are 95% confident that the slope of the sample regression line will be between 8 and 14 for these same 30 houses if we recompute it.

We are 95% confident that 95% of houses increase in price by between $\$8{,}000 and $\$14{,}000 when size increases by 100 square feet.

Explanation

This question assesses interpreting a 95% confidence interval for the slope β in regressing house price on size. The interval (8, 14) implies we are 95% confident that β, the average increase in mean price per 100 square feet, is between $8,000 and $14,000. Choice C distracts by equating the interval to correlation, but correlation is not measured in the same units or scale. Choice D wrongly extends the interval to percentages of houses rather than the population mean. Mini-lesson: Slope confidence intervals reflect the range where the true population rate of change likely falls, incorporating sampling error; positive endpoints indicate an upward trend, and proper unit interpretation is key, as slopes depend on variable scales without implying causality or individual predictions.

2

A restaurant manager samples 18 days, recording $x$ = number of online ads purchased that day and $y$ = total sales (dollars). The regression of sales on ads gives a 95% confidence interval for the population slope $\beta$ of $(15,\ 60)$. Which interpretation is correct?

We are 95% confident that for each additional online ad purchased, the mean daily sales in the population increase by between $\$15 and $\$60.

Because 0 is not in the interval, 95% of days will have sales between $\$15 and $\$60.

We are 95% confident that increasing ads by 1 will cause sales to increase by between $\$15 and $\$60.

We are 95% confident that the correlation between ads and sales is between 15 and 60.

There is a 95% probability that the true slope equals the midpoint of the interval.

Explanation

This question tests interpreting a 95% confidence interval for the slope β in regressing sales on online ads. The interval (15, 60) indicates we are 95% confident that β, the average increase in mean daily sales per additional ad, lies between $15 and $60. Choice A distracts by implying causation from the interval, but confidence intervals do not confirm causal links. Choice E confuses the slope with correlation, which is unitless and between -1 and 1. Mini-lesson: A slope confidence interval encapsulates the uncertainty around the estimated population parameter β, representing mean change per unit; positive intervals suggest an increasing relationship, and the level like 95% means repeated sampling would capture β in 95% of such intervals, not that 95% of data points fall within it.

3

A biologist uses linear regression to predict plant height (cm) from hours of sunlight per day for a random sample of 22 plants. A 98% confidence interval for the slope is $(-0.3,\ 1.9)$ cm per hour. Which interpretation is correct?

We are 98% confident that for each additional hour of sunlight, the population mean plant height changes by between -0.3 and 1.9 cm, on average.

About 98% of plants will grow between -0.3 and 1.9 cm for each extra hour of sunlight.

Because 0 is in the interval, the correlation between sunlight and height is 0.

There is a 98% chance that the true slope is between -0.3 and 1.9 for this sample.

We are 98% confident that the correlation is between -0.3 and 1.9.

Explanation

This question examines interpretation when a confidence interval includes both positive and negative values. The interval (-0.3, 1.9) contains 0, meaning we cannot determine if the relationship is positive or negative at the 98% confidence level. Option B correctly interprets this: we are 98% confident that for each additional hour of sunlight, the population mean plant height changes by between -0.3 and 1.9 cm. Option A incorrectly concludes the correlation is exactly 0. Option C wrongly assigns probability to this sample's slope. Option D misapplies the interval to individual plants. Option E confuses slope with correlation values. Key insight: when an interval contains 0, the relationship could be positive, negative, or zero in the population.

4

A marketing team samples 22 weeks, recording $x$ = number of promotional emails sent (in thousands) and $y$ = weekly revenue (in thousands of dollars). The regression of revenue on emails gives a 95% confidence interval for the population slope $\beta$ of $(0.0,\ 2.5)$ (with the lower endpoint rounded to 0.0). Which interpretation is correct?

We are 95% confident that the correlation between emails and revenue is between 0.0 and 2.5.

There is a 95% probability that weekly revenue will increase by between $\$0 and $\$2{,}500 when 1,000 more emails are sent.

Because the interval includes 0.0, the slope is exactly 0, so revenue and emails are uncorrelated.

Since the interval's lower endpoint is 0.0, it proves sending more emails cannot decrease revenue.

We are 95% confident that for each additional 1,000 emails sent, the mean weekly revenue in the population increases by between 0.0 and 2.5 thousand dollars.

Explanation

This question examines interpreting a 95% confidence interval for the slope β of revenue on emails sent. The interval (0.0, 2.5) means we are 95% confident that β, the average increase in mean weekly revenue per 1,000 additional emails, is between $0 and $2,500. Choice B is a distractor, incorrectly asserting that including zero means the slope is exactly zero and variables are uncorrelated, but zero is just one plausible value. Choice C misapplies probability to individual revenue changes rather than the mean. Mini-lesson: Slope confidence intervals provide a range of feasible values for the population's average effect, with the lower bound at zero indicating non-negative plausibility; they do not prove directions or apply to correlations, and the confidence level pertains to the method's reliability over many samples, not single instances.

5

An environmental scientist models ozone level (ppb) as a function of daily high temperature ($^\circ$F) using data from 25 randomly selected days. A 90% confidence interval for the regression slope is $(-1.8,\ -0.4)$ ppb per $^\circ$F. Which interpretation is correct?

There is a 90% probability that the true slope is negative.

If we repeated the study many times, 90% of the time the sample slope would equal a value between -1.8 and -0.4 exactly.

About 90% of individual days will have ozone levels that drop by between 0.4 and 1.8 ppb for each 1$^\circ$F increase.

Because the interval is negative, the correlation must be between -1.8 and -0.4.

We are 90% confident that for each 1$^\circ$F increase in temperature, the population mean ozone level decreases by between 0.4 and 1.8 ppb, on average.

Explanation

This question involves interpreting a confidence interval for slope when predicting ozone from temperature. The interval (-1.8, -0.4) is entirely negative, indicating an inverse relationship. Option A correctly states that we are 90% confident the population mean ozone level decreases by between 0.4 and 1.8 ppb for each 1°F increase in temperature. Option B incorrectly assigns probability to the parameter. Option C confuses slope values with correlation values (correlation must be between -1 and 1). Option D wrongly applies the interval to individual days rather than the population mean. Option E misunderstands what repeated sampling would show. Key insight: negative slopes indicate inverse relationships, and confidence intervals describe population parameters, not individual observations.

6

A biologist measures 20 plants of the same species, recording $x$ = hours of sunlight per day and $y$ = weekly growth (cm). The regression of growth on sunlight yields a 99% confidence interval for the population slope $\beta$ of $(-0.4,\ 1.6)$. Which interpretation is correct?

We are 99% confident that the correlation between sunlight and growth is between $-0.4$ and $1.6$.

There is a 99% chance that plants will grow between $-0.4$ and $1.6$ cm more each week for every extra hour of sunlight.

We are 99% confident that for each additional hour of sunlight, the mean weekly growth in the population changes by between $-0.4$ and $1.6$ cm.

Because 0 is in the interval, the slope is 0, so there is no relationship between sunlight and growth.

Since the interval includes both negative and positive values, sunlight has no effect on growth for any plant.

Explanation

This question tests understanding a 99% confidence interval for the slope β in regressing plant growth on sunlight hours. The interval (-0.4, 1.6) suggests we are 99% confident that β, the average change in mean weekly growth per extra hour of sunlight, ranges from -0.4 to 1.6 cm. A frequent distractor is choice A, which erroneously concludes that including zero means the slope is exactly zero and no relationship exists, but it only means zero is plausible. Choice E overgeneralizes the interval's inclusion of negatives and positives to claim no effect for any plant, ignoring variability. Mini-lesson: Confidence intervals for slopes estimate the plausible range for the population's average response change per unit predictor increase; wider intervals at higher confidence levels reflect greater certainty, and overlapping zero indicates the data is consistent with no linear association without proving it.

7

A meteorologist uses data from a random sample of 20 days to relate humidity ($x$, percent) to the maximum temperature ($y$, degrees F). A least-squares regression line predicts $y$ from $x$. A 90% confidence interval for the true slope is $(-0.30,\ 0.05)$ degrees F per percent humidity. Which interpretation is correct?

We are 90% confident that for each 1% increase in humidity, the mean maximum temperature in the population changes by between $-0.30$ and $0.05$ degrees F, on average.

We are 90% confident that the correlation between humidity and maximum temperature is between $-0.30$ and $0.05$.

Because 0 is in the interval, the slope in the population must be 0.

There is a 90% chance that the true slope is between $-0.30$ and $0.05$ after seeing the data.

90% of individual days will have maximum temperature changes between $-0.30$ and $0.05$ degrees F for each 1% increase in humidity.

Explanation

This question involves a confidence interval (-0.30, 0.05) that contains zero. Option D correctly interprets this as being 90% confident that for each 1% increase in humidity, the mean maximum temperature changes by between -0.30 and 0.05 degrees F. Option A confuses slope with correlation, Option B incorrectly concludes the slope must be 0, Option C misinterprets confidence as posterior probability, and Option E applies the interval to individual days rather than the population mean. Key insight: when 0 is in the confidence interval, we cannot determine the direction of the relationship at that confidence level - the true slope could be positive, negative, or zero.

8

An economist uses data from a random sample of 25 cities to study the relationship between median rent ($y$, dollars) and distance from the city center ($x$, miles). A least-squares regression line predicts rent from distance. A 95% confidence interval for the true slope is $(-85,\ -20)$ dollars per mile. Which interpretation is correct?

We are 95% confident that the mean rent in the population decreases by between $20 and $85 for each additional mile from the city center, on average.

Because the interval does not include 0, the rent must decrease by exactly $85 per mile in the population.

We are 95% confident that $r$ is between $-85$ and $-20$.

There is a 95% chance that the true slope is between $-85$ and $-20$ because this interval was computed from the sample.

95% of individual city rents will decrease by between $20 and $85 when distance increases by 1 mile.

Explanation

This question tests interpretation of a negative confidence interval for slope in an economics context. The interval (-85, -20) means we're 95% confident the true slope is between -85 and -20 dollars per mile. Option A correctly interprets this as the mean rent decreasing by between $20 and $85 for each additional mile from city center (note the positive phrasing of a negative relationship). Option B incorrectly assigns probability after seeing the data, Option C confuses slope with correlation, Option D makes an unfounded claim about the exact value, and Option E misapplies the interval to individual cities. Remember: confidence intervals describe our uncertainty about population parameters, not variability in individual observations.

9

A nutrition scientist samples 22 adults and measures daily fiber intake ($x$, grams) and LDL cholesterol ($y$, mg/dL). A least-squares regression line predicts LDL from fiber intake. A 95% confidence interval for the true slope is $(-1.9,\ -0.2)$ mg/dL per gram. Which interpretation is correct?

We are 95% confident that the correlation between fiber and LDL is between $-1.9$ and $-0.2$.

Because the interval is negative, fiber intake causes LDL to decrease for every individual.

If fiber intake increases by 1 gram, then 95% of individuals will reduce LDL by between 0.2 and 1.9 mg/dL.

There is a 95% chance that the slope is between $-1.9$ and $-0.2$ mg/dL per gram.

We are 95% confident that for each additional gram of fiber, the mean LDL cholesterol in the population decreases by between 0.2 and 1.9 mg/dL, on average.

Explanation

This question presents a negative confidence interval (-1.9, -0.2) in a health context. Option B correctly states we're 95% confident that for each additional gram of fiber, the mean LDL cholesterol decreases by between 0.2 and 1.9 mg/dL in the population. Option A incorrectly applies this to individual people, Option C misinterprets confidence as probability, Option D confuses slope with correlation (correlation has no units), and Option E makes an unfounded causal claim about every individual. Important distinction: regression describes associations on average, not deterministic relationships for every individual, and confidence intervals quantify our uncertainty about population parameters.

10

A city planner records data from 15 neighborhoods on $x$ = distance (miles) from downtown and $y$ = average monthly rent (dollars). A least-squares regression of rent on distance gives a 90% confidence interval for the slope $\beta$ of $(-220,\ -40)$. Which interpretation is correct?

We are 90% confident that for each additional mile from downtown, the mean rent in the population decreases by between $\$40 and $\$220 per month.

Because the interval contains negative values, the correlation must be between $-220$ and $-40$.

We are 90% confident that each additional mile causes rent to drop by between $\$40 and $\$220 per month.

There is a 90% probability that the slope for this fitted line is between $-220$ and $-40$.

Since 0 is not in the interval, exactly 90% of neighborhoods farther from downtown have lower rent.

Explanation

This question evaluates the interpretation of a 90% confidence interval for the slope β in a regression of rent on distance from downtown. The interval (-220, -40) means we are 90% confident that the true β, the average change in mean monthly rent per additional mile, is between -220 and -40 dollars, or a decrease of 40 to 220 dollars. Choice E is a distractor as it incorrectly assumes the interval implies causation, but confidence intervals do not establish cause-and-effect relationships. Choice C mistakenly equates the slope interval with the correlation coefficient, which is bounded between -1 and 1. Mini-lesson: A confidence interval for the regression slope provides a range where the true population average rate of change is likely to fall, accounting for sampling error; negative endpoints here indicate a plausible negative association, and the confidence level reflects the long-run success rate of the interval method in capturing β.

Page 1 of 6