Residuals

Help Questions

AP Statistics › Residuals

Questions 1 - 10
1

Which of the following correctly describes the variables plotted on the axes of a standard residual plot?

The explanatory variable is plotted on the vertical axis, and the response variable is plotted on the horizontal axis.

The residuals are plotted on the vertical axis, and the explanatory variable is plotted on the horizontal axis.

The response variable is plotted on the vertical axis, and the residuals are plotted on the horizontal axis.

The residuals are plotted on the vertical axis, and the response variable is plotted on the horizontal axis.

Explanation

A standard residual plot graphs the residuals ($$y - \hat{y}$$) on the vertical (y) axis against the explanatory variable (x) on the horizontal axis. Sometimes, predicted values ($$\hat{y}$$) are used on the horizontal axis instead.

2

Which of the following is the best conclusion based on this information?

Model B provides a better fit to the data than Model A because its sum of squared residuals is smaller.

Model A provides a better fit to the data than Model B because its sum of squared residuals is larger.

Neither model is a good fit, as the sum of squared residuals should be close to zero for any good fit.

Both models fit the data equally well; the choice depends on which model is simpler to interpret.

Explanation

The sum of squared residuals measures the total prediction error of a model. A model with a smaller sum of squared residuals provides a better fit to the data because the data points are, on average, closer to the model's predictions.

3

Which of the following conclusions is supported by the residual plot?

The variance of the test scores is the same for all students, regardless of the hours they studied.

The relationship between hours studied and test scores is a strong positive, but nonlinear, association.

The linear model is an appropriate way to describe the relationship between hours studied and test scores.

There is no statistical association between hours studied and test scores for these students.

Explanation

A residual plot that shows a random scatter of points around the zero line with no obvious pattern indicates that a linear model is appropriate for the data.

4

Which of the following is the most accurate statement based on this residual?

The house has $$\12,000$$ less square footage than predicted for its price.

The model overestimated the selling price of this house by $$\12,000$$.

The model underestimated the selling price of this house by $$\12,000$$.

The actual selling price of the house was exactly $$\12,000$$.

Explanation

A negative residual means the actual value is less than the predicted value ($$y < \hat{y}$$). This implies that the model's prediction was higher than the actual selling price, so the model overestimated the price.

5

In what sense is this line considered the 'best fit'?

It passes through the maximum number of data points.

It minimizes the sum of the squared residuals.

It minimizes the sum of the residuals, which must be a non-negative value.

It minimizes the number of points that are outliers.

Explanation

The least-squares regression line is defined as the line that minimizes the sum of the squared vertical distances (residuals) from the data points to the line.

6

What does this pattern in the residual plot suggest?

The variability of the prediction errors is not constant across all levels of advertising spending.

The linear model is generally appropriate because the residuals appear to be centered around zero.

The data contains an influential point at a high level of spending that increases the spread.

The relationship between profits and advertising spending is curved rather than linear.

Explanation

A fan-shaped or cone-shaped pattern in a residual plot indicates that the variance of the residuals is not constant (a condition known as heteroscedasticity). This violates a key assumption for regression inference.

7

How is the residual for this car calculated?

The residual is calculated as $$16,200 - 50,000$$.

The residual is calculated as $$15,000 - 16,200$$.

The residual is calculated as $$16,200 - 15,000$$.

The residual is calculated as $$50,000 - 15,000$$.

Explanation

The residual is defined as the difference between the actual observed value (y) and the value predicted by the regression model ($$\hat{y}$$). In this case, the calculation is $$y - \hat{y} = 16,200 - 15,000$$.

8

What is the residual for this plant?

5.8 cm

-0.8 cm

0.8 cm

6.2 cm

Explanation

First, calculate the predicted height using the regression equation: $$\hat{y} = 1.2 + 0.5(10) = 1.2 + 5 = 6.2$$ cm. The residual is the actual height minus the predicted height: $$residual = y - \hat{y} = 7.0 - 6.2 = 0.8$$ cm.

9

Which of the following is the best interpretation of this residual?

The student's midterm exam score was 5 points higher than their predicted final exam score.

The student's actual final exam score was 5 points higher than the score predicted by the model.

The student's actual final exam score was 5 points lower than the score predicted by the model.

The model's prediction was off by an average of 5 points for the students in the sample.

Explanation

A positive residual indicates that the actual value (y) is greater than the predicted value ($$\hat{y}$$). Therefore, the student's actual final exam score was 5 points above what the model predicted based on their midterm score.

10

Which of the following statements about the residuals from this least-squares regression line must be true?

The sum of the residuals is a positive value if the correlation is positive.

The sum of the residuals is equal to zero.

The sum of the absolute values of the residuals is equal to zero.

All residuals must be smaller in magnitude than the standard deviation of the y-values.

Explanation

A fundamental mathematical property of the least-squares regression line is that the sum of the residuals ($$\sum(y_i - \hat{y}_i)$$) is always zero.

Page 1 of 2