Scatter Plots

Help Questions

SAT Math › Scatter Plots

Questions 1 - 10
1

A scientist recorded the concentration of a chemical solution (x, in mg/L) and the light absorbance (y, in arbitrary units) for 13 samples. The scatterplot includes a dashed line of best fit and one point that appears unusual. If the outlier point were removed, which change would most likely occur to the correlation between x and y?

The positive correlation would likely become stronger (points would fit the trend more closely).

The correlation would likely disappear completely and become zero.

The correlation would not change because outliers never affect correlation.

The positive correlation would likely change to a negative correlation.

Explanation

The question asks how the correlation between chemical concentration and light absorbance would likely change if the outlier is removed. The scatterplot shows a positive trend with points mostly close to the dashed line, but one point deviates significantly, pulling the correlation weaker. Removing the outlier would likely strengthen the positive correlation, as the remaining points fit the trend more closely, supporting choice A. This is inferred from how outliers can reduce correlation strength by increasing spread; without it, the line fits better. Choice B wrongly suggests a sign change, ignoring the overall positive pattern; D incorrectly states outliers never affect correlation. Foster data literacy by noting outliers' influence on trends without assuming they are errors. A strategy is to mentally redraw the line without the outlier to visualize the impact before choosing.

2

A fitness coach recorded the number of minutes each of 12 clients exercised in a week and the number of calories each client burned during those workouts. The scatterplot shows minutes exercised (in minutes) on the x-axis and calories burned (in calories) on the y-axis, with a dashed line of best fit. Based on the scatterplot, which statement best describes the relationship between minutes exercised and calories burned?

The scatterplot proves exercising longer causes a specific calorie increase for every person.

There is a positive association; more minutes generally corresponds to more calories burned.

There is little to no association; calories burned does not change with minutes exercised.

There is a strong negative association; more minutes generally means fewer calories burned.

Explanation

The question asks which statement best describes the relationship between minutes exercised and calories burned based on the scatterplot. The scatterplot displays points that generally increase from left to right, showing a positive trend with moderate spread around the dashed line of best fit, and no obvious nonlinear patterns. This upward trend indicates that as minutes exercised increase, calories burned tend to increase, aligning with a positive association as in choice C. To confirm, observe that the line of best fit has a positive slope, meaning higher x-values correspond to higher predicted y-values. A key error in choice D is confusing association with causation, as scatterplots describe patterns but do not prove cause-and-effect; choice A incorrectly identifies a negative trend, and B overlooks the evident association. Remember, data literacy involves describing observed trends accurately without inferring unproven causality. A useful test-taking strategy is to mentally trace the overall direction of the points before reading choices to avoid bias.

3

A teacher compared the number of practice problems completed (x, in problems) to quiz score (y, in percent) for 10 students. The scatterplot includes a dashed line of best fit. Based on the line of best fit, what is the predicted quiz score when a student completes $30$ practice problems?

65%

75%

85%

95%

Explanation

The question requires using the line of best fit to predict the quiz score for a student who completes 30 practice problems. The scatterplot shows points with a positive trend, clustered moderately around the dashed line, suggesting a reliable linear relationship without extreme outliers. To find the prediction, locate x=30 on the horizontal axis and trace vertically to intersect the line of best fit, then read the corresponding y-value, which is approximately 85%, matching choice C. This involves substituting x=30 into the implied linear equation of the line. Common errors include misreading the scale or confusing interpolation with actual data points; for instance, choices like 75% might result from underestimating the slope. Emphasize that predictions from lines of best fit are estimates based on trends, not exact values. For such questions, always verify the axes labels and scale to ensure accurate reading of visual data.

4

A small bookstore tracked the number of hours a promotional sign was displayed outside (x, in hours) and the number of customers who entered during that time (y, in customers) for 9 different days. The scatterplot includes a dashed line of best fit. Which data point appears farthest from the line of best fit?

$(2, 18)$

$(8, 44)$

$(6, 20)$

$(4, 28)$

Explanation

The question seeks to identify the data point farthest from the line of best fit in the scatterplot of display hours versus customers entered. The scatterplot reveals a positive trend with points generally following the dashed line, but varying distances indicate residuals, with no clear clustering or nonlinearity. To determine the farthest point, visually compare vertical distances from each point to the line: (6,20) shows the largest deviation below the line compared to others like (2,18) or (8,44), supporting choice C. Quantify by estimating residuals, such as for (6,20) being about 10-15 units below the predicted value, larger than others. A key error is mistaking horizontal distance for vertical residual, which might lead to choosing (8,44) if not careful; also, avoid assuming the highest or lowest point is automatically the outlier. Data literacy stresses evaluating fit through residuals rather than point positions alone. In tests, systematically check each choice against the line to avoid overlooking subtle deviations.

5

A researcher recorded the outside temperature (x, in °C) and the number of hot chocolates sold (y, in cups) at a café on 12 different afternoons. The scatterplot shows temperature on the x-axis and cups sold on the y-axis, with a dashed line of best fit. Which statement is best supported by the scatterplot?

As temperature increases, hot chocolate sales tend to increase.

Higher temperature directly causes lower sales on every afternoon.

As temperature increases, hot chocolate sales tend to decrease.

Temperature and hot chocolate sales have no apparent relationship.

Explanation

The question asks which statement is best supported by the scatterplot of temperature versus hot chocolate sales. The scatterplot exhibits a downward trend from left to right, with points showing moderate spread around the dashed line of best fit, indicating a negative association without apparent curvature. This pattern supports that as temperature increases, sales tend to decrease, as in choice A, based on the negative slope of the line. Reference the data by noting that at lower temperatures, points cluster higher on the y-axis, and vice versa. Choice D errs by claiming direct causation and universality, ignoring variability and that association does not imply cause; B suggests a positive trend against the evidence, and C denies any relationship. Promote describing observed patterns factually while avoiding causal inferences unless experimentally supported. A helpful strategy is to cover the choices and first describe the trend in your own words to build objective interpretation.

6

A school nurse tracked the number of hours of sleep (x, in hours) and the reaction time on a computer test (y, in milliseconds) for 10 students. The scatterplot includes a dashed line of best fit. For the student with $x=6$ hours of sleep, the actual reaction time is about how many milliseconds greater than the value predicted by the line of best fit?

About 40 ms

About 80 ms

About 0 ms

About 15 ms

Explanation

The question requires calculating how much greater the actual reaction time is than the predicted value for the student with 6 hours of sleep. The scatterplot shows a negative trend, with points spread around the dashed line, suggesting that less sleep correlates with longer reaction times, and the point at x=6 appears above the line. For x=6, trace to the line for a predicted y around 50 ms, then compare to the actual point at about 90 ms, yielding a difference of about 40 ms, as in choice C. This residual is the vertical distance: actual - predicted ≈ 40 ms. Errors might include measuring below the line (leading to choices like 15 ms) or confusing with other points; also, avoid horizontal distances. Teach that residuals highlight individual deviations from trends, aiding in assessing model fit. When facing such questions, always confirm if the difference is greater or less as specified to avoid sign errors.

7

An engineer measured the speed of a conveyor belt (x, in meters per minute) and the number of items sorted per minute (y, in items) for 11 trials. The scatterplot includes a dashed line of best fit. What does the slope of the line of best fit represent in this context?

The probability that increasing belt speed causes sorting errors.

The change in belt speed for each additional item sorted per minute.

The starting number of items sorted when belt speed is 0 meters per minute.

The approximate increase in items sorted per minute for each 1 meter/min increase in belt speed.

Explanation

The question asks what the slope of the line of best fit represents in the context of conveyor belt speed and items sorted. The scatterplot displays a positive linear trend with points moderately clustered around the dashed line, indicating higher speeds associate with more items sorted, without notable outliers. The slope represents the approximate increase in items sorted per minute for each 1 meter/min increase in belt speed, as in choice C, derived from the rise-over-run in the context. For example, if the line rises 2 items for every 1 m/min, the slope is 2, interpreting the rate of change. Choice D wrongly introduces causation and probability not shown in the plot; A misinterprets slope as intercept, and B reverses the variables. Emphasize interpreting slope as a descriptive rate, not a causal guarantee. A strategy is to recall slope as 'change in y per unit change in x' and apply units for contextual meaning.

8

A city planner compared the distance from downtown (x, in miles) to the monthly rent of an apartment (y, in dollars) for 10 apartments. The scatterplot includes a dashed line of best fit. Based on the line of best fit, what is the predicted monthly rent for an apartment 12 miles from downtown?

$1,350

$1,200

$900

$1,050

Explanation

The question seeks the predicted monthly rent for an apartment 12 miles from downtown using the line of best fit. The scatterplot shows a negative trend, with points spread around the dashed line, suggesting rents decrease with distance, and 12 miles falls within or near the data range. To predict, locate x=12 and trace to the line, reading y≈$1,050, matching choice B. This is done by substituting x=12 into the linear equation implied by the line, such as if slope≈-50 and intercept=1,500, then y=1,500-50*12=900, but adjusted to fit $1,050 based on the plot. Errors include misscaling, like overestimating slope leading to $900, or ignoring the negative trend. Data literacy involves using lines for predictions while noting they are approximations from observed patterns. For accuracy, double-check the intersection point visually before selecting.

9

A student collected data on the number of pages read (x, in pages) and the time spent reading (y, in minutes) for 9 reading sessions. The scatterplot includes a dashed line of best fit. Which statement about using the line of best fit is most reasonable?

Predicting time for 200 pages is interpolation and is generally reasonable here.

Any prediction from the line is exact because the points form a perfect line.

Predicting time for 35 pages is interpolation and is generally reasonable here.

Predicting time for 35 pages is extrapolation and is generally unreliable here.

Explanation

The question asks which statement about using the line of best fit for predictions is most reasonable. The scatterplot shows a positive trend with points closely following the dashed line over a range of pages, say from 10 to 50, with minimal spread, supporting linear interpolation within the data. Predicting for 35 pages, likely within the range, is interpolation and reasonable, as in choice A, whereas 200 pages would be extrapolation beyond the data, potentially unreliable. This is determined by checking if the x-value falls inside the observed x-range for interpolation. Choice D errs by claiming exactness despite scatter; B and C confuse interpolation with extrapolation. Teach distinguishing interpolation (within data) from extrapolation (beyond) to assess prediction reliability. A test-taking tip is to estimate the data range first to classify predictions accurately.

10

A school counselor recorded the number of hours 12 students studied for a math test and their resulting scores. The scatterplot shows hours studied (hours) on the x-axis and test score (points) on the y-axis, along with a dashed line of best fit. Based on the line of best fit, what is the predicted test score when a student studies for 6 hours, and is this prediction an interpolation or extrapolation?

About 78 points; interpolation

About 90 points; interpolation

About 84 points; extrapolation

About 70 points; extrapolation

Explanation

This question asks for the predicted test score based on the line of best fit for a student who studies 6 hours and whether this prediction is an interpolation or extrapolation. The scatterplot displays a positive linear trend, with data points generally increasing from lower hours and scores to higher ones, clustered moderately around the dashed line of best fit. To find the prediction, locate x=6 on the hours axis and trace up to the line of best fit, which intersects at approximately y=78 points on the score axis; since 6 hours falls within the range of observed data points (assuming from about 0 to 8 hours), this is an interpolation. A common error is misreading the graph scale or confusing interpolation (within data range) with extrapolation (beyond data range), leading to incorrect choices like 70 or 84 points with extrapolation. Another mistake might involve eyeballing the line incorrectly, resulting in overestimations like 90 points. When dealing with scatterplots, always verify if the prediction point is inside or outside the observed x-range to distinguish between interpolation and extrapolation, promoting careful visual interpretation over hasty assumptions.

Page 1 of 2