Become a math whiz with AI Tutoring, Practice Questions & more.

HotmathMath Homework. Do It Faster, Learn It Better.

# Line of best fit

A linear line of best fit can be defined as a straight line providing the best approximation of a given set of data. It is used to study the relationship between two variables. The line of best fit is studied at two different levels.

At the middle and high school levels, students are asked to determine a rough line of best fit by eyeballing a graph on the coordinate plane; generally, a scatter plot. The trick is to draw a straight line such that an even number of points appear above and below it while intersecting as many individual points as possible. Students should also avoid making egregious errors when it comes to things such as identifying the x-intercept and y-intercept, but everything is a rough estimation.

Students of advanced statistics and probability transition to using the least square method to write an equation for the line of best fit. The rest of this article will focus on how to use the least square method for a much more precise equation.

## Finding the line of best fit through the least square method

You can find the equation for the line of best fit using the least square method in four steps. First, look at your ordered pairs and find the mean of all of the x values and all of the y values. Our example will use the following ordered pairs:

$\left(8,3\right),\left(2,10\right),\left(11,3\right),\left(6,6\right),\left(5,8\right),\left(4,12\right),\left(12,1\right),\left(9,4\right),\left(6,9\right),\left(1,14\right)$

It may help to see these ordered pairs plotted:

` $\stackrel{̄}{x}$ ` is the symbol used to specify the mean of the x values, and you can calculate it by adding all 10 x values above and then dividing by 10. In the example above, ` $\stackrel{̄}{x}=6.4$ `.

` $\stackrel{̄}{y}$ ` is the symbol used to specify the mean of the y values. ` $\stackrel{̄}{y}=7$ ` in the example above.

Next, we need to use both means to find numbers we'll eventually plug into formulas. Specifically, we need ` $x-\stackrel{̄}{x}$ ` and ` $y-\stackrel{̄}{y}$ ` for every value of x and y above. We are going to build a table to make the data easier to read:

 x-coordinate $x-\stackrel{̄}{x}$ y-coordinate $y-\stackrel{̄}{y}$ 8 1.6 3 -4 2 -4.4 10 3 11 4.6 3 -4 6 -0.4 6 -1 5 -1.4 8 1 4 -2.4 12 5 9 2.6 4 -3 6 -0.4 9 2 1 -5.4 14 7 12 5.6 1 -6

We're finally done with the first step! Now, we need to calculate the slope of the line of best fit using the following formula:

$m=\frac{\sum \left({x}_{i}-\stackrel{̄}{x}\right)\left({y}_{i}-\stackrel{̄}{y}\right)}{\sum \left({x}_{i}-\stackrel{̄}{x}{\right)}^{2}}$

That looks like a lot to handle, but it isn't that bad. The 'i' simply means add up every point on the scatter plot above, and we have values for both ' $x-\stackrel{̄}{x}$ ' and ' $y-\stackrel{̄}{y}$ ' in the table. The $\sum$ symbol is the Sigma or summation symbol and tells us that we need to add all of the values. Essentially, we need to multiply each ordered pair's ' $x-\stackrel{̄}{x}$ and ' $y-\stackrel{̄}{y}$ ', and then add all 10 products together for the numerator. For the denominator, we need to square all of the ' $x-\stackrel{̄}{x}$ ' values and add them together. That gives us:

$m=-\frac{131}{118.4}\approx 1.1$

We have our slope! Now, we can plug the slope into the following formula for the y-intercept:

$b=\stackrel{̄}{y}-m×\stackrel{̄}{x}$

$b=7-\left(-1.1×6.4\right)$

$b\approx 14$

We now have a slope of -1.1 and a y-intercept of 14, enabling us to write the equation $y=-1.1x+14$ . Note that we rounded both numbers to make them easier to work with. The final step is to draw the line on our scatter plot:

It's a lot of work for a single problem, but you can do it so long as you keep your wits about you and take it one step at a time.

## Line of best fit practice question

a. Write an equation for the line of best fit for the following ordered pairs:

$\left(2,4\right),\left(3,5\right),\left(5,7\right),\left(7,10\right),\left(9,15\right)$

To find the line of best fit, we can use the least squares regression method. However, I'll show you a simplified version of the method to obtain an approximate line.

First, find the means of x and y values:

$\text{Mean of x =}\frac{2+3+5+7+9}{5}=\frac{26}{5}=5.2$

$\text{Mean of x =}\frac{4+5+7+10+15}{5}=\frac{41}{5}=8.2$

Then, find the slope (m):

$m\approx \frac{\sum \left(x-\stackrel{̄}{x}\right)\left(y-{y}_{\text{mean}}\right)}{\sum \left(x-\stackrel{̄}{x}{\right)}^{2}}$

For each point $\left(x,y\right)$ , calculate $\left(x-\stackrel{̄}{x}\right)\left(y-{y}_{\text{mean}}\right)$ and ${\left(x-\stackrel{̄}{x}\right)}^{2}$ , then sum the results.

$m\approx \frac{\left(2-\stackrel{̄}{x}\right)\left(4-{y}_{\text{mean}}\right)+\left(3-\stackrel{̄}{x}\right)\left(5-{y}_{\text{mean}}\right)+...}{\left(2-\stackrel{̄}{x}{\right)}^{2}+\left(3-\stackrel{̄}{x}{\right)}^{2}+...}$

$m\approx \frac{-3.2×\left(-4.2\right)+\left(-2.2\right)×\left(-3.2\right)+..}{{3.2}^{2}+{2.2}^{2}+..}$

$m\approx \frac{38.4}{25.6}\approx 1.5$

Now find the y-intercept (b):

$b=\stackrel{̄}{y}-m×\stackrel{̄}{x}$

$b=8.2-1.5×5.2$

$b=0.3$

The approximate line of best fit is:

$y=1.5x+0.3$

b. Write an equation for the line of best fit for the following ordered pairs:

$\left(8,4\right),\left(3,12\right),\left(2,1\right),\left(10,12\right),\left(11,9\right),\left(3,4\right),\left(6,9\right),\left(5,6\right),\left(6,1\right),\left(8,14\right)$

Using the same simplified method, we first find the means of x and y values:

$\text{Mean of x}=\frac{8+3+2+10+11+3+6+5+6+8}{10}=\frac{62}{10}=6.2$

$\text{Mean of y}=\frac{4+12+1+12+9+4+9+6+1+14}{10}=\frac{72}{10}=7.2$

Then, find the slope (m):

$m\approx \frac{\sum \left(x-\stackrel{̄}{x}\right)\left(y-\stackrel{̄}{y}\right)}{\sum \left(x-\stackrel{̄}{x}{\right)}^{2}}$

For each point $\left(x,y\right)$ , calculate $\left(x-\stackrel{̄}{x}\right)\left(y-\stackrel{̄}{y}\right)$ and ${\left(x-\stackrel{̄}{x}\right)}^{2}$ , then sum the results.

$m\approx \frac{34.8}{51}\approx 0.68$

Now find the y-intercept (b):

$b=\stackrel{̄}{y}-m×\stackrel{̄}{x}$

$b=7.2-0.68×6.2$

$b\approx 3$

The approximate line of best fit is:

$y=0.68x+3$

## Flashcards covering the Line of best fit

Statistics Flashcards

## Varsity Tutors can help with the line of best fit

Whether your student is struggling to eyeball a line of best fit for a scatter plot or you need help in an advanced statistics course, a 1-on-1 professional math tutor can identify why you're struggling and address the issue at its root cause. That could mean completing more practice problems, using mnemonics to help you memorize the formulas, or just learning how to take your time on each problem. Reach out to the Educational Directors at Varsity Tutors for more details today!

;
Download our free learning tools apps and test prep books