# Line of Best Fit (Least Square Method)

A line of best fit is a straight line that is the best approximation of the given set of data.

It is used to study the nature of the relation between two variables. (We're only considering the two-dimensional case, here.)

A line of best fit can be roughly determined using an eyeball method by drawing a straight line on a scatter plot so that the number of points above the line and below the line is about equal (and the line passes through as many points as possible).

A more accurate way of finding the line of best fit is the least square method .

Use the following steps to find the equation of line of best fit for a set of ordered pairs $\left({x}_{1},{y}_{1}\right),\left({x}_{2},{y}_{2}\right),...\left({x}_{n},{y}_{n}\right)$ .

Step 1: Calculate the mean of the $x$ -values and the mean of the $y$ -values.

$\begin{array}{l}\stackrel{¯}{X}=\frac{\underset{i=1}{\overset{n}{\sum }}{x}_{i}}{n}\\ \stackrel{¯}{Y}=\frac{\underset{i=1}{\overset{n}{\sum }}{y}_{i}}{n}\end{array}$

Step 2: The following formula gives the slope of the line of best fit:

$m=\frac{\underset{i=1}{\overset{n}{\sum }}\left({x}_{i}-\stackrel{¯}{X}\right)\left({y}_{i}-\stackrel{¯}{Y}\right)}{\underset{i=1}{\overset{n}{\sum }}{\left({x}_{i}-\stackrel{¯}{X}\right)}^{2}}$

Step 3: Compute the $y$ -intercept of the line by using the formula:

$b=\stackrel{¯}{Y}-m\stackrel{¯}{X}$

Step 4: Use the slope $m$ and the $y$ -intercept $b$ to form the equation of the line.

Example:

Use the least square method to determine the equation of line of best fit for the data. Then plot the line.

 $x$ $8$ $2$ $11$ $6$ $5$ $4$ $12$ $9$ $6$ $1$ $y$ $3$ $10$ $3$ $6$ $8$ $12$ $1$ $4$ $9$ $14$

Solution:

Plot the points on a coordinate plane .

Calculate the means of the $x$ -values and the $y$ -values.

$\begin{array}{l}\stackrel{¯}{X}=\frac{8\text{\hspace{0.17em}}+\text{\hspace{0.17em}}2\text{\hspace{0.17em}}+\text{\hspace{0.17em}}11\text{\hspace{0.17em}}+\text{\hspace{0.17em}}6\text{\hspace{0.17em}}+\text{\hspace{0.17em}}5\text{\hspace{0.17em}}+\text{\hspace{0.17em}}4\text{\hspace{0.17em}}+\text{\hspace{0.17em}}12\text{\hspace{0.17em}}+\text{\hspace{0.17em}}9\text{\hspace{0.17em}}+\text{\hspace{0.17em}}6\text{\hspace{0.17em}}+\text{\hspace{0.17em}}1}{10}=6.4\\ \stackrel{¯}{Y}=\frac{3\text{\hspace{0.17em}}+\text{\hspace{0.17em}}10\text{\hspace{0.17em}}+\text{\hspace{0.17em}}3\text{\hspace{0.17em}}+\text{\hspace{0.17em}}6\text{\hspace{0.17em}}+\text{\hspace{0.17em}}8\text{\hspace{0.17em}}+\text{\hspace{0.17em}}12\text{\hspace{0.17em}}+\text{\hspace{0.17em}}1\text{\hspace{0.17em}}+\text{\hspace{0.17em}}4\text{\hspace{0.17em}}+\text{\hspace{0.17em}}9\text{\hspace{0.17em}}+\text{\hspace{0.17em}}14}{10}=7\end{array}$

Now calculate ${x}_{i}-\stackrel{¯}{X}$ , ${y}_{i}-\stackrel{¯}{Y}$ , $\left({x}_{i}-\stackrel{¯}{X}\right)\left({y}_{i}-\stackrel{¯}{Y}\right)$ , and ${\left({x}_{i}-\stackrel{¯}{X}\right)}^{2}$ for each $i$ .

 $i$ ${x}_{i}$ ${y}_{i}$ ${x}_{i}-\stackrel{¯}{X}$ ${y}_{i}-\stackrel{¯}{Y}$ $\left({x}_{i}-\stackrel{¯}{X}\right)\left({y}_{i}-\stackrel{¯}{Y}\right)$ ${\left({x}_{i}-\stackrel{¯}{X}\right)}^{2}$ $1$ $8$ $3$ $1.6$ $-4$ $-6.4$ $2.56$ $2$ $2$ $10$ $-4.4$ $3$ $-13.2$ $19.36$ $3$ $11$ $3$ $4.6$ $-4$ $-18.4$ $21.16$ $4$ $6$ $6$ $-0.4$ $-1$ $0.4$ $0.16$ $5$ $5$ $8$ $-1.4$ $1$ $-1.4$ $1.96$ $6$ $4$ $12$ $-2.4$ $5$ $-12$ $5.76$ $7$ $12$ $1$ $5.6$ $-6$ $-33.6$ $31.36$ $8$ $9$ $4$ $2.6$ $-3$ $-7.8$ $6.76$ $9$ $6$ $9$ $-0.4$ $2$ $-0.8$ $0.16$ $10$ $1$ $14$ $-5.4$ $7$ $-37.8$ $29.16$ $\begin{array}{l}\underset{i=1}{\overset{n}{\sum }}\left({x}_{i}-\stackrel{¯}{X}\right)\left({y}_{i}-\stackrel{¯}{Y}\right)\\ =-131\end{array}$ $\begin{array}{l}\underset{i=1}{\overset{n}{\sum }}{\left({x}_{i}-\stackrel{¯}{X}\right)}^{2}\\ =118.4\end{array}$

Calculate the slope.

$m=\frac{\underset{i=1}{\overset{n}{\sum }}\left({x}_{i}-\stackrel{¯}{X}\right)\left({y}_{i}-\stackrel{¯}{Y}\right)}{\underset{i=1}{\overset{n}{\sum }}{\left({x}_{i}-\stackrel{¯}{X}\right)}^{2}}=\frac{-131}{118.4}\approx -1.1$

Calculate the $y$ -intercept.

Use the formula to compute the $y$ -intercept.

$\begin{array}{l}b=\stackrel{¯}{Y}-m\stackrel{¯}{X}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=7-\left(-1.1×6.4\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=7+7.04\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\approx 14.0\end{array}$

Use the slope and $y$ -intercept to form the equation of the line of best fit.

The slope of the line is $-1.1$ and the $y$ -intercept is $14.0$ .

Therefore, the equation is $y=-1.1x+14.0$ .

Draw the line on the scatter plot.