Become a math whiz with AI Tutoring, Practice Questions & more.

HotmathMath Homework. Do It Faster, Learn It Better.

# Variance

The world is filled with varied data. Although there are certain constants and predictable patterns, things don't always happen in the same way every single time. Volcanoes erupt, rivers overflow, and some people like pineapple on their pizza. These unpredictable or abnormal events are all examples of "variance" in our data. Let's find out more about variance and discover its relation to statistics.

## The meaning of variance

Variance represents how "spread out" our data is. We can expect a certain degree of variance. For example, it would be strange if there was zero variance in height among our classmates. If we ran a survey and discovered that everyone in our class was the same height, we probably made a mistake. On the other hand, a high degree of variance may also be abnormal. For example, there are some ice cream flavors that are more popular than others -- such as chocolate, vanilla, and strawberry. If we ran a survey among our 30 classmates and each person chose a different ice cream flavor as their favorite, we would be left with very abnormal data that may not represent the general population as a whole.

Statisticians use variance to study the relationship between one result and all other results in a data set. Is one result normal? Was it expected? Is it a so-called "outlier" that is completely different compared to the other results? Finding the variance can help us answer these questions. If a measure is very far away from the other results, we might need to investigate it further. Did we make a mistake in our experiment? Why is this value so different compared to the others?

No variance may not provide us with very interesting results. If we get the same result every single time, what did we learn about our variables? On the other hand, too much variance provides scattered results, making it difficult to gain insights and draw valuable conclusions.

## How do we find variance?

In the world of statistics, we need to be as precise as possible. We can't simply look at our data and guess how spread out it is. We need to give our variance a specific value. To do this, we'll need to use a formula. This formula will tell us the average of the squared distances from the mean:

To find the variance, we need to follow a number of steps:

1. Find the mean of our data set.

2. Subtract the mean from each of our data points and square the result. This is known as the "squared difference."

3. Find the average of these squared differences

Here''s what that formula looks like when we write it out:

In this formula, ${\sigma }^{2}$ is the variance, $\text{sum}$ means to sum over all data points, ${x}_{i}$ denotes a single data point, $\mathrm{bar}\left\{x\right\}$ denotes the mean, and n is the number of data points.

You can also use this result to find the "standard deviation." This is another measure of data variability, and it represents the distance of each value from the mean. In other words, standard deviation tells us how "normal" or "abnormal" each result might be depending on its relation to the average value. With a quick look at a data point's standard deviation, we can alert ourselves to possible errors or variables that we might not have considered.

Finding the standard deviation is easy. All we need to do is find the square root of the variance. In other words:

## When should we use standard deviation?

Variance helps us understand the bigger picture. It tells us how spread apart our data is. On the other hand, standard variation lets us focus on one specific data point and determine whether it falls within a "normal range." Both variance and standard deviation can provide us with valuable insights depending on whether we need to consider the whole data or a single piece.

## Variance vs. interquartile range

Another way of measuring variability is the interquartile range. This approach involves dividing our data into four quartiles. The second quartile is identical to the average or mean of our data, and this "cut" goes right in the middle of our data points. The interquartile range includes all values that go between the first and third quartile -- excluding all of the values on the polar extremes. This gives us similar insights compared to the standard deviation and variance, although it excludes outliers.

The standard deviation and variance include outliers, which may be problematic in some situations. For example, if you have numerous outliers and a high degree of dispersal, using standard deviation and variance may provide skewed insights because of the influence of outliers. On the other hand, if all of our data points have low degrees of variance, then standard deviation is an excellent way to measure variability. Again, we need to choose the best strategies for interpreting our data based on our specific experiment and results.

## Flashcards covering the Variance

Statistics Flashcards