Show
The variance of some data is the arithmetical mean of the square of the absolute deviations. It is symbolized as $$\sigma ^2$$ and it is calculated by applying the formula $$$\sigma^2=\displaystyle \frac{\displaystyle\sum_{i=1}^N (x_i-\overline{x})^2}{N}=\frac{(x_1-\overline{x})^2+(x_2-\overline{x})^2+\ldots+(x_N-\overline{x})^2}{N}$$$ which it is possible to simplify as: $$$\sigma^2=\displaystyle \frac{\displaystyle \sum_{i=1}^N x_i^2}{N}-\overline{x}^2=\frac{x_1^2+x_2^2+\ldots+x_N^2}{N}-\overline{x}^2$$$ Same as with the average, it is not always possible to find the variance, and it is a parameter that is very sensitive to the extreme scorings. We can see that, with the deviation being squared, the variance cannot have the same units as the data. Comparing with the same type of information, a high variance means that the data is more dispersed. And a low value of the variance indicates that the values are in general closer to the average. A value of the variance equal to zero means that all the values are equal, and therefore they are also equal to the arithmetical average.
In a basketball match, we have the following points for the players of a team: $$0, 2, 4, 5, 8, 10, 10, 15, 38$$. Calculate the variance of the scorings of the players of the team. Applying the formula $$\overline{x}=\displaystyle \frac{0+2+4+5+8+10+10+15+38}{9}=\frac{92}{9}=10.22$$ the average is obtained. Next we apply the formula of the variance: $$$\sigma^2=\displaystyle \frac{(0-10.22)^2+(2-10.22)^2+(4-10.22)^2+(5-10.22)^2+(8-10.22)^2+(10-10.22)^2+(10-10.22)^2+(15-10.22)^2+(38-10.22)^2}{9}=\\=\displaystyle \frac{10.22^2+8.22^2+6.22^2+5.22^2+2.22^2+0.22^2+4.78^2+27.78^2}{9}=\\=\displaystyle\frac{104.4484+67.5684+38.6884+27.2484+4.9284+0.0484+22.8484+771.7284}{9}=\\=\displaystyle \frac{1037.5556}{9}=115.28$$$ Calculation of the variance for grouped informationIn case of $$N$$ samples grouped in $$n$$ classes the formula is: $$$\sigma^2=\displaystyle \frac{\displaystyle \sum_{i=1}^n (x_i-\overline{x})^2 f_i}{N}=\frac{(x_1-\overline{x})^2f_1+(x_2-\overline{x})^2f_2+\ldots+(x_n-\overline{x}^2f_n}{N}$$$ which is simplified as: $$$\displaystyle \sigma^2=\frac{\displaystyle \sum_{i=1}^n x_i^2f_i}{N}-\overline{x}^2=\frac{x_1^2f_1+x_2^2f_2+\ldots+x_n^2f_n}{N}-\overline{x}^2$$$ The interpretation that we can make of the result is the same as it is for non grouped information.
The height in cm of the players of a basketball team is in the following table. Calculate the variance.
First of all, fill the following table:
It is necessary to calculate the average $$$\displaystyle \overline{x}=\frac{2250}{12}=187.5$$$ to be able to apply the formula. The variance is calculated then $$$\displaystyle \omega^2=\frac{423500}{12}-187.5^2=135.42$$$ Properties of the variance
In an exam, all the students got a ten. Find the variance of the marks. Since all the values are the same, the average is also equal $$\overline{x}=10$$, and the variance is zero $$\sigma^2=0$$. Standard deviationThe standard deviation is the square root of the variance and it is represented by the letter $$\sigma$$. To calculate it, the variance is calculated first and the root is extracted. The interpretations that are deduced from standard deviation are, therefore, similar to those that were deduced from the variance. In comparing this with the same type of information, standard deviation means that the information is dispersed, while a low value indicates that the values are close together and, therefore, close to the average. Properties of standard deviation
University Grants Commission (Minimum Standards and Procedures for Award of Ph.D. Degree) Regulations, 2022 notified. As, per the new regulations, candidates with a 4 years Undergraduate degree with a minimum CGPA of 7.5 can enroll for PhD admissions. The UGC NET Final Result for merged cycles of December 2021 and June 2022 was released on 5th November 2022. Along with the results UGC has also released the UGC NET Cut-Off. With tis, the exam for the merged cycles of Dec 2021 and June 2022 have conclude. The notification for December 2022 is expected to be out soon. The UGC NET CBT exam consists of two papers - Paper I and Paper II. Paper I consists of 50 questions and Paper II consists of 100 questions. By qualifying this exam, candidates will be deemed eligible for JRF and Assistant Professor posts in Universities and Institutes across the country. Standard deviation is used in statistics to tell us how “spread out” the data points are. Having one or more data points far away from the mean indicates a large spread – but there are other factors to consider. So, what affects standard deviation? Sample size, mean, and data values affect standard deviation, since they are used to calculate standard deviation. Removing outliers changes sample size and may change the mean and affect standard deviation. Multiplication and changing units will also affect standard deviation, but addition will not. Of course, it is possible by chance that removing an outlier will leave the standard deviation unchanged. It is important to go through the calculations to see exactly what will happen with the data. In this article, we’ll talk about the factors that affect standard deviation (and which ones don’t). We’ll also look at some examples to make things clear. Let’s get started. (You can also see a video summary version of this article on YouTube!) You can download a PDF version of the above infographic here.What Affects Standard Deviation?Standard deviation has the formula The formula for the unbiased standard deviation of a sample data set from a population (for standard deviation of the entire population, use N instead of N – 1 in the denominator of the fraction in the radical).Standard deviation is used in fields from business and finance to medicine and manufacturing. Some of the things that affect standard deviation include:
Let’s take a look at each of these factors, along with some examples, to see how they affect standard deviation. Does Sample Size Affect Standard Deviation?Sample size does affect the sample standard deviation. However, it does not affect the population standard deviation. The sample size, N, appears in the denominator under the radical in the formula for standard deviation. Standard deviation tells us how “spread out” the data points are. Changing the sample size (number of data points) affects the standard deviation.So, changing the value of N affects the sample standard deviation. Changing the sample size N also affects the sample mean (but not the population mean). Example 1: Changing N Changes Standard DeviationFor the data set S = {1, 3, 5}, we have the following:
If we change the sample size by removing the third data point (5), we have:
So, changing N changed both the mean and standard deviation. Of course, it is possible by chance that changing the sample size will leave the standard deviation unchanged. Example 2: Changing N Leaves Standard Deviation UnchangedFor the data set S = {1, 2, 2.36604}, we have the following:
If we change the sample size by removing the third data point (2.36604), we have:
So, changing N lead to a change in the mean, but leaves the standard deviation the same. Does Removing An Outlier Affect Standard Deviation?Removing an outlier affects standard deviation. In removing an outlier, we are changing the sample size N, the mean, and thus the standard deviation. An outlier is a data point that is far outside of the expected range of values (far higher or lower than other data points).Example: Removing An Outlier Changes Standard DeviationFor the data set S = {1, 3, 98}, we have the following:
If we change the sample size by removing the third data point (98), we have:
So, changing N changed both the mean and standard deviation (both in a significant way). Does Addition Affect Standard Deviation?Addition of the same value to every data point does not affect standard deviation. However, it does affect the mean. This is because standard deviation measures how spread out the data points are. Adding the same value to every data point may give us larger values, but they are still spread out in the exact same way (in other words, the distance between data points has not changed at all!) Example: Addition Does Not Change Standard Deviation.For the data set S = {1, 2, 3}, we have the following:
If we add the same value of 5 to each data point, we have:
So, adding 5 to all data points changed the mean (an increase of 5), but left the standard deviation unchanged (it is still 1). Does Multiplication Affect Standard Deviation?Multiplication affects standard deviation by a scaling factor. If we multiply every data point by a constant K, then the standard deviation is multiplied by the same factor K. In fact, the mean is also scaled by the same factor K. Example: Multiplication Scales Standard Deviation By A Factor Of KFor the data set S = {1, 2, 3}, we have the following:
If we use multiplication by a factor of K = 4 on every point in the data set, we have:
So, multiplying by K = 4 also multiplied the mean by 4 (it went from 2 to 8) and multiplied standard deviation by 4 (it went from 1 to 4). Does Changing Units Affect Standard Deviation?Changing units affects standard deviation. Any change in units will involve multiplication by a constant K, so the standard deviation (and the mean) will also be scaled by K. Changing units of measurement (like inches instead of feet for height) will change standard deviation accordingly.Example: Changing Units Changes Standard DeviationFor the data set S = {1, 2, 3} (units in feet), we have the following:
If we want to convert units from feet to inches, we use multiplication by a factor of K = 12 on every point in the data set, we have:
So, multiplying by K = 12 also multiplied the mean by 12 (it went from 2 to 24) and multiplied standard deviation by 12 (it went from 1 to 12). Does Mean Affect Standard Deviation?Mean affects standard deviation. To calculate standard deviation, we add up the squared differences of every data point and the mean. However, it can happen by chance that a different mean will lead to the same standard deviation (for example, when we add the same value to every data point). This is because standard deviation measures how far each data point is from the mean. So, the data set {1, 3, 5} has the same standard deviation as the set {2, 4, 6} (all we did was add 1 to each data point in the first set to get the second set). See the example from earlier (adding 5 to every data point in the set {1, 2, 3}): the mean changes, but the standard deviation does not. You can learn more about the difference between mean and standard deviation in my article here. ConclusionNow you know what affects standard deviation and what to consider about outliers and sample size. You can learn about how to use Excel to calculate standard deviation in this article. You can learn about the units for standard deviation here. You can learn about the difference between standard deviation and standard error here. You might also be interested to learn more about variance in my article here. You can learn more about standard deviation calculations in this resource from Texas A&M University. Also, Penn State University has an article on how standard deviation can be used to measure the risk of a stock portfolio, based on variability of returns. This article I wrote will reveal what standard deviation can tell us about a data set. I hope you found this article helpful. If so, please share it with someone who can use the information. Don’t forget to subscribe to my YouTube channel & get updates on new math videos! ~Jonathon |