The standard deviation is the average distance between each value and the mean. This value tells you if the data is clustered around the mean or scattered, and therefore is a key value to

The standard deviation is calculated using the formula above where:

x(i) is each value in the sample

x is the mean

n is the number of values in the sample

**assess if the reliability of mean**as a precise/vague representation of the entire sample. Standard deviation can also be used to compare different samples which, although they have similar means, actually have values which may be clustered/dispersed in different ways.The standard deviation is calculated using the formula above where:

x(i) is each value in the sample

x is the mean

n is the number of values in the sample

The significance of the standard deviation is assessed by comparing it to the mean:

- Low SD = the values are tightly clustered (the distribution curve is steep) and the mean value is a reliable representation of the entire sample
- High HD = the values are scattered apart (the distribution curve is relatively flat) and the mean value is NOT a reliable representation of the entire sample

In a normally distributed sample (bell-curve):

- 68% of all individuals lie within +/- 1 standard deviation of the mean
- 95% of all individuals lie within +/- 2 standard deviations of the mean
- 99% of all individuals lie within +/- 3 standard deviations of the mean

*z*represents the distance between a raw score x and the population mean in units of the standard deviation:*z*is negative when the raw score is below the mean, positive when above. Click here to learn more about Z-scores.Example using the ages of a sample of men and women:

Ages Sampled | Mean | Standard Deviation | |
---|---|---|---|

39,45, 54, 66, 66, 66, 74 |
_{men} = 18.98 yrs |
||

57,69, 70, 72, 75, 82, 83 |
_{women} = 23.74 yrs |

Conclusion:

- In both cases, the SD is very high (= approx. half the value of the mean), which means that both samples are very scattered apart from their mean values. In both cases, the mean values are therefore not a very reliable representation of the sample.
- If you pick a
*random*man from the sample, there is a 68% probability (see diagram above) that he will lie within 1 standard deviation of the mean, which means that his age will be = Mean +/- 1 SD = 41.7 +/- 18.98. Therefore, there is a 68%*probability*his age will be at least 22.7 years old but no more than 60.7 years old - If you pick a
*random*woman from the sample, there is a 95% probability (see diagram above) that she will lie within 2 standard deviations of the mean, which means that her age will be= Mean +/- 2 SD = 51.6 +/- (23.74x2). Therefore, there is a 68%*probability*her age will be at least 4.1 years old but no more than 99.0 years old