Measures of Variability/Dispersion
The measures of central tendencies
mean, median and mode, these 3 averages gives us an idea of the concentration
of the observations about the central part of distribution. But these averages
do not explain the characteristics of the distribution.
Ex: The marks of two students
obtained in a test is as follows:
A=41, 46, 54, 40, 50, 51 . . . Average =282/6 =47
B= 30, 33, 32, 48, 69, 70 . . . Average =282/6 =47
Here the average marks of both A
and B is same but we cannot say the performance of both A and B is same.
Because here we cannot consider the performance of student in each subject.
Here the Student ‘A’ got above 40 in each subject but ‘B’ is not. Therefore,
the performance of the student ‘A’ is better than ‘B’.
Measures of Variability
There are 4
measures of variability namely,
1. Range (R) 2.
Quartile Deviation (Q.D) 3.
Mean/Average Deviation (M.D)
4.
Standard Deviation (S.D)
1. RANGE (R)
It
is defined as the difference between the highest and lowest scores.
Range= Highest score – Lowest Scores
|
R= H - L |
It is one
of the least reliable measures of variability, for it is affected by
fluctuations in the extreme scores.
Its only merit
is that it can be easily calculated and readily understood.
Co-efficient
of Range
It is defined as
the ratio of the difference between the highest and lowest score to the sum of
the highest and lowest score.
i.e. Co-efficient
of Range = H - L
H+L
Range is
a number between 0 and 1 scores are more consistent if the co-efficient of range is very near to 0 and not consistent
if it is near to 1.
Merits: It is
simple to understand and easy to calculate.
Limitations
1.
It helps us to
make only a rough comparison of two or more groups for variability.
2.
It takes account
of only the two extreme scores of a series and is unreliable when N is small or
when there are large gaps (i.e., Zero f’s) in the frequency distribution.
3.
It is affected
greatly by fluctuations in sampling. Its value is never stable. In a class
where normally the height of students ranges from 150 cms to 180 cms, if a
dwarf, whose height is 90 cm is admitted, the range would shoot up from 90cm to
180cm.
4.
The range does
not take into account the composition of a series or the distribution of the
items within the extremes. The range of a symmetrical and an asymmetrical
distribution can be identical.
Use of Range-----
1.
When a knowledge
of extreme scores is all that is wanted;
2.
When the data
are too scant or too scattered to justify the computation of a more precise
measure of variability.
3.
Quality control.
4.
In studying the
fluctuation in prices.
5.
Weather
forecast.
6.
Day to day activities like sales in a
shop, earning of a family in a week etc.
QUARTILE DEVIATION (Q.D) or Q
(Semi inter quartile Range)
Range tells us only the difference between
highest and lowest score within the distribution. The inter-quartile range
measures approximately how far from the median. We can include on half of the
scores (50%) of the given set of data. To compute this range we divide the
given data in to four equal parts, each of which contains 25% of the items in
the distribution. The quartiles are thus the highest value in each of these 4
parts.
![]()
![]()
![]()
Q3 – Q1
* Q1 *
Q2 * Q3 * * Q4
First
Quartile is ‘Q1’ When Q1 = N+1 Item
4
Second
Quartile (median) is Q2 When
Q2 = 2 X N+1 Item
4
Third
Quartile is Q3 When Q3 = 3 X N+1 Item
4
Inter
quartile range is the difference between
Q3 and Q1 i.e.,
(Q3- Q1).
One half of the inter quartile range is a
measure called Quartile Deviation.
|
(Q.D)
= (Q3- Q1) 2 |
Quartile
Deviation (Q.D)---
1.
Find
the Q.D for the following ungrouped data
25, 29,
36, 42, 48, 56, 62, 65, 67, 70, 72
|
Sl. No. |
X |
|
01 |
25 |
|
02 |
29 |
|
03 |
|
|
04 |
42 |
|
05 |
48 |
|
06 |
56 |
|
07 |
62 |
|
08 |
65 |
|
09 |
|
|
10 |
70 |
|
11 |
72 |
Q1 = N +1 item
= 11+1 = 12/4 =3 rd item.
4 4
Q2 = 2 X N +1 item = 2X3 =6 th item.
4
Q3 = 3X N +1 item = 3 X 3 = 9th item
4
Q.D = Q3
–Q1 = 67 – 36 =
31/2 = 15.5
2 2
|
Q.D = 15.5 |
2.
Find the Q.D for the following ungrouped data
|
X |
f |
F |
|
10 |
4 |
4 |
|
20 |
7 |
|
|
30 |
15 |
26 |
|
40 |
8 |
|
|
50 |
7 |
41 |
|
80 |
2 |
43 |
Q1 = N + 1
= 43 + 1 = 44/4
= 11th item
4 4
i.e., Q1 = 20
Q3 = 3 X N + 1
= 3 X 11 = 33 rd item
4
i.e., Q3 = 40
.
N = 43
.
. Q.D = Q3 –Q1 = 40- 20
2 2
= 20/2
= 10
|
Q.D = 10 |
|
C
- I |
f |
F |
|
70
– 79 |
14 |
150 |
|
60
– 69 |
16 |
136 |
|
50
- 59 |
40 |
120 |
|
40
- 49 |
10 |
80 |
|
30
- 39 |
0 |
70 |
|
20- 29 |
20 |
70 |
|
10
- 19 |
40 |
50 |
|
0 - 9 |
10 |
10 |
3. Find the Q.D for the following
grouped data
Q1 = N+1
= 150 + 1 = 151/4 = 37.75
4 4
Q1 = 37.75
Q3 = 3X N+1 = 3 X 37.75 = 113.25
4
Q3 =
113.25
Q1 = L + N/4
– F X I
Q3 = L + 3N/4 – F X i
Fm
fm
= 9.5 + 37.5 – 10 X 10 = 49.5+ 112.5 – 80 X 10
40
40
= 9.5 + 27.5/4
= 49.5 + 32.5/4
= 9.5 + 6.875 = 49.5 +
8.125
|
Q1 = 16.375 |
|
Q3 = 57.625 |
.
. .
Q.D = Q3 – Q1 = 57.625
– 16.375 = 41.25/2 = 20.625
2 2
|
Q.D= 20.625 |
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
4) C.I f 5) C.I
f
90 – 99 3 55 – 59 2
80 – 89 9
50 - 54 4
70 – 79 18 45 – 49 8
60 – 69 10 40 – 44 10
50 – 59 6 35 – 39 15
40 – 49 4
30 –
34 25
N =50 25 – 29 30
20
– 24 24
15 – 19 16;
Ans:
Q.D = 8.47 10 –
14 8 Ans: Q.D = 7.18
5 – 9 5
0 – 4 3
N = 150
6) C.I f 7) C.I f
91 – 100 9 55 –
61 8
81 – 90 10 48 – 54 10
71 – 80 15
41 – 47 15
61 - 70 20 34
– 40 30
51 - 60 12
27 – 33 12
41 – 50 8 20 – 26 9
31- 40 4
13 – 19 6
N = 78
N = 90
Ans: Q.D = 11.63
Ans: Q.D
= 7.26
Merits of Quartile Deviation
1. It
is a more representative and trustworthy measure of variability than the
overall range;
2. It
us a good index of score density at the middle of the distribution;
3. Quartiles
are useful in indicating the skewness of a distribution;
Q3 – Q2 > Q2 – Q1 à Indicates +
ve Skewness.
Q3 – Q2 < Q2 – Q1 à Indicates –ve Skewness.
Q3 – Q2 = Q2 – Q1
à Indicates Zero Skewness;
4. Like
the median, Q.D is applicable to open-end distributions.
Limitations of Q.D
1.
It is not capable for further algebraic
treatment;
2.
It is possible for two distributions to have
equal Q2 but quite dissimilar
variability of the lower and upper 25% of scores;
3.
It is affected to a considerable extent by
fluctuations in sampling. A change in the value of a single item may, in
certain cases, affect its value considerably.
Use a Quartile Deviation
1. When
the median is a measure of a central tendency.
2. When
the distribution is incomplete at either end.
3. When
there are scattered or extreme scores which would disproportionately influence
the S.D.
4. When
the concentration around the median- the middle 50% of primary interest.
STANDARD DEVIATION (S.D)
Standard deviation is the “Square root of the mean of the squares of
individual deviations from the mean in a series.”
------James Drever.
Varience (⌐2) = ∑fd2 ,
S.D (⌐) = √∑fd2/N
N
Short
cut method:
S.D = √∑d2/N – (∑d/N)2 or
S.D = i √ ∑fd2
/N - (∑fd/N)2
Steps
to find S.D (Long method)
1.
Find the mean using the formula, M = ∑fx
N
2.
Calculate the value of‘d’. i.e. d = X- M for all the values of x.
3.
Square the value of‘d’ i.e., d2.
4.
Find ∑ fd2.
5.
Calculate ⌐2= ∑fd2/N, this is the varience
of distribution.
6.
Take the positive square root of the
varience to get standard deviation of the distribution.
i.e., S.D = √∑fd2/N
Problems:
1.
Find
the S.D for the following ungrouped data:
6, 8, 10,
12, 14
|
Scores (X) |
6 |
8 |
10 |
12 |
14 |
|
Deviations (d=X-M) |
-4 |
-2 |
0 |
2 |
4 |
|
d2 |
16 |
4 |
0 |
4 |
16 |
∑d2 =40
Mean = 50/5
= 10.
M = 10
.
. .
S.D = √∑d2/N = √40/5 = √8 = 2.83
2.
Find
the S.D for the following distribution
|
X |
f |
fX |
d=X-M |
d2 |
fd2 |
|
5 |
1 |
5 |
-9.7 |
94.09 |
94.09 |
|
10 |
2 |
20 |
-4.7 |
22.09 |
44.18 |
|
12 |
3 |
36 |
-2.7 |
7.29 |
21.87 |
|
14 |
12 |
168 |
-0.7 |
0.49 |
5.88 |
|
15 |
4 |
60 |
0.3 |
0.09 |
0.36 |
|
17 |
5 |
85 |
2.3 |
5.29 |
26.45 |
|
22 |
3 |
66 |
7.3 |
53.29 |
159.87 |
N= 30
∑fd2 = 352.7
M = ∑fx
= 440 = 14.7
N
30
S.D = √∑fd2/N = √352.7/30 = √11.756
= 3.42
|
S.D = 3.42 |
Finding S.D by short-cut method
Procedure:
1.
Assume a value for the mean.
2.
Lay off the deviation from the AM by intervals.
3.
Find fd by multiplying frequencies with deviations for each C.I. Add the
product.
4.
Find fd2 by multiplying d with fd. Add the products.
5. Use the formula, S.D = i x √ ∑fd2 /N - (∑fd/N)2
1.
Find S.D from the following grouped data
|
C. I |
x |
f |
d |
fd |
fd2 |
|
10 – 14 |
12 |
3 |
-3 |
-9 |
27 |
|
15 - 19 |
17 |
5 |
-2 |
-10 |
20 |
|
20 – 24 |
22 |
9 |
-1 |
-9 (-28) |
9 |
|
25 – 29 |
27 |
18 |
0 |
0 |
0 |
|
30 – 34 |
32 |
11 |
1 |
11 |
11 |
|
35 – 39 |
37 |
5 |
2 |
10 |
20 |
|
40 – 44 |
42 |
6 |
3 |
18 |
54 |
|
45 – 49 |
47 |
2 |
4 |
8 |
32 |
|
50- 54 |
52 |
1 |
5 |
5 (52) |
25 |
N= 60 ∑fd= 24 ∑fd2= 198
S.D = i √ ∑fd2 /N - (∑fd/N)2
= 5√198/60 - (24/60)2
= 5√ 3.3 – (0.4)2
= 5 √3.3
– 0.16
= 5 √3.14
=
5x1.77
= 8.85
|
S.D = 8.85 |
Exercise problems
Find the S.D for the following
1.
|
X |
14 |
20 |
30 |
40 |
50 |
|
f |
5 |
6 |
12 |
8 |
9 |
Ans: S.D= 12.2
|
C.
I |
f |
|
1
– 5 |
2 |
|
6
– 10 |
3 |
|
11
– 15 |
4 |
|
16
- 20 |
1 |
2. 3. 4.
|
C.
I |
f |
|
16 – 20 |
3 |
|
21 – 25 |
2 |
|
26
– 30 |
4 |
|
31
– 35 |
5 |
|
36
– 40 |
7 |
|
41
– 45 |
2 |
|
46
- 50 |
2 |
|
X |
|
26 |
|
27 |
|
32 |
|
35 |
|
40 |
|
45 |
|
49 |
|
54 |
|
60 |
|
72 |
Ans: 8.485
Ans: 4.6
Ans: 14.2
Merits
of S.D:
1. S.D
is rigidly defined and its value is always definite.
2. It
is based on all the observations of the data.
3. It
is amenable to algebraic treatment and possesses many mathematical properties.
This is why it is used in many advanced studies.
4. It
is less affected by fluctuations in sampling than most other measures of
variability.
Limitations of S.D
1.
It is difficult to understand and interpret
S.D.
2.
It gives more weight to extreme items and
less to those which are near the mean, because the squares of the deviations,
which are big in size, would be proportionately greater than the squares of
those which are comparatively small.
Use the S.D ---
1. When
a measure, having the greatest stability and reliability, is sought;
2. When
extreme deviations should exercise a proportionately greater effect upon variability;
3. When the coefficient of correlation and other
statistics are subsequently to be computed;
4. When
the interpretations related to the normal probability curve are desired.
No comments:
Post a Comment