askvity

What is the standard deviation formula for grouped data using assumed mean?

Published in Statistics Formula 3 mins read

The standard deviation formula for grouped data using the assumed mean method calculates the spread of data around the mean, considering frequency of data in different groups, using an assumed mean as reference.

Understanding the Assumed Mean Method

The assumed mean method simplifies calculations by selecting an arbitrary mean ("A") within the dataset. This simplifies the calculation of deviations and their squares. The basic premise of using assumed mean is that the calculation of the mean and standard deviation gets less cumbersome, particularly when the data have large numbers.

Formula Breakdown

The standard deviation formula for grouped data using the assumed mean method involves the following steps and components:

  1. Assumed Mean (A): Select a suitable value in the middle of the dataset. This simplifies calculations of the deviations of the data values from this chosen value.

  2. Deviations (d): The deviation of each class mark (mid-point of each class interval) from the assumed mean is computed using the formula:

  • d = x - A
  • Where:
  • 'x' is the mid-point (class mark) of each group/class.
  • 'A' is the assumed mean.
  1. Frequency (f): The frequency of each class is represented by 'f'. This means, how many data values fall under each class.

  2. Calculations:

    • fd: For each group, multiply the frequency (f) with the corresponding deviation (d).
    • fd²: Multiply the frequency with the square of deviations (d²).
    • Σfd: Calculate the sum of all 'fd' values.
    • Σfd²: Calculate the sum of all 'fd²' values.
    • Σf : Calculate the sum of all frequencies which will give you total number of observations
  3. Formula:

The standard deviation (σ) for grouped data with assumed mean can be written as:

σ = √[ (Σfd²/Σf) - (Σfd/Σf)²] * h

  • Where:
    • σ represents the standard deviation.
    • Σfd² is the sum of the product of frequencies and the square of the deviation.
    • Σfd is the sum of the product of frequencies and the deviation.
    • Σf is the sum of frequencies (total number of data points)
    • h represents the class size or width

Step-by-Step Example:

Let's illustrate with an example (simplified for clarity).

Class Interval Midpoint (x) Frequency (f) Deviation (d = x - A) fd fd²
10 - 20 15 5 -20 (Assumed Mean =35) -100 2000
20 - 30 25 10 -10 -100 1000
30 - 40 35 15 0 0 0
40 - 50 45 8 10 80 800
50 - 60 55 2 20 40 800
Totals 40 -80 4600

Where, we assumed the mean as 35. The class size here (h) is 10.

So, the Standard deviation will be calculated as follows:

σ = √[ (4600/40) - (-80/40)²] 10
σ = √[ 115 - 4 ]
10
σ = √[111] 10
σ= 10.54
10
σ = 105.4

Key Considerations

  • The assumed mean chosen can influence the ease of calculations. The closer to the actual mean, the easier the manual calculation will be.
  • The step deviation method, as mentioned in the reference, is an extension of this where we make the deviation as multiple of class width. This is also used to simplify manual calculation.

Related Articles