The Kish formula is an estimate of the effective sample size in survey sampling, particularly when dealing with unequal weighting. It quantifies the reduction in statistical efficiency due to weighting.
The formula, typically attributed to statistician Leslie Kish, is expressed as:
(∑ni=1wi)2 / ∑ni=1w2i
Where:
n
represents the total sample size.w<sub>i</sub>
represents the weight assigned to the i-th observation.
Explanation:
This formula calculates the effective sample size (neff). If all weights are equal (i.e., no weighting is applied), the effective sample size will equal the actual sample size. However, if weights vary considerably, the effective sample size will be smaller than the actual sample size, reflecting the reduced precision of the estimates. Larger variations in the weights lead to a smaller effective sample size.
Why is the Kish Formula Important?
- Assessing the Impact of Weighting: It helps researchers understand how much the variability in weights affects the statistical power of their analyses.
- Sample Size Planning: It can inform sample size planning by accounting for anticipated variations in weights.
- Data Quality Evaluation: A significantly reduced effective sample size compared to the actual sample size can signal potential issues with the weighting scheme or data collection process.
Example
Imagine a survey with 100 respondents. Let's say some respondents have a weight of 1, and others have a weight of 2.
If all respondents had a weight of 1:
- The sum of the weights would be 100.
- The sum of the squares of the weights would also be 100.
- The Kish formula would yield (1002)/100 = 100, so the effective sample size would be 100 (equal to the actual sample size).
Now, suppose 50 respondents have a weight of 1, and 50 have a weight of 2.
- The sum of the weights would be (50 1) + (50 2) = 150.
- The sum of the squares of the weights would be (50 12) + (50 22) = 50 + 200 = 250.
- The Kish formula would yield (1502)/250 = 22500/250 = 90. The effective sample size would only be 90, indicating a reduction in precision due to unequal weighting.
Considerations
- The Kish formula provides an estimate of the effective sample size.
- It's particularly useful when dealing with survey weights designed to correct for unequal probabilities of selection or non-response.
- A low effective sample size can necessitate adjustments in statistical analysis or further investigation of the weighting scheme.