askvity

What is Cross Deviation?

Published in Statistical Measures 3 mins read

Cross deviation, also known as the cross-product of deviations, is a fundamental concept in statistics used to measure the relationship between two variables. Based on the provided reference, the cross-product of deviations is equal to the sum of the products of mean-corrected variables.

Understanding Cross Deviation

To grasp cross deviation, we first need to understand "mean-corrected variables."

What are Mean-Corrected Variables?

A mean-corrected variable is simply a variable where the mean of its data points has been subtracted from each individual data point. This process centers the data around zero.

  • For a variable X, the mean-corrected value for a data point (x_i) is (x_i - \bar{X}) (where (\bar{X}) is the mean of X).
  • Similarly, for a variable Y, the mean-corrected value for a data point (y_i) is (y_i - \bar{Y}) (where (\bar{Y}) is the mean of Y).

Calculating Cross Deviation

Cross deviation is then calculated by:

  1. Taking a data point for variable X ((x_i)) and subtracting its mean ((\bar{X})) to get its deviation from the mean.
  2. Taking the corresponding data point for variable Y ((y_i)) and subtracting its mean ((\bar{Y})) to get its deviation from the mean.
  3. Multiplying these two deviations together: ((x_i - \bar{X}) \times (y_i - \bar{Y})).
  4. Repeating this for all pairs of data points across both variables.
  5. Summing up all these products.

So, the formula for cross deviation is (\sum (x_i - \bar{X})(y_i - \bar{Y})).

Let's look at a simple example with two data points:

Data Point (i) X Y X - Mean(X) Y - Mean(Y) (X - Mean(X)) * (Y - Mean(Y))
(Assume Mean(X)=3, Mean(Y)=4)
1 2 3 2 - 3 = -1 3 - 4 = -1 (-1) * (-1) = 1
2 4 5 4 - 3 = 1 5 - 4 = 1 (1) * (1) = 1
Sum 1 + 1 = 2

In this miniature example, the cross deviation (sum of the products of mean-corrected variables) is 2.

Significance in Statistics

Cross deviation plays a crucial role in measuring the linear relationship between two variables:

  • Pearson Correlation Coefficient: The reference explicitly states that the cross-product of deviations (cross deviation) is the numerator of the Pearson correlation coefficient. This coefficient is a standardized measure indicating the strength and direction (positive or negative) of the linear relationship between two variables. A positive cross deviation suggests that as one variable increases, the other tends to increase, while a negative cross deviation suggests that as one increases, the other tends to decrease.
  • Covariance: The covariance is described in the reference as an unstandardized measure of the relationship between two variables. It is equal to the cross-product deviation divided by N–1, where N is the number of data points. Covariance shows how much two variables change together; a large positive covariance indicates that X and Y both tend to be high or low at the same time, while a large negative covariance indicates that when one is high the other tends to be low.

In essence, cross deviation provides the core component needed to calculate these key measures of association between variables.

Related Articles