askvity

How do you find the linear correlation coefficient from a table?

Published in Statistics 3 mins read

Finding the linear correlation coefficient (r) from a table involves using a formula and calculating several summations based on the data provided in the table. Here's a step-by-step guide:

1. Understanding the Formula

The linear correlation coefficient, denoted by 'r', measures the strength and direction of a linear relationship between two variables. The formula to calculate 'r' is:

*r = [n (∑XY) - (∑X)(∑Y)] / √{[n ∑X² - (∑X)²] [n ∑Y² - (∑Y)²]}**

Where:

  • r = linear correlation coefficient
  • n = number of data points (pairs of x and y values)
  • ∑XY = sum of the product of each x and y value
  • ∑X = sum of all x values
  • ∑Y = sum of all y values
  • ∑X² = sum of the squares of all x values
  • ∑Y² = sum of the squares of all y values

2. Creating a Table for Calculations

Start with the table of your data, which has columns for 'x' and 'y'. Expand the table to include the following additional columns to facilitate calculations:

x y xy
x₁ y₁ x₁y₁ x₁² y₁²
x₂ y₂ x₂y₂ x₂² y₂²
... ... ... ... ...
xₙ yₙ xₙyₙ xₙ² yₙ²

3. Performing the Calculations

  • Column 'xy': For each row, multiply the x value by the y value and record the result.
  • Column 'x²': For each row, square the x value and record the result.
  • Column 'y²': For each row, square the y value and record the result.

4. Summing the Columns

Add up all the values in each column:

  • Calculate ∑X (sum of x values).
  • Calculate ∑Y (sum of y values).
  • Calculate ∑XY (sum of xy values).
  • Calculate ∑X² (sum of x² values).
  • Calculate ∑Y² (sum of y² values).

5. Plugging the Values into the Formula

Substitute the calculated sums and the number of data points (n) into the correlation coefficient formula:

*r = [n (∑XY) - (∑X)(∑Y)] / √{[n ∑X² - (∑X)²] [n ∑Y² - (∑Y)²]}**

6. Interpreting the Result

The value of 'r' will always be between -1 and +1:

  • r = +1: Perfect positive linear correlation.
  • r = -1: Perfect negative linear correlation.
  • r = 0: No linear correlation.
  • Values close to +1 indicate a strong positive correlation.
  • Values close to -1 indicate a strong negative correlation.
  • Values close to 0 indicate a weak or no linear correlation.

Example

Let's say you have the following data in a table:

x y
1 2
2 4
3 5
4 7
5 9

First, create the augmented table:

x y xy
1 2 2 1 4
2 4 8 4 16
3 5 15 9 25
4 7 28 16 49
5 9 45 25 81

Then, calculate the sums:

  • ∑X = 1 + 2 + 3 + 4 + 5 = 15
  • ∑Y = 2 + 4 + 5 + 7 + 9 = 27
  • ∑XY = 2 + 8 + 15 + 28 + 45 = 98
  • ∑X² = 1 + 4 + 9 + 16 + 25 = 55
  • ∑Y² = 4 + 16 + 25 + 49 + 81 = 175
  • n = 5 (number of data points)

Now, plug these values into the formula:

r = [5(98) - (15)(27)] / √{[5(55) - (15)²] [5(175) - (27)²]}
r = [490 - 405] / √{[275 - 225]
[875 - 729]}
r = 85 / √(50 * 146)
r = 85 / √7300
r = 85 / 85.44
r ≈ 0.995

This indicates a very strong positive linear correlation between x and y.

Related Articles