askvity

What is the Variance of the Sample Space?

Published in Probability Variance 5 mins read

The term "variance of the sample space" is not standard terminology in probability and statistics. Variance is a statistical measure applied to a set of data or, more commonly in probability, to a random variable, not directly to the sample space itself.

The sample space (S) is simply the set of all possible outcomes of a random experiment. Variance, on the other hand, quantifies the spread or dispersion of the values a random variable can take around its mean (expected value).

Based on the provided reference, which defines variance for a discrete random variable X over a sample space S, we understand that the concept you're asking about relates to how variance is calculated for a variable whose outcomes are linked to the sample space.

Understanding Variance in Relation to a Sample Space

Variance measures how far the values of a random variable are from the mean (expected value). A high variance indicates that the values are spread out, while a low variance indicates that they are clustered closer to the mean.

When we talk about variance in the context of a sample space, we are actually referring to the variance of a random variable (X) that is defined on that sample space (S). A random variable assigns a numerical value to each outcome in the sample space.

For a discrete random variable X defined over a sample space S, where X takes on specific values 'x' with associated probabilities P[X=x], the variance of X, denoted as Var[X], can be calculated using specific formulas.

Formulas for the Variance of a Discrete Random Variable

The reference provides two equivalent formulas for calculating the variance of a discrete random variable X over a sample space S:

  1. Definition Formula: The sum of the squared deviations of each value from the mean, weighted by its probability.
    Var[X] = ∑ₓ∈S P[X=x](x − μ)²

  2. Computational Formula: The expected value of X² minus the square of the expected value of X.
    Var[X] = ∑ₓ∈S {P[X=x] ⋅ x²} − μ²

Where:

  • X: The discrete random variable.
  • S: The sample space. (Note: In the context of these formulas, 'x ∈ S' typically refers to summing over the possible values 'x' that the random variable X can take, where each value 'x' corresponds to one or more outcomes in the original sample space. It's more precise to say '∑_x P[X=x](x - μ)²' where the sum is over all possible values 'x' that X can take.)
  • x: A specific value that the random variable X can take.
  • P[X=x]: The probability that the random variable X takes on the value x.
  • μ (mu): The expected value (mean) of the random variable X. It is calculated as μ = E[X] = ∑_x x ⋅ P[X=x].

Both formulas yield the same result and measure the average squared distance of the variable's values from its mean, weighted by their probabilities.

Example: Variance of a Single Die Roll

Let's consider the experiment of rolling a fair six-sided die.

  • Sample Space (S): {1, 2, 3, 4, 5, 6}
  • Random Variable (X): The outcome of the roll. So, X can take values {1, 2, 3, 4, 5, 6}.
  • Probabilities (P[X=x]): For a fair die, the probability of rolling any specific number is 1/6. P[X=1] = P[X=2] = ... = P[X=6] = 1/6.

First, calculate the mean (μ):
μ = E[X] = ∑ x ⋅ P[X=x]
μ = (1 ⋅ 1/6) + (2 ⋅ 1/6) + (3 ⋅ 1/6) + (4 ⋅ 1/6) + (5 ⋅ 1/6) + (6 ⋅ 1/6)
μ = (1 + 2 + 3 + 4 + 5 + 6) / 6 = 21 / 6 = 3.5

Now, calculate the variance using the definition formula:
Var[X] = ∑ P[X=x](x − μ)²
Var[X] = (1/6)(1 - 3.5)² + (1/6)(2 - 3.5)² + (1/6)(3 - 3.5)² + (1/6)(4 - 3.5)² + (1/6)(5 - 3.5)² + (1/6)(6 - 3.5)²
Var[X] = (1/6)(-2.5)² + (1/6)(-1.5)² + (1/6)(-0.5)² + (1/6)(0.5)² + (1/6)(1.5)² + (1/6)(2.5)²
Var[X] = (1/6)(6.25 + 2.25 + 0.25 + 0.25 + 2.25 + 6.25)
Var[X] = (1/6)(17.5)
Var[X] ≈ 2.9167

Alternatively, using the computational formula:
First, calculate ∑ x² ⋅ P[X=x]:
∑ x² ⋅ P[X=x] = (1² ⋅ 1/6) + (2² ⋅ 1/6) + (3² ⋅ 1/6) + (4² ⋅ 1/6) + (5² ⋅ 1/6) + (6² ⋅ 1/6)
∑ x² ⋅ P[X=x] = (1 ⋅ 1/6) + (4 ⋅ 1/6) + (9 ⋅ 1/6) + (16 ⋅ 1/6) + (25 ⋅ 1/6) + (36 ⋅ 1/6)
∑ x² ⋅ P[X=x] = (1 + 4 + 9 + 16 + 25 + 36) / 6 = 91 / 6 ≈ 15.1667

Now, Var[X] = ∑ x² ⋅ P[X=x] − μ²
Var[X] = (91/6) − (3.5)²
Var[X] = (91/6) − 12.25
Var[X] ≈ 15.1667 − 12.25
Var[X] ≈ 2.9167

Both formulas give the variance of the random variable X defined on the sample space, which quantifies the spread of the possible outcomes (the numbers 1 through 6) around their mean (3.5).

Summary Table

Term Description Applies to
Sample Space (S) The set of all possible outcomes of an experiment. The experiment itself
Random Variable (X) A function that assigns a numerical value to each outcome in the sample space. The numerical representation of outcomes
Variance (Var[X]) A measure of the spread or dispersion of the values of a random variable. The random variable X, defined over the sample space S

In conclusion, while you don't calculate the "variance of the sample space" directly, you calculate the variance of a random variable that maps the outcomes of the sample space to numerical values. The formulas provided in the reference are the standard methods for doing this for a discrete random variable.

Related Articles