askvity

What is Normalised Importance Sampling?

Published in Importance Sampling 4 mins read

Normalised Importance Sampling is a modification of the standard Importance Sampling technique that addresses the issue of the unknown normalising constant in the target distribution. In essence, it estimates expectations by weighting samples from a proposal distribution, but with weights that are scaled to sum to one.

Here's a breakdown:

  • Importance Sampling (IS) Recap: IS approximates the expectation of a function f(x) with respect to a target distribution p(x) by drawing samples from a proposal distribution q(x) and weighting each sample by the importance weight w(x) = p(x) / q(x). The core idea is to use q(x) when directly sampling from p(x) is difficult or impossible.

  • The Problem with Unknown Normalising Constants: Often, we only know the target distribution p(x) up to a normalising constant. In other words, we have access to an unnormalised version, p*(x), where p(x) = p*(x) / Z, and Z is the normalising constant. This constant is usually intractable to compute directly.

  • Normalised Importance Sampling Solution: Normalised IS tackles this issue by estimating the normalising constant alongside the expectation. Instead of using the raw importance weights w(x) = p(x) / q(x), which require knowing p(x), it uses unnormalised weights w*(x) = p*(x) / q(x). The normalising constant Z is then estimated as the average of these unnormalised weights.

    • Expectation Estimate: The expectation of f(x) is then approximated as:

      E[f(x)] ≈ Σ [w*(x_i) * f(x_i)] / Σ [w*(x_i)]

      where x_i are samples drawn from q(x), and the summations are over all N samples.

  • Key Advantages of Normalised Importance Sampling:

    • Handles unnormalised target distributions: It's applicable even when the normalising constant of the target distribution is unknown or intractable.
    • Generally lower variance: In many cases, normalised IS can exhibit lower variance compared to standard IS, especially when the proposal distribution q(x) is a good approximation of the target distribution p(x).
  • Example:

    Imagine we want to estimate the average height of people in a city, but we only have access to a biased sample (e.g., people who visit a specific website). We can use Normalised Importance Sampling:

    1. p*(x): Represents the unnormalised distribution of heights in the entire city (which we don't know precisely).
    2. q(x): Represents the distribution of heights in our biased sample (the people who visit the website).
    3. We weigh each person's height in the biased sample by w*(x) = p*(x) / q(x), where p*(x) is an estimate of the relative probability of that height in the city (even without knowing the exact height distribution), and q(x) is the probability of observing that height in our biased sample.
    4. We normalise these weights to sum to 1.
    5. The weighted average height becomes our estimate of the average height in the entire city.
  • Potential Issues:

    • Sensitivity to Tail Behavior: Normalised IS can be sensitive if the tails of q(x) are much lighter than the tails of p(x). In such cases, a few large weights can dominate the estimate, leading to high variance.
    • *Accurate p(x) needed:* While it avoids calculating the absolute normalization constant, Normalised IS still requires a good estimate of the relative probabilities within `p(x)`.

In summary, Normalised Importance Sampling is a powerful technique for approximating expectations when the target distribution is only known up to a normalising constant, by using weighted samples from a proposal distribution with scaled weights.

Related Articles