askvity

What is Dimensional Data Modelling?

Published in Data Warehousing Modelling 3 mins read

Dimensional modelling is a data modelling technique used primarily for data warehouses and analytics that involves organizing data into core business processes (facts) and their descriptive attributes (dimensions). It is a data modeling technique where you break data up into “facts” and “dimensions” to organize and describe entities within your data warehouse.

Understanding Facts and Dimensions

The foundation of dimensional modelling lies in separating quantitative measures (facts) from the descriptive context (dimensions).

  • Facts: These represent the measurable events or transactions in your business, such as sales amounts, quantities sold, or profit. Fact tables typically contain numerical values that can be aggregated and foreign keys linking to dimension tables.
  • Dimensions: These provide context to the facts. They describe who, what, where, when, how, and why a fact occurred. Examples include customer information, product details, time periods, locations, and sales channels. Dimension tables contain descriptive attributes (text or numbers) and a primary key.

Why Use Dimensional Modelling?

This approach is highly effective for business intelligence (BI) and analytics because it:

  • Simplifies Queries: The structure is intuitive, making it easier for BI tools and users to write queries and generate reports.
  • Optimizes Performance: Data is organized for fast retrieval of summarized and detailed information, crucial for analytical workloads.
  • Facilitates Understanding: The separation of facts and dimensions aligns closely with how business users think about performance measures and their influencing factors.
  • Supports Change: New dimensions or facts can often be added without significantly restructuring existing tables.

Common Dimensional Models

The most common structures using dimensional modelling are the Star Schema and the Snowflake Schema:

  • Star Schema: A central fact table is directly connected to multiple dimension tables. It's simpler and often preferred for performance.
  • Snowflake Schema: An extension of the star schema where dimensions are normalized into multiple related tables. This reduces redundancy but increases complexity and potentially query join times.

Practical Examples

Consider a retail sales scenario:

Component Description Example Attributes
Fact A sales transaction Sales Amount, Quantity Sold, Profit
Dimension Time (When did the sale happen?) Date, Month, Year, Day of Week
Dimension Product (What was sold?) Product Name, Category, Brand
Dimension Customer (Who bought it?) Customer Name, City, Loyalty Status
Dimension Store (Where was it sold?) Store Name, Region, Store Type
Dimension Promotion (Was there a promotion?) Promotion Name, Discount Percentage

In this model, a fact table record for a single sale would link to specific records in the Time, Product, Customer, Store, and Promotion dimension tables, allowing analysts to easily answer questions like "What was the total sales amount for product 'X' in region 'Y' during month 'Z'?"

Dimensional modelling provides a robust and flexible framework specifically designed to meet the analytical needs of a data warehouse environment.

Related Articles