askvity

What is natural join formal?

Published in Database Joins 3 mins read

A natural join is a specific type of equi-join that implicitly links tables based on columns sharing the same name. Let's formalize this concept.

Understanding Natural Join

A natural join combines rows from two tables, let's call them Table A and Table B, based on equality between columns that have the same name in both tables. The resulting table includes only one column for each pair of identically named columns. According to provided reference: A natural join is a type of equi-join where the join predicate arises implicitly by comparing all columns in both tables that have the same column-names in the joined tables. The resulting joined table contains only one column for each pair of equally named columns.

Formal Definition

While a truly formal, symbolic representation requires database theory notation, we can provide a structured explanation:

  1. Identify Common Columns: Find all columns that exist in both Table A and Table B, having the same name. Let's call the set of these columns CommonColumns.

  2. Join Predicate: The implicit join predicate is the conjunction (AND) of equalities between the common columns. For example, if CommonColumns = {Column1, Column2}, the join predicate would be:

    A.Column1 = B.Column1 AND A.Column2 = B.Column2

  3. Resulting Table: The resulting table contains:

    • All columns from Table A.
    • All columns from Table B except the common columns (because they are already included from Table A). Only one copy of each common column is included.
    • Only rows that satisfy the join predicate (i.e., the rows where the values in the common columns are equal).

Example

Let's say we have two tables:

Table A: Employees

EmployeeID Name DepartmentID
1 John 10
2 Jane 20
3 Mike 10

Table B: Departments

DepartmentID DepartmentName Location
10 Sales New York
20 Marketing London

A natural join between Employees and Departments will automatically join on the DepartmentID column, because that is the only column name that is common to both tables. The result would be:

Resulting Table: Employees NATURAL JOIN Departments

EmployeeID Name DepartmentID DepartmentName Location
1 John 10 Sales New York
2 Jane 20 Marketing London
3 Mike 10 Sales New York

Key Characteristics

  • Implicit Join Condition: No explicit ON clause is needed. The database system infers the join condition based on common column names.
  • Potential Ambiguity: If tables have multiple columns with the same name, the natural join will join on all of them. This can sometimes lead to unexpected or incorrect results if that's not the intention.
  • Readability: While convenient for simple joins, natural joins can be less readable than explicit joins (using JOIN ... ON) because the join condition is not immediately obvious.

Related Articles