Back/Module F-5 Connecting Tables — Joins Demystified
Module F-5·23 min read

INNER JOIN, LEFT JOIN, FULL OUTER JOIN, self-joins, and multi-table queries — the most feared concept made simple.

F-5 — Connecting Tables — Joins Demystified

Who this module is for: You completed F-4 and understand aggregation. Now you need to combine data from multiple tables — the foundational skill that makes relational databases genuinely powerful. Joins confuse almost every beginner because they require a new mental model. This module builds that model step by step.


Why Data Lives in Multiple Tables

Imagine a simple online store. You could store everything in one table:

orders_messy:
  order_id | customer_name | customer_email | product_name | product_price | quantity
  1        | Alice         | alice@...      | Keyboard     | 129.99        | 1
  2        | Alice         | alice@...      | Mouse        | 49.99         | 2
  3        | Bob           | bob@...        | Keyboard     | 129.99        | 1

This has problems:

  • Alice's email is stored twice — if she changes it, you must update multiple rows
  • The Keyboard's price is stored twice — if it changes, you must update multiple rows
  • Adding a new customer who has not ordered yet is impossible without a fake order
  • Removing all orders for a customer accidentally removes their account

The solution: split the data into separate tables, each representing one concept, and connect them with foreign keys.

customers:                    products:
  id | name  | email          id | name     | price
  1  | Alice | alice@...      1  | Keyboard | 129.99
  2  | Bob   | bob@...        2  | Mouse    | 49.99

orders:
  id | customer_id | product_id | quantity
  1  | 1           | 1          | 1        ← Alice bought Keyboard
  2  | 1           | 2          | 2        ← Alice bought Mouse
  3  | 2           | 1          | 1        ← Bob bought Keyboard

Now Alice's email lives in one place. The Keyboard's price lives in one place. Joins let you reconstruct the combined view when you need it.


Setting Up the Example Schema

Create these tables to follow along:

sql
CREATE DATABASE store; \c store CREATE TABLE customers ( id BIGSERIAL PRIMARY KEY, name TEXT NOT NULL, email TEXT NOT NULL UNIQUE, city TEXT ); CREATE TABLE products ( id BIGSERIAL PRIMARY KEY, name TEXT NOT NULL, price NUMERIC(10,2) NOT NULL, category TEXT NOT NULL ); CREATE TABLE orders ( id BIGSERIAL PRIMARY KEY, customer_id BIGINT REFERENCES customers(id), product_id BIGINT REFERENCES products(id), quantity INTEGER NOT NULL, ordered_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); INSERT INTO customers (name, email, city) VALUES ('Alice', 'alice@example.com', 'New York'), ('Bob', 'bob@example.com', 'London'), ('Carol', 'carol@example.com', 'Berlin'), ('David', 'david@example.com', NULL); -- David has no city yet INSERT INTO products (name, price, category) VALUES ('Mechanical Keyboard', 129.99, 'peripherals'), ('Wireless Mouse', 49.99, 'peripherals'), ('Monitor Stand', 39.99, 'accessories'), ('USB-C Hub', 64.99, 'accessories'), ('Laptop Sleeve', 24.99, 'accessories'); INSERT INTO orders (customer_id, product_id, quantity) VALUES (1, 1, 1), -- Alice bought Keyboard (1, 2, 2), -- Alice bought Mouse (2x) (2, 1, 1), -- Bob bought Keyboard (3, 3, 1), -- Carol bought Monitor Stand (3, 4, 1); -- Carol bought USB-C Hub -- Note: David (id=4) has no orders -- Note: Laptop Sleeve (id=5) has no orders

INNER JOIN — Only Matching Rows

An INNER JOIN returns rows that have a match in both tables. If a customer has no orders, they do not appear. If an order references a customer that doesn't exist, it doesn't appear.

sql
SELECT customers.name, orders.quantity, products.name AS product FROM orders INNER JOIN customers ON orders.customer_id = customers.id INNER JOIN products ON orders.product_id = products.id; -- name | quantity | product -- -------+----------+--------------------- -- Alice | 1 | Mechanical Keyboard -- Alice | 2 | Wireless Mouse -- Bob | 1 | Mechanical Keyboard -- Carol | 1 | Monitor Stand -- Carol | 1 | USB-C Hub

Notice:

  • David does not appear — he has no orders
  • Laptop Sleeve does not appear — nobody ordered it

The syntax:

sql
SELECT [columns] FROM table_a INNER JOIN table_b ON table_a.column = table_b.column; -- 'INNER' is optional — bare JOIN means INNER JOIN SELECT [columns] FROM table_a JOIN table_b ON table_a.column = table_b.column;

Using aliases to shorten table names:

sql
SELECT c.name, o.quantity, p.name AS product FROM orders o JOIN customers c ON o.customer_id = c.id JOIN products p ON o.product_id = p.id;

LEFT JOIN — Keep All Left Rows

A LEFT JOIN returns all rows from the left table, plus matching rows from the right. If there is no match in the right table, the right side columns are NULL.

sql
-- List ALL customers, and their orders if they have any SELECT c.name, o.quantity, p.name AS product FROM customers c LEFT JOIN orders o ON c.id = o.customer_id LEFT JOIN products p ON o.product_id = p.id ORDER BY c.name; -- name | quantity | product -- -------+----------+--------------------- -- Alice | 1 | Mechanical Keyboard -- Alice | 2 | Wireless Mouse -- Bob | 1 | Mechanical Keyboard -- Carol | 1 | Monitor Stand -- Carol | 1 | USB-C Hub -- David | (null) | (null) ← David has no orders

David appears with NULL values for the order columns because he has no orders — but he is not excluded. This is the key difference from INNER JOIN.

The most common use of LEFT JOIN: finding rows that have NO match.

sql
-- Find customers who have never placed an order SELECT c.name, c.email FROM customers c LEFT JOIN orders o ON c.id = o.customer_id WHERE o.id IS NULL; -- NULL means no matching order was found -- name | email -- -------+------------------ -- David | david@example.com
sql
-- Find products that have never been ordered SELECT p.name, p.price FROM products p LEFT JOIN orders o ON p.id = o.product_id WHERE o.id IS NULL; -- name | price -- ---------------+------- -- Laptop Sleeve | 24.99

RIGHT JOIN — Keep All Right Rows

RIGHT JOIN is the mirror of LEFT JOIN — it keeps all rows from the right table. It is used rarely in practice because you can always rewrite a RIGHT JOIN as a LEFT JOIN by swapping the table order.

sql
-- These two queries produce identical results: SELECT c.name, o.quantity FROM orders o RIGHT JOIN customers c ON o.customer_id = c.id; -- ← equivalent to → SELECT c.name, o.quantity FROM customers c LEFT JOIN orders o ON c.id = o.customer_id;

Use LEFT JOIN consistently and avoid RIGHT JOIN — it makes queries easier to read.


FULL OUTER JOIN — Keep All Rows from Both Sides

FULL OUTER JOIN returns all rows from both tables. Where there is no match, the missing side has NULLs.

sql
-- All customers AND all products, showing orders where they connect SELECT c.name AS customer, p.name AS product, o.quantity FROM customers c FULL OUTER JOIN orders o ON c.id = o.customer_id FULL OUTER JOIN products p ON o.product_id = p.id ORDER BY c.name; -- Result includes: -- - Customers with orders (matched) -- - David (customer with no orders) -- - Laptop Sleeve (product with no orders)

Used less frequently than INNER and LEFT joins, but useful for finding unmatched rows on either side.


CROSS JOIN — Every Combination

A CROSS JOIN produces the Cartesian product — every row in the left table combined with every row in the right table.

sql
-- 4 customers × 5 products = 20 rows SELECT c.name, p.name AS product FROM customers c CROSS JOIN products p;

This is rarely intentional. If you write a JOIN without an ON clause, you accidentally get a cross join — a common beginner mistake.

sql
-- ❌ ACCIDENTAL cross join (missing ON clause) SELECT c.name, o.quantity FROM customers c JOIN orders o; -- Missing: ON c.id = o.customer_id -- Returns 4 customers × 5 orders = 20 rows (wrong!) -- ✅ CORRECT SELECT c.name, o.quantity FROM customers c JOIN orders o ON c.id = o.customer_id;

Self-Joins — A Table Joining Itself

Sometimes you need to join a table to itself. The classic example: an employee table where each employee has a manager_id that references another employee's id.

sql
CREATE TABLE employees ( id BIGSERIAL PRIMARY KEY, name TEXT NOT NULL, manager_id BIGINT REFERENCES employees(id) -- references same table ); INSERT INTO employees (name, manager_id) VALUES ('CEO', NULL), -- id=1, no manager ('VP Sales', 1), -- id=2, reports to CEO ('VP Eng', 1), -- id=3, reports to CEO ('Sales Rep', 2), -- id=4, reports to VP Sales ('Engineer', 3); -- id=5, reports to VP Eng -- Show each employee with their manager's name SELECT e.name AS employee, m.name AS manager FROM employees e LEFT JOIN employees m ON e.manager_id = m.id ORDER BY e.id; -- employee | manager -- -----------+---------- -- CEO | (null) ← CEO has no manager -- VP Sales | CEO -- VP Eng | CEO -- Sales Rep | VP Sales -- Engineer | VP Eng

The key: you use aliases (e and m) to distinguish the two "instances" of the same table.


Joining Three or More Tables

Joins chain naturally — each new JOIN adds another table:

sql
-- Full order details: customer name, product name, price, quantity, total SELECT c.name AS customer, c.city AS city, p.name AS product, p.price AS unit_price, o.quantity, p.price * o.quantity AS line_total, o.ordered_at FROM orders o JOIN customers c ON o.customer_id = c.id JOIN products p ON o.product_id = p.id ORDER BY o.ordered_at;

Common Join Mistakes

Mistake 1: Duplicate rows from one-to-many joins

If a customer has 5 orders, joining customers to orders gives 5 rows for that customer. This is correct and expected — but surprises beginners who then aggregate incorrectly.

sql
-- ❌ WRONG: counting total orders per customer using GROUP BY without noticing duplicates SELECT COUNT(*) FROM customers JOIN orders ON customers.id = orders.customer_id; -- Returns 5 (number of orders), not 4 (number of customers) -- ✅ CORRECT: that's exactly right — there are 5 orders -- If you want unique customers who have ordered, use DISTINCT: SELECT COUNT(DISTINCT c.id) FROM customers c JOIN orders o ON c.id = o.customer_id; -- Returns 3 (Alice, Bob, Carol — David has no orders)

Mistake 2: Ambiguous column names

When two joined tables have a column with the same name, you must qualify which table's column you want:

sql
-- ❌ ERROR: 'id' is ambiguous — which table? SELECT id, name FROM customers JOIN orders ON customers.id = orders.customer_id; -- ✅ CORRECT: qualify with the table name SELECT customers.id, customers.name FROM customers JOIN orders ON customers.id = orders.customer_id; -- ✅ ALSO CORRECT: use aliases SELECT c.id, c.name FROM customers c JOIN orders o ON c.id = o.customer_id;

Mistake 3: Missing the join condition

sql
-- ❌ This is a CROSS JOIN producing a huge result SELECT * FROM customers, orders; -- old-style implicit join syntax -- ✅ CORRECT SELECT * FROM customers JOIN orders ON customers.id = orders.customer_id;

Practical Exercise: Complete Store Queries

Using the store database with customers, products, and orders:

sql
-- Q1: Show every order with customer name, product name, and total cost SELECT c.name AS customer, p.name AS product, o.quantity, ROUND(p.price * o.quantity, 2) AS total FROM orders o JOIN customers c ON o.customer_id = c.id JOIN products p ON o.product_id = p.id ORDER BY c.name, p.name; -- Q2: Total amount spent by each customer SELECT c.name, SUM(p.price * o.quantity) AS total_spent FROM customers c JOIN orders o ON c.id = o.customer_id JOIN products p ON o.product_id = p.id GROUP BY c.name ORDER BY total_spent DESC; -- Q3: Customers who have never ordered anything SELECT c.name, c.email FROM customers c LEFT JOIN orders o ON c.id = o.customer_id WHERE o.id IS NULL; -- Q4: Products that have been ordered at least once, and how many times SELECT p.name, COUNT(o.id) AS order_count FROM products p JOIN orders o ON p.id = o.product_id GROUP BY p.name ORDER BY order_count DESC; -- Q5: Products that have never been ordered SELECT p.name, p.price FROM products p LEFT JOIN orders o ON p.id = o.product_id WHERE o.id IS NULL; -- Q6: Customers and the number of unique products they ordered SELECT c.name, COUNT(DISTINCT o.product_id) AS unique_products FROM customers c JOIN orders o ON c.id = o.customer_id GROUP BY c.name ORDER BY unique_products DESC; -- Q7: All customers and their total orders (including those with 0 orders) SELECT c.name, COUNT(o.id) AS order_count, COALESCE(SUM(p.price * o.quantity), 0) AS total_spent FROM customers c LEFT JOIN orders o ON c.id = o.customer_id LEFT JOIN products p ON o.product_id = p.id GROUP BY c.name ORDER BY total_spent DESC;

Join Type Summary

Join TypeReturns
INNER JOINOnly rows that match in both tables
LEFT JOINAll rows from left table + matches from right (NULLs where no match)
RIGHT JOINAll rows from right table + matches from left (use LEFT JOIN instead)
FULL OUTER JOINAll rows from both tables (NULLs where no match on either side)
CROSS JOINEvery combination of left × right rows

When to use which:

  • Finding related data → INNER JOIN
  • Keeping all records even without a match → LEFT JOIN
  • Finding records with no match → LEFT JOIN ... WHERE right.id IS NULL
  • Comparing two complete sets → FULL OUTER JOIN

Module F-6 covers constraints — the rules that make your database trustworthy: PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, and CHECK.

Next: F-6 — Constraints and Data Integrity →

Discussion