Skip to content

Exercise 6: CTEs (Common Table Expressions) 📝

🎯 Objective

Master CTEs for writing readable, maintainable, and powerful SQL queries.

📋 Problem Statement

CTEs (WITH clause) create temporary named result sets that exist only during query execution. They improve readability and enable recursive queries.

Syntax:

sql
WITH cte_name AS (
  SELECT ...
)
SELECT * FROM cte_name;

📥 Tasks

Task 1: Basic CTE

Rewrite this subquery using a CTE:

sql
SELECT * FROM (
  SELECT category, AVG(price) AS avg_price
  FROM products
  GROUP BY category
) AS category_stats
WHERE avg_price > 100;

Task 2: Multiple CTEs

Find products that are:

  1. Above average price in their category
  2. From suppliers with rating >= 4.5

Use two CTEs: one for category averages, one for high-rated suppliers.

Task 3: Recursive CTE - Number Series

Generate a series of numbers from 1 to 10 using a recursive CTE.

Task 4: Recursive CTE - Date Series

Generate all dates in January 2024 using a recursive CTE.

Task 5: CTE with Window Functions

Find products whose price is in the top 25% of their category. Use CTE with NTILE().

Task 6: Hierarchical Data (Bonus)

Given an employees table with employee_id and manager_id, find all employees in a management chain.

💡 Hints

Hint 1: CTE Benefits - Improves readability (named subqueries) - Can be referenced multiple times - Enables recursive queries - Better than temp tables (no cleanup needed)
Hint 2: Recursive CTE Structure ```sql WITH RECURSIVE cte_name AS ( -- Anchor member (base case) SELECT initial_value UNION ALL -- Recursive member (references cte_name) SELECT next_value FROM cte_name WHERE condition ) SELECT * FROM cte_name; ```
Hint 3: Multiple CTEs ```sql WITH cte1 AS (SELECT ...), cte2 AS (SELECT ...), cte3 AS (SELECT ...) SELECT * FROM cte1 JOIN cte2 ...; ```

Solutions

Click to reveal solutions

Solution 1: Basic CTE

sql
WITH category_stats AS (
  SELECT 
    category, 
    AVG(price) AS avg_price
  FROM products
  GROUP BY category
)
SELECT *
FROM category_stats
WHERE avg_price > 100;

Result:

category     | avg_price
-------------|----------
Electronics  | 356.59
Furniture    | 248.66

Benefits over subquery:

  • Named result set (self-documenting)
  • Can reference multiple times
  • Easier to debug (can run CTE separately)

Solution 2: Multiple CTEs

sql
WITH 
  category_avg AS (
    SELECT 
      category,
      AVG(price) AS avg_price
    FROM products
    GROUP BY category
  ),
  high_rated_suppliers AS (
    SELECT supplier_id
    FROM suppliers
    WHERE rating >= 4.5
  )
SELECT 
  p.product_name,
  p.price,
  p.category,
  ca.avg_price,
  s.supplier_name,
  s.rating
FROM products p
JOIN category_avg ca ON p.category = ca.category
JOIN high_rated_suppliers hrs ON p.supplier_id = hrs.supplier_id
JOIN suppliers s ON p.supplier_id = s.supplier_id
WHERE p.price > ca.avg_price
ORDER BY p.price DESC;

Result:

product_name          | price   | category    | avg_price | supplier_name  | rating
----------------------|---------|-------------|-----------|----------------|-------
Laptop Pro 15         | 1299.99 | Electronics | 356.59    | TechCorp       | 4.50
Standing Desk         | 499.99  | Furniture   | 248.66    | FurniturePro   | 4.80

Explanation:

  1. First CTE calculates category averages
  2. Second CTE filters high-rated suppliers
  3. Main query joins everything together

Readability: Much clearer than nested subqueries!


Solution 3: Recursive CTE - Number Series

sql
WITH RECURSIVE number_series AS (
  -- Anchor: Start with 1
  SELECT 1 AS n
  
  UNION ALL
  
  -- Recursive: Add 1 until we reach 10
  SELECT n + 1
  FROM number_series
  WHERE n < 10
)
SELECT n
FROM number_series;

Result:

n
--
1
2
3
4
5
6
7
8
9
10

How it works:

  1. Anchor member returns initial value (1)
  2. Recursive member adds 1 to previous value
  3. Stops when condition (n < 10) is false
  4. UNION ALL combines all iterations

Use Cases:

  • Generate test data
  • Fill gaps in sequences
  • Create calendar tables

Solution 4: Recursive CTE - Date Series

sql
WITH RECURSIVE date_series AS (
  -- Anchor: Start with first day of January
  SELECT DATE('2024-01-01') AS date
  
  UNION ALL
  
  -- Recursive: Add 1 day until end of January
  SELECT DATE(date, '+1 day')
  FROM date_series
  WHERE date < '2024-01-31'
)
SELECT 
  date,
  CASE strftime('%w', date)
    WHEN '0' THEN 'Sunday'
    WHEN '1' THEN 'Monday'
    WHEN '2' THEN 'Tuesday'
    WHEN '3' THEN 'Wednesday'
    WHEN '4' THEN 'Thursday'
    WHEN '5' THEN 'Friday'
    WHEN '6' THEN 'Saturday'
  END AS day_of_week
FROM date_series;

Result: 31 rows with all dates in January 2024

Use Cases:

  • Generate calendar tables
  • Fill missing dates in time series
  • Calculate business days

Note: Date functions vary by database (this is SQLite syntax).


Solution 5: CTE with Window Functions

sql
WITH product_quartiles AS (
  SELECT 
    product_name,
    category,
    price,
    NTILE(4) OVER (PARTITION BY category ORDER BY price DESC) AS price_quartile
  FROM products
)
SELECT 
  product_name,
  category,
  price,
  price_quartile
FROM product_quartiles
WHERE price_quartile = 1  -- Top 25%
ORDER BY category, price DESC;

Result:

product_name          | category    | price   | price_quartile
----------------------|-------------|---------|----------------
Laptop Pro 15         | Electronics | 1299.99 | 1
Monitor 27"           | Electronics | 349.99  | 1
Standing Desk         | Furniture   | 499.99  | 1

Explanation:

  • NTILE(4) divides products into 4 quartiles within each category
  • Quartile 1 = top 25%
  • CTE makes it easy to filter on window function result

Alternative (without CTE - less readable):

sql
SELECT * FROM (
  SELECT 
    product_name, category, price,
    NTILE(4) OVER (PARTITION BY category ORDER BY price DESC) AS price_quartile
  FROM products
) WHERE price_quartile = 1;

Solution 6: Hierarchical Data (Bonus)

sql
-- First, create sample employees table
CREATE TABLE employees (
  employee_id INT PRIMARY KEY,
  employee_name VARCHAR(100),
  manager_id INT
);

INSERT INTO employees VALUES
(1, 'CEO Alice', NULL),
(2, 'VP Bob', 1),
(3, 'VP Charlie', 1),
(4, 'Manager David', 2),
(5, 'Manager Eve', 2),
(6, 'Engineer Frank', 4),
(7, 'Engineer Grace', 4),
(8, 'Engineer Henry', 5);

-- Recursive CTE to find all subordinates of employee 2 (VP Bob)
WITH RECURSIVE subordinates AS (
  -- Anchor: Start with VP Bob
  SELECT 
    employee_id,
    employee_name,
    manager_id,
    0 AS level
  FROM employees
  WHERE employee_id = 2
  
  UNION ALL
  
  -- Recursive: Find direct reports
  SELECT 
    e.employee_id,
    e.employee_name,
    e.manager_id,
    s.level + 1
  FROM employees e
  JOIN subordinates s ON e.manager_id = s.employee_id
)
SELECT 
  employee_id,
  REPEAT('  ', level) || employee_name AS employee_name,
  level
FROM subordinates
ORDER BY level, employee_id;

Result:

employee_id | employee_name        | level
------------|----------------------|------
2           | VP Bob               | 0
4           |   Manager David      | 1
5           |   Manager Eve        | 1
6           |     Engineer Frank   | 2
7           |     Engineer Grace   | 2
8           |     Engineer Henry   | 2

Explanation:

  • Anchor finds starting employee (VP Bob)
  • Recursive member finds all direct reports
  • Level tracks depth in hierarchy
  • REPEAT adds indentation for visualization

Use Cases:

  • Organization charts
  • Bill of materials (manufacturing)
  • Category trees
  • File system hierarchies

🚀 Follow-up Challenges

  1. CTE Reuse: Reference the same CTE multiple times in one query
  2. Recursive Limit: Add LIMIT to prevent infinite recursion
  3. Graph Traversal: Find shortest path between two nodes
  4. Running Calculations: Use CTE to calculate month-over-month growth

🔑 Key Takeaways

CTE vs Subquery vs Temp Table

FeatureCTESubqueryTemp Table
Readability✅ Excellent❌ Poor (nested)✅ Good
Reusability✅ Yes (in same query)❌ No✅ Yes (across queries)
Recursion✅ Yes❌ No❌ No
PerformanceSimilarSimilarCan be faster (indexed)
Cleanup✅ Automatic✅ Automatic❌ Manual

When to Use CTEs

Use CTEs when:

  • Query has multiple subqueries
  • Need to reference result set multiple times
  • Recursive query needed
  • Improving readability is priority

Don't use CTEs when:

  • Simple one-time subquery
  • Need to reuse across multiple queries (use temp table)
  • Performance is critical and temp table with indexes would be faster

Recursive CTE Structure

sql
WITH RECURSIVE cte_name AS (
  -- 1. ANCHOR MEMBER (base case)
  SELECT initial_values
  WHERE base_condition
  
  UNION ALL
  
  -- 2. RECURSIVE MEMBER (references cte_name)
  SELECT derived_values
  FROM cte_name
  JOIN other_tables
  WHERE termination_condition
)
SELECT * FROM cte_name;

Important: Always include termination condition to prevent infinite recursion!

Best Practices

  • Name CTEs descriptively - they're self-documenting
  • One CTE per logical step - don't cram everything into one
  • Comment complex CTEs - explain the business logic
  • Test CTEs independently - run each CTE separately during development
  • Limit recursion depth - use LIMIT or counter to prevent runaway queries
  • Consider materialization - for very large result sets, temp tables might be faster

📚 Advanced Patterns

Pattern 1: CTE Chain (Data Pipeline)

sql
WITH 
  -- Step 1: Filter raw data
  filtered_orders AS (
    SELECT * FROM orders WHERE order_date >= '2024-01-01'
  ),
  -- Step 2: Join with products
  enriched_orders AS (
    SELECT o.*, p.product_name, p.price
    FROM filtered_orders o
    JOIN products p ON o.product_id = p.product_id
  ),
  -- Step 3: Calculate metrics
  order_metrics AS (
    SELECT 
      order_id,
      product_name,
      quantity * price AS order_value
    FROM enriched_orders
  )
SELECT * FROM order_metrics WHERE order_value > 100;

Pattern 2: Recursive Fibonacci

sql
WITH RECURSIVE fibonacci AS (
  SELECT 0 AS n, 0 AS fib, 1 AS next_fib
  UNION ALL
  SELECT n + 1, next_fib, fib + next_fib
  FROM fibonacci
  WHERE n < 10
)
SELECT n, fib FROM fibonacci;

Pattern 3: Gap and Island Detection

sql
WITH RECURSIVE date_gaps AS (
  SELECT 
    order_date,
    LEAD(order_date) OVER (ORDER BY order_date) AS next_date,
    JULIANDAY(LEAD(order_date) OVER (ORDER BY order_date)) - JULIANDAY(order_date) AS gap_days
  FROM (SELECT DISTINCT order_date FROM orders)
)
SELECT * FROM date_gaps WHERE gap_days > 1;

⚠️ Common Pitfalls

⚠️ Cạm bẫy

Pitfall 1: Infinite Recursion

sql
-- WRONG: No termination condition
WITH RECURSIVE infinite AS (
  SELECT 1 AS n
  UNION ALL
  SELECT n + 1 FROM infinite  -- Never stops!
)
SELECT * FROM infinite;

-- CORRECT: Add termination
WITH RECURSIVE finite AS (
  SELECT 1 AS n
  UNION ALL
  SELECT n + 1 FROM finite WHERE n < 100
)
SELECT * FROM finite;

⚠️ Cạm bẫy

Pitfall 2: Forgetting RECURSIVE Keyword

sql
-- WRONG: Missing RECURSIVE
WITH number_series AS (
  SELECT 1 AS n
  UNION ALL
  SELECT n + 1 FROM number_series WHERE n < 10
)
SELECT * FROM number_series;  -- Error!

-- CORRECT: Add RECURSIVE
WITH RECURSIVE number_series AS (...)

⚠️ Cạm bẫy

Pitfall 3: CTE Scope

sql
-- WRONG: CTE only exists in one query
WITH temp AS (SELECT * FROM products)
SELECT * FROM temp;

SELECT * FROM temp;  -- Error: temp doesn't exist here

-- CORRECT: Use temp table for multiple queries
CREATE TEMP TABLE temp AS SELECT * FROM products;
SELECT * FROM temp;
SELECT * FROM temp;  -- Works