Giao diện
Exercise 4: Subqueries 🎭
🎯 Objective
Master subqueries (nested queries) for complex data retrieval and filtering.
📋 Problem Statement
Using the same tables (products, orders, suppliers), learn to write queries within queries.
📥 Tasks
Task 1: Products Above Average Price
Find all products with price higher than the average price of all products.
Task 2: Suppliers with High-Value Products
Find suppliers who have at least one product priced above $200. Use a subquery with IN.
Task 3: Products Ordered by Specific Customer
Find all products that customer 501 has ordered. Use EXISTS.
Task 4: Most Expensive Product per Category
For each category, find the product(s) with the highest price. Use a correlated subquery.
Task 5: Customers Who Ordered Electronics
Find customer_ids who have ordered at least one Electronics product.
Task 6: Suppliers Without Recent Orders
Find suppliers whose products haven't been ordered in 2024. Use NOT EXISTS.
💡 Hints
Hint 1: Subquery Types
- **Scalar subquery**: Returns single value (used with =, <, >) - **Row subquery**: Returns single row - **Column subquery**: Returns single column (used with IN, ANY, ALL) - **Table subquery**: Returns multiple rows and columns (used in FROM)Hint 2: IN vs EXISTS
- `IN`: Good for small result sets, checks if value is in list - `EXISTS`: Better for large result sets, checks if subquery returns any rows - `EXISTS` stops at first match (more efficient)Hint 3: Correlated Subquery
References column from outer query. Executes once per outer row. ```sql SELECT * FROM table1 t1 WHERE column > (SELECT AVG(column) FROM table1 t2 WHERE t2.category = t1.category); ```✅ Solutions
Click to reveal solutions
Solution 1: Products Above Average Price
sql
SELECT product_name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products)
ORDER BY price DESC;Result:
product_name | price
----------------------|--------
Laptop Pro 15 | 1299.99
Standing Desk | 499.99
Monitor 27" | 349.99
Office Chair | 199.99Explanation:
- Inner query calculates average price: ~$316
- Outer query filters products above this average
- Subquery executes once (scalar subquery)
Performance: Efficient because subquery returns single value.
Solution 2: Suppliers with High-Value Products
sql
SELECT supplier_name, country
FROM suppliers
WHERE supplier_id IN (
SELECT DISTINCT supplier_id
FROM products
WHERE price > 200
)
ORDER BY supplier_name;Result:
supplier_name | country
-----------------|--------
FurniturePro | Germany
TechCorp | USAExplanation: Subquery returns list of supplier_ids with expensive products, outer query filters suppliers.
Alternative (using EXISTS - often faster):
sql
SELECT s.supplier_name, s.country
FROM suppliers s
WHERE EXISTS (
SELECT 1
FROM products p
WHERE p.supplier_id = s.supplier_id AND p.price > 200
);Performance Comparison:
IN: Subquery executes once, returns listEXISTS: Stops at first match per supplier (better for large datasets)
Solution 3: Products Ordered by Specific Customer
sql
SELECT DISTINCT p.product_name, p.price
FROM products p
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.product_id = p.product_id AND o.customer_id = 501
)
ORDER BY p.product_name;Result:
product_name | price
----------------------|--------
Laptop Pro 15 | 1299.99
Office Chair | 199.99
Standing Desk | 499.99Explanation: EXISTS checks if there's at least one order by customer 501 for each product.
Alternative (using IN):
sql
SELECT DISTINCT product_name, price
FROM products
WHERE product_id IN (
SELECT product_id FROM orders WHERE customer_id = 501
);Why EXISTS here? More semantic - we're checking existence, not matching values.
Solution 4: Most Expensive Product per Category
sql
SELECT p1.category, p1.product_name, p1.price
FROM products p1
WHERE p1.price = (
SELECT MAX(p2.price)
FROM products p2
WHERE p2.category = p1.category
)
ORDER BY p1.category, p1.product_name;Result:
category | product_name | price
-------------|-----------------------|--------
Electronics | Laptop Pro 15 | 1299.99
Furniture | Standing Desk | 499.99Explanation:
- Correlated subquery: Inner query references outer query's category
- For each product, finds max price in its category
- Keeps products matching that max price
Performance Warning: Correlated subqueries can be slow on large datasets (executes once per row).
Alternative (using window functions - more efficient):
sql
WITH ranked AS (
SELECT
category,
product_name,
price,
RANK() OVER (PARTITION BY category ORDER BY price DESC) as rank
FROM products
)
SELECT category, product_name, price
FROM ranked
WHERE rank = 1;Solution 5: Customers Who Ordered Electronics
sql
SELECT DISTINCT o.customer_id
FROM orders o
WHERE o.product_id IN (
SELECT product_id
FROM products
WHERE category = 'Electronics'
)
ORDER BY o.customer_id;Result:
customer_id
-----------
501
502
503
504
506Explanation: Subquery finds all Electronics product_ids, outer query finds customers who ordered them.
Alternative (using JOIN - often clearer):
sql
SELECT DISTINCT o.customer_id
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE p.category = 'Electronics'
ORDER BY o.customer_id;When to use subquery vs JOIN?
- Subquery: When you only need to filter, not retrieve data from inner table
- JOIN: When you need columns from both tables
Solution 6: Suppliers Without Recent Orders
sql
SELECT s.supplier_name, s.country
FROM suppliers s
WHERE NOT EXISTS (
SELECT 1
FROM products p
JOIN orders o ON p.product_id = o.product_id
WHERE p.supplier_id = s.supplier_id
AND o.order_date >= '2024-01-01'
)
ORDER BY s.supplier_name;Result:
supplier_name | country
-----------------|--------
GlobalParts | JapanExplanation:
- NOT EXISTS checks for absence of orders in 2024
- Subquery joins products and orders for each supplier
- GlobalParts has no products, so no orders
Alternative (using LEFT JOIN + NULL check):
sql
SELECT DISTINCT s.supplier_name, s.country
FROM suppliers s
LEFT JOIN products p ON s.supplier_id = p.supplier_id
LEFT JOIN orders o ON p.product_id = o.product_id AND o.order_date >= '2024-01-01'
WHERE o.order_id IS NULL;🚀 Follow-up Challenges
- Nested Subqueries: Find products more expensive than the average price of products from the same supplier
- Multiple Subqueries: Find customers who ordered both Electronics AND Furniture
- Subquery in SELECT: Show each product with its price and the category average price
- ANY/ALL Operators: Find products more expensive than ALL products in Furniture category
🔑 Key Takeaways
Subquery Types
| Type | Returns | Example Use |
|---|---|---|
| Scalar | Single value | WHERE price > (SELECT AVG(price)...) |
| Column | Single column | WHERE id IN (SELECT id...) |
| Row | Single row | WHERE (col1, col2) = (SELECT col1, col2...) |
| Table | Multiple rows/cols | FROM (SELECT...) AS subquery |
IN vs EXISTS
sql
-- IN: Subquery returns list of values
WHERE column IN (SELECT column FROM table WHERE condition)
-- EXISTS: Checks if subquery returns any rows
WHERE EXISTS (SELECT 1 FROM table WHERE condition)Performance:
IN: Better for small result setsEXISTS: Better for large result sets (stops at first match)NOT EXISTS: Often faster thanNOT IN(handles NULLs better)
Correlated vs Non-Correlated
Non-Correlated (Independent):
sql
-- Executes once
SELECT * FROM products
WHERE price > (SELECT AVG(price) FROM products);Correlated (Dependent):
sql
-- Executes once per outer row
SELECT * FROM products p1
WHERE price > (SELECT AVG(price) FROM products p2 WHERE p2.category = p1.category);Best Practices
- Prefer JOINs when you need data from multiple tables
- Use EXISTS for existence checks (more semantic)
- Avoid correlated subqueries on large datasets (use window functions instead)
- Use CTEs for complex subqueries (better readability)
- Test performance - subqueries vs JOINs can vary by database
📚 Advanced Patterns
Pattern 1: Subquery in SELECT (Scalar Subquery)
sql
SELECT
product_name,
price,
(SELECT AVG(price) FROM products) AS avg_price,
price - (SELECT AVG(price) FROM products) AS price_diff
FROM products;Pattern 2: Subquery in FROM (Derived Table)
sql
SELECT category, avg_price
FROM (
SELECT category, AVG(price) AS avg_price
FROM products
GROUP BY category
) AS category_stats
WHERE avg_price > 100;Pattern 3: Multiple Subqueries
sql
SELECT product_name
FROM products
WHERE price > (SELECT AVG(price) FROM products)
AND supplier_id IN (SELECT supplier_id FROM suppliers WHERE rating > 4.5);Pattern 4: ANY/ALL Operators
sql
-- More expensive than ANY Furniture product
SELECT product_name, price
FROM products
WHERE price > ANY (SELECT price FROM products WHERE category = 'Furniture');
-- More expensive than ALL Furniture products
SELECT product_name, price
FROM products
WHERE price > ALL (SELECT price FROM products WHERE category = 'Furniture');⚠️ Common Pitfalls
⚠️ Cạm bẫy
Pitfall 1: NULL in NOT IN
sql
-- WRONG: Returns no rows if subquery contains NULL
SELECT * FROM products
WHERE supplier_id NOT IN (SELECT supplier_id FROM suppliers WHERE rating < 4.0);
-- CORRECT: Use NOT EXISTS or filter NULLs
SELECT * FROM products p
WHERE NOT EXISTS (
SELECT 1 FROM suppliers s
WHERE s.supplier_id = p.supplier_id AND s.rating < 4.0
);⚠️ Cạm bẫy
Pitfall 2: Correlated Subquery Performance
sql
-- SLOW: Executes subquery for each row
SELECT product_name,
(SELECT supplier_name FROM suppliers s WHERE s.supplier_id = p.supplier_id)
FROM products p;
-- FAST: Use JOIN instead
SELECT p.product_name, s.supplier_name
FROM products p
JOIN suppliers s ON p.supplier_id = s.supplier_id;⚠️ Cạm bẫy
Pitfall 3: Forgetting DISTINCT
sql
-- May return duplicates
SELECT customer_id FROM orders WHERE product_id IN (...);
-- Better
SELECT DISTINCT customer_id FROM orders WHERE product_id IN (...);