Skip to content

Exercise 2: Aggregations & GROUP BY 📊

🎯 Objective

Master SQL aggregation functions and GROUP BY clause for data summarization.

📋 Problem Statement

Using the same products table from Exercise 1, plus an orders table:

sql
CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  product_id INT,
  quantity INT,
  order_date DATE,
  customer_id INT,
  FOREIGN KEY (product_id) REFERENCES products(product_id)
);

Sample Data:

sql
INSERT INTO orders VALUES
(1, 1, 2, '2024-01-15', 501),
(2, 2, 5, '2024-01-16', 502),
(3, 1, 1, '2024-01-17', 503),
(4, 3, 3, '2024-01-18', 501),
(5, 5, 10, '2024-01-19', 504),
(6, 2, 3, '2024-01-20', 502),
(7, 6, 1, '2024-01-21', 505),
(8, 4, 2, '2024-01-22', 501),
(9, 8, 4, '2024-01-23', 503),
(10, 1, 1, '2024-01-24', 506);

📥 Tasks

Task 1: Total Products

Count the total number of products in each category.

Task 2: Average Price by Category

Calculate the average price for each category, rounded to 2 decimal places.

Task 3: Inventory Value

Calculate the total inventory value (price × stock_quantity) for each category.

Task 4: Order Statistics

Find the total quantity ordered for each product (show product_name and total_quantity).

Task 5: High-Value Categories

Find categories where the average price is greater than $100.

Task 6: Customer Order Count

Count how many orders each customer has placed (show customer_id and order_count).

💡 Hints

Hint 1: Aggregation Functions - `COUNT(*)` counts all rows - `SUM(column)` adds up values - `AVG(column)` calculates average - `MIN(column)` finds minimum - `MAX(column)` finds maximum
Hint 2: GROUP BY When using aggregation functions, you must GROUP BY any non-aggregated columns in SELECT.
Hint 3: HAVING vs WHERE - `WHERE` filters rows BEFORE aggregation - `HAVING` filters groups AFTER aggregation
Hint 4: JOIN for Product Names To show product names with order statistics, you'll need to JOIN products and orders tables.

Solutions

Click to reveal solutions

Solution 1: Total Products

sql
SELECT category, COUNT(*) AS product_count
FROM products
GROUP BY category
ORDER BY product_count DESC;

Result:

category     | product_count
-------------|---------------
Electronics  | 5
Furniture    | 3

Explanation: GROUP BY groups rows by category, COUNT(*) counts rows in each group.


Solution 2: Average Price by Category

sql
SELECT 
  category, 
  ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category
ORDER BY avg_price DESC;

Result:

category     | avg_price
-------------|----------
Electronics  | 356.59
Furniture    | 248.66

Explanation: AVG() calculates mean, ROUND() formats to 2 decimals.


Solution 3: Inventory Value

sql
SELECT 
  category,
  SUM(price * stock_quantity) AS total_inventory_value
FROM products
GROUP BY category
ORDER BY total_inventory_value DESC;

Result:

category     | total_inventory_value
-------------|----------------------
Electronics  | 56,994.25
Furniture    | 13,559.25

Explanation: Calculate value per product, then SUM by category.

Pro Tip: Format currency in application layer, not SQL (for better performance).


Solution 4: Order Statistics

sql
SELECT 
  p.product_name,
  SUM(o.quantity) AS total_quantity_ordered
FROM orders o
JOIN products p ON o.product_id = p.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_quantity_ordered DESC;

Result:

product_name          | total_quantity_ordered
----------------------|-----------------------
USB-C Cable           | 10
Wireless Mouse        | 8
Laptop Pro 15         | 4
Keyboard Mechanical   | 4
Office Chair          | 3
Standing Desk         | 2
Monitor 27"           | 1

Explanation: JOIN to get product names, GROUP BY product, SUM quantities.

Important: Include product_id in GROUP BY for uniqueness (in case of duplicate names).


Solution 5: High-Value Categories

sql
SELECT 
  category,
  AVG(price) AS avg_price
FROM products
GROUP BY category
HAVING AVG(price) > 100
ORDER BY avg_price DESC;

Result:

category     | avg_price
-------------|----------
Electronics  | 356.59
Furniture    | 248.66

Explanation: HAVING filters groups after aggregation. WHERE would filter individual products before grouping.

Common Mistake: Using WHERE instead of HAVING for aggregate conditions.


Solution 6: Customer Order Count

sql
SELECT 
  customer_id,
  COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
ORDER BY order_count DESC;

Result:

customer_id | order_count
------------|------------
501         | 3
502         | 2
503         | 2
504         | 1
505         | 1
506         | 1

Explanation: Simple GROUP BY with COUNT to find order frequency per customer.

🚀 Follow-up Challenges

  1. Multiple Aggregations: Show category, product count, average price, and total inventory value in one query
  2. Date Aggregations: Count orders per day/month (Hint: use DATE functions)
  3. Conditional Aggregation: Count how many products in each category have price > $50 (Hint: use CASE)
  4. Top N per Group: Find the most expensive product in each category (Hint: window functions or subquery)

🔑 Key Takeaways

  • Aggregation Functions: COUNT, SUM, AVG, MIN, MAX summarize data
  • GROUP BY: Groups rows with same values for aggregation
  • HAVING: Filters groups after aggregation (use with GROUP BY)
  • WHERE vs HAVING: WHERE filters rows, HAVING filters groups
  • NULL Handling: Most aggregate functions ignore NULL values (except COUNT(*))
  • Performance: Aggregations can be expensive on large datasets - consider indexes

📚 Common Patterns

Pattern 1: Count with Condition

sql
-- Count products with price > 100
SELECT COUNT(*) 
FROM products 
WHERE price > 100;

Pattern 2: Multiple Aggregations

sql
SELECT 
  category,
  COUNT(*) AS count,
  AVG(price) AS avg_price,
  MIN(price) AS min_price,
  MAX(price) AS max_price
FROM products
GROUP BY category;

Pattern 3: Conditional Aggregation

sql
SELECT 
  category,
  SUM(CASE WHEN price > 100 THEN 1 ELSE 0 END) AS expensive_count,
  SUM(CASE WHEN price <= 100 THEN 1 ELSE 0 END) AS affordable_count
FROM products
GROUP BY category;