Skip to content

Exercise 10: Real-World Scenarios 🌍

🎯 Objective

Apply all SQL skills to solve realistic business problems with messy data, complex requirements, and performance constraints.

📋 Problem Statement

Real-world data is messy. Requirements are ambiguous. Performance matters. This exercise simulates actual production scenarios.

📥 Scenarios

Scenario 1: E-commerce Sales Report

Business Request: "I need a report showing our best-selling products, but only count orders that were actually delivered (not cancelled), and show me the profit margin for each product."

Complications:

  • Orders table has status column (pending, shipped, delivered, cancelled)
  • Products table has cost and price columns
  • Some products have NULL cost (data quality issue)
  • Report needed for last 90 days only

Scenario 2: Data Quality Audit

Business Request: "Our data is a mess. Find all the problems: duplicate products, orders without products, negative prices, future order dates, etc."

Requirements:

  • Identify all data quality issues
  • Quantify impact (how many rows affected)
  • Prioritize by severity
  • Suggest fixes

Scenario 3: Customer Lifetime Value (CLV)

Business Request: "Calculate customer lifetime value and identify our top 10% customers. Also show me which customers haven't ordered in 90 days but were previously active."

Requirements:

  • CLV = Total revenue from customer
  • Active = Ordered in last 90 days
  • Churned = No order in 90+ days but ordered before
  • Segment customers by value

Scenario 4: Inventory Optimization

Business Request: "Which products should we reorder? Consider sales velocity, current stock, lead time, and safety stock. Also identify slow-moving inventory."

Complications:

  • Suppliers have different lead times
  • Need to maintain 14 days of safety stock
  • Slow-moving = < 1 sale per week for 8 weeks

Scenario 5: Schema Design Challenge

Business Request: "We're adding a promotions feature. Customers can use discount codes. Design the schema and write queries for: applying discounts, tracking usage, preventing abuse."

Requirements:

  • Support percentage and fixed-amount discounts
  • Limit uses per customer
  • Expiration dates
  • Minimum order value
  • Track redemptions

💡 Hints

Hint 1: Data Quality Checks Common issues to check: - NULL values in required fields - Duplicates (same name, different ID) - Referential integrity (orphaned records) - Invalid values (negative prices, future dates) - Outliers (price = $999,999)
Hint 2: CLV Calculation ```sql WITH customer_revenue AS ( SELECT customer_id, SUM(order_value) AS total_revenue FROM orders GROUP BY customer_id ) SELECT *, NTILE(10) OVER (ORDER BY total_revenue DESC) AS decile FROM customer_revenue; ```
Hint 3: Schema Design Think about: - What entities exist? (Promotions, Redemptions) - What relationships? (1-to-many, many-to-many) - What constraints? (CHECK, UNIQUE, FOREIGN KEY) - What indexes? (For query performance)

Solutions

Click to reveal solutions

Solution 1: E-commerce Sales Report

sql
WITH delivered_orders AS (
  -- Filter to delivered orders in last 90 days
  SELECT 
    o.order_id,
    o.product_id,
    o.quantity,
    o.order_date
  FROM orders o
  WHERE o.status = 'delivered'
    AND o.order_date >= DATE('now', '-90 days')
),
product_sales AS (
  -- Calculate sales and profit per product
  SELECT 
    p.product_id,
    p.product_name,
    p.category,
    p.price,
    COALESCE(p.cost, p.price * 0.6) AS cost,  -- Assume 40% margin if cost is NULL
    SUM(do.quantity) AS units_sold,
    SUM(do.quantity * p.price) AS total_revenue,
    SUM(do.quantity * COALESCE(p.cost, p.price * 0.6)) AS total_cost
  FROM delivered_orders do
  JOIN products p ON do.product_id = p.product_id
  GROUP BY p.product_id, p.product_name, p.category, p.price, p.cost
)
SELECT 
  product_name,
  category,
  units_sold,
  ROUND(total_revenue, 2) AS revenue,
  ROUND(total_cost, 2) AS cost,
  ROUND(total_revenue - total_cost, 2) AS profit,
  ROUND(100.0 * (total_revenue - total_cost) / total_revenue, 2) AS profit_margin_pct,
  RANK() OVER (ORDER BY total_revenue DESC) AS revenue_rank,
  RANK() OVER (PARTITION BY category ORDER BY total_revenue DESC) AS category_rank
FROM product_sales
WHERE units_sold > 0
ORDER BY revenue DESC
LIMIT 20;

Result:

product_name          | category    | units_sold | revenue  | cost    | profit  | profit_margin_pct | revenue_rank | category_rank
----------------------|-------------|------------|----------|---------|---------|-------------------|--------------|---------------
Laptop Pro 15         | Electronics | 4          | 5199.96  | 3119.98 | 2079.98 | 40.00%            | 1            | 1
Standing Desk         | Furniture   | 2          | 999.98   | 599.99  | 399.99  | 40.00%            | 2            | 1

Key Decisions:

  • ✅ Filter by status = 'delivered' (business requirement)
  • ✅ Handle NULL cost with COALESCE (data quality issue)
  • ✅ Calculate both absolute profit and margin percentage
  • ✅ Rank overall and within category

Business Insight: Laptop Pro 15 is top revenue generator with healthy 40% margin.


Solution 2: Data Quality Audit

sql
-- Create comprehensive data quality report
WITH quality_checks AS (
  -- Check 1: Products with NULL or invalid prices
  SELECT 
    'Invalid Price' AS issue_type,
    'CRITICAL' AS severity,
    COUNT(*) AS affected_rows,
    'Products with NULL, negative, or zero price' AS description
  FROM products
  WHERE price IS NULL OR price <= 0
  
  UNION ALL
  
  -- Check 2: Duplicate product names
  SELECT 
    'Duplicate Products' AS issue_type,
    'HIGH' AS severity,
    COUNT(*) - COUNT(DISTINCT product_name) AS affected_rows,
    'Products with same name but different IDs' AS description
  FROM products
  HAVING COUNT(*) > COUNT(DISTINCT product_name)
  
  UNION ALL
  
  -- Check 3: Orders without valid products (orphaned)
  SELECT 
    'Orphaned Orders' AS issue_type,
    'CRITICAL' AS severity,
    COUNT(*) AS affected_rows,
    'Orders referencing non-existent products' AS description
  FROM orders o
  LEFT JOIN products p ON o.product_id = p.product_id
  WHERE p.product_id IS NULL
  
  UNION ALL
  
  -- Check 4: Future order dates
  SELECT 
    'Future Orders' AS issue_type,
    'HIGH' AS severity,
    COUNT(*) AS affected_rows,
    'Orders with dates in the future' AS description
  FROM orders
  WHERE order_date > DATE('now')
  
  UNION ALL
  
  -- Check 5: Negative quantities
  SELECT 
    'Negative Quantity' AS issue_type,
    'CRITICAL' AS severity,
    COUNT(*) AS affected_rows,
    'Orders with negative or zero quantity' AS description
  FROM orders
  WHERE quantity <= 0
  
  UNION ALL
  
  -- Check 6: Products with NULL supplier
  SELECT 
    'Missing Supplier' AS issue_type,
    'MEDIUM' AS severity,
    COUNT(*) AS affected_rows,
    'Products without assigned supplier' AS description
  FROM products
  WHERE supplier_id IS NULL
  
  UNION ALL
  
  -- Check 7: Suppliers without products
  SELECT 
    'Unused Suppliers' AS issue_type,
    'LOW' AS severity,
    COUNT(*) AS affected_rows,
    'Suppliers with no products' AS description
  FROM suppliers s
  LEFT JOIN products p ON s.supplier_id = p.supplier_id
  WHERE p.product_id IS NULL
  
  UNION ALL
  
  -- Check 8: Price outliers (> 3 standard deviations)
  SELECT 
    'Price Outliers' AS issue_type,
    'MEDIUM' AS severity,
    COUNT(*) AS affected_rows,
    'Products with prices > 3 std dev from mean' AS description
  FROM products
  WHERE price > (SELECT AVG(price) + 3 * STDEV(price) FROM products WHERE price IS NOT NULL)
)
SELECT 
  issue_type,
  severity,
  affected_rows,
  description,
  CASE severity
    WHEN 'CRITICAL' THEN 1
    WHEN 'HIGH' THEN 2
    WHEN 'MEDIUM' THEN 3
    WHEN 'LOW' THEN 4
  END AS priority_order
FROM quality_checks
WHERE affected_rows > 0
ORDER BY priority_order, affected_rows DESC;

Result:

issue_type          | severity | affected_rows | description                              | priority_order
--------------------|----------|---------------|------------------------------------------|----------------
Orphaned Orders     | CRITICAL | 5             | Orders referencing non-existent products | 1
Invalid Price       | CRITICAL | 2             | Products with NULL, negative, or zero... | 1
Future Orders       | HIGH     | 3             | Orders with dates in the future          | 2
Missing Supplier    | MEDIUM   | 8             | Products without assigned supplier       | 3
Unused Suppliers    | LOW      | 1             | Suppliers with no products               | 4

Recommended Fixes:

sql
-- Fix 1: Delete orphaned orders (or set to 'error' status)
DELETE FROM orders WHERE product_id NOT IN (SELECT product_id FROM products);

-- Fix 2: Set default price for NULL prices (or flag for review)
UPDATE products SET price = 0.01 WHERE price IS NULL;

-- Fix 3: Correct future dates (set to today)
UPDATE orders SET order_date = DATE('now') WHERE order_date > DATE('now');

-- Fix 4: Delete or correct negative quantities
DELETE FROM orders WHERE quantity <= 0;

Solution 3: Customer Lifetime Value (CLV)

sql
WITH customer_orders AS (
  SELECT 
    o.customer_id,
    COUNT(*) AS total_orders,
    SUM(o.quantity * p.price) AS lifetime_value,
    MIN(o.order_date) AS first_order_date,
    MAX(o.order_date) AS last_order_date,
    julianday('now') - julianday(MAX(o.order_date)) AS days_since_last_order
  FROM orders o
  JOIN products p ON o.product_id = p.product_id
  WHERE o.status = 'delivered'
  GROUP BY o.customer_id
),
customer_segments AS (
  SELECT 
    customer_id,
    total_orders,
    ROUND(lifetime_value, 2) AS lifetime_value,
    first_order_date,
    last_order_date,
    days_since_last_order,
    NTILE(10) OVER (ORDER BY lifetime_value DESC) AS value_decile,
    CASE 
      WHEN days_since_last_order <= 90 THEN 'Active'
      WHEN days_since_last_order > 90 AND total_orders > 1 THEN 'Churned'
      ELSE 'One-time'
    END AS customer_status
  FROM customer_orders
)
SELECT 
  customer_status,
  COUNT(*) AS customer_count,
  ROUND(AVG(lifetime_value), 2) AS avg_clv,
  ROUND(SUM(lifetime_value), 2) AS total_value,
  ROUND(100.0 * SUM(lifetime_value) / (SELECT SUM(lifetime_value) FROM customer_segments), 2) AS pct_of_total_value
FROM customer_segments
GROUP BY customer_status
ORDER BY total_value DESC;

-- Top 10% customers (value decile = 1)
SELECT 
  customer_id,
  total_orders,
  lifetime_value,
  customer_status,
  ROUND(lifetime_value / total_orders, 2) AS avg_order_value
FROM customer_segments
WHERE value_decile = 1
ORDER BY lifetime_value DESC;

-- Churned customers (for win-back campaign)
SELECT 
  customer_id,
  total_orders,
  lifetime_value,
  last_order_date,
  days_since_last_order
FROM customer_segments
WHERE customer_status = 'Churned'
  AND lifetime_value > 500  -- High-value churned customers
ORDER BY lifetime_value DESC;

Result - Segment Summary:

customer_status | customer_count | avg_clv  | total_value | pct_of_total_value
----------------|----------------|----------|-------------|--------------------
Active          | 45             | 1250.50  | 56272.50    | 65.00%
Churned         | 12             | 850.25   | 10203.00    | 12.00%
One-time        | 33             | 320.75   | 10584.75    | 23.00%

Business Insights:

  • 65% of revenue from active customers (healthy)
  • 12 churned customers worth $10K (win-back opportunity)
  • 33 one-time customers (improve retention)

Action Items:

  1. Win-back campaign for high-value churned customers
  2. Loyalty program for active customers
  3. Second-purchase incentive for one-time customers

Solution 4: Inventory Optimization

sql
WITH sales_velocity AS (
  -- Calculate average daily sales over last 56 days (8 weeks)
  SELECT 
    p.product_id,
    p.product_name,
    p.stock_quantity,
    p.supplier_id,
    COALESCE(AVG(o.quantity), 0) AS avg_daily_sales,
    COUNT(DISTINCT DATE(o.order_date)) AS days_with_sales
  FROM products p
  LEFT JOIN orders o ON p.product_id = o.product_id
    AND o.order_date >= DATE('now', '-56 days')
    AND o.status IN ('delivered', 'shipped')
  GROUP BY p.product_id, p.product_name, p.stock_quantity, p.supplier_id
),
reorder_analysis AS (
  SELECT 
    sv.product_id,
    sv.product_name,
    sv.stock_quantity,
    ROUND(sv.avg_daily_sales, 2) AS avg_daily_sales,
    s.supplier_name,
    COALESCE(s.lead_time_days, 7) AS lead_time_days,
    -- Calculate days of stock remaining
    ROUND(sv.stock_quantity / NULLIF(sv.avg_daily_sales, 0), 1) AS days_of_stock,
    -- Calculate reorder point (lead time + safety stock)
    ROUND(sv.avg_daily_sales * (COALESCE(s.lead_time_days, 7) + 14), 0) AS reorder_point,
    -- Classify inventory status
    CASE 
      WHEN sv.avg_daily_sales = 0 THEN 'Slow-Moving'
      WHEN sv.stock_quantity <= sv.avg_daily_sales * (COALESCE(s.lead_time_days, 7) + 14) THEN 'Reorder Now'
      WHEN sv.stock_quantity <= sv.avg_daily_sales * (COALESCE(s.lead_time_days, 7) + 21) THEN 'Reorder Soon'
      ELSE 'Adequate Stock'
    END AS inventory_status,
    -- Calculate suggested order quantity (30 days of stock)
    ROUND(GREATEST(0, (sv.avg_daily_sales * 30) - sv.stock_quantity), 0) AS suggested_order_qty
  FROM sales_velocity sv
  LEFT JOIN suppliers s ON sv.supplier_id = s.supplier_id
)
SELECT 
  inventory_status,
  COUNT(*) AS product_count,
  SUM(suggested_order_qty) AS total_units_to_order
FROM reorder_analysis
GROUP BY inventory_status
ORDER BY 
  CASE inventory_status
    WHEN 'Reorder Now' THEN 1
    WHEN 'Reorder Soon' THEN 2
    WHEN 'Adequate Stock' THEN 3
    WHEN 'Slow-Moving' THEN 4
  END;

-- Detailed reorder list
SELECT 
  product_name,
  stock_quantity,
  avg_daily_sales,
  days_of_stock,
  reorder_point,
  suggested_order_qty,
  supplier_name,
  lead_time_days
FROM reorder_analysis
WHERE inventory_status IN ('Reorder Now', 'Reorder Soon')
ORDER BY days_of_stock;

Result:

inventory_status | product_count | total_units_to_order
-----------------|---------------|---------------------
Reorder Now      | 3             | 450
Reorder Soon     | 5             | 320
Adequate Stock   | 12            | 0
Slow-Moving      | 2             | 0

Reorder List:

product_name          | stock_quantity | avg_daily_sales | days_of_stock | reorder_point | suggested_order_qty | supplier_name | lead_time_days
----------------------|----------------|-----------------|---------------|---------------|---------------------|---------------|----------------
Wireless Mouse        | 15             | 2.5             | 6.0           | 52            | 60                  | ElectroSupply | 7
USB-C Cable           | 20             | 1.8             | 11.1          | 38            | 34                  | ElectroSupply | 7
Laptop Pro 15         | 5              | 0.5             | 10.0          | 11            | 10                  | TechCorp      | 7

Business Actions:

  1. Urgent: Order 60 Wireless Mouse units (only 6 days of stock)
  2. Soon: Order USB-C Cables and Laptops
  3. Review: 2 slow-moving products (consider clearance sale)

Solution 5: Schema Design - Promotions Feature

sql
-- Schema Design
CREATE TABLE promotions (
  promotion_id INTEGER PRIMARY KEY AUTOINCREMENT,
  code VARCHAR(50) UNIQUE NOT NULL,
  description TEXT,
  discount_type VARCHAR(20) NOT NULL CHECK (discount_type IN ('percentage', 'fixed')),
  discount_value DECIMAL(10, 2) NOT NULL CHECK (discount_value > 0),
  min_order_value DECIMAL(10, 2) DEFAULT 0,
  max_uses_per_customer INT DEFAULT NULL,  -- NULL = unlimited
  max_total_uses INT DEFAULT NULL,  -- NULL = unlimited
  start_date DATE NOT NULL,
  end_date DATE NOT NULL,
  is_active BOOLEAN DEFAULT TRUE,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  CHECK (end_date >= start_date)
);

CREATE TABLE promotion_redemptions (
  redemption_id INTEGER PRIMARY KEY AUTOINCREMENT,
  promotion_id INT NOT NULL,
  order_id INT NOT NULL,
  customer_id INT NOT NULL,
  discount_amount DECIMAL(10, 2) NOT NULL,
  redeemed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (promotion_id) REFERENCES promotions(promotion_id),
  FOREIGN KEY (order_id) REFERENCES orders(order_id),
  FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

-- Indexes for performance
CREATE INDEX idx_promotions_code ON promotions(code);
CREATE INDEX idx_promotions_active_dates ON promotions(is_active, start_date, end_date);
CREATE INDEX idx_redemptions_promotion ON promotion_redemptions(promotion_id);
CREATE INDEX idx_redemptions_customer ON promotion_redemptions(customer_id);

-- Sample promotions
INSERT INTO promotions (code,
  description,
  discount_type,
  discount_value,
  min_order_value,
  max_uses_per_customer,
  start_date,
  end_date)
VALUES 
('WELCOME10', '10% off first order', 'percentage', 10, 0, 1, '2024-01-01', '2024-12-31'),
('SAVE20', '$20 off orders over $100', 'fixed', 20, 100, NULL, '2024-01-01', '2024-12-31'),
('VIP50', '50% off for VIP customers', 'percentage', 50, 200, 5, '2024-01-01', '2024-12-31');

-- Query 1: Validate promotion code
CREATE VIEW valid_promotions AS
SELECT 
  p.*,
  COALESCE(COUNT(pr.redemption_id), 0) AS total_uses
FROM promotions p
LEFT JOIN promotion_redemptions pr ON p.promotion_id = pr.promotion_id
WHERE p.is_active = TRUE
  AND DATE('now') BETWEEN p.start_date AND p.end_date
GROUP BY p.promotion_id;

-- Query 2: Check if customer can use promotion
WITH customer_usage AS (
  SELECT 
    promotion_id,
    customer_id,
    COUNT(*) AS customer_uses
  FROM promotion_redemptions
  GROUP BY promotion_id, customer_id
)
SELECT 
  vp.*,
  COALESCE(cu.customer_uses, 0) AS customer_uses,
  CASE 
    WHEN vp.max_total_uses IS NOT NULL AND vp.total_uses >= vp.max_total_uses THEN 'Promotion fully redeemed'
    WHEN vp.max_uses_per_customer IS NOT NULL AND COALESCE(cu.customer_uses,
      0) >= vp.max_uses_per_customer THEN 'Customer limit reached'
    ELSE 'Valid'
  END AS validation_status
FROM valid_promotions vp
LEFT JOIN customer_usage cu ON vp.promotion_id = cu.promotion_id AND cu.customer_id = ?
WHERE vp.code = ?;

-- Query 3: Calculate discount amount
CREATE FUNCTION calculate_discount(
  order_value DECIMAL(10, 2),
  discount_type VARCHAR(20),
  discount_value DECIMAL(10, 2),
  min_order_value DECIMAL(10, 2)
) RETURNS DECIMAL(10, 2)
BEGIN
  IF order_value < min_order_value THEN
    RETURN 0;
  END IF;
  
  IF discount_type = 'percentage' THEN
    RETURN ROUND(order_value * discount_value / 100, 2);
  ELSE
    RETURN discount_value;
  END IF;
END;

-- Query 4: Apply promotion to order
BEGIN TRANSACTION;

-- Calculate order value
SELECT @order_value := SUM(quantity * price)
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
WHERE oi.order_id = ?;

-- Validate and calculate discount
SELECT 
  @discount_amount := calculate_discount(@order_value, discount_type, discount_value, min_order_value),
  @promotion_id := promotion_id
FROM valid_promotions
WHERE code = ?
  AND validation_status = 'Valid';

-- Record redemption
INSERT INTO promotion_redemptions (promotion_id, order_id, customer_id, discount_amount)
VALUES (@promotion_id, ?, ?, @discount_amount);

-- Update order total
UPDATE orders
SET discount_amount = @discount_amount,
    final_amount = @order_value - @discount_amount
WHERE order_id = ?;

COMMIT;

-- Query 5: Promotion performance report
SELECT 
  p.code,
  p.description,
  COUNT(pr.redemption_id) AS total_redemptions,
  COUNT(DISTINCT pr.customer_id) AS unique_customers,
  SUM(pr.discount_amount) AS total_discount_given,
  ROUND(AVG(pr.discount_amount), 2) AS avg_discount,
  MIN(pr.redeemed_at) AS first_use,
  MAX(pr.redeemed_at) AS last_use
FROM promotions p
LEFT JOIN promotion_redemptions pr ON p.promotion_id = pr.promotion_id
WHERE p.start_date >= DATE('now', '-90 days')
GROUP BY p.promotion_id, p.code, p.description
ORDER BY total_redemptions DESC;

Schema Design Decisions:

  • ✅ Separate tables for promotions and redemptions (normalized)
  • ✅ CHECK constraints for data validation
  • ✅ Indexes on frequently queried columns
  • ✅ Audit trail (created_at, redeemed_at)
  • ✅ Flexible discount types (percentage, fixed)
  • ✅ Usage limits (per customer, total)

Anti-Abuse Measures:

  • Max uses per customer
  • Max total uses
  • Expiration dates
  • Minimum order value
  • Audit trail of all redemptions

🚀 Follow-up Challenges

  1. A/B Testing: Design schema and queries for tracking experiment variants and conversions
  2. Fraud Detection: Identify suspicious patterns (multiple accounts, unusual order patterns)
  3. Recommendation Engine: "Customers who bought X also bought Y"
  4. Multi-Currency Support: Handle orders in different currencies with exchange rates

🔑 Key Takeaways

Real-World SQL Skills

1. Data Quality

  • Always validate data before analysis
  • Handle NULLs gracefully (COALESCE, NULLIF)
  • Check for outliers and anomalies
  • Document assumptions

2. Business Logic

  • Understand requirements deeply
  • Ask clarifying questions
  • Handle edge cases
  • Provide actionable insights

3. Performance

  • Index appropriately
  • Avoid N+1 queries
  • Use CTEs for readability
  • Test with realistic data volumes

4. Schema Design

  • Normalize to reduce redundancy
  • Add constraints for data integrity
  • Index foreign keys
  • Plan for scalability

Production SQL Checklist

Before Deployment:

  • [ ] Test with production-like data volume
  • [ ] Handle NULL values
  • [ ] Add appropriate indexes
  • [ ] Use transactions for multi-step operations
  • [ ] Add error handling
  • [ ] Document assumptions
  • [ ] Test edge cases
  • [ ] Review execution plan (EXPLAIN)

After Deployment:

  • [ ] Monitor query performance
  • [ ] Track slow queries
  • [ ] Review index usage
  • [ ] Optimize based on actual usage patterns

Common Business Metrics

MetricSQL Pattern
RevenueSUM(quantity * price)
Average Order ValueSUM(revenue) / COUNT(DISTINCT order_id)
Customer Lifetime ValueSUM(revenue) GROUP BY customer_id
Retention RateCOUNT(returning) / COUNT(total) * 100
Churn Rate100 - retention_rate
Conversion RateCOUNT(converted) / COUNT(visitors) * 100
Days of Inventorystock / avg_daily_sales
Profit Margin(revenue - cost) / revenue * 100

📚 Production Patterns

Pattern 1: Idempotent Operations

sql
-- Can run multiple times safely
INSERT INTO products (product_id, product_name, price)
VALUES (1, 'Laptop', 999.99)
ON CONFLICT (product_id) DO UPDATE SET price = EXCLUDED.price;

Pattern 2: Soft Deletes

sql
-- Don't delete, mark as deleted
UPDATE products SET deleted_at = CURRENT_TIMESTAMP WHERE product_id = ?;

-- Query only active records
SELECT * FROM products WHERE deleted_at IS NULL;

Pattern 3: Audit Logging

sql
-- Track all changes
CREATE TABLE audit_log (
  log_id INT PRIMARY KEY,
  table_name VARCHAR(50),
  record_id INT,
  action VARCHAR(20),
  old_values JSON,
  new_values JSON,
  changed_by INT,
  changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Pattern 4: Optimistic Locking

sql
-- Prevent concurrent updates
UPDATE products
SET stock_quantity = stock_quantity - 1,
    version = version + 1
WHERE product_id = ? AND version = ?;

-- Check affected rows (should be 1)
-- If 0, someone else updated it (retry)

⚠️ Production Pitfalls

⚠️ Cạm bẫy

Pitfall 1: Not Using Transactions

sql
-- WRONG: Partial updates if error occurs
UPDATE inventory SET quantity = quantity - 10 WHERE product_id = 1;
INSERT INTO orders (product_id, quantity) VALUES (1, 10);

-- CORRECT: All or nothing
BEGIN;
UPDATE inventory SET quantity = quantity - 10 WHERE product_id = 1;
INSERT INTO orders (product_id, quantity) VALUES (1, 10);
COMMIT;

⚠️ Cạm bẫy

Pitfall 2: Ignoring Time Zones

sql
-- WRONG: Assumes server timezone
SELECT * FROM orders WHERE order_date = CURRENT_DATE;

-- CORRECT: Use UTC and convert
SELECT * FROM orders WHERE order_date AT TIME ZONE 'UTC' = CURRENT_DATE;

⚠️ Cạm bẫy

Pitfall 3: Not Handling Concurrent Access

sql
-- WRONG: Race condition (two users can buy last item)
SELECT stock FROM products WHERE id = 1;  -- Returns 1
-- Another user does same
UPDATE products SET stock = stock - 1 WHERE id = 1;  -- Both succeed!

-- CORRECT: Atomic operation
UPDATE products SET stock = stock - 1 
WHERE id = 1 AND stock > 0;
-- Check affected rows (only one succeeds)

🎓 Congratulations!

You've completed all 10 SQL exercises! You now have the skills to:

  • Write complex analytical queries
  • Optimize query performance
  • Design robust schemas
  • Handle real-world data quality issues
  • Solve business problems with SQL

Next Steps:

  • Practice with your own datasets
  • Contribute to open-source projects
  • Build a portfolio of SQL projects
  • Learn database-specific features (PostgreSQL, MySQL, SQL Server)

Keep Learning:

  • Advanced topics: Query optimization, indexing strategies
  • Database administration: Backup, replication, monitoring
  • Data engineering: ETL pipelines, data warehousing
  • Analytics: BI tools, data visualization

Remember: SQL is a skill that improves with practice. Keep querying! 🚀