ToolPopToolPop
SQL · Lesson 5 of 8

GROUP BY and aggregates, counting like a manager

11 min readUpdated 24 Jun 2026

So far our queries have returned raw rows. Useful, but rarely what your boss actually asks for. They want totals. Averages. Counts per city. This is what aggregates are for.

GROUP BY is where SQL starts to feel like a real analysis tool. This is the lesson where you stop being a "data fetcher" and start being someone who answers questions.

Diagram
rendering diagram...
GROUP BY collapses many rows into one row per group

The five aggregates

These are functions that collapse many rows into one number.

  • COUNT(*) how many rows. COUNT(column) ignores NULLs.
  • SUM(column) total of a numeric column.
  • AVG(column) average. Watch the precision, sometimes returns decimals you don't expect.
  • MIN(column) smallest value. Works on numbers, dates, even strings (alphabetical).
  • MAX(column) largest value.
sql
SELECT COUNT(*), SUM(total), AVG(total)
FROM orders;

That gives you one row with three numbers. The total order count, the total revenue, and the average order value, for the entire table.

GROUP BY: one row per bucket

What if you want those numbers per city? Add GROUP BY.

sql
SELECT city, COUNT(*), SUM(total)
FROM orders
GROUP BY city;

Now you get one row per city. Read GROUP BY as "make one bucket per unique value of this column, then run the aggregates inside each bucket."

  • Every non-aggregate column in SELECT must appear in GROUP BY. This is a hard rule, not a suggestion.
  • You can GROUP BY multiple columns: GROUP BY city, status gives one row per city-status combo.

The mental model: GROUP BY creates buckets. Aggregates summarise each bucket. If a column is not bucketed and not aggregated, SQL does not know what to do with it, and it will yell at you.

HAVING vs WHERE

Both filter rows. The difference is when.

  • WHERE filters rows before grouping.
  • HAVING filters groups after aggregating.
sql
SELECT city, COUNT(*) AS order_count
FROM orders
WHERE status = 'delivered'
GROUP BY city
HAVING COUNT(*) > 100;

Read it top to bottom. Take delivered orders only (WHERE). Bucket them by city (GROUP BY). Show only cities with more than 100 (HAVING).

If you tried to put COUNT(*) > 100 in WHERE, SQL would refuse. The count does not exist yet at WHERE time.

A real query, end to end

"Top 5 cities by revenue from delivered orders."

sql
SELECT city, SUM(total) AS revenue
FROM orders
WHERE status = 'delivered'
GROUP BY city
ORDER BY revenue DESC
LIMIT 5;

Every clause is doing one job. WHERE filters. GROUP BY buckets. SUM aggregates. ORDER BY sorts. LIMIT cuts. This is the shape of nine out of ten analytics queries.

Try it on the right

Three GROUP BY assignments. The third one combines GROUP BY with a JOIN, which is where it gets fun. Next: indexes, and why your slow query is probably missing one.

Free tools you can use while you learn

Chai0/3 done

Watching quietly. Tap me if you want a tip.

SQL Playground

Try this (0 of 3 done)

  1. 1

    Revenue earned by each restaurant from delivered orders.

    hint

    GROUP BY restaurant, SUM(total), filter delivered first.

    show answer
    SELECT r.name, SUM(o.total) AS revenue
    FROM orders o
    JOIN restaurants r ON r.id = o.restaurant_id
    WHERE o.status = 'delivered'
    GROUP BY r.name
    ORDER BY revenue DESC;
  2. 2

    Customers who placed more than 2 orders. (HAVING clause.)

    hint

    GROUP BY customer then HAVING COUNT(*) > 2

    show answer
    SELECT c.name, COUNT(*) AS order_count
    FROM orders o
    JOIN customers c ON c.id = o.customer_id
    GROUP BY c.name
    HAVING COUNT(*) > 2;
  3. 3

    Average order value per city.

    hint

    AVG(total) grouped by city. JOIN customers.

    show answer
    SELECT c.city, ROUND(AVG(o.total), 2) AS avg_order
    FROM orders o
    JOIN customers c ON c.id = o.customer_id
    GROUP BY c.city;