Indexes, the IRCTC Tatkal moment
It is 10:00 AM. Tatkal opens. Lakhs of people slam IRCTC at the same time. Why does it sometimes return in two seconds, and sometimes spin forever? A lot of it comes down to one word: indexes.
If your query is slow, ninety percent of the time the answer is "you are missing an index." Let us learn what they are and why they matter.
A database is a book without a table of contents
Imagine a thousand-page recipe book. You want every recipe that uses paneer. With no index at the back, you flip every page. With an index, you jump straight to the "P" section and find "paneer, pages 47, 112, 389."
That is exactly what a database index is. A separate, sorted, lookup-friendly copy of one or more columns, with pointers back to the full rows.
- Without an index, SQL scans every row. This is called a full table scan.
- With an index, SQL walks a tree to find matching rows in milliseconds.
B-tree, in one sentence
Most indexes are B-trees, balanced trees where each lookup takes roughly log(n) steps. A million rows? About 20 hops. A billion rows? About 30 hops. That is why indexes feel like magic. The cost barely grows as data grows.
CREATE INDEX idx_orders_customer ON orders(customer_id);That one line could turn a 10-second query into a 10-millisecond one. Truly.
When indexes help
- WHERE on indexed columns.
WHERE customer_id = 42flies. - JOIN keys. Always, always index your foreign keys.
- ORDER BY on indexed columns can skip the sort entirely.
- Range scans.
BETWEEN,<,>all use B-tree indexes happily.
When indexes do nothing
SELECT *on a small table. Below a few thousand rows, the full scan is faster than the index lookup. The planner knows this and ignores your index.- Functions on the column.
WHERE LOWER(email) = '[email protected]'cannot use a regular index onemail. The function hides the column. LIKE '%chai'with a leading wildcard. B-trees sort left to right, a leading%defeats them.
Adding an index is not free. Every INSERT, UPDATE, DELETE has to update the index too. Index the columns you query a lot. Do not index every column "just in case." I have seen tables with 15 indexes where writes became molasses.
EXPLAIN: ask the database what it is doing
Stick EXPLAIN in front of any query and the database tells you its plan.
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;What to look for:
Seq Scanfull table scan. Bad on big tables.Index ScanorIndex Only Scanusing the index. Good.- High
costnumbers relative to row count. Investigate.
Reading EXPLAIN output is a skill that pays for itself within a week of starting your first backend job.
Try it on the right
Three assignments. One creates an index, one breaks one, one reads an EXPLAIN. Next: transactions, where SQL stops losing your money.
Free tools you can use while you learn
Common questions
Q.Should I just index every column?›
Q.Why is my query still slow after adding an index?›
Watching quietly. Tap me if you want a tip.
Try this (0 of 3 done)
- 1
Create an index on the customer_id column of the orders table.
hint
CREATE INDEX idx_name ON table(column);
show answer
CREATE INDEX idx_orders_customer ON orders(customer_id); - 2
Show the query plan for finding all orders from customer 1.
hint
EXPLAIN QUERY PLAN SELECT ... WHERE customer_id = 1
show answer
EXPLAIN QUERY PLAN SELECT * FROM orders WHERE customer_id = 1; - 3
List all indexes that currently exist on the orders table.
hint
SQLite stores them in sqlite_master.
show answer
SELECT name FROM sqlite_master WHERE type = 'index' AND tbl_name = 'orders';