
Query Life Cycle
Query
The user enters a query and passes it to the parser.
Parse
The parser checks the query for:
- ▪ syntax errors (e.g. a comma out of place)
- ▪ semantic errors (e.g. a table doesn’t exist)
If the checks fail, the error is returned. If the checks pass, the query is turned into an algebraic expression and passed on.
Optimize
The engine:
- 1. Performs simple optimizations (e.g. simplifying 5*10 into 50).
- 2. Considers different ‘query plans’, estimating CPU usage and duration for each plan based on number of rows in the query’s specified table(s).
- 3. Picks an optimal plan and passes it on.
Execute
The engine turns the plan into operations for the database.
Result
The result is returned to the user.

Query Plan
When you start dealing with bigger data sets, you'll start to care more about how quickly your queries execute. SQL is a declarative language - each query declares what we want the SQL engine to do, not how. The how, known as the query plan, is 1 thing that affects the efficiency of a query. Imagine we had the following data:
/schema.sql
result
CREATE TABLE IF NOT EXISTS fleet (
id serial PRIMARY KEY,
name VARCHAR (50) NOT NULL,
class VARCHAR (50) NOT NULL,
length INTEGER NOT NULL
);
INSERT INTO fleet(name, class, length)
VALUES ('a-wing', 'fighter', 7),
('cruiser', 'corvette', 127),
('x-wing', 'fighter', 13),
('y-wing', 'bomber', 23);
SELECT *
FROM fleet;
We want to find all the starships who class = 'fighter'
. SQL could find the result using 2 different plans; Full table scan or Create an index.
Full Table Scan
SQL will look at every single row in the fleet table and return the rows where class = "fighter". This would be fast if the table only had 4 rows. If the table has 1 million rows however, then it would require looking at 1 million rows, which could be slow.
/schema.sql
result
CREATE TABLE IF NOT EXISTS fleet (
id serial PRIMARY KEY,
name VARCHAR (50) NOT NULL,
class VARCHAR (50) NOT NULL,
length INTEGER NOT NULL
);
INSERT INTO fleet(name, class, length)
VALUES ('a-wing', 'fighter', 7),
('cruiser', 'corvette', 127),
('x-wing', 'fighter', 13),
('y-wing', 'bomber', 23);
SELECT *
FROM fleet
WHERE class = 'fighter';
Create an Index
Creating an index of a table involves picking a column(s) that you want to sort your table by. SQL will then generate a new sorted table, made up of the column(s) you selected and the id column (to link each row back to the original table).
/schema.sql
result
CREATE TABLE IF NOT EXISTS fleet (
id serial PRIMARY KEY,
name VARCHAR (50) NOT NULL,
class VARCHAR (50) NOT NULL,
length INTEGER NOT NULL
);
INSERT INTO fleet(name, class, length)
VALUES ('a-wing', 'fighter', 7),
('cruiser', 'corvette', 127),
('x-wing', 'fighter', 13),
('y-wing', 'bomber', 23);
CREATE INDEX fleet_class_index
ON fleet(class);
Now when you do the query:
result
SELECT *
FROM fleet
WHERE class = "fighter";
SQL will:
- 1. do a binary search on the index table to find the rows with class = "fighter",
- 2. get the IDs of these rows &
- 3. for each ID, do another binary search on the original table, returning the row that matches.
If the table contains 1 millions rows, creating the index would require ~60 million operations (depending on the engine) which could table some time.
Each proceeding binary search on the original table however, would only require log(1 000 000) = 6
lookups to find a value.

Query Tuning
The process of optimizing queries.
- 1. Identify what query(s) you want to tune by looking at which of your database calls are taking the longest / using the most resources using tools like SQL Profiler.
- 2. Understand how the SQL engine is executing the query. All SQL systems come with a way to ask the engine (e.g. In SQLite, put
EXPLAIN QUERY PLAN
in front of a query to see the query plan). - 3. Change your query or data structures based on the SQL engine you're using and your data.
Example
Imagine our fleet
table had a significant number of rows, users were regularly querying that table to find all rows with a specific class and it was taking a significant amount of time.
We could create an index based on the class
column in our schema.
This would take some time to be created but now all the queries would be faster.