Left Arrow

Notes

Performance

3 rectangles stacked on top of 1 another behind a tachometer.

SQL - Performance

Tended

Status: decay

The different stages of a xenomoprh life cycle

Query Life Cycle

A flowchart

Query

The user enters a query and passes it to the parser.

Parse

The parser checks the query for:

  • syntax errors (e.g. a comma out of place)
  • semantic errors (e.g. a table doesn’t exist)

If the checks fail, the error is returned. If the checks pass, the query is turned into an algebraic expression and passed on.

Optimize

The engine:

  1. 1. Performs simple optimizations (e.g. simplifying 5*10 into 50).
  2. 2. Considers different ‘query plans’, estimating CPU usage and duration for each plan based on number of rows in the query’s specified table(s).
  3. 3. Picks an optimal plan and passes it on.

Execute

The engine turns the plan into operations for the database.

Result

The result is returned to the user.

Blueprint

Query Plan

When you start dealing with bigger data sets, you'll start to care more about how quickly your queries execute. SQL is a declarative language - each query declares what we want the SQL engine to do, not how. The how, known as the query plan, is 1 thing that affects the efficiency of a query. Imagine we had the following data:

/schema.sql

result

CREATE TABLE IF NOT EXISTS fleet (
id serial PRIMARY KEY,
name VARCHAR (50) NOT NULL,
class VARCHAR (50) NOT NULL,
length INTEGER NOT NULL
);
INSERT INTO fleet(name, class, length)
VALUES ('a-wing', 'fighter', 7),
('cruiser', 'corvette', 127),
('x-wing', 'fighter', 13),
('y-wing', 'bomber', 23);
SELECT *
FROM fleet;

We want to find all the starships who class = 'fighter'. SQL could find the result using 2 different plans; Full table scan or Create an index.

Full Table Scan

SQL will look at every single row in the fleet table and return the rows where class = "fighter". This would be fast if the table only had 4 rows. If the table has 1 million rows however, then it would require looking at 1 million rows, which could be slow.

/schema.sql

result

CREATE TABLE IF NOT EXISTS fleet (
id serial PRIMARY KEY,
name VARCHAR (50) NOT NULL,
class VARCHAR (50) NOT NULL,
length INTEGER NOT NULL
);
INSERT INTO fleet(name, class, length)
VALUES ('a-wing', 'fighter', 7),
('cruiser', 'corvette', 127),
('x-wing', 'fighter', 13),
('y-wing', 'bomber', 23);
SELECT *
FROM fleet
WHERE class = 'fighter';

Create an Index

Creating an index of a table involves picking a column(s) that you want to sort your table by. SQL will then generate a new sorted table, made up of the column(s) you selected and the id column (to link each row back to the original table).

/schema.sql

result

CREATE TABLE IF NOT EXISTS fleet (
id serial PRIMARY KEY,
name VARCHAR (50) NOT NULL,
class VARCHAR (50) NOT NULL,
length INTEGER NOT NULL
);
INSERT INTO fleet(name, class, length)
VALUES ('a-wing', 'fighter', 7),
('cruiser', 'corvette', 127),
('x-wing', 'fighter', 13),
('y-wing', 'bomber', 23);
CREATE INDEX fleet_class_index
ON fleet(class);

Now when you do the query:

result

SELECT *
FROM fleet
WHERE class = "fighter";

SQL will:

  1. 1. do a binary search on the index table to find the rows with class = "fighter",
  2. 2. get the IDs of these rows &
  3. 3. for each ID, do another binary search on the original table, returning the row that matches.

If the table contains 1 millions rows, creating the index would require ~60 million operations (depending on the engine) which could table some time. Each proceeding binary search on the original table however, would only require log(1 000 000) = 6 lookups to find a value.

Science tubes with alien specimens

Query Tuning

The process of optimizing queries.

  1. 1. Identify what query(s) you want to tune by looking at which of your database calls are taking the longest / using the most resources using tools like SQL Profiler.
  2. 2. Understand how the SQL engine is executing the query. All SQL systems come with a way to ask the engine (e.g. In SQLite, put EXPLAIN QUERY PLAN in front of a query to see the query plan).
  3. 3. Change your query or data structures based on the SQL engine you're using and your data.

Example

Imagine our fleet table had a significant number of rows, users were regularly querying that table to find all rows with a specific class and it was taking a significant amount of time. We could create an index based on the class column in our schema. This would take some time to be created but now all the queries would be faster.

References

Where to Next?

Arrow pointing downYOU ARE HERE
Performance
A sci-fi robot taxi driver with no lower body