Explain Plan

Explain plan (a.k.a. Execution Plan / EP) is a very complex topic and a book could be easily written.Ā  I will try to explain the basic usage and principals of how does it work and how you can read it. There are two types of optimizers generating either “rule-based” or “cost-based” optimizer. The former one is obsolete and not used as far as I know so I will focus on the cost-based optimizer that generates the plan.

What is the execution plan?

The execution plan represents steps chosen by Oracle in the same order as the query is executed. Each query can have from one to multiple steps. Each step in the execution plan is described in detail (joins, access methods, filtering, sorting, whether partitions are involved, …) plus each step is estimated with a cost that represents how expensive is each step.

What is very important here is, that the execution plan ESTIMATES the cost of the step and does not represent the real cost!! To find out the real value visit my article which will show you how to Check the real cost of a query.

How does the execution plan work?

As I said, explain plan cost represents only an estimate and there are many cases where cost might be higher for an “optimized” query compared to the one for a non-optimized query; and even with the higher cost the query will run faster and consume fewer resources. Why would I use it then, you may ask? Well, that’s simple šŸ™‚ As mentioned above, explain plan will show you how the data are accessed and this will allow you to modify the plan according to your needs. You can identify incorrectly used indexes, cross partition reading, DRY (=Do NOT Repeat Yourself) principle violation, and many more. You can also use hints to bent the plan.

Understanding the execution plan and being able to deduce further actions requires a certain level of experience. I will not delve into every detail here but instead, I will try to give you a good understanding of the basics.

Before we start you ought to know one important fact – execution plan is based on statisticsĀ and if your statistics are obsolete your plan will most probably be invalid.

How to read the execution plan?

The execution plan is read from the rightmost row and from top to bottom. Usually, subqueries and inline views are processed before the main query.

Let’s have this as an example:

I will explain how this plan is being executed, what to look for, and what to be careful about it.

As I mentioned above, the plan is read from the rightmost row and from top to bottom. Thus the order of the execution will be as follows:

  1. Since we have two sibling operations (PARTITION RANGE ITERATOR and PARTITION RANGE AND) we will apply the TOP TO BOTTOM approach and start with the 6th row and perform a FULL TABLE SCAN on the table with orders (this might be a potential performance issue).
  2. The second step will be the sibling on the 8th row and Oracle will perform INDEX SKIP SCAN (for TABLE ACCESS BY LOCAL INDEX ROWID)
  3. These two steps will create the first nested loop (5th row) and since we again apply the TOP TO BOTTOM approach the next step will be INDEX RANGE SCAN (11th row)
  4. Index Range Scan will again create another (second) nested loop (4th row)
  5. The next step will be TABLE ACCESS BY GLOBAL INDEX ROWID (12th row) – which creates the last NESTED LOOP (3rd row)
  6. And since we have this HASH GROUP BY (1st row) it indicates we have some GROUP BY clause (most likely an aggregation) we need to apply the filter there (2nd row)

That’s how Oracle executes this explain plan. The total cost is on row 0.

You can see there are other parameters in the explain plan such as Filter predicates, some costs, etc. Let me share now what are those and how to use them šŸ™‚

  • ID – steps for processing (the order is different)
  • Description – Access method / How a table is accessed
  • Object owner – the owner of the accessed object
  • Object name – the name of the accessed object
  • Filter predicates – what filters are applied during the execution for each step
  • Cardinality – how many records are retrieved by each step (based on statistics!)
  • Cost – the overall cost of the query (per step and total) – it is an ESTIMATE!!! You cannot say the higher the worse (even tho it’s generally true) – there are cases where a higher cost in Explain Plan results in a faster query! Always Check the real cost of a query.
  • CPU Cost – how much of a CPU will be used to get the result (per step and total)Ā  – again it’s an estimate only so you have to check the real cost.
  • Bytes – number of bytes access by the operation (estimate again šŸ™‚ )
  • IO Cost – how many bytes are transferred during IO operations (estimate šŸ™‚ )
  • Partition Start – if partition pruning is activated, you will see where does it start
    • if it starts at the first partition (1) and ends at the last partition (x) it seems like an invalid use of partition pruning (plus you will see in the description PARTITION RANGE ALL)
    • if it has the KEY value – it’s a great indicator of the correct application of partition pruning
  • Partition Stop – if partition pruning is activated, you will see where does it stop
  • Partition ID – id of a partition being used
  • Plan ID – ID of your explain plan

There are other values you can choose to see (select) in your explain plan – feel free to scan the official documentation for more details. These are what I use usually to decide on whether or not the explain plan looks suspicious šŸ™‚

Access methods

Access methods represent how you access data. This plays a critical role in the case of performance tuning; the wrong access method may dramatically change the plan! To choose the correct one, you have to understand how each method operates and when it is better to use or replace it with another.

Full Table Scan

A Full Table Scan (a.k.a. Table access full; FTS) means that all rows in a table will be checked for your condition and those NOT matching will be filtered out. This is very ineffective when you are selecting only a few records from the table. Optimizer usually chooses the FTS when a large amount (40-50% or more) of records will be accessed and there is no index or cannot be used or the cost is simply lower than any other solution. It generates a lot of IO.

Table Access by ROWID

ROWID is a pseudo column that stores a unique identifier for each row in a database to locate the row directly. By using this access method Oracle will locate each selected row directly and return it on a row-by-row access basis. It is very effective and quick.

Index Range Scan

Oracle uses ROWID access for all adjacent index entries to retrieve rows. In practice, it means that several values need to be read based on the condition used. It is either a non-unique index or non-equality condition or range predicate on a unique index. It is generally very fast and effective.

Index Unique Scan

Only one record is returned in this case based on the condition used. This is very efficient and extremely fast.

Index Skip Scan

This is usually used for partially defined predicates (conditions) when using a composite index. That means whenever you use only the second, third, etc ..Ā  part of a composite index while omitting the first part, it will use the index skip scan. Oracle claims it is not as fast as an index range scan, on the other hand, it is supposed to be faster than a full table scan. By selecting the index skip scan as an access method, Oracle generates multiple sub-queries on the background to compose the final result (dataset).

An important part to remember is, that the cardinality of the leading column has a direct impact on the speed of this index. The general rule of thumb is, whenever you have an index skip scan – it is bad and you should avoid it (=change the plan) by modifying the query, adding hints, or adjusting the table structure.

Index Range Scan Descending

It is the same as the Index Range Scan. The only difference is in this case ORDER BY + DESC is used.

Full Index Scan

A Full Index Scan reads only a single block at a time – meaning only one block is needed to be scanned and non-matching data are filtered out. It only uses a single block IO.

Fast Full Index Scan

This access method reads the index as a whole, instead of reading the table itself. It reads the leaves, branches, and the parent node and processes the data in leaves (while ignoring the parent and branches nodes). It uses a multi-block IO.

Index Join

An Index join joins multiple indexes within one table (performing a hash join). The optimizer chooses this method when all columns selected are indexed (by multiple indexes – see the example below).

SELECT name_full, id_department
FROM employee
WHERE id_department IN (1010,2040,8001);

-- Index emp_name   : name_full;
-- Index emp_id_dpt : id_department;

It is usually way slower than a full table scan, however, when the indexed columns are wide (multiple columns, big string data types, ..) it might be faster.

Bitmap Index

A bitmap index is composed of bits (1,0) which represent each record in a table and a mapping function that converts them to a ROWID. Oracle then evaluates (using boolean operations) whether the condition is true or false.

Join methods

Join methods define how two different data sources will be joined together.

Nested Loops

Nested loops are suitable joins for a small subset of data. For each record in the driving table (the first one) Oracle scans all records in the joined table.Ā  This generates lots of IO and therefore buffer gets. If you see this method when joining big tables, force hash joins instead.

Hash Join

Hash joins are designed to join large data sets. The optimizer will use the smaller table to generate hash keys (in memory -> very fast) based on the join conditions. Then, it scans the second table generating hash keys the same way, and compare it with the original hash map. Matched records are returned.

Sort-Merge Join

A Sort Merge Join is used when there is an inequality in the join condition used (<,>,<=,>=). It is a two-step process where the first operation is SORT (both data sources are sorted by the join key) and the second one is MERGE (sorted data sources are merged). This operation might perform faster on larger data sets than Nested Loops.

Cartesian join

This is practically a Cartesian product. It matches each row from the driving table with each row from the joined table, thus at the end, you will get N (number of rows from the first table) x M (number of rows from the second table) result.

Cardinality

The cardinality represents the (estimated) number of rows returned by each operation. This estimation is based on statistics or dynamic sampling. The general formula is:

 Cardinality = Σ total number of rows / Σ distinct values in the column

Partition pruning

The Partition pruning means that you restrict the partitions reading – thus you do not perform a cross partition reading (reading all partitions; a.k.a. full table scan if you will). The more you restrict the partitions, the faster your query will run. For every partition table included in your query, you should activate the partition pruning (limit the partitions) by proper use of the partition key while filtering the data. This is one of the cases where the execution plan might show you a higher estimated cost but the query will run significantly faster and cheaper when it comes to resources.

How to display the explain plan?

Now, when you understand all those code-words that the execution plan will flood you with šŸ™‚ It is time to show you have you can display a plan for a particular query. There are several ways how you can check the plan and I will show you the most common ways -> SQL-like ways. Almost every tool has its own GUI alternative where you can only click on a button and it will generate the explain plan for you.

/****
 * DBMS_XPLAN + DISPLAY
 ****/
EXPLAIN PLAN FOR
SELECT *
FROM employee e
LEFT JOIN department d
ON e.id_department = d.id_department
WHERE 1=1
AND d.name_department IN ('IT','Finance');

SELECT * FROM table(DBMS_XPLAN.DISPLAY);

/****
 * DBMS_XPLAN + CURSOR
 *    will show the last executed query
 ****/
SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR);

/****
 * DBMS_XPLAN + CURSOR + specify the query by sql_id
 *   will show the specific plan for given sql_id
 ****/
SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR(sql_id,0));

--example
SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('gwp663cqh5qbf',0));

-- you can be more creative / dynamic
SELECT t.*
FROM v$sql s, table(DBMS_XPLAN.DISPLAY_CURSOR(s.sql_id, s.child_number)) t
WHERE 1=1
AND lower(sql_text) LIKE '%id_department = 10%'
AND lower(parsing_schema_name) LIKE '%john.green%';

You can also format the output of the table by using filters

The general syntax is:

DBMS_XPLAN.DISPLAY(
table_name IN VARCHAR2 DEFAULT 'PLAN_TABLE',
statement_id IN VARCHAR2 DEFAULT NULL, 
format IN VARCHAR2 DEFAULT 'TYPICAL',
filter_preds IN VARCHAR2 DEFAULT NULL
);

Where the format might have the following values:

  • BASIC: minimum information only — the operation ID, the operation name, and its option.
  • TYPICAL: default; displays the most relevant information in the plan (operation id, name, and option, #rows, #bytes, and optimizer cost).
  • SERIAL: Same likeĀ TYPICALĀ but ignores parallel (even though the plan executes in parallel.
  • ALL: Maximum information; includes information displayed with theĀ TYPICALĀ level with additional information (PROJECTION,Ā ALIAS, and information aboutĀ REMOTEĀ SQLĀ if the operation is distributed).

There is an option to adjust the output even more by adding more options to the base four options mentioned above. Each of these options must be separated either by comma or space.

  • ROWSĀ – if relevant, shows the number of rows estimated by the optimizer
  • BYTESĀ – if relevant, shows the number of bytes estimated by the optimizer
  • COSTĀ – if relevant, shows optimizer cost information
  • PARTITIONĀ – if relevant, shows partition pruning information
  • PARALLELĀ – if relevant, shows PX information (distribution method and table queue information)
  • PREDICATEĀ – if relevant, shows the predicate section
  • PROJECTIONĀ -if relevant, shows the projection section
  • ALIASĀ – if relevant, shows the “Query Block Name / Object Alias” section
  • REMOTEĀ – if relevant, shows the information for distributed query (for example, remote from serial distribution and remote SQL)
  • NOTEĀ – if relevant, shows the note section of the explain plan

Usage is as follows:

-- first parameter is the plan_table (default one; can be omitted)
-- second parameter is null - that means last executed query will be used
-- third parameter is a filter specification
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY('plan_table',null,'basic +predicate +cost'));

-- or 

-- first parameter is omitted (will be used the default value - see above; -> plan_table)
-- second parameter is a sql_id (will be used for that particular sql_id)
-- thirds is a filter specification

SELECTĀ *Ā FROMĀ TABLE(DBMS_XPLAN.DISPLAY('gwp663cqh5qbf','allĀ -predicateĀ -partition'));

Lastly, let me remind you again:

The execution plan ESTIMATES the cost of the step and does not represent the real cost!!