Indexes

Indexes are one of the most powerful (but at the same time the most overlooked/misunderstood) aspects of a database. Tables might grow tremendously big and having them properly optimized is crucial. I saw many people using them incorrectly and therefore worsening the overall database performance.

What are those indexes and how do they work?

Indexes organize all records in a table and make it faster for Oracle to retrieve a row (or a set of rows). However, there is one condition you have to consider before you create an index. It is called “low selectivity“. What is it? Low selectivity means that you select only a small number of records out of the whole table. There is no definition for the “small number” expression but it varies from approximately 10-20% of a total number of records, depending on a table size (huge tables = 20%; smaller tables = 10%). It works best if you select only a few records (in case of returning one record – consider using the UNIQUE constraint).

What makes it faster then no-index scan?

When you select only a few records from a table with no index, Oracle has to perform FULL TABLE SCAN (FTS); meaning Oracle has to read every single record and validate it to your condition whether it is true or not. In case of tables with billions of records, it is very inefficient when you are after 10 rows only (you have to read all 10bn to extract only those ten). Nonetheless, if you create an index on the filtered column it will scan much fewer records. Let me demonstrate, how such index works (I will be talking about B-tree index; B stands for Balanced tree).

Let’s picture a table with IDs (say 100 for the sake of simplicity) and I need to select only ID equal to 14, 20 and 22. The ID column is indexed (non-unique) and Oracle will read only those “blocks” of data marked red on the picture below to get the desired data set.

B-tree Index structure
B-tree index structure

For each block of data Oracle will evaluate whether the wanted result is in its range or not and move to its child node. This process is repeated until the final leaf is reached. As you can see we read only 4 out of 15 blocks. Image the same setup for a much bigger table; works flawlessly 🙂

There are two main types of indexes – bitmap and b-tree. By default, all indexes are created as balanced (b-tree) indexes. These are the most commonly used.

Bitmap, on the other hand, transform values to the set of zeros and ones and this value is used as a key for searching. The major difference between bitmap and b-tree index is that NULL values are not included in b-tree indexes but are in a bitmap. Thus, you cannot use a B-tree index for IS NULL / IS NOT NULL conditions as opposed to a bitmap that might be configured this way. But there is also a drawback for bitmap indexes and it is called “writing concurrency locks”. Whenever you insert a value to a bitmap index it has been recalculated and that means all other inserts with the same value have to wait until the first one is committed. This creates huge locks and waiting times compared to b-tree where the value is just inserted in it is placed in a tree. You can check how the b-tree works here at the University of San Francisco’s website.

Now, since you understand how indexes work, let me talk about “types” of indexes and their characteristics.

Regular index

The regular index is just an index with no special functionality. It organizes records in a table and allows users/application to look through faster. NULL values are not considered here (in case of b-tree) and are not organized at all. If you use any kind of function or a different (even though similar column name) the index won’t be triggered!

General syntax:

CREATE INDEX schema_name.my_index_name ON schema_name.table_name(column_name);

Examples:

-- there is a table of contracts with millions of records and index is on DATE_SIGNED
-- CREATE INDEX idx_date_signed ON contract(date_signed);

-- index is triggered
SELECT *
FROM contract
WHERE 1=1
AND date_signed = DATE '2018-12-12';

-- index is triggered
SELECT *
FROM contract
WHERE 1=1
AND date_signed BETWEEN trunc(sysdate) - 3 AND sysdate;

-- index is NOT triggered
-- the result set will be the same as above but Full Table Scan will be applied to retrieve the data set ... function is used
-- to have an index triggered in this case, it would have to be function based index
SELECT *
FROM contract
WHERE 1=1
AND trunc(date_signed) >= trunc(sysdate) - 3;

Unique index

A unique index is a regular index with a unique constraint. That means you cannot insert two same values into the indexed column. NULL values are again ignored and thus if you have invalid data you might experience duplicates in a table (NULL values) – even though your column is tagged as a unique index.

General syntax:

CREATE UNIQUE INDEX schema_name.my_index_name ON schema_name.table_name(column_name);

Composite index

A composite index is (as its name suggests) composed of at least two columns (up to 32 columns). It is very handy when you are usually using a combination of more columns in your query for filtering out the result set.

General syntax:

CREATE INDEX schema_name.my_index_name ON schema_name.table_name(column_name, another_column);

Function-based index

Function-based indexes are used when you have to use a function on a column for filtering (i.e. UPPER, LOWER, TRUNC, ROUND, …). Though it is not recommended to use function-based indexes (except a few specific cases) because they are usually set for one specific scenario.

CREATE INDEX schema_name.my_index_name ON schema_name.table_name(trunc(my_date_column));

-- or

CREATE INDEX schema_name.my_index_name ON schema_name.table_name(upper(my_string_column));

Then, the condition in your query must be the same as how the index was defined!!

-- Correct usage 
SELECT *
FROM my_table
WHERE trunc(my_date_column) >= trunc(sysdate) - 3

-- Incorrect usage 
SELECT *
FROM my_table
WHERE trunc(my_date_column, 'DD') >= trunc(sysdate) - 3

Invisible index

Invisible indexes are indexes that are disabled and therefore not used in an explain plan (unless OPTIMIZER_USE_INVISIBLE_INDEXES parameter is set to TRUE). However, they are still maintained !! It is useful to make it invisible in case of testing whether the index is needed or not.

-- make the index invisible
ALTER INDEX schema_name.index_name INVISIBLE;

-- make it visible again
ALTER INDEX schema_name.index_name VISIBLE;

Local / Global index

Local indexes are used for partitioned tables and only for non-unique indexes. Unique indexes (even if used on partitioned tables must be global). Local partitioned indexes allow you to take individual partitions of a table and indexes offline for maintenance (or reorganization) without affecting the other partitions and indexes in the table unlike with global indexes where if you update the index you have to reorganize it every time because it gets invalidated. By default, all indexes are created as global. If you want to create a local index you have to specify it by using a keyword LOCAL.

CREATE INDEX schema_name.my_index_name ON schema_name.table_name(column_name) LOCAL;

As you can see there are many ways how to organize a table and it is not restricted to only those listed above – you can also combine them (i.e. you can create a “unique function-based composite local index”). It is only up to you to properly create an index to make your queries run faster.