Performance tuning is one of the most complicated and complex areas of Oracle and only true experts can say they understand how to fix any kind of issue. It takes years of practice to become such an expert; and as it is with everything in life – one step at a time (you have to climb a ladder from the ground). In other words, start with the beginnings and master those skills. Once you are confident, add something more you are interested in and keep progressing this way until you wake up one day being an Oracle expert. It is a lot of effort, nerves, failures, time. and also money but it is worth it; trust me 🙂
There are two levels of tuning; server-side and front-end side (which means on the side of a user). I will dedicate this article to the latter one because this page is focused on developers, not DBAs (even though there is some DBA stuff here). I don’t consider myself a DBA expert a thus I won’t mess with DBA stuff here 🙂 Maybe in the future, I will add more articles here related to DBA topics, once I get super confident in these shoes.
I will give you a manual in the form of steps to take to improve the performance. The order of the steps is the same how you should proceed:
Check the structures/objects
The core of database performance is proper design. First of all, look at all tables and objects in your database and try to create indexes, partitions, subpartitions, enforce data integrity, or remove invalid indexes. Look for duplicates as well. Duplicates might worsen your performance a lot. Another problem might be an inappropriate database design where you have one table in 2nd NF instead of more tables in 3rd NF (more about NF [Normal Forms] in the article Database design: ERD and NFs. Once you are sure that there is nothing else you can do about your design, proceed to the next step.
Check the query whether
The second step for you in order to improve performance is to check your query. Have a closer look on all filters, whether you correctly use indexes or partitions. Is there partition pruning activated? Always minimize the returned result set as soon as possible.
-- WRONG; it reads the whole table first and then filters data out at the second step SELECT * FROM ( SELECT id , date FROM my_table ) WHERE date >= sysdate-7 ; -- CORRECT; it reads the table with filtered data during the first step SELECT * FROM ( SELECT id , date FROM my_table WHERE date >= sysdate-7 ) WHERE 1=1 ;
I have seen it many times that users use initial load instead of incremental. I know it might be comfortable but it is not manageable forever. I know it might be some extra work and effort on your side, but trust me, it’s worth it. I wrote an article on this topic called Initial load VS Incremental load.
Another thing you should look for is the application of DRY (stands for Do NOT Repeat Yourself) principle. That means you should use (=read) one table only once in your query and not multiple times, especially when speaking about big tables. See the example below:
-- WRONG; the table will be read three times SELECT c.id_contract , i1.amt_total as i1_amt_total , i2.amt_total as i2_amt_total , i3.amt_total as i3_amt_total FROM contract c -- LEFT JOIN installment i1 ON i1.id_contract = c.id_contract AND i1.id_num_inst = 1 -- LEFT JOIN installment i2 ON i2.id_contract = c.id_contract AND i2.id_num_inst = 2 -- LEFT JOIN installment i3 ON i3.id_contract = c.id_contract AND i3.id_num_inst = 3 -- WHERE 1=1 -- CORRECT; one reading only using WITH clause (you can use GTT as well); this time the table will be read once only WITH w_inst_3 AS ( SELECT --+ materialize id_contract , id_num_inst , amt_inst FROM installment WHERE id_num_inst IN (1,2,3) ) SELECT c.id_contract , i1.amt_total as i1_amt_total , i2.amt_total as i2_amt_total , i3.amt_total as i3_amt_total FROM contract c -- LEFT JOIN w_inst_3 i1 ON i1.id_contract = c.id_contract AND i1.id_num_inst = 1 -- LEFT JOIN w_inst_3 i2 ON i2.id_contract = c.id_contract AND i2.id_num_inst = 2 -- LEFT JOIN w_inst_3 i3 ON i3.id_contract = c.id_contract AND i3.id_num_inst = 3 -- WHERE 1=1 -- the better performance might be achieved with GTT and especially if the original table is not partitioned by id_num_ist column and you will partition the GTT (considering the equal distribution of rows) + of course id_contract should be indexed CREATE GLOBAL TEMPORARY TABLE gtt_inst_3 PARTITION BY LIST(id_num_inst) ( PARTITION p_1 VALUES (1), PARTITION p_2 VALUES (2), PARTITION p_3 VALUES (3), PARTITION p_null VALUES (NULL), PARTITION p_dflt VALUES (default) ) AS SELECT SELECT id_contract , id_num_inst , amt_inst FROM installment WHERE id_num_inst IN (1,2,3) ; SELECT c.id_contract , i1.amt_total as i1_amt_total , i2.amt_total as i2_amt_total , i3.amt_total as i3_amt_total FROM contract c -- LEFT JOIN gtt_inst_3 i1 ON i1.id_contract = c.id_contract AND i1.id_num_inst = 1 -- activation of partition pruning -- LEFT JOIN gtt_inst_3 i2 ON i2.id_contract = c.id_contract AND i2.id_num_inst = 2 -- activation of partition pruning -- LEFT JOIN gtt_inst_3 i3 ON i3.id_contract = c.id_contract AND i3.id_num_inst = 3 -- activation of partition pruning -- WHERE 1=1
Do not try to get the result with one big query
Another best practice you should learn is to try not to achieve your goal by writing an extremely long query with multiple subqueries, unions, etc… This is an overkill for the database server and as a consequence, you will have to wait x-times longer for the result. Splitting a query into smaller chunks will multiply the efficiency and help not only you but also other users (by not wasting shared resources) and the server itself. I saw it many times that one poor query and even poorer design killed the server 🙁 It is very sad but (sh)it happens.
--WRONG; too complicated query SELECT * FROM (SELECT DISTINCT id_contract FROM contract_action) ca LEFT JOIN ( SELECT c.id_contract , c.contract_type , i.amt_total FROM contract c LEFT JOIN installment i ON i.id_contract = c.id_contract ) ci ON ca.id_contract = ci.id_contract WHERE 1=1 -- CORRECT; queries are separated into GTTs and they are used INSERT INTO gtt_cnt_action SELECT distinct id_contract FROM contract_action ; INSERT INTO gtt_cnt_inst SELECT c.id_contract , c.contract_type , i.amt_total FROM contract c LEFT JOIN installment i ON i.id_contract = c.id_contract ; -- final query SELECT * FROM gtt_cnt_action ca LEFT JOIN gtt_cnt_inst ci ON ca.id_contract = ci.id_contract WHERE 1=1
Remember, this is just an example and of course, the query might be tuned much more – it all depends on the amount of data there, design, etc … Not only the second query will be faster and resource-friendly it has better manageability and readability 🙂
In case of UNION ALL to split it into multiple inserts instead of one big query. In case of UNION, you should use MERGE INTO to prevent duplicates.
-- WRONG; too slow and too much of a memory waste. INSERT INTO final_table SELECT ... some query A ... UNION ALL SELECT ... some query B ... UNION ALL SELECT ... some query C ... ; -- CORRECT; optimized INSERT INTO final_table SELECT ... some query A ... ; INSERT INTO final_table SELECT ... some query B ... ; INSERT INTO final_table SELECT ... some query C ... ; -- note that can still apply splitting it into GTTs or using WITH clauses inside those queries ... the more "abstract" you make, the better the performance will be.
In case of UNION (which removes the duplicates), it is a bit more complicated but uses MERGE INTO instead of the INSERT to avoid duplicates. You just have to define the key to how the data should be paired.
Check the explain plan
Once you exhaust all your possibilities, it is time to check the explain plan. It might discover and alert you on the incorrect use of indexes, cross partition reading or simply tells you that the plan chose was not correct. You can modify it with hints to achieve the best performance.
Test the real cost of your query
In the end, you might end up with multiple options on how to retrieve your data and achieve your desired/required result set. But which one is the best? How can you determine that this query is better than that one? Well, here is the toughest part. During the steps before you can get some palpable results but here, is it only about waiting how long does the query take? And pick the fastest one? Yes and no. Yes, you have to run each version of your query. No, the fastest one must not necessarily be the best one (even though it usually is 🙂 ).
Run each query and check its real cost to decide which one is the best. Check the article called Check the real cost of a query to find out how the cost is measured and what matters.