Subqueries come in handy when it comes to a complex logic. There are many cases where it is easier to tackle the problem using subqueries.
Let me start with an example right away.
/* Task: Display all products with price lower than average price of all products */ SELECT id , product_name FROM products WHERE product_price < (SELECT avg(product_price) FROM products)
You can use a subquery in SELECT, FROM, WHERE clause and even as a JOINed result set. See some examples below
-- subquery in SELECT SELECT product_name , (SELECT count(*) FROM product) as total_products FROM product -- subquery in FROM SELECT product_name , product_price FROM (SELECT * FROM product WHERE product_type = 'TV') -- subquery as a JOIN result set SELECT p.product_name , p.product_price , o.orders# FROM product p JOIN (SELECT id_product , count(*) as orders# FROM order WHERE status = 'finsihed' GROUP BY id_product ) o ON p.id_product = o.id_product -- subquery in WHERE SELECT product_name , product_price FROM product WHERE product_name IN (SELECT product_name FROM promo_products) -- you can even use multiple-level subqueries if needed SELECT product_name , (SELECT count(*) FROM product WHERE product_price < (SELECT avg(product_price) FROM product) ) as under_avg_prod_cnt FROM product WHERE 1=1 AND id_product IN (SELECT id_product FROM ( SELECT id_product , count(*) as cnt FROM order WHERE date_order >= trunc(sysdate, 'MM') ) )
Subqueries are executed first before their parent queries. It is very important to realize which type of subquery we use:
- Single-row – will return only one record and thus we can easily use any operator we want
- Multi-row – will return two or more records and therefore we have to use set operators (IN, ALL, ANY, NOT IN)
- Correlated – these are very inefficient and should be avoided at all costs because they will be executed for every single row and therefore will generate a high cost
Not only they are handy in terms of shorter code but also with regard to the performance. With subqueries, you can restrict selected data and extract only those you need instead of performing Full Table Scan (reading all records in a table).
SELECT e.employee_name , e.employee_number , a.date_attendace , a.clock_in , a.clock_out FROM employee e JOIN (SELECT id_employee , date_attedance , clock_in , clock_out FROM attendance WHERE date_attendance >= trunc(sysdate)-7 ) a ON a.id_employee = e.id_employee
Another great feature for subqueries is called WITH clause. This is especially useful when you have multiple long subqueries and the code is becoming very messy.
--- instead of this SELECT * FROM table_1 t1 LEFT JOIN (SELECT FROM table_Y /* multiple lines here */ WHERE 1=1 AND ... ) sub_A ON sub_A.id = t1.id LEFT JOIN (SELECT FROM table_X /* multiple lines here */ WHERE 1=1 ) sub_B ON sub_B.id = t1.id WHERE 1=1 AND t1.col_name IN (SELECT col_other_name FROM table_2 WHERE parameter = 'XNA') -- with the WITH clause it's way more readable WITH w_sub_A AS (SELECT FROM table_Y /* multiple lines here */ WHERE 1=1 AND ... ) , w_sub_B AS (SELECT FROM table_X /* multiple lines here */ WHERE 1=1 ) , w_t2 (SELECT t2.col_div / t3.col_data as col_other_name FROM table_2 t2 LEFT JOIN table_3 t3 ON t2.id = t3.id AND t2.date_time = t3.date_time WHERE t2.parameter = 'XNA') --- instead of this SELECT * FROM table_1 t1 LEFT JOIN w_sub_A ON sub_A.id = t1.id LEFT JOIN w_sub_B ON sub_B.id = t1.id WHERE 1=1 AND t1.col_name IN (SELECT col_other_name FROM w_t2)
The final query looks here very friendly and even the rest is very well organized. We can work with that as with separate data subsets. Another advantage of using WITH clauses is that you can easily reuse the same data set as many times as you want and in case of any change in your code, you will need to change it only once at one place and thus you will not most likely make a mistake.
WITH w_reuse AS ( SELECT * FROM multi_src ) SELECT * FROM w_reuse a LEFT JOIN w_reuse b ON a.id = b.id LEFT JOIN w_reuse c ON c.id2 = b.id2 WHERE 1=1
There is one thing you have to be cautious about and it is when using big tables (big result sets) in a WITH clause and you are going to use it multiple times, you have to use MATERIALIZE hint. You can find more in my article about hints.
Subqueries are very powerful and may help you with your tasks but this door swings both ways. It means it could be a killer for your performance when incorrectly written.