Anytime you need to aggregate some data you have to define two things:
- what are the dimensions
- what are the facts
First of all, what is a dimension and what is a fact? Let me explain that
Dimension table
Dimension tables group dimensions facts together based on their nature. They contain basic business entities (such as customers, products, salesrooms, employees, etc.). They are mostly represented by a string value.
Fact table
Fact tables are represented mostly by numbers and are related to a particular business process. Fact tables contain some measurement that might be easily manipulated (sum, count, …). To give you a few examples, fact tables might store information about calls, sales, payments, purchases, etc.
Now, when you know what is the difference between the dimension and fact, let me show you how would a general query look like.
SELECT dimension_1 , dimension_2 , dimension_3 , count(fact_1) , sum(fact_2) FROM sample_table GROUP BY dimension_1 , dimension_2 , dimension_3
As you can see in the example, we only mostly aggregate facts. We can aggregate also dimensions in order to get the number of customers we would need to count them
-- to get unique number of customers who purchased a TV SELECT count(distinct customers) FROM purchase WHERE 1=1 AND id_product = 10 -- TV ;
You can execute these single row functions against a single report or grouped data (multiple dimensions). As long as you do not have any dimension involved, there is no need to use GROUP BY (as you can see in the example above).
However, as long as you select a dimension (non-aggregated field) you have to list it down to the GROUP BY section.
Unlike in ORDER BY, you can use neither aliases or number references to a column position in GROUP BY. On the other side, you do not have to put your GROUP BY columns in a SELECT area necessarily (which does not work the other way around – if you have a certain column in your selection (SELECT part) and it is not aggregated, it must be listed in GROUP BY as well).
-- this query will get the count of employees in each department SELECT count(id_employee) as emp_nbr FROM employee GROUP BY id_department emb_nbr ----------- 10 13 5 8 -- this is the exact query except you can see the departments SELECT id_department , count(id_employee) as emp_nbr FROM employee GROUP BY id_department id_department | emb_nbr ----------------------- 1001 | 10 1010 | 13 1055 | 5 1068 | 8
To be able to filter aggregated results, Oracle prepared a feature that goes hand in hand with GROUP BY. It is called HAVING and you can easily filter aggregated records with this directive. Find more details about it in the article SELECT statement (decomposed).
GROUP BY clause does not have to be used for aggregates only but it is also capable of giving you the unique records from your selection.