Introduction to SQL SELECT DISTINCT
DISTINCT keyword in SQL is used to fetch only unique records from a database table. It is usually used in conjunction with the SELECT statement. For uninitiated, the SELECT statement is used to fetch desired records from the datatable. Distinct keyword removes all duplicate records and fetches only unique ones. It can further be used with aggregate functions like AVG, MAX, MIN, SUM, COUNT, etc.
SELECT DISTINCT keyword is very helpful when a table contains many duplicate values and we have to fetch unique records only, for example in business scenarios where we want to know the distinct locations from where orders are being placed, how many distinct sets of customers do we have, etc.
Syntax and Parameters
The basic syntax for writing a SELECT DISTINCT statement in SQL is as follows:
SELECT DISTINCT column_name1, column_name2,...
FROM table_name;
The parameters used in the above-mentioned syntax are as follows:
- column_name1, column_name2,..: These are the columns or fields which you want to fetch in your final result set.
- table_name: This is the name of the table from which said columns or fields will be fetched from.
How Does SELECT Statements with DISTINCT Keyword Work in SQL?
DISTINCT keyword in SQL filters out duplicate values from our returned results. For more simplification of its working, we may look at it in the following way:
- A SELECT DISTINCT statement first builds our overall result set with all records, i.e including duplicate values based on FROM, JOIN, WHERE, HAVING, etc statements.
- Next, it sorts the result set based on the column_name or field with which DISTINCT has been used.
- Then it performs de-duplication (i.e. removes any duplicate values) on the overall result set which was prepared in the first step.
The SELECT DISTINCT statement can be considered semantically equivalent to a GROUP BY with all returned fields mentioned in the GROUP BY clause.
Having discussed the syntax and working of SELECT DISTINCT statements, let us go ahead and try some examples to develop a great understanding of this concept.
Examples of SQL SELECT DISTINCT
In order to illustrate the working of SQL DISTINCT keyword, what could be better than creating a dummy database table. Let us create a table called “customers”.
We can use the code snippet given below for performing this task.
4.5 (5,258 ratings)
View Course
Query:
CREATE TABLE public.customers
(
customer_id integer NOT NULL,
sale_date date NOT NULL,
sale_amount numeric NOT NULL,
salesperson character varying(255),
store_state character varying(255),
order_id character varying(255)
);
We have successfully created the table. Now let us insert some records in it to work with.
Query:
INSERT INTO public.customers(
customer_id, sale_date, sale_amount, salesperson, store_state, order_id)
VALUES (1001,'2020-05-23',1200,'Raj K','KA','1001'),
(1001,'2020-05-22',1200,'M K','NULL','1002'),
(1002,'2020-05-23',1200,'Malika Rakesh','MH','1003'),
(1003,'2020-05-22',1500,'Malika Rakesh','MH','1004'),
(1001,'2020-05-23',1320,'Dave Peter','MH','1005'),
(1002,'2020-05-21',1200,'Molly Samberg','NY','1001'),
(1004,'2020-05-22',1210,'M K','NULL','1003');
Output:
We have successfully inserted 7 records in the table. The customers table after performing the above insertions looks something as follows:
Query:
SELECT * FROM customers;
Output:
Now let us try some examples using the DISTINCT keyword in conjunction with the SELECT statement.
Example #1
Find the customer ids of all the unique customers who have bought or ordered something from the departmental store.
Query:
SELECT DISTINCT customer_id
FROM customers;
Output:
In this example, we can see that the distinct keyword has fetched only the unique customer ids.
Example #2
Find all the unique dates on which sales were made at the departmental store.
Query:
SELECT DISTINCT sale_date
FROM customers;
Output:
Example #3
Find all the distinct store locations where the departmental store is located.
Query:
SELECT DISTINCT store_state
FROM customers
ORDER BY store_state ASC;
Output:
Explanation: The thing with NULL values and the DISTINCT keyword is that DISTINCT lets the first NULL in the final result set and removes all other subsequent NULL values. This can be observed in this example also. We had two NULL values in the store_state column, but DISTINCT has made room for only one NULL in the final result set.
Example #4
Find all the unique orders that were made on a particular date in the departmental store.
Query:
SELECT DISTINCT sale_date,
salesperson,
order_id,
store_state
FROM customers
ORDER BY sale_date;
Output:
Example #5
Find the sum of revenue collected for all the unique orders that were made on a particular date at a particular store of the departmental store.
Query:
SELECT DISTINCT sale_date, store_state,
sum(sale_amount) as total_sales
FROM customers
GROUP BY store_state, sale_date
ORDER BY sale_date ASC;
Output:
Example #6
Find all the unique customers and the sum of total money spent by them at the departmental store.
Query:
SELECT DISTINCT customer_id,
sum(sale_amount) as total_sales
FROM customers
GROUP BY customer_id
ORDER BY customer_id ASC;
Output:
Example #7
Find all the unique salespeople working in the departmental store.
Query:
SELECT DISTINCT salesperson
FROM customers;
Output:
Conclusion
SQL DISTINCT keyword is used mostly in the conjugation of SELECT statements to fetch only unique records from the specified table by removing duplicate values from the final result set.
Recommended Articles
This is a guide to SQL SELECT DISTINCT. Here we discuss the Introduction of SELECT DISTINCT in SQL and the practical examples and different subquery expressions. You can also go through our suggested articles to learn more –