SQL Tip – Working With the UNION Clause

I have been working with the SQL language for very many years. I take for granted that most developers are not very experienced in using commands outside of simple select, insert, update and delete statements. Certainly, this can be a disadvantage when working. I would like to explain a few tricks when using the UNION clause.

Union basically joins the result of two separate results sets with the condition that the result sets have the same number of columns and the same or very similar data type. Naturally, the columns need to be in the same order. This seems very intuitive if you can imagine that you have two tables that are similar, and you want to produce a result set that has records from both. But you can also use records from the same table or two more complex queries to produce the result you need. Let me walk you through the details and some examples.

First the details…

There is UNION and UNION ALL

UNION joins the results of two queries UNION ALL also joins the result of two queries. UNION does not include duplicates; however, UNION ALL does include duplicates. Let us walk through the logic quickly. In the example below imagine A,B,C… as records.

Select 1 returns A,B,C

and

Select 2 returns C, D, E

UNION of the two -> A,B,C,D,E

UNION ALL of the two -> A,B,C,C,D,E

Higher cost of UNION vs. UNION ALL

Since the UNION clause does not include duplicates the database engine needs to sort the two result sets and eliminate the duplicates. If you are selecting records from two different tables and you know it is not logically possible to have a duplicate, you should use the UNION ALL. For example, you are querying a SALES_2020 table and SALES_2021 table and one of the columns is the date of the sale. Logically there should not be any duplicate records between the 2020 and 2021 tables. UNION also has the undesirable surprise that a GROUP BY is applied without you knowing. UNION will eliminate any duplicate not only between Select 1 and Select 2 but also UNION will eliminate any duplicates within Select 1 or Select 2.

Now some examples…

We can see in table SalesQ1 that John appears three times. Twice John purchase $100 on the same day. A normal select of the of three columns SalesDate, Customer and SalesAmount would return 5 records.

SalesQ1
SalesQ2

It would be intuitive to simply union the same query on the second table and have a result that combines the two sets of data together. Surprise the result will not be what you want.

SELECT SalesDate, Customer, SalesAmount
  FROM SalesQ1
UNION
SELECT SalesDate, Customer, SalesAmount
  FROM SalesQ2;
Result of UNION

John does not appear 3 times anymore in Q1’s dataset. The two records that had the same SalesDate, Customer and SalesAmount were grouped into one. Like I explained to make the query do less you need to use the UNION ALL.

SELECT SalesDate, Customer, SalesAmount
  FROM SalesQ1
UNION
 ALL
SELECT SalesDate, Customer, SalesAmount
  FROM SalesQ2;
Results of UNION ALL

To illustrate how wrong things could look sometimes we can do it again without the date.

SELECT Customer, SalesAmount
  FROM SalesQ1
UNION
SELECT Customer, SalesAmount
  FROM SalesQ2;
UNION Only 4 rows???

We see that the UNION clause essentially grouped the records from the first table but also the records that have the same values in either table are grouped together. If we use the UNION ALL the result looks like all the records from both tables.

SELECT Customer, SalesAmount
  FROM SalesQ1
UNION
 ALL
SELECT Customer, SalesAmount
  FROM SalesQ2;
Big difference ehh?

If the selection in each table was unique, the UNION and UNION ALL clauses would return the same results. Often things look that way in the beginning but if you are the person working with reports and ETL you often have to work with set of data that you are not an expert with. I hope this example helps with your understanding.

Have a nice day.

Published by Marek

Just a Database Administrator from Montreal.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

<span>%d</span> bloggers like this: