Optimizing SQL queries using UNION

0 / 349
SQL query optimization
Overview   While working on a data management solution you might have encountered a situation where you had to put multiple OR conditions in the where clause. Sometimes end user may ask for a wide kind of report having multiple OR conditions on multiple columns of a database table. These kinds of reports are prone to show bad query performance. Most of the performance bugs are reported for these kinds of reports. Database table Indexing also does not help in this case.  
In this article, we will see how to improve the performance of such queries, with the help of an example. We will optimize SQL queries using UNION operator instead of OR   Let’s say we have a table to store customer information. That table has five columns viz ( id, Name, Country, Age, Total_Spending ).   Following is the SQL to create customer info table.  
CREATE TABLE CustomerInfo( cust_id NUMBER NOT NULL PRIMARY KEY, 
cust_name CHAR(50), country CHAR(4), age NUMBER(3), total_spending NUMBER );
  Now, load some data into it. Following is the SQL for the same. I have loaded near about hundred records in CustometInfo Table.  
INSERT INTO CustomerInfo VALUES ( 1, 'James Smith', 'USA', 23, 5800);
  Next step is to create indices for all the columns of CustomerInfo table.  
CREATE INDEX cust_name_i1 ON CustomerInfo (cust_name);

CREATE INDEX cust_country_i2 ON CustomerInfo (country);

CREATE INDEX cust_age_i3 ON CustomerInfo (age);

CREATE INDEX cust_spending_i4 ON CustomerInfo (total_spending);
  Also read: https://www.golibrary.co/optimizing-hierarchical-sql-queries/  
Now, you have been asked to find out the potential high spending customers. The management has given you following criteria to identify high spending customers.  
  1. If the customer is young i.e below age 40.
  2. If the customer has total spending more than 8000 dollars.
  3. If the customer is from USA.
  If either of the above-mentioned conditions is satisfied, then that means the customer is a potential high spending customer.   Following is the conventional way of solving this problem. Here, we are using OR condition in the where clause to satisfy the search criteria.  
SELECT * FROM CustomerInfo WHERE country = 'USA' OR age < 40 OR 
total_spending > 8000;
  What is wrong with the above query??  On most of the databases the Oracle does not use any index to execute this type of query. We can verify this from the explain plan.   Following is the SQL to fetch the explain plan.  
EXPLAIN PLAN FOR  (SELECT * FROM CustomerInfo WHERE country = 'USA' 
OR age < 40 OR total_spending > 8000);
SELECT * FROM TABLE( dbms_xplan.display);
  On my machine I got following explain plan. On line number 7 of the explain plan, you can see that the customerInfo table was accessed using a full table scan. Full table scan was preferred in spite of having indices for all the fields mentioned in the predicate i.e. ( age, country, and total_spending).  
Explain plan for OR
The reason is that using OR in a query will often cause the Query Optimizer to abandon use of index seeks and revert to scans. Oracle gets confused when it sees multiple OR clauses. It could not decide which index to use to fetch the records. Should Age index be used here? Or should country or total_spending index be used? There is no straight answer to this question. Hence Oracle does full table scan and that slows down the execution of the query.   There is a solution to this problem. Do not use OR, use UNION instead. Following is the same query written using UNION clause.  
SELECT * FROM CustomerInfo
WHERE country = 'USA'
UNION
SELECT * FROM CustomerInfo
WHERE age < 40
UNION 
SELECT * FROM CustomerInfo
WHERE total_spending > 8000
  Now, let’s see the explain plan for this query. Is Oracle using any index to fetch the data now? Following is the SQL to fetch the explain plan.  
EXPLAIN PLAN FOR (
SELECT * FROM CustomerInfo
WHERE country = 'USA'
UNION
SELECT * FROM CustomerInfo
WHERE age < 40
UNION 
SELECT * FROM CustomerInfo
WHERE total_spending > 8000
);
SELECT * FROM TABLE( dbms_xplan.display);
Explain plan for Union
  In the above explain plan you can see now the custometInfo table is getting accessed using the created indices. There is no full table scan now. On small table like the one we are experimenting with, the benefits may now be visible. However, for a real time database where the tables have millions of records this solution shows significant benefits in query execution time.  

Comments

comments


14+ years of professional experience in developing products like Siemens Teamcenter, Siemens Product Master Manager, Intuit Quicken and Kosmix using Python, C++, JAVA, Oracle SQL. Pythonist | Python Aficionado | Machine Learning Enthusiast | Freelancer

Related Posts