Get the unique values from multiple CTEs – Impala

Often, some requests for reports seem simple and straightforward. But it is challenging to find a better solution to be applied. Cloudera’s Impala provides several features, and in order to accomplish the tasks in the best possible way, an analyst needs to be aware of them. 

Assume that, after filtering out specific values, there are two separate result sets that need to be combined into a single output with distinct values. How would you do that?

Let’s see with an example.

Below are two sample datasets:

products_sample_datasets

Now the requirement is –

i) Get the products that cost over 500 AED from “DXB_Products” table  
ii) Get the products from SHJ_Products that cost less than 500 AED.
iii) Now combine both the resultsets and filter-out unique products.
 
Common table expression is the best way to get the desired output. Here’s the query –

WITH CTE1 AS (
           SELECT * FROM DXB_Products WHERE Price > 500
),
           CTE2 AS (
SELECT * FROM SHJ_Products WHERE Price > 500
)
SELECT * FROM CTE1 UNION SELECT * FROM CTE2;


union_CTE

 
Hope you find this article useful.
 

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s