The UNION operator is one of the most often used SQL operators. This function combines the results of two or more SELECT queries and returns unique values. The essential condition is that the result sets for each SELECT operation in the UNION must have the same number of columns with identical data types. Though this feature is not available in Apache Hive versions before to 1.2.0, there is a workaround.
Most of the beginners do use Cloudera VM either 5.10 or 5.12 for learning purposes in which Hive version is 1.1.0. When tried to use UNION it returns the following error message.
FAILED: ParseException line 1:16 missing EOF at ‘select’ near ‘union’.
Since Apache Hive allows us to use “UNION ALL” we can make a use of it by writing the query in the following format that returns the desired output.
SELECT DISTINCT * FROM (
SELECT 10 UNION ALL
SELECT 20 UNION ALL
SELECT 10) t;
If you look the below screenshot, it returned the distinct values of the combined select statements.
Hope you find this article helpful.