Regular Expressions – RLIKE in Hive

Either SQL Developers or Data Analysts often use arithmetic operators in their queries such as =, >, <, !=

Apart from this, LIKE, EXISTS, IN, NOT IN, NOT EXISTS, etc. will also be used very frequently. 

These are all will help in fetching only the required data.

There is a relational operator (RLIKE) which might not be used the way the above referred to used. But it is helpful in searching the string with similar text the way LIKE operator does in SQL. Both MySQL and Hive provides RLIKE operator that can be used for searching Advanced Regular Expressions.

REGEXP_LIKE in Oracle is equivalent to RLIKE.

Let us see what RLIKE does in Hive.

The below is my “test” table data.

Id

somecol

101

123xyz

102

1234

103

3456

104

abcde

105

Vxyz

106

Bbc


select * from
test where somecol RLIKE ‘[0-9]+’;

This will return all the numbers from the column. If the column value has “123xyz” then it will also be returned. In case if we want to eliminate text contained values and return only numbers then use the following statement.

select * from test where somecol RLIKE ‘[0-9]+$’;

Or you can use-

select * from test where somecol RLIKE ‘^[0-9]+$’;

If you want to retrieve the rows that has no numbers:

SELECT * FROM test WHERE somecol RLIKE ‘([a-z]|[A-Z])+’ 

If you want to retrieve the rows that has numbers with decimal points:

SELECT * FROM test WHERE somecol RLIKE ‘[0-9]+[.][0-9]+’

Please leave some comments if you need more clarification. 



2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s