Regular expression functions identify precise patterns of characters in the given string and returns the matching text item from the string/data.
Syntax: REGEXP_EXTRACT(STRING subject, STRING pattern, INT index)
Let’s see practically how it works.
Below is the sample records from the dataset:
ReleaseYear MovieTitle
1969 Downhill Racer
1970 M*A*S*H
1970 The Party at Kitty and Stud’s
1970 Lovers and Other Strangers
1970 The Sidelong Glances of a Pigeon Kicker
1970 Hercules in New York
1971 Bananas
1971 Klute
1972 What’s Up, Doc?
1973 No Place to Hide
If you look at the data, there are two columns however it is difficult to convert these into structured format due to the delimitation. It can only be done using regular expressions. Let’s convert this data into columns using REGEXP_EXTRACT function.
–Creating a table to hold the entire data in a single column.
–Let’s create dummy table to hold the entire data.
CREATE TABLE movies(data STRING);
— Loading the data
LOAD DATA LOCAL INPATH ‘Desktop/movies.csv’ INTO TABLE movies;
–Converting the data into meaningful columns.
SELECT REGEXP_EXTRACT(data, ‘^\\d+’, 0),
REGEXP_EXTRACT(data, ‘\\s+(\\S.*)’, 1)
FROM movies;
Let me explain the meaning of regular expression provided for regexp_extract function:
^\\d – matches the digit at the start of the string (^). + symbol represents one or more number of characters. Thus, it captures the first string from the beginning that has any number of digits.
\\s – Helps in skipping the space character. + symbol helps in skipping more than one space character.
\\S – Captures any non-whitespace char. And the dot and asterisk (.*) matches any zero or more chars other than line break chars. Index argument is 1 so that REGEXP_EXTRACT returns the captured group group without the initial whitespace.
With this function, let’s see what else we can do and what else we can add.
SORTING:
SELECT REGEXP_EXTRACT(data,’\\s+(\\S.*)’,1)x FROM movies ORDER BY x LIMIT 20;
SEARCHING:
Find out the names of the employees whose last-name starts with ‘S’
Data:
Will Smith
Black Smith
Scott James
James Miller
Allen Jackson
Some Name
SELECT REGEXP_EXTRACT(name,'(.*) (S.*)’,0) from someNames;
Result:
Will Smith
Black Smith
Find out the names whose first-name or last-name starts with ‘S’
SELECT REGEXP_EXTRACT(name,'(.*)(S.*)’,0) from someNames;
Result:
Some Name
Will Smith
Black Smith
Scott James
Hope you find this article. Please do follow for more interesting updates.
I’m truly enjoying the design and layout of your
site. It’s a very easy on the eyes which makes it much more pleasant for me to come here and
visit more often. Did you hire out a developer to create your theme?
Excellent work!
LikeLike
I have to thank you for the efforts you have
put in writing this website. I really hope to view the same high-grade content by you in the future as well.
In truth, your creative writing abilities has motivated
me to get my own, personal site now 😉
LikeLike
Thank you fοr sharing your info. I truly apprеciate
your efforts and I will be waiting for your further post thank you
once again.
LikeLike
Its like you read my mind! You seem to know a lot about this, like you wrote
the book in it or something. I think that you can do with a few pics to drive the
message home a bit, but other than that, this is
fantastic blog. An excellent read. I will certainly be back.
Feel free to visit my website; clasificados.ligueselsalvador.com
LikeLike
Excellent post. I was checking continuously this blog and I’m impressed!
Very useful info specially the last part 🙂 I care for
such info much. I was looking for this particular information for
a long time. Thank you and good luck.
my blog post: Natural Burn Keto Review
LikeLike
Howdy this is kinda of off topic but I was wanting to know if blogs use WYSIWYG editors or if you have to
manually code with HTML. I’m starting a blog
soon but have no coding skills so I wanted to get guidance from someone with
experience. Any help would be enormously appreciated!
LikeLike
Your style is really unique in comparison to other people I’ve
read stuff from. Thank you for posting when you have the opportunity, Guess I will just bookmark
this web site.
Here is my web site Chandra
LikeLike
I don’t even know how I ended up here, but I assumed this publish was great.
I don’t know who you are but certainly you’re going to
a famous blogger should you are not already. Cheers!
Here is my blog – Extreme Muscle XXL Review
LikeLike
Everything is very open with a very clear clarification of the challenges.
It was truly informative. Your website is very useful.
Many thanks for sharing!
LikeLike
I’m really impressed with your writing talents as well as with the format on your blog.
Is that this a paid subject matter or did you customize it your self?
Anyway keep up the nice high quality writing, it’s rare to peer a great weblog like this
one these days..
LikeLike