Introduction to MySQL REGEXP
The following article provides an outline for MySQL REGEXP. A regular expression is used with SELECT queries to search for patterns, generally strings, in the database. We can consider the REGEXP as a search tool to understand easily. This operation is similar to the “LIKE …%” operator which also does pattern matching. REGEXP can be combined with almost all operators from the keyboard. The regular expression operator (REGEXP) can be considered as a separate set of language as altogether.
Syntax for REGEXP function is as follows:
expression REGEXP pattern
Here, the ‘expression’ stands for the column name or the expression to be looked at from the database. The term ‘pattern’ stands for the string which is to be searched in the DB.
While using with a real query, the REGEXP will be as follows:
SELECT * from table_name where expression REGEXP pattern;
The pattern will be mentioned with single quotes.
How RANK() works in MySQL?
Consider the table emp as below:
select * from emp;
Here we need to get the details of employees who has character ‘a’ in their names.
select * from emp where E_Name REGEXP 'a';
Now to analyse the query, we have described the query to select all details from table ‘emp’ who has the character ’a’ in their E_Name field. And as we look at the output, we can understand that out of 5 employees, 3 of them has character ‘a’ in their names and they are Alan, Carl and Dave.
So we wrote the query for selecting details with a string in it. Here we just mentioned to select data with a character. The specified character could be anywhere in the word.
Here we see how to select the details of employee whose name starts with a specific character.
select * from emp where E_Name REGEXP '^a';
Here, we used the operator ‘^’ along with the pattern to be searched and thus got the details of employees whose name starts with character ‘a’.
So far, the characters or strings considered as the pattern were not case-sensitive.
Here we make it more specific by making it case-sensitive.
select * from emp where E_Name REGEXP BINARY '^a';
Since we specified the lower case character ‘a’ in the query, there was no results to be retrieved.
Here let’s make it upper case and try to retrieve output.
select * from emp where E_Name REGEXP BINARY '^E';
Here we got the details of employees who had the character ‘E’ in upper case at the beginning.
Here we can search a part of word as well.
select * from emp where Location REGEXP BINARY '^New';
The query says to select the entries for which the location field starts with word ‘New’. And from the sample table, we get two rows satisfying the condition, which are New York and New Jersey. Also, since the BINARY function is mentioned in the query, it will check for case sensitivity too.
Here we see how to search for an ending character/ word/ part of word.
select * from emp where Location REGEXP 'a$';
If we examine the query, it specifies to select the entries from employee table for which the location field ends with the character ‘a’. And as the output clearly picked two entries from the table which has character ‘a’ at the final character of the location, California and Alabama. So far we have discussed searching a single character, word or a portion of word.
Here we can see how to search for a set of character and a range of character.
For example, let’s assume we need to identify the rows from employee table with any of the characters ‘a’,’b’ and ‘c’ in the E_Name column. We can put this same requirement in two ways, first by mentioning all three characters as an array and second by mentioning the range of characters from ‘a’ to ‘c’.
select * from emp where E_Name REGEXP '[abc]';
select * from emp where E_Name REGEXP '[a-c]';
Query 7 and 8 will give the same output. Query 7 picks those rows having characters specified within the ‘[..]’ under column E_Name. We have mentioned characters ‘a’, ‘b’ and ‘c’ within the square bracket. So the four rows will be selected to display. Here, make a note that the characters are to be mentioned ‘without’ comma operator inside the square brackets.
Query 8 will pick those rows having characters specified within the range specified within the square bracket ‘[….]’. So those employee names which includes any characters ranging from ‘a’ to ‘c’ will be picked and displayed.
Here let’s write a query with the ‘OR’ operator which is ‘|’ in MySQL.
select * from emp where Location REGEXP 'new|san';
The query is to search for locations with either ‘new’ or ‘san’ in the word. And there we get three outputs which has either of the words mentioned within the quotes.
One another operator generally used along with REGEXP is the number of characters in the word. Suppose we have to search for employee names with exact four characters, then we need to specify four instances of ‘.’ between the beginning and closing strings, which are ‘^’ and ‘$’ respectively.
Here let’s write the query to identify names with four characters.
select * from emp where E_Name REGEXP '^....$';
Conclusion – MySQL REGEXP
In this article, we have seen the REGEXP operator, which is used to search for characters or patterns in a table. We are now familiar with multiple string operators, which are generally used along with the REGEXP operator and the syntax for each of them. We can use almost all string operators with the REGEXP operator in MySQL.
This is a guide to MySQL REGEXP. Here we discuss the introduction to MySQL REGEXP and how RANK() works with respective queries You may also have a look at the following articles to learn more –