Introduction to PostgreSQL Full Text Search
The PostgreSQL help us to find the record as well as the document; the document included text columns and row along with Metadata. The document consists of data, URL and title. Search speed and search accuracy is the main factor of PostSQL search. Basically, we use LIKE expression for find or search purpose but required exact match, we have another path for search is trigrams it is applicable for spelling mistake, or inexact matches depends on the similarity of the word, but this is difficult to search multiple words. So the best option to avoid this limitation is PostgreSQL Search. It provides search with a large document with the help of natural language. In this topic, we are going to learn about PostgreSQL Full Text Search.
Full-Text Search Methodologies
In PostgreSQL, we use tsvector data type for full-text search, tsvector create lexemes. Let’s see how tsvector is working. Mainly tsvector has two functions.
1. to_tsvector
It is used to create a list of token and where t stands for text and s stand for search. We create optimized searching with the help of tsvector, in which we add a column in a table to save the index. We can able to do a fast search with the help of tsvector but remember, one think data should be up to date.
Syntax:
Select to_tsvector (‘Title’, ‘document’);
This is a simple syntax of to_tsvector in which that to_tsvector is the data type; Title is language, and the document is search text. After execution of the above syntax, we got lexemes.
Examples
Select to_tsvector('English', 'This is my old company, and this is company very good' );
Illustrate the end result of the above declaration by using the use of the following snapshot
In the above Snapshot in which phrase takes place, two instances, however right here, appear as soon as their unique position.
select to_tsvector('The white dog jumped over the lazy cat ');
Illustrate the end result of the above declaration by using the use of the following snapshot
The above result returns a vector, and every token is lexeme with its position in the document and article (the) is removed.
2. to_tsquery
This is a very interesting function of searching. It used to search specific word in a document. It accepts the document created by to_tsvector. It uses the @@ operator for search purpose.
Syntax:
Select to_tsvector (‘document’) @@ to_tsquery (‘search word’);
Above syntax to_tsvector is the data type; a document is text and searches word for a specific search.
Examples
select to_tsvector(‘The white dog jumped over the lazy cat ‘) @@ to_tsquery(‘cat’);
Illustrate the end result of the above declaration by using the use of the following snapshot
It shows the result true because the cat word present in the document
In the same example, we perform another search cats
select to_tsvector(' The white dog jumped over the lazy cat ') @@ to_tsquery('cats');
Illustrate the end result of the above declaration by using the use of the following snapshot
The result of the above query is true. Because cats is a plural form of cat
Now another query we write for cated
select to_tsvector(' The white dog jumped over the lazy cat ') @@ to_tsquery('cated');
Illustrate the end result of the above declaration by using the use of the following snapshot
The result of the above statement is false because the meaning of cated word is different; it does not belong to the same cluster
- Operators and Uses
tsquery provide a different operator to the user to make a fixable search on the document, and it reduces the time and complexity of the user. PostgreSQL provides the following operator
- AND Operator (&)
By using this operator, we can return two different words from the document.
Example
Select to_tsvector(' The white dog jumped over the lazy cat ') @@ to_tsquery('cat & dog');
Illustrate the end result of the above declaration by using the use of the following snapshot
- OR operator(|)
By using this operator, we can return at least one word from the document.
Example
SELECT to_tsvector('The white dog jumped over the lazy cat') @@ to_tsquery('cat |monkey');
Illustrate the end result of the above declaration by using the use of the following snapshot
- NAVIGATION Operator (!)
By using this operator, we can able to check word is absent in the given document.
Example
SELECT to_tsvector('The white dog jumped over the lazy cat') @@ to_tsquery('!monkey');
Illustrate the end result of the above declaration by using the use of the following snapshot
3. Stop Word
In the case of tsvector, it misses some words, but by using Stop Word, we can regain that word.
Example
SELECT to_tsvector('pg_catalog.simple','Sky is blue and roses are red');
Illustrate the end result of the above declaration by using the use of the following snapshot
4. Normalization
Search dictionaries deal with natural language with the complexity of human language. Sometimes, meaning is similar to a different word, so we use normalization to avoid the complexity of a word that is different from the same word to one word.
Example
SELECT to_tsvector('pg_catalog.English','Jon is very brillent studtent''from his class''he got first class from last semister');
Illustrate the end result of the above declaration by using the use of the following snapshot
5. Create Document/ Record
Here we create a simple table name as a record by using create a statement
Example
CREATE TABLE Record ( record_id SERIAL, record_text TEXT, record_tokens TSVECTOR, CONSTRAINT record_pkey PRIMARY KEY (record_id) );
Illustrate the end result of the above declaration by using the use of the following snapshot
- Then insert the record into a document.
INSERT INTO record (record_text) VALUES
('Ram is playing cricket with his friends.'),
('I want to go abroad for master studies'),
('PostgreSQL is popular technology.'),
('Full text search gives fast result');
Select * from Record;
Illustrate the result of the above statement by using the following snapshot
- Now do update command with their respective vector of each record
UPDATE record r1 SET record_tokens = to_tsvector(r1.record_text) FROM record r2;
Select * from record;
Illustrate the end result of the above declaration by using the use of the following snapshot
- Now phrase search Record
SELECT record_id, record_text FROM record WHERE record_tokens @@ to_tsquery('play & friend');
Illustrate the end result of the above declaration by using the use of the following snapshot
Conclusion
We hope from the above article you have to understand what Full-Text Search in PostgreSQL is and how it is used. In the above article, we learn a different method of full-text search like To_tsvector, To_tsquery; with the different example, we also have seen how we can use the different operator in tsquery. The full-text search is able to avoid the repetition of a word with normalizing. This is a very fast and advanced searching methodology in PostgreSQL.
Recommended Articles
This is a guide to PostgreSQL Full Text Search. Here we discuss what Full-Text Search in PostgreSQL is and Examples of how it is used. You may also have a look at the following articles to learn more –