How to use full-text search in MySQL
Full-text search is a database technology that retrieves records even if they do not exactly match our search criteria. This enables rich, natural language searches that feel more familiar.
Using full text search to search for "database engine" will return results containing the phrase "database", "engine", or "database engine". This resolves possible ambiguity in the search criteria, so a line like "I have a database and an engine" will still appear.
Simple searches can be done in MySQL using LIKE
the operator. This is inefficient and has limited functionality. MySQL often needs to perform a full table scan to identify records that match our query.
Full-text queries use specially created indexes to improve performance. This also enables MySQL to keep track of words in a data set, facilitating natural language searches.
Setting up full text search
We can use full-text search only on columns that have a full-text index. This is set using the -s clause CREATE TABLE
in the -destination statement
when seeding the database schema . We can also use it with -destination to add a search index to an existing column.FULLTEXT
ALTER TABLE
CREATE TABLE articles(content TEXT, FULLTEXT (content));
ALTER TABLE articles ADD FULLTEXT (content);
With the index in place, we can start querying the database.
Use full text search
A full text search starts with WHERE
a SELECT clause. We use SELECT MATCH AGAINST
instead of SELECT LIKE
. We need to indicate the index columns to match, and the query to search for.
SELECT * FROM articles WHERE MATCH (content) AGAINST ('database engine' IN NATURAL LANGUAGE MODE);
This will perform a full text search of articles using the query database engine. By specifying IN NATURAL LANGUAGE MODE
, MySQL is instructed to interpret the query literally and not process it in any way. If not specified, this is the default search mode.
Sort by results relevance
One of the advantages of full-text search is that it allows us to sort the returned records by relevance. This LIKE
is not possible with a regular SELECT query. We can use MATCH ... AGAINST
the SELECT clause as SQL SELECT
part of SELECT . The virtual column returned will contain a relevance score from 0 to 1, indicating how well the record matches the search query.
SELECT content, MATCH (content) AGAINST ('database engine') AS relevance FROM articles ORDER BY relevance DESC
This query will return the most relevant results first. This helps our application meet user expectations of how a natural language search system should perform.
MySQL calculates the search relevance score by evaluating several different factors. These include the number of records that match the query, and the number of times the query occurs in each record. Results that have multiple exact matches to the query will be ranked higher than results that contain only a partial query.
SELECT
When using in a statement
, MATCH ... AGAINST
there is no need WHERE
to repeat it in the clause. We can manually filter the results to include only records with non-zero relevance scores.
SELECT content, MATCH (content) AGAINST ('database engine') AS relevance FROM articles WHERE relevance > 0 ORDER BY relevance DESC
Query Expansion Mode
Natural language is not the only supported full-text search mode. Query expansion mode is an alternative that helps expand the scope of search results. It automatically weights search queries based on the most relevant terms.
An expanded search first finds records that contain matches to the query. These records are then examined to identify the most relevant terms. The database then runs another search, this time based on the relevant terms instead of the original query. This typically results in more records being returned while maintaining an acceptable level of relevance.
Here's how to enable query expansion mode:
SELECT * FROM articles WHERE MATCH (content) AGAINST ('database engine' WITH QUERY EXPANSION)
Boolean Mode
The last full-text search mode is the Boolean mode. This allows us to include Boolean modifiers in our queries. This mode can be used when we need advanced control over the matching logic.
We can +
require a word to appear in every result by prefixing it with . Use -
to exclude results containing the word. Other operators can be used to match parts of words, create sub-expressions and bring words down in relevance scores. The latter is useful when shielding "noise" terms. If we do not specify an operator, it is implied that the word will be quoted in the query using OR
.
SELECT * FROM articles WHERE MATCH (content) AGAINST ('+data* engine -sqlite' IN BOOLEAN MODE)
This query will show you sqlite
articles that have at least one word that starts with a word that is not mentioned in your data. Each result may or may not contain the word engine.
Boolean mode allows us to build powerful searches using our own logic. One caveat is that it does not support search relevance scoring. This is the tradeoff for enabling users to do Boolean searches. The ranking of results may not match human expectations.
Configuring full-text search
MySQL comes with several full-text configuration options that allow us to fine-tune how searches are performed. Here are some of the most important ones.
-
innodb_ft_min_token_size – Sets the minimum word length for indexed terms. Words with fewer characters than this value will not be added to the index, so we will not be able to search for them. The default value is 3, which excludes extremely common words such as a, an, and I. This setting applies to InnoDB tables; use for MyISAM
ft_min_word_len
. -
innodb_ft_max_token_size - Similar to
innodb_ft_min_token_size
, this sets the maximum length of an indexed word. Longer words will not be searchable. Use with MyISAM tablesft_max_word_len
. - innodb_ft_enable_stopword – By default, this setting allows us to control whether MySQL should filter out “stop words”. Stop words are very common words that can overly affect search results. The default stop word list contains 36 common phrases.
-
innodb_ft_user_stopword_table – We can set this value to the name of the database table from which MySQL should get the stopword list. The table must have a
VARCHAR
column called value. Add our stopwords to exclude as records in this table.
These settings are usually set in our MySQL server configuration file. The location varies by platform; it can usually be found in /etc/mysql/my.cnf . You will need to restart the MySQL service after changing the settings.
Once the server is back up, rebuild the full-text indexes for the table. We must do this so that our data is re-indexed using the current configuration. Otherwise, the previously indexed data will continue to be used.
To reindex an InnoDB table, run OPTIMIZE TABLE my_table
. For a MyISAM table, use REPAIR TABLE my_table QUICK
. The search index will then be rebuilt so that our configuration changes take effect.
Summarize
We enable MySQL full-text search by adding a full-text index to a searchable field. We then use MATCH ... AGAINST
and one of three search modes to get results. Natural language queries return a search relevance score that we can use to rank the results.
Full-text search provides LIKE
more powerful search behavior than the SELECT statement. It is also more performant, especially on large data sets, because all the text is pre-indexed.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Two ways to install mysql-5.5.47 on Linux system and manage mysql
Publish Date:2025/04/26 Views:140 Category:MySQL
-
We know that there are generally two ways to install software on the Linux system. One is to use rpm or yum to install, which is convenient and fast; the other is to use the compiled source package. Although this method is more troublesome
Mysql master-slave replication simple configuration
Publish Date:2025/04/26 Views:120 Category:MySQL
-
I'm learning about MySQL master-slave replication recently and want to apply it to a project at work. Since I'm new to it, I don't understand it very well, so I can only share how to make a simple configuration. At the beginning, we configu
Implementation details of MySQL master-slave replication (I) Exploration of the m
Publish Date:2025/04/26 Views:56 Category:MySQL
-
This article mainly discusses the implementation mechanism of master-slave replication, which may not be directly helpful for our actual application, but understanding its principle can achieve twice the result with half the effort for futu
Implementation details of MySQL master-slave replication (II) Exploration from th
Publish Date:2025/04/26 Views:74 Category:MySQL
-
Previously we explored the master server in master-slave replication. Now let's look at how the slave server works in the entire system. In the article Master Server Exploration, we mentioned that three threads are needed in a master-slave
Mysql master-slave replication - what to do if the slave server stops
Publish Date:2025/04/26 Views:97 Category:MySQL
-
You may find this topic a little ridiculous. What can we do if the server stops? Of course, we should restart the service. Restarting the service is no problem. The problem is that a lot of data has been written to the master database durin
MySQL stored procedure details
Publish Date:2025/04/26 Views:163 Category:MySQL
-
A stored procedure can be thought of as encapsulating a SQL statement that we need to process specially into a function. When needed, we only need to call this function to achieve the desired operation. This process can be called a stored p
How many of these MySQL statement tags have you used?
Publish Date:2025/04/26 Views:122 Category:MySQL
-
In the article "A Peek into MySQL Stored Procedure Details" , we briefly introduced the use of stored procedures. The syntax for creating stored procedures includes BEGIN...END. In addition to BEGIN...END, the following statement tags can b
MySQL CURRENT_TIMESTAMP() Function Detailed Introduction (with Examples)
Publish Date:2025/04/26 Views:132 Category:MySQL
-
This article explains how to use MySQL functions with examples CURRENT_TIMESTAMP() . By using it, we can convert or display the current date and time. The output format is "YYYY-MM-DD HH:MM:SS" or "YYYYMMDDHHMMSS", depending on the context
Back up the MySQL database to a file
Publish Date:2025/04/26 Views:166 Category:MySQL
-
Backing up your database is a very important system administration task that should usually cron be run from a job at scheduled intervals. We will use mysqldump the dump utility that comes with mysql to dump the contents of the database to