JIYIK CN >

Current Location:Home > Learning > DATABASE > MySQL >

MySQL switch from utf8 to utf8mb4

Author:JIYIK Last Updated:2025/04/26 Views:

utf8mb4

Fortunately, MySQL 5.5.3 (released in early 2010) introduced a utf8mb4new encoding called which maps to proper UTF-8 and thus fully supports Unicode, including astral symbols.

Step 1: Create a backup

Create a backup of all databases on the server you are upgrading. Safety first!

Step 2: Upgrade MySQL Server

Upgrade your MySQL server to v5.5.3+, or ask your server administrator to do it for you.

Step 3: Modify database, table and column

Change the character set and collation properties of the database, tables, and columns to use utf8mb4instead of utf8.

# For each database:
ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
# For each table:
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
# For each column:
ALTER TABLE table_name CHANGE column_name column_name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
# (Don’t blindly copy-paste this! The exact statement depends on the column type, maximum length, and other properties. The above line is just an example for a `VARCHAR` column.)

Since is fully backwards compatible utf8mb4with utf8, no mojibake or other forms of data loss should occur. (But you have backups, right?)

Step 4: Check the maximum length of column and index keys

This is probably the most tedious part of the entire upgrade process.

When utf8converting from utf8mb4, the maximum length of a column or index key in bytes is unchanged. Therefore, it is smaller in terms of characters because the maximum length of a character is now four bytes instead of three.

For example, a TINYTEXTcolumn can hold up to 255 bytes, which corresponds to 85 three-byte characters or 63 four-byte characters. Suppose we have a column that uses utf8TINYTEXT but must be able to contain more than 63 characters. Given this requirement, we cannot convert this column to utf8mb4 unless we also change the data type to a longer type, such as TEXT_t — because if we try to fill it with four-byte characters, we will only be able to enter 63 characters, but no more.

The same is true for index keys. InnoDBThe maximum index length for a storage engine is 767 bytes, so we can index up to 255 or 191 characters for a utf8 or utf8mb4 column, respectively. If our current index on a utf8 column is longer than 191 characters, we will need to index fewer characters when using utf8mb4VARCHAR(255) . (I had to change some indexes on the columns to VARCHAR(191).)

Step 5: Modify the connection, client and server character sets

In our application code, set the connection character set to utf8mb4 . This can be done by simply SET NAMES utf8replacing any variation of with SET NAMES utf8mb4. If our old SET NAMESstatement specified a collation, make sure to change that as well, e.g. SET NAMES utf8 COLLATE utf8_unicode_cito SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci.

Make sure to set both the client and server character sets. I have the following in my MySQL configuration file ( /etc/my.cnf ):

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

We can easily confirm that these settings are working correctly:

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_unicode_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)

As we can see, all relevant options are set to utf8mb4 , except that should be binary unless the file system we are using supports multi-byte UTF-8 encoded characters character_set_filesystemin filenames , while is always utf8 and cannot be overridden.character_set_system

注意: The default character set and collation can also be configured at some other levels.

Step 6: Repair and optimize all tables

After upgrading your MySQL server and making the necessary changes above, make sure to repair and optimize all databases and tables. I did not do this immediately after the upgrade (I did not think it was necessary since everything seemed to be working fine at first glance) and ran into some strange errors where UPDATEstatements had no effect, even though no errors were thrown.

We can run the following MySQL query for each table to be repaired and optimized:

# For each table
REPAIR TABLE table_name;
OPTIMIZE TABLE table_name;

Fortunately, this can be mysqlcheckeasily done in one go using the command line utility:

$ mysqlcheck -u root -p --auto-repair --optimize --all-databases

This will prompt for the root user's password, after which all tables in all databases will be repaired and optimized.


Summarize

Never use utf8 in MySQL - always use utf8mb4 . It may take some time to update your database and your code, but it is definitely worth the effort. Why arbitrarily limit the set of symbols that can be used in a database? Why lose data every time a user enters an astral symbol as part of a comment or message or whatever we store in the database? There is no reason not to strive for full Unicodesupport everywhere. Do the right thing and use utf8mb4 . 🍻

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Two ways to install mysql-5.5.47 on Linux system and manage mysql

Publish Date:2025/04/26 Views:140 Category:MySQL

We know that there are generally two ways to install software on the Linux system. One is to use rpm or yum to install, which is convenient and fast; the other is to use the compiled source package. Although this method is more troublesome

Mysql master-slave replication simple configuration

Publish Date:2025/04/26 Views:120 Category:MySQL

I'm learning about MySQL master-slave replication recently and want to apply it to a project at work. Since I'm new to it, I don't understand it very well, so I can only share how to make a simple configuration. At the beginning, we configu

MySQL stored procedure details

Publish Date:2025/04/26 Views:163 Category:MySQL

A stored procedure can be thought of as encapsulating a SQL statement that we need to process specially into a function. When needed, we only need to call this function to achieve the desired operation. This process can be called a stored p

How many of these MySQL statement tags have you used?

Publish Date:2025/04/26 Views:122 Category:MySQL

In the article "A Peek into MySQL Stored Procedure Details" , we briefly introduced the use of stored procedures. The syntax for creating stored procedures includes BEGIN...END. In addition to BEGIN...END, the following statement tags can b

Back up the MySQL database to a file

Publish Date:2025/04/26 Views:166 Category:MySQL

Backing up your database is a very important system administration task that should usually cron be run from a job at scheduled intervals. We will use mysqldump the dump utility that comes with mysql to dump the contents of the database to

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial