'MySQL VARCHAR Lengths and UTF-8

In MySQL, if I create a new VARCHAR(32) field in a UTF-8 table does it means I can store 32 bytes of data in that field or 32 chars (multi-byte)?



Solution 1:[1]

it would let you store 32 multi-byte chars

To save space with UTF-8, use VARCHAR instead of CHAR. Otherwise, MySQL must reserve three bytes for each character in a CHAR CHARACTER SET utf8 column because that is the maximum possible length. For example, MySQL must reserve 30 bytes for a CHAR(10) CHARACTER SET utf8 column.

http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html

Solution 2:[2]

32 multibytes data for varchar(32) with collation utf8_unicode_ci, I just tested with XAMPP.

????????????????????????????????????????

Get truncated to:

????????????????????????????????

Keep in mind that these are not regular ASCII chars.

Solution 3:[3]

It is better to use "char" for high-frequent update tables because the total data length of the row will be fixed and fast. Varchar columns make row data sizes dynamic. That's not good for MyISAM, but I don't know about InnoDB and others. For example, if you have a very narrow "type" column, it may be better to use char(2) with latin1 charset to only claim minimal space.

Solution 4:[4]

If you connect to the database using latin1 encoding (for example with PHP) to save an PHP UTF8 string in an MySQL UTF8 column, you will have a double UTF8 encoding.

If the UTF8 string $s is 32 characters long but 64 bytes long and the column is VARCHAR(32) UTF8, the double encoding will convert the string $s to a 64 characters long UTF8 string that will be truncated in the database to its 32 first characters corresponding to the 32 first bytes of $s. You may end up thinking that MySQL 5 behaves like MySQL 4 but it is in fact a second cause for the same effect.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Alix Axel
Solution 3 Nudge
Solution 4