Create a MySQL database with charset UTF-8
Update in 2019-10-29
As mentions by @Manuel Jordan in comments,
utf8mb4_0900_ai_ciis the new default in
MySQL 8.0, so the following is now again a better practice:
CREATE DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
Answer before 2019-10-29
Note: The following is now considered a better practice (see bikeman868's answer):
CREATE DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE DATABASE mydatabase CHARACTER SET utf8 COLLATE utf8_general_ci;
For more information, see Database Character Set and Collation in the MySQL Reference Manual.
MySQL's `utf8mb4` is what the rest of us call `utf8`. So what is MySQL's `utf8` you ask? It's a limited version of utf-8 that only works for a subset of the characters but fails for stuff like emoji. Later they added `utf8mb4` which is the correct implementation, but MySQL has to stay backwards compatible to it's old mistakes so that is why the added a new encoding instead of fixing the old one. All new databases should use `utf8mb4`.
If you want to go down the rabbit hole: `COLLATE utf8mb4_unicode_520_ci` or `utf8mb4_0900_ai_ci` or even locale specific, for example: `utf8mb4_vi_0900_ai_ci`. For MariaDB 10.2.2+, you have "nopad" collations `utf8mb4_unicode_520_nopad_ci`. https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html
@shellbye Consider to update your answer to do mention about `utf8mb4_0900_ai_ci`. Seems is better than `utf8mb4_unicode_ci`
You should use:
CREATE DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
utf8_general_ciis no longer recommended best practice. See the related Q & A:
What's the difference between utf8_general_ci and utf8_unicode_ci on Stack Overflow.