Create a MySQL database with charset UTF-8

  • I'm new to MySQL and I would like to know:

    How can I create a database with charset utf-8 like I did in navicat?

    create mydatabase;
    

    ...seems to be using some kind of default charset.

  • shellbye

    shellbye Correct answer

    6 years ago

    Update in 2019-10-29
    As mentions by @Manuel Jordan in comments, utf8mb4_0900_ai_ci is the new default in MySQL 8.0, so the following is now again a better practice:

    CREATE DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
    

    Answer before 2019-10-29
    Note: The following is now considered a better practice (see bikeman868's answer):

    CREATE DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    

    Original answer:

    Try this:

    CREATE DATABASE mydatabase CHARACTER SET utf8 COLLATE utf8_general_ci;
    

    For more information, see Database Character Set and Collation in the MySQL Reference Manual.

    MySQL's `utf8mb4` is what the rest of us call `utf8`. So what is MySQL's `utf8` you ask? It's a limited version of utf-8 that only works for a subset of the characters but fails for stuff like emoji. Later they added `utf8mb4` which is the correct implementation, but MySQL has to stay backwards compatible to it's old mistakes so that is why the added a new encoding instead of fixing the old one. All new databases should use `utf8mb4`.

    If you want to go down the rabbit hole: `COLLATE utf8mb4_unicode_520_ci` or `utf8mb4_0900_ai_ci` or even locale specific, for example: `utf8mb4_vi_0900_ai_ci`. For MariaDB 10.2.2+, you have "nopad" collations `utf8mb4_unicode_520_nopad_ci`. https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html

    @shellbye Consider to update your answer to do mention about `utf8mb4_0900_ai_ci`. Seems is better than `utf8mb4_unicode_ci`

    @ManuelJordan Updated as you suggested.

    Do you know whether this also applies to MariaDB?

    @Manngo I haven't tested it, you can try it and publish your result here.

    i have found utf8mb4_general_ci better than utf8mb4_unicode_ci

  • You should use:

    CREATE DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    

    Note that utf8_general_ci is no longer recommended best practice. See the related Q & A:

    What's the difference between utf8_general_ci and utf8_unicode_ci on Stack Overflow.

    Consider to update your answer to do mention about `utf8mb4_0900_ai_ci`. Seems is better than `utf8mb4_unicode_ci`

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM