How to convert an entire MySQL database characterset and collation to UTF-8?

Questions : How to convert an entire MySQL database characterset and collation to UTF-8?

How can I convert entire MySQL database character-set to UTF-8 and collation to UTF-8?

Total Answers: 20 Answers 20


Popular Answers:

  1. Use the ALTER DATABASE and ALTER TABLE commands.

    ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; 

    Or if you’re still on MySQL 5.5.2 or older which didn’t support 4-byte UTF-8, use utf8 instead of utf8mb4:

    ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci; ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci; 
  2. On the commandline shell

    If you’re one the commandline shell, you can do this very quickly. Just fill in “dbname” 😀

    DB="dbname" ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;' mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names  | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' )  | mysql "$DB" 

    One-liner for simple copy/paste

    DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB" 
  3. Before proceeding, ensure that you: Have completed a full database backup!

    Step 1: Database Level Changes

    • Identifying the Collation and Character set of your database

      SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA S WHERE schema_name = 'your_database_name' AND (DEFAULT_CHARACTER_SET_NAME != 'utf8' OR DEFAULT_COLLATION_NAME not like 'utf8%'); 
    • Fixing the collation for the database

      ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci; 

    Step 2: Table Level Changes

    • Identifying Database Tables with the incorrect character set or collation

      SELECT CONCAT( 'ALTER TABLE ', table_name, ' CHARACTER SET utf8 COLLATE utf8_general_ci; ', 'ALTER TABLE ', table_name, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ') FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C WHERE C.collation_name = T.table_collation AND T.table_schema = 'your_database_name' AND (C.CHARACTER_SET_NAME != 'utf8' OR C.COLLATION_NAME not like 'utf8%') 
    • Adjusting table columns’ collation and character set

    Capture upper sql output and run it. (like following)

    ALTER TABLE rma CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_products CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_products CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_report_period CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_report_period CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_reservation CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_reservation CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_supplier_return CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_supplier_return_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE rma_supplier_return_product CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_product CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; 

    refer to: https://confluence.atlassian.com/display/CONFKB/How+to+Fix+the+Collation+and+Character+Set+of+a+MySQL+Database

  4. Use HeidiSQL. Its free and a very good db tool.

    From tools menu, enter Bulk table editor

    Select the complete database or pick tables to convert,

    • tick Change default collation: utf8mb4_general_ci
    • tick Convert to charset: utf8

    Execute

    This converts complete database from latin to utf8 in just a few seconds.

    Works like a charm 🙂

    HeidiSQL connects by default as utf8 so any special characters should now be seen as the character (æ ø å) and not as encoded when inspecting the table data.

    The real pitfall when moving from latin to utf8 is to make sure pdo connects with utf8 charset. If not you will get rubbish data inserted to the utf8 table and question marks all over the place on your web page, making you think the table data is not utf8…

  5. Inspired by @sdfor comment, here is a bash script that does the job

    #!/bin/bash printf "### Converting MySQL character set ###nn" printf "Enter the encoding you want to set: " read -r CHARSET # Get the MySQL username printf "Enter mysql username: " read -r USERNAME # Get the MySQL password printf "Enter mysql password for user %s:" "$USERNAME" read -rs PASSWORD DBLIST=( mydatabase1 mydatabase2 ) printf "n" for DB in "${DBLIST[@]}" do ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;' mysql "$DB" -u"$USERNAME" -p"$PASSWORD" -e "SHOW TABLES" --batch --skip-column-names  | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;' )  | mysql "$DB" -u"$USERNAME" -p"$PASSWORD" echo "$DB database done..." done echo "### DONE ###" exit 
  6. DELIMITER $$ CREATE PROCEDURE `databasename`.`update_char_set`() BEGIN DECLARE done INT DEFAULT 0; DECLARE t_sql VARCHAR(256); DECLARE tableName VARCHAR(128); DECLARE lists CURSOR FOR SELECT table_name FROM `information_schema`.`TABLES` WHERE table_schema = 'databasename'; DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done = 1; OPEN lists; FETCH lists INTO tableName; REPEAT SET @t_sql = CONCAT('ALTER TABLE ', tableName, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci'); PREPARE stmt FROM @t_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; FETCH lists INTO tableName; UNTIL done END REPEAT; CLOSE lists; END$$ DELIMITER ; CALL databasename.update_char_set(); 
  7. In case the data is not in the same character set you might consider this snippet from http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html

    If the column has a nonbinary data type (CHAR, VARCHAR, TEXT), its contents should be encoded in the column character set, not some other character set. If the contents are encoded in a different character set, you can convert the column to use a binary data type first, and then to a nonbinary column with the desired character set.

    Here is an example:

     ALTER TABLE t1 CHANGE c1 c1 BLOB; ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8; 

    Make sure to choose the right collation, or you might get unique key conflicts. e.g. Éleanore and Eleanore might be considered the same in some collations.

    Aside:

    I had a situation where certain characters “broke” in emails even though they were stored as UTF-8 in the database. If you are sending emails using utf8 data, you might want to also convert your emails to send in UTF8.

    In PHPMailer, just update this line: public $CharSet = 'utf-8';

  8. For databases that have a high number of tables you can use a simple php script to update the charset of the database and all of the tables using the following:

    $conn = mysqli_connect($host, $username, $password, $database); if ($conn->connect_error) { die("Connection failed: " . $conn->connect_error); } $alter_database_charset_sql = "ALTER DATABASE ".$database." CHARACTER SET utf8 COLLATE utf8_unicode_ci"; mysqli_query($conn, $alter_database_charset_sql); $show_tables_result = mysqli_query($conn, "SHOW TABLES"); $tables = mysqli_fetch_all($show_tables_result); foreach ($tables as $index => $table) { $alter_table_sql = "ALTER TABLE ".$table[0]." CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci"; $alter_table_result = mysqli_query($conn, $alter_table_sql); echo "<pre>"; var_dump($alter_table_result); echo "</pre>"; } 
  9. The safest way is to modify the columns first to a binary type and then modify it back to it type using the desired charset.

    Each column type have its respective binary type, as follows:

    1. CHAR => BINARY
    2. TEXT => BLOB
    3. TINYTEXT => TINYBLOB
    4. MEDIUMTEXT => MEDIUMBLOB
    5. LONGTEXT => LONGBLOB
    6. VARCHAR => VARBINARY

    Eg.:

    ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARBINARY; ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARCHAR(140) CHARACTER SET utf8mb4; 

    I tried in several latin1 tables and it kept all the diacritics.

    You can extract this query for all columns doing this:

    SELECT CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' VARBINARY;'), CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' ', COLUMN_TYPE,' CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;') FROM information_schema.columns WHERE TABLE_SCHEMA IN ('[TABLE_SCHEMA]') AND COLUMN_TYPE LIKE 'varchar%' AND (COLLATION_NAME IS NOT NULL AND COLLATION_NAME NOT LIKE 'utf%'); 

    After you do this on all your columns then you do it on all tables:

    ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci; 

    To generate this query for all your table, use the following query:

    SELECT CONCAT('ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME, ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;') FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_COLLATION NOT LIKE 'utf8%' and TABLE_SCHEMA in ('[TABLE_SCHEMA]'); 

    And now that you modified all your columns and tables, do the same on the database:

    ALTER DATABASE [DATA_BASE_NAME] CHARSET = utf8mb4 COLLATE = utf8mb4_general_ci; 
  10. mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql cp dump.sql dump-fixed.sql vim dump-fixed.sql :%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/ :%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/ :wq mysql -uusername -ppassword < dump-fixed.sql 
  11. If you cannot get your tables to convert or your table is always set to some non-utf8 character set, but you want utf8, your best bet might be to wipe it out and start over again and explicitly specify:

    create database database_name character set utf8; 
  12. from utf8 to utf8mb4:

    1.show all DATABASE default characterset:

    SELECT SCHEMA_NAME 'YOUR_DATABASE_NAME', default_character_set_name 'charset', DEFAULT_COLLATION_NAME 'collation' FROM information_schema.SCHEMATA; 

    2.show all tables status(character set), focus on column ‘collation’:

    use YOUR_DATABASE_NAME; SHOW TABLE STATUS ; 

    3.generate convert sql: convert database & all tables to utf8mb4,utf8mb4_unicode_ci

    USE information_schema; SELECT CONCAT("ALTER DATABASE `",table_schema,"` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;") AS _sql FROM `TABLES` WHERE table_schema LIKE "YOUR_DATABASE_NAME" AND TABLE_TYPE='BASE TABLE' GROUP BY table_schema UNION SELECT CONCAT("ALTER TABLE `",table_schema,"`.`",table_name,"` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;") AS _sql FROM `TABLES` WHERE table_schema LIKE "YOUR_DATABASE_NAME" AND TABLE_TYPE='BASE TABLE' GROUP BY table_schema, TABLE_NAME /*include all columns, commonly don't need this.*/ /* UNION SELECT CONCAT("ALTER TABLE `",`COLUMNS`.table_schema,"`.`",`COLUMNS`.table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type,"(",character_maximum_length,") CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci",IF(is_nullable="YES"," NULL"," NOT NULL"),";") AS _sql FROM `COLUMNS` INNER JOIN `TABLES` ON `TABLES`.table_name = `COLUMNS`.table_name WHERE `COLUMNS`.table_schema like "YOUR_DATABASE_NAME" and data_type in ('varchar','char') AND TABLE_TYPE='BASE TABLE' UNION SELECT CONCAT("ALTER TABLE `",`COLUMNS`.table_schema,"`.`",`COLUMNS`.table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type," CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci",IF(is_nullable="YES"," NULL"," NOT NULL"),";") AS _sql FROM `COLUMNS` INNER JOIN `TABLES` ON `TABLES`.table_name = `COLUMNS`.table_name WHERE `COLUMNS`.table_schema like "YOUR_DATABASE_NAME" and data_type in ('text','tinytext','mediumtext','longtext') AND TABLE_TYPE='BASE TABLE'; */ 

    4.run the sql generated.

    5.refresh your database.

    6.check:

    SHOW TABLE STATUS ; 
  13. The only solution that worked for me: http://docs.moodle.org/23/en/Converting_your_MySQL_database_to_UTF8

    Converting a database containing tables

    mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql cp dump.sql dump-fixed.sql vim dump-fixed.sql :%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/ :%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/ :wq mysql -uusername -ppassword < dump-fixed.sql 
  14. alter table table_name charset = ‘utf8’;

    This is a simple query i was able to use for my case, you can change the table_name as per your requirement(s).

  15. To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql> prompt. Replace DBNAME with the database name:

    ALTER DATABASE DBNAME CHARACTER SET utf8 COLLATE utf8_general_ci; 
  16. Command Line Solution and Exclude Views

    I am simply completing @Jasny’s answer for others like @Brian and I who have views in our database.

    If you have an error like this:

    ERROR 1347 (HY000) at line 17: 'dbname.table_name' is not of type 'BASE TABLE' 

    It’s because you probably have views and you need to exclude them. But when trying to exclude them, MySQL returns 2 columns instead of 1.

    SHOW FULL TABLES WHERE Table_Type = 'BASE TABLE'; -- table_name1 BASE TABLE -- table_name2 BASE TABLE 

    So we have to adapt Jasny’s command with awk to extract only the 1st column which contains the table name.

    DB="dbname" ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;' mysql "$DB" -e "SHOW FULL TABLES WHERE Table_Type = 'BASE TABLE'" --batch --skip-column-names  | awk '{print $1 }'  | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' )  | mysql "$DB" 

    One-liner for simple copy/paste

    DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW FULL TABLES WHERE Table_Type = 'BASE TABLE'" --batch --skip-column-names | awk '{print $1 }' | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB" 
  17. To change the character set encoding to UTF-8 follow simple steps in PHPMyAdmin
  18. If its Java source code copy it to Visual Studio and then copy it back to Word.

  19. Ok, this is weird, but to address the background color issue I paste in the text as normal, select the whole block, click on the highlighter tool to highlight (even if the highlight is set to “No Color”), and then I can style the text block without the background color of the text remaining white. I am using VS 11 and Word 2010, but the problem has been around for a long time (see http://www.visualstudiodev.com/visual-studio-setup-installation/copypaste-code-from-vs-1305.shtml)

  20. If you are using Android Studio, you can simply copy and paste, and the code aspect is going to be preserved and the colors as well. Simple enough!

  21. From Powershell ISE copy and paste to Word.
    Same with Visual Studio.

  22. Just paste your code in MS Words, select it -> then right click -> Numbering. In this case MS Word will interpret your code as marker/numbered list.

    Here is the screens:

    Screen1.

    Screen2.

  23. If you’re using TextMate (On OS X), use the “copy as rtf” command. It will place pretty-printed text onto the clipboard. rtf command

    From there you can paste into word or anything else.

  24. A quick approach I use is to use the Snipping Tool (already built in Microsoft tool) with stack overflow’s preview.

    Once I input my code into an Ask Question box, I then capture the preview and insert it into the MS Word document as a picture.

    enter image description here

    This above is the result, a picture, (not SO code ) you can put into word.

    No worries about formatting, grammar checks, or downloading new software or add-ins!

  25. If you already have the document created with plenty of code snippets in it and you are racing against time (as I unfortunately was). Save the file as a .doc as opposed to .docx and voila! Worked for me. Phew!

    NOTE: Obviously your document can’t have fancy features from > word 2007.

    NOTE 2: File size becomes bigger if this is a concern to you.

  26. Simplest solution, for me atleast, is to paste your code into the document, highlight it, then navigate to:

    home -> styles -> << click drop down arrow by styles >> -> code

    This has the advantage that the code is now searchable within the document (unlike gargamel’s solution), as well as being able to format code that is multiple pages.

  27. Simply right click and paste using the “Keep Source Formatting” option. I do this almost everyday to document my work. Further, you can set the ‘default paste’ for pasting from various soures in File/Options/Advanced/Cut,CopyPaste. Also useful: enable “Show paste options” in the same section of Word Options.

    Note that all of the text properties from your Code Editor’s theme (colors, fonts, etc.) will be added to the Stylesheet in your Word doc, so I would recommend that you not make any changes directly to the pasted text as that will add clutter to your stylesheet and subsequent pastes will not match. It will be to your great advantage to do a quick study on using ‘Styles’ in Word (which are actually CSS). They are very powerful. Using Word’s Stylesheet you can make global changes to the pasted text, but it will probably cause subsequent pasted text to add new styles.

  28. You can paste your code into LINQPad. Then copy from LINQPad into MS Word. LINQPad supports following programming languages: C#, VB, SQL, ESQL and F#

  29. Hilite doesn’t seem to be mentioned yet in the answers, so: Hilite supports lots of languages (20+), can be used online also via API, and is on Github (so you can clone, modify, and run it on your own if you don’t trust the online service). The online version can also be adjusted to one’s needs via CSS rules.

    I just found it some minutes ago since I needed a tool for copying xQuery into Word, but couldn’t find a proper tool for doing so. The source program is baseX and for some reason, its formatting could not be transmitted to Word (also not via Keep format etc. when pasting). Also, many of the given answers are now, i.e. 06/2019, not working anymore or do not support xQuery. Hilite, however, did the job quite well.

    Edit: a code block is not part of the result, unfortunatelly, just the highlighting. Nevertheless, it’s better than nothing and adjusting the result by adding a block around is still less work than formating every single line by hand

    • What I do is I use Google Backup and Sync and put the docx file in the folder that syncs with Google Drive.
    • Then Open the file in chrome as google drive has functionality to parse docx file.
    • Then run this plugin https://workspace.google.com/marketplace/app/code_blocks/100740430168 which formats the code in different languages with good theme.
    • Once done save it and open the docx file in the system once it is synced.

Tasg: mysql, character-encoding