It has been a very very long working week-end for the technical
team at Believe...
From MariaDB 5.2 to 5.5.28 some few QP
regression still need to be fixed.
From latin1 to utf8 very few indexes as to be
modified, reaching the max index column size limit of 767 bytes
for InnoDB. Some Url like Referer as been change to varbinary as
the column content was already a mixed of encoding stored in
latin1.
Don' forget UTF8 is bad for in memory workload as strings
in …
When computers were still using large black text oriented screens
or no screens at all, a computer only knew how to store a limited
set of characters. Then it was normal to store a name with the
more complicated characters replaced by more basic characters.
The ASCII standard was used to make communication between
multiple systems (or applications) easier. Storing characters as
ASCII needs little space and is quite strait forward.
Then DOS used CP850 and CP437 and so on to make it possible to use
language /location specific characters.
Then ISO8859-1, ISO8859-15 and more of these character
sets were defined as standard.
And now there is Unicode: UTF-8, UTF-16, UCS2, etc. which allow
you to …
Every once in a while questions like the one in MySQL Bug #60843 or Bug #19567 come up:
What collation should i use if i want case insensitive behavior but also want all accented letter to be treated as distinct to their base letters?
or shorter, as the reporter of bug #60843 put it:
I need something like utf8_bin + ci
utf8_general_ci and utf8_unicode_ci unfortunately do not provide this behavior and utf8_bin is obviously not case insensitive.
Background Knowledge
Using the character set UTF-8 allows for the use of any language, can represent every character in the Unicode character set and is backward compatibility with ASCII. Not to mention is can handle any platform and be sent through many different systems without corruption. With such advantages this is why so many are making the switch.
The following instructions were done on Debian Squeeze v6.04 AMD64 operating system using MySQL v14.14 Distrib 5.1.61.
Solution – Server Configuration
At present MySQL is configured by default to use “latin1″ character set. Here’s how to change MySQL configuration to use UTF-8 character set and collation.
-
Check MySQL’s current configuration, run the following two SQL statements.
1 2
SHOW VARIABLES LIKE '%collation%'; …
Having covered the preparation and character set options of performing a latin1 to utf8 MySQL migration, just how do you perform the migration correctly.
Example Case
Just to recap, we have the following example table and data.
mysql> select c,length(c),char_length(c),charset(c), hex(c) from conv.test_latin1; +---------------+-----------+----------------+------------+----------------------------+ | c | length(c) | char_length(c) | charset(c) | hex(c) | +---------------+-----------+----------------+------------+----------------------------+ | a | 1 | 1 | latin1 | 61 | | abc | 3 | 3 …[Read more]
The headline is flame-bait, don’t take it. I just wanted to point something out about character sets and collations in MySQL.
To the uninitiated, it may seem overwhelming. Everything has a character set! Everything has a collation! And they act weirdly! The server has one. The database has one (oh, and it changes magically as I USE different databases.) Every table has one, and columns too. Is that all? NO! My connection has one! Kill me now!
Relax. In truth, only one kind of thing actually has a charset/collation. That is values. And values are stored in columns. The only thing that really has a charset/collation is a column.[1]
What about all the rest of those things — connection, database, server, table? Those are just defaults, which determine what charset/collation a value gets if it isn’t overridden. So if the table’s default charset is utf8, and you add a column without saying what …
[Read more]Continuing on from preparation in our MySQL latin1 to utf8 migration let us first understand where MySQL uses character sets. MySQL defines the character set at 4 different levels for the structure of data.
- Instance
- Schema
- Table
- Column
In MySQL 5.1, the default character set is latin1. If not specified, this is what you will get. For example.
mysql> create table test1(c1 varchar(10) not null); mysql> show create table test1\G Create Table: CREATE TABLE `test1` ( `c1` varchar(10) NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1
If you want all tables in your instance to always be a default of
utf8, you can changed the server variable
character_set_server
. This can be set dynamically.
mysql> set global …[Read more]
Before undertaking such migration the first step is a lesson in understanding more about how latin1 and utf8 work and interact in MySQL. latin1 in a common and historical character set used in MySQL. utf8 (first available in MySQL Version 4.1) is an encoding supporting multiple bytes and is the system default in MySQL 5.0
- latin1 is a single byte character set.
- utf8 is a 1-3 byte character set depending on the size of the character. NOTE: MySQL utf8 does not support the RFC 3629 4 byte sequences
MySQL variables
MySQL has a number of different system variables to consider, the following is the default representation in MySQL 5.1
mysql> show global variables like '%char%'; +--------------------------+----------------------------------------------------------------+ | Variable_name | Value …[Read more]
My colleague Lenz might have forgotten to post before he
disappeared on a well-deserved vacation but we've enabled Russian
as a choice in PlanetMySQL. Feel free to start submitting your
Russian language blogs.
Russian Language PlanetMySQL: http://ru.planet.mysql.com
New feed submissions: http://ru.planet.mysql.com/new
We haven't completely translated all the strings yet (that's my
fault, I need to stringify the vote stuff) but we're getting
there!
(EDIT: LenZ is not on vacation... in fact he is at PHPDay2009 in
Verona, Italy... sorry LenZ)
The Oracle gateway for ODBC provides an almost seamless data integration between Oracle and other RDBMS. I won’t argue about its performance, limits, or relevance. It serves a few purposes; set it up and you’ll be able, for example, to create database links between Oracle and MySQL. After all, wouldn’t it be nice if you could run some of the following SQL statements?
-
select o.col1, m.col1 from oracle_tab
o, mysql_tab@mysql m where o.col1=m.col1; -
insert into oracle_tab (select * from mysql_tab@mysql);
This post is intended to share, the same way Karun did it for SQL Server …
[Read more]