Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 31 Next 1 Older Entries

Displaying posts with tag: freesoftware (reset)

First steps with MariaDB Global Transaction ID
+8 Vote Up -1Vote Down

My previous writings were mostly teoretical, so I wanted to give a more practical example, showing the actual state of the current code. I also wanted to show how I have tried to make the feature fit well into the existing replication features, without requiring the user to enable lots of options or understand lots of restrictions before being able to use it.

So let us start! We will build the code from lp:~maria-captains/maria/10.0-mdev26, which at the time of writing is at revision knielsen@knielsen-hq.org-20130214134205-403yjqvzva6xk52j.

First, we start a master server on port 3310 and put a bit of data into it:

    server1> use test;

  [Read more...]
More on global transaction ID in MariaDB
+6 Vote Up -0Vote Down

I got some very good comments/questions on my previous post on MariaDB global transaction ID, from Giuseppe and Robert (of Tungsten fame). I thought a follow-up post would be appropriate to answer and further elaborate on the comments, as the points they raise are very important and interesting.

(It also gives me the opportunity to explain more deeply a lot of interesting design decisions that I left out in the first post for the sake of brevity and clarity.)

On crash-safe slave

One of the things I really wanted to improve with global transaction ID is to make the replication slaves more crash safe with respect to their current replication state. This state is mostly persistently stored information about which event(s) were last executed on the slave, so that after a server restart the slave will know from  [Read more...]
Global transaction ID in MariaDB
+5 Vote Up -0Vote Down

The main goal of global transaction ID is to make it easy to promote a new master and switch all slaves over to continue replication from the new master. This is currently harder than it could be, since the current replication position for a slave is specified in coordinates that are specific to the current master, and it is not trivial to translate them into the corresponding coordinates on the new master. Global transaction ID solves this by annotating each event with the global transaction id which is unique and universal across the whole replication hierarchy.

In addition, there are at least two other main goals for MariaDB global transaction ID:

  • Make it easy to setup global transaction ID, and easy to provision a new slave into an existing replication hierarchy.
  • Fully support
  •   [Read more...]
    Integer overflow
    +0 Vote Up -1Vote Down

    What do you think of this piece of C code?

      void foo(long v) {
        unsigned long u;
        unsigned sign;
        if (v < 0) {
          u = -v;
          sign = 1;
        } else {
          u = v;
          sign = 0;
        }
        ...
    
    Seems pretty simple, right? Then what do you think of this output from MySQL:
      mysql> create table t1 (a bigint) as select '-9223372036854775807.5' as a;
      mysql> select * from t1;
      +----------------------+
      | a                    |
      +----------------------+
      | -'..--).0-*(+,))+(0( | 
      +----------------------+
    
    Yes, that is authentic output from older versions of MySQL. Not just the wrong number, the output is complete garbage! This is my all-time favorite MySQL bug#31799. It was caused by code like the above C snippet.

    So can you spot what is wrong with the code? Looks pretty simple, does it

      [Read more...]
    Even faster group commit!
    +11 Vote Up -0Vote Down

    I found time to continue my previous work on group commit for the binary log in MariaDB.

    In current code, a (group) commit to InnoDB does not less than three fsync() calls:

  • Once during InnoDB prepare, to make sure we can recover the transaction in InnoDB if we crash after writing it to the binlog.
  • Once after binlog write, to make sure we have the transaction in the binlog before we irrevocably commit it in InnoDB.
  • Once during InnoDB commit, to make sure we no longer need to scan the binlog after a crash to recover the transaction. Of
  •   [Read more...]
    Tale of a bug
    +5 Vote Up -0Vote Down

    This is a tale of the bug lp:798213. The bug report has the initial report, and a summary of the real problem obtained after detailed analysis, but it does not describe the processes of getting from the former to the latter. I thought it would be interesting to document this, as the analysis of this bug was rather tricky and contains several good lessons.

    Background

    The bug first manifested itself as a sporadic failure in one of our random query generator tests for replication. We run this test after all MariaDB pushes in our Buildbot setup. However, this failure had only occured twice in several months, so it is clearly a very rare failure.

    The first task was to try to repeat the problem and get some more data in the form of binlog files and so on. Philip kindly

      [Read more...]
    The future of replication revealed in Istanbul
    +7 Vote Up -0Vote Down

    A very good meeting in Istanbul is drawing to an end. People from Monty Program, Facebook, Galera, Percona, SkySQL, and other parts of the community are meeting with one foot on the European continent and another in Asia to discuss all things MariaDB and MySQL and experience the mystery of the Orient.

    At the meeting I had the opportunity to present my plans and visions for the future development of replication in MariaDB. My talk was very well received, and I had a lot of good discussions afterwards with many of the bright people here. Working from home in a virtual company, it means a lot to get this kind of inspiration and encouragement from others on occasion, and I am looking forward to continuing the work after an early flight to Copenhagen tomorrow.

    The new interface for transaction coordinator plugins is what

      [Read more...]
    Dynamic linking costs two cycles
    +9 Vote Up -0Vote Down

    It turns out that the overhead of dynamic linking on Linux amd64 is 2 CPU cycles per cross-module call. I usually take forever to get to the point in my writing, so I thought I would change this for once :-)

    In MySQL, there has been a historical tendency to favour static linking, in part because to avoid the overhead (in execution efficiency) associated with dynamic linking. However, on modern systems there are also very serious drawbacks when using static linking.

    The particular issue that inspired this article is that I was working on MWL#74, building a proper shared libmysqld.so library for the MariaDB embedded server. The lack of a proper libmysqld.so in MySQL and MariaDB has caused no end of grief for packaging Amarok for the various Linux distributions.

      [Read more...]
    Micro-benchmarking pthread_cond_broadcast()
    +2 Vote Up -0Vote Down

    In my work on group commit for MariaDB, I have the following situation:

    A group of threads are going to participate in group commit. This means that one of the threads, called the group leader, will run an fsync() for all of them, while the other threads wait. Once the group leader is done, it needs to wake up all of the other threads.

    The obvious way to do this is to have the group leader call pthread_cond_broadcast() on a condition that the other threads are waiting for with pthread_cond_wait():

      bool wakeup= false;
      pthread_cond_t wakeup_cond;
      pthread_mutex_t wakeup_mutex
    

    Waiter:

      pthread_mutex_lock(&wakeup_mutex);
      while (!wakeup)
        pthread_cond_wait(&wakeup_cond, &wakeup_mutex);
      pthread_mutex_unlock(&wakeup_mutex);

      [Read more...]
    MySQL/MariaDB replication: applying events on the slave side
    +4 Vote Up -2Vote Down

    Working on a new set of replication APIs in MariaDB, I have given some thought to the generation of replication events on the master server.

    But there is another side of the equation: to apply the generated events on a slave server. This is something that most replication setups will need (unless they replicate to non-MySQL/MariaDB slaves). So it will be good to provide a generic interface for this, otherwise every binlog-like plugin implementation will have to re-invent this themselves.

    A central idea in the current design for generating events is that we do not enforce a specific content of events. Instead, the API provides accessors for a lot of different information related to each event, allowing the plugin flexibility in choosing what to

      [Read more...]
    Dissecting the MySQL replication binlog events
    +1 Vote Up -0Vote Down

    For the replication project that I am currently working on in MariaDB, I wanted to understand exactly what information is needed to do full replication of all MySQL/MariaDB statements on the level of completeness that existing replication does. So I went through the code, and this is what I found.

    What I am after here is a complete list of what the execution engine needs to provide to have everything that a replication system needs to be able to completely replicate all changes made on a master server. But not anything specific to the particular implementation of replication used, like binlog positions or replication event disk formats, etc.

    The basic information needed is of course the query (for statement-based replication), or the column values (for row-based replication). But there are lots of extra details

      [Read more...]
    Fixing MySQL group commit (part 4 of 3)
    +3 Vote Up -0Vote Down

    (No three-part series is complete without a part 4, right?)

    Here is an analogy that describes well what group commit does. We have a bus driving back and forth transporting people from A to B (corresponding to fsync() "transporting" commits to durable storage on disk). The group commit optimisation is to have the bus pick up everyone that is waiting at A before driving to B, not drive people one by one. Makes sense, huh? :-)

    It is pretty obvious that this optimisation of having more than one person in the bus can dramatically improve throughput, and it is the same for the group commit optimisation. Here is a graph from a benchmark comparing stock MariaDB 5.1 vs. MariaDB patched

      [Read more...]
    Fixing MySQL group commit (part 3)
    +2 Vote Up -0Vote Down

    This is the third and final article in a series about group commit in MySQL. The first article discussed the background: group commit in MySQL does not work when the binary log is enabled. The second article explained the part of the InnoDB code that is responsible for the problem.

    So how do we fix group commit in MySQL? As we saw in the second article of this series, we can just eliminate the prepare_commit_mutex from InnoDB, extend the binary logging to do group commit by itself, and that would solve the problem.

    However, we might be able to do even better. As explained in the first article, with binary logging

      [Read more...]
    Fixing MySQL group commit (part 2)
    +3 Vote Up -0Vote Down

    This is the second in a series of three articles about ideas for implementing full support for group commit in MariaDB. The first article discussed the background: group commit in MySQL does not work when the binary log is enabled. See also the third article.

    Internally, InnoDB (and hence XtraDB) do support group commit. The way this works is seen in the innobase_commit() function. The work in this function is split into two parts. First, a "fast" part, which registers the commit in memory:

        trx->flush_log_later = TRUE;
        innobase_commit_low(trx);
        trx->flush_log_later = FALSE;
    
    Second, a "slow" part, which writes and fsync's the commit to disk to make it durable:
        trx_commit_complete_for_mysql(trx)
    

      [Read more...]
    Fixing MySQL group commit (part 1)
    +4 Vote Up -1Vote Down

    This is the first in a series of three articles about ideas for implementing full support for group commit in MariaDB (for the other parts see the second and third articles). Group commit is an important optimisation for databases that helps mitigate the latency of physically writing data to permanent storage. Group commit can have a dramatic effect on performance, as the following graph shows:

    The rising blue and yellow lines show transactions per second when group commit is working, showing greatly improved throughput as the parallelism (number of concurrently running transactions) increases. The flat red and green lines show transactions per second with no group

      [Read more...]
    Debugging memory leaks in plugins with Valgrind
    +2 Vote Up -0Vote Down

    I had an interesting IRC discussion the other day with Monty Taylor about what turned out to be a limitation in Valgrind with respect to debugging memory leaks in dynamically loaded plugins.

    Monty Taylor's original problem was with Drizzle, but as it turns out, it is common to all of the MySQL-derived code bases. When there is a memory leak from an allocation in a dynamically loaded plugin, Valgrind will detect the leak, but the part of the stack trace that is within the plugin shows up as an unhelpful three question marks "???":

    ==1287== 400 bytes in 4 blocks are definitely lost in loss record 5 of 8
    ==1287==    at 0x4C22FAB: malloc (vg_replace_malloc.c:207)
    ==1287==    by 0x126A2186: ???
    ==1287==    by 0x7C8E01: ha_initialize_handlerton(st_plugin_int*) (handler.cc:429)
    ==1287==    by 0x88ADD6:

      [Read more...]
    MariaDB talk at the OpenSourceDays 2010 conference
    +2 Vote Up -1Vote Down

    Earlier this month, I was at the OpenSourceDays 2010 conference, giving a talk on MariaDB (the slides from the talk are available).

    The talk went quite well I think (though I probably talked way too fast as I usually do; at least that means that I finished on time with plenty room for questions..)

    There was quite a bit of interest after the talk from many of the people who heard it. It was even reported on by the Danish IT media version2.dk (article in Danish).

    Especially interesting to me was to discuss with three people from Danish site komogvind.dk, who told me fascinating details about their work keeping a busy

      [Read more...]
    Conference time!
    +2 Vote Up -2Vote Down

    It is conference time for me. I just came home from FOSDEM 2010 where we had a booth and I gave a talk. At the end of the month there will be a company meeting in Iceland for Monty Program, followed by Open Source Days 2010 where I will also be speaking. And then in April there is the MySQL User Conference. With two additional talks given at local user groups end of last year, I think I've about filled my quota for now, I feel quite fortunate that it turned out that I will not also be presenting at the UC! (I do not have a natural talent for speaking, and tend to need to spend quite a lot of time in preparations.)

    Having a booth at FOSDEM turned out really well I think, as I got to talk to

      [Read more...]
    Why I work on Free Software
    +2 Vote Up -3Vote Down

    I happened upon this old LinuxJournal article about how the University of Zululand in South Africa used MySQL (http://www.mysql.com/) and other Free Software to make do with a 128 kbit (and later 768 kbit) internet connection for their staff and students.

    This made me remember the trip I made to another African country, Burkina Faso, 15 years ago: With the huge amount of work and numerous difficult obstacles facing my work on the MariaDB project, it can be

      [Read more...]
    RunVM, a tool for automated scripting inside virtual machines
    +0 Vote Up -0Vote Down

    In the Autumn, I wrote about some experiments I did using KVM and virtual machines to build and test MariaDB binary packages on a number of different platforms. In the period since then I added some polish and refinements, and the system is now running well for some time. We build and test packages for Debian (4 and 5), Ubuntu (8.04 to 10.04), Centos 5, and generic Linux; amd64 and i386 architectures.

    To better control the startup and shutdown of the virtual machines, I created a small wrapper script around KVM called runvm. This wrapper encapsulates the steps needed to boot up a virtual machine, run a series of commands inside it, and shut it down gracefully afterwards. Some special care is taken in the script to ensure that

      [Read more...]
    Oracle speculations
    +5 Vote Up -5Vote Down

    The Planet MySQL has been abuzz with opinions for or against the acquisition of Sun (and in particular MySQL) by Oracle, but I do not have a strong opinion to chime in with in support of either groups. The reason is that I do not know anything about antitrust laws, which is the legal basis for the EC blocking or not blocking the deal; and also I do not know what the alternative is to Oracle buying the MySQL part of Sun.

    However, that does not mean that I can not join in the speculations about Oracles reasons for wanting MySQL in the first place ;-)

    I think it is basically a matter of obtaining control over MySQL.

    The horror scenario for Oracle is that MySQL (or Postgress or another Free Software program) does to the proprietary databases what Linux has done to the proprietary Unixes. Which is essentially to kill them,

      [Read more...]
    MariaDB Buildbot configuration file published
    +2 Vote Up -0Vote Down

    I have now published the Buildbot configuration file that we use for our continuous integration tests in our Buildbot setup. Every push into main and development branches of MariaDB is built and tested on a range of platforms to catch and fix any problems early (and we also test MySQL releases before merging to easily see whether any new problems already existed in MySQL or were introduced by something specific to MariaDB).

    The configuration is included in the Tools for MariaDB Launchpad project.

    Now, the Buildbot configuration file is not something that most MariaDB users will

      [Read more...]
    Fixing a MariaDB package bug
    +4 Vote Up -0Vote Down

    One of the things that I am really happy about in MariaDB is that we have our releases available as apt (and yum for Centos) repositories. This is largely thanks to being able to build this on the OurDelta package build infrastructure (which again builds on things like the Debian packaging scripts for MySQL).

    Something like the Debian apt-get package system (which is also used by Ubuntu) is one of the major innovations in the Free Software world in my opinion. Debian has spent many years refining this system to where it is today. Want to run the mysql client, but it isn't installed? Just try to run it on your local Ubuntu host:

        $ mysql
        The program 'mysql' can be found in the following packages:
         * mysql-client-5.0
         * mysql-client-5.1
        Try: sudo apt-get install <selected package>
        -bash: mysql:

      [Read more...]
    (Almost) one year of MariaDB
    +4 Vote Up -2Vote Down

    Most of this year I have been working on the MariaDB project. So it is interesting to look back and see what has been achieved.

    For those that do not know, MariaDB is a project to create a community-oriented branch of the MySQL code base. We want MariaDB to be developed for the community, by the community, and driven by the needs of the community.

    Turns out that a lot has been achieved already:

    • We have had three releases (and a fourth is being prepared currently). The code is getting close now to release candidate.
    • We have apt-able (and yum-able on Centos/RHEL) repositories for the releases. These are based on the OurDelta infrastructure (scripts, build machines, etc). This means MariaDB installation and upgrade can be done the prefered way using the

      [Read more...]
    Lightning talks at Open Source Days 2008
    +0 Vote Up -0Vote Down

    I am giving two lightning talks at Open Source Days on October 3-4. One on improving database I/O performance using clustered indexes with MySQL/InnoDB, and one on advanced profiling with OProfile.

    Hope to meet up with a lot of people there!

    Giv tid!
    +0 Vote Up -0Vote Down

    Vores røde ananas-æbler står utroligt flot her en sensommer-eftermiddag:

     

    Jeg holder meget at træer. En af grundene er, at træer tager tid. At plante et æbletræ er at planlægge mange år frem i tiden. Og ingen forventer, at et træ kan vokse op på få måneder eller år. Det giver en ro at arbejde med træer.

    God software tager også tid. Linux, Apache, Perl, Emacs, GCC, det er alle projekter som også fandtes for mange år siden mens jeg stadig gik på Universitetet. Selv KDE er snart 12 år gammelt.

    Jeg tror at en vigtig faktor i de mange Open Source successer er, at man har taget sig tid til at gøre tingene ordentligt. Jeg har alt for tit oplevet, hvordan manglende forståelse for hvor lang tid

      [Read more...]
    Back From Boston and the Red Hat Summit and FUDCON
    +0 Vote Up -0Vote Down

    The second half of last week I attended the Red Hat Summit and FUDCon which Sun and MySQL were silver sponsors of.  The events were co-located at the Hynes convention center in Boston. 

    Although both events featured an impressive list of topics and tracks, other than the keynotes I spent the majority of my time meeting and talking to people.   One of my goals was to figure out how Sun can better work with Fedora to get more of our software into their distro. 


    A few key Fedorans: Max Spevak, Dennis Gilmore, Tom "Spot" Callaway, Jeremy Katz, Paul Frields, Jesse


      [Read more...]
    Back From Boston and the Red Hat Summit and FUDCON
    +0 Vote Up -0Vote Down

    The second half of last week I attended the Red Hat Summit and FUDCon which Sun and MySQL were silver sponsors of.  The events were co-located at the Hynes convention center in Boston. 

    Although both events featured an impressive list of topics and tracks, other than the keynotes I spent the majority of my time meeting and talking to people.   One of my goals was to figure out how Sun can better work with Fedora to get more of our software into their distro. 


    A few key Fedorans: Max Spevak, Dennis Gilmore, Tom "Spot" Callaway, Jeremy Katz, Paul Frields, Jesse


      [Read more...]
    Back From Boston and the Red Hat Summit and FUDCON
    +0 Vote Up -0Vote Down

    The second half of last week I attended the Red Hat Summit and FUDCon which Sun and MySQL were silver sponsors of.  The events were co-located at the Hynes convention center in Boston. 

    Although both events featured an impressive list of topics and tracks, other than the keynotes I spent the majority of my time meeting and talking to people.   One of my goals was to figure out how Sun can better work with Fedora to get more of our software into their distro. 


    A few key Fedorans: Max Spevak, Dennis Gilmore, Tom "Spot" Callaway, Jeremy Katz, Paul Frields, Jesse


      [Read more...]
    MySQL Conf08 - My Interview with Jennifer Venable of Red Hat
    +0 Vote Up -0Vote Down

    Last Tuesday when I was walking the show floor at the MySQL conference, I ran into a familiar name,  Jennifer Venable.  I had never met Jennifer before but we had traded mails and had spoken on the phone.  This was about 18 months ago when we were negotiating the renewal of Sun's contract as an Authorized Distributor of Red Hat.  At that time, Jennifer was the Red Hat lawyer working on the contract.

    Well since that time Jennifer has escaped from the Red Hat legal ranks and has joined the business side where, as of a couple of months ago, she took over as head of the Red Hat Exchange. 

    Take a listen.

    My interview with Jennifer (12:25)  Listen (


      [Read more...]
    Showing entries 1 to 30 of 31 Next 1 Older Entries

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.