Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 24

Displaying posts with tag: Ideas (reset)

An interesting case in ORDER BY LIMIT optimization
+1 Vote Up -0Vote Down

Recently, I was asked about an interesting case in ORDER BY … LIMIT optimization. Consider a table

create table tbl (
  KEY key1(col1, col2),
) engine=InnoDB;

Consider queries like:

  select * from tbl where col1=’foo’ and col2=123 order by pk limit 1;
  select * from tbl where col1=’bar’ and col2=123 order by pk limit 1;

These run nearly instantly. But, if one combines these two queries with col1='foo' and col1='bar' into one query with col1 IN ('foo','bar'):

  select * from tbl where col1 IN (’foo’,'bar’) and col2=123 order by pk limit 1;

then the query is be orders of magnitude slower than both of the queries with col1=const.

The first thing to note when doing investigation is to note that the table uses InnoDB engine,

  [Read more...]
Fun with Bugs #8 - what's wrong with Oracle's way of handling public MySQL bugs database
+1 Vote Up -0Vote Down
Many people seem unhappy with the way Oracle develops MySQL. I am not one of them. I think very few really important things are missing and in this post I'd like to concentrate on one of them: having internal and public bugs databases not in sync in too many cases.

Let me quote myself to explain where problem starts:

"Now the most important thing you should know about MySQL bugs processing the way it is done now in Oracle. When bug is "Verified" and(!) considered serious enough, it is copied to the Oracle internal bugs database and further processing, including status changes etc, is done there. All further comments to the public bug report are then copied to internal bug report automatically, but no comments or

  [Read more...]
Here’s a quick way to Foresee if Replication Slave is ever going to catch up and When!
+1 Vote Up -0Vote Down

If you ever had a replication slave that is severely behind, you probably noticed that it’s not catching up with a busy master at a steady pace. Instead, the “Seconds behind master” is going up and down so you can’t really tell whether the replica is catching up or not by looking at just few samples, unless these are spread apart. And even then you can’t tell at a glance when it is going to catch up.

Normally, the “severely behind” thing should not happen, but it does often happen in our consulting practice:

  • sometimes replication would break and then it needs to catch up after it is fixed,
  • other times new replication slave is built from a backup which is normally hours behind,
  • or, it could be that replication slave became too slow to catch
  [Read more...]
help starting my sql ideas
+0 Vote Up -2Vote Down

The Brief
The Client’s mystery shopping programme consists of visits to each of the client’s ten locations, grouped into five different areas. Each location receives a single visit each month, but on occasion a visit may not take place.
On the completion of each visit, a questionnaire is completed by the mystery shopper that is used to rate the service they received during the visit. The completed questionnaire is assigned a score, which is calculated based on the number of points achieved over the number of points available, normally expressed as a percentage.
Points Achieved x 100 Points Available
= % Score
PHP Development Test
The Client has requested that an end of year report is generated to summarise the data collected over the previous year. The report should be easy to understand and show as much useful information as possible.

  [Read more...]
Optimizing slow web pages with mk-query-digest
+1 Vote Up -0Vote Down

I don’t use many tools in my consulting practice but for the ones I do, I try to know them as best as I can. I’ve been using mk-query-digest for almost as long as it exists but it continues to surprise me in ways I couldn’t imagine it would. This time I’d like to share a quick tip on how mk-query-digest allows you to slice your data in a completely different way than it otherwise would by default.

Disclaimer: this only works when persistent connections or connection pools aren’t used and is only accurate when single mysql connection is used during execution of a request.

If you are seeking to reduce the load on the database server and [as a result] increase response time for some random user requests, you are usually

  [Read more...]
Tuning MySQL Server Settings
+0 Vote Up -0Vote Down
The default configuration file for MySQL is intended not to use many resources, because its a general purpose sort of a configuration file. The default configuration does enough to have MySQL running happily with limited resources and catering to simple queries and small data-sets. The configuration file would most definitely need to be customized and tuned if you intend on using complex queries and when you have good amount of data. Most of the tunings mentioned in this post are applicable to the MyISAM storage engine, I will soon be posting tunings applicable to the Innodb storage engine. Getting started...
Databases: Normalization or Denormalization. Which is the better technique?
+1 Vote Up -0Vote Down
This has really been a long debate as to which approach is more performance orientated, normalized databases or denormalized databases. So this article is a step on my part to figure out the right strategy, because neither one of these approaches can be rejected outright. I will start of by discussing the pros and cons of both the approaches. Pros and Cons of a Normalized database design. Normalized databases fair very well under conditions where the applications are write-intensive and the write-load is more than the read-load. This is because of the following reasons: Normalized tables are usually smaller and...
The “Shadow Table” trick.
+0 Vote Up -0Vote Down
The need: Often there is a requirement where data in a particular table has to be processed, and the data processing might be slow, while the table might be a one that is used by your application extensively. For example, a logging table that logs page hits. Or there might be an archiving operation that has to be performed on a particular table. Archiving / processing / aggregating records, all these operations are slow and can really blog down a website, combine that with the added overhead if the table that needs to have these operations performed is one that...
Ideas for select all columns but one mysql stored procedure
+0 Vote Up -0Vote Down
Assume we’ve a table with 100 rows and we need to select all columns but one. The problem is headache of actually typing out all 99 required columns!! Solutions / Ideas to above problem are: Ignorance is bliss. Select all(*) and ignore the column. Manually type column names or manage it with copy paste! Create […]
Active Cache for MySQL
+1 Vote Up -1Vote Down

One of the problems I have with Memcache is this cache is passive, this means it only stores cached data. This means application using Memcache has to has to special logic to handle misses from the cache, being careful updating the cache - you may have multiple data modifications happening at the same time. Finally you have to pay with increased latency constructing the items expired from the cache, while they could have been refreshed in the background. I think all of these problems could be solved with concept of active cache

The idea with Active Cache is very simple - for any data retrieval operation cache would actually know how to construct the object, so you will never get a miss from the cache, unless there is an error. From existing tools this probably lies out best on registering the jobs with Gearman.

The updates of the data in this case should go

  [Read more...]
Few more ideas for InnoDB features
+1 Vote Up -0Vote Down

As you see MySQL is doing great in InnoDB performance improvements, so we decided to concentrate more on additional InnoDB features, which will make difference.

Beside ideas I put before http://www.mysqlperformanceblog.com/2009/03/30/my-hot-list-for-next-innodb-features/ (and one of them - moving InnoDB tables between servers are currently under development), we have few mores:

- Stick some InnoDB tables / indexes in buffer pool, or set priority for InnoDB tables. That means tables with bigger priority will be have more chances to stay in buffer pool then tables with lower priority. Link to blueprint

  [Read more...]
Adjusting Innodb for Memory resident workload
+0 Vote Up -0Vote Down

As larger and larger amount of memory become common (512GB is something you can fit into relatively commodity server this day) many customers select to build their application so all or most of their database (frequently Innodb) fits into memory.

If all tables fit in Innodb buffer pool the performance for reads will be quite good however writes will still suffer because Innodb will do a lot of random IO during fuzzy checkpoint operation which often will become bottleneck. This problem makes some customers not concerned with persistence run Innodb of ram drive

In fact with relatively simple changes Innodb could be made to perform much better for memory resident workloads and we should consider fixing these issues for XTRADB.

Preload It is possible to preload all innodb tables (ibdata, .ibd files) on the

  [Read more...]
High-Performance Click Analysis with MySQL
+0 Vote Up -0Vote Down

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they're generally some kind of loggable event.

The next characteristic of a lot of these systems (real or planned) is the desire for "real-time" analysis.  Our customers often want their systems to provide the freshest data to their own clients, with no delays.

Finally, the analysis is usually multi-dimensional.  The typical user wants to be able to generate summaries and reports in many different ways on demand, often to support the functionality of the application as well as to provide reports to their clients.  Clicks by day, by customer, top ads by clicks, top ads by click-through ratio, and so on for dozens of different types

  [Read more...]
SHOW OPEN TABLES - what is in your table cache
+0 Vote Up -0Vote Down

One command, which few people realize exists is SHOW OPEN TABLES - it allows you to examine what tables do you have open right now:

  • mysql> SHOW open TABLES FROM test;
  • +----------+-------+--------+-------------+
  • | DATABASE | TABLE | In_use | Name_locked |
  • +----------+-------+--------+-------------+
  • | test     | a     |      3 |           0 |
  • +----------+-------+--------+-------------+
  • 1 row IN SET (0.00 sec)
  • This command lists all non-temporary tables in the table-cache, showing each of them only once (even if table is opened more than

      [Read more...]
    Thoughs on Innodb Incremental Backups
    +0 Vote Up -0Vote Down

    For normal Innodb "hot" backups we use LVM or other snapshot based technologies with pretty good success. However having incremental backups remain the problem.

    First why do you need incremental backups at all ? Why not just take the full backups daily. The answer is space - if you want to keep several generations to be able to restore to, having huge amount of full copies of large database is not efficient. Especially if it only changes couple of percents per day.

    The solution MySQL offers - using binary log works in theory but it is not overly useful in practice because it may take way too long to catch up using binary log. Even if you have very light updates and can execute updates for a full day within an hour it will take over 24 hours to cover month worth of binary logs... and quite typically you would have much higher update

      [Read more...]
    Living with backups
    +0 Vote Up -0Vote Down

    Everyone does backups. Usually it’s some nightly batch job that just dumps all MySQL tables into a text file or ordinarily copies the binary files from the data directory to a safe location. Obviously both ways involve much more complex operations than it would seem by my last sentence, but it is not important right now. Either way the data is out and ready to save someone’s life (or job at least). Unfortunately taking backup does not come free of any cost. On the contrary, it’s more like doing very heavy queries against each table in the database when mysqldump is used or reading a lot of data when copying physical files, so the price may actually be rather high. And the more effectively the server resources are utilized, the more that becomes a problem.

    What happens when you try to get all the data?

    The most obvious answer is that it needs to be read,

      [Read more...]
    32-bit? Really?
    +0 Vote Up -0Vote Down

    Is anyone out there actually still using 32-bit systems for new deployments? On purpose?

    I know I occasionally see people who have 64-bit systems and have installed 32-bit OS on them. They are one of two things: people who don’t know what they are doing, or why their server is then having memory problems, or people who have 32-bit Linux installed on their laptops because there is no good 64-bit Flash Player plugin for Linux. (/me shoots Adobe in the Face… it’s called re-compile it and release, please)

    The 32-bit laptop people I don’t care about - they are not yet hosting websites on their laptops while browsing YouTube. Yet.

    The others just need the learning.

    Which brings me back to… should we start to consider 32-bit a dinosaur sort of like AIX 4.1?

    (I should be clear here… I am honestly asking… not just

      [Read more...]
    A proposal for method of delivering optimizer bug fixes
    +0 Vote Up -0Vote Down

    Working on query optimizer bugs can be a rather frustrating experience. First, as soon as some query doesn't run as fast it theoretically could people will consider it a bug. On one hand that's great, you get a constant stream of user input, but on the other hand you end up with a whole pile of "bugs" which you can't hope to finish.

    What's more frustrating is that even if you manage to create a fix for an optimizer bug, there are chances it won't be allowed into next GA (currently 5.0.70) or approaching-GA (currently 5.1.30) release (GA is our term for "stable" or "release").

    The reason behind this is that most optimizer bugfixes cause the optimizer to pick different query plans, and there's no way to guarantee that the fix will be a change for the better for absolutely everyone. Experience shows that it is possible to have a query that hits two optimizer bugs/deficiencies at once in such a way that

      [Read more...]
    Pluggable storage engine interface needs to support table name resolution hooks
    +0 Vote Up -0Vote Down

    I've started some attempts at coding ha_trace storage engine I've mentioned earlier. One of the first things that became apparent was that I needed a way to put a hook into table name resolution code so I can wrap tables into ha_trace objects.

    The need to intercept table name resolution and do something other than looking at the .frm files is not unique to call trace engine:

    • Remote storage engines would benefit also:
      • NDB has a whole chunk of code that ships .frm files from one connected mysqld instance to another. It doesn't hook into name resolution; it ships table definitions proactively, which could be nuisance if you use a new mysqld node to just connect and run a few queries
      [Read more...]
    erlang and MySQL Cluster
    +0 Vote Up -0Vote Down

    Ok, in case you just showed up going “finally!”, I’m sorry to let you down - I haven’t yet ported NDB API to erlang.

    But I should - and I want to.

    Brian was just talking about concurrent program and mentioned erlang. Turns out that when I was starting off working on the NDB/Connectors, Elliot asked me if I’d considered erlang. Always up for learning a new language I did a quick check, but there were no swig bindings, so I put it off until later.

    Then later came and I still hadn’t written any code, so I found a book online and started reading. I have to say erlang is very cool.

    There is no way on earth I can wrap the NDB API in any meaningful way using erlang. However, I might be able to reimplement the wire protocol in erlang and have the resulting thing be way more

      [Read more...]
    An idea: create ha_trace tracing storage engine
    +0 Vote Up -0Vote Down

    Our exprience in solving query optimizer problems shows that a good chunk of optimization problems are incorrect choice of join order. The most frequent causes of the problems are

  • Table engine returns index statistics or records-in-range numbers that are very far from reality;
  • The WHERE clause has high-selectivity conditions for one table and low-selectivity conditions for the other. The optimizer is not aware of this and chooses a poor join order;
  • Bugs or shortcomings in the optimizer or the table engine;
  • At the moment investigation of those kinds of problems is hard and time-consuming:

    • There is no easy way to find out what the storage engine has returned to the optimizer from records-in-range calls. One has to manually repeat the steps taken by equality propagation, constant table detection and other query rewrites, construct the table's condition and
      [Read more...]
    Nested-loops join speedup idea promoted to WL task
    +0 Vote Up -0Vote Down

    The idea how to speed up nested-loops join a little I've mentioned earlier has now been promoted into a worklog entry. It is now known as WL#3724 "Short-Cutting Join Execution: Speeding up star queries" and its text is publicly available at MySQLForge.

    At the moment there is only a short description, but hopefully Martin Hansson (the assigned developer) will add more content there.

    Backing up Data, Thoughts on an Alternate Ending
    +0 Vote Up -0Vote Down
    Let us recap for the audience on how to back up MySQL, or at least my favorite methods :)

    1) Use Mysqldump
    2) Shutdown the database or use an LVM snapshot method to back up the physical bits.
    3) Use replication in combination with method 2.

    And the winner for me goes to number three. So why is that?

    Its cheap, and in a lot of cases I see, its very possible. This is the point where someone says, but I have 20 terabytes! I am sure you do, and in this case you can ignore the rest of what I am about to say.

    One linux box with a bunch of disks with either one or multiple instances all acting as slaves is a cheap solution. From the point where you have a replication copy, you can make copies of the database off to tape or just create vectors (aka Monday, Tuesday, Wednesday... Friday one week ago) from the physical database. With this method you get live

      [Read more...]
    Microsoft pays $115 million of Copyright Infringement
    +0 Vote Up -0Vote Down
    A Texas jury has awarded $133 million in damages to David Colvin, after finding Microsoft and Autodesk guilty of infringing upon Colvin’s two software patents for software antipiracy protection. Colvin’s company, z4 Technologies Inc., filed patents for ‘passwords and codes assigned to individual software copies to prevent unauthorized copies.’ Microsoft was ordered to pay $115 [...]
    Showing entries 1 to 24

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.