Over in this post, Brian Aker talks about a popular social
networking application (Foursquare), which recently moved off a
classical durable transactional store (Postgres) and onto a
new-fangled NoSQL system (10gen MongoDB), and how MongoDB's very
poor durability is negatively impacting the end user experience
(e.g. Foursquare keeps losing a days activity stream
history).
I posted this comment:
One of the things I took away from NoSQL Live in Boston is that
the standard "MySQL master/slave with replication" configuration
is even worse than most of the NoSQL solutions out there already,
not even having an "eventual consistancy" guarantee, instead
having, at best, "wishful consistancy".
I sometimes quip that MySQL 3.11 became popular a decade ago
because it fit very well to quickly and poorly written PHP3 apps.
MongoDB may be the modern …
Change is in this air this summer in Drizzle-land. One of our GSoC students, Vijay Samuel, has been hard at work replacing the options processing system we inherited from MySQL, my_getopt, with one based on boost::program_options. We've been merging his work in to trunk for a while now, and he's made really excellent progress, so it's probably about time to point out how the new system will be different from the MySQL one. There are three main changes afoot here, and I'm actually pretty pleased with all three of them:
- Plugin Option Prefixing
- Dashes v. Underscores
- Config Files
A few of these changes are still in the middle of their transition, so I'm just going to describe the finished system, but we've essentially got all of the client programs and most of the plugins done at this point. …
[Read more]Today marks my last day at Pythian. I have been at Pythian for almost three years. In those three years, Pythian’s already thriving MySQL practice has grown even more. I have worked with big and small clients alike, across many industries, managed a team of up to 4 DBAs, and learned a lot not just about MySQL, but what my goals are in general.
Though I am leaving, everything I said in the blog post I made when I announced I was coming to Pythian still holds true. Pythian is a challenging environment and one I would recommend to anyone who finds their current DBA environment boring that they should come to Pythian and experience what it is like to work here. I had lunch with Paul Vallee yesterday and we even discussed possible future collaborations (hence the title, a joke that I am “forking” off of Pythian). …
[Read more]When working with databases it is always necessary to import data or schemas. In this article we describe the process of importing data from a text file into a MySQL database, and also we discuss questions concerning problems with MySQL import and the ways of solving these problems. We will give a detailed description of the Data Import tool of dbForge Studio for MySQL, describe the capabilities of this tool and illustrate its usage. What problems can be experienced when importing data from a text file?
To specify all problems one can experience when importing data from a text file it’s necessary to remember the specificity of storing text data:
- data in text files is always formatted, and formatting is free;
- tabular data in text files can be presented together with its header, i.e. with column names and certain formatting;
- the type of data in text files can not be distinguished, that’s why there …
5.1.46 has this change:
Performance: While looking for the shortest index for a covering index scan, the optimizer did not consider the full row length for a clustered primary key, as in InnoDB. Secondary covering indexes will now be preferred, making full table scans less likely.
In other words, if you have covering index on * (which is quite common on m:n mapping tables), use it rather than PK. As I have spent my time getting indexing right and having PKs be based on primary access pattern and SKs on secondary access pattern, I hereby not welcome the new change that suddenly reverses the behavior in late GA version.
Not good, when mysqldump queries end up taking 6 days instead of previous half an hour, not good at all.
Update: Oh, MariaDB has this reverted, from their …
[Read more]The North Texas MySQL Users Group is now a special interest group withing the Dallas Oracle Users Group. As such, we can meet in oracle's offices in Plano or Irving. In the past there has been demand for meetings in the northern part of the Metroplex and demand for meetings more near the DFW Airport. So we can meet in either office or alternate between the two. Please state you preference by voting on the North Texas MySQL Users Group website.
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
See part 1 for the introduction and talking about
power and hardware. This part will go over the 2nd “P”,
partitioning. Learning about Oracle’s partitioning has gotten me
more interested in how MySQL’s partitioning works, and I do hope
that MySQL partitioning will develop to the level that Oracle
partitioning does, because Oracle’s partitioning looks very nice
(then again, that’s why it costs so much I guess).
Partition – …
[Read more]At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
These are my notes from the session, which include comparisons of
how Oracle works (which Maria gave) and how MySQL works (which I
researched to figure out the difference, which is why this blog
post took a month after the conference to write). Note that I am
not an expert on data warehousing in either Oracle or MySQL, so
these are more concepts to think about than hard-and-fast advice.
In some places, I still have questions, and I am happy to have
folks comment and contribute what they know.
One interesting point brought up:
Maria quoted someone (she said the name but I did not grab it)
from …
Does it matter if the end user knows what the database is?
Recently I got a wonderful view of a database from the end user
perspective.
While I was traveling I had found a restaurant where I had
decided to let friends who live locally know where I was at. Part
way through my food I got a message from a local friend that said
"Don't eat there, their food always makes people sick!"
"Always" is a word that I would think would be a little too
strong when applied to a restaurant, right?
Nope, the next day I got to feel the full truth of the
word.
A couple of days later I am telling some friends about this and a
local asked me "Where was this, I want to avoid them." I didn't
get asked this question once, I got it asked a dozen times.
I don't know where the place is. Why is that? Because the system
I was using lost the entire day worth of my data. I don't know
how …
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. In that session, there was a section on how to determine I/O throughput for a system, because in data warehousing I/O per second (iops) is less important than I/O throughput (how much actual data goes through, not just how many reads/writes).
The section contained an Oracle-specific in-database tool, and a
standalone tool that can be used on many operating systems,
regardless of whether or not a database exists:
If Oracle is installed, run
DBMS_RESOURCE_MANAGER.CALIBRATE_IO:
SET SERVEROUTPUT ON DECLARE lat INTEGER; iops INTEGER; mbps INTEGER; BEGIN -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO(<DISKS>, <MAX_LATENCY>,iops,mbps,lat); DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 10, iops, mbps, lat); …[Read more]