Earlier this week we all read GigaOM's article with this title:
"Why the days are numbered for Hadoop as we know it"I know GigaOM
like to provoke scandals sometimes, we all remember some other
unforgettable piece, but there is something behind
it...
Hadoop today (after SOA not so long ago) is one of the worst case
of an abused buzzword ever known to men. It's everything,
everywhere, can cure illnesses and do "big-data" at the same
time! Wow! Actually Hadoop is a software framework that
supports data-intensive distributed applications, derived from
Google's MapReduce and Google File System (GFS) papers.
My take from the article is this: Hadoop is a foundation,
low-level platform. I used the word …
I want to propose something that’s mere imagination at this point. There’s a lot of promises made by database vendors whether closed or open source that they have created an RDBMS that can scale across nodes, making a data storage service that can expand with user needs as well as deal with hardware failures. We have the traditional players:
- Oracle has RAC
- MySQL has their cluster product
- Postgres has synchronous replication
- SQL Server has replication and fail-over clustering
The sad truth is this: none of these are fully there yet. Most of them trade the ability to transparently scale and fail over for availability. These solutions do not require major application refactoring. Oracle RAC has a shared disk, which means no IO scaling and a single point of failure (SPF). MySQL Cluster has heterogeneous nodes (although you can remove all SPFs), and it does not perform foreign key …
[Read more]At the recent South East LinuxFest in June 2012 I gave two MySQL presentations.
The first was on Explaining the MySQL Explain. This presentation details the MySQL Query Execution Plan (QEP) of an SQL statement and how to understand and interpret the information from the EXPLAIN command. Also discussed are additional commands and tools that exist to add supplementary information. These are essential skills that will be used daily in production operations. Download Presentation (PDF)
More detailed information about EXPLAIN and associated commands is available …
[Read more]Sometime you need to debug your Sphinx indexes to know what’s inside it, is it okay, is there document you trying to find? In this case indextool utility might be very handy as it gathers information directly from index files even searchd is not started. Here few examples of indextool usage:
Checking index consistency
One of the most important functions of indextool is
checking index consistency. You will need to have sphinx config
file and index files.
/path/to/indextool -c sphinx.conf --check
my_sphinx_index
This will perform checking of my_sphinx_index for
consistency between document list, hit list, positions and other
internal sphinx index structures. Please note that indextool is
only checking disk indexes (starting from 2.0.2 it could also
check on-disk part of Real-Time indexes, but not a memory part).
Usual output for healthy index looks likes …
Forge was intended to be a community wiki resource for sharing
information with each other. However, over the last few
years, we have seen Forge used less and less by MySQL
Community, and more by spammers. What happened?
MySQL Worklogs and MySQL Internals documentation will be moved
to dev.mysql.com and with new anti spam measures in
place.
The MySQL Wiki, which was the primary focus of forge.mysql.com has
been migrated to https://wikis.oracle.com/display/mysql
MySQL Forge will EOL on August 1st 2012.
Effective MySQL: Backup and Recovery by Ronald Bradford
Effective MySQL: Backup and Recovery by Ronald
Bradford is hot off the press! This is the second book in the
series and I hope to have a review shortly. Meanwhile keep an eye
out for it and the first book in the series — Effective MySQL Optimizing SQL Statements.
Ronald’s style is concise and fluff free.
…
OSCON2012 is just a few days away !
I will be in Portland for OSCON as well
as the Community Leadership summit. I look forward
to meeting with everyone.
OSCON
has another great lineup of database sessions below are just a
few:
1:30pmMonday, 07/16/2012 MySQL Cluster and NoSQL …
I learned how to use a computer on DOS and Windows. My first programming projects were written in QBASIC and my first Web applications were written in VB using ASP on Windows 2000. The first job where I made decent money was developing a SQL Server-based application. I bought my first car, an engagement ring, and a honeymoon with money from making software on Windows. Needless to say, I found a lot of intellectual and financial fulfillment from Windows over the years.
That first real job also allowed me flexibility in what technology I could employ, and I helped implement a features using Redis on top of Ubuntu. This was a fun time, because my company basically paid me to study a new technology and to gain experience using it. On my own, I began to use Linux and to embrace open-source ideas, one of which is that the consumer is also the producer. I changed my mindset about what it means to use software: for open-source projects, it often …
[Read more]Can OLTP database workloads use Amazon S3 as primary storage? Now they can, thanks to the Cloud Storage Engine (ClouSE), but the question is: how fast?
- ClouSE on “across the street” vs. “across the continent” cloud storage
- ClouSE vs InnoDB
- ClouSE benefits
To answer the question about performance, we decided to run
db_STRESS benchmark on a MySQL database in Amazon
EC2. We compared 3 configurations:
- “Across the street storage”: ClouSE with data stored in S3 in the same …
I love our community.
Not long after posting my update on ps_helper, I had a couple of comments around the formatting of values within the output. Daniël van Eeden gave the suggestion that I could add a couple of Stored Functions, for formatting byte and time based values.
Of course, this was a great idea – not least for myself, because I no longer have to worry about how to format certain columns in the output.
I’ve added the following:
format_bytes()
format_time()
format_path()
…