Showing entries 26851 to 26860 of 44123
« 10 Newer Entries | 10 Older Entries »
Swedish Pirate Party takes seat in Europarl!

"Rick Falkvinge: Today is a good day for epic winnage.11 hours ago" (Facebook status of the Chairman of Swedish Pirate Party.)

The Swedish Pirate Party (the first of the many national Pirate Parties popping up) wins it's first seat (bordering on two, some votes still left to count) in the European Parliament tonight, with 7+ %. In percentages they drive right past 3 long time established parties from the Swedish national parliament.

This is a historical moment in the turns of copyright and even civil liberties movements. I've personally for years supported the EFFish approach (and member of the Finnish equivalent EFFI) of lobbying all political parties with rational arguments about how good copyright, patents and civil liberties legislation will …

[Read more]
Soothsaying SQL Standardization Stuff

In an earlier blog posting “SQL Standards, ANSI committees, and Sun”, I (Peter Gulutzan) talked about our prospects for joining the American committee charged with database standards, which we typically call “ANSI” although that’s not the formal name (and by the way the formal name is about to change, but I’ll chat about organization some other time).

Well, I’m now Sun’s official voting delegate to the committee. There are also three “alternate” delegates from other parts of Sun; I’ll loosely categorize them as advocates from our PostgreSQL-ophile and Java / Java DB interest groupings. Mostly my concern is the MySQL side of things.

The committee holds frequent meetings by telephone conference, and infrequent ones in personal get-togethers. I’ve just finished attending one of the lengthier meetings. I …

[Read more]
Extended covering indexes

As you can probably guess, I’m catching up on reading my blogs. I’ve just read with interest about TokuDB’s multiple clustering indexes. It’s kind of an obvious thought, once someone has pointed it out to you. I’ve only been around products that insist there can be Only One clustered index (and then there’s ScaleDB, who say “think differently already”).

Anyway, we already know that there are quite a few database products that use clustered indexes and to avoid update overhead, require every non-clustered index to store the clustered key as the “pointer” for row lookups. Thus there are “hidden columns” which are present at the leaf nodes, but not the non-leaf nodes, of secondary indexes. Why not take that idea and run with it a little? Here’s what I mean:

[Read more]
The cache-oblivious algorithms inside Tokutek’s TokuDB

Tokutek have said they are working towards explaining their indexing algorithms. I spoke to some of the Tokutek people over the last 14 months or so about this, although I didn’t really start to pay attention until the beginning of the year. While Vadim, Peter and I were writing our blog post on TokuDB, I asked them to provide scholarly references, and they did, but warned me it would be dense reading, in part because it’s so academic. Mark Callaghan also told me he had gotten them to walk him through the math behind their indexing algorithm and found it hard.

Here’s a blog post with links to the research behind their work. I’m happy to say that after …

[Read more]
An ongoing thread of blogs on MySQL performance

In the last six months, things have gotten much busier in the world of MySQL performance. That is, making MySQL and InnoDB scale faster out of the box. This is a great trend and I hope it keeps going. At this point I’m fighting to find enough time to read about what people are doing; I can’t keep up fast enough to actually understand the improvements. That’s also good.

The blogs that are posting the most news and analysis are MySQL Performance Blog, Mikael Ronstrom’s blog, DimitriK’s blog, and Mark Callaghan’s blog.

WaffleGrid: Cream Benchmarks, stable and delivering a 3x boost

Lets get down to how the latest version of Waffle Grid performs.

Starting off simple lets look at the difference between the wafflegrid modes. As mentioned before the LRU mode is the “classic” Waffle Grid setup. A page is put into memcached when the page is removed from the buffer pool via the LRU process. When a page is retrieved from memcached it is expired so its no longer valid. In the New “Non-LRU” mode when a page is read from disk, the page is placed in memcached. When a dirty page is flushed to disk, this page is overwritten in memcached. So how do the different modes perform?

4GB Memcached, Read Ahead Enabled TPM % Increase
No Waffle 3245.79 Baseline
Waffle LRU 10731.34
[Read more]
The Argument For & Against Map/Reduce

The last 24 months has seen the introduction of Map/Reduce functionality into the data processing arena in various forms.  Map/Reduce is a framework for developing scalable data processing functionality, and was popularized by Google (see this earlier post).

Pure players like Hadoop are starting to find their own niche, helped by organizations such as Cloudera.  However there has been a number of for & against arguments relating to Map/Reduce functionality inside the database.

These arguments are now really serving a moot point.  Customers have recognized value in Map/Reduce prompting some (b)leading edge database vendors to introduce such …

[Read more]
An ongoing thread of blogs on MySQL performance

In the last six months, things have gotten much busier in the world of MySQL performance. That is, making MySQL and InnoDB scale faster out of the box. This is a great trend and I hope it keeps going. At this point I’m fighting to find enough time to read about what people are doing; I can’t keep up fast enough to actually understand the improvements. That’s also good. The blogs that are posting the most news and analysis are MySQL Performance Blog, Mikael Ronstrom’s blog, DimitriK’s blog, and Mark Callaghan’s blog.

Tool of the day: ack – better than grep

I’m decently familiar with grep so I can usually make it do what I want. I frequently need to search for instance MySQL source code for certain pattern strings, and this makes life so much easier. But Akash pointed out ack to me, which has the specific tagline “better than grep” (has the domain even) and I reckon it does live up to that. Win! It’s written in pure Perl, very easy to install (doesn’t even use CPAN if you don’t want).

It recurses into subdirs by default, while ignoring stuff like revision control and binary files. You can search specific types of files through a symbolic name rather than by specifying regexes. And it has colour highlighting, and simply uses the familiar Perl regexes for its pattern matching rather than funky subsets of which there are many distinct ones…

Simulating indexes in Hadoop

You should not try to use Hadoop as a “drop-in” replacement of your current (R)DBMS. That said it is still possible to utilize the power of cluster computing while circumventing its weaknesses when it comes to ad-hoc or real-time queries. We use Hadoop as an on-line system tightly integrated with our application and use it for both, long-running analytical queries and ad-hoc style queries.

In the mindset of a “traditional” database engineer one of the biggest concerns about Hadoop, or MapReduce in conjunction with a distributed file system in general, is the lack of indexes. Set aside that the debate “(R)DBMS vs MapReduce” is most of the time superfluous and sometimes almost leads to religious debates, the absence of a thing like an index is one the biggest hurdles you face when migrating data from a traditional DBMS.
Even …

[Read more]
Showing entries 26851 to 26860 of 44123
« 10 Newer Entries | 10 Older Entries »