There was once a big hooplah about the MySQL Storage Engine Architecture and how it was easy to just slot in some other method of storage instead of the provided ones. Over the years I’ve repeatedly mentioned how this wasn’t really …
[Read more]
I thought people may be interested to know what the PBMS patch
for MySQL actually patches, in case they should think this is a
major hack into the MySQL source code.
Almost all of the patch consists of the PBMS daemon
source code which is added to the "storage/pbms" folder in the
MySQL source code tree. Other than that here is a list of the
actual MySQL files touched and what the patch is for:
-
sql/CMakeLists.txt:
Added PBMS source directories to the header file search list.
Lines added: 1. -
sql/handler.cc:
Added PBMS server side API calls to check for longblob columns being modified or tables containing longblob columns being dropped or renamed. This is the guts of the PBMS patch.
Lines added: 170. -
libmysql/CMakeLists.txt:
Added PBMS API functions to the client API functions list and …
Version 2 of the PBMS daemon is now ready.
Here are the major changes introduced with this version:
-
PBMS is fully integrated with MySQL 5.5:
PBMS is now provided as a patch for MySQL 5.5 which simplifies installation and provides numerous benefits.
-
All engines are "PBMS enabled":
PBMS no longer requires that you have a "PBMS enabled" storage engine to be able to use PBMS.
-
The MySQL client lib provides the PBMS client
API:
You no longer need to link your application to a separate PBMS lib to use the PBMS 'C' API.
-
mysqldump understands PBMS BLOB URLS:
When dumping tables or databases containing PBMS BLOB URLs mysqldump will dump the referenced BLOBs as binary data to a …
-
All engines are "PBMS enabled":
Just a reminder that I will be presenting a session on PBMS at
the
MySQL Conference on Thursday April 14 at 10:50.
The title is "BLOB Data And Thinking Out Side The Box" where I
will be talking about the new PBMS daemon with a focus on how it
handles replication and backup.
Hope to see you there!
Why use PBMS?
I have talked to people about why they should use PBMS to handle
BLOB data often enough, so I was surprised when someone asked me
where they could find this information and I discovered I had
never actually written it down anywhere. So here it
is.
If you are unfamiliar with PBMS, PBMS stands for PrimeBase Media
Streaming. For details please have a look at the home page for
BLOB
Streaming.
Both MySQL and Drizzle are not designed to handle BLOB
data efficiently. This is not a storage engine problem, most
storage engines can store BLOB data reasonably efficiently, but
the problem is in the server architecture itself. The problem is
that the BLOB data is transferred to and from the server as part
of the regular result set. To do this both the server and the
client must allocate a buffer large enough to hold the entire
BLOB. DBMSs …
A new version of PBMS for drizzle has been pushed up to
launchpad:
drizzle_pbmsV2
I have rewritten PBMS and changed the way that BLOBs are
referenced in order to make PBMS more flexible and to fix some of
it's limitations. I have also removed some of the more confusing
parts of the code and reorganized it in an attempt to make it
easier for people to find there way around it.
So apart form some cosmetic changes what is different?
Maybe the best answer would be to say what hasn't changed: the
user and engine API and the way in which the actual data is
stored on the disk remains pretty much unchanged, but everything
else has changed.
The best place to start is with the BLOB URL, the old URL looked
like this:
"~*1261157929~5-128-6147b252-0-0-37" the new URL looks …
A new release of the PrimeBase Media Streaming daemon is now
available for download at
http://www.blobstreaming.org .
This release doesn't contain any major new features just some bug
fixes and a lot of house keeping changes.
If you look at the download section on http://www.blobstreaming.org you will see that
there are now more packages that can be downloaded. I have
separated out different client side components from the PBMS
project and created separate launchPad projects for each one. You
can see them listed in the "Related Links" side panel to the
right of this post.
- The "PBMS Client Library" facilitates communication with the PBMS daemon. This library is independent of the PBMS daemon's host server and can be used to communicate with a PBMS daemon hosted by …
If you haven't already heard PBMS is now part of the Drizzle
tree.
Getting it there was a fair bit of work but not as much as I had
thought it would be. The process of getting it to work with
Drizzle and running it thorough Hudson has improved the code a
lot. It is amazing what some compilers will catch that others
will let by. I am now a firm believer in treating all compiler
warnings as errors.
I am just in the process of updating the PBMS plugin so that it
will build and install the PBMS client library (libpbmscl.so) as
well as the plugin. The PBMS client library is a standalone
library that can be used to access the PBMS daemon weather it is
running as part of MySQL or Drizzle. So a PBMS client library
built with Drizzle can be used to access a PBMS daemon running as
part of MySQL and vice-versa.
There is also PHP extension for PBMS that is basically just a
wrapper for the library. Currently this is …
Some of you may have noticed that blob streaming has been merged into the main Drizzle tree recently. There are a few hooks inside the Drizzle kernel that PBMS uses, and everything else is just in the plug in.
For those not familiar with PBMS it does two things: provide a place (not in the table) for BLOBs to be stored (locally on disk or even out to S3) and provide a HTTP interface to get and store BLOBs.
This means you can do really neat things such as have your BLOBs replicated, consistent and all those nice databasey things as well as easily access them in a scalable way (everybody knows how to cache HTTP).
This is a great addition to the AlsoSQL arsenal of Drizzle. I’m looking forward to it advancing and being adopted (now much easier that it’s in the main repository)
- …
Recently when talking to someone about PBMS it occurred to me
that I had been thinking about BLOBs in the traditional database
sense in that they were atomic blocks of data the content of
which the server knew nothing about. But with PBMS that need not
be the case.
The simplest enhancement would be to allow the client to send a
BLOB request to the PBMS daemon with an offset and size to just
return a chunk of the BLOB. Depending on the application and the
BLOB contents this may make perfectly good sense, why force the
client to retrieve the entire BLOB if it only want part of
it.
A much more interesting idea would be to enable the user to
provide custom server side functions that they could run against
the BLOB.
So how would his work?
The PBMS daemon would provide its own "BLOB functions" plugin
API. The API would be quite simple where the plugin would
register the function names it …