Planet MySQL Planet MySQL: Meta Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 10 of 12 2 Older Entries

Displaying posts with tag: streaming (reset)

A new big data structure for streaming counters - bit length encoding
+1 Vote Up -0Vote Down
One of the challenges of big data is that it is, well, big. Computers are optimized for math on 64 bits or less. Any bigger, and extra steps have to be taken to work with the data which is very expensive. This is why a BIGINT is 64 bits.  In MySQL DECIMAL can store more than 64 bits of data using fixed precision.  Large numbers can use FLOAT or DECIMAL but those data types are lossy.

DECIMAL is an expensive encoding. Fixed precision math is expensive and you eventually run out of precision at which point you can't store any more data, right?

What happens when you want to store a counter that is bigger than the maximum DECIMAL?  FLOAT is lossy.  What if you need an /exact/ count of a very big number without using very much space?

I've developed an encoding method that allows you to store very large counters in a very small amount of space. It takes





  [Read more...]
Real-time streaming data aggregation
+0 Vote Up -0Vote Down

Dear Kettle users,

Most of you usually use a data integration engine to process data in a batch-oriented way.  Pentaho Data Integration (Kettle) is typically deployed to run monthly, nightly, hourly workloads.  Sometimes folks run micro-batches of work every minute or so.  However, it’s lesser known that our beloved transformation engine can also be used to stream data indefinitely (never ending) from a source to a target.  This sort of data integration is sometimes referred to as being “streaming“, “real-time“, “near real-time“, “continuous” and so on.  Typical examples of situations where you have a never-ending supply of data that needs to be processed the instance it becomes available are JMS (Java Message Service), RDBMS log sniffing, on-line fraud

  [Read more...]
OpenSQL Camp Europe and FrOSCon: A summary
Employee +4 Vote Up -0Vote Down

With OpenSQL Camp and FrOSCon being over for almost a week now, it's time to come up with a short summary. I traveled home on Monday morning and then took Tuesday off, so I had some catching up to do...

As for the past years, FrOSCon rocked again! According to the closing keynote, they had around 1.500 (unique) visitors and I had a great time there. I really enjoyed meeting all the old and new faces of the various Open Source communities. The lineup of speakers was excellent, Jon "maddog" Hall's keynote about "Free and Open Source Software in the Developing World" was quite insightful and inspiring.

Most of the time I was busy with

  [Read more...]
Live video stream from OpenSQL Camp
Employee +5 Vote Up -0Vote Down

Greetings from Sankt Augustin, Germany! I've arrived by train today and just returned from the FrOSCon venue, which will start tomorrow. The organizers are still busy with the preparations, but things already seem to be in good shape.

It was a mild and sunny evening today. Hopefully it will be the same tomorrow again, so we can enjoy a relaxed BBQ outside! The social event at FrOSCon is always a nice opportunity to meet and talk with fellow open source enthusiasts, users and developers.

And finally some good news for those of you who can't make it to FrOSCon this year: there will be live video streams from selected lecture rooms! So you will be able to attend the OpenSQL Camp sessions virtually - just head over to http://live.froscon.org/ and select room "HS6". It'll be interesting to see how this will work out.

PBMS in Drizzle
+2 Vote Up -0Vote Down

Some of you may have noticed that blob streaming has been merged into the main Drizzle tree recently. There are a few hooks inside the Drizzle kernel that PBMS uses, and everything else is just in the plug in.

For those not familiar with PBMS it does two things: provide a place (not in the table) for BLOBs to be stored (locally on disk or even out to S3) and provide a HTTP interface to get and store BLOBs.

This means you can do really neat things such as have your BLOBs replicated, consistent and all those nice databasey things as well as easily access them in a scalable way (everybody knows how to cache HTTP).

This is a great addition to the AlsoSQL arsenal of Drizzle. I’m looking forward to it advancing and being adopted (now much easier that it’s in the main repository)

  [Read more...]
A join I/O manipulator for IOStream
Employee +0 Vote Up -0Vote Down
I started playing around with protobuf when doing some stuff in Drizzle (more about that later), and since the examples where using IOStream, the table reader and writer that Brian wrote is using IOStreams. Now, IOStreams is pretty powerful, but it can be a pain to use, so of course I start tossing together some utilities to make it easier to work with.

Being a serious Perl addict since 20 years, I of course start missing a lot of nice functions for manipulating strings, and the most immediate one is join, so I wrote a C++ IOStream manipulator to join the elements of an arbitrary sequence and output them to an std::ostream.

In this case, since the I/O Manipulator takes arguments, it has to be written as a

  [Read more...]
MySQL Conference Liveblogging: Introduction To The BLOB Streaming Project (Wednesday 3:00PM)
+0 Vote Up -0Vote Down
  • Paul McCullagh presents
  • BLOB
    • invented by Jim Starkey
    • Basic Large OBject
    • Binary Large OBject
    • photos, films, mp4 files, pdfs, etc
  • how MySQL handles BLOBs
    • mysql client send buffer -> receive buffer on the server (max_allowed_packet)
    • streaming a BLOB
      • continuous data stream
      • stream BLOB data directly in and out of the database
      • store BLOBs of any size (>4GB) in the database
      • create a scalable back-end that can handle any throughput and storage requirements. Wouldn't need to know in advance how big the database will get
      • provide an open system that can be used by all engines
      • provide extensions for BLOB streaming to existing MySQL clients
  • why put BLOBs
  [Read more...]
New PBXT/MyBS release enables JDBC-based BLOB streaming!
+0 Vote Up -0Vote Down
This is quite a milestone for me! At last it possible to actually do some practical work with the BLOB streaming engine (MyBS)!

For this release I have completed changes to the MySQL Connector/J 5.0.7, to allow BLOB data to be transparently stored and retrieved from the MyBS BLOB repository. The new version of the driver is called MySQL Connector/J SE (streaming enabled).

Uploading a BLOB is as simple as using setBinaryStream() or setBlob() on INSERT or UPDATE. By using getBinaryStream() or getBlob() after a SELECT you get direct access to the data stream coming from the repository. More information and some examples are provided in the documentation at: http://www.blobstreaming.org/documentation.

To try this out you need to install the latest versions of PBXT and MyBS. Both are





  [Read more...]
BLOB streaming engine (MyBS), version 0.5 Alpha released!
+0 Vote Up -0Vote Down
With some effort just before my holiday, I have managed to complete the release of the next version of MyBS, the BLOB streaming engine for MySQL.

This version includes all the basic functionality required to stream BLOB data in and out of MySQL tables.

The main features are:
  • Uploading of BLOB data directly into the database using HTTP PUT or GET methods.

  • Downloaded of BLOB data directly from the database using HTTP GET.

  • BLOB size may exceed 4GB - theoretical BLOB size limit of 256 Terabytes.

  • BLOBs are stored in a repository which manages references from other storage engine tables.

  • BLOBs are referenced by a URL.

  • URLs referencing BLOBs in the repository have a unique access code, for security.

  • The theoretical maximum repository size is 4 Zettabytes (2^72











  [Read more...]
The MyBS Engine and the BLOB Repository
+0 Vote Up -0Vote Down
After some consideration I have decided to move the BLOB repository from PBXT to MyBS (§). This has the advantage that any engine that does not have its own BLOB repository (or is otherwise not suitable for storing large amounts of BLOB data) can reference BLOBs in the MyBS BLOB repository.

(§) MyBS stands for "BLOB Streaming for MySQL". The BLOB Streaming engine is a new storage engine for MySQL which allows you to stream media data directly in and out of the database. More info at www.blobstreaming.org.

Lets look at an example of this. Assume my standard example table:
CREATE TABLE notes_tab (
n_id int PRIMARY KEY,
n_text longblob
) ENGINE=PBXT;
And assume we have a file called blob_eg.txt with the contents "This is a BLOB Streaming upload test".

Firstly, I can









  [Read more...]
Showing entries 1 to 10 of 12 2 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.