Showing entries 1 to 10 of 21
10 Older Entries »
Displaying posts with tag: infinidb (reset)
Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala.

This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the MySQL server to execute queries. This allows for an easy “migration” of existing analytical tools. The results are quite interesting and promising.

Quick How-To

The InfiniDB documentation is not very clear on step-by-step instructions so I’ve created this quick guide:

  1. Install Hadoop cluster (minimum …
[Read more]
New MySQL & MariaDB Instructional Videos from SkySQL

Are you looking to expand your knowledge about MySQL and MariaDB database solutions?

Well, you’re in luck! SkySQL is introducing an exclusive collection of educational videos featuring some of the industry’s leading experts on the MySQL database and related technologies. View informative, technical talks on a variety of topics, from the experts at SkySQL, MariaDB, Calpont InfiniDB, Continuent, ScaleDB, Severalnines, Sphinx, Webyog, and others.

read more

Vote for MySQL[plus] awards 2011 !

First of all, I wish you a happy new year.
Many things happened last year, it was really exciting to be involved in the MySQL ecosystem.
I hope this enthusiasm will be increased this year, up to you !

To start the year, I propose the MySQL[plus] Awards 2011
It will only take 5 minutes to fill out these polls.
Answer with your heart first and then with your experience with some of these tools or services.

Polls will be closed January 31, so, vote now !
For “other” answers, please,  let me a comment with details.

Don’t hesitate to submit proposal for tools or services in the comments.
And, please, share these polls !

 

Note: There is a poll embedded within this post, …

[Read more]
Muzing on NoSQL, damned ! can't get rid of InifiniDB

NoSQL have been frequently used for building analytic solutions. The big picture is using some scalable client code with map reduce to distribute full data scan. 

This approach have nothing new to the RDBMS world and can be considered an extension of the Kimball ROLAP normalization, just allowing more CPU power on a single query. 

NoSQL or sharding take advantages on 

  • Distributed processing 

NoSQL or sharding are loosing advantages such        

  • Fast memory communication 
  • Per column data type optimization and deserialization cost (NoSQL)
  • C processing when reduced with slower language (NoSQL)

There is more non technical advantages in classic ROLAP normalization like using same well known OLTP tools and bug free storage engine, all coming with GPL licences for reducing the …

[Read more]
451 CAOS Links 2010.11.05

Oracle increases MySQL pricing. Jono Bacon wants some respect. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# Oracle increased the prices for MySQL and rejigged its editions.

# A good overview of the resulting MySQL pricing hubbub from @tiensoon

# SkySQL named first customers in open letter to Oracle MySQL customers.

# Actuate reported over $5.1m in BIRT-related business for Q3, up …

[Read more]
Calpont InfiniDB 2.0 and BI QuickStarts

The 2.0 release of Calpont InfiniDB
is ready for download. New features for the columnar database storage engine for MySQL include data compression, fully parallelized & scalable UDFs, and partition drop has been added to the automatic vertical & horizontal data partitioning.

  • Benefits of InfiniDB Enterprise 2.0
  • 20-50% query performance improvement when reading from disk
  • Distributed in-database calculations provide greater flexibility to the data analyst, and enable faster performance for deep analytics
  • Removing obsolete data from the database quickly frees up disk storage and improves query response


And for those of you new to data warehousing and business intelligence, there are QuickStart for data …

[Read more]
Open Source BI -- Pentaho and Jaspersoft Part I

Hey DBAs! Are you seeking more efficient ways of shifting through your data to aid your business operations? Two popular Business Intelligence products have community Open Source software are Pentaho and JasperSoft. And both work with MySQL.

Both are easy to download and install. Both will use a JDBC connector to connect to MySQL. But how easy are the two to configure and run a simple report against a running instance of MySQL?


Setting up a JDBC connection with JasperSoft or Pentaho is pretty much like using any other JDBC connection.
The …

[Read more]
InfiniDB 1.5 Final is Now Available!

I am very excited to announce that the the FINAL 1.5 version of the InfiniDB Community Edition is now available for use.  Thanks to everyone in the community for helping us through the alpha, beta, and RC cycles to the 1.5 release of InfiniDB.

We've put a lot of hard work into this release and we've come a long way since 1.0. Here's a reminder of a few of the features that have been added since 1.0:


High-speed subqueries. 
Support for running on...

Doing your own on-time flight time analysis Part III

In the last post, the data from the on-time flight database was loaded in a column-orientated storage engine. Now the numbers can be crunched.

The original goal of this exercise was to find the flight from Los Angeles International Airport, LAX, to Dallas Fort Worth International Airport, DFW, that was the most likely to arrive on-time.

The data is 'opportunity rich' in that there is a lot information in there. It is easy to start wondering about the various nuggets of information in there. Are their certain aircraft (tail numbers) that are routinely bad performers? Are some days of the week better than others? Do national holidays have an effect on the on-time performance? If you are delayed, is there a 'regular amount' of delay? Does early departure make for an early arrival? Can the flight crew make up for a late departure? How much time is usually spend on runways?

But to look for the flight from LAX …

[Read more]
Anybody have a few millions lines of Apache Log files they can share?

Anybody have a few millions lines of Apache Log files they can share? I am working on a lab/demo on InfiniDB and need a few millions lines of an Apache HTTPD log file. I no longer run any large websites and would rather use real data over creating something. I will sanitize your URL so you will be anonymous, so www.YourCompanyHere.com will end up as www.abc123.com or something similar.

The demo/lab will show how to load data into the columnar InfiniDB storage engine and run some analytics against the data. Please let me know if you can help.

Showing entries 1 to 10 of 21
10 Older Entries »