Showing entries 21 to 30 of 43
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: data warehousing (reset)
On the need for an agile approach to data warehousing

I’d like to take a step back from technical issues to distill some of my thoughts on the challenges of data warehousing in the 21st century.

Having worked on a number of warehouse projects in different industries over the years, I’ve encountered many challenges, some failures, some successes. One thing is certain: all organizations that have a reasonable amount of data should be building a data warehouse if they don’t already have one. In 2009, given the economic atmosphere, no one wants to wait as long, or pay as much, as they did in 1999 to get one.

While this is a huge opportunity for open-source competitors like MySQL, it comes with big challenges for an organization that thinks it will get a $10MM warehouse (in 1999 dollars) for $300,000 (2009 dollars).

My contention is that in a web-connected, high-traffic and high-speed world, a monolithic approach with a rigid set of requirements, and a project team isolated …

[Read more]
Kickfire Ships to First Web 2.0 Customer

We just shipped and installed the Kickfire appliance in the data center of our first web 2.0 customer this week. We’re very excited about this new customer. With already over a million active members, this company continues to grow in spite of a challenging economic environment because it has a clearly defined audience and a business model which adds value to its members while adding money to its coffers. Part of the value add to their member base comes from well-targeted discount and coupon offers. In order to achieve this, the company runs complex analytics to understand members’ behaviors and responses and uses this data to help its advertising customers better target their offers.

As with many web 2.0 companies, this customer has built its application on MySQL. MySQL has helped them scale their web application well but was presenting performance and scalability challenges for their analytics. With their fact table in the …

[Read more]
Looking for a ETL engineer for our BI team

So, I mentioned earlier that I was looking at Infobright's Brighthouse technology as a storage backend for heaps and heaps of traffic and user data from Habbo. Turns out it works fine (now that it's in V3 and supports more of the SQL semantics), and we took it into use. Been pretty happy with that, and I expect to talk more about the challenge and our solution at the next MySQL Conference in April 2009.

However, our DWH team needs extra help. If you're interested in solving business analytics problems by processing lots of data and the idea of working in a company that leads the virtual worlds industry excites you, …

[Read more]
New, New, New … News at Kickfire

It’s been a crazy month here at Kickfire which is why I have fallen a bit behind on my postings – a new product, new customers, a new CEO, a new relationship with Sun/MySQL, a new website … and a new baby girl! Here’s a quick summary of all that has been going on:

New Product
We quietly came out of beta a month ago. After nearly two and half years in development, this is a great achievement for the company. The team took on a hugely ambitious project: to re-design how SQL is processed today to be able to deliver an order of magnitude improvement in price/performance relative to any other data warehousing solution on the market. This project involved bringing together over 50 of the industry’s smartest database and hardware engineers to build a new type of database machine that includes the world’s first SQL chip, an ultra-modern database kernel, and advanced system features. Kickfire’s four data …

[Read more]
Infobright Review – Part 2

First, a retraction, it turns out that the performance problem with datatimes in the previous article wasn’t due to high cardinality (I speculated too much here), but due to a type conversion issue.  From a helpful comment from Victoria Eastwood of Infobright (a good sign for a startup), the Infobright engine considered ‘2001-01-01’ to be a date, not a datetime, and it couldn’t do a conversion to a datetime.  Instead it pushed the date filtering logic from the Infobright engine to MySQL.  Effectively, the slow queries were a table scan.   The solution is to add the 00:00:00 to the dates to make them datetimes.  

With that in mind, here are some much better numbers for Infobright.   For Infobright this query took 0.05 seconds. 

1) Select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-05-01 00:00:00'

This compares very …

[Read more]
More Good News for Data Warehousing on MySQL

Last week, Infobright announced it had open sourced its data warehousing software code. This is good news for the growing number of organizations looking to use MySQL as a data warehousing platform. According to IDC, MySQL is already the third-most deployed database for data warehousing and Infobright’s move will give users yet another reason to seriously consider MySQL for this application.

For those of you not familiar with the Infobright offering, it is essentially a column-oriented data store for data warehousing. While the column-oriented approach is not exclusive to Infobright (Kickfire’s MySQL storage engine is also column-oriented, as are some other non-MySQL data warehousing solutions on the market) Infobright does have some unique technology that Lou Agosta recently described as follows in his post on Trends in Data Warehousing for the Second Half of 2008: …

[Read more]
An Infobright Review

With open source software I can install reasonably complete software and try it with my data. This way I get to see how it works in a realistic setting without having to rely on benchmarks and hoping they are a good match for my environment. And I get to do this without having to deal with commercial software sales people.

So I glad to hear the Infobright had gone open source as I have been wanting test a column based database for a while. I was even happier that it was a MySQL based engine as I would already know many of the commands. I decided to run some of the same tests I had run when comparing InnoDB and MyISAM for reporting (http://dbscience.blogspot.com/2008/08/innodb-suitability-for-reporting.html ).  InnoDB performed better than MyISAM in my reporting tests so I’m going to compare Infobright to InnoDB.

The …

[Read more]
A New Business Model for Open Source?

Kickfire was recently selected by Network World as one of 10 Open Source Companies to Watch. First of all, the disclaimer: we are not an open source company. As any of you reading this blog know, Kickfire is an appliance company. So, why then did we appear on the list? The link of course is MySQL.

The Kickfire appliance was built to run MySQL for high-performance business intelligence and data warehousing workloads. So, while we are not an open source company, we are very much what I would term as an “open source-based business”. Now, for those who track the data warehousing market, it might seem that a lot of vendors could claim that mantle as a large proportion have code that is derived from PostgreSQL. However, that’s not what I mean by an open source-based business. So, how would one …

[Read more]
InnoDB's Suitability for Reporting

I started using Oracle, a MVCC database, to develop reporting (data warehousing, BI, take your pick) systems years ago.  I’ve come to appreciate the scalability improvements that MVCC provides, particularly for pseudo real-time reporting applications, the ones where loads are occurring at the same time as report generation.  So when people say InnoDB, partly due to MVCC, isn’t as good as MyISAM for reporting I had to look into this in more detail.

What I found is InnoDB is a good engine for reporting.  In some ways, such as performance, it is at times better than MyISAM, and one of the downsides, such as a larger disk requirement, can be mitigated.  The trick is to for the primary key to be the one predominant access path.  In this example, the InnoDB clustered index, is purchaseDate and another column, such as orderId is added to make it unique.  This has a number of advantages.  In my experience, …

[Read more]
When VLSI meets DBMS: The Story behind the World’s First SQL Chip

In April this year, Kickfire announced the first high-performance appliance for MySQL. As part of the announcement, the company released data warehouse benchmark results that broke prior records in terms of price/performance and performance in a non-clustered environment. While the creation of a new appliance built exclusively for MySQL along with the benchmark records was noteworthy, perhaps the bigger story lies in what we believe to be the beginning of a paradigm shift in the database world - one marked by the advent of the first SQL chip.

To give some context to this story I have included a graph below which depicts the evolution of VLSI (Very-Large-Scale Integration) semiconductor technology and its growing impact on a broadening range of industries.

[Read more]
Showing entries 21 to 30 of 43
« 10 Newer Entries | 10 Older Entries »