| Showing entries 1 to 23 |
"Why the days are numbered for Hadoop as we know it"I know GigaOM like to provoke scandals sometimes, we all remember some other unforgettable piece, but there is something behind it...

If you have used MySQL for some time you know that mysqld can write binlogs. This is usually used for backup purposes and JITR or for replication purposes so a slave can collect the changes made on the master and apply them locally.
Most of the time apart from configuring how long you keep these binlogs they are pretty much ignored.
Recently I came across an issue. I have a slave server which is NOT configured read only and which has an additional database used to collect statistics from the replicated database and provided aggregation and business information. The typical sales per country, per product, per day, week, month, year, per whatever etc. This is the usual datawarehouse type functionality. It’s done on a slave and not the
[Read more...]At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and
[Read more...]At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than
[Read more...]I’m taking part in two webinar’s this week that will likely be of interest to CAOS readers. On WednesdayI’m contributing to a webinar with EnterpriseDB on the subject of open source database adoption in the enterprise, while on Thursday I’ll be presenting a 451 Group webinar on data warehousing.
During the EnterpriseDB webinar we will provide recommendations for how organizations can effectively leverage open source software. Attendees will learn about open source software trends for 2010, top considerations when using open source databases, and best practices for successful deployments of open source software.
I’ll be providing some data points from our recent surveys on database adoption and
[Read more...]Departmental or subject-specific data warehouses – known as “data marts” in the industry – seem to be gaining in popularity. Fueled partly by companies wanting to start small with focused projects in today’s economy, and partly by advances in data warehousing technology improving affordability and deployability, data marts seem to be popping-up everywhere.
In most cases, data mart projects are driven by the head of a business unit or a functional group (like Sales) needing to analyze their own slice of data in order to run their department more efficiently and effectively. The data may come directly from an operational system or a combination of source systems resulting in what’s called an “independent data mart”, or it may come directly from a larger, enterprise data warehouse in a hub-and-spoke or “dependent data mart” configuration.
[Read more...]Why was Teradata able to become the leader of data warehousing at the super high-end (e.g. greater than 25 TB’s)? Why was Netezza only the second pure-play data warehousing company to go public by focusing on the 10 – 25 TB range of opportunities? Why did Oracle after so many years of denial finally announce a joint hardware / software product for data warehousing with HP, the Exadata data warehouse server? Why did Microsoft acquire DATAllegro, one of the earlier data warehousing appliances? Why are there now dozens of data warehouse appliances available on the market today, and – more importantly – how should a customer choose which one to purchase?
In all these cases, the vendors have listened to the market and concluded that the most optimal way to serve the customer is through a true data warehouse appliance. Given that there are so many flavors of appliances, though,
[Read more...]The Kickfire MySQL Appliance is offically launched!
We just announced today, along with a new customer, and strategic partnerships with ten leading service companies including Percona, the MySQL performance experts.
Look for more news next week from Kickfire as we head into the MySQL conference. Kickfire will also give a keynote on the first day of the conference and will make a surprise announcement! Stay tuned …
At the March Boston MySQL User Group meeting, Jacob Nikom of MIT’s Lincoln Laboratory presented “Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications.” In the middle of the talk, Jacob said he sometimes calls what he did in this application as “real-time data warehousing”, which was so accurate I decided to give that title to this blog post.
The slides can be downloaded in PDF format (1.3 Mb) at http://www.technocation.org/files/doc/Concurrent_database_performance_02.pdf. The 54 minute video can be downloaded (644Mb) at http://technocation.org/node/693/download or streamed directly in your browser at
[Read more...]We just shipped and installed the Kickfire appliance in the data center of our first web 2.0 customer this week. We’re very excited about this new customer. With already over a million active members, this company continues to grow in spite of a challenging economic environment because it has a clearly defined audience and a business model which adds value to its members while adding money to its coffers. Part of the value add to their member base comes from well-targeted discount and coupon offers. In order to achieve this, the company runs complex analytics to understand members’ behaviors and responses and uses this data to help its advertising customers better target their offers.
As with many web 2.0 companies, this customer has built its application on MySQL. MySQL has helped them scale their web application well but was presenting performance and scalability challenges for their
[Read more...]If there is one thing that a DBA or data warehouse architect can count on, it is that data volumes will increase while budgets will decrease.
This is why MySQL 5.1 and its partitioning capabilities are so interesting. I’m going to demonstrate how you can build a small/medium-sized data warehouse or data mart (1-10 TB range) on a shoe-string budget.
I decided to convert a relatively large statistics table (750m rows, 140GB in size in about 10 partitions) on a test machine from MyISAM to the Archive storage engine. After a long conversion process, my data, on disk, ended up being about 21GB, for an impressive compression ratio of 6.7:1.
Prior to MySQL 5.1, one of the drawbacks to the archive storage engine was that you could not index it; however, with partition pruning, you can get yourself a “free” index on a large archive table by splitting it into
| Showing entries 1 to 23 |