Showing entries 26041 to 26050 of 44963
« 10 Newer Entries | 10 Older Entries »
Setting up Development and Production Pentaho PDI Repositories

I’ve been setting up a Pentaho Data Integration system with the goals of supporting collaboration with my team, allowing easy deployment to test or production, and enabling remote monitoring and troubleshooting of jobs and tranformations.

I’ve finally figured out a way to achieve these goals, so I’ll try to pass this on now. I found the book "Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL", by Roland Bouman and Jos van Dongen to be a big help in figuring out how to export/import. It definitely helped me get up and running quickly.

My first decision was to bet the farm on the use of a repository. A file based system would probably work, but I felt that it would require too much file distribution and usage of remote terminals. So I’ve setup two separate repositories hosted on MySQL databases: One for development (DEV), and one for Production (PRD). Here are the steps I …

[Read more]
MySQL Database Analytics with InfiniDB from Calpont – Part 1

Let’s be honest: working with big databases is a lot of fun. There’s something cool about dealing with tables that have hundreds of millions or billions of rows in them, loading huge amounts of data, building star and snowflake schemas for data warehouses/marts, optimizing query performance, and all that jazz. Yes, working with big databases is a lot of fun.

On the other hand, let’s be honest: working with big databases is not a lot of fun. There’s a lot of pain in dealing with tables that have hundreds of millions or billions of rows in them, waiting for huge amounts of data to be loaded only to have the load job toss its cookies and fail when it’s 99% done, building special schemas that you wonder whether make any difference at all, and trying to figure out why just a simple two-way join query has been hanging for over an hour. Yes, working with big databases is not a lot of fun.

MySQL features timeline

I’ve begun a MySQL features timeline which is a quick reference showing as of what version MySQL features were added, changed or removed. The manual tells us this, of course, but I wanted a quicker reference. The list is far from complete as there’s a huge number of features to cover. I’ll continue to improve it and help is appreciated. Send me a quick email saying “feature x added/removed/changed as of version y” and I’ll do the rest. — If someone has already done this, please give me the url so I don’t reinvent the wheel.

Want to know how Yahoo does their capacity planning?

Every wonder how Google plans for growth? Well, if you are in the Dallas / Fort Worth area on Monday, November 2nd you can find out exactly how they do it!

Strategic MySQL Planning for Complexity & Growth (i.e. MySQL Scaling for Dummies) will be presented by Tommy Falgout at the North Texas MySQL Users Group Meeting

Sometimes a data driven website is a simple matter. Sometimes it only starts out that way. Membership records can grow from dozens to hundreds to thousands (or more). Performance or historic logs can grow astronomically. The ongoing need to coordinate different sets of data can lead to outrageously complex schema and duplicate data. How can you avoid those troubles? There are ways. Tommy Falgout will talk about designing your database for …

[Read more]
OpenSQL Camp, SQL vs NoSQL

The upcoming OpenSQL Camp is almost full! We have space for 130 people to register, and as of this writing only 10 spots are free. If you want to attend, sign up before it’s too late!

We’re still looking for a few sponsors if anyone is interested in helping cover food and t-shirt costs.

I’m organizing the closing keynote panel, “SQL vs NoSQL”, which will include core community members and committers from a number of open source databases. Selena has offered to take the PostgreSQL position if we don’t find another worthy contender. So far, it will include:

  • Brian Aker – Drizzle
  • Eric Evans – Cassandra
  • Joydeep Sen Sarma – Hive/Hadoop
  • Mike Dirolf – …
[Read more]
MySQL and hardware information

People often ask “what’s the best hardware to run a database on?” And the answer, of course, is “it depends”. With MySQL, though, you can get good performance out of almost any hardware.

If you need *great* performance, and you have active databases with a large data set, here are some statistics on real life databases — feel free to add your own.

We define “large data set” as over 100 Gb, mostly because smaller data sets have an easier time with the available memory on a machine (even if it’s only 8 Gb) and backups are less intrusive — InnoDB Hot Backup and Xtrabackup are not really “hot” backups, they are “warm” backups, because there is load on the machine to copy the data files, and on large, active servers we have found that this load impacts query performance. As for how active a database is, we’ve found that equates to a peak production load of over 3,000 queries per second on a transactional …

[Read more]
Air traffic queries in LucidDB

After my first post Analyzing air traffic performance with InfoBright and MonetDB where I was not able to finish task with LucidDB, John Sichi contacted me with help to setup. You can see instruction how to load data on LucidDB Wiki page

You can find the description of benchmark in original post, there I will show number I have for LucidDB vs previous systems.

Load time
To load data into LucidDB in single thread, it took for me 15273 sec or 4.24h. In difference with other systems LucidDB support multi-threaded load, with concurrency 2 (as I have only 2 cores on that box), the load time is 9955 sec or 2.76h. For comparison
for InforBright load time is …

[Read more]
A DTrace Quick Start Guide

Alta Elsta & I just completed the DTrace Quick Start Guide: Observing Native and Web Applications in Production mini-book. If you need a hardcopy let me know but if you are like me and would love to save a few trees you can click on the picture below to download the pdf version.

We talk about why DTrace is so different from any observation tools you may have seen. We cover basic DTrace concepts and some details on writing d-scripts. We also have a lot of sample scripts. Sections on using DTrace to observe MySQL and Drupal are included.

The examples from the book have also been added to a wiki page for easier copy & paste.

[Read more]
My upcoming event schedule for this year

This time of the year is usually a very busy one, as there are plenty of events and conferences to attend. Just take a look at our calendar of OSS events on the MySQL Forge to see what I mean! Here's a quick summary of the ones that I will attend and speak at until the end of this year:

On November 14-15, I'll attend the openSQL Camp in Portland (OR), USA. I missed the first one that took place in Charlottesville (VA) in 2008, but had a lot of fun organizing the European Edition earlier this year. The upcoming one will be more like an unconference again - the list of …

[Read more]
Everything you always wanted to know about MySQL but were afraid to ask - part one

Since the European Commission announced it was opening an in-depth investigation into the proposed takeover of Sun Microsystems by Oracle with a focus on MySQL there has been no shortage of opinion written about Oracle’s impending ownership of MySQL and its impact on MySQL users and commercial partners, as well as MySQL’s business model, dual licensing and the GPL.

In order to try and bring some order to the conversation, we have brought together some of the most referenced blog posts and news stories in chronological order. Part one, below, takes us from the announcement of the EC’s in-depth investigation up to the eve of the communication of the EC’s Statement of Objections. We will continue to update part two until either the acquisition or the EC’s investigation …

[Read more]
Showing entries 26041 to 26050 of 44963
« 10 Newer Entries | 10 Older Entries »