Planet MySQL Planet MySQL: Meta Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 10 of 41 10 Older Entries

Displaying posts with tag: data warehousing (reset)

The stealth success of PostgreSQL
+0 Vote Up -0Vote Down

One of the more notable success stories of the open source world is in the field of databases. A company with a strong commitment to open source has seen tremendous growth and success in the enterprise while contributing to a hugely respected open source code base. Who is that? Maybe your first thought was MySQL, now owned by Oracle. But unlike MySQL, this company is actually taking business away from Oracle so effectively that it's seen an 80 percent revenue growth in the last year.

read more

Advantages of weighted lists in RDBMS processing
+1 Vote Up -0Vote Down
A list is simply a list of things. The list has no structure, except in some cases, the length of the list may be known. The list may contain duplicate items. In the following example the number 1 is included twice.

Example list:
1
2
3
1

A set is similar to a list, but has the following differences:
  • The size of the set is always known

  • A set may not contain duplicates

  • You can convert a list to a set by creating a 'weighted list'. The weighted list includes a count column so that you can determine when an item in the list appears more than once:
    1,2
    2,1
    3,1
    
    Notice that there are two number 1 values in the weighted list. In order to make insertions into such a list scalable, consider using partitioning to avoid large indexes.








      [Read more...]
    Shard-Query turbo charges Infobright community edition (ICE)
    +2 Vote Up -1Vote Down

    Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to significantly improve the performance of queries by spreading the work over all available compute resources. In this test, Shard-Query averages a nearly 6x (max over 10x) improvement over the baseline, as shown in the following graph:

    One



      [Read more...]
    Using Flexviews – part one, introduction to materialized views
    +2 Vote Up -0Vote Down

    If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized views.

    You might be asking yourself “what is an incrementally refreshable materialized view?”. If so, then keep reading. This is the first in a multi-part series describing Flexviews.

    edit:
    You can find part 2 of the series here:

      [Read more...]
    YAPCEU 2010 – Day Two…
    +1 Vote Up -0Vote Down

    After enjoying the excellent hospitality of our host here in Pisa (6 courses) we were ready for our second day at YAPCEU 2010 here in sunny Pisa.

    Larry’s new catch phrase “My Language is a four letter word” was the ‘Buzz word’ for today. We settled down to some very interesting talks, the highlight for me being Tim Bunce’s talk on using Devel::NYTProf to Optimize your code. Tim first gave us a quick and dirty overview of optimization which covered the basics of where to start and what to look for he followed up with real examples of Optimizer output and than wrapped up with a few before and after results on an optimization effort.

    The rest of the day was dedicated in my opinion, to the future of DBs in with Nelson Ferraz giving an excellent presentation of his concepts for using Perl as to glue for a Data Warehouse application. Next on my agenda,

      [Read more...]
    Data Warehousing Best Practices: Comparing Oracle to MySQL pt 2
    +5 Vote Up -3Vote Down

    At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

    See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and

      [Read more...]
    Data Warehousing Best Practices: Comparing Oracle to MySQL pt 1
    +4 Vote Up -3Vote Down

    At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

    These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than

      [Read more...]
    CAOS Theory Podcast 2010.02.05
    +0 Vote Up -0Vote Down

    Topics for this podcast:

    *Matt Asay moves from Alfresco to Canonical
    *GPL fade fuels heated discussion
    *Apple’s iPad and its enterprise and open source impact
    *Open source in data warehousing and storage
    *Our perspective on Oracle’s plans for Sun open source

    iTunes or direct download (32:50, 9.2 MB)

    Some scaling observations on Infobright
    +0 Vote Up -0Vote Down

    A couple of days ago, Baron Schwartz posted some simple load and select benchmarking of MyISAM, Infobright and MonetDB, which Vadim Tkachenko followed up with a more realistic dataset and interesting figures where MonetDB beat Infobright in most queries.

    Used to the parallel IEE loader, I was surprised by the apparent slow loading speed of Baron's benchmark and decided to try and replicate it. I installed Infobright 3.2 on my laptop (see, this is very unscientific) and wrote a simple perl script to generate and load an arbitrarily large data set resembling Baron's description. I'm not going to post my exact numbers, because

      [Read more...]
    A peek under the hood in Infobright 3.2 storage engine
    +0 Vote Up -0Vote Down

    I've been meaning to post some real-world data on the performance of the Infobright 3.2 release which happened a few weeks ago after an extended release candidate period. We're just preparing our upgrades now, so I don't have any performance notes over significant data sets or complicated queries to post quite yet.

    To make up for that, I decided to address a particular annoyance of mine in the community edition, first because it hadn't been addressed in the 3.2 release (and really, I'm hoping doing this would include it into 3.2.1), and second, simply because the engine being open source means I can. I feel being OSS is one of Infobright's biggest strengths, in addition to being a pretty amazing piece of performance for such a simple,

      [Read more...]
    Showing entries 1 to 10 of 41 10 Older Entries

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.