Planet MySQL Planet MySQL: Meta Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 7

Displaying posts with tag: General BI (reset)

NoSQL Now 2011: Review of AdHoc Analytic Architectures
+1 Vote Up -0Vote Down

For those that weren’t able to attend the fantastic NoSQL Now Conference in San Jose last week, but are still interested in the slides about how people are doing Ad Hoc analytics on top of NoSQL data systems, here’s my slides from my presentation:

No sql now2011_review_of_adhoc_architectures

View more presentations from ngoodman We obviously continue to hear from our community that LucidDB is a great solution sitting in front of a Big Data/NoSQL  [Read more...]
DynamoBI: website? bits?
+3 Vote Up -0Vote Down

Well, what a soft launch it has been.
Some people have asked:

When are you going to get a website? Errr…. Soon! We soft launched a bit early, due to some “leaking information” but figured heck, it’s open source let’s let it all out. Soon enough, I swear!

Where can I download DynamoDB? Errr… you can’t yet cause we haven’t finished our build/QA/certification process.

However, since DynamoDB is the alter ego business suit wearing brother of LucidDB, just download the 0.9.2 release if you want to get a sense of what DynamoDB is.

There are 3 built binaries (Linux 32, Linux 64, and Windows 32):


  [Read more...]
select stream REAL_TIME_BI_METRIC from DATA_IN_FLIGHT
+0 Vote Up -0Vote Down

SQL is great. There are millions of people who know it, it’s semantics are standardized (well, as much as anything is ever I suppose), and it’s a great language for interacting with data. The folks at SQLstream have taken a core piece of the technology world, and turbocharged it for a whole new application: querying streams of data. You can think of what SQLstream is doing as the ‘calculus’ of SQL - not just query results of static data in databases but query results of a stream of records flowing by.

Let’s consider some of the possibilities. Remember the results aren’t a one time poll - the values will continue to be updated as the query continues.

// Select the stocks Average price over the

  [Read more...]
Kettles secret in-memory database
+0 Vote Up -0Vote Down

Kettles secret in-memory database is

  • Not actually secret
  • Not actually Kettles
  • There. I said it, and I feel much better.
    In most circumstances, Kettle is used in conjunction with a database. You are typically doing something with a database: INSERTs, UPDATEs, DELETEs, UPSERTs, DIMENSION UPDATEs, etc. While I do know of some people that are using Kettle without a database (think log munching and summarization) a database is something that a Kettle developer almost always has at their disposal.

    Sometimes there isn’t a database. Sometimes you don’t want the slowdown of persistence in a database. Sometimes you just want Kettle to just have an in memory blackboard across transformations. Sometimes you want to ship an


      [Read more...]
    MySQL Archive Tablespace for FACTs
    +0 Vote Up -0Vote Down

    I’m visiting a Pentaho customer right now whose current “transaction” volume is 200 million rows per day.  Relatively speaking, this puts their planned warehouse in the top quintile of size.  They will face significant issues with load times, data storage, processing reliability, etc.  Kettle is the tool they selected and it is working really well.  Distributed record processing using Kettle and a FOSS database is a classic case study for Martens scale out manifesto (http://www.mysql.com/why-mysql/scaleout.html). 

    This organization doesn’t have unlimited budget.  Specifically, they don’t have a telecom type budget for their telecom like volume of data.  One of the issues that has come up with their implementation has

      [Read more...]
    2gig + 2gig = 50gig
    +0 Vote Up -0Vote Down

    I was recently writing up a volume and performance specification for a customer project when a discussion arose with the current DBA staff about volume projections. The intuitive thinking was the volume requirements for the BI/Data Warehouse would be the sum of the systems from which it sourced data. The group was thinking this would be a good way to approximate the required space for the system. I asserted this method is flawed, and had to suggest why that BI volume is proportionate but not necessariliy directly proportionate.
    BI systems required much greater storage than the sum of their sources because:

    • Data are denormalized. With denormalizing data to increase query performance one increases storage from 1-10000 times (it depends on the data, perhaps more).
    • New Data are created. There are many analytically significant

      [Read more...]
    Database Operation Complexity Reference
    +0 Vote Up -0Vote Down

    I mentioned this previously, but I’ve been reading “Principles of Distributed Database Systems.” I’m enjoying it, and it’s helping me solidify many of the concepts that I apply daily in my capacity as a Principal BI Solutions consultant. Database theory, and specifically as it relates to tuning is part of any professionals work with Oracle. We’ve all deciphered the performance implications and clues for improvements from EXPLAIN PLAN. I’ve always been told you want to be able to give Oracle the clues/configurations to enable filter of results (selection) before joining. It always made sense but I had never understood fully

      [Read more...]
    Showing entries 1 to 7

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.