| Showing entries 1 to 30 of 30 |
We’re proud to announce that Jim Tommaney, CTO of Calpont, has just signed on to speak at the MySQL & Cloud Database Solutions Day, hosted by SkySQL and MariaDB - taking place next Friday, April 26, directly after Percona Live: MySQL Conference & Expo.
Join me for a webinar where I discuss how the recent changes and trends in big data management effect the enterprise. This event is sponsored by Red Rock and RockSolid.
Overview:
[Read more...]It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data
Often I think about challenges that organizations face with “Big Data”. While Big Data is a generic and over used term, what I am really referring to is an organizations ability to disseminate, understand and ultimately benefit from increasing volumes of data. It is almost without question that in the future customers will be won/lost, competitive advantage will be gained/forfeited and businesses will succeed/fail based on their ability to leverage their data assets.
It may be surprising what I think are the near term challenges. Largely I don’t think these are purely technical. There are enough wheels in motion now to almost guarantee that data accessibility will continue to improve at pace in-line with the increase in data volume. Sure, there will continue to be lots of interesting innovation with technology, but
[Read more...]Reading yesterday that the NSA has submitted a proposal to Apache to incubate their Accumulo platform. This, according to the description, is a key/value store built over Hadoop which appears to provide similar function to HBase except it provides “cell level access labels” to allow fine grained access control. This is something you would expect as a requirement for many applications built at government agencies like the NSA. But this also is very important for organizations in health care and law enforcement etc where strict control is required to large volumes of privacy sensitive data.
An interesting part of this is how it highlights the acceptance of Hadoop.
[Read more...]Conor O'Mahony over at IBM wrote a good post on a favorite topic of mine “The Future of the NoSQL, SQL, and RDBMS Markets”. If this is of interest to you then I suggest you read his original post. I replied in the comments but thought I would also repost my reply here.
-----------------------------------------------------------------------------------------------
Hi Connor, I wish it was as simple as SQL & RDBMS is good for this and NoSQL is good for that. For me at least, the waters are much muddier than that.
The benefit of SQL & RDBMS is
[Read more...]My friends over at IA Ventures are looking both for an Analyst and for an Associate to their team. If Big Data, New York and start-ups is in your blood then I can’t think of a better VC to be involved in.
From the IA blog:
[Read more...]"IA Ventures funds early-stage Big Data companies creating competitive advantage through data and we’re looking for two start-up junkies to join our team – one full-time associate / community manager and one full time analyst. Because there are only four of us (we’re a start-up ourselves, in fact), we’ll need you to help us investigate companies, learn about industries, develop investment theses, perform internal operations, organize
In life there are really two major types of data analytics. Firstly, we don’t know what we want to know – so we need analytics to tell us what is interesting. This is broadly called discovery. Secondly, we already know what we want to know – we just need analytics to tell us this information, often repeatedly and as quickly as possible. This is called anything from reporting or dashboarding through more general data transformation and so on.
Typically we are using the same techniques to achieve this. We shove lots of data into a repository of some from (SQL, MPP SQL, NoSQL, HDFS etc) then run queries/ jobs/ processes across that data to retrieve the information we care about.
Now this makes sense for data discovery. If we don’t know what we want to know, having lots of data in a big pile that we can slice and dice
[Read more...]It is a constant, yet interesting debate in the world of big data. What scales best? OldSQL, NoSQL, NewSQL?
I have a longer post coming on this soon. But for now, let me make the following comments. Generally, most data technologies can be made to scale - somehow. Scaling up tends not to be too much of an issue, scaling out is where the difficulties begin. Yet, most data technologies can be scaled in one form or another to meet a data challenge even if the result isn’t pretty.
What is best? Well that comes down to the resulting complexity, cost, performance and other trade-offs. Trade-offs are key as there are almost always significant concessions to be made as you scale up.
A recent example of mine, I was looking at scalability aspects of MySQL. In particular, MySQL Cluster
[Read more...]Since this article was written, HPCC has undergone a number of significant changes and updates. This addresses some of the critique voiced in this blog post, such as the license (updated from AGPL to Apache 2.0) and integration with other tools. For more information, refer to the comments placed by Flavio Villanustre and Azana Baksh.
The original article can be read unaltered below:
Yesterday I noticed this tweet by Andrei Savu:
. This prompted me to read the related GigaOM article and then check out the [Read more...]
Well as predicted, with Aster Data recently being picked up by Teradata most of the key new generation MPP distributed analytics vendors have been acquired (Aster Data, Vertica, Netezza & Greenplum). This had to happen and was expected to happen. The MPP Analytics startup “revolution” is over and these technologies will now be integrated into the mainstream.
So what’s next? As we now, if you are a massive multi-national software company it is a lot less risky to incrementally innovate and leave the development of “game changing” technologies to startups that can be acquired after
[Read more...]There are so, so many big data platforms in play at the moment it can be confusing for developers to know where to start. For startups it used to be simple, MySQL, but dust clouds were created when all the NoSQL platforms started to crash the party 18 months or so ago. But I do see the dust begin to settle and we are starting to see some market “leaders” appear. A very unscientific approach is to list the technologies I hear about in the “big data startup” world on a daily basis. These are, in no particular order:
“NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface. Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all (such as Hadoop).
Early on (a couple of years back in NoSQL time) when the term was coined I think the positioning was much more aggressive, but more recently this has been softened so now NoSQL is commonly quoted as meaning of “Not only SQL” or “next generation databases” (whatever that means). The common message you get now is something along the lines of NoSQL systems are
[Read more...]
With IBM intending to acquire Netezza the predicted consolidation in the distributed analytics market is well underway. Recent deals include EMC/Greenplum Teradata/Kickfire and now IBM/Netezza. A good breakdown of this deal is on Curt’s blog. There is still more to go of course with one of the crown jewels, Vertica, still ripe for the picking.
What this indicates is that MPP analytics has moved from the innovative edge into the mainstream market and now the more risk
[Read more...]
I will be at VLDB 2010 next week. If anyone on this blog is attending and wants to catch up to discuss start ups and innovation in DB, NoSQL, Big Data etc drop me a line and I will try to meet up.
This is the first of a series of posts about business intelligence tools, particularly OLAP (or online analytical processing) tools using MySQL and other free open source software. OLAP tools are a part of the larger topic of business intelligence, a topic that has not had a lot of coverage on MPB. Because of this, I am going to start out talking about these topics in general, rather than getting right to gritty details of their performance.
I plan on covering the following topics:

Image by Aranda\Lasch via Flickr
f_easter() function. You can use this function in MySQL statements to calculate easter sunday for any given year:
mysql> select f_easter(year(now()));
+-----------------------+
| f_easter(year(now())) |
+-----------------------+
| 2010-04-04 |
+-----------------------+
1 row in set (0.00 sec)
Image via Wikipedia

Pentaho Solutions
Pentaho Solutions, Business Intelligence and Data Warehousing with Pentaho and MySQL. By Roland Bouman and Jos van Dongen, Wiley 2009. Page count: about 570 pages. (Here’s a link to the publisher’s site.)
The book is big in part because it’s about a GUI tool, so there are the requisite number of screenshots (but not too many). It is structured into four parts, each on a different topic.
The first part is 4
[Read more...]
Image by Getty Images via Daylife
Image by Nathan Lanier via Flickr
<< Back from Blogging Hiatus - Update 2
Back from Hiatus - Summary Update 1
Image via Wikipedia

| Showing entries 1 to 30 of 30 |