|Showing entries 1 to 30 of 30|
We’re proud to announce that Jim Tommaney, CTO of Calpont, has just signed on to speak at the MySQL & Cloud Database Solutions Day, hosted by SkySQL and MariaDB - taking place next Friday, April 26, directly after Percona Live: MySQL Conference & Expo.
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data
Often I think about challenges that organizations face with “Big Data”. While Big Data is a generic and over used term, what I am really referring to is an organizations ability to disseminate, understand and ultimately benefit from increasing volumes of data. It is almost without question that in the future customers will be won/lost, competitive advantage will be gained/forfeited and businesses will succeed/fail based on their ability to leverage their data assets.
It may be surprising what I think are the near term challenges. Largely I don’t think these are purely technical. There are enough wheels in motion now to almost guarantee that data accessibility will continue to improve at pace in-line with the increase in data volume. Sure, there will continue to be lots of interesting innovation with technology, but[Read more...]
Reading yesterday that the NSA has submitted a proposal to Apache to incubate their Accumulo platform. This, according to the description, is a key/value store built over Hadoop which appears to provide similar function to HBase except it provides “cell level access labels” to allow fine grained access control. This is something you would expect as a requirement for many applications built at government agencies like the NSA. But this also is very important for organizations in health care and law enforcement etc where strict control is required to large volumes of privacy sensitive data.
An interesting part of this is how it highlights the acceptance of Hadoop.[Read more...]
Conor O'Mahony over at IBM wrote a good post on a favorite topic of mine “The Future of the NoSQL, SQL, and RDBMS Markets”. If this is of interest to you then I suggest you read his original post. I replied in the comments but thought I would also repost my reply here.
Hi Connor, I wish it was as simple as SQL & RDBMS is good for this and NoSQL is good for that. For me at least, the waters are much muddier than that.
The benefit of SQL & RDBMS is[Read more...]
My friends over at IA Ventures are looking both for an Analyst and for an Associate to their team. If Big Data, New York and start-ups is in your blood then I can’t think of a better VC to be involved in.
From the IA blog:
"IA Ventures funds early-stage Big Data companies creating competitive advantage through data and we’re looking for two start-up junkies to join our team – one full-time associate / community manager and one full time analyst. Because there are only four of us (we’re a start-up ourselves, in fact), we’ll need you to help us investigate companies, learn about industries, develop investment theses, perform internal operations, organize
In life there are really two major types of data analytics. Firstly, we don’t know what we want to know – so we need analytics to tell us what is interesting. This is broadly called discovery. Secondly, we already know what we want to know – we just need analytics to tell us this information, often repeatedly and as quickly as possible. This is called anything from reporting or dashboarding through more general data transformation and so on.
Typically we are using the same techniques to achieve this. We shove lots of data into a repository of some from (SQL, MPP SQL, NoSQL, HDFS etc) then run queries/ jobs/ processes across that data to retrieve the information we care about.
Now this makes sense for data discovery. If we don’t know what we want to know, having lots of data in a big pile that we can slice and dice[Read more...]
It is a constant, yet interesting debate in the world of big data. What scales best? OldSQL, NoSQL, NewSQL?
I have a longer post coming on this soon. But for now, let me make the following comments. Generally, most data technologies can be made to scale - somehow. Scaling up tends not to be too much of an issue, scaling out is where the difficulties begin. Yet, most data technologies can be scaled in one form or another to meet a data challenge even if the result isn’t pretty.
What is best? Well that comes down to the resulting complexity, cost, performance and other trade-offs. Trade-offs are key as there are almost always significant concessions to be made as you scale up.
A recent example of mine, I was looking at scalability aspects of MySQL. In particular, MySQL Cluster[Read more...]
Since this article was written, HPCC has undergone a number of significant changes and updates. This addresses some of the critique voiced in this blog post, such as the license (updated from AGPL to Apache 2.0) and integration with other tools. For more information, refer to the comments placed by Flavio Villanustre and Azana Baksh.
The original article can be read unaltered below:Yesterday I noticed this tweet by Andrei Savu: . This prompted me to read the related GigaOM article and then check out the [Read more...]
Well as predicted, with Aster Data recently being picked up by Teradata most of the key new generation MPP distributed analytics vendors have been acquired (Aster Data, Vertica, Netezza & Greenplum). This had to happen and was expected to happen. The MPP Analytics startup “revolution” is over and these technologies will now be integrated into the mainstream.
So what’s next? As we now, if you are a massive multi-national software company it is a lot less risky to incrementally innovate and leave the development of “game changing” technologies to startups that can be acquired after[Read more...]
There are so, so many big data platforms in play at the moment it can be confusing for developers to know where to start. For startups it used to be simple, MySQL, but dust clouds were created when all the NoSQL platforms started to crash the party 18 months or so ago. But I do see the dust begin to settle and we are starting to see some market “leaders” appear. A very unscientific approach is to list the technologies I hear about in the “big data startup” world on a daily basis. These are, in no particular order:
“NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface. Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all (such as Hadoop).
Early on (a couple of years back in NoSQL time) when the term was coined I think the positioning was much more aggressive, but more recently this has been softened so now NoSQL is commonly quoted as meaning of “Not only SQL” or “next generation databases” (whatever that means). The common message you get now is something along the lines of NoSQL systems are[Read more...]
With IBM intending to acquire Netezza the predicted consolidation in the distributed analytics market is well underway. Recent deals include EMC/Greenplum Teradata/Kickfire and now IBM/Netezza. A good breakdown of this deal is on Curt’s blog. There is still more to go of course with one of the crown jewels, Vertica, still ripe for the picking.
What this indicates is that MPP analytics has moved from the innovative edge into the mainstream market and now the more risk[Read more...]
This is the first of a series of posts about business intelligence tools, particularly OLAP (or online analytical processing) tools using MySQL and other free open source software. OLAP tools are a part of the larger topic of business intelligence, a topic that has not had a lot of coverage on MPB. Because of this, I am going to start out talking about these topics in general, rather than getting right to gritty details of their performance.
I plan on covering the following topics:
I work in all markets of the database industry, from web & startup through the largest and most established enterprises. And to be completely honest, the name Ingres has not come up in conversation very much at all. 10 years ago maybe more often, but recently not all that much. But Ingres has been quietly ticking away. Despite being largely off the radar, they still have a sizable and loyal customer base, global offices and a focused & dedicated management team. And importantly they have an open source business model which actually appears to be working.
Image by Aranda\Lasch via FlickrOne of my favorite terms at the moment is “Big Data”. While all terms are by nature subjective, in this post I will try and explain what Big Data means to me.
f_easter()function. You can use this function in MySQL statements to calculate easter sunday for any given year:
mysql> select f_easter(year(now()));
| f_easter(year(now())) |
| 2010-04-04 |
1 row in set (0.00 sec)
Image via WikipediaOracle has published their promises which have reportedly gone a long way to appeasing the EU, so the likely outcome is the takeover of Sun will be approved in January.
Pentaho Solutions, Business Intelligence and Data Warehousing with Pentaho and MySQL. By Roland Bouman and Jos van Dongen, Wiley 2009. Page count: about 570 pages. (Here’s a link to the publisher’s site.)
The book is big in part because it’s about a GUI tool, so there are the requisite number of screenshots (but not too many). It is structured into four parts, each on a different topic.
The first part is 4[Read more...]
Last week I spent some time speaking with Kevin Weil, head of analytics at Twitter. Twitter, from a technology perspective, has had a bit of a hard time due to their stability issues in their early days. Kevin was keen to point out that he feels this was due to the incomparable growth Twitter was experiencing at the time and their constant struggle to keep up. Kevin was also keen to show that Twitter prides themselves on striving for engineering excellence, the creation & contribution to new technologies and generally assisting in pushing the boundaries forward. Our conversation naturally centered on analytics at Twitter.
Image by Nathan Lanier via Flickr
Image via WikipediaFYI - the thoughts here have been gathered from conversations with several individuals, including an interesting conversation yesterday. As these conversations were off the record I won’t name names here but thanks to those people.
|Showing entries 1 to 30 of 30|