Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Previous 30 Newer Entries Showing entries 31 to 60 of 67 Next 7 Older Entries

Displaying posts with tag: startups (reset)

Why generalists are better at scaling the web
+1 Vote Up -0Vote Down

Recently at Surge 2011, the annual  conference on scalability  and performance, Google's CIO Ben Fried gave an illuminating keynote address. His main insight was that generalists are the people that will lead engineering teams in successfully scaling the web.

In a world where the badge of Specialist or Expert is prized, this was refreshing perspective from an industry bigwig. As tech professionals, or any professional for that matter, we don't welcome the label of generalist. The word suggests a jack-of-all-trades and master of none. But the generalist is no less an expert than the specialist. Generalists can get their hands greasy with the tools to fix bugs in the

  [Read more...]
Scale Quickly Like Birchbox – Startup Scalability 101
+0 Vote Up -0Vote Down

Read the original article at Scale Quickly Like Birchbox – Startup Scalability 101

One of the great things about the Internet is how it has made it easier to put great ideas into practice. Whether the ideas are about improving people’s lives or a new way to sell and old-fashioned product, there’s nothing like a good little startup tale of creative disruption to deliver us from something old and tired.

We work with a lot of startup firms and we love being part of the atmosphere of optimism and ingenuity, peppered with a bit of youthful zeal - something very indie-rock-and-roll about it. But whether they are just starting out or already picking up pace every startup faces the same challenges to scale a business. Recently, we were reminded of this

  [Read more...]
What is the biggest challenge for Big Data?
+0 Vote Up -0Vote Down

Often I think about challenges that organizations face with “Big Data”.  While Big Data is a generic and over used term, what I am really referring to is an organizations ability to disseminate, understand and ultimately benefit from increasing volumes of data.  It is almost without question that in the future customers will be won/lost, competitive advantage will be gained/forfeited and businesses will succeed/fail based on their ability to leverage their data assets.

It may be surprising what I think are the near term challenges.  Largely I don’t think these are purely technical.  There are enough wheels in motion now to almost guarantee that data accessibility will continue to improve at pace in-line with the increase in data volume.  Sure, there will continue to be lots of interesting innovation with technology, but

  [Read more...]
NSA, Accumulo & Hadoop
+0 Vote Up -0Vote Down

Reading yesterday that the NSA has submitted a proposal to Apache to incubate their Accumulo platform.  This, according to the description, is a key/value store built over Hadoop which appears to provide similar function to HBase except it provides “cell level access labels” to allow fine grained access control.  This is something you would expect as a requirement for many applications built at government agencies like the NSA.  But this also is very important for organizations in health care and law enforcement etc where strict control is required to large volumes of privacy sensitive data.

An interesting part of this is how it highlights the acceptance of Hadoop.

  [Read more...]
Top 3 Questions From Clients
+0 Vote Up -0Vote Down

1. This page or area of the website is very slow, why?

There are a lot of components that make up modern internet websites, and a lot of places to get stuck in the mud.  Website performance starts with the browser, what caching it is doing, their bandwidth to your server, what the webserver is doing (caching or not and how), if the webserver has sufficient memory, and then what the application code is doing and lastly how it is interacting with the backend database.

With all this complexity, it's no wonder so many sites struggle.  Typically these types of analysis start with some load testing, to stress test your setup, so you can watch for leaks.  Then

  [Read more...]
Specialty Technology Consultant – New York Scalability Consultant – MySQL & EC2 Scalability
+0 Vote Up -0Vote Down

Amazon EC2 and cloud computing offer great promise for startups to ramp up their online presence quickly.  Navigate those challenges with an strong partner.  We bring 20 years experience to the table with each new client.

  • Scaling Web Applications
  • MySQL High Availability in Amazon EC2
  • Amazon Multi-AZ Deployments
  • Amazon RDS Deployments
  • Migrating to Amazon EC2
  • Migrating to MySQL
  • Managing Backups and Disaster Recovery in the Cloud
  • Horizontal Scalability of MySQL on EC2
  • Horizontal Scalability on Cloud Hosted Servers
  • Evaluating Cloud Providers
  • Evaluating MySQL
  [Read more...]
Reply to The Future of the NoSQL, SQL, and RDBMS Markets
+0 Vote Up -0Vote Down

Conor O'Mahony over at IBM wrote a good post on a favorite topic of mine “The Future of the NoSQL, SQL, and RDBMS Markets”.  If this is of interest to you then I suggest you read his original post.  I replied in the comments but thought I would also repost my reply here.

-----------------------------------------------------------------------------------------------

Hi Connor, I wish it was as simple as SQL & RDBMS is good for this and NoSQL is good for that.  For me at least, the waters are much muddier than that.

The benefit of SQL & RDBMS is

  [Read more...]
Building data startups: Fast, big, and focused
+0 Vote Up -0Vote Down

This is a written follow-up to a talk presented at a recent Strata online event.

A new breed of startup is emerging, built to take advantage of the rising tides of data across a variety of verticals and the maturing ecosystem of tools for its large-scale analysis.

These are data startups, and they are the sumo wrestlers on the startup stage. The weight of data is a source of their competitive advantage. But like their sumo mentors, size alone is not enough. The most successful of data startups must be fast (with data), big (with analytics), and focused (with services).

Setting the stage: The attack of the exponentials

The question of

  [Read more...]
IA Ventures - Jobs shout out
+0 Vote Up -0Vote Down

My friends over at IA Ventures are looking both for an Analyst and for an Associate to their team.  If Big Data, New York and start-ups is in your blood then I can’t think of a better VC to be involved in. 

From the IA blog:

"IA Ventures funds early-stage Big Data companies creating competitive advantage through data and we’re looking for two start-up junkies to join our team – one full-time associate / community manager and one full time analyst. Because there are only four of us (we’re a start-up ourselves, in fact), we’ll need you to help us investigate companies, learn about industries, develop investment theses, perform internal operations, organize

  [Read more...]
Realtime Data Pipelines
+0 Vote Up -0Vote Down

In life there are really two major types of data analytics.  Firstly, we don’t know what we want to know – so we need analytics to tell us what is interesting.  This is broadly called discovery.  Secondly, we already know what we want to know – we just need analytics to tell us this information, often repeatedly and as quickly as possible.  This is called anything from reporting or dashboarding through more general data transformation and so on.

Typically we are using the same techniques to achieve this.  We shove lots of data into a repository of some from (SQL, MPP SQL, NoSQL, HDFS etc) then run queries/ jobs/ processes across that data to retrieve the information we care about.  

Now this makes sense for data discovery.  If we don’t know what we want to know, having lots of data in a big pile that we can slice and dice

  [Read more...]
What Scales Best?
+0 Vote Up -0Vote Down

It is a constant, yet interesting debate in the world of big data.  What scales best?  OldSQL, NoSQL, NewSQL?

I have a longer post coming on this soon.  But for now, let me make the following comments.  Generally, most data technologies can be made to scale - somehow.  Scaling up tends not to be too much of an issue, scaling out is where the difficulties begin.  Yet, most data technologies can be scaled in one form or another to meet a data challenge even if the result isn’t pretty. 

What is best?  Well that comes down to the resulting complexity, cost, performance and other trade-offs.  Trade-offs are key as there are almost always significant concessions to be made as you scale up.

A recent example of mine, I was looking at scalability aspects of MySQL.  In particular, MySQL Cluster

  [Read more...]
Who/What to acquire next
+1 Vote Up -0Vote Down

Well as predicted, with Aster Data recently being picked up by Teradata most of the key new generation MPP distributed analytics vendors have been acquired (Aster Data, Vertica, Netezza & Greenplum).  This had to happen and was expected to happen.  The MPP Analytics startup “revolution” is over and these technologies will now be integrated into the mainstream.

So what’s next?  As we now, if you are a massive multi-national software company it is a lot less risky to incrementally innovate and leave the development of “game changing” technologies to startups that can be acquired after

  [Read more...]
What’s hot in Big Data startups?
+0 Vote Up -0Vote Down

There are so, so many big data platforms in play at the moment it can be confusing for developers to know where to start.  For startups it used to be simple, MySQL, but dust clouds were created when all the NoSQL platforms started to crash the party 18 months or so ago.  But I do see the dust begin to settle and we are starting to see some market “leaders” appear.  A very unscientific approach is to list the technologies I hear about in the “big data startup” world on a daily basis.  These are, in no particular order:

  • MySQL - yes it is still very much hanging in there despite the Oracle acquisition.  MySQL has been helped by technologies such as AWS RDS and Xeround making it more digestible for big data startups who want
  [Read more...]
The problem with a full box of big data tools
+0 Vote Up -0Vote Down

NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface.  Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all (such as Hadoop).

Early on (a couple of years back in NoSQL time) when the term was coined I think the positioning was much more aggressive, but more recently this has been softened so now NoSQL is commonly quoted as meaning of “Not only SQL” or “next generation databases” (whatever that means).  The common message you get now is something along the lines of NoSQL systems are

  [Read more...]
Big Data innovation marches on
+0 Vote Up -0Vote Down

With IBM intending to acquire Netezza the predicted consolidation in the distributed analytics market is well underway.  Recent deals include EMC/Greenplum Teradata/Kickfire and now IBM/Netezza.  A good breakdown of this deal is on Curt’s blog.  There is still more to go of course with one of the crown jewels, Vertica, still ripe for the picking. 

What this indicates is that MPP analytics has moved from the innovative edge into the mainstream market and now the more risk

  [Read more...]
VLDB 2010
+0 Vote Up -0Vote Down

I will be at VLDB 2010 next week.  If anyone on this blog is attending and wants to catch up to discuss start ups and innovation in DB, NoSQL, Big Data etc drop me a line and I will try to meet up.

Why software startups decide to patent ... or not
+1 Vote Up -0Vote Down

Guest blogger Pamela Samuelson is the Richard M. Sherman Distinguished Professor of Law and Information at the University of California, Berkeley. She teaches courses on intellectual property, cyberlaw, and information privacy, and she has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes. A version of this material is scheduled to appear in the November 2010 issue of Communications of the ACM.

Two-thirds of the approximately 700 software entrepreneurs who participated in the 2008 Berkeley Patent Survey report that they neither have nor are seeking patents for innovations embodied in their products and services. These entrepreneurs rate patents as the least important mechanism among seven options for attaining

  [Read more...]
Four short links: 25 June 2010
+0 Vote Up -0Vote Down

  • Membase -- an open-source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data in real-time. Supporting these requirements, membase processes data operations with quasi-deterministic low latency and high sustained throughput. (via Hacker News)
  • Sergey's Search (Wired) -- Sergey Brin, one of the Google founders, learned he had a gene allele that gave him much higher odds of getting Parkinson's. His response has been to help medical research, both with money and through
  •   [Read more...]
    Riptano for Cassandra
    +0 Vote Up -4Vote Down

    Cassandra is one of the most interesting NoSQL platforms at the moment.  And by most interesting what I really mean is the most clearly justifiable.  Some NoSQL platforms offer new data models, improved query interfaces and/or good single node performance through relaxed consistency models.  As a database guy however, the justification for throwing out the RDBMS baby and bathwater is still difficult at this point as NoSQL platforms tend to be highly focused in one aspect of data management, and very immature in all other areas.  Cassandra is somewhat different as it is more mature in a number of key areas (albeit still immature in others).  Areas that can make Cassandra more justifiable for the right project, when compared with a more  [Read more...]
    NoSQL Buzz
    +0 Vote Up -1Vote Down

    I have noticed a definite increase in NoSQL buzz over the last few months.  This is partly confirmed by Google Trends, this service shows data relating to how search topics rank:


    The last couple of months has seen a dramatic rise in both the number of searches and also the number of news items relating to NoSQL. 

    But the traditionalists need not yet fret, interest in NoSQL is yet but a blip on the data management radar, as demonstrated by this compairson between NoSQL and MySQL search rankings:


      [Read more...]
    What is Big Data?
    +0 Vote Up -1Vote Down

    Image by Aranda\Lasch via Flickr

    One of my favorite terms at the moment is “Big Data”.  While all terms are by nature subjective, in this post I will try and explain what Big Data means to me.

    So what is Big Data?

    Big Data is the “modern scale” at which we are defining or data usage challenges.  Big Data begins at the point where need to seriously start thinking about the technologies used to drive our information needs.

    While Big Data as a term seems to refer to volume this isn’t the case.  Many existing technologies have little problem physically handling large volumes (TB or PB) of data.  Instead the Big Data



      [Read more...]
    Analytics at Twitter
    +1 Vote Up -0Vote Down

    Last week I spent some time speaking with Kevin Weil, head of analytics at Twitter. Twitter, from a technology perspective, has had a bit of a hard time due to their stability issues in their early days.  Kevin was keen to point out that he feels this was due to the incomparable growth Twitter was experiencing at the time and their constant struggle to keep up.  Kevin was also keen to show that Twitter prides themselves on striving for engineering excellence, the creation & contribution to new technologies and generally assisting in pushing the boundaries forward.  Our conversation naturally centered on analytics at Twitter.

    Twitter, like many web 2.0 apps, started life as a MySQL based RBDMS application.  Today, Twitter is still

      [Read more...]
    Startup Weekend
    +0 Vote Up -0Vote Down

    I attended the Bay Area Startup Weekend in Mountain View this previous week-end. This was the first such event I attended and it was an amazing experience – so I thought I’d share it.

    The idea behind the event was that a bunch of folks would show up, some of them would pitch ideas for new startups and the others would join them if they liked the idea and/or had the necessary skills to build it. The goal was to build a working prototype over the course of the week-end.

    This seemed like an impossible task to me – not the part where you build a prototype but the idea that random people could come together and actually form a startup. And on talking to one of the organizers, he confirmed that the goal was really to form a community, help people get to know each other – sometimes the team does

      [Read more...]
    Is the RDBMS doomed (yada yada yada) ?
    +0 Vote Up -3Vote Down

    Image by Snooch2TheNooch via Flickr

    I was speaking with Michael Stonebraker this morning.  I mentioned that lately many have been referencing comments he has made over the last couple of years.  And I also mentioned that many had interpreted them as he was implying the RDBMS is “doomed”.  Mike has been saying the same thing for years, but the current NoSQL movement seems to have picked up on this and highlighting one of the RDBMS's own

      [Read more...]
    OLTP back into focus
    +0 Vote Up -0Vote Down

    I haven’t blogged in over a month now.  This is for a number of reasons.  Firstly I have been flat out with various activities.  This included a trip to VLDB in Lyon mid month.  Secondly, a lot of the companies I have spoken with this month aren’t ready to speak publically so hence no blog posts resulting from these sorts of discussions.

    However there has been a wiff of a change in the air in terms of focus that is interesting and worth highlighting.  After years of lots of innovation around data analytics, OLTP is starting to make a comeback in terms of reclaiming some of the limelight.  Much more on this between now and the end of the year, but a couple things to watch:

      [Read more...]
    VectorWise
    +1 Vote Up -0Vote Down


    I was fortunate enough to speak with Marcin Zukowski earlier about VectorWise.  If you missed it, VectorWise came out of stealth mode a day or two ago.  The have announced a joint partnership with Ingres and essentially are claiming impressive analytic RDBMS performance gains on conventional hardware.

    To start with, a key message that I think needs to be communicated here is that this is not a product announcement.  Ingres and VectorWise have announced a partnership in which they of course plan to build products together, today those products are still in the works.

    VectorWise is a spin out of


      [Read more...]
    The NoSQL community needs to engage the DBA’s
    +3 Vote Up -1Vote Down

    The NoSQL movement has been gaining some steam lately, with discussion forums and mailing lists popping up all around the web.  Despite having a career that has been centered on the RDBMS, I have made no secret that I think we have gone too far down with our RDBMS for everything mindset.  I think we need to add a few more tools back into our data toolbox. 

    Today, 99.5% of new data centric developments started will use a RDBMS by default.  Maybe .5 of a % will consider using something as obtuse as a NoSQL platform.  By experience I know the majority of people discussing NoSQL platforms today are web developers.  In

      [Read more...]
    HamsterDB
    +0 Vote Up -0Vote Down

    This post was a bit of a test to see if I could write a serious post about a database platform called Hamster.  I think I just made it :)

    With all the noise over key/value stores recently, we should keep in mind that this technology isn’t exactly new.  It is being applied to new problems, but many of the foundations have been around for decades.  Probably the oldest of them all, Berkley DB came into existence during the mid ‘80’s and now has over 200 million deployments (according to the Oracle web site).

    HamsterDB, while not having the same pedigree of Berkley, has been steadily worked on by

      [Read more...]
    HadoopDB discussion with Daniel Abadi
    +0 Vote Up -1Vote Down


    I spoke to Daniel Abadi this morning about his HadoopDB announcement that came out a couple of days back.  I am sure this has been a busy time for Daniel and his team over in Yale as HadoopDB has been getting a lot of interest which I am sure will continue to build.

    Some notes from our discussion:

    • HadoopDB is primarily focused on high scalability and the required availability at scale.  Daniel questions current MPP’s ability to truly scale past 100 nodes whereas Hadoop has real examples on 3000+ nodes.
    • HadoopDB like many MPP analytical database platforms

      [Read more...]
    Forrester's EDM Wave
    +0 Vote Up -1Vote Down

    Forrester put out its Enterprise Data Management Q2 2009 report a few days ago, you can buy it from Forrester but it also seems to now be available for free from Microsoft here.  I don’t actively seek out these reports as they usually just re-enforce common knowledge (this one was no exception), however as it turned up I managed to find some time on the weekend for a quick read through.

    Few surprises in this report, but some key mentions are:

    • DBMS market expected to grow 8% annually
    • IBM, Microsoft & Oracle own 88% of the DBMS market (by revenue)
    • Current market estimated at $27 billion, $32 billion by 2013
    • IBM,
      [Read more...]
    Previous 30 Newer Entries Showing entries 31 to 60 of 67 Next 7 Older Entries

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.