Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Previous 30 Newer Entries Showing entries 61 to 90 of 169 Next 30 Older Entries

Displaying posts with tag: big data (reset)

Introducing Data Fabric Design for Commodity SQL Databases
+2 Vote Up -0Vote Down
Data management is undergoing a revolution.  Many businesses now depend on data sets that vastly exceed the capacity of DBMS servers.  Applications operate 24x7 in complex cloud environments using small and relatively unreliable VMs.  Managers need to act on new information from those systems in real-time. Users want constant and speedy access to their data in locations across the planet.

It is tempting to think popular SQL databases like MySQL and PostgreSQL have no place in this new world.  They manage small quantities of data, lack scalability features like parallel query, and have weak availability models.  One reaction is to discard them and adopt alternatives like Cassandra or MongoDB.  Yet open source SQL databases have tremendous strengths:  simplicity, robust transaction support, lightning fast operation, flexible

  [Read more...]
Tracking 5.3 Billion Mutations: Using MySQL for Genomic Big Data
+1 Vote Up -0Vote Down

University of Montreal Tracks Genomic Data With Tokutek’s TokuDB.

Faster insertion rates, improved scalability and agility support lab’s fast growing research database as it grows from 100s of GBs to 1 TB and beyond.

Issue addressed: MySQL database used for genomic research must be able to quickly ingest huge amounts of incoming data – hundreds of thousands of records every day. It also must be able to retrieve data quickly in response to a diverse set of research requests.

Enabling the Hunt for New Cures for Diseases by Seamlessly Processing Billions of Mutations  [Read more...]

Talking at the SkySQL Roadshow in Stockholm
+3 Vote Up -0Vote Down
SkySQL Roadshow is coming to Stockholm on Feb 7, come by and meet us. I'll be ending the day with a talk on Big Data, which will be a more generic Big Data talk with some MySQL relevance, but with the focus on Big Data in general.

I haven't blogging much recently, but that has some reasons. I am since Dec 1 the proud father of twins, a little boy and a little girl. I have yet to teahc them to write proper SQL, the have particular issues with subqueries, but we'll get there. In order to create the usual mess of things and to make sure things are at the brink of running out of control, we decided to renovate our flat in the middle of all this. But I'll get there, and once we have a new kitchen installed, I'll do some more blogging, I have some things piled up to write about.

/Karlsson
The Results Are In!
+1 Vote Up -0Vote Down

We wanted to take a moment to say thanks to all of our customers and to the wider MySQL and MariaDB community. Today we announced a doubling of our customer base for the year ending December 31, 2012. Significant milestones over the last year included new technology and service partnerships, several awards, rapid hiring, as well as three upgrades to TokuDB®. We even dabbled in some MongoDB benchmarks. And to fuel continued growth in 2013, we secured additional venture capital funding last November.

Did You Hear? NASA Uses TokuDB for Big Data with MySQL!

To read the full press release and learn more,

  [Read more...]
Webinar: Introduction to TokuDB v6.6
+2 Vote Up -0Vote Down

TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, replication performance and online schema flexibility. Tokutek’s recently launched TokuDB v6.6 delivers all of these features and more, with additional improvements in multi-client, fast SQL updates, and in-memory performance.

Date: January 15th
Time: 2 PM EST / 11 AM PST
REGISTER TODAY

Topics will include:

  • Performance – With a 10x or more improvement in insertions and indexing, TokuDB delivers faster, more complex ad hoc queries in live production systems without rewriting or tuning applications. Offering high performance even when tables are too large for memory, TokuDB scales MySQL and MariaDB


  [Read more...]
The Data Day, Two days: January 7/8, 2013
+0 Vote Up -0Vote Down

SAP’s HANA – a floor wax *and* a dessert topping?

For 451 Research clients: SAP’s HANA database – a floor wax *and* a dessert topping? bit.ly/13dmDCH

— Matt Aslett (@maslett) January 8, 2013

Attivio has secured $8 million in new growth funding from General Electric Pension Trust | bit.ly/ZwXPFG

— Attivio (@attivio) January 7, 2013

Why We Need To Kill “Big Data” | TechCrunch tcrn.ch/ZpbnDl

— Mortar (@mortardata) January 5,

  [Read more...]
Marinating in 2013
+0 Vote Up -0Vote Down

What a flashback this week. Staring at a text terminal trying to establish a connection with a remote server, I began to fret whether I would get my homework assignment done on time. My mind raced back to college nights years ago in the Fishbowl, hunched over an Athena workstation. Would this be another late night fueled by Jolt cola in order to get my problem set done?

Thankfully, no!

Embarking on my first software class in quite a while was relatively painless, and I have Sheeri Cabral and her detailed guidance to thank. This week I started the

  [Read more...]
Move over Marcia: Top Ten for 2012
+0 Vote Up -0Vote Down

Well, it’s that time of the year again for top ten lists. There have been many versions showing up on the web the last few days, including Time Magazine’s “Top 10 Everything of 2012″ list, with 55 wide ranging lists!

Last year we started using Google Analytics to see what content for blogs was most popular on Tokutek.com and generated a 2011 top ten list, ending up with a few surprises.  This year saw spikes in some interesting areas as well, including flash performance, NASA and Big Data, and MongoDB.

Without further adieu, here is the top ten list for 2012:

10. Announcing TokuDB v6.1

  [Read more...]
The “Big Data” buzzword finally gets a real definition
+0 Vote Up -0Vote Down

We’ve all heard the term “Big Data” thrown around a fair amount in the last several years ever since the rise of Hadoop and other distributed storage methods. But defining “Big Data” has always been a subjective term that hinges on perspective; what one engineer considers big can be vastly different than another’s.

However, there’s finally a definite description that says Big Data no matter what perspective you operate from: “That facility by my calculations that I submitted to the court for the Electronic Frontiers Foundation against NSA would hold on the order of 5 zettabytes of data. Just that current storage capacity is being advertised on the web that you can buy. And that’s not talking about what they have in the near future.” You can read more about the facility and its purpose here:

  [Read more...]
On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.
+0 Vote Up -0Vote Down
“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]
Small Data
+0 Vote Up -0Vote Down

There is obviously much being written these days about Big Data. While the term has many different meanings to many different folks, our MySQL and MariaDB customers tend to find their data to be uncomfortably big when the tables become too large for memory. In this case, more storage has to be acquired, performance starts to lag, and making changes to the schema becomes a challenge.

TokuDB addresses these issues for big MySQL instances by delivering high compression rates, faster insertion and query performance, and agile

  [Read more...]
Two Cons against NoSQL. Part I.
+1 Vote Up -1Vote Down
Two cons against NoSQL data stores read like this: 1. It’s very hard to move data out from one NoSQL to some other system, even other NoSQL. There is a very hard lock in when it comes to NoSQL. If you ever have to move to another database, you have basically to re-implement a lot [...]
Webinar: MongoDB and Fractal Tree Indexes
+0 Vote Up -0Vote Down

This webinar covers the basics of B-trees and Fractal Tree Indexes, the benchmarks we’ve run so far, and the development road map going forward.

Date: November 13th
Time: 2 PM EST / 11 AM PST
REGISTER TODAY

Topics will include:

  • What is a Fractal Tree Index?
  • How to Fractal Trees compare with B-Trees
  • What can a Fractal Tree do for MongoDB performance
  • Benchmarks + Gotchas
  • What’s next

We look forward to having you join the webinar. We also hope that by sharing these results with



  [Read more...]
On Eventual Consistency– Interview with Monty Widenius.
+1 Vote Up -0Vote Down
“For analytical things, eventual consistency is ok (as long as you can know after you have run them if they were consistent or not). For real world involving money or resources it’s not necessarily the case.” — Michael “Monty” Widenius. In a recent interview, I asked Justin Sheehy, Chief Technology Officer at Basho Technologies, maker [...]
Presenting “MongoDB and Fractal Tree Indexes” at MongoDB Boston 2012
+0 Vote Up -1Vote Down

I’ll be presenting “MongoDB and Fractal Tree Indexes” at MongoDB Boston 2012 on October 24th.  My presentation covers the basics of B-trees and Fractal Tree Indexes, the benchmarks we’ve run so far, and the development road map going forward.

I’ve been to this one day conference twice now and both times came away with a better understanding of MongoDB’s capabilities, use-cases, and many questions answered via their deep technical dives.  I highly recommend current MongoDB users and anyone considering a MongoDB project attend – it appears that seats are still available.

First of New NSF Big Data Grants Go to Tokutek Founders
+0 Vote Up -0Vote Down

The core technology behind Tokutek is based on the academic research by our founders: Michael Bender, Bradley Kuszmaul and Martin Farach-Colton.  They are all still in academia, in addition to their work at Tokutek.

Back in March, the White House kicked off a new Initiative for Big Data.  Last week, the National Science Foundation announced the first interagency grants for this.  Eight awards were given, and our own Michael Bender and Martin Farach-Colton, along with Robert Johnson of Stony Brook University, received one of

  [Read more...]
Scaling MySQL and MariaDB to TBs: Interview with Martín Farach-Colton.
+1 Vote Up -0Vote Down
“While I believe that one size fits most, claims that RDBMS can no longer keep up with modern workloads come in from all directions. When people talk about performance of databases on large systems, the root cause of their concerns is often the performance of the underlying B-tree index”– Martín Farach-Colton. Scaling MySQL and MariaDB [...]
Log Buffer #289, A Carnival of the Vanities for DBAs
+1 Vote Up -0Vote Down
Oracle Open World 2012, this year, was all about Cloud, 12c, Exadata, Fusion, SuperClusters, social media, content management and much more. From operating systems to databases, and from applications to interactive media, professionals all around the world presented, attended, and networked in San Francisco. MySQL’S professionals also rocked massively. SQL Server bloggers also remained actively [...]
Forbes: “Tokutek Makes Big Data Dance”
+1 Vote Up -0Vote Down

Recently, our CEO, John Partridge had a chance to talk about novel database technologies for “Big Data” with Peter Cohan of Forbes.

According to the article, “Fractal Tree indexing is helping organizations analyze big data more efficiently due to its ability to improve database efficiency thanks to faster ‘database insertion speed, quicker input/output performance, operational agility, and data compression.’” As a start-up based on “the  first algorithm-based breakthrough in the database world in 40 years,” Toktuetek is following in the footsteps of firms such as Google and RSA, which also relied on novel algortithm advances as core to their technology.

To read the full article, and

  [Read more...]
Tips for Leveraging Oracle OpenWorld 2012 From Pythian Marketing
+0 Vote Up -0Vote Down
With Oracle OpenWorld just around the corner & MySQL Connect already underway I can’t believe yet another year has passed.  This is my third OOW and I must have a following as folks are already reaching out to me on twitter @pythiansimmons (log buffer lady seems to be a handle I can’t seem to shake). [...]
Announcing TokuDB v6.5: Optimized for Flash
+2 Vote Up -0Vote Down

We are excited to announce TokuDB® v6.5, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers optimization for Flash as well as more hot schema change operations for improved agility.

We’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.

Flash TokuDB v6.5 continues the great Toku-tradition of fast insertions. On flash drives, we show an order-of-magnitude (9x) faster insertion rate than InnoDB. TokuDB’s standard compression works just as well on flash and helps you get the most out of your  [Read more...]
Data Fabrics and Other Tales: Percona Live and MySQL Connect
+0 Vote Up -0Vote Down
The fall conference season is starting.  I will be doing a number of talks including a keynote on "future proofing" MySQL through the use of data fabrics.  Data fabrics allow you to build durable, long-lasting systems that take advantage of MySQL's strengths today but also evolve to solve future problems using fast-changing cloud and big data technologies.  The talk brings together ideas that Ed Archibald (our CTO) and I have been working on for over two decades.  I'm looking forward to rolling them out to a larger crowd.

Here are the talks in calendar order.  The first two are at MySQL Connect 2012 in San Francisco on September 30th:



  [Read more...]
Pythian at OOW12
+2 Vote Up -0Vote Down
Every time I have had the pleasure of attending Oracle Open World, I have discovered a plethora of technical heavy-weights from all over the world in attendance. I enjoy meeting and shmoozing with these people almost as much as absorbing the technical content of the show itself. Many of my Pythian colleagues are presenting at [...]
Oracle High Availability and More with Continuent Tungsten
+0 Vote Up -0Vote Down
Oracle is the most powerful database system in the world. However, Oracle's expensive and complex replication makes it difficult to build highly available applications or move data in real-time to data warehouses and popular databases like MySQL. In this video (recording of our 9/13/12 webinar) you will learn how Continuent Tungsten solves problems with Oracle replication at a fraction of the
XLDB Tutorial on Data Structures and Algorithms
+0 Vote Up -0Vote Down

Next week Michael and I (Bradley) will be travelling to Silicon Valley to present a tutorial on Data Structures and Algorithms for Big Databases at the 6th XLDB Conference.

The tutorial, which is 4 hours on Monday afternoon, aims to cover the following topics (but it’s looking like we’ll have to drop several items for lack of time.)

This tutorial will explore data structures and algorithms for big databases. The topics include:

  • Data structures including B-trees, Log Structured Merge Trees, and Streaming B-trees.
  • Approximate Query Membership data structures including Bloom filters and cascade filters.
  • Algorithms for join including hash joins and
  [Read more...]
Facebook makes big data look... big!
+1 Vote Up -0Vote Down
Oh I love these things: http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/

Every day there are 2.5B content items shares, and 2.7B "Like"s. I care less about GiGo content itself, but metadata, connections, relations are kept transactionally in a relational database. The above 2 use-cases generate 5.2B transactions on the database, and since there are only 86400 seconds a day, we get over 60000 write transactions per second on the database, from these 2 use-cases alone, not to mention all other use-cases, such as new profiles, emails, queries...

And what's the



  [Read more...]
Scale Up, Partitioning, Scale Out
+1 Vote Up -0Vote Down
On the 8/16 I conducted a webinar titled: "Scale Up vs. Scale Out" (http://www.slideshare.net/ScaleBase/scalebase-webinar-816-scaleup-vs-scaleout):


ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut from ScaleBase
The webinar was successful, we had many attendees and great participation in questions and answers throughout the session and in the end. Only after the webinar it only occurred to me that one specific graphic was missing from the webinar deck. It was occurred to me after answering



  [Read more...]
Y Gatorz are Considering Moving Back to a Gator Farm Instead of MapReducing the World
+0 Vote Up -1Vote Down
NSFW (audio) “…pipe your data to /dev/null – it will be very fast.” “Does /dev/null support sharding?” NSFW (audio) “…the only thing constructive we could have used their source files for was as random keys for SSL certs.” NSFW (audio) “PHP reeks … Continue reading →
FROSCON and VLDB
+2 Vote Up -0Vote Down

Next week I (Bradley) will be traveling to FROSCON near Bonn, Germany, and then on to VLDB in Istanbul.

At FROSCON I’ll be talking about fast data structures for maintaining indexes. The talk will share some content with my upcoming MySQL Connect talk.

At VLDB, Dzejla Medjedovic will be presenting a talk on our paper on SSD-friendly Bloom-filter-like data structures. The paper is

Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadok.
Don’t Thrash: How to Cache Your Hash on Flash. PVLDB 5(11):1627-1637, 2012.

An earlier version of the paper appeared at


  [Read more...]
Dagstuhl Seminar on Database Workload Management
+1 Vote Up -0Vote Down

A few weeks ago Bradley Kuszmaul and I attended the Dagstuhl Seminar on Database Workload Management.

The Dagstuhl computer science research center is (remotely) located in the countryside in Saarland, Germany. The actual building is an 18th Century Manor House, first retooled as an old-age home, and then a computer science research center. Workshop participants typically spend the whole week talking and working together.

Dagstuhl Computer Science Center

Shivnath Babu (Duke University), Goetz Graefe (Hewlett Packard),

  [Read more...]
Previous 30 Newer Entries Showing entries 61 to 90 of 169 Next 30 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.