Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Previous 30 Newer Entries Showing entries 91 to 120 of 157 Next 30 Older Entries

Displaying posts with tag: big data (reset)

Webinar: TokuDB v6 Replication Performance
+1 Vote Up -0Vote Down

TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, and online schema flexibility.

Tokutek’s recently launched TokuDB v6 delivers all of these features and more, with the introduction of high performance replication for MySQL and MariaDB. TokuDB v6 eliminates the common and persistent problem of “slave lag” in which a replication server is unable to keep up with the query load borne by the master server. TokuDB v6 solves this by offering high ingestion rates at the slave.

Time: 2PM EDT / 11AM PDT

REGISTER TODAY

  [Read more...]
Challenges of Big Databases with MySQL – IOUG Presentation
+3 Vote Up -0Vote Down

 

 

Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL® is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine plug-ins.

I recently gave a talk at

  [Read more...]
SwRI Chooses TokuDB to Tackle Machine Data for an 800M+ Record Database
+0 Vote Up -1Vote Down

Tackling machine data on the ground to ensure successful operations for NASA in space

Issues addressed:

  • Scaling MySQL to multi-terabytes
  • Insertion rates as InnoDB hit a performance wall
  • Schema flexibility to handle an evolving data model

The Company:  Southwest Research Institute (SwRI) is an independent, nonprofit applied research and development organization. The staff of more than 3,000 specializes in the creation and transfer of technology in engineering and the physical sciences. Currently, SwRI is part of an international team working on the NASA

  [Read more...]
Percona Live Slides and Video Available: The Right Read Optimization is Actually Write Optimization
+2 Vote Up -0Vote Down

In April, I got to give a talk at Percona Live, about why The Right Read Optimization is Actually Write Optimization. It was my first industry talk, so I was delighted when someone in the audience said “I feel like I just earned a college credit.”

Box offered to host everyone’s slides from the conference here (mine is here). A big thanks from me to Sheeri Cabral, for

  [Read more...]
Tokutek and PalominoDB Partner to Bring Scale, Performance to Database Deployments
+3 Vote Up -0Vote Down

MySQL storage engine provider joins forces with leading database consultants to deliver support for growing number of MySQL and MariaDB customers

Lexington, MA – (May 2, 2012) – Tokutek, the leader in high-performance and agile database storage engines, today announced a strategic partnership with PalominoDB, a premier database operations and engineering consultancy, to provide database services and support to joint customers. Tokutek’s storage engine will be complemented with PalominoDB’s operational excellence, 24×7 on-call support and access to the company’s skilled team of

  [Read more...]
TokuDB v6.0: Download Available
+2 Vote Up -0Vote Down

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Sysbench Performance

I wanted to take this time to talk about one more under-the-hood goody we’ve added to v6.0. In

  [Read more...]
My Talk on Tuesday at IOUG COLLABORATE 12
+0 Vote Up -0Vote Down

 

 

Challenges of Big Databases with MySQL

Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine

  [Read more...]
TokuDB v6.0: Even Better Compression
+1 Vote Up -0Vote Down

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks

  [Read more...]
TokuDB v6.0: Getting Rid of Slave Lag
+1 Vote Up -0Vote Down

Master/slave replication is an important tool that gets used in many ways: distributing read loads among many slaves for performance, using a slave for backups so the master can handle live load, geographically distributed disaster recovery, etc. The Achilles’ Heal of slave performance is that slave workloads are single-threaded. The master can have many clients inserting, updating, querying, whereas the slave has only one insertion client: the master. InnoDB single-client performance is much slower than its multi-client performance, which means that the bottleneck in a master/slave system is often the rate at which a slave can keep up.

If the master has an average transactions per second (tps) that is higher than what the slave can handle, the slave will fall further and further behind. If the slaves are being used to distribute read workload, for example, the

  [Read more...]
Announcing TokuDB v6.0: Less Slave Lag and More Compression
+1 Vote Up -0Vote Down

We are excited to announce TokuDB® v6.0, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers feature and performance enhancements over previous releases, support for XA (two-phase transactional commits), better compression, and reduced performance variability associated with checkpointing. This release also brings TokuDB support up to date on MySQL v5.1, MySQL v5.5 and MariaDB v5.2. There’s a lot of great technical stuff under the hood in this release and I’ll be reviewing the improvements one-by-one over the course of this week.

I’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.

Replication Slave Lag One of the things TokuDB does well is single-threaded insertions, which translates directly into  [Read more...]
Looking for Global Collisions
+0 Vote Up -0Vote Down

On Monday, I took a break from planning for the upcoming Percona Live MySQL Conference (where we have a sessionlightning talkbooth, and other misc activities planned) to go attend the UK-Massachusetts Innovation Economies Conference at the MIT Media Lab. The event featured Gov. Deval Patrick, MIT Media Lab Director Joi Ito, industry experts such as

  [Read more...]
SkySQL is Coming to a City Near You!
+1 Vote Up -0Vote Down

Now that the snow is melting and spring is in the air, the SkySQL Team is hitting the road and making the rounds of key industry events, trade shows, and meetups around the globe.  Come meet the team, pick-up a few tips and tricks for using the MySQL database, network with your peers, and learn more about SkySQL’s products and services.  Here are some the events we’ll be at this spring:

BIG Data, A New Horizon for Data Analysis
March 20 - 21, 2012
Cité Internationale Univeritaire de Paris, Paris, France

POSSCON 2012
March 28-29, 2012
Columbia Metropolitan Convention Center, Columbia, South Carolina





  [Read more...]
O’Reilly Strata 2012: The Year of the Data Scientist
+0 Vote Up -0Vote Down

We had the privilege this past week to be invited to be part of the 2012 O’Reilly Strata “Making Data Work” Conference. Some of our photos from the event are here. At the event, we were excited to have Tokutek described in front of the approximately 2,500 attendees during the keynote sessions.

Overall, the diversity of topics discussed at the conference was impressive, spanning databases, developer tools, data visualization techniques, customer stories, and business implications. The full agenda is

  [Read more...]
Evidenzia Upgrades to TokuDB v5.2 to Address Storage Growth and Scale Performance
+2 Vote Up -0Vote Down

Ensuring sufficient disk I/O to catch copyright violations at network speed.

Evidenzia GmbH & Co. KG

Issues addressed:

  • Storage growth, including maxed-out disk I/O utilization
  • Performance issues and business impact due to slow selects
  • Inability to revise data schema on the fly

The Company: Evidenzia GmbH & Co. KG is one of the leading partners of the software, movie and music industry when it comes to tracing copyright infringements

  [Read more...]
A super-set of MySQL for Big Data. Interview with John Busch, Schooner.
+0 Vote Up -0Vote Down
“Legacy MySQL does not scale well on a single node, which forces granular sharding and explicit application code changes to make them sharding-aware and results in low utilization of severs”– Dr. John Busch, Schooner Information Technology A super-set of MySQL suitable for Big Data? On this subject, I have interviewed Dr. John Busch, Founder, Chairman, [...]
Tokutek Selected as a Finalist for O’Reilly Strata Conference
+2 Vote Up -0Vote Down

We are excited to announce that we’ve been named as one of ten finalists selected for the startup showcase at the O’Reilly Strata “Making Data Work” Conference at the end of this month in Santa Clara, California. The startup showcase will be held on February 29th, starting at 6:30 pm.

The conference offers a great overview of the big data space, with tracks on Data Science, Business and Industry,

  [Read more...]
New England’s Victory (for Big Data)
+1 Vote Up -0Vote Down

While it might not have been New England’s weekend on the Big Gridiron, it was certainly New England’s day for Big Data at the New England Database Summit on Friday at MIT.

The summit was well attended, with 350 registrants and keynotes from prominent MySQL users such as Mark Callaghan. The coverage was quite broad, with presentations running the gamut from grad students (complete with bodyguards and intimidating academic

  [Read more...]
MySQL Conference and Expo Talk on Benchmarking
+2 Vote Up -0Vote Down

I’ll be speaking on April 11th at 4:30 pm in Room 4 in at the Percona Conference and Expo Talk. The topic will be “Creating a Benchmark Infrastructure That Just Works.

Throughout my career I’ve been involved with maintaining the performance of database applications and therefore created many benchmark frameworks. At Tokutek, an important part of my role is measuring the performance of our storage engine over time and versus competing solutions. There is nothing proprietary about

  [Read more...]
Big Kettle News
+0 Vote Up -0Vote Down

Dear Kettle fans,

Today I’m really excited to be able to announce a few really important changes to the Pentaho Data Integration landscape. To me, the changes that are being announced today compare favorably to reaching Kettle version 1.0 some 9 years ago, or reaching version 2.0 with plugin support or even open sourcing Kettle itself…

First of all…

Pentaho is again open sourcing an important piece of software.  Today we’re bringing all big data related software to you as open source software.  This includes all currently available capabilities to access HDFS, MongoDB, Cassandra, HBase, the specific VFS drivers we created as well as the ability to execute work inside of Hadoop (MapReduce), Amazon EMR, Pig and so

  [Read more...]
1 Billion Insertions – The Wait is Over!
+4 Vote Up -0Vote Down

iiBench measures the rate at which a database can insert new rows while maintaining several secondary indexes. We ran this for 1 billion rows with TokuDB and InnoDB starting last week, right after we launched TokuDB v5.2. While TokuDB completed it in 15 hours, InnoDB took 7 days.

The results are shown below. At the end of the test, TokuDB’s insertion rate remained at 17,028 inserts/second whereas InnoDB had dropped to 1,050 inserts/second. That is a difference of over 16x. Our complete set of benchmarks for TokuDB v5.2 can be found here.

  [Read more...]
CAOS Theory Podcast 2012.01.20
+0 Vote Up -0Vote Down

Topics for this podcast:

*Hadoop v1.0 and year ahead
*Oracle-Cloudera deal for more Hadoop
*Oracle’s ‘Sun spot’ with Solaris
*Open Source M&A outlook for 2012
*Our new MySQL/NoSQL/NewSQL survey

iTunes or direct download (28:49, 4.9MB)

Fractal Tree Indexes and Mead – MySQL Meetup
+2 Vote Up -1Vote Down

 
Thanks again to Sheeri Cabral  for having me at the Boston MySQL Meetup on Monday for the talk on “Fractal Tree® Indexes – Theoretical Overview and Customer Use Cases.” The crowd was very interactive, and I appreciated that over 50 people signed up for the event and left some very positive comments and reviews.

In addition, the conversation spilled over late into the night as we made our way over to nearby Mead


  [Read more...]
Top Ten for 2011
+0 Vote Up -0Vote Down

 

It’s almost the end of the year – that means holiday cards, shopping, cooking, parties, and the inevitable year-end top lists (including gems like this one).

In the spirit of end of year list making, we fed our 60+ blogs this year through Google Analytics to find out what our own top ten blogs were (outside of product announcements). So if you missed an episode of the View (TokuView that is) we’ve got a Tokutek Top Ten for you (spoiler alert – they are mostly technical):

10. Cage Match: OldSQL, NoSQL and NewSQL – References to

  [Read more...]
VC funding for Hadoop and NoSQL tops $350m
+0 Vote Up -0Vote Down

451 Research has today published a report looking at the funding being invested in Apache Hadoop- and NoSQL database-related vendors. The full report is available to clients, but non-clients can find a snapshot of the report, along with a graphic representation of the recent up-tick in funding, over at our Too Much Information blog.

The Big Data Community at the MassTLC unConference
+0 Vote Up -0Vote Down

 

I had the pleasure of being invited to blog at the MassTLC unConference on Friday. The event was a full day of diverse topics and discussions ranging from the latest in recipe sharing sites, to entrepreneurial CEO war stories, to hot trends in venture investing. An excerpt covering Big Data from my MassTLC blog is below.


Big Data and Analytics in MA

Hosted by Steve O’Leary of Aeris Partners and Bob Zurek (@bzurek) of Oracle

First question – what is Big Data? While often debated, Steve had a working definition of “big” in terms of Volume, Velocity and Variety. Fritz Knabe of IBM noted that Big Data can come from even the most unexpected places, such

  [Read more...]
Webinar: NoSQL, NewSQL, Hadoop and the future of Big Data management
+0 Vote Up -0Vote Down

Join me for a webinar where I discuss how the recent changes and trends in big data management effect the enterprise.  This event is sponsored by Red Rock and RockSolid.

Overview:

It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward. 

These technologies bring with them possibilities both in terms of the scale of data



  [Read more...]
Write Optimization: Myths, Comparison, Clarifications, Part 2
+0 Vote Up -0Vote Down

In my last post, we talked about the read/write tradeoff of indexing data structures, and some ways that people augment B-trees in order to get better write performance. We also talked about the significant drawbacks of each method, and I promised to show some more fundamental approaches.

We had two “workload-based” techniques: inserting in sequential order, and using fewer indexes, and two “data structure-based” techniques: a write buffer, and OLAP. Remember, the most common thing people do when faced with an insertion bottleneck is to use fewer indexes, and this kills query performance. So keep in mind that all our work on write-optimization is really work for read-optimization, in that write-optimized

  [Read more...]
From Under the Desk to the Cloud
+1 Vote Up -0Vote Down

 

Review of the O’Reilly Strata Making Data Work Conference

(reprinted from my guest blog for the Cloud Council of 7)

Monica Rogati of LinkedIn told a story of the early days at the firm, when the reporting system consisted of a single server under someone’s desk. One day, someone needed an Ethernet cable and unplugged the machine from the data outlet in the wall. LinkedIn’s data reporting, its life blood, instantly came to a

  [Read more...]
What is the biggest challenge for Big Data?
+0 Vote Up -0Vote Down

Often I think about challenges that organizations face with “Big Data”.  While Big Data is a generic and over used term, what I am really referring to is an organizations ability to disseminate, understand and ultimately benefit from increasing volumes of data.  It is almost without question that in the future customers will be won/lost, competitive advantage will be gained/forfeited and businesses will succeed/fail based on their ability to leverage their data assets.

It may be surprising what I think are the near term challenges.  Largely I don’t think these are purely technical.  There are enough wheels in motion now to almost guarantee that data accessibility will continue to improve at pace in-line with the increase in data volume.  Sure, there will continue to be lots of interesting innovation with technology, but

  [Read more...]
Online Advertiser Intent Media Selects TokuDB over InnoDB and NoSQL for Big Data Ad-Hoc Analysis
+1 Vote Up -0Vote Down

Intent Media

Issue addressed: Ad hoc analytics on clickstream data arriving too fast for InnoDB or NoSQL to handle.

TokuDB powers an online advertising application

The Company: Headquartered in New York, Intent Media is a fast-growing online advertising startup. The company helps some of the largest online retailers monetize their traffic more efficiently at scale by showing highly relevant and targeted advertising to the 97+% of e-commerce visitors who do not transact.

The Challenge: The Intent Media platform processes hundreds of

  [Read more...]
Previous 30 Newer Entries Showing entries 91 to 120 of 157 Next 30 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.