Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 1136 Next 30 Older Entries

Displaying posts with tag: Databases (reset)

Always use decimal for money
+0 Vote Up -0Vote Down

Data types make all the difference in the world when you’re designing your database. The choices you make now will affect the quality of your data, as well as application performance. I’m going to focus on one issue in this article: why you should always use decimals to represent money. Let’s jump in and see why that’s true.

An example of floats gone wrong

Let’s use a really, really simplified accounting ledger. It’s just three fields, an entry id, a customer id, and an amount, which has a float for a data type.

Create Table LedgerEntries
(
 LedgerEntryID Int Primary Key Auto_Increment Not Null
,CustomerID Int Not Null
,Amount Float Not Null
);

Now let’s load some data.

Insert Into LedgerEntries (CustomerID, Amount)
  [Read more...]
WebScaleSQL RPMs for CentOS 6
+0 Vote Up -0Vote Down

WebScaleSQL was announced last week. This looks like a good thing for MySQL as it provides a buildable version of MySQL which includes multiple patches from Facebook, Google, LinkedIn, and Twitter needed by large users of MySQL, patches which have not been incorporated into the upstream source tree.  Making this more visible will possibly encourage more of these patches to be brought into the code sooner.

The source is provided as a git repo at https://github.com/webscalesql/webscalesql-5.6 and as detailed at http://webscalesql.org/faq.html the documentation says there is currently no intention to provide binaries.

Instructions on building the binaries and the build

  [Read more...]
Continuent Replication to Hadoop – Now in Stereo!
+0 Vote Up -0Vote Down

Hopefully by now you have already seen that we are working on Hadoop replication. I’m happy to say that it is going really well. I’ve managed to push a few terabytes of data and different data sets through into Hadoop on Cloudera, HortonWorks, and Amazon’s Elastic MapReduce (EMR). For those who have been following my long association with the IBM InfoSphere BigInsights Hadoop product, and I’m pleased to say that it’s working there too. I’ve had to adapt Robert’s original script to work with the different versions of the underlying Hadoop tools and systems to make it compatible. The actual performance and process is unchanged; you just use a different JS-based batchloader script to work with different tools.

Robert has also been simplifying some of the core functionality, such as configuring some fixed pre-determined

  [Read more...]
MySQL 5.6 GTIDs: Evaluation and Online Migration
+0 Vote Up -0Vote Down

A colleague and I have been looking at GTID on MySQL recently and you may be interested in the blog post that results from that. You can see it here. http://blog.booking.com/mysql-5.6-gtids-evaluation-and-online-migration.html.

 

Interviewing for a Database Developer
+0 Vote Up -0Vote Down

I work for a firm that’s heavily invested in SQL – a team that needs to have developers who know their way around relational databases and MySQL in particular. I want to show you how I run interviews for our development positions.

Method

Everybody has their own methods and opinions on how to conduct technical interviews. I’ve found that I generally dislike interviews that focus either on whiteboard puzzles or obscure technical details, since they don’t really show how well the candidate is at what really matters: building functioning, quality apps. I really like running the interview like we’re talking about the design for a new product. I want to figure out the requirements, mull over the data model, and write some simple queries to make sure we can show the

  [Read more...]
Real-Time Data Loading from MySQL to Hadoop using Tungsten Replicator 3.0 Webinar
+0 Vote Up -0Vote Down

To follow-up and describe some of the methods and techniques behind replicating into Hadoop from MySQL in real-time, and how this can be combined into your data workflow, Continuent are running a webinar with me presenting that will go over the details and provide a demo of the data replication process.

Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0

Hadoop is an increasingly popular means of analyzing transaction data from MySQL. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. The new Continuent Tungsten Replicator 3.0 provides enterprise-quality replication from MySQL to Hadoop. Tungsten Replicator 3.0 is 100% open source, released under a GPL V2 license, and available for download at

  [Read more...]
Parallel Extractor for Provisioning
+0 Vote Up -0Vote Down

Coming up as a new feature in Tungsten Replicator (and written by our replicator expert Stephane Giron) is the ability to provision a new database by using data from an existing database. This new feature comes in the form of a tool called the Parallel Extractor.

The principles are very simple. On the master side:

  • Start the master replicator offline.
  • Switch the replicator to the online provision state.
  • The master replicator pulls the data out of the existing database and writes that information into the Transaction History Log (THL). At this point, the normal replicator thread is not extracting events from the source database.
  • Once the parallel replication has completed, the replicator switches over to normal extraction mode, and starts writing change data into the THL.

On the slave side, the THL events are read

  [Read more...]
MC at Percona Live San Francisco 2014
+0 Vote Up -0Vote Down

Now I’m back in the MySQL fold, I’ve got the opportunity to speak at Percona Live again. I’ve always enjoyed speaking at this conference (back when it was known by another name…), although I need to up my game and do the 6 talks I did back in 2009.

On the Tuesday afternoon, tutorials day, I’m running a half-day session with my replication colleague Linas Virbalas. This will be similar to the session I did at Percona Live London, and cover some of the more advanced content on replication, including, but not limited to:

  • Filters
  • JavaScript Filtering
  • Some fun and practical filters
  • Heterogeneous replication from MySQL out to MongoDB, Vertica, Oracle and Hadoop

I might even choose to demo

  [Read more...]
Inner vs. Outer Joins
+1 Vote Up -0Vote Down

I want to teach you the difference between an inner and an outer join. We first need to think about what a join is. Simply, it’s when you combine two tables to make a new one. You’re not physically creating a new table when you join them together, but for the purposes of the query, you are creating a new virtual table. Every row now has the columns from both tables. So if TableA has columns Col1 and Col2 and TableB has columns Col3 and Col4, when you join these two tables, you’ll get Col1, Col2, Col3, and Col4. Just as with any query, you have the option of including all columns or excluding some, as well as filtering out rows.

Inner join. A join is combining the rows from two tables. An inner join

  [Read more...]
Driverless cars and MySQL
+0 Vote Up -0Vote Down

Unless you’ve been living under a rock, you’ve probably heard about driverless cars. Google and the large car makers have been creating cars that doen’t need a human driver. This is tremendous news. Over 30,000 people die in car crashes every year in the US. While not all of those could be prevented by removing the meatbag driver, a lot of them could. Driverless cars are the right way forward and I can’t wait until they come. Every time I see a car wreck or hear about a drunk driver, I wish we had this

  [Read more...]
MySQL to Hadoop Step-By-Step
+0 Vote Up -0Vote Down

We had a great webinar on Thursday about replicating from MySQL to Hadoop (watch the whole thing). It was great, but one of the questions at the end was ‘is there an easy way to test’.

Sadly we can’t go giving out convenient ready-to-run downloads of these things because of licensing and and other complexities, so I want to try and make it as simple and straightforward as possible by giving you the directions to complete. I’m going to be point to the Continuent Documentation every now and then so this is not too crowded, but we should get through it pretty easily.

Major Decisions

For this to work: 

  • We’ll setup two VMs, one the master
  [Read more...]
MMUG6: Madrid MySQL Users Group meeting to take place on 20th March 2014
+1 Vote Up -0Vote Down

Madrid MySQL Users Group will have its next meeting on 20th March. Details can be found on the group’s Meetup page.

I will be giving a presentation on MySQL replication hopefully aimed at all levels, but covering some details relevant to larger setups. The meeting will be in Spanish.

Look forward to seeing you there.

La próxima reunión de Madrid MySQL Users Group tendrá lugar el jueves 20 de marzo. Se puede encontrar más detalles en la página del grupo.  Ofreceré una presentación sobre replicación de MySQL dirigido a gente de todos los niveles, pero incluirá información relevante a entornos más grandes.  La presentación será en español.

Espero veros allí.

Difference between strict_all_tables and strict_trans_tables
+0 Vote Up -0Vote Down

MySQL has default behavior that’s just plain wrong. I’ve covered some quirks with MySQL not null behavior and offered up using the SQL Mode strict_all_tables as a solution.

As a review, SQL Mode changes the way MySQL executes SQL statements and it’s often used to make MySQL behave. There are several switches that I would turn on by default, like only_full_group_by. You can change settings for yourself (just your connection) or for everybody (change it at the server level).

There’s another SQL Mode option, strict_trans_tables, that’s similar in intent but not in behavior to

  [Read more...]
Intro to MySQL Information Schema
+0 Vote Up -0Vote Down

Databases store information, right? Well, what if they could store information about your information so you could query it? Good news! Most database engines such as MySQL implement Information Schema, or a set of views that describe your tables and columns. If you’ve ever used a language like Java or C#, you might be familiar with reflection: the ability to read an object’s metadata. This enables you to do meta-programming, or writing logic about the program itself.

The idea of Information Schema is like reflection: they are views in your database that you can use in your programs or scripts when you need to know what the schema looks like or check the health of the instance. Unlike Show Processlist or Desc Table commands, the results are in tabular format.

I’m going

  [Read more...]
MySQL 5.6 Optimizer Trace
+5 Vote Up -0Vote Down

One of the new features of MySQL 5.6 is the optimizer trace. What does it do? Well, you know Explain, right? Just add Explain before a query and MySQL will print out how it would execute the query. The optimizer trace does that, except it gives you a little more background into to how it makes its decisions. Think of it as “It’s a Wonderful Life” for indexes – it shows you what the world would be like without that primary key or index.

If you ask, why did MySQL choose to run a full table scan vs. an index seek, the optimizer trace can help you get a better picture of the situation. The other big deal with the optimizer trace is that it’ll show you the relative costs of operations. Yeah, MySQL already tells you how many rows it’ll have to read, but now you know how much those cost

  [Read more...]
Getting Data into Hadoop in real-time
+0 Vote Up -0Vote Down

Moving data between databases is hard. Without ever intending it, I seem to have spent a lifetime working on solutions for getting data into and out of databases, but more frequently between. In fact, my first job out of university was migrating data from BRS/Text, a free-text database (probably what we would call a NoSQL) into a more structured Oracle.

Today I spend some of my time working in Big Data, more often than not, migrating information from existing data stores into Big Data so that they can be analysed, something I covered in more detail here:

http://www.ibm.com/developerworks/library/bd-sqltohadoop1/index.html
http://www.ibm.com/developerworks/library/bd-sqltohadoop2/index.html


  [Read more...]
Creating Users & Granting Permissions in MySQL
+0 Vote Up -0Vote Down

One of the first tasks when setting up a new MySQL server is creating users and granting them permissions, or giving them the ability to do stuff in MySQL. By default, a single user, the root user, is created when you setup MySQL. This user is granted all privileges on the entire system, which means you should create separate logins for administrators and applications. Let’s walk through all the basic steps you need to create a new user and give them permissions.

Create a User

The first command I’ll show you is pretty simple:

Create User bob@'%' Identified By 'Astr0ngPhr@$e';

Before we move on, let’s check that user’s permissions:

Show Grants for bob@'%';

Which should show something like:

  [Read more...]
Designing Multi-Tenanted Databases – Anti-patterns
+0 Vote Up -0Vote Down

In this post, I’m going to briefly cover the signs that you’re doing multi-tenancy wrong. Some of these practices are entrenched in software: there are gems in Ruby on Rails, for instance, use the first anti-pattern to achieve multi-tenancy. Listen, you can drive a car with a flat tire and you can eat yogurt with a fork. People have made these solutions work, but there’s a better way.

Creating tables or schemas per customer

If you find yourself running DDL (Create Table…) for each new company or user that you add to your system, most likely you’re committing a pretty big anti-pattern. Now every time you update the

  [Read more...]
Anonymizing Data During Replication
+0 Vote Up -0Vote Down

If you happen to work with personal data, chances are you are subject to SOX (Sarbanes-Oxley) whether you like it or not.

One of the worst aspects of this is that if you want to be able to analyse your data and you replicate out to another host, you have to find a way of anonymizing the information. There are of course lots of ways of doing this, but if you are replicating the data, why not anonymize it during the replication?

Of the many cool features in Tungsten Replicator, one of my favorites is filtering. This allows you to process the stream of changes that are coming from the data extracted from the master and perform operations on it. We use it a lot in the replicator for ignoring tables, schemas and columns, and for ensuring that we have the correct information within the THL.

Given this, let’s use it to anonymize the data as it is being

  [Read more...]
MySQL 5.6 GA one year – What is next?
+3 Vote Up -0Vote Down

MySQL 5.6 has been GA for just over a year now. See MySQL 5.6.10 Release Notes.  Congratulations on your birthday! That is quite a long time. I was using it earlier in production because it worked and could do things that 5.5 could not do, but earlier versions were to use at your own risk, and indeed if prodded incorrectly would fall on the floor. That is fair enough because they were work in progress, yet if you poked them the right way they did a very good job.  Those dev versions have been long since upgraded which is good so they do not need quite as much care and attention.

So from where I see 5.6 it works very well. One big change that has made a large difference but which I think a lot of people may not really understand or use is the

  [Read more...]
Designing Multi-Tenanted Databases – Many-to-Many
+0 Vote Up -0Vote Down

I want to give you tools that you can use for building databases that can handle complex relationships. In the previous article in this series, I looked at hierarchical data. The classic example of a hierarchy is an org chart. On most org charts I’ve seen, an employee has only one boss, which is a one-to-many relationship. The other kind of segmentation is many-to-many. A good instances of this is your social circle. Most people have many friends, and those friends have multiple friends themselves.

This is a common pattern. When developing a multi-tenanted application, users

  [Read more...]
Enhancing pt-kill to Better Protect your Servers
+2 Vote Up -0Vote Down
I believe in automation as much as possible, and I'm always working to make the day to day tasks of operations as smooth as possible.  Also I try not to be afraid to take good tools and make them better.

Here in Database Ops at Box, we use pt-kill running as a service to constantly monitor our servers and help protect against long running queries.  But our thresholds are pretty generous, and in some cases it's possible for unforeseen circumstances to cause enough queries to storm the database such that we can have problems before any of them hit the threshold for "busy time."  Ditto for idle connections.

The response is that someone has to be available to manually run another copy of pt-kill with much lower thresholds to clear out these thundering herds.  But what if we could let pt-kill handle both the "normal" mode and still protect



  [Read more...]
What is FTS_BEING_DELETED.ibd
+0 Vote Up -0Vote Down

I currently have on a MySQL 5.6 database using innodb_file_per_table the following individual tablespace file.

schema/FTS_00000000000001bb_BEING_DELETED.ibd

The schema is all InnoDB tables, and there ARE NO Full Text Indexes. I cannot comment on if a developer has tried to create one previously.
I am none the wiser in explaining the ongoing use of these files, or if it can be/should be deleted.

On closer inspection there are infact a number of FTS files.

$ ls -al FTS*
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001bb_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001bb_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29

  [Read more...]
Good Test Data
+0 Vote Up -2Vote Down

Over the years you collect datasets you have created for various types of testing, seeding databases etc. I have always thought one needs to better manage this for future re-use. Recently I wanted to do some “Big Data” playing and again that question of what datasets can I use let me to review the past collated list at Seeking public data for benchmarks.

The types of things I was wanting to do lead me to realize a lot of content is “public domain” and Project Gutenberg is just one great source of text in multiple languages. This was just one aspect of my wish list but text based data is used from blogs, comments, articles, microblogs etc, and multiple languages was important

  [Read more...]
The GRANT/REVOKE dilemma
+0 Vote Up -0Vote Down

It is common practice to grant your application the privileges of “GRANT SELECT, INSERT, UPDATE, DELETE ON yourdb.* TO user@host”.

But what if you want to ensure you cannot DELETE data from just one table?

Ideally I want to be able to “REVOKE DELETE ON yourdb.important_table FROM user@host”. You cannot do currently this with the MySQL privilege system.

If your schema has 100 tables, and you want to remove DELETE from one, you have to define DELETE for the 99 others, and remember that for each new table, you need to remember to also modify user privileges.

MySQL Explain Explained
+0 Vote Up -0Vote Down

Whenever I have a problem query in MySQL, I say to myself, I know, I’ll use Explain Plan. Now I have two problems. I run the explain plan, but then I have to look up all the cryptic terms and read through the documentation to know what in the world is going on. Well, over the weekend, I decided to solve my own problem. I created a simple page that allows you to run an explain plan, and then each of those obscure, abbreviated terms is annotated with a tooltip that explains exactly what’s going on. I’m calling it MySQL Explain Explained.

I created a demo using the sample employees database (which you can download here). Here’s a sample query:

select gender, count(emp_no) as manager_cnt,
  [Read more...]
Designing Multi-Tenanted Databases
+0 Vote Up -0Vote Down

Designing database tables for many customers is a surprisingly foreign concept for some programmers. I’ve been in interviews where a programmer will sketch out a normal object model, and then proceed to suggest that for the database, each customer has their own set of tables, each prefixed with the customer name. What I’d like to do in this post is introduce the concept of multi-tenancy and then show methods you can use to do it right, instead of hacky ways like the one I just mentioned. Multi-tenancy is when several customers (tenants) share the same database and codebase but can only see their own data.

The basic idea

The core method of adding multi-tenancy to your database is adding a column to every table you’d like to segment that indicates the owner of the

  [Read more...]
Null Columns in MySQL – Part II
+0 Vote Up -0Vote Down

I’ve covered some strange default behavior around nulls in MySQL. There’s another nuance to this issue: you can still insert rows without specifying values for not null columns. MySQL will helpfully give you a default value (this is not really helpful – true help would be a quick failure with a descriptive error message). Let’s walk through this example.

mysql> Create Table NullTable (a Int Not Null, b Int Not Null);
Query OK, 0 rows affected (0.01 sec)

Here’s a table with two not null columns.

mysql> Insert Into NullTable (a) Values (1);
Query OK, 1 row affected, 1 warning (0.01 sec)

We can totally insert, even though we didn’t specify a value for b and b is not

  [Read more...]
MySQL Simplified
+0 Vote Up -0Vote Down

MySQL is the little engine that could. It powers sites like Facebook, YouTube, Twitter, and thousands of blogs, CMSes, and e-commerce sites. Its value to the world and to the development community could be measured in the hundreds of billions, and yet it’s free, and you can use it just by downloading it. Almost every programming language has drivers for it and it can run on so many operating systems and architectures, there’s really no limit on it.

Yet there’s a dark side. MySQL is full of gotchas and bugs, and it lacks features that sometimes call into question its status as a real database. The documentation is often open-ended and confusing, with gaps in key parts. If you want to run it, you have the option of using it on Linux, Mac, Solaris, or Windows and every hosting company or provider like Amazon AWS has

  [Read more...]
MongoDB in 2013 -- A Year in Review
+0 Vote Up -1Vote Down
It's again that time of the year. Analysts are spending oceans of words to predict the future, companies are making plans for the next year and people are resting and enjoying the break with their families. To me, this is the perfect time to reflect on my choices, the direction I'm headed to and consider if I still love what I do.

At the beginning of the year I decided to join MongoDB (formerly 10gen). The more I think about it, the more I realize I've been wrong. Yes, it's been the worst decision in my life not to join MongoDB when I was first offered the opportunity years ago. At that time an

  [Read more...]
Showing entries 1 to 30 of 1136 Next 30 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.