Showing entries 121 to 130 of 1248
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Databases (reset)
MySQL 5.6 Optimizer Trace

One of the new features of MySQL 5.6 is the optimizer trace. What does it do? Well, you know Explain, right? Just add Explain before a query and MySQL will print out how it would execute the query. The optimizer trace does that, except it gives you a little more background into to how it makes its decisions. Think of it as “It’s a Wonderful Life” for indexes – it shows you what the world would be like without that primary key or index.

If you ask, why did MySQL choose to run a full table scan vs. an index seek, the optimizer trace can help you get a better picture of the situation. The other big deal with the optimizer trace is that it’ll show you the relative costs of operations. Yeah, MySQL already tells you how many rows it’ll have to read, but now you know how much those cost given the access path (index seek, scan, etc.)

Example

Let’s jump in to see what it looks like. First, enable the optimizer trace …

[Read more]
Getting Data into Hadoop in real-time

Moving data between databases is hard. Without ever intending it, I seem to have spent a lifetime working on solutions for getting data into and out of databases, but more frequently between. In fact, my first job out of university was migrating data from BRS/Text, a free-text database (probably what we would call a NoSQL) into a more structured Oracle.

Today I spend some of my time working in Big Data, more often than not, migrating information from existing data stores into Big Data so that they can be analysed, something I covered in more detail here:

http://www.ibm.com/developerworks/library/bd-sqltohadoop1/index.html
http://www.ibm.com/developerworks/library/bd-sqltohadoop2/index.html

[Read more]
Creating Users & Granting Permissions in MySQL

One of the first tasks when setting up a new MySQL server is creating users and granting them permissions, or giving them the ability to do stuff in MySQL. By default, a single user, the root user, is created when you setup MySQL. This user is granted all privileges on the entire system, which means you should create separate logins for administrators and applications. Let’s walk through all the basic steps you need to create a new user and give them permissions.

Create a User

The first command I’ll show you is pretty simple:

Create User bob@'%' Identified By 'Astr0ngPhr@$e';

Before we move on, let’s check that user’s permissions:

Show Grants for bob@'%';

Which should show something like:

+----------------------------------------------------------------------------------------------------+
| Grants for bob@% …
[Read more]
Designing Multi-Tenanted Databases – Anti-patterns

In this post, I’m going to briefly cover the signs that you’re doing multi-tenancy wrong. Some of these practices are entrenched in software: there are gems in Ruby on Rails, for instance, use the first anti-pattern to achieve multi-tenancy. Listen, you can drive a car with a flat tire and you can eat yogurt with a fork. People have made these solutions work, but there’s a better way.

Creating tables or schemas per customer

If you find yourself running DDL (Create Table…) for each new company or user that you add to your system, most likely you’re committing a pretty big anti-pattern. Now every time you update the table definition or need to update data across all tables, you’ll have to use a script to generate the SQL for you. Those updates will take longer and it’s much more prone to failure.

If you’re doing this for performance reasons, you have two options in most database systems to …

[Read more]
Anonymizing Data During Replication

If you happen to work with personal data, chances are you are subject to SOX (Sarbanes-Oxley) whether you like it or not.

One of the worst aspects of this is that if you want to be able to analyse your data and you replicate out to another host, you have to find a way of anonymizing the information. There are of course lots of ways of doing this, but if you are replicating the data, why not anonymize it during the replication?

Of the many cool features in Tungsten Replicator, one of my favorites is filtering. This allows you to process the stream of changes that are coming from the data extracted from the master and perform operations on it. We use it a lot in the replicator for ignoring tables, schemas and columns, and for ensuring that we have the correct information within the THL.

Given this, let’s use it to anonymize the data as it is being replicated so that we don’t need to post-process it for analysis, and …

[Read more]
MySQL 5.6 GA one year – What is next?

MySQL 5.6 has been GA for just over a year now. See MySQL 5.6.10 Release Notes.  Congratulations on your birthday! That is quite a long time. I was using it earlier in production because it worked and could do things that 5.5 could not do, but earlier versions were to use at your own risk, and indeed if prodded incorrectly would fall on the floor. That is fair enough because they were work in progress, yet if you poked them the right way they did a very good job.  Those dev versions have been long since upgraded which is good so they do not need quite as much care and attention.

So from where I see 5.6 it works very well. One big change that has made a large difference but which I think a lot of people may not really understand or use is the …

[Read more]
Designing Multi-Tenanted Databases – Many-to-Many

I want to give you tools that you can use for building databases that can handle complex relationships. In the previous article in this series, I looked at hierarchical data. The classic example of a hierarchy is an org chart. On most org charts I’ve seen, an employee has only one boss, which is a one-to-many relationship. The other kind of segmentation is many-to-many. A good instances of this is your social circle. Most people have many friends, and those friends have multiple friends themselves.

This is a common pattern. When developing a multi-tenanted application, users will often want to see and update information for multiple tenants. Imagine, if you will, a freelance Web developer. He’ll set up Google Analytics accounts for each client’s website, and he’ll add both the client and himself as users to the account. …

[Read more]
Enhancing pt-kill to Better Protect your Servers

I believe in automation as much as possible, and I'm always working to make the day to day tasks of operations as smooth as possible.  Also I try not to be afraid to take good tools and make them better.

Here in Database Ops at Box, we use pt-kill running as a service to constantly monitor our servers and help protect against long running queries.  But our thresholds are pretty generous, and in some cases it's possible for unforeseen circumstances to cause enough queries to storm the database such that we can have problems before any of them hit the threshold for "busy time."  Ditto for idle connections.

The response is that someone has to be available to manually run another copy of pt-kill with much lower thresholds to clear out these thundering herds.  But what if we could let pt-kill handle both the "normal" mode and still protect us from herds?

That's what we've done by adding a …

[Read more]
What is FTS_BEING_DELETED.ibd

I currently have on a MySQL 5.6 database using innodb_file_per_table the following individual tablespace file.

schema/FTS_00000000000001bb_BEING_DELETED.ibd

The schema is all InnoDB tables, and there ARE NO Full Text Indexes. I cannot comment on if a developer has tried to create one previously.
I am none the wiser in explaining the ongoing use of these files, or if it can be/should be deleted.

On closer inspection there are infact a number of FTS files.

$ ls -al FTS*
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001bb_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001bb_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:00 FTS_00000000000001bb_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 …
[Read more]
Good Test Data

Over the years you collect datasets you have created for various types of testing, seeding databases etc. I have always thought one needs to better manage this for future re-use. Recently I wanted to do some “Big Data” playing and again that question of what datasets can I use let me to review the past collated list at Seeking public data for benchmarks.

The types of things I was wanting to do lead me to realize a lot of content is “public domain” and Project Gutenberg is just one great source of text in multiple languages. This was just one aspect of my wish list but text based data is used from blogs, comments, articles, microblogs etc, and multiple languages was important from some text analysis.

With a bit of thinking about the building blocks, I created …

[Read more]
Showing entries 121 to 130 of 1248
« 10 Newer Entries | 10 Older Entries »