Showing entries 61 to 70 of 1184
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Databases (reset)
Anonymizing Data During Replication

If you happen to work with personal data, chances are you are subject to SOX (Sarbanes-Oxley) whether you like it or not.

One of the worst aspects of this is that if you want to be able to analyse your data and you replicate out to another host, you have to find a way of anonymizing the information. There are of course lots of ways of doing this, but if you are replicating the data, why not anonymize it during the replication?

Of the many cool features in Tungsten Replicator, one of my favorites is filtering. This allows you to process the stream of changes that are coming from the data extracted from the master and perform operations on it. We use it a lot in the replicator for ignoring tables, schemas and columns, and for ensuring that we have the correct information within the THL.

Given this, let’s use it to anonymize the data as it is being replicated so that we don’t need to post-process it for analysis, and …

[Read more]
MySQL 5.6 GA one year – What is next?

MySQL 5.6 has been GA for just over a year now. See MySQL 5.6.10 Release Notes.  Congratulations on your birthday! That is quite a long time. I was using it earlier in production because it worked and could do things that 5.5 could not do, but earlier versions were to use at your own risk, and indeed if prodded incorrectly would fall on the floor. That is fair enough because they were work in progress, yet if you poked them the right way they did a very good job.  Those dev versions have been long since upgraded which is good so they do not need quite as much care and attention.

So from where I see 5.6 it works very well. One big change that has made a large difference but which I think a lot of people may not really understand or use is the …

[Read more]
Designing Multi-Tenanted Databases – Many-to-Many

I want to give you tools that you can use for building databases that can handle complex relationships. In the previous article in this series, I looked at hierarchical data. The classic example of a hierarchy is an org chart. On most org charts I’ve seen, an employee has only one boss, which is a one-to-many relationship. The other kind of segmentation is many-to-many. A good instances of this is your social circle. Most people have many friends, and those friends have multiple friends themselves.

This is a common pattern. When developing a multi-tenanted application, users will often want to see and update information for multiple tenants. Imagine, if you will, a freelance Web developer. He’ll set up Google Analytics accounts for each client’s website, and he’ll add both the client and himself as users to the account. …

[Read more]
Enhancing pt-kill to Better Protect your Servers

I believe in automation as much as possible, and I'm always working to make the day to day tasks of operations as smooth as possible.  Also I try not to be afraid to take good tools and make them better.

Here in Database Ops at Box, we use pt-kill running as a service to constantly monitor our servers and help protect against long running queries.  But our thresholds are pretty generous, and in some cases it's possible for unforeseen circumstances to cause enough queries to storm the database such that we can have problems before any of them hit the threshold for "busy time."  Ditto for idle connections.

The response is that someone has to be available to manually run another copy of pt-kill with much lower thresholds to clear out these thundering herds.  But what if we could let pt-kill handle both the "normal" mode and still protect us from herds?

That's what we've done by adding a …

[Read more]

I currently have on a MySQL 5.6 database using innodb_file_per_table the following individual tablespace file.


The schema is all InnoDB tables, and there ARE NO Full Text Indexes. I cannot comment on if a developer has tried to create one previously.
I am none the wiser in explaining the ongoing use of these files, or if it can be/should be deleted.

On closer inspection there are infact a number of FTS files.

$ ls -al FTS*
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001bb_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001bb_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:00 FTS_00000000000001bb_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 …
[Read more]
Good Test Data

Over the years you collect datasets you have created for various types of testing, seeding databases etc. I have always thought one needs to better manage this for future re-use. Recently I wanted to do some “Big Data” playing and again that question of what datasets can I use let me to review the past collated list at Seeking public data for benchmarks.

The types of things I was wanting to do lead me to realize a lot of content is “public domain” and Project Gutenberg is just one great source of text in multiple languages. This was just one aspect of my wish list but text based data is used from blogs, comments, articles, microblogs etc, and multiple languages was important from some text analysis.

With a bit of thinking about the building blocks, I created …

[Read more]
The GRANT/REVOKE dilemma

It is common practice to grant your application the privileges of “GRANT SELECT, INSERT, UPDATE, DELETE ON yourdb.* TO user@host”.

But what if you want to ensure you cannot DELETE data from just one table?

Ideally I want to be able to “REVOKE DELETE ON yourdb.important_table FROM user@host”. You cannot do currently this with the MySQL privilege system.

If your schema has 100 tables, and you want to remove DELETE from one, you have to define DELETE for the 99 others, and remember that for each new table, you need to remember to also modify user privileges.

MySQL Explain Explained

Whenever I have a problem query in MySQL, I say to myself, I know, I’ll use Explain Plan. Now I have two problems. I run the explain plan, but then I have to look up all the cryptic terms and read through the documentation to know what in the world is going on. Well, over the weekend, I decided to solve my own problem. I created a simple page that allows you to run an explain plan, and then each of those obscure, abbreviated terms is annotated with a tooltip that explains exactly what’s going on. I’m calling it MySQL Explain Explained.

I created a demo using the sample employees database (which you can download here). Here’s a sample query:

select gender, count(emp_no) as manager_cnt, sum(cnt) as sum, avg(cnt) as avg
    select m.emp_no, m.gender, count(de.emp_no) as cnt
    from …
[Read more]
Designing Multi-Tenanted Databases – Hierarchies

In the previous article, we learned how you can design your database schema so that many customers can share one database and one set of tables. The examples I used were fairly simple: they consisted of single tables or simple hierarchies like the order and line item tables. In those examples, you had a single tenant id that segmented the tables. Now I’d like to introduce the idea of hierarchical segmentation. This is a pretty big term for the idea of allowing customers to see data based on where they are in a hierarchy, like an org chart.

Let’s say you have three sales teams each with its own manager for three regions in the US: East, Central, and West. A salesman in the East only gets to see his own prospects, while the East sales manager sees the prospects of the entire East sales team. At the very top, the VP of US Sales can access the …

[Read more]
Designing Multi-Tenanted Databases

Designing database tables for many customers is a surprisingly foreign concept for some programmers. I’ve been in interviews where a programmer will sketch out a normal object model, and then proceed to suggest that for the database, each customer has their own set of tables, each prefixed with the customer name. What I’d like to do in this post is introduce the concept of multi-tenancy and then show methods you can use to do it right, instead of hacky ways like the one I just mentioned. Multi-tenancy is when several customers (tenants) share the same database and codebase but can only see their own data.

The basic idea

The core method of adding multi-tenancy to your database is adding a column to every table you’d like to segment that indicates the owner of the data. I’ll call this column a tenant id. We’ll start with a basic example: a todo app. Let’s say you write a program that you’re going to use on your own to …

[Read more]
Showing entries 61 to 70 of 1184
« 10 Newer Entries | 10 Older Entries »