Showing entries 1 to 7
Displaying posts with tag: schema design (reset)
Schema Design in MongoDB vs Schema Design in MySQL

For people used to relational databases, using NoSQL solutions such as MongoDB brings interesting challenges. One of them is schema design: while in the relational world, normalization is a good way to start, how should we design our collections when creating a new MongoDB application?

Let’s see with a simple example how we would create a data structure for MySQL (or any relational database) and for MongoDB. We will assume in this post that we want to store people information (their name) and the details from their passport (country and validity date).

Relational Design

In the relational world, the basic idea is to try to stick to the 3rd normal form and create two tables (I’ll omit indexes and foreign keys for clarity – MongoDB supports indexes but not foreign keys):

mysql> select * from people;
+----+------------+
| id | name       |
+----+------------+
|  1 | Stephane   |
|  2 | John       |
|  3 | …
[Read more]
Simple MySQL: using TRIGGERs to keep datetime columns updated without direct SQL calls

If you’ve ever used non-opensource code, or applications that you don’t have complete control over, then you may have run into situations you need to alter data on a per-row basis but been unable to do so for lack of application SQL access. The solution to this type of problem is to use a MySQL TRIGGER, which allows us to execute arbitrary SQL commands when defined events occur. Why is this useful and how does it work? Well…

For example, I have a freeRADIUS server that uses MySQL as a backend for the user authentication, and one of my server applications (HostBill) provides a freeRADIUS plugin that allows my users to manage their RADIUS accounts; however the default freeRADIUS schema lacks a DATETIME column on the user table. When a user is created (INSERT) or has their password changed (UPDATE) I have no row data that tells me the dates when these operations were issued. Typically this would be a trivial change: issue an ALTER TABLE …

[Read more]
Dirty growth-over-time query

I’ve been messing around with the Kontrollbase schema for the last couple of days, writing various queries for the daily reporting scripts that will eventually be an automated pdf report. I’ll give you examples of two of the queries, the first being overall environment stats, and the second being single-host growth over time.

Overall environment stats
select ((((MAX(os_mem_used)) / 1024 ) / 1024) / 1024) max_os_mem_used, ((((MIN(os_mem_used)) / 1024 ) / 1024) / 1024) min_os_mem_used, ((((AVG(os_mem_used)) / 1024 ) / 1024) / 1024) avg_os_mem_used, ((((STDDEV_POP(os_mem_used)) / 1024 ) / 1024) / 1024) stdev_os_mem_used, ((((MAX(length_data + length_index)) / 1024 ) / 1024) / 1024) max_size, ((((MIN(length_data + length_index)) / 1024 ) / 1024) / 1024) min_size, ((((AVG(length_data + length_index)) / 1024 ) / 1024) / 1024) avg_size, ((((STDDEV_POP(length_data + length_index)) / 1024 ) / 1024) / 1024) …

[Read more]
MySQL Data Type Q&A

Question: “When I use procedure analyse() on my schema it suggests TINYINT for the columns which have the data type VARCHAR. Based on the performance and data requirements, which one is better?”

Answer: TINYTEXT and TINYINT and VARCHAR are quite different. For reference I would refer you to the mysql manual page about data types.

However, procedure analyse() will read the values you have in your columns and if they consistently fit a pattern that would be better suited to another data type then it will suggest the correct one. As in, if your column is VARCHAR(1) and your data is similar to “1,4,7,5,2″ etc then TINYINT would be a better suited data type since you are dealing with numbers and not variable characters. Similarly, if you have the same varchar column, but your data is “a,b,t,h,o” etc then TINYTEXT or CHAR would be better …

[Read more]
Hot cache data, sharding

In the last several months at Grazr, we've been wrestling with a large database (running on MySQL) of feeds and feed items. The schema is essentially a feeds table with child tables items, items_text (text), and enclosures. We have this database to provide the means for users to be able to merge (a Stream) feeds so that you have an aggregate feed with items for whatever feeds you want in the list of feeds for your merge. It works great, the only problem being the volume of data, which more data means the query to produce that merge becomes slower. We want this merge to be able to be run on the fly, and if it's too slow, the user experience is unacceptable.

So, now I'm in the process of implementing a "Hot Cache" of feeds with an LRU (Least Recently Used) policy. The idea being, that this cache provides a smaller data set for performing the merge query against. We need to be able to handle storing much more data than we currently do …

[Read more]
FULLTEXT lesson/reminder of the day

I've been using MySQL fulltext indexes on a table where I keep a few varchar and one text column that is used for searches. I've had it defined as:

CREATE TABLE `items_text` (
  `item_id` bigint(20) NOT NULL,
  `fts` varchar(4) NOT NULL default 'grzr',
  `author` varchar(80) NOT NULL default '',
  `title` varchar(255) NOT NULL default '',
  `content` text NOT NULL,
  PRIMARY KEY  (`item_id`),
  FULLTEXT KEY `title` (`title`),
  FULLTEXT KEY `author` (`author`),
  FULLTEXT KEY `fts` (`fts`),
  FULLTEXT KEY `content` (`content`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8



One of my colleagues pointed out he was experiencing slow performance with this query:

select count(*) from items_text where (MATCH (title, author, content) AGAINST ('+iron +man' IN BOOLEAN MODE))



I ran EXPLAIN just to make sure that the index was being used:

mysql> explain …
[Read more]
High Performance MySQL, Second Edition: Schema Optimization and Indexing

I've been trying to circle back and clean up things I left for later in several chapters of High Performance MySQL, second edition. This includes a lot of material in chapter 4, Schema Optimization and Indexing. At some point I'll write more about the process of writing this book, and what we've done well and what we've learned to do better, but for right now I wanted to complete the picture of what material we have on schema, index, and query optimization. The last two chapters I've written about (Query Performance Optimization and Advanced MySQL Features) have generated lots of feed back along the lines …

[Read more]
Showing entries 1 to 7