Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 39 Next 9 Older Entries

Displaying posts with tag: design (reset)

How to structure and design a relational database to support you data storage needs?
+0 Vote Up -0Vote Down
Well, every now and then, when we began to start a new project or app, which has some data storage requirement, we have a deep intriguing thought as to how best represent the data structure so as to support a variety of needs including but not limited to (ACID rules):

1. Normalization
2. Reliability
3. Consistency
4. And many others

Below, I provide a set of steps which you can follow to arrive at a data model that correctly suites your requirements.

Steps:

1. Identify the project or app requirements / specifications and business rules which tell you what your app will be able to do when it is ready.
2. From these business rules, identify possible objects for each business rule and mark them in a paper using rectangular sections like authors,











  [Read more...]
Reports exaggerated
Employee +1 Vote Up -0Vote Down
I've been letting the blog rest recently, and not so recently as well.  The problem is not a lack of subjects, but a lack of time to do them any justice.  However it is quite sad to see that my last entry was in September 2012, so it is time to post again.

Of late I have been pondering what I have to say about :
  • Distributed MVCC and write-scaling
  • Different approaches to eventual consistency with replicated RDBMS
  • Various MySQL Cluster related topics
  • Various general rambling and unstructured topics
However, these will take some time to percolate and calcify.

In the meantime here are some things I have found interesting recently :






  [Read more...]
Parallel replication: off by one
+2 Vote Up -0Vote Down

One of the most common errors in development is where a loop or a retrieval by index falls short or long by one unit, usually because of an oversight or a logic in coding.

Of the following snippets, which one will run 10 times?

/* #1 */    for (N = 0 ; N < 10; N++) printf("%d\n", N);

/* #2 */ for (N = 0 ; N <= 10; N++) printf("%d\n", N);

/* #3 */ for (N = 1 ; N <= 10; N++) printf("%d\n", N);

/* #4 */ for (N = 1 ; N < 10; N++) printf("%d\n", N);

The question is deceptive, as there are two snippets that will run 10 times (1 and 3). But they will print different numbers. If you ware aiming for numbers from 1 to 10, only #3 is good.

After many years of programming, off-by-one errors are rare in my code, and I have been able to spot








  [Read more...]
MySQL Workbench 6.0 - New Design and Many Enhancements
Employee_Team +5 Vote Up -0Vote Down

New GUI, 30+ New Features, and Major New Components

Oracle is excited to announce the immediate availability of the production-read, GA release of MySQL Workbench 6.0, available for download under the GPL, as well as part of the Commercial MySQL Standard, Enterprise, and Cluster Carrier Grade Editions with 24x7 global support.

The need by database professionals for management tools has increased with expanding data volumes, web, cloud and mobile computing growth. Improvement and additions in MySQL Workbench helps developers and administrators better manage these dynamic data environments. This latest GA release includes many new features and a modernized user interface that allows users to simplify MySQL database development, design and administration.

Overview
The goal of MySQL Workbench 6.0 is to simplify and improve the workflow in the

  [Read more...]
Implementing efficient Geo IP location system in MySQL
+0 Vote Up -0Vote Down

Often application needs to know where a user is physically located. The easiest way to figure that out is by looking up their IP address in a special database. It can all be implemented in MySQL, but I often see it done inefficiently. In my post I will show how to implement a complete solution that offers great performance.

Importing Geo IP data

First you will require a database mapping network addresses to real locations. There are various resources available, but I chose the one nginx web server uses with its geoip module. GeoLite City comes in CSV format and is available for download with no charge from MaxMind.

The archive contains two files. GeoLiteCity-Blocks.csv lists all IP

  [Read more...]
(My)SQL mistakes. Do you use GROUP BY correctly?
+2 Vote Up -0Vote Down

Often I see a SQL problem solved incorrectly and I do not mean inefficiently. Simply incorrectly. In many cases the developer remains unaware that they aren’t getting the results they were expecting or even if a result is correct, it is only by chance, for example because the database engine was smart enough to figure out some non-sense in a query. In a few posts I will try to disclose some of the more common problems.

Aggregate with GROUP BY

Unlike many other database systems, MySQL actually permits that an aggregate query returns columns not used in the aggregation (i.e. not listed in GROUP BY clause). It could be considered as flexibility, but in practice this can easily lead to mistakes if a person that designs queries does not understand how they will be executed. For example, what values an aggregate query returns for a

  [Read more...]
The CAP theorem and MySQL Cluster
Employee +8 Vote Up -0Vote Down
tldr; A single MySQL Cluster prioritises Consistency in Network partition events. Asynchronously replicating MySQL Clusters prioritise Availability in Network partition events.


I was recently asked about the relationship between MySQL Cluster and the CAP theorem. The CAP theorem is often described as a pick two out of three problem, such as choosing from good, cheap, fast. You can have any two, but you can't have all three. For CAP the three qualities are 'Consistency', 'Availability' and 'Partition tolerance'. CAP states that in a system with data replicated over a network only two of these three qualities can be maintained at once, so which two does MySQL Cluster provide?

Standard 'my interpretation of CAP' section

Everyone who discusses CAP like to rehash





  [Read more...]
One billion
Employee +5 Vote Up -0Vote Down
As always, I am a little late, but I want to jump on the bandwagon and mention the recent MySQL Cluster milestone of passing 1 billion queries per minute. Apart from echoing the arbitrarily large ransom demand of Dr Evil, what does this mean?

Obviously 1 billion is only of interest to us humans as we generally happen to have 10 fingers, and seem to name multiples in steps of 10^3 for some reason. Each processor involved in this benchmark is clocked at several billion cycles per second, so a single billion is not so vast or fast.

Measuring over a minute also feels unnatural for a computer performance benchmark - we are used to lots of things happening every second! A minute is a long time in silicon.

What's





  [Read more...]
Eventual Consistency in MySQL Cluster - implementation part 3
Employee +3 Vote Up -0Vote Down



As promised, this is the final post in a series looking at eventual consistency with MySQL Cluster asynchronous replication. This time I'll describe the transaction dependency tracking used with NDB$EPOCH_TRANS and review some of the implementation properties.

Transaction based conflict handling with NDB$EPOCH_TRANS

NDB$EPOCH_TRANS is almost exactly the same as NDB$EPOCH, except that when a conflict is detected on a row, the whole user transaction which made the conflicting row change is marked as conflicting, along with any dependent transactions. All of these rejected row operations are then handled using






  [Read more...]
Eventual consistency in MySQL Cluster - implementation part 2
Employee +5 Vote Up -0Vote Down



In previous posts I described how row conflicts are detected using epochs. In this post I describe how they are handled.

Row based conflict handling with NDB$EPOCH

Once a row conflict is detected, as well as rejecting the row change, row based conflict handling in the Slave will :
  • Increment conflict counters
  • Optionally insert a row into an exceptions table
For NDB$EPOCH, conflict detection and handling operates on one Cluster in an Active-Active pair designated as the Primary. When a Slave MySQLD attached to the Primary Cluster detects a conflict between data stored in the








  [Read more...]
Eventual consistency in MySQL Cluster - implementation part 1
Employee +4 Vote Up -0Vote Down



The last post described MySQL Cluster epochs and why they provide a good basis for conflict detection, with a few enhancements required. This post describes the enhancements.

The following four mechanisms are required to implement conflict detection via epochs :
  • Slaves should 'reflect' information about replicated epochs they have applied
    Applied epoch numbers should be included in the Slave Binlog events returning to the originating cluster, in a Binlog position corresponding to the commit time of the






  •   [Read more...]
    Eventual Consistency in MySQL Cluster - using epochs
    Employee +5 Vote Up -0Vote Down



    Before getting to the details of how eventual consistency is implemented, we need to look at epochs. Ndb Cluster maintains an internal distributed logical clock known as the epoch, represented as a 64 bit number. This epoch serves a number of internal functions, and is atomically advanced across all data nodes.

    Epochs and consistent distributed state

    Ndb is a parallel database, with multiple internal transaction coordinator components starting, executing and committing transactions against rows stored in different data nodes. Concurrent transactions only interact where they attempt to lock the same row. This






      [Read more...]
    Eventual Consistency - detecting conflicts
    Employee +3 Vote Up -0Vote Down



    In my previous posts I introduced two new conflict detection functions, NDB$EPOCH and NDB$EPOCH_TRANS without explaining how these functions actually detect conflicts? To simplify the explanation I'll initially consider two circularly replicating MySQL Servers, A and B, rather than two replicating Clusters, but the principles are the same.

    Commit ordering

    Avoiding conflicts requires that data is only modified on one Server at






      [Read more...]
    Some MySQL projects I think are cool - Shard-Query
    Employee +0 Vote Up -0Vote Down
    I've already described Justin Swanhart's Flexviews project as something I think is cool. Since then Justin appears to have been working more on Shard-Query which I also think is cool, perhaps even more so than Flexviews.

    On the page linked above, Shard-Query is described using the following statements :

    "Shard-Query is a distributed parallel query engine for MySQL"
    "ShardQuery is a PHP class which is intended to make working with a partitioned dataset easier"
    "ParallelPipelining - MPP distributed query engines runs fragments of queries in parallel, combining the results at the end. Like map/reduce except it speaks SQL





      [Read more...]
    Eventual consistency with transactions
    Employee +3 Vote Up -0Vote Down



    In my last post I described the motivation for the new NDB$EPOCH conflict detection function in MySQL Cluster. This function detects when a row has been concurrently updated on two asynchronously replicating MySQL Cluster databases, and takes steps to keep the databases in alignment.

    With NDB$EPOCH, conflicts are detected and handled on a row granularity, as opposed to column granularity, as this is the granularity of the epoch metadata used to detect conflicts. Dealing




      [Read more...]
    Eventual consistency with MySQL
    Employee +2 Vote Up -0Vote Down



    tl;dr : New 'automatic' optimistic conflict detection functions available giving the best of both optimistic and pessimistic replication on the same data

    MySQL replication supports a number of topologies, and one of the most interesting is an active-active, or master-master topology, where two or more Servers accept read and write traffic, with asynchronous replication between them.

    This topology has a number of attractions, including :
    • Potentially higher availability
    • Potentially low impact on read/write latency
    • Service availability insensitive to replication







      [Read more...]
    Some MySQL projects I think are cool - HandlerSocket Plugin
    Employee +2 Vote Up -0Vote Down
    The HandlerSocket project is described in Yoshinori Matsunobu's blog entry under the title 'Using MySQL as a NoSQL - A story for exceeding 750,000 qps on a commodity server'. It's a great headline and has generated a lot of buzz. Quite a few early commentators were a little confused about what it was - a new NoSQL system using InnoDB? A cache? In memory only? Where does Memcached come in? Does it support the Memcached protocol? If not, why not? Why is it called HandlerSocket?

    Inspirations from Memcache may include the focus on simplicity, performance and a simple human readable protocol. As Yoshinori says, Kazuho Oku has already implemented a MySQLD-embedded Memcached server, no

      [Read more...]
    Journey upriver to the dark heart of ha_ndbcluster
    Employee +3 Vote Up -0Vote Down
    Unlike most other MySQL storage engines, Ndb does not perform all of its work in the MySQLD process. The Ndb table handler maps Storage Engine Api calls onto NdbApi calls, which eventually result in communication with data nodes. In terms of layers, we have SQL -> Handler Api -> NdbApi -> Communication. At each of these layer boundaries, the mapping between operations at the upper layer to operations at the lower layer is non trivial, based on runtime state, statistics, optimisations etc.

    The MySQL status variables can be used to understand the behaviour of the MySQL Server in terms of user commands processed, and also how these map to some of the Storage Engine Handler Api calls.

    Status variables



      [Read more...]
    MySQL Cluster online scaling
    Employee +7 Vote Up -0Vote Down
    Most people looking at a diagram showing the Cluster architecture soon want to know if the system can scale online. Api nodes such as MySQLD processes can be added online, and the storage capacity of existing data nodes can be increased online, but it was not always possible to add new data nodes to the cluster without an initial system restart requiring a backup and restore.

    An online add node and data repartitioning feature was finally implemented in MySQL Cluster 7.0. It's not clear how often users actually do scale their Clusters online, but it certainly is a cool thing to be able to do.

    There are two parts to the feature :
  • Online add an empty data node to an existing cluster
  • Online rebalance existing data across the existing and new data nodes

  • Adding an empty data node to a cluster sounds trivial, but is actually fairly





      [Read more...]
    Data distribution in MySQL Cluster
    Employee +5 Vote Up -0Vote Down
    MySQL Cluster distributes rows amongst the data nodes in a cluster, and also provides data replication. How does this work? What are the trade offs?

    Table fragments

    Tables are 'horizontally fragmented' into table fragments each containing a disjoint subset of the rows of the table. The union of rows in all table fragments is the set of rows in the table. Rows are always identified by their primary key. Tables with no primary key are given a hidden primary key by MySQLD.

    By default, one table fragment is created for each data node in the cluster at the time the table is created.

    Node groups and Fragment replicas

    The data nodes in a cluster are logically divided into Node groups. The size of each Node group is controlled by the NoOfReplicas parameter. All data nodes in a Node group store the same data. In









      [Read more...]
    Low latency distributed parallel joins
    Employee +9 Vote Up -0Vote Down
    When MySQL AB bought Sun Microsystems in 2008 (or did Sun buy MySQL?), most of the MySQL team merged with the existing Database Technology Group (DBTG) within Sun. The DBTG group had been busy working on JavaDB, Postgres and other DB related projects as well as 'High Availability DB' (HADB), which was Sun's name for the database formerly known as Clustra.

    Clustra originated as a University research project which spun out into a startup company and was then acquired by Sun around the era of dot-com. A number of technical papers describing aspects of Clustra's design and history can be found online, and it is in many ways similar to Ndb Cluster, not just in their shared Scandinavian roots. Both are shared-nothing parallel databases originally aimed at the Telecoms market, supporting high availability

      [Read more...]
    Some MySQL projects I think are cool - OpenQuery Graph Engine (OQG)
    Employee +1 Vote Up -0Vote Down
    This project was announced a year or so ago by Antony Curtis who used to work for MySQL AB. Having met Antony a few times I was intrigued to see what he was up to. The quote on the OpenQuery website describes it well :
    The Open Query GRAPH engine (OQGRAPH) is a computation engine allowing hierarchies and more complex graph structures to be handled in a relational fashion. In a nutshell, tree structures and friend-of-a-friend style searches can now be done using standard SQL syntax, and results joined onto other tables.

    That sounds cool, and it's the first time I've heard of a MySQL 'Computation engine' plugin. Delving further into the manual gives some insight, and there's some unexpected twists there :
    • OQG is a storage



      [Read more...]
    Some MySQL projects I think are cool - Flexviews
    Employee +0 Vote Up -0Vote Down
    Most of the time we think of SQL queries as being executed at a point in time and generating a single definitive result, but huge efficiency gains are available when data changes are tracked and derived views are partially updated as needed rather than being fully recomputed periodically. MySQL has support for views on tables, but there is currently no support for materialized views. While thinking about this topic I decided to have another look at Justin Swanhart's Flexviews tool and it's definitely a cool MySQL based project.

    Flexviews is an open source set of non-intrusive addons to MySQL enabling materialized views to be defined and maintained as the underlying tables are changed. If you're not sure what a materialized view is or why they can be useful then I recommend reading the intro on the Flexviews

      [Read more...]
    Cool Web Designer is Looking for Work
    +0 Vote Up -0Vote Down

    My wife – a good web designer with 6 years of experience with web design, HTML and CSS is looking for a job. Here is some information about her:

    We’re physically located in Toronto, Canada, but she has a great experience of working remotely too. So, if you need a web designer or a junior web designer, feel free to contact Tanya.

      [Read more...]
    ACID tradeoffs, modularity, plugins, Drizzle
    Employee +1 Vote Up -1Vote Down
    Most software people are aware of the ACID acronym coined by Jim Gray. With the growth of the web and open source, the scaling and complexity constraints imposed on DBMS implementations supporting ACID are more visible, and new (or at least new terms for known) compromises and tradeoffs are being discussed widely. The better known NoSQL systems are giving insight by example into particular choices of tradeoffs.

    Working at MySQL, I have often been surprised at the variety of potential alternatives when implementing a DBMS, and the number of applications which don't need the full set of ACID letters in the strictest form. The original MySQL storage engine, MyISAM is one of the first and most successful examples of

      [Read more...]
    My favorite MySQL data type – DECIMAL(31,0)
    +1 Vote Up -0Vote Down

    It may seem hard to believe, but I have seen DECIMAL(31,0) in action on a production server. Not just in one column, but in 15 columns just in the largest 4 tables of one schema. The column was being used to represent a integer primary or foreign key column.

    In a representative production instance (one of a dozen plus distributed production database servers) the overall database footprint was decreased from ~10 GB to ~2 GB, a 78% saving. In total, 15 columns across just 4 tables were changed from DECIMAL(31,0) to INT UNSIGNED.

    One single table > 5GB was reduced to under 1GB (a 81% saving). This being my record for any GB+ tables in my time working with the MySQL database.

    Had this server for example had 4GB of RAM, and say 2.5GB allocated to the innodb_buffer_pool_size, this one change moved the system from requiring more consistent disk access (4x

      [Read more...]
    MySQL Cluster development
    Employee +1 Vote Up -0Vote Down
    MySQL Cluster (http://www.mysql.com/products/database/cluster/) is the name given to one or more MySQL Server processes, connected to an Ndb Cluster database. From the point of view of the MySQL Server processes, the Ndb Cluster is a Storage Engine, implementing transactional storage of tables containing rows. From the point of view of the Ndb Cluster database, the MySQL Server processes are API nodes, performing DDL and DML transactions on tables stored in the cluster. Both exist independently – Ndb Cluster can be used without attached MySQL Server processes, but almost all users of Ndb Cluster connect at least one MySQL Server for DDL and administration.

    Ndb stands for Network DataBase. This is a telecoms phrase where Network usually refers to a fixed or wireless telephone

      [Read more...]
    Workbench 5.2 Alpha
    Employee +0 Vote Up -1Vote Down

    In case you have not already noticed, Workbench 5.2 alpha / preview release of MySQL's premier development and design tool, has been announced.

    For an independent preview, you can also see here

    Workbench 5.2 Alpha
    Employee +0 Vote Up -0Vote Down

    In case you have not already noticed, Workbench 5.2 alpha / preview release of MySQL's premier development and design tool, has been announced.

    For an independent preview, you can also see here

    Workbench 5.2 Alpha
    Employee +0 Vote Up -0Vote Down

    In case you have not already noticed, Workbench 5.2 alpha / preview release of MySQL's premier development and design tool, has been announced.

    For an independent preview, you can also see here

    Showing entries 1 to 30 of 39 Next 9 Older Entries

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.