Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Everything is awesome
+1 Vote Up -0 Vote Down
My kids watched the new Lego movie today and spent the rest of the day repeating "Everything is amazing". I spent a few hours reading MongoDB documentation to help a friend who uses it. Everything wasn't awesome for all of us. I try to be a pessimist when reading database documentation. If you spend any time near production then you spend a lot of time debugging things that fail. Being less than optimistic is a good way to predict failure.

One source of pessimism is database limits. MongoDB has a great page to describe limits. It limits index keys to less than 1025 bytes. But this is a great example that shows the value of pessimism. The documentation states that values (MongoDB documents) are not added to the index when the index key is too large. An optimist might assume that an insert or update statement fails when the index key is too large, but that is not the specified behavior.

As far as I can tell, prior to MongoDB 2.5.5 the behavior was to not add the document to the index when the indexed column exceeded 1024 bytes. The insert or update would succeed but the index maintenance would fail. Queries that used the index after this can return incorrect results.

A quick search of the interwebs shows that people were aware of the problem in 2011. I can reproduce the problem on my Ubuntu 12.10 VM. Why do we tolerate problems like this? Maybe this isn't a big deal and the real problem is that new risks (from a system I don't know much about) are worse than risks in software that I have been using for years. But corruption (either permanent or via incorrect query results) has been a stop the world bug for me -- as in you do nothing else until the problem has been fixed. Why have MongoDB users tolerated this problem for years?

While it is great that the bug appears to have been fixed, database vendors should understand that FUD takes much more time to go away. See all of the stories about transactions and MySQL that lived on long after InnoDB became viable. And note that some of the MySQL FUD was self-inflicted -- see the section of the MySQL manual on Atomic Operations.

Found this code in 2.4.9 to explain how key-too-large is handled. A careful reader might figure out that index maintenance isn't done. Nice optimization. I spoke to one user who doesn't like the old behavior but doesn't want to break apps with the new behavior that fails inserts/updates with too large keys. Indexes on a prefix of a key would help in that case.

template< class V >
void BtreeBucket<V>::twoStepInsert(DiskLoc thisLoc,
                                   IndexInsertionContinuationImpl<V> &c,
                                   bool dupsAllowed) const
    {

        if ( c.key.dataSize() > this->KeyMax ) {
            problem() << "ERROR: key too large len:" << c.key.dataSize() 
                            << " max:" << this->KeyMax << ' '
                            << c.key.dataSize() << ' '
                            << c.idx.indexNamespace() << endl;
            return; // op=Nothing
        }
        insertStepOne(thisLoc, c, dupsAllowed);
    }

/** todo: meaning of return code unclear clean up */
template< class V >
int BtreeBucket<V>::bt_insert(const DiskLoc thisLoc, const DiskLoc recordLoc,
                              const BSONObj& _key, const Ordering &order,
                              bool dupsAllowed,
                              IndexDetails& idx, bool toplevel) const
    {
        guessIncreasing = _key.firstElementType() == jstOID && idx.isIdIndex();
        KeyOwned key(_key);

        dassert(toplevel);
        if ( toplevel ) {
            if ( key.dataSize() > this->KeyMax ) {
                problem() << "Btree::insert: key too large to index, skipping "
                << idx.indexNamespace() << ' ' << key.dataSize()
                << ' ' << key.toString() << endl;
                return 3;
            }
        }


Votes:

You must be logged in with a MySQL account to vote on Planet MySQL entries. More information on PlanetMySQL voting.

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.