Introduction # I recently encountered an intriguing bug. A user reported that their query was causing vtgate to fetch a large amount of data, sometimes resulting in an Out Of Memory (OOM) error. For a deeper understanding of grouping and aggregations on Vitess, I recommend reading this prior blog post. The Query # The problematic query was: selectsum(user.type)fromuserjoinuser_extraonuser.team_id=user_extra.idgroupbyuser_extra.idorderbyuser_extra.id;The planner was unable to delegate aggregation to MySQL, leading to the fetching of a significant amount of data for aggregation.
Query planning is hard # Have you ever wondered what goes on behind the scenes when you execute a SQL query? What steps are taken to access your data? In this article, I'll talk about the history of Vitess's V3 query planner, why we created a new query planner, and the development of the new Gen4 query planner. Vitess is a horizontally scalable database solution which means that a single table can be spread out across multiple database instances.
Originally posted at Andres's blog. Traditional query optimizing is mostly about two things: first, in which order and from where to access data, and then how to then combine it. You have probably seen the tree shapes execution plans that are produced from query planning. I’ll use an example from the MySQL docs, using FORMAT=TREE which was introduced in MySQL 8.0: mysql>EXPLAINFORMAT=TREE->SELECT*->FROMt1->JOINt2->ON(t1.c1=t2.c1ANDt1.c2<t2.c2)->JOINt3->ON(t2.c1=t3.c1)\G***************************1.row***************************EXPLAIN:->Innerhashjoin(t3.c1=t1.c1)(cost=1.05rows=1)->Tablescanont3(cost=0.35rows=1)->Hash->Filter:(t1.c2<t2.c2)(cost=0.70rows=1)->Innerhashjoin(t2.c1=t1.c1)(cost=0.70rows=1)->Tablescanont2(cost=0.35rows=1)->Hash->Tablescanont1(cost=0.35rows=1)Here we can see that the MySQL optimizer thinks the best plan is to start reading from t1 using a table scan.
… or, “Making Event Attendance Count”
Late last year, I gave a keynote at paired Finnish conferences MindTrek and OpenMind. While the events were well worth attending, afterwards I spent a few bleak hours thinking about the actual costs of my attendance. If I had left Canada just for these events (which, thankfully, I didn’t) then a naive estimation of costs would have been something like this:
- ~20 000 km of air travel (Vancouver to Frankfurt, Frankfurt to Helsinki. Return.)
- ~7 000+ CAD of costs (flights, hotels, taxis, meals, time) (borne by a combination of eZ Systems, the Mozilla Foundation and the …
The organizers of the PHP Québec Conference were gracious enough to give me an unedited copy of my Copyright, Contracts and Licensing for PHP Developers session.
I cleaned the audio up last night, stripping out some of the more odious filler words (I seem to say “Umm” rather often), shortening pauses as I switched slides, removing redundant asides (like asking if there are any questions, when no questions then follow) and excising the introduction and applause.
After a good deal of consideration, I did chose to leave the errors and other flaws in the content presented. These flaws were presented to the audience and they should stay in the recording.
Of course, I don’t want people to be mislead by any of the flaws; to prevent this, I am transcribing the entire talk, …
[Read more]