Many of my recent engagements have been all around strategy to
implement Real Time Big Data Analytics: Computing hardware
cost of extending a single table collection with MariaDB
and Parallel Query found in the Spider storage engine
to offload columnar MPP storage like InfiniDB or Vertica.
As of today Parallel Query is only available
from releases of MariaDB Spider supported by spiral arms. The
more efficient way to use parallel query with Spider can be done
on group by, and count queries that use a single
spider table. In such case Spider Engine will execute query
push down AKA map reduce.
Spider gets multiple levels of parallel execution for a single
partitioned tables.
First level is per backend server:
The way to actually tell spider to scan different backends in
concurrency is to set spider_sts_bg_mode=1
Other level is per partition:
The way to actually tell spider to scan different partitions in
concurrency is by set spider_casual_read=1
Hey but that's sometimes not enough! Per partition parallel query
can conflict with the connection recycling feature of Spider.
Behind the scene Spider search for some already open connections
via a hash table that contains every partition connections: Host,
User, Password, Port or Socket, and will reuse the same
connection if found one. To really enable concurrent scan inside
the same backend or inside localhost you need to create different
users to abuse spider on connection recycling and the server
should use TCP connection string and not socket or name
pipes.
In real life scenario, it's always good to create a different
user per partition table and grant only SELECT privileges to the
single table accessed attaching the Spider partition. in DDL you
later map to partitions via creating a unique server per
partition that map every user.
Here is a local test case attach to this scenario inside a single
server to use multiple cores to produce an aggregate.
CREATE OR REPLACE TABLE `test` ( `id` int(11) NOT NULL
AUTO_INCREMENT, `a` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`id`) ) ENGINE=InnoDB;
insert into test select *,'test' from
seq_1_to_4000000;
create or replace view v1 as select * from test where
id>=0000000 and id<1000000; create or replace view v2
as select * from test where id>=1000000 and id<2000000;
create or replace view v3 as select * from test where
id>=2000000 and id<3000000; create or replace view v4
as select * from test where id>=3000000 and
id<4000000;
grant all on *.* to root1@'127.0.0.1' identified by
'mariadb' ; grant all on *.* to root2@'127.0.0.1'
identified by 'mariadb' ; grant all on *.* to
root3@'127.0.0.1' identified by 'mariadb' ; grant
all on *.* to root4@'127.0.0.1' identified by
'mariadb' ;
create or replace server l1 FOREIGN DATA WRAPPER
mysql OPTIONS (USER 'root1',PORT 3307,PASSWORD
'mariadb',HOST '127.0.0.1',DATABASE 'test' ); create or
replace server l2 FOREIGN DATA WRAPPER
mysql OPTIONS (USER 'root2',PORT 3307,PASSWORD
'mariadb',HOST '127.0.0.1',DATABASE 'test' ); create or
replace server l3 FOREIGN DATA WRAPPER
mysql OPTIONS (USER 'root3',PORT 3307,PASSWORD
'mariadb',HOST '127.0.0.1',DATABASE 'test' ); create or
replace server l4 FOREIGN DATA WRAPPER mysql OPTIONS
(USER 'root4',PORT 3307,PASSWORD 'mariadb',HOST
'127.0.0.1',DATABASE 'test' );
create or replace table test_spd (
`id` int(11) NOT NULL AUTO_INCREMENT, `a`
varchar(1000) DEFAULT NULL, PRIMARY KEY (`id`) )
ENGINE=SPIDER PARTITION BY RANGE(id) ( PARTITION P1 VALUES LESS
THAN (1000000) COMMENT='table "v1", srv "l1"',
PARTITION P2 VALUES LESS THAN (2000000) COMMENT='table
"v2", srv "l2"', PARTITION P3 VALUES LESS THAN
(3000000) COMMENT='table "v3", srv "l3"', PARTITION
P4 VALUES LESS THAN (4000000) COMMENT='table "v4", srv
"l4"');
In this scenario queries on my 40 Million record table will use
up to 4 cores or the number of views that materialized
each partition used inside the Spider table.
Now from a real use case, let's ask David Chanial, devops on the
Beleive Digital platform that demonstrate this on a 8 node 64
cores cluster.
SVA: How do you initialize the cluster?
DCH: To manage the farm we are deploying using the python
fabric
library! We had create some deployment scripts that take as
input the node list, the number for cores and the number of
replicates.
SVA: What type of tables definitions and data sizing are you
using :
DCH: Let's have a look at the spider 2,5 Billion record table
,
MariaDB [spider01_ro]> show table status LIKE 'Sales' \G
Name: Sales
Engine: SPIDER
Version: 10
Row_format: Dynamic
Rows:
2500657155
Avg_row_length: 91
Data_length: 228106478935
Index_length: 40839657478
Auto_increment: 2618963872
Collation: utf8_general_ci
Create_options: partitioned
SVA: What type of of partitioning?
DCH: That's a mixture of technical per auto increment and per
business segments. Indeed we are using sub partitioning with
double spider tables that point to TokuDB or InnoDB tables in
this case and reduce that to modulo of number of cores in the
cluster.
SVA: What performance can you get?
MariaDB [spider01_ro]>
select count(*) from spider01_ro.Sales t;
select idGenre, count(*) from spider01_ro.Sales GROUP BY
idGenre;
+------------+
| count(*) |
+------------+
| 2506437338 |
+------------+
1 row in set (8.87 sec)
+---------+-----------+
| idGenre | count(*) |
+---------+-----------+
| 0 | 8137 |
| 1 | 56044584 |
| 2 | 21179162 |
| 3 | 25446110 |
| 4 | 31829221 |
| 293 | 1386236 |
| 294 | 47109 |
| 295 | 50776 |
| 296 | 988 |
..........
| 297 | 47589 |
| 298 | 9610 |
| 299 | 5215 |
| 300 | 224 |
+---------+-----------+
295 rows in set (16.00 sec)
Indeed 149M records read per sec on 8 nodes and inside
a single node all the cores are working hard :
| 2848 | tsrc_p15_c02 | 10.93.1.12:51388 | spider01 | Query
| 1 | Queried about 730000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p15_c02` group by
`idGenre` | 0.000 | | 2849 | tsrc_p15_c10 |
10.93.1.12:51396 | spider01 | Query | 1 |
Queried about 720000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p15_c10` group by `idGenre` | 0.000
| | 2850 | tsrc_p15_c18 | 10.93.1.12:51404 | spider01 | Query
| 1 | Queried about 950000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p15_c18` group by
`idGenre` | 0.000 | | 2851 | tsrc_p15_c26 |
10.93.1.12:51412 | spider01 | Query | 1 |
Queried about 740000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p15_c26` group by `idGenre` | 0.000
| | 2852 | tsrc_p15_c34 | 10.93.1.12:51420 | spider01 | Query
| 1 | Queried about 1060000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p15_c34` group by
`idGenre` | 0.000 | | 2853 | tsrc_p15_c42 |
10.93.1.12:51428 | spider01 | Query | 1 |
Queried about 920000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p15_c42` group by `idGenre` | 0.000
| | 2854 | tsrc_p15_c50 | 10.93.1.12:51436 | spider01 | Query
| 1 | Queried about 530000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p15_c50` group by
`idGenre` | 0.000 | | 2855 | tsrc_p15_c58 |
10.93.1.12:51444 | spider01 | Query | 1 |
Queried about 790000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p15_c58` group by `idGenre` | 0.000
| | 2856 | tsrc_p16_c02 | 10.93.1.12:51452 | spider01 | Query
| 1 | Queried about 760000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p16_c02` group by
`idGenre` | 0.000 | | 2857 | tsrc_p16_c10 |
10.93.1.12:51460 | spider01 | Query | 1 |
Queried about 660000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p16_c10` group by `idGenre` | 0.000
| | 2858 | tsrc_p16_c18 | 10.93.1.12:51468 | spider01 | Query
| 1 | Queried about 940000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p16_c18` group by
`idGenre` | 0.000 | | 2859 | tsrc_p16_c26 |
10.93.1.12:51476 | spider01 | Query | 1 |
Queried about 930000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p16_c26` group by `idGenre` | 0.000
| | 2860 | tsrc_p16_c34 | 10.93.1.12:51484 | spider01 | Query
| 1 | Queried about 910000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p16_c34` group by
`idGenre` | 0.000 | | 2861 | tsrc_p16_c42 |
10.93.1.12:51492 | spider01 | Query | 1 |
Queried about 800000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p16_c42` group by `idGenre` | 0.000
| | 2862 | tsrc_p16_c50 | 10.93.1.12:51500 | spider01 | Query
| 1 | Queried about 770000 rows
| select
count(0),`idGenre` from `spider01`.`tsrc_p16_c50` group by
`idGenre` | 0.000 | | 2863 | tsrc_p16_c58 |
10.93.1.12:51508 | spider01 | Query | 1 |
Queried about 740000 rows
| select count(0),`idGenre` from
`spider01`.`tsrc_p16_c58` group by `idGenre` | 0.000
|
SVA: Thanks David
Take away:
We conclude that columnar model in such scenario with less than
20 columns is only 10 time more efficient for a given hardware
cost. But RDBMS take the lead on small range indexed
queries.
Parallel and distributed queries is never an easy task but we can
make it shine just on regular good old well stable OLTP storage
engine.
Stay tuned, thanks to Spiral Arms Open Source spirit, a special
to us foundation sponsor, and more and more supported clients
inside MariaDB, we will get more of that spider release inside
official MariaDB 10.2. branch.
If you feel to help finance such move in getting support ask your
MariaDB sales rep some specific spider support or directly to a
Spiral Arms sales rep.
Nov
12
2015