Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский
Air traffic queries in LucidDB
+7 Vote Up -0 Vote Down

After my first post Analyzing air traffic performance with InfoBright and MonetDB where I was not able to finish task with LucidDB, John Sichi contacted me with help to setup. You can see instruction how to load data on LucidDB Wiki page

You can find the description of benchmark in original post, there I will show number I have for LucidDB vs previous systems.

Load time
To load data into LucidDB in single thread, it took for me 15273 sec or 4.24h. In difference with other systems LucidDB support multi-threaded load, with concurrency 2 (as I have only 2 cores on that box), the load time is 9955 sec or 2.76h. For comparison
for InforBright load time is 2.45h and for MonetDB it is 2.6h

DataSize
Another interesting metric is datasize after load. In LucidDB db file after load takes 9.3GB.
UPDATE 27-Oct-2009 From metadata table the actual size of data is 4.5GB, the 9.3GB is size of physical file db.dat, which probably was not truncated after several loads of data.

For InfoBright it is 1.6GB, and for MonetDB - 65GB. Obviously LucidDB uses some compression, but it is not so aggressive as in InfoBright case. As original dataset is 55GB, compression rate for LucidDB is somewhat 1:12

Queries time

Let me put list of queries and times for all systems.

- Lame query "count start"
LucidDB:
SELECT count(*) FROM otp."ontime";
1 row selected (55.165 seconds)

Both InfoBright and MonetDB returned result immediately.
It seems LucidDB has to scan whole table to get result.

- Q0:
select avg(c1) from (select "Year","Month",count(*) as c1 from otp."ontime" group by "Year","Month") t;
LucidDB: 103.205 seconds
InfoBright: 4.19 sec
MonetDB: 29.9 sec

- Q1:
SELECT "DayOfWeek", count(*) AS c FROM OTP."ontime" WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER BY c DESC;
LucidDB: 49.17 seconds
InfoBright: 12.13 sec
MonetDB: 7.9 sec

- Q2:
SELECT "DayOfWeek", count(*) AS c FROM otp."ontime" WHERE "DepDelay">10 AND "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER BY c DESC;
LucidDB: 27.131 seconds
InfoBright: 6.37 sec
MonetDB: 0.9 sec

- Q3:
!set rowlimit 10
SELECT "Origin", count(*) AS c FROM otp."ontime" WHERE "DepDelay">10 AND "Year" BETWEEN 2000 AND 2008 GROUP BY "Origin" ORDER BY c DESC;
LucidDB: 27.664 seconds
InfoBright: 7.29 sec
MonetDB: 1.7 sec

- Q4:
SELECT "Carrier", count(*) FROM otp."ontime" WHERE "DepDelay">10 AND "Year"=2007 GROUP BY "Carrier" ORDER BY 2 DESC;
LucidDB: 2.338 seconds
InfoBright: 0.99 sec
MonetDB: 0.27 sec

- Q5:
SELECT t."Carrier", c, c2, c*1000/c2 as c3 FROM (SELECT "Carrier", count(*) AS c FROM OTP."ontime" WHERE "DepDelay">10 AND "Year"=2007 GROUP BY "Carrier") t JOIN (SELECT "Carrier", count(*) AS c2 FROM OTP."ontime" WHERE "Year"=2007 GROUP BY "Carrier") t2 ON (t."Carrier"=t2."Carrier") ORDER BY c3 DESC;
LucidDB: 7.351 seconds
InfoBright: 2.92 sec
MonetDB: 0.5 sec

- Q6:
SELECT t."Carrier", c, c2, c*1000/c2 as c3 FROM (SELECT "Carrier", count(*) AS c FROM OTP."ontime" WHERE "DepDelay">10 AND "Year" BETWEEN 2000 AND 2008 GROUP BY "Carrier") t JOIN (SELECT "Carrier", count(*) AS c2 FROM OTP."ontime" WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "Carrier") t2 ON (t."Carrier"=t2."Carrier") ORDER BY c3 DESC;
LucidDB: 78.423 seconds
InfoBright: 21.83 sec
MonetDB: 12.5 sec

- Q7:
SELECT t."Year", c1/c2 FROM (select "Year", count(*)*1000 as c1 from OTP."ontime" WHERE "DepDelay">10 GROUP BY "Year") t JOIN (select "Year", count(*) as c2 from OTP."ontime" GROUP BY "Year") t2 ON (t."Year"=t2."Year");
LucidDB: 106.374 seconds
InfoBright: 8.59 sec
MonetDB: 27.9 sec

- Q8:
SELECT "DestCityName", COUNT( DISTINCT "OriginCityName") FROM "ontime" WHERE "Year" BETWEEN 2008 and 2008 GROUP BY "DestCityName" ORDER BY 2 DESC;

Years, LucidDB, InfoBright, MonetDB
1y, 6.76s, 1.74s, 0.55s
2y, 28.82s, 3.68s, 1.10s
3y, 35.37s, 5.44s, 1.69s
4y, 41.66s, 7.22s, 2.12s
10y, 72.67s, 17.42s, 29.14s

- Q9:
select "Year" ,count(*) as c1 from "ontime" group by "Year";
LucidDB: 76.121 seconds
InfoBright: 0.31 sec
MonetDB: 6.3 sec

As you see LucidDB is not showing best results. However on good side about LucidDB I can mention it is very reach featured, with full support of DML statement. ETL features is also very impressive, you can extract, filter, transform external data (there is even access to MySQL via JDBC driver) just in SQL queries (compare with single LOAD DATA statement in InfoBright ICE edition). Also I am not so much in Java, but as I understood LucidDB can be easily integrated with Java applications, which is important if your development is Java based.

Worth to mention that in LucidDB single query execution takes 100% of user time in single CPU, which may signal that there some low-hanging fruits for optimization. OProfile can show clear places to fix.


Entry posted by Vadim | 10 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Votes:

You must be logged in with a MySQL.com account to vote on Planet MySQL entries. More information on PlanetMySQL voting.

Planet MySQL © 1995-2008 MySQL AB, 2008-2009 Sun Microsystems, Inc.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Sun Microsystems, Inc. and does not
necessarily represent the opinion of Sun Microsystem, Inc. or any other party.