We have a lot more storage space available these days, and a lot
more data to work with, so Big Data and Big Analytics is getting
much more mainstream now. And there are conclusions and insights
you can get from that data, any data more or less, but Web data
in particular brings a new dimension when combined with more
traditional, domain specific data. But this data is also mostly
in the shape of plain text, like your blogs, twitters, news
articles and other web content. And this in turn means that to
combine your organized structures sales data for 20 years with
Web data, the Web data first needs to be analyzed.
Web data also brings in a new difficulty: the data is big,
and it's not organized at it's core, so you can not easily
aggregate or something like that to save space (and why would you
want to do that?). It's not until after you have analyzed it that
you know what data is interesting and what is not. And to be
frank (but …
Over the weekend I came across an extremely curious issue with MySQL. It seemed that no matter how many times I tried to set the wait_timeout, it would always show the value of interactive_timeout. I even tried restarting mysql, to no avail.
Eventually I figured it out. When I was in an *interactive session*, wait_timeout displays as the value of interactive_timeout. Otherwise, it showed the appropriate value. Here’s what I found, when interactive_timeout was set to 600 and wait_timeout was set to 14400 (this is on an analytics server, so setting the value that high actually makes sense):
[root@mysql1 ~]# mysql -e "show variables like
'interactive_timeout'"
+---------------------+-------+
| Variable_name | Value |
+---------------------+-------+
| interactive_timeout | 600 |
+---------------------+-------+
[root@mysql1 ~]# mysql -e "show variables like 'wait_timeout'" …
[Read more]Collaborate 2012 started on Sunday but for me I began on Monday. I enjoyed Bob Burgess, SalesForce, presentation on shell scripting for MySQL Administration today. It preceded my presentation in the same room, which I thought was an interesting coincidence since we got our conference credentials together.
I presented on portable SQL between Oracle and MySQL. The presentation went well. Before I took questions, I got to ask them because I had three copies of my new Oracle Press book to give away: Oracle Database 11g and MySQL 5.6 Developer Handbook. Handing out the books served as a nice ice breaker for the audience to ask questions about the presentation.
My favorite …
[Read more]I came across an interesting error reported on #mysql the other day. When I went through it with the reporter it looks like we uncovered up to two bugs in InnoDB (or rather XtraDB as it was Percona Server). I thought it might be useful to go through the error message, including the stack trace, to show that you don't need to be a developer to track down some useful information.
The MySQL sessions at Collaborate started strong after an amazing keynote by former astronaut Mark Kelly about working to become a naval aviator, astronaut, and helping his wife — Congresswoman Gabrielle Giffords — after an assassination attempt on her life last year.
A rare moment when the Oracle demo pods where not wall to wall people.
First up was Set up MySQL in Five Minutes by Bob Burgess of Radian6. Most of the attendees to these sessions seem to be long time Oracle DBAs looking to add more MySQL skills or long time Oracle AND MySQL DBAs. Bob then had a second session and covered Shell Scripting for MySQL Administration where most of the crowd of twenty had already had lots of shell programming experience.
BYU’s Dr. Mike McLaughin …
[Read more]
This is a sneak peek of an exciting new data management
technology. linq4j (short for "Language-Integrated Query for
Java") is inspired by Microsoft's LINQ technology, previously only
available on the .NET platform, and adapted for Java. (It also
builds upon ideas I had in my earlier Saffron project.)
I launched the linq4j project less than a week
ago, but already you can do select, filter, join and groupBy
operations on in-memory and SQL data.
In this demo, I write and execute sample code against the working
system, and explain the differences between the key interfaces
…
LinkedIn has what they call "inDays" where employees may so something interesting which may not be directly related to their day job. I spent my inDay by porting my old WL820 project (External Language Stored Procedures) to MariaDB 5.3.
The code, as usual, is available on LaunchPad ... To get the branch, simply do:
bzr branch lp:~atcurtis/maria/5.3-wl820 The test cases pass... I haven't tested
Cleaning DataDuplicate keys happen. I see it most when you
feed data into a database and the source data is dirty.
Source data is usually dirty, that's why they want it in a
database.
The UNIQUE constraint clause in SQL prevents duplicate keys from
ever getting into your pristine database--at least in
theory. Sometimes a plain old attribute just needs to be
turned into a key for some "practical" reason.
A common data cleaning need is finding and removing duplicate
keys. Don't forget to turn on unique constraints for your
newly clean keys when you are ready. You never know when
you might get hit by a drive-by data sludger.
Finding Duplicate KeysSELECT my_key, count(*) FROM my_table GROUP
BY my_key HAVING count(*) > 1;
Remove Duplicate KeysDELETE
FROM tableA
WHERE uniquekey NOT IN
(SELECT …
I don’t think there is a single good-quality MySQL init script for a Unix-like operating system. On Windows, there is the service facility, and I used to write Windows services. Based on that, I believe Windows has a pretty good claim to better reliability for start/stop services with MySQL. What’s wrong with the init scripts? Well, let me count the reasons! Wait, I’m out of fingers and toes. I’ll just mention the two annoying ones that I’ve run into most recently.
A good day for MySQL at COLLABORATE 12. Most of the sessions had
good attendance. Dave and I also got the pleasure of
meeting Michael McLaughlin of BYU. He is also an
Oracle ACE in Database App Development.
Michael is a support of Oracle and MySQL and I have just recently
added his blog to the Planet site.