Showing entries 23603 to 23612 of 44108
« 10 Newer Entries | 10 Older Entries »
Generating unique integer IDs from strings in MySQL

I have an interesting problem, on a data migration project I'm currently working on. I'm importing a large amount of legacy data into Drupal, using the awesome Migrate module (and friends). Migrate is a great tool for the job, but one of its limitations is that it requires the legacy database tables to have non-composite integer primary keys. Unfortunately, most of the tables I'm working with have primary keys that are either composite (i.e. the key is a combination of two or more columns), or non-integer (i.e. strings), or both.

Table with composite primary key.

The simplest solution to this problem would be to add an auto-incrementing integer primary key column to the legacy tables. This would provide the primary key information that Migrate needs in order to do its mapping of legacy IDs to Drupal IDs. But this solution has a serious drawback. In my project, …

[Read more]
Generating unique integer IDs from strings in MySQL

I have an interesting problem, on a data migration project I'm currently working on. I'm importing a large amount of legacy data into Drupal, using the awesome Migrate module (and friends). Migrate is a great tool for the job, but one of its limitations is that it requires the legacy database tables to have non-composite integer primary keys. Unfortunately, most of the tables I'm working with have primary keys that are either composite (i.e. the key is a combination of two or more columns), or non-integer (i.e. strings), or both.

Table with composite primary key.

The simplest solution to this problem would be to add an auto-incrementing integer primary key column to the legacy tables. This would provide the primary key information that Migrate needs in order to do its mapping of legacy IDs to Drupal IDs. But this solution has a serious drawback. In my project, …

[Read more]
Greatest N per group: dealing with aggregates

Answering questions asked on the site.

Vlad Enache asks:

In MySQL I have a table called meanings with three columns:

person word meaning
1 1 4
1 2 19
1 2 7
1 3 5

word has 16 possible values, meaning has 26.

A person assigns one or more meanings to each word. In the sample above, person 1 assigned two meanings to word 2.

[Read more]
Ignite MySQL, "Enlighten us, but make it quick."

This year at the O'Reilly MySQL User's Conference we will be doing the first ever Ignite talk series!

What is Ignite?

It's a high-energy evening of 5-minute talks by people who have an idea—and the guts to get onstage and share it. All talks are exactly five minutes long, and Ignite presenters will be sharing their personal and professional passions, using 20 slides that auto-advance every 15 seconds.

Sound interesting?

So far we have Robin Schumacher talking about when you should be using InfiniDB instead of general MySQL server. Monty Widenius will be presenting the "The Full Monty, State of MariaDB".

[Read more]
Reviewed: Managing Software Development with SVN and Trac

I’ve recently been migrating my wiki/documentation for Kontrollbase to Trac. For those that are not aware, Trac is a web-based documentation/wiki/Subversion tool that is used by countless number of software projects. Subversion, of course, is a software collaboration and code management repository that manages branches/tags/trunk files with revision control. It’s one of the most heavily used open-source code repositories available. Given that I use SVN (subversion) for all of my software applications and am now using Trac, the book “Managing Software Development with Trac and Subversion” by David J Murphy comes as a useful and great resource for integrating these two useful tools. …

[Read more]
Getting SQL from a SPAN port

Recently I needed the query stream hitting a very busy master. Normally I would have been using the MySQL Proxy to collect queries, but on a very busy machine the Proxy is as much of a problem as it is part of the solution, so I chose a different approach.

I had a SPAN port configured for the master, which is Ciscospeak for a port on a switch which mirrors all traffic of one or more other ports. I had an otherwise idle machine listening to the SPAN port on a spare network card. That way it is possible to collect traffic to and from the master without any interference with the master.

On the listener box, I had tcpflow collecting data to my master (and only traffic to, not from the master): CODE:tcpflow -i eth1 dst master and port 3306 These tcpflow files now need to be processed into a …

[Read more]
When the subselect runs faster

A few weeks ago, we had a query optimization request from one of our customer.

The query was very simple like:

PLAIN TEXT CODE:

  1. SELECT * FROM `table` WHERE (col1='A'||col1='B') ORDER BY id DESC LIMIT 20 OFFSET 0

This column in the table is looks like this:

PLAIN TEXT CODE:

  1. `col1` enum('A','B','C','CD','DE','F','G','HI') default NULL

The table have 549252 rows and of course, there is an index on the col1. MySQL estimated the cardinality of that index as 87, though what was of course misleading as index cardinality in this case can't be over 9, as there is only 8(+ NULL) different possible values for this column.

PLAIN TEXT CODE:

[Read more]
MySQL ERROR 1137 (HY000): Can’t reopen table: ‘tmp_journals’

When setting up a query using a temporary lookup table, I got this error:

ERROR 1137 (HY000): Can't reopen table: 'tmp_journals'

It transpires that since 4.1 the way MySQL handles temporary tables has changed. This affects joins, unions and subqueries. There is an obvious fix:

mysql> CREATE TEMPORARY TABLE tmp_journals_2 LIKE tmp_journals;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO tmp_journals_2 SELECT * FROM tmp_journals;
Query OK, 3228659 rows affected (2.01 sec)
Records: 3228659  Duplicates: 0  Warnings: 0 

Then the query is easy:

SELECT COUNT(1) cnt, journal_invoice_ref
FROM tmp_journals
GROUP BY journal_date 
HAVING cnt > 10000</pre>

UNION

SELECT COUNT(1) cnt, journal_invoice_ref
FROM tmp_journals_2
GROUP BY journal_invoice_ref
HAVING cnt < 10
Percona-XtraDB-9.1: released and new coming features

Recently Alexandr announced new Percona-XtraDB-9.1 release, and now it is good time to summarize features we have and what is expected soon.

This release contains long waited features from 5.0:

  • extended slow.log
  • USER/TABLE/INDEX/CLIENT_STATISTICS + THREAD_STATISTICS ( coming in release-10)

Extended slow.log is now even more extended, there is additional information for each query:

PLAIN TEXT CODE:

  1. # Bytes_sent: 4973  Tmp_tables: 1  Tmp_disk_tables: 1  Tmp_table_sizes: 7808

That is you can see how many bytes was returned by query, was temporary table used,
was it disk table or in-memory, and how big was temporary table.

Also you can profile each individual …

[Read more]
The Data Scientist

Here's a very interesting article for those of us who work with big data: Data, data everywhere in the Economist.  One interesting quote predicts the emergence of a new database professional: the data scientist: "Chief information officers (CIOs) have become somewhat more prominent in the executive suite, and a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hiddenRead More...

Showing entries 23603 to 23612 of 44108
« 10 Newer Entries | 10 Older Entries »