We have an application which stores massive amount of urls. To save on indexes instead of using URL we index CRC32 of the URL which allows to find matching urls quickly. There is a bit of chance there would be some false positives but these are filtered out after reading the data so it works all pretty well.
If we just process urls one by one it works great:
PLAIN TEXT SQL:
- mysql> EXPLAIN SELECT url FROM 124pages.124pages WHERE url_crc=484036220 AND url="http://www.dell.com/";
- +----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
- | id | select_type | TABLE | type | possible_keys | KEY | key_len | ref | rows | Extra |
- +----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
- | 1 | …