Skip to content
Snippets Groups Projects
Commit 60fcf00c authored by Jared Hancock's avatar Jared Hancock
Browse files

Crazy performance penalty scanning blob tables

When scanning the file_chunk table for orphaned file chunks that can be
deleted, apparently, MySQL will read (at least part of) the blob data from
the disk. For databases with lots of large attachments, this can take
considerable time. Considering that it is triggered from the autocron and
will run everytime the cron is run, the database will spend considerable
time scanning for rows to be cleaned.

This patch changes the orphan cleanup into two phases. The first will search
just for the pk's of file chunks to be deleted. If any are found, then the
chunks are deleted by the file_id and chunk_id, which is the primary key of
the table.

The SELECT query seems to run at least 20 times faster than the delete
statement, and DELETEing against the primary key of the blob table should
be the fastest possible operation. Somehow, both queries required a full
table scan; however, because the SELECT statement is explictly only
interested in two fields, it is more clear to the query optimizer that the
blob data should not be scanned.

References:
http://stackoverflow.com/q/9511476
parent 5ea0a9dd
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment