It's easy to load multiple rows of data from a file into a MySQL
table with a single LOAD DATA INFILE
command, but if the
file is over a certain size you may want to load it in multiple
chunks. When I have a file with millions of lines I typically use
pt-fifo-split
to separate the
file into multiple chunks in order to limit the size of each
transaction. Unlike most tools in Percona Toolkit,
pt-fifo-split
doesn't actually interact with your
database, but this is one MySQL use case where it's very helpful.
Here's a quick example in bash that illustrates how I can take a large tab-delimited file, break it into multiple 100,000 line chunks, and load the chunks into a table:
``` FLAT_FILE="/tmp/big_file.txt" FIFO_PATH="${FLAT_FILE}.fifo" LOAD_FILE="${FLAT_FILE}.load" CHUNK_SIZE=100000
Split the file
pt-fifo-split --force --lines ${CHUNK_SIZE} ${FLAT_FILE} --fifo ${FIFO_PATH} &
Sleep 10 seconds to assure ${FIFO_PATH} exists before entering loop
sleep 10 while [ -e ${FIFO_PATH} ] do # Write chunk to disk cat ${FIFO_PATH} > ${LOAD_FILE} # Load chunk into table mysql --database=test \ --show-warnings \ -vve "load data infile '${LOAD_FILE}' into table my_table;" done ```
YMMV, so you should do some testing to determine the optimal
chunk size based on your schema, storage engine, replication
setup, complexity of LOAD DATA INFILE
statement,
etc.