Posts Tagged ‘MySQL’

Go Go Ghetto MySQL Parallel Dump and Import (or Hitting Things With Hammers Is Fun)!

Tuesday, October 18th, 2011

I am not a sophisticated man and my idea of elegance is hitting things with hammers. Bear that in mind when you look over the below code, which is here more for posterity than for any public edification.

One of the backup strategies we have is a complete table level dump and subsequent encryption of those files for storage in S3. Sure, it’s not a scalable solution as we discovered when the first iteration of the script did a complete dump of the db rather than by tables. This version runs them in parallel for the dump and works well enough for me not to sweat the timing at this point (~20 minutes from start to finish).

#!/bin/bash

# set variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
RPTIME=$(date +%x-%H:%M)
ME=$(hostname)

# stop the slave
mysql -utears -plamentations -e 'slave stop;'

# create directories
mkdir -p /mnt/tmp/backup/$ME-$DAYNOW-$TIMENOW

function r {
for J in $(cat $1) ; do
mysqldump -utears -plamentations sorrowdb --tables $J | gzip > "/mnt/tmp/backup/$ME-$DAYNOW-$TIMENOW/$J.sql.gz"
done
exit
}

mysql -u tears -plamentations -e 'show tables from sorrowdb' -s --skip-column-names | sort -R | split -d -l 5

for I in $(find x*) ; do
(r $I &)
done

sleep 15s

while [ "$(pgrep -c -f mysqldump)" -gt "0" ]
do
sleep 1s
done

#start slave
mysql -utears -plamentations -e 'slave start;'

cd /mnt/tmp/backup
gpg-zip --encrypt -r yourmom --output $ME-$DAYNOW-$TIMENOW.gpg $ME-$DAYNOW-$TIMENOW

# copy to s3
BACKUP=$(s3cmd --acl-private put /mnt/tmp/backup/$ME-$DAYNOW-$TIMENOW.gpg s3://sorrowdb-db-backups/ | sed \
-e 's/File //g' \
-e 's/\/mnt\/tmp\/backup\///g' \
-e 's/s3:\/\/sorrowdb-db-backups\/'"$HOSTNAME"'-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9].tgz.gpg//g' \
-e 's/\[1 of 1\]//g' \
-e 's/(//g' \
-e 's/)//g' | awk '{ print $1 " " $2 " " $3 " " $5/1048576 "MB" }')

echo $BACKUP | mutt yourmom@sorrowdb.com -s "$ME - $DAYNOW $TIMENOW"

# clean up
rm -r /mnt/tmp/backup/

Now just like the World’s Most Interesting Man, I often only test things after it’s shipped. In this case I’ve tested that I can decrypt the package and see that all the files are there but I never took the time to actually do an import. Today is that day.

Version 0.01-ALPHA-WTFISTHISSHIT-b of the script just handles parallel imports of the tables from the dump file. As a bonus shoddiness, you need to run it in the directory where the files are sitting.

#!/bin/bash
function r {
for J in $(cat $1) ; do
gunzip < $J | mysql -utears -plamentations sorrowdb
done
exit
}

ls $1 | sort -R | split -d -l 18

for I in $(find x*) ; do
(r $I &)
done

Ugly but it works. Feel free to laugh, I always do when I look at my work.

Save your database (and your bacon) with Elastic Block Store and mysqlcheck

Monday, March 16th, 2009

Here’s the situation: early this afternoon I get a panicked IM from a client that they dropped a table on the production db but that they have a CVS copy that they want to load.  Sounds easy, right?  Should have been but the CSV file had some oddities where lines were terminated by \n but those also existed in some of the fields.  Now, I am by no means a MySQL guru and while there might be a solution to issue a LOAD DATA INFILE statement that accounts for it the process of working around that would keep the site down longer than necessary.

When we setup the database on EC2 we made a conscious decision to move the database into an Elastic Block Store (EBS) so that we could take regular snapshots of the volume as well as enjoy the durability that they offer.  The approach that we took to recovering the table was to re-purpose one of the QA instances as a recovery point, create an EBS volume from the latest snapshot, point MySQL at it, dump the table with –complete-insert, source it in, call it a day.

Things were working swimmingly up until the point where I needed to dump the table, ERROR 1033 (HY000): Incorrect information in file, was the last thing I wanted to read.  The table is InnoDB and it is possible that it was corrupted when I started the server back up with some missing variables in the my.cnf file–lesson here is remember to breathe, work quickly but methodically, and double check your work.  So here I sat with a seemingly good copy of the database but a mangled table.

Just a handful of keystrokes saved my ass: mysqlcheck mydb mytable.

If we had been doing just whole database dumps with mysqldump this process would have been frustrated by trying to chop up a 12GB file into the section that we needed (yes, it makes more sense to dump each table seperately but remember, I’m no guru).  In the end, having a volume that we could mount and access withing minutes was the biggest reason we were able to get the production site back online as fast as we did and for future reference I’ll check my config files more closely before I turn on services.

***Note: this really only applies if your database runs in EC2 ;-)

Evolving Services on EC2

Friday, May 2nd, 2008

One of the great things about EC2 is that it is essentially a giant sandbox where you can take risks experimenting with architecture and services in a rapid and cost effective manner, something that you cannot do really well at co-lo or even on other VPS services.  In the past year we have experimented with plenty of different configurations: some found their way into production, others filed for future reference, and still some to be avoid all costs.

May 2007

When I came on board as a contractor we had only 2 servers inside EC2 and the database hosted at Go Daddy. The company had just migrated the Apache/Application server along with a Harvest server into EC2 but had opted to leave the database hosted at Go Daddy due to fears of data loss.  The only trouble with this scheme was the latency between the application and the db which made things so glacially slow that the site nearly unusable.

August 2007

After starting full-time we brought the database into the cloud and started looking into how we might implement a MySQL cluster in EC2.  The challenge was to get a backup routine that was unobtrusive yet fast and easy to transfer into S3. I never got LVM snapshots working to my comfort level so we relied instead on MySQLdump, which, all and all worked fine while the db was small. Data loss was still a big concern for us so we began experimenting in earnest with MySQL clusters.

November 2007

When the MySQL cluster idea didn’t pan out, the theory is that the small instances just didn’t have what it takes to cluster. So we went with plain old replication which has proven to be stable and reliable. The slaves serve both as fail-over units but also perform periodic backups freeing the master from that task.  Feeling more comfortable with database integrity we turned our attention to getting our application to scale, a challenge with resource hungry Rails.

January 2008

Breaking apart Apache and rails was a snap with mod_proxy and it allowed us to dedicate hardware to each.  With things running even better we started thinking about how we can flip this into a more through horizontal scale.

May 2008

So one year later and we have brought some horizontal scale to the site adding stability and failover to the application. As the site grows, though, we are back to the how we can best scale the database but at least we have a sandbox to play in so we can figure it out.

EC2, MySQL, Replication, Recovery, and You! (Hammer Time!)

Saturday, January 12th, 2008

I finally cobbled together an incredibly ugly but functional script for recovering or setting up a slave. The pure hideousness stems from the brute force, lack of error checking, cram that data down the db’s throat method that I am leveraging. See, I know just enough to get the job done but not nearly enough to do it with any elegance, flair, or care and concern for stability. Running with scissors, at night, with a blindfold, through a roomful of children’s toys and cats is my style.

Anyways, here we go…

This script is executed on the slave instance and will fetch the most recent copy of the db from the master, stop the slave, drop the db, recreate the db, read in the backup, issue the change master command, start the slave, and then display the slave status after a minute.

#!/bin/bash
# Recover slave post crash

# run backup from master
# transfer it to the slave
echo "Getting backup, this may take a while."
ssh master "/scripts/slave_recovery.sh WHATSLAVE"

echo

# untar backup
echo "Expanding backup and getting ready to import."
cd /mnt/tmp/recovery
recover=$(ls | grep yourdb)
tar -xf $recover

# set variables
recodir=${recover:0:21}
mastfle=$(ls $recodir/ | grep master)
fullbin=$(cat $recodir/$mastfle | grep A.)
binlog=${fullbin:2}
fullpos=$(cat $recodir/$mastfle | grep B.)
positn=${fullpos:2}

echo "Here's what I have..."
echo $recodir
echo $recover
echo $mastfle
echo $binlog
echo $positn

# stop slave
echo "Stopping slave..."
mysql -e "slave stop;"

# drop database
echo "Dropping the database..."
mysql -e "drop database yourdb;"

# recreate database
echo "Recreating the database..."
mysql -e "create database yourdb;"

# source database from backup
echo "Importing the database..."
mysql yourdb < $recodir/$recodir.sql

# issue change master command
echo "Issuing the change master command..."
mysql -e "CHANGE MASTER TO MASTER_HOST='master', MASTER_USER='USERNAME', MASTER_PASSWORD='PASSWORD', MASTER_LOG_FILE='$binlog', MASTER_LOG_POS=$positn;"

# start slave
echo "I am starting the slave..."
mysql -e "slave start"

# clean up
rm -r *yourdb*

# check status
echo "I'm waiting one minute the checking the status of the slave..."
sleep 1m
mysql -e "show slave status \G;"

echo
echo "I am all done."

Now, you might have noticed that on the seventh line I call another script on the master and you might have noticed a variable trailing it. WHATSLAVE is whatever you called your slaves in the host file on the master in my unimaginative case it is slavea and slaveb but you could have Tom, Dick, and Harry, or the names of your favorite Hostess snackcake characters.

#! /bin/bash
# This script runs on the master and is built off the backup script

# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)

# grab info about the binlog and position of the database

status1=$(mysql -e 'show master status \G' | grep mysql)
status2=$(mysql -e 'show master status \G' | grep Position)
sql=${status1:18}
posit=${status2:18}

mkdir /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW

echo A.$sql >> /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt
echo B.$posit >> /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt

# dump database
mysqldump yourdb > /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/slave-yourdb-$DAYNOW-$TIMENOW.sql

# tar SQL dump
cd /mnt/tmp/backup

tar -chf - slave-yourdb-$DAYNOW-$TIMENOW | gzip - > slave-yourdb-$DAYNOW-$TIMENOW.tar.gz

rm -r /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/

# copy tar to slaves
scp /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW.tar.gz root@$1:/mnt/tmp/recovery/slave-yourdb-$DAYNOW-$TIMENOW.tar.gz
#clean up
rm /mnt/tmp/backup/*.gz*
echo "I'm all done!"

This is just our basic backup script but rather than trying to pass all the variables through ssh I decided to be lazy and just execute the script remotely.

Some of the things I would like to add would be more flexibility in reading the backup name and error checking. Down the line I want to see if I can just backup the schema and import that into the slave db so that I don’t loose all that time reading the db back in (500MB+ can take awhile) and it would help with rapid recovery from data migrations. If you have any comments or suggestions, particularly if they trip your “WTF is wrong with this guy?” sensor I’m all ears.

EC2, MySQL, Backup Recovery, and You! (redux)

Thursday, December 27th, 2007

Here we go again…

On the heels of the replication monitor, I’ve gone back and fine-tuned the fetch script to let you look back two days in the archives. Now, it is a bit janky because I am setting the days for the first array rather than parsing the actual buckets in S3 but my sed/awk skills are less than none. However, I suppose that the next version could be set up to ask how many days you want to look back easily enough.

#!/bin/bash
# set the environment
export AWS_ACCESS_KEY_ID=xxxyyyzzz
export AWS_SECRET_ACCESS_KEY=xxxyyyzzz
export SSL_CERT_DIR=/opt/s3sync/certs

DAYLST[0]=$(date +%j --date='2 days ago')
DAYLST[1]=$(date +%j --date='1 days ago')
DAYLST[2]=$(date +%j)

DAYNUM=${#DAYLST[@]}

echo

echo "Here are the available days for backup recovery."

echo

# echo each element in array
# for loop
for (( i=0;i<$DAYNUM;i++)); do
echo $i -  ${DAYLST[${i}]}
done

echo

echo -e "What day did you want to parse? \c"
read selectday
listday=${DAYLST[$selectday]}

echo "Ok, I'm going to get the backups from $listday."
echo

echo -e "How many did you want? \c"
read count

echo

# Get the list of backups on the server using s3cmd
dbsets=$(ruby s3cmd.rb list your_db_backups:$listday | tail -n $count)
ARRAY=($dbsets)
# get number of elements in the array
ELEMENTS=${#ARRAY[@]}

# echo each element in array
# for loop
for (( i=0;i<$ELEMENTS;i++)); do
echo $i -  ${ARRAY[${i}]:4}
done

# Prompt user for which backup they want to recover
echo

echo -e "Which backup set would you like to recover? \c"

read numbackup
backup=${ARRAY[$numbackup]:4}

echo "I am fetching your backup $backup now..."
echo

ruby s3cmd.rb get your_db_backups/$listday:$backup /tmp/$backup
cd /tmp
tar -xf $backup
sqlset=${backup:0:14}
mv $sqlset /root

echo "Your backup can be found here /root/$sqlset"

Still on the agenda is getting a slave to recover unassisted after a failure is detected but as my shell scripting abilities improve the possibility of it being realized grows.

EC2, MySQL Replication, Monitoring, and You!

Monday, December 24th, 2007

So in a full turn of events I’ve gone back to replication as the the most cost effective solution for creating a high availability environment for MySQL in EC2. The problem of the development team issuing schema changes frequently and without notification hasn’t changed but I have gotten a little more sophisticated about how to deal with them kicking the slave in the teeth when they issue schema changes with impunity.

What I’ve done is build off the backup scripts I have written about prior–especially since they work so well and created the beginnings of a metascript to over see the slaves–it is aptly named slaver. This metascript checks the state of the slave and acts based on it is state: slave up, run backups, or slave down, issue notifications.

#!/bin/bash
### Slaver v0.0.1
### this script is intended to check on the status of the slave
### if the slave is down (IO or SQL) it will send an email out

### set the variables that we are checking for ###
slaver1=$(mysql -Bse ‘show slave status \G;’ | grep Slave_IO_Running)
slaver2=$(mysql -Bse ‘show slave status \G;’ | grep Slave_SQL_Running)
IO=${slaver1:29}
SQL=${slaver2:29}
COMBO=$IO-$SQL
count=$(mysql -Bse ‘show slave status \G;’| grep -c Yes)

### this is sanity check testing stuff ###

#count=1
#echo $IO
#echo $SQL
#echo $COMBO
#echo $count
### this is sanity check testing stuff ###

### run the exception check ###
if [[ "$count" == "2" ]] ; then
/opt/s3sync/db_backup.sh
exit
else
### create status file and mail it ###
date >> status.txt
mysql -Bse ‘show slave status \G;’ >> status.txt
mutt you@yourhome.com -s “Slave Status :: DOWN” < status.txt
rm status.txt
fi

The next pieces to build out will be freezing the automated deletion of the old backup sets and attempting recovery of the slave if it is down. To get started on the latter I made some changes to the backup routine on the master:

#! /bin/bash

# Hourly cron job to upload to current bucket
# This is built off what we are currently running

# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
# set the environment
export AWS_ACCESS_KEY_ID=xxxyyyzzz
export AWS_SECRET_ACCESS_KEY=xxxyyyzzz
export SSL_CERT_DIR=/opt/s3sync/certs

sleep 1m

# grab info about the binlog and position of the database

status1=$(mysql -e ‘show master status \G’ | grep mysql)
status2=$(mysql -e ‘show master status \G’ | grep Position)
sql=${status1:18}
posit=${status2:18}

mkdir /mnt/tmp/backup/you-$DAYNOW-$TIMENOW

echo A.$sql >> \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt
echo B.$posit >> \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt

# dump database
mysqldump geezeo > \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/you-$DAYNOW-$TIMENOW.sql

# tar SQL dump
cd /mnt/tmp/backup

tar -chf – you-$DAYNOW-$TIMENOW | gzip – > you-$DAYNOW-$TIMENOW.tar.gz

rm -r /mnt/tmp/backup/you-$DAYNOW-$TIMENOW/

# copy tar to S3
cd /opt/s3sync
ruby s3sync.rb -vr –ssl /mnt/tmp/backup/ you_db_backups:$DAYNOW

#clean up
rm /mnt/tmp/backup/*.gz*

The key piece here is the capturing of binlog number and position with those two pieces captured it becomes much easier to automate a recovery of the slave from the master’s backup.

More to follow…