Posts Tagged ‘replication’

EC2, MySQL, Replication, Recovery, and You! (Hammer Time!)

Saturday, January 12th, 2008

I finally cobbled together an incredibly ugly but functional script for recovering or setting up a slave. The pure hideousness stems from the brute force, lack of error checking, cram that data down the db’s throat method that I am leveraging. See, I know just enough to get the job done but not nearly enough to do it with any elegance, flair, or care and concern for stability. Running with scissors, at night, with a blindfold, through a roomful of children’s toys and cats is my style.

Anyways, here we go…

This script is executed on the slave instance and will fetch the most recent copy of the db from the master, stop the slave, drop the db, recreate the db, read in the backup, issue the change master command, start the slave, and then display the slave status after a minute.

#!/bin/bash
# Recover slave post crash

# run backup from master
# transfer it to the slave
echo "Getting backup, this may take a while."
ssh master "/scripts/slave_recovery.sh WHATSLAVE"

echo

# untar backup
echo "Expanding backup and getting ready to import."
cd /mnt/tmp/recovery
recover=$(ls | grep yourdb)
tar -xf $recover

# set variables
recodir=${recover:0:21}
mastfle=$(ls $recodir/ | grep master)
fullbin=$(cat $recodir/$mastfle | grep A.)
binlog=${fullbin:2}
fullpos=$(cat $recodir/$mastfle | grep B.)
positn=${fullpos:2}

echo "Here's what I have..."
echo $recodir
echo $recover
echo $mastfle
echo $binlog
echo $positn

# stop slave
echo "Stopping slave..."
mysql -e "slave stop;"

# drop database
echo "Dropping the database..."
mysql -e "drop database yourdb;"

# recreate database
echo "Recreating the database..."
mysql -e "create database yourdb;"

# source database from backup
echo "Importing the database..."
mysql yourdb < $recodir/$recodir.sql

# issue change master command
echo "Issuing the change master command..."
mysql -e "CHANGE MASTER TO MASTER_HOST='master', MASTER_USER='USERNAME', MASTER_PASSWORD='PASSWORD', MASTER_LOG_FILE='$binlog', MASTER_LOG_POS=$positn;"

# start slave
echo "I am starting the slave..."
mysql -e "slave start"

# clean up
rm -r *yourdb*

# check status
echo "I'm waiting one minute the checking the status of the slave..."
sleep 1m
mysql -e "show slave status \G;"

echo
echo "I am all done."

Now, you might have noticed that on the seventh line I call another script on the master and you might have noticed a variable trailing it. WHATSLAVE is whatever you called your slaves in the host file on the master in my unimaginative case it is slavea and slaveb but you could have Tom, Dick, and Harry, or the names of your favorite Hostess snackcake characters.

#! /bin/bash
# This script runs on the master and is built off the backup script

# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)

# grab info about the binlog and position of the database

status1=$(mysql -e 'show master status \G' | grep mysql)
status2=$(mysql -e 'show master status \G' | grep Position)
sql=${status1:18}
posit=${status2:18}

mkdir /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW

echo A.$sql >> /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt
echo B.$posit >> /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt

# dump database
mysqldump yourdb > /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/slave-yourdb-$DAYNOW-$TIMENOW.sql

# tar SQL dump
cd /mnt/tmp/backup

tar -chf - slave-yourdb-$DAYNOW-$TIMENOW | gzip - > slave-yourdb-$DAYNOW-$TIMENOW.tar.gz

rm -r /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW/

# copy tar to slaves
scp /mnt/tmp/backup/slave-yourdb-$DAYNOW-$TIMENOW.tar.gz root@$1:/mnt/tmp/recovery/slave-yourdb-$DAYNOW-$TIMENOW.tar.gz
#clean up
rm /mnt/tmp/backup/*.gz*
echo "I'm all done!"

This is just our basic backup script but rather than trying to pass all the variables through ssh I decided to be lazy and just execute the script remotely.

Some of the things I would like to add would be more flexibility in reading the backup name and error checking. Down the line I want to see if I can just backup the schema and import that into the slave db so that I don’t loose all that time reading the db back in (500MB+ can take awhile) and it would help with rapid recovery from data migrations. If you have any comments or suggestions, particularly if they trip your “WTF is wrong with this guy?” sensor I’m all ears.

EC2, MySQL Replication, Monitoring, and You!

Monday, December 24th, 2007

So in a full turn of events I’ve gone back to replication as the the most cost effective solution for creating a high availability environment for MySQL in EC2. The problem of the development team issuing schema changes frequently and without notification hasn’t changed but I have gotten a little more sophisticated about how to deal with them kicking the slave in the teeth when they issue schema changes with impunity.

What I’ve done is build off the backup scripts I have written about prior–especially since they work so well and created the beginnings of a metascript to over see the slaves–it is aptly named slaver. This metascript checks the state of the slave and acts based on it is state: slave up, run backups, or slave down, issue notifications.

#!/bin/bash
### Slaver v0.0.1
### this script is intended to check on the status of the slave
### if the slave is down (IO or SQL) it will send an email out

### set the variables that we are checking for ###
slaver1=$(mysql -Bse ‘show slave status \G;’ | grep Slave_IO_Running)
slaver2=$(mysql -Bse ‘show slave status \G;’ | grep Slave_SQL_Running)
IO=${slaver1:29}
SQL=${slaver2:29}
COMBO=$IO-$SQL
count=$(mysql -Bse ‘show slave status \G;’| grep -c Yes)

### this is sanity check testing stuff ###

#count=1
#echo $IO
#echo $SQL
#echo $COMBO
#echo $count
### this is sanity check testing stuff ###

### run the exception check ###
if [[ "$count" == "2" ]] ; then
/opt/s3sync/db_backup.sh
exit
else
### create status file and mail it ###
date >> status.txt
mysql -Bse ‘show slave status \G;’ >> status.txt
mutt you@yourhome.com -s “Slave Status :: DOWN” < status.txt
rm status.txt
fi

The next pieces to build out will be freezing the automated deletion of the old backup sets and attempting recovery of the slave if it is down. To get started on the latter I made some changes to the backup routine on the master:

#! /bin/bash

# Hourly cron job to upload to current bucket
# This is built off what we are currently running

# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
# set the environment
export AWS_ACCESS_KEY_ID=xxxyyyzzz
export AWS_SECRET_ACCESS_KEY=xxxyyyzzz
export SSL_CERT_DIR=/opt/s3sync/certs

sleep 1m

# grab info about the binlog and position of the database

status1=$(mysql -e ‘show master status \G’ | grep mysql)
status2=$(mysql -e ‘show master status \G’ | grep Position)
sql=${status1:18}
posit=${status2:18}

mkdir /mnt/tmp/backup/you-$DAYNOW-$TIMENOW

echo A.$sql >> \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt
echo B.$posit >> \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt

# dump database
mysqldump geezeo > \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/you-$DAYNOW-$TIMENOW.sql

# tar SQL dump
cd /mnt/tmp/backup

tar -chf – you-$DAYNOW-$TIMENOW | gzip – > you-$DAYNOW-$TIMENOW.tar.gz

rm -r /mnt/tmp/backup/you-$DAYNOW-$TIMENOW/

# copy tar to S3
cd /opt/s3sync
ruby s3sync.rb -vr –ssl /mnt/tmp/backup/ you_db_backups:$DAYNOW

#clean up
rm /mnt/tmp/backup/*.gz*

The key piece here is the capturing of binlog number and position with those two pieces captured it becomes much easier to automate a recovery of the slave from the master’s backup.

More to follow…

EC2, MySQL, Replication, and You!

Tuesday, August 7th, 2007

One week in and I’m just beginning to get my head around the EC2 way of doing things which is different than what I am used to doing in meatspace with a pile of hardware as well as being worlds away from my Microsoft background. So after tackling a couple of minor projects like Monit + Mongrel, and playing with Memcached, I decided tackle a simple replication setup to get my feet wet in preparation for building a cluster.

After hammering my head against the wall for a day or so things clicked after realizing that this exercise is so much easier if I use the private DNS for plumbing rather than laying ssh tunnels all over the place, which by the way sort-of-kind-of-not-really worked.

Prologue – Create Your World

Look, it is a well known fact that the 9 or so GB that Amazon gives you on the main partition isn’t going to cut it so you have to make some changes to your MySQL environment. The best place for bins and logs are on /mnt. So, since I am epically lazy, I created a mysql directory in /mnt, passed ownership to mysql:mysql, backed up /var/lib/mysql, created a symbolic link to mysql, and then cp -a the contents of the old /var/lib/mysql into /mnt/mysql. Why the backup? Well, if your MySQL instance ever goes down it’ll take the contents of /mnt with it and having a copy of it helps with a speedy recovery.

Step One – Be The Master

Add or edit the following line in /etc/mysql/my.cnf
log=/mnt/mysql/log/mysql.log
server-id=1
log-bin=/var/log/mysql/mysql-bin.log
binlog-do-db=babygotdb

log=/var/lib/mysql/mysql.log (this part is important for seeing whether or not data is replicating)

Restart MySQL then launch the client and issue the following command to add a user with replication privileges:

GRANT REPLICATION SLAVE ON *.* TO ‘replicant’@'%’
IDENTIFIED BY ‘replicantsarepeopletoo’;

FLUSH PRIVILEGES;

Now use the database so we can finish things off…

USE babygotdb;

We want to clear any read locks before we move on so,

FLUSH TABLES WITH READ LOCK;

Now we want to check if what we set up looks like right:

SHOW MASTER STATUS;

This will tell you about the database File, Position, Binlog_Do_DB, and Binlog_Ignore_DB. You will need to know both the file and the position. In this case mine has the following:

File – mysql-bin.000023
Position – 3617723
Binlog_Do_DB – babygotdb
Binlog_Ignore_DB -

You should see something similar to that though it will be in a different format.

quit;

Now is when we dump the database and get it ready to ship over to the slave server.

$ mysqldump -B babygotdb | gzip > babygotdb.sql.gz

Then just scp it to the slave.

Step Two – Get To Know Your Slave

Like on the master we need to make some changes to /etc/mysql/my.cnf

server-id=2
master-host=domU-12-34-56-78-90-1A.usma1.compute.amazonaws.com
master-user=replicant
master-password=replicantsarepeopletoo
master-connect-retry=60
replicate-do-db=babygotdb
log=/var/lib/mysql/mysql.log

Restart mySQL and then launch the client and we have only a little more work to do until we are done.

Gunzip your SQL dump then jump into the mysql client as we need to create our replicated database:

CREATE DATABASE babygotdb;

Then we need to suck in what we dumped on the master:

SOURCE babygotdb.sql;

Stop the slave from running:

SLAVE STOP;

And get our environment ready for replication (this is one long command with each component marked out with commas)

CHANGE MASTER TO
MASTER_HOST=’domU-12-34-56-78-90-1A.usma1.compute.amazonaws.com’,
MASTER_USER=’replicant’,
MASTER_PASSWORD=’replicantsarepeopletoo’,
MASTER_LOG_FILE=’mysql-bin.000023′, MASTER_LOG_POS=3617723;

Now start the slave back up…

SLAVE START;

Watch the replication happen in real time:

$ tail -f /var/lib/mysql/mysql.log

If all is working properly you should be seeing transactions flying by on both as changes are made in the master database.

Gratefully cribbed from Phillip Pearson over at Second p0st and the HowTo Forge write up by falko.

My long term goal is to create images of slaves, masters, data nodes, and management nodes so that deployment will require less hammering on the forge and more tying up loose ends.