I’m a sucker, especially for the intersection of technology and shopping. Last night, while catching up on my feeds on my iPhone, I caught io9′s article about The Call of the Cthulhu movie which was released on DVD last year. Now it is bad enough that I am reading the feed of a Sci-Fi blog on my iPhone and doubly worse about an indie H.P. Lovecraft film but the icing on the cake is that I purchased it right then and there from Amazon whose iPhone formated store plus one-click shopping makes it way to easy to purchase the things you think you need when you really ought to be sleeping because you know the baby will be getting up in the middle of the night because she’s teething but you cannot resist the simplicity of it tap-tap-ka-ching.
Posts Tagged ‘Amazon’
File Under: They make it too easy
Tuesday, January 29th, 2008EC2, MySQL, Backup Recovery, and You! (redux)
Thursday, December 27th, 2007Here we go again…
On the heels of the replication monitor, I’ve gone back and fine-tuned the fetch script to let you look back two days in the archives. Now, it is a bit janky because I am setting the days for the first array rather than parsing the actual buckets in S3 but my sed/awk skills are less than none. However, I suppose that the next version could be set up to ask how many days you want to look back easily enough.
#!/bin/bash
# set the environment
export AWS_ACCESS_KEY_ID=xxxyyyzzz
export AWS_SECRET_ACCESS_KEY=xxxyyyzzz
export SSL_CERT_DIR=/opt/s3sync/certs
DAYLST[0]=$(date +%j --date='2 days ago')
DAYLST[1]=$(date +%j --date='1 days ago')
DAYLST[2]=$(date +%j)
DAYNUM=${#DAYLST[@]}
echo
echo "Here are the available days for backup recovery."
echo
# echo each element in array
# for loop
for (( i=0;i<$DAYNUM;i++)); do
echo $i - ${DAYLST[${i}]}
done
echo
echo -e "What day did you want to parse? \c"
read selectday
listday=${DAYLST[$selectday]}
echo "Ok, I'm going to get the backups from $listday."
echo
echo -e "How many did you want? \c"
read count
echo
# Get the list of backups on the server using s3cmd
dbsets=$(ruby s3cmd.rb list your_db_backups:$listday | tail -n $count)
ARRAY=($dbsets)
# get number of elements in the array
ELEMENTS=${#ARRAY[@]}
# echo each element in array
# for loop
for (( i=0;i<$ELEMENTS;i++)); do
echo $i - ${ARRAY[${i}]:4}
done
# Prompt user for which backup they want to recover
echo
echo -e "Which backup set would you like to recover? \c"
read numbackup
backup=${ARRAY[$numbackup]:4}
echo "I am fetching your backup $backup now..."
echo
ruby s3cmd.rb get your_db_backups/$listday:$backup /tmp/$backup
cd /tmp
tar -xf $backup
sqlset=${backup:0:14}
mv $sqlset /root
echo "Your backup can be found here /root/$sqlset"
Still on the agenda is getting a slave to recover unassisted after a failure is detected but as my shell scripting abilities improve the possibility of it being realized grows.
EC2, MySQL Replication, Monitoring, and You!
Monday, December 24th, 2007So in a full turn of events I’ve gone back to replication as the the most cost effective solution for creating a high availability environment for MySQL in EC2. The problem of the development team issuing schema changes frequently and without notification hasn’t changed but I have gotten a little more sophisticated about how to deal with them kicking the slave in the teeth when they issue schema changes with impunity.
What I’ve done is build off the backup scripts I have written about prior–especially since they work so well and created the beginnings of a metascript to over see the slaves–it is aptly named slaver. This metascript checks the state of the slave and acts based on it is state: slave up, run backups, or slave down, issue notifications.
#!/bin/bash
### Slaver v0.0.1
### this script is intended to check on the status of the slave
### if the slave is down (IO or SQL) it will send an email out
### set the variables that we are checking for ###
slaver1=$(mysql -Bse ‘show slave status \G;’ | grep Slave_IO_Running)
slaver2=$(mysql -Bse ‘show slave status \G;’ | grep Slave_SQL_Running)
IO=${slaver1:29}
SQL=${slaver2:29}
COMBO=$IO-$SQL
count=$(mysql -Bse ‘show slave status \G;’| grep -c Yes)
### this is sanity check testing stuff ###
#count=1
#echo $IO
#echo $SQL
#echo $COMBO
#echo $count
### this is sanity check testing stuff ###
### run the exception check ###
if [[ "$count" == "2" ]] ; then
/opt/s3sync/db_backup.sh
exit
else
### create status file and mail it ###
date >> status.txt
mysql -Bse ‘show slave status \G;’ >> status.txt
mutt you@yourhome.com -s “Slave Status :: DOWN” < status.txt
rm status.txt
fi
The next pieces to build out will be freezing the automated deletion of the old backup sets and attempting recovery of the slave if it is down. To get started on the latter I made some changes to the backup routine on the master:
#! /bin/bash
# Hourly cron job to upload to current bucket
# This is built off what we are currently running
# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
# set the environment
export AWS_ACCESS_KEY_ID=xxxyyyzzz
export AWS_SECRET_ACCESS_KEY=xxxyyyzzz
export SSL_CERT_DIR=/opt/s3sync/certs
sleep 1m
# grab info about the binlog and position of the database
status1=$(mysql -e ‘show master status \G’ | grep mysql)
status2=$(mysql -e ‘show master status \G’ | grep Position)
sql=${status1:18}
posit=${status2:18}
mkdir /mnt/tmp/backup/you-$DAYNOW-$TIMENOW
echo A.$sql >> \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt
echo B.$posit >> \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/master-$DAYNOW-$TIMENOW.txt
# dump database
mysqldump geezeo > \
/mnt/tmp/backup/you-$DAYNOW-$TIMENOW/you-$DAYNOW-$TIMENOW.sql
# tar SQL dump
cd /mnt/tmp/backup
tar -chf – you-$DAYNOW-$TIMENOW | gzip – > you-$DAYNOW-$TIMENOW.tar.gz
rm -r /mnt/tmp/backup/you-$DAYNOW-$TIMENOW/
# copy tar to S3
cd /opt/s3sync
ruby s3sync.rb -vr –ssl /mnt/tmp/backup/ you_db_backups:$DAYNOW
#clean up
rm /mnt/tmp/backup/*.gz*
The key piece here is the capturing of binlog number and position with those two pieces captured it becomes much easier to automate a recovery of the slave from the master’s backup.
More to follow…
Recovering Encrypted MySQL Backups from S3
Thursday, November 1st, 2007So like I promised here’s the script I banged together to allow easy recovery of your MySQL backup sets on S3. At the moment, it only does the current day so if it is just after midnight, well, you won’t see any backups! I plan on updateing it to allow the user to choose today or yesterday and then build the list from that selection.
#/bin/bash
# This script will list the most recent backups based on a number prompted by the user
# decrypt and expand them into a temp directory.
# set date variables
cd /opt/s3sync
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs
echo -e "How many backups would you like to list? \c"
read count
echo
# Get the list of backups on the server using s3cmd
dbsets=$(ruby s3cmd.rb list YOURDB_db_backups:$DAYNOW | tail -n $count)
ARRAY=($dbsets)
# get number of elements in the array
ELEMENTS=${#ARRAY[@]}
# echo each element in array
# for loop
for (( i=0;i<$ELEMENTS;i++)); do
echo $i - ${ARRAY[${i}]:4}
done
# Prompt user for which backup they want to recover
echo
echo -e "Which backup set would you like to recover? \c"
read numbackup
backup=${ARRAY[$numbackup]:4}
tarset=${backup:0:31}
sqlset=${tarset:0:19}
echo "I am fetching your backup $backup now..."
ruby s3cmd.rb get YOURDB_db_backups/$DAYNOW:$backup /mnt/tmp/recovery/$backup
echo
echo "I'm going to decrypt your backup..."
cd /mnt/tmp/recovery
gpg -d $tarset > $sqlset
echo
echo "Cleaning up after myself..."
rm *.gz*
echo
echo "Your backup can be found here /mnt/tmp/recovery/$sqlset"
Next up is a script that easily allows you to chuck files or directories up onto S3 from your EC2 instance or from your local machine.
EC2, S3, Encrypted MySQL Backups, and You!
Tuesday, October 30th, 2007With great trepidation I write this as my last attempt earlier in the day saw the utter meltdown of this blog…
The topic of what we are doing to secure user data is one that comes up often and it is completely understandable, so this past week I’ve decided to add an extra layer of security into our database backups by encrypting them. It is a fairly simple process that while still being a work in progress works pretty well.
To get things started I generated a key-pair both on the server and imported my personal key so that I can encrypt the backups so I can open them either on the server or on my laptop. Further down the road I’ll be collecting the keys of the development team and importing them so that they can decrypt locally as well.
Now, I’m a bit wet behind the ears when it comes to shell scripting and while I already had a backup script written I wasn’t really happy with how it performed. I’ve made some tweaks to this one that allowed me to drop the nightly “Create Bucket” procedure as well as gathered the backups into a more logical folder/sub-folder layout.
Here’s the backup script…
#! /bin/bash
# Hourly cron job to upload to current bucket
# This is built off what we are currently running
# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs
# dump database
mysqldump YOURDB > /mnt/tmp/backup/YOURDB-$DAYNOW-$TIMENOW.sql
# tar SQL dump
cd /mnt/tmp/backup
tar -chf – YOURDB-$DAYNOW-$TIMENOW.sql | gzip – | \
gpg -r [remote-key-holder] -r [local-key-holder] –encrypt \
> YOURDB-$DAYNOW-$TIMENOW.sql.tar.gz.gpg
rm /mnt/tmp/backup/*.sql
# copy tar to S3
cd /opt/s3sync
ruby s3sync.rb -vr –ssl /mnt/tmp/backup/ YOURDB_db_backups:$DAYNOW
#clean up
rm /mnt/tmp/backup/*.gz*
And the fetch script which will download the backup, decrypt it, and untar it. Now, this script I am working on listing the last X number of backups as determined by the user, dumping them into an array, and then prompting the user to choose which one they want. At the moment, the user need to know the number day of the year and the military time sans colon of the backup. But for the moment running the script is as simple as ./get_db_backup.sh 301 1530.
#! /bin/bash
# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs
echo “Fetching your backup now…”
ruby s3cmd.rb get YOURDB_db_backups/$1:YOURDB-$1-$2.sql.tar.gz.gpg \
/mnt/tmp/recovery/YOURDB-$1-$2.sql.tar.gz.gpg
echo “I’m going to decrypt your backup but will need a passcode…”
gpg -d /mnt/tmp/recovery/YOURDB-$1-$2.sql.tar.gz.gpg \
> /mnt/tmp/recovery/YOURDB-$1-$2.sql.tar.gz
echo “Extracting your backup into /mnt/tmp/recovery…”
cd /mnt/tmp/recovery
tar -xf YOURDB-$1-$2.sql.tar.gz
echo “Cleaning up after myself…”
rm *.tar.gz*
echo “Your file is here: /mnt/tmp/recovery/YOURDB-$1-$2.sql”
Lastly, the “Delete Bucket” script which now thankfully works as advertised.
#! /bin/bash
# Daily cron job to delete old bucket
# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs
DAYTHEN=$(date +%j –date=’2 days ago’)
cd /opt/s3sync
ruby s3cmd.rb -v deleteall YOURDB_db_backups:$DAYTHEN
Since all this is a work in progress I’d love to hear how other people are leveraging S3 for their database backups and if there is an easier way to accomplish what I’m attempting.
EC2: Pound + Apache, Mongrel Cluster, and MySQL Cluster
Thursday, October 11th, 2007Alternately, I should be titling this my 36 hour nightmare. Last week, high off the presentation, I built out and deployed the following configuration.
Everything was nice and tight and after loading QA data it ran like a champ but the problem was that QA data was pretty thin being only a fraction of the size of the production data. When we loaded production data into it, which by the way took nearly an hour to import,performance in the Cluster ground to a halt and we were faced with MySQL timing out the mongrels. Needless to say that after another 36 hours of work we abandoned this model and are looking at plain old replication for our data backed.
What could have given us all that grief? A couple of things spring to mind. The instances have 1.7GB of RAM and a single core process which for now works like a champ for a single MySQL server but for whatever reason it is not enough for a cluster under load. Also, running both SQL and Data Node services on the same box was likely less than inspired as the SQL service would spin up chewing into the remaining RAM and would often dominate the CPU. However, when we launch the cluster we were running some grossly inefficient queries with little or no indexing in the tables. A huge issue.
So we pulled back. At the moment we are still running the three legged system (one instance running Pound, Apache, Monit, and Mongrels, one Harvester, and one MySQL instance) but we made significant changes to the DB so that all the bloated joins that Ruby likes to make are hitting indexed tables as well as tweaking my.cnf to boost key buffer to 30% of RAM. Things seem better and we bought ourselves a little breathing room but we are still hitting the limit of the number of mongrels we can run on a single instance, 10 seems to be the upper threshold for stability, so we need to work out a method for building out a replicated set that will auto-recover after the countless data migrations that the dev team performs. That will be fun!


Comments
James, Dale
james, Mike
james, Mike, james [...]
james, Mike
james, Mike
james, Kyle Daigle