Tag Archive for 'EC2'

EC2, S3, Encrypted MySQL Backups, and You!

With great trepidation I write this as my last attempt earlier in the day saw the utter meltdown of this blog…

The topic of what we are doing to secure user data is one that comes up often and it is completely understandable, so this past week I’ve decided to add an extra layer of security into our database backups by encrypting them. It is a fairly simple process that while still being a work in progress works pretty well.

To get things started I generated a key-pair both on the server and imported my personal key so that I can encrypt the backups so I can open them either on the server or on my laptop. Further down the road I’ll be collecting the keys of the development team and importing them so that they can decrypt locally as well.

Now, I’m a bit wet behind the ears when it comes to shell scripting and while I already had a backup script written I wasn’t really happy with how it performed. I’ve made some tweaks to this one that allowed me to drop the nightly “Create Bucket” procedure as well as gathered the backups into a more logical folder/sub-folder layout.

Here’s the backup script…

#! /bin/bash

# Hourly cron job to upload to current bucket
# This is built off what we are currently running

# set date variables
DAYNOW=$(date +%j)
TIMENOW=$(date +%H%M)
# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs

# dump database
mysqldump YOURDB > /mnt/tmp/backup/YOURDB-$DAYNOW-$TIMENOW.sql

# tar SQL dump
cd /mnt/tmp/backup

tar -chf - YOURDB-$DAYNOW-$TIMENOW.sql | gzip - | \
gpg -r [remote-key-holder] -r [local-key-holder] –encrypt \
> YOURDB-$DAYNOW-$TIMENOW.sql.tar.gz.gpg

rm /mnt/tmp/backup/*.sql

# copy tar to S3
cd /opt/s3sync
ruby s3sync.rb -vr –ssl /mnt/tmp/backup/ YOURDB_db_backups:$DAYNOW

#clean up
rm /mnt/tmp/backup/*.gz*

And the fetch script which will download the backup, decrypt it, and untar it. Now, this script I am working on listing the last X number of backups as determined by the user, dumping them into an array, and then prompting the user to choose which one they want. At the moment, the user need to know the number day of the year and the military time sans colon of the backup. But for the moment running the script is as simple as ./get_db_backup.sh 301 1530.

#! /bin/bash

# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs

echo “Fetching your backup now…”

ruby s3cmd.rb get YOURDB_db_backups/$1:YOURDB-$1-$2.sql.tar.gz.gpg \
/mnt/tmp/recovery/YOURDB-$1-$2.sql.tar.gz.gpg

echo “I’m going to decrypt your backup but will need a passcode…”

gpg -d /mnt/tmp/recovery/YOURDB-$1-$2.sql.tar.gz.gpg \
> /mnt/tmp/recovery/YOURDB-$1-$2.sql.tar.gz

echo “Extracting your backup into /mnt/tmp/recovery…”

cd /mnt/tmp/recovery
tar -xf YOURDB-$1-$2.sql.tar.gz

echo “Cleaning up after myself…”
rm *.tar.gz*

echo “Your file is here: /mnt/tmp/recovery/YOURDB-$1-$2.sql”

Lastly, the “Delete Bucket” script which now thankfully works as advertised.

#! /bin/bash

# Daily cron job to delete old bucket
# set the environment
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export SSL_CERT_DIR=/opt/s3sync/certs

DAYTHEN=$(date +%j –date=’2 days ago’)
cd /opt/s3sync
ruby s3cmd.rb -v deleteall YOURDB_db_backups:$DAYTHEN

Since all this is a work in progress I’d love to hear how other people are leveraging S3 for their database backups and if there is an easier way to accomplish what I’m attempting. :-D

EC2: Pound + Apache, Mongrel Cluster, and MySQL Cluster

Alternately, I should be titling this my 36 hour nightmare. Last week, high off the presentation, I built out and deployed the following configuration.

EC2 Cluster

Everything was nice and tight and after loading QA data it ran like a champ but the problem was that QA data was pretty thin being only a fraction of the size of the production data. When we loaded production data into it, which by the way took nearly an hour to import,performance in the Cluster ground to a halt and we were faced with MySQL timing out the mongrels. Needless to say that after another 36 hours of work we abandoned this model and are looking at plain old replication for our data backed.

What could have given us all that grief? A couple of things spring to mind. The instances have 1.7GB of RAM and a single core process which for now works like a champ for a single MySQL server but for whatever reason it is not enough for a cluster under load. Also, running both SQL and Data Node services on the same box was likely less than inspired as the SQL service would spin up chewing into the remaining RAM and would often dominate the CPU. However, when we launch the cluster we were running some grossly inefficient queries with little or no indexing in the tables. A huge issue.

So we pulled back. At the moment we are still running the three legged system (one instance running Pound, Apache, Monit, and Mongrels, one Harvester, and one MySQL instance) but we made significant changes to the DB so that all the bloated joins that Ruby likes to make are hitting indexed tables as well as tweaking my.cnf to boost key buffer to 30% of RAM. Things seem better and we bought ourselves a little breathing room but we are still hitting the limit of the number of mongrels we can run on a single instance, 10 seems to be the upper threshold for stability, so we need to work out a method for building out a replicated set that will auto-recover after the countless data migrations that the dev team performs.  That will be fun!

EC2, MySQL Cluster, and You!

The past week I’ve been pounding my head bloody going round and round with setting up a MySQL Cluster in EC2. First trying it with Ubuntu, then Fedora 6, and then finally I learned to trust the fine folks at Canonical and believe in that their distro was tight and damn is it ever tight. The beauty of using Ubuntu is that everything you need is installed by default and there is no mucking trying to get the right packages, dependencies, or source. Yes, this is probably not the optimal way of going about this but I need a workable solution and fast and while there are a whole pile of rpms ready to roll the nightmare of getting simple things like perl dependencies satisfied in Fedora were enough to send me screaming out of the cloud.

Anyways, I have a wicked basic cluster running using the following process:

On the Management Node I’m using this config.ini which is sort of cribbed together (/var/lib/mysql-cluster/config.ini)

# Options affecting ndbd processes on all data nodes:
[NDBD DEFAULT]
NoOfReplicas=2    # Number of replicas
DataMemory=256M    # How much memory to allocate for data storage
IndexMemory=256M   # How much memory to allocate for index storage
                  # For DataMemory and IndexMemory, we have used the
                  # default values. Since the "world" database takes up
                  # only about 500KB, this should be more than enough for
                  # this example Cluster setup.

# TCP/IP options:
[TCP DEFAULT]
portnumber=2202   # This the default; however, you can use any
                  # port that is free for all the hosts in cluster
                  # Note: It is recommended beginning with MySQL 5.0 that
                  # you do not specify the portnumber at all and simply allow
                  # the default value to be used instead

# Management process options:
[NDB_MGMD]
hostname=mgmn           # Hostname or IP address of MGM node
datadir=/var/lib/mysql-cluster  # Directory for MGM node log files

# Options for data node "A":
[NDBD]
                                # (one [NDBD] section per data node)
hostname=ndbda           # Hostname or IP address
datadir=/mnt/mysql/data   # Directory for this data node's data files

# Options for data node "B":
[NDBD]
hostname=ndbdb           # Hostname or IP address
datadir=/mnt/mysql/data   # Directory for this data node's data files

# SQL node options:
[MYSQLD]
hostname=sqln           # Hostname or IP address
                                # (additional mysqld connections can be
                                # specified for this node for various
                                # purposes such as running ndb_restore)

Now, in the end I moved this into /mnt (cp -ar /var/lib/mysql-cluster) so that I didn’t have the threat of running out of disk space on the primary partition.

On the SQL Node in mysql.cnf (/etc/mysql/my.cnf) I have nothing more than this:

# Options for mysqld process:
[MYSQLD]
ndbcluster                      # run NDB storage engine
ndb-connectstring=mgmn  # location of management server
log=/var/lib/mysql/mysql.log

I am experimenting with adding settings back in but I’m not too sure if they belong in the config.ini on the management node or in here. My gut tell me management node. Anyhow, with this I copied the contents of /var/lib/mysql into /mnt (cp -ar again) and renamed the old directory and created a symbolic link pointing to the new location. Kludgey, yes, but I am still learning my way around MySQL and its various settings. Likely, I will figure which config file gets the data directory settings and I’ll make the appropriate changes. And yes, you read that right I do have logging turned on because I am the kind of guy who needs to know.

On the Data Node in my.cnf (/etc/mysql/my.cnf) this plain vanilla setup:

# Options for ndbd process:
[MYSQL_CLUSTER]
ndb-connectstring=mgmn  # location of management server

Now to tie all this boxes together I ended up using a host file, recommended by Paul Moen and my boss and with an endorsement like that I just had to run with it! On all of the nodes in /etc/hosts I dropped the internal IP addresses of each box in the cloud (nslookup domU-12-34-56-78-9A-B1.z-2.compute-1.internal):

# Mysql Cluster data node
10.1.2.3	ndbda
10.4.5.6	ndbdb
# Mysql Cluster mgm node
10.7.8.9	mgmn
# MySQL Cluster sql node
10.10.11.12	sqln

Starting everything up begins with the management cluster:

ndb_mgmd -f /mnt/mysql-cluster/config.ini

Then the data nodes:

ndbd --initial

Note, you only need to do the inital part if it is the first time the node is coming up if you are restarting a cluster you can skip it.

Lastly, the SQL node:

/etc/init.d/mysql start

On the management node you can issue a SHOW to figure out if your bacon is frying:

root@mgmn:~# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @10.1.2.3  (Version: 5.0.38, Nodegroup: 0, Master)
id=3    @10.4.5.6  (Version: 5.0.38, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.7.8.9  (Version: 5.0.38)

[mysqld(API)]   1 node(s)
id=4    @10.10.11.12  (Version: 5.0.38)

Now, what about backups? Well, I am in the process of experimenting with issuing ndb_mgm -e “START BACKUP” on the cluster manager and that will dump a backup to each of the data nodes. Ideally, I would like to issue periodic backups to each individual node in a staggered fashion and have those gziped and sent up to S3. What I need to figure out is if I can issue a backup command for individual nodes like START BACKUP Node_2 or something there abouts. If that is the case I could then grow the data nodes out to the maximum four and take snapshots every 15 minutes which could give us decent coverage if our whole section of the cloud decided to pop.

If you have any questions, criticisms, or gripes feel free to slap me with them as I feel like I am still missing a huge chunk of the picture with all of this. :-D

EC2, MySQL, Replication, and You!

One week in and I’m just beginning to get my head around the EC2 way of doing things which is different than what I am used to doing in meatspace with a pile of hardware as well as being worlds away from my Microsoft background. So after tackling a couple of minor projects like Monit + Mongrel, and playing with Memcached, I decided tackle a simple replication setup to get my feet wet in preparation for building a cluster.

After hammering my head against the wall for a day or so things clicked after realizing that this exercise is so much easier if I use the private DNS for plumbing rather than laying ssh tunnels all over the place, which by the way sort-of-kind-of-not-really worked.

Prologue - Create Your World

Look, it is a well known fact that the 9 or so GB that Amazon gives you on the main partition isn’t going to cut it so you have to make some changes to your MySQL environment. The best place for bins and logs are on /mnt. So, since I am epically lazy, I created a mysql directory in /mnt, passed ownership to mysql:mysql, backed up /var/lib/mysql, created a symbolic link to mysql, and then cp -a the contents of the old /var/lib/mysql into /mnt/mysql. Why the backup? Well, if your MySQL instance ever goes down it’ll take the contents of /mnt with it and having a copy of it helps with a speedy recovery.

Step One - Be The Master

Add or edit the following line in /etc/mysql/my.cnf
log=/mnt/mysql/log/mysql.log
server-id=1
log-bin=/var/log/mysql/mysql-bin.log
binlog-do-db=babygotdb

log=/var/lib/mysql/mysql.log (this part is important for seeing whether or not data is replicating)

Restart MySQL then launch the client and issue the following command to add a user with replication privileges:

GRANT REPLICATION SLAVE ON *.* TO ‘replicant’@'%’
IDENTIFIED BY ‘replicantsarepeopletoo’;

FLUSH PRIVILEGES;

Now use the database so we can finish things off…

USE babygotdb;

We want to clear any read locks before we move on so,

FLUSH TABLES WITH READ LOCK;

Now we want to check if what we set up looks like right:

SHOW MASTER STATUS;

This will tell you about the database File, Position, Binlog_Do_DB, and Binlog_Ignore_DB. You will need to know both the file and the position. In this case mine has the following:

File - mysql-bin.000023
Position - 3617723
Binlog_Do_DB - babygotdb
Binlog_Ignore_DB -

You should see something similar to that though it will be in a different format.

quit;

Now is when we dump the database and get it ready to ship over to the slave server.

$ mysqldump -B babygotdb | gzip > babygotdb.sql.gz

Then just scp it to the slave.

Step Two - Get To Know Your Slave

Like on the master we need to make some changes to /etc/mysql/my.cnf

server-id=2
master-host=domU-12-34-56-78-90-1A.usma1.compute.amazonaws.com
master-user=replicant
master-password=replicantsarepeopletoo
master-connect-retry=60
replicate-do-db=babygotdb
log=/var/lib/mysql/mysql.log

Restart mySQL and then launch the client and we have only a little more work to do until we are done.

Gunzip your SQL dump then jump into the mysql client as we need to create our replicated database:

CREATE DATABASE babygotdb;

Then we need to suck in what we dumped on the master:

SOURCE babygotdb.sql;

Stop the slave from running:

SLAVE STOP;

And get our environment ready for replication (this is one long command with each component marked out with commas)

CHANGE MASTER TO
MASTER_HOST=’domU-12-34-56-78-90-1A.usma1.compute.amazonaws.com’,
MASTER_USER=’replicant’,
MASTER_PASSWORD=’replicantsarepeopletoo’,
MASTER_LOG_FILE=’mysql-bin.000023′, MASTER_LOG_POS=3617723;

Now start the slave back up…

SLAVE START;

Watch the replication happen in real time:

$ tail -f /var/lib/mysql/mysql.log

If all is working properly you should be seeing transactions flying by on both as changes are made in the master database.

Gratefully cribbed from Phillip Pearson over at Second p0st and the HowTo Forge write up by falko.

My long term goal is to create images of slaves, masters, data nodes, and management nodes so that deployment will require less hammering on the forge and more tying up loose ends.

Closing out the first week: Monit, Mongrel, and MySQL

One of the things I’ve realized with this new position is that I am my own worst taskmaster, driving myself to work longer hours in tightly focused stretches of time rarely punctuated by breaks. I suppose on some level that I feel like I need to be even more productive because of the absence of “face-time”, that there is no boss leaning over me making sure that I at least have the appearance of being busy. In contrast, though, I really am enjoying the work and the challenge that it presents, so I often feel that itch in the back of my brain to tray and solve the puzzle before I go to bed.

What have I been working on? Well, half the week was spent training Monit to play nice with Mongrel and I am decently confident that it works as advertised in the test environment. This afternoon we did a test deploy with Capistrano nesting it between Monit stop and start statements and everything appeared to work without a hitch. The challenge we faced with Monit in our environment was that we are unable to actually issue Mongrel starts and stops inside the config file. The solution was to take those statements and drop them into bash scripts, so at the moment I have an kludgey but operational method of fourteen scripts for seven mongrels (one start and one stop). When I get a moment, I plan on cleaning them up and making a single one that executes with variables, ie $ monit-mongrel stop 8001 but at the moment I am epically lazy. If I have the time I would like to figure out what exactly it is about the environment that doesn’t like mongrels being started or stopped inside Monit.

Half of yesterday and all of today I have been pounding my head against a nail studded board trying to get secure replication rolling inside EC2 for our MySQL boxen. The masters and slaves (yes, the developers on the crew with a more PC sensibility have chided me saying that the correct terms are primary and secondary. Fine we can meet in the middle with boss and underling) fire up fine and do what they are supposed to except actually perform replication of any shape, form, or fashion. To pipe them together I went the Stunnel route–could not for the life of me get SSL in MySQL to actually do anything–and I know that something is happening because the moment I issue a SLAVE START; command this shows up in the stunnel logs on the master: 2007.08.03 15:46:10 LOG5[13077:3083316112]: localhost.3306 connected from xx.xx.xx.xx:36769. I’m thinking that possibly it is how I set up the replication account permissions on the master, GRANT REPLICATION SLAVE ON *.* TO ‘replicantsarepeopletoo’@'%.mydomain.com’ IDENTIFIED BY ’s3KrEtpa5Sw0rd’;. Taking shot in the dark, since I am tunneling the traffic it likely should just be ‘replicantsarepeopletoo’@'localhost’ so when I’m feeling a little less punchy I’m going to take a look at that again but after twelve hours I pretty much hate MySQL and EC2 at the moment.

What I haven’t been doing is taking pictures, writing, reading something other than man pages and long-winded newsgroup threads, and really listening to some of the new albums I just picked up (Nicole Willis & The Soul Investigators - Keep Reachin’ Up and Red Bumb Ball: Rare and Unreleased Rocksteady (1966-1968) are fucking amazing albums though). Hopefully, I’ll find my stride soon and build a sort of groove where I’m not pushing myself so hard that I’m dreaming about how the company abandons it current market focus and I’m forced to look into re-architecting the mongrel cluster for their plans to launch a fried chicken franchise. Yeah, I do need more sleep.





Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States