Posts Tagged ‘Amazon’

Work In Progress: Top Albums from 2008

Thursday, February 5th, 2009

Last year I purchased 251 albums from eMusic and maybe another 50 or more from Amazon MP3 so by that count I was buying some 25 a month or nearly 1 a day.  That is one hell of a habit but what makes it harder is sifting through all those and tally the ones that hooked me because, while I’m and obsessive list maker it is always far in arrears.

My first step is to figure out what from those 300 or so purchases was released in 2008 and then of that group which of that set are not re-releases–it is usually that latter part where I give up.  In my first pass I’m left with 1399 tracks and no real way to parse that into albums with Amarok except maybe counting but I don’t have that many fingers and toes.  Thankfully, I set Amarok up to use MySQL as it’s engine so with a quick query, and a little clean up for the freebies, I have 120 albums* to work through.

Now, my more purest readers and friends might exclaim, “Not all of these albums are actually 2008 releases! Cull! Cull!”  While I agree with that in principle I really am an individual governed by sloth and am truly unmotivated to verify the true release date of each album.  Either way it is going to take me a long while to whittle things down to a Top 10.

This is my playlist…

  1. 2562–Aerial
  2. 3 Na Massa–3 Na Massa
  3. Al Kent Presents The Million Dollar Orchestra–Better Days
  4. Ananda Project–Night Blossom
  5. Aphex Twin–Classics
  6. Aziza Brahim–Mi Canto
  7. Baby Charles–Baby Charles
  8. Basia Bulat–Oh, My Darling
  9. Black Taj–Beyonder
  10. BLK JKS–Mystery EP
  11. Bombay Dub Orchestra–3 Cities
  12. Booka Shade–The Sun & The Neon Light
  13. Calexico–Carried To Dust
  14. Carl Craig–Sessions
  15. Cheb i Sabbah–Devotion
  16. Chin Chin–Chin Chin
  17. Coldplay–Viva La Vida Or Death And All
  18. Cordero–De Donde Eres
  19. Curtis Macomber–Asia: Sonata for Violin & Piano, Piano Trio
  20. Dan Zanes and Friends–¡Nueva York!
  21. Deastro–Keeper’s
  22. Debashish Bhattacharya–Calcutta Chronicles: Indian Slide-Guitar Odyssey
  23. Dengue Fever–Venus on Earth
  24. Derrick May–Innovator
  25. DJ /rupture–Uproot
  26. Dub Trio–Another Sound Is Dying
  27. Duffy–Rockferry
  28. Elbow–The Seldom Seen Kid
  29. El Guincho–Alegranza
  30. Eliot Lipp–The Outside
  31. Esperanza Spalding–Esperanza
  32. Etran Finatawa–Desert Crossroads
  33. Fall Out Boy–Folie à Deux
  34. Fanatix–This Thing of Ours
  35. Faraquet–Anthology 1997-98
  36. Fenin–Been Through
  37. Firewater–The Golden Hour
  38. Gang Gang Dance–Saint Dymphna
  39. Ghislain Poirier–No Ground Under
  40. Gnarls Barkley–The Odd Couple
  41. Grouper–Dragging A Dead Deer Up A Hill
  42. Grupo Fantasma–Sonidos Gold
  43. Guillermo Klein–Filtros
  44. Hauschka–Ferndorf
  45. Headlights–Some Racing, Some Stopping
  46. Health–Disco (V3)
  47. Hector Zazou & Swara–In The House Of Mirrors
  48. Hot Chip–Made In The Dark
  49. Huun-Huur-Tu–Mother Earth! Father Sky!
  50. Jack Peñate–Matinée
  51. James Blackshaw–The Wolf Also Shall Dwell with the Lamb
  52. James Blackshaw–White Goddess
  53. James Hardway–L.A. Instrumental
  54. J-Boogie’s Dubtronic Science–Soul Vibrations
  55. J*Davey–The Beauty In Distortion / The Land Of The Lost
  56. J-Live–Then What Happened
  57. Josh Martinez–World Famous Sex Buffet
  58. Joy Division–The Best Of
  59. Juno Reactor–Gods & Monsters
  60. Kasai Allstars–In The 7th Moon, The Chief Turned Into A Swimming Fish And A
  61. Kaya Project–…& So It Goes
  62. Kayhan Kalhor–Silent City
  63. Kraak & Smaak–Plastic People
  64. La Sonora de Lucho Macedo–Gozalo – Bugalu Tropical Volume 2
  65. La Sonora de Lucho Macedo–¡Gózalo! Vol. 1 – Bugalú Tropical
  66. Lau Nau–Nukkuu
  67. Les Voix Baroques–Canticum Canticorum
  68. Louie Vega–House Masters: Louie Vega
  69. Luomo–Convivial
  70. Lyrics Born–Everywhere At Once
  71. Marco Benevento–Invisible Baby
  72. Markus Schulz–Markus Schulz – Amsterdam 08
  73. Melody Gardot–Worrisome Heart
  74. Michael Nyman–8 Lust Songs: I Sonetti Lussuriosi
  75. Michael Nyman–Mozart 252
  76. Mike Ladd–Nostalgialator
  77. Minus The Bear–Acoustics
  78. Moby–Last Night
  79. Natacha Atlas–Ana Hina
  80. Natural Self feat. Andreya Triana–The Art Of Vibration
  81. N.E.R.D.–Seeing Sounds [Explicit]
  82. Niyaz–Nine Heavens
  83. Nomo–Ghost Rock
  84. Plantlife–Time Traveller
  85. Plants and Animals–Parc Avenue
  86. Portishead–Third
  87. Q-Tip–The Renaissance
  88. Quantic Presents…Flowering Inferno–Death Of The Revolution
  89. Quiet Village–Silent Movie
  90. Raashan Ahmad–The Push
  91. Rainbow Arabia–The Basta
  92. Ratatat–LP3
  93. Rebirth Brass Band–25th Anniversary
  94. Richard Swift–Ground Trouble Jaw
  95. Santogold–Santogold
  96. Scott Reynolds–Adventure Boy
  97. Seun Kuti & Fela’s Egypt 80–Seun Kuti & Fela’s Egypt 80
  98. Siah & Yeshua dapoED–The Visualz Anthology
  99. Stanton Moore–Emphasis! (On Parenthesis)
  100. Studio–Yearbook 2
  101. Thao–We Brave Bee Stings and All
  102. The Big Sleep–Sleep Forever
  103. The Black Ghosts–The Black Ghosts
  104. The Cat Empire–So Many Nights
  105. The Gaslight Anthem–The ’59 Sound
  106. The Herbaliser–Same As It Never Was
  107. The High Decibels–The High Decibels
  108. The Hold Steady–Stay Positive
  109. The Matthew Herbert Big Band–There’s Me And There’s You
  110. The Postmarks–By The Numbers
  111. The Saturday Knights–Mingle
  112. The Vandermark 5–Beat Reader
  113. Thievery Corporation–Radio Retaliation
  114. TM Juke And The Jack Baker Trio–Boto And The Second Liners
  115. Vampire Weekend–Vampire Weekend
  116. Vibesquad–Dawn Patrol
  117. Yusef Lateef–Yusef Lateef
  118. Zomby–Where Were U in ’92?
  119. Zuco 103–After The Carnival

Time to turn up the speakers for the next 107 hours or so…

*edit–Found a straggler and culled it.

**edit–Culled Coptic Light because it was released in 2005 and my copy had the date encoded wrong.

Caching Static Assets Made Simple with Nginx, Varnish, S3

Thursday, September 18th, 2008

We serve some of our assets directly out of s3 and while it is convenient it is not the speediest way to deliver content.  The crew over at Viximo worked out how to bolt Varnish on the side of Apache so that they can cache their S3 content and I was so smitten with the idea that I wanted to adapt what they worked out for our configuration so I asked Chris Chiodo reveal the secret sauce.  Below are the configuration files I munged from what he generously shared.

Nginx

This is pretty straightforward, what I’ve done is made varnish an upstream server and am intercepting any content in photos, avatars, kit, or caboodle and passing it the request to it.

upstream varnish {
server varnish01:7000 max_fails=3  fail_timeout=30s;
}

location ~ ^/(photos|avatars|kit|caboodle)/ {
proxy_pass http://varnish;
}

Varnish

This was my stumbling block until I talked to Viximo, the problem was how I defined the backend and that for whatever reason it did not like or AWS did not like the request to the bucket-name.amazonaws.com.

backend media {
.host = "s3.amazonaws.com";
.port = "80";
}

sub vcl_recv {
set req.url = regsub(req.url, "^", "/bucket-name");
set req.backend = media;
set req.http.host = "localhost";
remove req.http.X-Forwarded-For;
remove req.http.X-Forwarded-for;
remove req.http.X-Forwarded-Host;
remove req.http.X-Forwarded-Server;
set    req.http.X-Forwarded-for     = "127.0.0.1";
set req.grace = 30s;
lookup;
}

sub vcl_fetch {
set obj.http.X-Varnish-Url = req.url;
// set a 1 day ttl for avatars
set obj.ttl = 1d;
set obj.grace = 30s;

if (!obj.cacheable) {
pass;
}

set obj.prefetch =  -30s;
deliver;
}

That’s it.  Simple and it works.

nginx + HAProxy + Thin + FastCGI + PHP5 = Load Balanced Rails with PHP Support

Tuesday, July 15th, 2008

This was probably one of the more radical switches in architecture that we’ve made in the recent past.  For the past 7 months we have been successfully running Apache + mod_proxy + mongrel with some limited PHP applications bolted on but the whole set up felt a tad bloated and a little more than unstable as we tested various scaling scenarios.  With the rails community chatting about the hotness that is thin, nginx, and HAProxy we decided to see what it would take to migrate.

The catch with our infrastructure though is that we have broken apart our static assets from rails so the usual localhost simplicity isn’t there which, unfortunately, is how most of the tutorials are aimed at.  In our case, the application sits in a pool of servers and one of the things that we wanted to do was leverage HAProxy to balance each nginx instance over a group of primary and secondary application servers with the primary and secondary status staggered between each nginx instance. Igvita’s post was the inspiration for this and our goal is to create a more fault tolerant environment built on shared services rather than our current setup of largely discrete stacks.

The first thing I tackled was setting up nginx by breaking apart the rails application and any PHP applications into separate virtual hosts. First up is the rails config…

upstream thin {
server 127.0.0.1:8700;
}

server {
listen       80;
server_name  first.server.name;
rewrite ^/(.*) https://what.ever.you.want/$1 permanent;
}

server {
listen 443;
ssl on;
ssl_session_timeout  5m;
ssl_protocols  SSLv2 SSLv3 TLSv1;
ssl_ciphers  ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
ssl_prefer_server_ciphers   on;

# path to your certificate
# if you have an intermediate cert then you need to add the contents to the end of the cert file
ssl_certificate /where/your/cert/is.pem;

# path to your ssl key
ssl_certificate_key /where/your/key/is.key;

# standard rails configuration goes here.
root /location/of/your/site/root;

#        rewrite_log on;

if (-f $document_root/system/maintenance.html) {
rewrite  ^(.*)$  /system/maintenance.html last;
break;
}

location ~ ^/$ {
if (-f /index.html){
rewrite (.*) /index.html last;
}
proxy_pass  http://thin;
}

location / {
if (!-f $request_filename.html) {
proxy_pass  http://thin;
}
rewrite (.*) $1.html last;
}

location ~ .html {
root /location/of/your/site/root;
}

location ~* ^.+\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|pdf|txt|js|mov)$ {
root  /location/of/your/site/root;
}

location / {
proxy_pass  http://thin;
proxy_redirect     off;
proxy_set_header   Host             $host;
proxy_set_header   X-Real-IP        $remote_addr;
proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
proxy_set_header X-FORWARDED_PROTO https;
}
}

And our PHP config…

server {
### PHP Support ###
listen       80;
server_name  second.server.name;
access_log  /location/of/your/site/root/logs/blog-access.log;
error_log  /location/of/your/site/root/logs/blog-error.log;

if (!-e $request_filename) {
rewrite ^([_0-9a-zA-Z-]+)?(/wp-.*) $2 last;
rewrite ^([_0-9a-zA-Z-]+)?(/.*\.php)$ $2 last;
rewrite ^ /index.php last;
}

location / {
root / /location/of/your/site/root;
index index.html index.php index.htm;
}

location ~* ^.+\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|pdf|txt|js|mov)$ {
root /location/of/your/site/root;
}

# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000

location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param QUERY_STRING $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_param SCRIPT_FILENAME  /location/of/your/site/root/$fastcgi_script_name;
fastcgi_param REQUEST_URI $request_uri;
fastcgi_param DOCUMENT_URI $document_uri;
fastcgi_param DOCUMENT_ROOT $document_root;
fastcgi_param SERVER_PROTOCOL $server_protocol;
fastcgi_param GATEWAY_INTERFACE CGI/1.1;
fastcgi_param SERVER_SOFTWARE nginx;
fastcgi_param REMOTE_ADDR $remote_addr;
fastcgi_param REMOTE_PORT $remote_port;
fastcgi_param SERVER_ADDR $server_addr;
fastcgi_param SERVER_PORT $server_port;
fastcgi_param SERVER_NAME $server_name;
}
}

Next up is the HAProxy configuration…

global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
	nbproc		1
	pidfile		/var/run/haproxy.pid
	#debug
	#quiet
	user haproxy
	group haproxy

defaults
	log		global
	mode		http
	option		httplog
	option		dontlognull
	retries		15
	redispatch
	contimeout	60000
	clitimeout	150000
	srvtimeout	60000
	option          httpclose     # disable keepalive (HAProxy does not yet support the HTTP keep-alive mode)
	option          abortonclose  # enable early dropping of aborted requests from pending queue
	option          httpchk       # enable HTTP protocol to check on servers health

listen	thin *:8700
	option httpchk
        mode http
        option forwardfor except 127.0.0.1/8
	balance roundrobin
        server web01 hostname-of-server:8100 weight 1 minconn 1 maxconn 6 check inter 40000
        etc....

There are a couple of things to note here: to get HAProxy to fetch content from servers other than localhost you’ll need to chuck in a wildcard: listen thin *:8700, and to get logging running you’ll need to edit /etc/syslog.conf adding the following lines:

# Save HA-Proxy logs
	local0.*                                                /var/log/haproxy_0.log
	local1.*                                                /var/log/haproxy_1.log

As well as edit /etc/default/syslogd:

# For remote UDP logging use SYSLOGD="-r"
SYSLOGD="-r"

One last thing that drove me almost to the brink of madness is that HAProxy, at least in the build on Ubuntu 8.04, is finicky about how the configuration file is laid out. Each section default, global, and listen has to have the parameters defined with a tab preceding each and while HAProxy would start and accept request from nginx with anything else it would not fetch from the thin server pool.

So that is our front-end, what about the application pool? Turns out that Thin is just as easy to set up as a mongrel cluster and only took a minimum of effort on our part to get it dialed in with God and serving upstream. We edited the stock init script to reflect where we store the yamls and massaged God for the changes in clustering.

Here’s our init script:

#!/bin/sh
### BEGIN INIT INFO
# Provides:          thin
# Required-Start:    $local_fs $remote_fs
# Required-Stop:     $local_fs $remote_fs
# Default-Start:     2 3 4 5
# Default-Stop:      S 0 1 6
# Short-Description: thin initscript
# Description:       thin
### END INIT INFO

# Original author: Forrest Robertson

# Do NOT "set -e"

DAEMON=/usr/bin/thin
SCRIPT_NAME=/etc/init.d/thin
CONFIG_PATH=/location/of/your/yamls

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

case "$1" in
  start)
	$DAEMON start --all $CONFIG_PATH
	;;
  stop)
	$DAEMON stop --all $CONFIG_PATH
	;;
  restart)
	$DAEMON restart --all $CONFIG_PATH
	;;
  *)
	echo "Usage: $SCRIPT_NAME {start|stop|restart}" >&2
	exit 3
	;;
esac

:

And here’s a sample yaml:

---
user: user-which-runs
group: group-which-runs
chdir: /location/of/your/app
log: log/thin.log
port: 8100
environment: staging
pid: /location/of/your/pids.pid
servers: 3

God is very similar to what we had been running with a mongrel cluster:

RAILS_ROOT = "/location/of/your/app"

%w{8100 8101 8102}.each do |port|
 God.watch do |w|
    w.group = 'pack_01'
    w.name = "thin-#{port}"
    w.interval = 30.seconds # default
    w.start = "thin start -C /location/of/your.yaml -o #{port}"
    w.stop = "thin stop -C /location/of/your.yaml -o #{port}"
    w.restart = "thin stop -C/location/of/your.yaml -o #{port} && thin start -C /location/of/your.yaml -o #{port}"
    w.start_grace = 15.seconds
    w.restart_grace = 15.seconds
    w.pid_file = "/location/of/your/pids.#{port}.pid"

    w.behavior(:clean_pid_file)

    w.start_if do |start|
      start.condition(:process_running) do |c|
        c.interval = 5.seconds
        c.running = false
      end
    end

    w.restart_if do |restart|
      restart.condition(:memory_usage) do |c|
        c.above = 150.megabytes
        c.times = [3, 5] # 3 out of 5 intervals
      end

      restart.condition(:cpu_usage) do |c|
        c.above = 50.percent
        c.times = 5
      end
    end

    # lifecycle
    w.lifecycle do |on|
      on.condition(:flapping) do |c|
        c.to_state = [:start, :restart]
        c.times = 5
        c.within = 5.minute
        c.transition = :unmonitored
        c.retry_in = 10.minutes
        c.retry_times = 5
        c.retry_within = 2.hours
      end
    end
  end
end

There you have it, a completely rebuilt stack leveraging lean, fast, and stable services.

Gratefully cribbed from HowtoForgeJohn Yerhot, and  Igvita.

Evolving Services on EC2

Friday, May 2nd, 2008

One of the great things about EC2 is that it is essentially a giant sandbox where you can take risks experimenting with architecture and services in a rapid and cost effective manner, something that you cannot do really well at co-lo or even on other VPS services.  In the past year we have experimented with plenty of different configurations: some found their way into production, others filed for future reference, and still some to be avoid all costs.

May 2007

When I came on board as a contractor we had only 2 servers inside EC2 and the database hosted at Go Daddy. The company had just migrated the Apache/Application server along with a Harvest server into EC2 but had opted to leave the database hosted at Go Daddy due to fears of data loss.  The only trouble with this scheme was the latency between the application and the db which made things so glacially slow that the site nearly unusable.

August 2007

After starting full-time we brought the database into the cloud and started looking into how we might implement a MySQL cluster in EC2.  The challenge was to get a backup routine that was unobtrusive yet fast and easy to transfer into S3. I never got LVM snapshots working to my comfort level so we relied instead on MySQLdump, which, all and all worked fine while the db was small. Data loss was still a big concern for us so we began experimenting in earnest with MySQL clusters.

November 2007

When the MySQL cluster idea didn’t pan out, the theory is that the small instances just didn’t have what it takes to cluster. So we went with plain old replication which has proven to be stable and reliable. The slaves serve both as fail-over units but also perform periodic backups freeing the master from that task.  Feeling more comfortable with database integrity we turned our attention to getting our application to scale, a challenge with resource hungry Rails.

January 2008

Breaking apart Apache and rails was a snap with mod_proxy and it allowed us to dedicate hardware to each.  With things running even better we started thinking about how we can flip this into a more through horizontal scale.

May 2008

So one year later and we have brought some horizontal scale to the site adding stability and failover to the application. As the site grows, though, we are back to the how we can best scale the database but at least we have a sandbox to play in so we can figure it out.

Things to do when EC2 goes down (again) in the middle of the night.

Monday, April 7th, 2008
  1. Drink coffee by the gallon.
  2. Hit F5 repeatedly on the the relevant thread hoping for some shred of a fix.
  3. Organize your photos (again).
  4. Throw Capt’n Crunch at your dog and marvel at how he catches it on his tongue without moving.
  5. Plan what sleeping position you’ll assume when you get back to bed.
  6. Start pricing Engine Yard.

Sold on Jungle Disk

Saturday, March 22nd, 2008

Admittedly, it was an easy sell since I’ve been using Amazon’s S3 service on the job for the last 8 months for storing db backups so I’m familiar with the pros, cons, and costs and I had looked at Jungle Disk as a possible solution but disregarded it since it did not support Linux. Mike posted his thoughts about it and pointed out that they are giving some love to The Penguin.

After a quick spin on the free trial client I went ahead and signed up for the Plus service which allows you to browse your files online. Yes, you could use the S3 Firefox plugin but given the way that Jungle Disk writes folders as files it makes for some ugly viewing and the same goes for the S3Sync tools. Anyways, I look at the $12/year as a donation to keep the company afloat and developing.

To give people an idea of the cost I’m going to start with backing up my photos (33GB) and the last three years of eMusic (50GB) which the first transfer cost will be about $30 dollars and after that will cost about $12.50 a month.  This data grows at about 2GB a month which will tack on less than a dollar extra a month.  Not too bad of a proposition though I do see the potential for climbing up to around $35/month though the cost is worth the piece of mind it brings.