Some of you might not know this about me, but I am a successful web developer outside of this blog. I currently host and run 9 sites on my Media Temple (dv) server. One of my biggest concerns was how I can keep safe, up-to-date, and secure backups of my website files outside of my home. After turning to my good friend Google, I came across a couple of great articles from two local Atlanta bloggers about this topic. My friend Paul Stamatiou and Christina Warren both have written in depth articles on how they used Amazon S3 to securely backup their websites daily, which is where I learned to do the following.

While Paul and Christina’s guides are great, I wanted to further explore S3sync and give my experience using S3sync to backup my Media Temple (dv) server to Amazon S3. The steps I am going to walk you through are based on S3sync, a open source Ruby application, that will allow you to transfer files to Amazon S3 using secure SSL encryption. Let me start by warning you, that you need to know a bit about the UNIX commands. You will need a application like Terminal for OS X, Linux, or PuTTy for Windows to SSH into your web server. If you don’t know what SSH is, then this tutorial might not be for you. I will be using Terminal which is built into Apple OS X.

**It is very important that you use absolute paths through out this tutorial. If you are not sure what your absolute path is, enter “pwd” in the terminal window after you have logged into you server. For this tutorial, I am going to work directly from the root level.**

Step 1: Install Ruby

I could walk you through a step by step guide on how to install Ruby, but truth be told, I would recommend you check with your web host for specific instructions. For Media Temple, I would use these guides: (dv) 2.0 server or (dv) 3.0 server. The Media Temple Grid server comes with Ruby pre-installed so you can skip this step.

Step 2: Install S3sync

To get started, we need to connect to S3sync’s Amazon S3 and download the S3sync tar-file. Once we have done that, we are going to decompress it and download the SSL certificates for secure transfers.

Start by getting the S3sync tar:

wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz

Decompress the tar file:

tar xvzf s3sync.tar.gz

Remove the S3syc tar file, get the SSL certificates, and decompress them:

rm s3sync.tar.gz
cd s3sync
mkdir certs
cd certs
wget http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar
sh ssl.certs.shar

Step 3: Setup S3sync

Next, we are going to make the directory that will store the backups before they are transferred to your S3 account. This folder will be inside the S3sync folder.

cd ..
mkdir s3backup

Edit the s3config.rb file, this is a step that only needs to be done with newer version of S3sync.

vi s3config.rb

You need to replace the confpath with this:

confpath = ["./", "#{ENV['S3CONF']}", "#{ENV['HOME']}/.s3conf", "/etc/s3conf"]

Now, enter your Amazon S3 account information into the s3config.yml file which we will create in the S3sync directory:

vi s3config.yml

Now that you are in the VI editor, hit the “i” key to enter insert mode then enter the following:

aws_access_key_id:***********************
aws_secret_access_key: ***************************
ssl_cert_dir: /s3sync/certs

If you are having problems with this step, look at the s3config.yml.sample file which comes with S3sync. After you have entered your Amazon S3 keys and the absolute path to your certs directory, you need to hit the “escape” key then “:wq” to save and quit the VI editor. If you are not sure how to use the VI editor, check out this resource.

Step 4: Write the Shell Backup Script

Now, you need to change directories so you are in the S3sync directory which is where we will create the backup shell script. I have said it before, but this is very important for the automation part at the end of this tutorial, use absolute paths in this script. For the script example below, I am assuming everything is at the root level.

vi backupscript

Hit the “i” key to enter insert mode.

!/bin/bash

echo `date` ": Deleting Previous TAR Files..." > /s3sync/s3backup/backup.log

# Empty the backup folder of previous backups.
cd /s3sync/s3backup
rm -f *
cd /s3sync

echo `date` ": Beginning Backup Process..." >> /s3sync/s3backup/backup.log

# Get the date.
NOW=$(date +_%b_%d_%y_%H-%M)

echo `date` ": Generating File Backup TAR..." >> /s3sync/s3backup/backup.log

# Generate a tar-file of the server contents.
tar czvf websites_backup$NOW.tar.gz **********
mv websites_backup$NOW.tar.gz /s3sync/s3backup
cd /s3sync/s3backup

# Database Backup
DBNAME=**********
DBPWD=**********
DBUSER=**********

echo `date` ": Generating SQL Backup TAR..." >> /s3sync/s3backup/backup.log

# Generate a tar-file of the SQL database.
touch $DBNAME.backup$NOW.sql.gz
mysqldump -u $DBUSER -p$DBPWD $DBNAME | gzip -9 > $DBNAME.backup$NOW.sql.gz

echo `date` ": Compressing 2 TAR Files Into 1..." >> /s3sync/s3backup/backup.log

# Compress all tar-files in to 1
tar czvf server_backup$NOW.tar.gz $DBNAME.backup$NOW.sql.gz websites_backup$NOW.tar.gz

echo `date` ": Delete Individual TAR Files..." >> /s3sync/s3backup/backup.log

# Remove individual tar-files.
rm -f $DBNAME.backup$NOW.sql.gz websites_backup$NOW.tar.gz

echo `date` ": Running S3sync Ruby Script..." >> /s3sync/s3backup/backup.log

# Transfer tar-file to Amazon S3
BUCKET=**********
cd ~/
ruby /s3sync/s3sync.rb -r -v --ssl /s3sync/s3backup/ BUCKET:

echo `date` ": Backup Complete..." >> /s3sync/s3backup/backup.log

This script is not very complex but I will walk through it with you a little bit. All of the echo statements in the script are where I output the status of the script for debugging purposes. These statements are not required, but they might help you troubleshoot any problems that might arise.Make sure you replace the directory you want to backup, the SQL database information, and the Amazon S3 Bucket information where you see:  **********

The script starts by deleting the contents of the backup folder, which will contain the backups that were generated the last time the script was ran. Next, we will generate the date for labeling the tar-files. After that, it is time to compress the actual web server files into a tar-file. Make sure you replace the “**********” with the absolute path of the directory you would like to backup. Next, we will generate a tar-file with the contents of a SQL database. Again, make sure you replace the “**********” with your specific information. Finally, we are going to compress the 2 tar-files into one and transfer that tar-file to your Amazon S3 account. Make sure you edit the last occurrence of “**********” with the name of the Amazon S3 bucket you wish to save the backups in.

Step 5: Test the Script

./backupscript

When you start the script, you should see a fast scrolling list of files as they backup. When it stops scrolling, the 2 tar-files are being combined and transferred to your Amazon S3. This can take a few minutes, so be patient. If successful, the prompt will reappear in the terminal window. Now, you should see the tar-file in your Amazon S3 bucket.

Step 6: Automate the Script

Now, I am going to walk you through how to edit your crontab file to run the script daily at a time of your choosing. If you don’t know how a crontab works, check out this great crontab resource. The basic format of a crontab entry is:

*    *     *    *     *  command to be executed
-    -    -    -     -
|     |     |     |     |
|     |     |     |     +----- day of week (0 - 6) (Sunday=0)
|     |     |     +------- month (1 - 12)
|     |     +--------- day of month      (1 - 31)
|     +----------- hour (0 - 23)
+------------- min (0 - 59)

Start by entering your crontab file:

crontab -e

There may already be some crontab entries in this file, so make sure you do not edit any of the current entries. Scroll down to the last entry and insert a new line. One the new line, enter the new crontab:

0    6    *    *    *    /s3sync/backupscript

This will run your backupscript at 0 minutes into the 6th hour of everyday. To change the time, edit the 0 for minutes or the 6 for hours (use military time.) This works off of your server time, so if your host is in a different timezone, the backup might not occur when you expect it to. Finally, make sure the path to the file is the absolute path.

Final Words

There are many local factors that can effect this script, if the script is not working, I would walk back through the tutorial, and make sure your file tree layout is the way it is suppose to be. Everything we created and edited in this tutorial, belongs inside the /s3sync/ directory.

One of the problems I struggled with, was that the script ran fine manually, but would not transfer the files to Amazon S3 when automated by the crontab. If this problem occurs use the “SET” command inside your script, and directly from the terminal and compare the environment variables (more on “set”.) Any differences you find might need to be annually adjusted in your script. For me it was the “PATH” varible.

I am not a pro with terminal, but I can try to help you troubleshoot any problems you might have when you setup your script. Just drop me a comment below and I will defiantly try my best to help you out.

Helpful Links