KitzKikz  KitzKikz: ThinningS3Backups   RecentChanges 
 Home | Trail - ThinningS3Backups

Keeping daily backups on offsite, replicated, long term, cheap storage is a great idea. That is why I use Amazon's Simple Storage Service. However, it can become costly as the number of backups grows. Instead, I keep daily backups for a while, weekly backups for a little while longer, and monthly backups for the long term.

For example, if each backup is 1 Gb, two years of daily backups would take up 730 Gb at an annual cost of $1,314 at Amazon's current S3 rates. By thinning with the script below, only 7 daily, 5 weekly, and 24 monthly backups are kept, using 36 Gb and only costing $65 per year.

The script is fully adjustable if you need more granularity, but in 30 years I've never needed more.

The code could also be modified to keep hourly snapshots in Amazon's EC2/EBS down to a manageable level.

# WARNING: This program will delete files from your s3 bucket.
#     Furthermore, it's designed to delete backups!  If you are
#     careful enough to make backups, and you care enough to place
#     them into long term replicated storage, then be extra careful 
#     when deleting them.  Make sure you know what this script is 
#     doing before using it.  Don't just copy, paste, and run.
#     You take full responsibility.  Use at your own risk.  Caveat emptor ...
use POSIX;

$DRYRUN = 1;
	# Be careful about turning dryrun off.  It **WILL** delete your files.

$BUCKET = "your-bucket-name-here";

	# 0 = keep monthly backups for all time

	# Which day of week to keep around (0 = Sunday, 4 = Thursday)

@NOW = localtime(time());
$THINDAILY   = mktime(-1,0,0,$NOW[3] - $DAYSTOKEEP,$NOW[4],$NOW[5]); 
$THINWEEKLY  = mktime(-1,0,0,$NOW[3] - 7*$WEEKSTOKEEP,$NOW[4],$NOW[5]); 
$THINMONTHLY = mktime(-1,0,0,$NOW[3],$NOW[4] - $MONTHSTOKEEP,$NOW[5]); 

open FILELIST, "s3cmd ls s3://$BUCKET |";
while (<FILELIST>) {
	$_ = (split(/\s+/))[3]; # Filename is in 4th field

	next if ! m/backup_(\d{4})(\d{2})(\d{2})_(\d+).tgz$/;
		# Filename format must match "backup_00001122_334455.tgz"
		# Other filenames are skipped.  To keep an arbitrary backup
		# forever, rename it to something that won't match:
		# i.e. "backup_20100516_KEEPME.tgz"

	$filetime = mktime(0,0,0,$3,$2-1,$1-1900);
        next if ($filetime > $THINDAILY);

	@T = localtime($filetime);
	next if ($filetime > $THINWEEKLY and $T[6] == $KEEPWDAY);

	$lastday = (localtime(mktime(0,0,0,0,$2,$1-1900)))[3]; # Day 0 of next month is last day of this month
	next if ((!$MONTHSTOKEEP or $filetime > $THINMONTHLY) and $T[3] == $lastday);

	$command = "s3cmd del $_";
	if ($DRYRUN) {
		print "$command\n";
	} else {
		system $command;


 EditThisPage · LinksToPage · PageInfo 11/06/10 09:58:36  ·  0.0764s