Hosting Yum Repository on S3

How to setting up a yum repository on S3

Posted by Abdel Kamel on August 18, 2015

The problem

Say you would like to host a yum repository with 99.9% uptime due to an SLA requirement. One way to pull this off is to have some sort HA solution with multiple instances behind a load balancer, or just use S3.

Unfortunately for this to work, you will need to have a tiny VM / EC2 instance with enough disk space to sync the bucket and update the repo metadata repomd.xml. This VM is used only to update the repo.

Lets say you loose this VM because amazon hates you. Your infrastructure wouldn’t even notice. Your clients will still be able to query the repository and function as expected. The only downside is that the repository will not be up to date, until you get a new VM online.

So lets get to it, hosting a yum repository on S3 is surprisingly easy. Here is how…

Setup IAM credentials

Create an EC2 instance with an IAM role that looks similar to the following.

{
  "Statement": [
  {
    "Action": [ "s3:*" ],
    "Resource": "arn:aws:s3:::your-bucket-name",
    "Effect": "Allow"
  },
  {
    "Action": [ "s3:*" ],
    "Resource": "arn:aws:s3:::your-bucket-name/*",
    "Effect": "Allow"
  },
  {
    "Action": [
    "s3:ListAllMyBuckets",
    "s3:GetBucketLocation"
    ],
    "Resource": "arn:aws:s3:::*",
    "Effect": "Allow"
  }]
}

Make sure to create your IAM role first then attach the role to your EC2 instance.

Setup Bucket Policy

Create an S3 Bucket and add the following bucket policy. This bucket policy restrict access to specific IP addresses. Which is convenient for network setups that have one IP address per VPC, otherwise it becomes a bit compression to manage the whitelisting of IPs.

{
  "Statement": [
  {
    "Effect": "Allow",
    "Principal":
    {
      "AWS": "*"
    },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::your-bucket-name/*",
    "Condition": {
      "IpAddress": {
        "aws:SourceIp": [ "123.123.123.123", "456.456.456.456" ]
      }
    }
  }]
}

The update script

This script requires the awscli tool which can be installed via pip. Install the awscli and the update script on your newly created server that is attached to the newly created IAM role.

$ pip install awscli

/usr/bin/update_repo.sh

#!/usr/bin/env bash

YUMREPO_PATH="/mnt/yumrepo"
YUMREPO_CACHE="/mnt/yumrepo_cache"
S3_BUCKET="your-bucket-name"

mkdir -p "$YUMREPO_PATH"
mkdir -p "$YUMREPO_PATH"/{noarch,x86_64,SRPMS}
cd "$YUMREPO_PATH"

aws s3 sync --only-show-errors  s3://"$S3_BUCKET"/ "$YUMREPO_PATH"/

for arch in x86_64 noarch SRPMS
do
    createrepo -c "$YUMREPO_CACHE"  -v --update --deltas "$arch" > /dev/null
done

aws s3 sync --delete --only-show-errors "$YUMREPO_PATH"/ s3://"$S3_BUCKET"/

I would recommend to run this script on a cron job.

* * * * * root /path/to/update_repo.sh

The clients

To set up your clients to connect to the repo, create my-s3yum.repo at /etc/yum.repos.d that looks similar to the following example.

[s3yum-repo]
metadata_expire=20
autorefresh=1
name=My S3 yum repository - $basearch
gpgcheck=0
enabled=1
baseurl=https://your-s3-region.amazonaws.com/your-bucket-name/$basearch/
humanname=My S3 yum repository - $basearch

Next Steps

At this point your done! But one improvement you can make is to start adding rules to your S3 bucket so that packages auto-expire. Depending on your environment, you can set up rules to send packages to glacier or simply be deleted after a specific amount of time.



comments powered by Disqus