Using EC2 Instance Store (SSD) with Amazon Elastic Beanstalk and Docker

Did you know that it's possible to use the local physical storage device attached to your EC2 instance with Amazon's Elastic Beanstalk service and without having to create a custom AMI? I'm going to take a guess and say probably not. This information doesn't seem to be very well documented by Amazon (at the time of this post) and it's not explained well anywhere else, either. Of course, I have nothing against AWS's documentation - it's generally excellent, but there are a few topics here and there, that just aren't. This is definitely one of them.

If you connect to an AWS Elastic Beanstalk EC2 instance directly via SSH and issue the normal lsblk command to see what devices are available, the instance store (local physical storage) device will not be listed at all. It's not mapped and you cannot access it to format or partition it, by default. So, even though the EC2 instance you already pay for may have a real physical drive attached to it, you can't see or do anything with it.

With traditional EC2 instances upon launch, you actually have the option to attach the instance store and when you connect via SSH to them, you can see that the drive(s) are available to use via the lsblk command. Keep in mind that not all EC2 types have an available local storage device, but many of them do.

The solution

To use the instance store on AWS Elastic Beanstalk you just need a little bit of additional magic applied via a .config file in your .ebextensions directory. You will need to modify this to suit your exact needs, but it will probably look something like this:

option_settings:
  aws:autoscaling:launchconfiguration:
    BlockDeviceMappings: /dev/xvdcz=:12:true:gp2,/dev/xvdb=ephemeral0

commands:
  01_format_and_mount_ssd:
    command: |
      if mount | grep ssd0 > /dev/null; then
        echo /media/ssd0 \(instance store\) is already mounted
      else
        mkfs -t ext4 /dev/xvdb
        mkdir -p /media/ssd0
        mount /dev/xvdb /media/ssd0
      fi
      service docker restart && docker restart ecs-agent

EB's default Docker multi-disk configuration

First, one of the most important things to note, if you are using Multi-Container Docker with AWS EB, you absolutely need the entry for /dev/xvdcz=:12:true:gp2 in BlockDeviceMappings. This is actually the default setting that EB uses to create a separate EBS volume that Docker uses for images and containers, it's just invisible to you when doing a deployment. You can see that it actually exists, if you launch an environment and use the ebcli to view it's configuration. EB Docker will still use your root device for volumes. The separate disk configuration exists because of a performance issue with Docker. The issue may now be resolved but EB still uses this type of configuration. You can also increase the size of this volume if you wish, but your docker images are most likely not larger than 12GB, and if they are, you probably have an entirely different problem to tackle.

Find EC2 instance's device names

You will need to know what you need to label each instance store device. This varies depending on what type of EC2 instance you are using. For c3 types, they should be /dev/sdb and /dev/sdc, for r3 types, you will use /dev/xvdb and /dev/xvdc. I don't have a list for the other types of EC2's because I don't think one exists yet. Oddly enough, some instance stores come pre-formatted, some do not. You'll need to do some experimentation to find out. In order to find out what your instance store device should be called, you can simply launch a regular (non-beanstalk) EC2 and be sure to check off the option to make the instance store available. Then connect to it via SSH and see via the lsblk command.

The 'labels' to be used for the instance store will be ephemeral0 and ephemeral1 (if your EC2 has two physical block storage devices attached).

Avoid failed re-deployments

The if check to see if the drive is already mounted exists because this set of commands in your .ebextensions directory will run each and every time you do a deployment, which means if the drive is already mounted, the command will fail and your deployment will also fail. As an alternative, you could use the entry ignoreFail: true but I feel like this is the 'cleaner' way to do this. To each their own though, I suppose.

Make disks available to Docker

After the drive is mounted, the docker service must be restarted with service docker restart. Otherwise, it will not be seen and accessible by docker. Lastly, and probably the most important piece of this configuration, you then must restart the ecs-agent service with docker restart ecs-agent. I could not find this part documented anywhere - not even in a random blogpost scattered throughout the internet. If you don't restart the ecs-agent after restarting the docker service, your deployment will absolutely fail and your logs won't be very helpful in determining why because ecs-agent is no longer reporting anything and doing it's usual magic for you.

In order to use the mounted drive within your Docker containers you will still also need to map the volumes from within your Dockerrun.aws.json file. However, that's a subject matter for another blog post some other day.

I hope this post helps you attain your goals with deployments on Amazon Elastic Beanstalk. Steem on!