AWS EFS & S3 Case Study

 

I recently helped a customer migrate from Azure to AWS and they had some unique and interesting storage needs that I thought would be excellent material for a blog post. What made this particular design interesting from a storage perspective was the customer’s syncing requirements.

Background

Customer X was having some issues with their Azure environment and wanted to run an MVC (Minimum Viable Cloud ; ) in AWS to test & decide if a full migration was in the cards. To this end we spun up a CFT (CloudFormation Template) that created a multi-AZ 3-tier environment with ELBs (Elastic Load Balancers), ASGs (AutoScale Groups) for web servers and bastion hosts, multi-AZ RDS (Relational Database Service) with ElastiCache (Redis) fronting it.

fig 1. logical design

 

Best practices review:

Let me take a step back: When you have an ASG (AutoScale Group) that turns servers on/off based upon load – you need to have a central repository of files that the servers can access independent of the state of the web server itself. You don’t want to have your ‘changeable’ files on the local web server because that server can be turned off at any time (when utilization is low, or when the health of the system goes bad & ASG forces a shutdown/rebuild).

On a web server, normally your codebase remain static and only the media files (files served up to the browser) are modified. This is a good use case for using a AWS EFS (Elastic File System) as this service can:

  1. Be accessed by ‘up to thousands’ of EC2 instances (max we needed here was 16)
  2. Grows/shrinks as you add/remove files.
  3. Can be accessed across multiple AZs (and even on prem if you are running a hybrid cloud environment)
  4. Has enough performance to serve web server media files

What we did:

The reason I said ‘normally’ was because this use case required rapid modifications of *both* the codebase and the media files. Initially we tested running both codebase and media from EFS but that proved to be too slow across the wire. The final solution used both EFS and S3 to achieve the customers requirements while not jeopardizing TTFB (Time To First Byte) or the overall speed & responsiveness of the site:

  1. mount media via EFS per normal AWS best practices
  2. create an on demand sync job that syncs the S3 codebase with the ASG spawned web servers
  3. Create a bash script on the bastion host that will query all existing web servers, then trigger a sync job from the web servers on demand

Now, you might be asking ‘why didn’t you automate the S3 sync?’. We found that the sync job didn’t finish on time, they stacked on top of each other and bogged the system down with unfinished syncs.

As a final note, I’ll add the (sanitized) script here that we run from the bastion host to trigger the syncs on the web servers. You can use this to update/refresh EBS volumes attached to your EC2 instances:

refresh EBS volumes snip