AWS re:Invent – Driving Innovation with Container Architecture – CON203
Driving Innovation with Container Architecture – CON203 – re:Invent 2017 – Live Notes
Walking into CON203 at re:Invent made me thankful that I had a reserved seat. The line for walkups was incredible. The session was held in the Aria complex home to a good portion of the Container and DevOps tracks at re:Invent.
This session was hosted by Asif Khan – AWS, Chris Tava and Josh Davis – WeWork, and Radek Wierzbicki was a combination of patterns for refactoring and deploying containers wrapped around the re-platform of a critical WeWork application.
Asif Khan set the stage by giving an overview of containers and schedulers, the usual Build/Ship/Run story combined with breaking down monolithic apps into micro-services overview. Finally, Asif provided an overview of Amazon’s ECS platform to set the stage for the gents from WeWork.
Chris Tava – WeWork – gave an overview of problem statement which can be summarized:
WeWork’s application for construction management was deployed in a monolithic manner on top of IaaS instances. Problems with the underlying hardware dictated that the app needed to be re-platformed and unfortunately in the current deployment methodology there wasn’t a good way to simply just re-provision as the standard for deployment was scattered among Chef, scripts, manual effort. To power the over 80K users of the application, WeWork needed a new model for deployment.
Josh Davis – WeWork – went over most of the details as the session gathered steam:
Long story short WeWork moved to Amazon ECS for container orchestration, coupled with Docker for developers, Jenkins for Deployment Orchestration, CloudFormation for infrastructure orchestration (plus needed environment parameters), a Git repo, and Atlassian Artifactory for Docker image hosting. Other AWS services including, KMS, ELB, and perhaps RDS were also leveraged.
- Moving all services to containers and abstracting key environmental values helped reduce some of the environment promotion trouble that can normally occur when code is promoted from one environment to the next.
- Developers leverage docker-compose to execute the environment on their laptops for development purposes. Deployments leverage Cfn…. Making the development process, Build, Build the Ship, Ship, Run. This still seems a bit kludgy but is likely the best solution at the moment for both developers and operations.
- Each service has a git repo with Dockerfiles and Cfn needed to do deployments that aren’t laptop based. The Cfn includes info about the ECS service resource, task definition, and any needed IAM roles.
- A master environment repo describes shared services common to the application. IAM, KMS, etc.
- Stack parameters are versioned with application code allowing for environment versioning.
- DB Schema Updates – It appears that WeWork is taking advantage of RDS, and if not, standard MySQL. Deployments of schema changes were a two-step process, first a bit of CloudFormation is executed to call and execute a single container-based execution of a process that updates DB schema, a second call to CloudFormation updates any needed infrastructure.
- Education/Documentation was/is needed for the development teams around this new methodology for deployment, this also includes generic CloudFormation templates that can be leveraged for each service as well as guidance on how to build/run containers and how to build effective container images.
- KMS is leveraged to encrypt secrets needed for the execution of the distributed app. This allows IAM to own which developers have rights to encrypt/decrypt appropriate secrets.
- All of this is rolled into Jenkins. Jenkins builds the container images and stores them in Artifactory while deploying the infrastructure and executing the needed tasks.
- Using Joyent ContainerPilot to communicate effectively with Consul for service discovery. Consul is running in ECS.
- Sidecar processes leveraged for logging/event management/Datadog integration were key in enabling the appropriate aggregation and collection of data from the application many components.
- With any containerized platform logging into boxes to review errors is generally an anti-pattern. Make sure that you have your bases covered when it comes logging your app as well as stdout and stderr. Build process around this.
Radek Wierzbicki from a consulting partner gave a bit of an overview of ECS scheduling and the session wrapped up. All in, very impressed with the folks from WeWork. Thoughtful content.