Continuous Integration / Continuous Delivery on AWS
An introduction to CodeBuild and CodePipeline
Continuous Integration and Continuous Delivery (CI/CD) are techniques that I've had a passion about for a long time. Back in the Dim And Distant Past of 2003 I even co-led an open source project that brought some at-the-time interesting innovations to this area.
Fast forward to today : CI/CD, and deployment automation in general, are key to how John and I work with clients at Symphonia: it doesn't matter if you're using all the fanciest new cloud technology if you're still waiting weeks to put new functionality into production. We even put CI/CD front and center in our training courses — we help people create a Continuous Deployment pipeline before they start getting into the nitty-gritty of Serverless Architecture.
Since we use so many AWS services in our work, it makes sense that we also consider AWS tools for CI/CD. In this article I explain briefly what these tools are, and where I think their strengths and weaknesses lie.
What are the AWS Continuous Integration / Continuous Delivery tools?
First up some background / terminology. AWS have a bunch of different products in this area grouped into their “CodeSuite” developer tools :
- CodeCommit is their managed Git service. Think “GitHub Light”, useful if you want to keep your entire SDLC (Software Development LifeCycle) infrastructure in one AWS account.
- CodeBuild is their CI tool / managed build service
- CodeDeploy is their deployment automation tool — think EC2 services, staggered release, etc.
- CodePipeline is their deployment pipeline service.
- CodeStar wraps all these 4 things up into a newbie-friendly-ish combination.
- CloudFormation is a “close cousin” of these services, and is the the AWS Infrastructure-as-Code (IaC) tool + service
I'm going to focus for the remainder of this article on CodeBuild and CodePipeline.
CodeBuild is what is typically referred to as a Continuous Integration / CI tool. You tell it where to get some source code from, give it a bunch of scripts to run in a shell environment (build, package, test, etc.), and it will run them in a container for you. It's cheap, simple, does the job. You can think of it as the AWS version of Travis. You can use various different sizes of build host (small, medium, large); you can use one of the predefined images as your build environment, or you can define your own docker image to use; and you can, if you want, cache data across build runs.
CodePipeline is a CD pipeline orchestrator. You don't give it scripts to run — you give it a sequence of actions, which are links to other services. A typical pipeline would be:
Source → Build + Test → Deploy
The Source action can be Github, CodeCommit, etc. Build + Test can be … well, I'll get to that. For Deploy we typically use CloudFormation (which may partially use CodeDeploy under the covers, implicitly, but ignore that for now). CodePipeline can also have manual actions — e.g. for manual testing, approval, and/or to model manual steps in a deployment value stream map.
Both CodeBuild and CodePipeline can themselves be completely defined in their own CloudFormation templates, so you can use Infrastructure-as-Code (IaC) principles for these components too. Which means your CD definitions themselves are version controlled. Fab.
Some confusion can arise when looking at these two tools since CodeBuild can either run entirely independently — defining its own Source Control configuration, OR it can be an embedded action within a CodePipeline, where Source Control config is CodePipeline's responsibility. But since CodeBuild can run any arbitrary scripts, why would you want to add the complexity of using CodePipeline as well?
- Use standalone CodeBuild if you don't want pipeline orchestration. E.g. “Doing CD in a CI tool”. As I said above it's basically AWS Travis.
- Use CodeBuild embedded in CodePipeline if you want to be able to retry pipelines at particular points, want to spin up multiple actions in parallel (e.g. for cross region build, test or deployment), etc. CodePipeline can also be more efficient than CodeBuild by itself since it can overlap concurrent executions.
The nice thing is that there's no inescapable decision here — you can start with (1) and migrate to (2) as needs arise (and back again.) That being said at Symphonia we typically always start with a CodePipeline instance anyway, but technique (1) is fine in many cases.
Either way you'll very likely end up using CodeBuild to define how your application is built, packaged, and tested. CodeBuild can run any number of sequenced actions, but we recommend putting them all into a shell script so that you can more easily test your build and test flow.
Pros and Cons Of AWS CI/CD Tools
This is not an exhaustive list, but here are some of my opinions of the strengths and weaknesses of using AWS for CI/CD tooling:
- It's service based — no resource management; it has effectively infinite scaling; no worrying about security patching; etc.
- It's also self service — no central ‘CD Ops team’ necessary.
- It's definable by ‘code’ (well, YAML, but that's another Twitter thread…). No pointing and clicking is necessary at all in the AWS Console (but of course you can do that if you want.)
- It's all in AWS, which means that's its performance is decent, you can integrate all the other AWS services (IAM, customization with Step Functions and Lambda, use CloudTrail events for monitoring, etc.) and the more knowledge you have about AWS in general, the more that's going to help with how you use these particular services.
- Most of the basic and medium functionality is covered. And it being AWS it's highly customizable if you want to do something that isn't covered.
- It's pretty cheap. CodeBuild is 0.5 cents per minute for a small compute instance, CodePipeline is $1 / month / active pipeline.
- And since it's just another AWS service it comes into the same bill as all your other AWS services.
- The UI for these tools isn't awesome. But UI isn't typically AWS’ strong point, as long-time users will know! Two aspects I'd particularly like to see improved are: (a) more “human oriented” views, e.g. “tell me about executions due to my commits” and (b) a more consistent UI across CodePipeline and CodeBuild. There are open source and other 3rd party projects available that layer different UIs on top of these tools, e.g. this one from Stelligent.
- Getting your head around some of the concepts can take some time. This is partly because of the benefits and drawbacks of CloudFormation.
- Neither CodePipeline or CodeBuild support sub-directory source-filtering for triggering executions. So they don't play nicely with monorepos, or even mini-monorepos. Yet. In theory you could implement this yourself, but it's not simple.
- CodePipeline doesn't support arbitrarily complicated graphs out of the box. It just supports a linear sequence of stages, where each stage has one action, or multiple parallel actions. This is sufficient for the majority of use cases, but if you want to do some fork-joining you need to use Step Functions — there are examples of this online.
John and I think that the AWS CI/CD tools are very valuable, and a great choice, especially if you're already deep into using the AWS cloud for your application hosting. The top 4 ‘pros’ above alone make the operations of these tools much simpler than many other options. However if you want to invest heavily in something that might support monorepos better, or you want a better UI experience, then consider looking elsewhere.
Do you want to know more?
If you want to dig into CodePipeline and CodeBuild a little more we have some open source repos that you might find interesting:
- Simple getting started Github + CodePipeline: https://github.com/symphoniacloud/github-codepipeline
- One CodePipeline deploying to multiple regions: https://github.com/symphoniacloud/multi-region-codepipeline
- “Meta” CodePipeline to CD your CD: https://github.com/symphoniacloud/meta-code-pipeline
- An example app that uses all of these: https://github.com/symphoniacloud/lambda-benchmarking