AWS Glue 2.0 - Realizing the Benefit of Managed Services

John Chapin
Aug 18, 2020 ยท 3 min read

Serverless isn't just Lambda

Here at Symphonia, we're proponents of all things Serverless. This means that in addition to technologies like AWS Lambda, we make heavy use of managed backends-as-a-service that address concerns like scaling, resource allocation, and process management, while giving us and our clients granular billing with minimal fixed costs. However, one of the most important benefits of managed services is that they improve without any effort from us.

The recent release of Glue 2.0 demonstrates this perfectly. In this case, the AWS teams responsible for Glue spend their time fixing bugs, improving performance, and reducing costs - and we benefit from that work without having to lift a finger.

Faster, Cheaper, Better: Pick 3

Glue 2.0 brings not only new features, but also faster startup and job execution times, more granular billing (per-second instead of per-minute) and an order-of-magnitude reduction in the minimum cost of a cost (1 minute instead of 10 minutes). This means that not only will our jobs start sooner and finish faster, our costs will more closely track the actual runtime (especially for short-duration jobs).

We have a few internal Glue ETL jobs that run regularly at Symphonia. Upgrading those jobs was trivial, all we did was change the Glue version number from 1.0 to 2.0 in our CloudFormation template. The difference in performance, however, was far from trivial.

Glue job start-up and execution times, over the last 10 runs

Can you spot where I changed the Glue version? Not only did the start-up times plummet from an average of nearly 10 minutes to just under 10 seconds, but the job execution times dropped from an average of ~23 minutes to ~17 minutes. Those two performance improvements combined to give us a nearly 40% reduction in run time.

We recommended this change to clients, and soon heard reports back of similar performance improvements. Even more notable is that for one of our clients, the cost savings percentage was better than the performance. That particular client runs dozens of short-duration Glue jobs, each of which might only take a few minutes to execute. They're no longer paying the minimum 10-minute charge for each of those jobs - now they only pay for the few minutes when their jobs are actually doing work.

For that client, the trivial upgrade from Glue 1.0 to 2.0 not only has their jobs starting and running faster, but they're now paying 70% less for AWS Glue!

Trade-offs all the way down

Now of course there are trade-offs when picking managed services over those you host or operate yourself. New features, bug fixes, and operational improvements are no longer your responsibility, but they're also out of your control. Managed services typically also have higher costs - remember, you're paying for someone else to build and operate them.

At Symphonia, we help our clients sort through the laundry list of AWS services to make the right build-versus-buy decisions. We then work alongside management and engineering teams to implement scalable, cost-efficient, AWS-native solutions. And, while we're AWS specialists, we're not afraid to steer clients away from AWS solutions and services that create more problems than they solve.

If that sounds interesting, feel free to send us an email at - we'd love to hear from you.