Auto scaling is an old selling point for AWS and cloud services in general, but a surprising number of production applications don’t auto-scale.
There are several legitimate reasons for this:
- Applications have a very fixed input/output and have no reason to scale.
- Performance independent applications (no annoyance from short term slow-downs)
- Applications with no external connections (self-contained)
There are also too many applications lying around that should have been auto-scaled and never have (cough, cough, NZ Sky GO, cough). It is best to design your application to handle auto-scaling from the start, which then gives us the option of using it when it becomes necessary. We don’t want to scare people away just because our web services are too slow, do we?
When we talk about building cloud applications we always mention auto-scaling and how all our applications should scale up and down as they need to. What we don’t really talk about are good practices to follow when we are building apps that need to scale. Here are some useful things to keep in mind when building applications for the cloud.
Keeping your applications stateless is a good practice to get into. It really helps when scaling out applications as connections can be routed to different servers which may not have the local state stored on them. Storing the state outside of the application (Memcache or Redis is a good location) allows important information to be loaded on the fly when an incoming connection gets routed to a new server.
Using Mature Clustered Backend Services
When we select data storage locations for our application we must make sure that they can scale when our need to scale arises. The first step here is to assess all the datastores available to us and selecting one that scales well. Once we have selected our datastore we need to set it up properly from the beginning. This means setting them up with clustering enabled and being ready to increase the size of the cluster when we need to. This helps to prevent a migration headache that can be caused by needing to enable clustering on a database already in use.
One of the most underrated features required by scaling applications is monitoring. Proper monitoring of our systems allows developers to see where the services are performing the slowest and what resources those services are using the most. Building the capability for monitoring the system and alerting you to failures before using the system in used in anger is very important, even if you don’t build nice graphs or tidy reports from day one.
The Scaling Rollercoaster
There are a number of key metrics we can use to scale up and down on-demand. Here are a few of the good ones to use as input for managing the scaling.
Number of active connections
If you know your application performs poorly over a certain number of connections per server then the number of active connections is a good metric to start scaling up.
If your application is currently experiencing high CPU usage than its generally a good indicator that another server could help spread the load out more.
This is one that is often skipped but if your local disks are being taxed too hard adding a few more disks to a cluster (or more provisioned IOPS) can help an application immensely.
Remember once we have scaled up we need to scale down again! Scaling up allows us to have better performance, but it comes with a monetary cost. We also need to scale down so that we only spend when we need to. An application that scales well can even offer a much lower running cost over its lifetime.
Microservices are commonplace now in any web-facing application, they allow developers to build small single responsibility applications that are heavily abstracted from the other segments of the application. This allows us to rapidly develop sections of the architecture without affecting the performance and maturity of the other layers.
The importance of this is it allows us to scale the individual sections of the application separately from the other sections. Is a DB cluster having a hard time? Scale up the number of DBs! Apache filling up its small pool of collections, make more pools! When a microservice architecture is properly monitored it is easy to see where this can improve our application.
Serverless is all the rage at the moment. One of the reasons is that serverless architectures scale by default. The code is executed on demand as requests come into the system and in parallel. This means that the performance from one request has no (or little) effect on the performance of a different request, which is quite an elegant solution to the scaling problem. Serverless also has the added benefit of costing very little (or nothing) when the system is idle.
That is just a small number of things you can consider when building scalable applications in AWS (or Azure/Google Cloud/…). Hopefully, this can help you avoid a development nightmare if your next great idea ends up more popular than first expected.
Coffee to Code – Timothy Gray (Code Conjurer)