If you are considering building an application of any kind, you want to ensure the application you build will meet your current needs and expected future demands. In short, you want the application to scale.
Scalability is an issue you must consider when building any application. Web, mobile, back office, customer-facing, it really doesn’t matter. Whatever you build must scale. But what does “scalability” really mean? It is not as simple as you might think.
Most software applications can scale up, no matter how they are built. The more pertinent questions are:
How quickly do you need to be able to scale?
How much do you want to invest upfront to ensure that you can scale quickly?
In what way do you need to scale?
What are your concerns in terms of scale?
In this post, I will attempt to address some of these questions. We will discuss how software can scale and what it takes to ensure that it can scale to meet your needs.
What Does It Mean to “Scale Up”?
Software/Hardware solutions can scale in different ways. When we speak about scaling up from a software/hardware perspective, we are really talking about two factors:
- Concurrency – The number of users or requests on behalf of users that occur at the same time.
- Request/response latency – The amount of time a user must wait for the application to return a response. You must determine what is acceptable for your application. The Quantity/complexity of data that needs to be stored/queried will affect this response time.
What Does It Mean to “Scale Up”?
For a software application used by customers, the number of concurrent users is the primary concern related to scale. A mobile application used by the public, such as a social platform, would be one such example. A back-office application that would be used by a large number of employees is another.
Let’s clear one thing up: concurrent users and site visitors are not always the same thing. A website can have a large number of visitors and not require large amounts of processing power. If the average user only visits the site to view a few pages and then possibly make a purchase or contact the company, the site may not be required to scale up.
An example of this would be a lead generation or marketing website. If you are simply putting up a website to promote your business, then you probably would not have thousands of users performing complex processing tasks all at once. You may have a few thousand people visit your site during a day, but they would probably be visiting at different times and requesting pages from the site sporadically.
On the other hand, if you have a website that could experience a dramatic increase in the number of concurrent users based on some unforeseen event, you should probably consider architecting the site so that it can scale quickly if the need arises. Consider what might happen to an eCommerce website that has an exclusive offering of a very popular product. If thousands of site visitors are trying to purchase the same product at the same time, things could get interesting if you can’t scale to meet this unexpected demand.
On the other hand, if you are building a mobile application for users to start and participate in group chats, then you will need to consider the number of users the application will draw. If you start off with a system designed to support a couple hundred concurrent users and your application suddenly becomes very successful and draws tens of thousands or even millions of users, it will need to be able to scale up quickly to support these users. This type of application will require more processing power on the server as those users are more apt to be doing things in the app at the same time.
How Quickly Do You Need to Scale?
When planning for scalability, you must know how quickly each resource (processing power and storage) needs to scale. If your dataset is expected to slowly grow over time and the number of users will be constant, scaling is a fairly simple undertaking. However, if you have a mobile application that could see a rapid increase in the number of concurrent users, you may need to architect your solution so that it can scale quickly to meet this demand. Of course, the latter is a good problem to have. But if you do not plan for it, you may get caught off guard. Your user base might become disgruntled and drop your application if response times are exceedingly slow.
Being able to scale up quickly takes some investment up front. In order to quickly scale an application, you must …
- Understand your application and the demands it will place on your infrastructure (processing power and data access disc and cache). You should know what resources could cause a performance bottleneck.
- Have a plan for how you will scale up your resources to open up the bottleneck(s). There are a few possibilities for scaling. You can scale horizontally (add additional servers in parallel) or vertically (more powerful server equipment). Scaling vertically has its limitations though.
- Stress-test your application with expected concurrency, so that you understand the expected latency under increasing loads. There is an application called Jmeter that can be used to stress test practically anything.
- Do not try to host the application on your own physical servers. Architect your application to run in the cloud and launch the application using cloud hosting from the start.
- Use autoscaling where you can (except for relational databases)
- Architect your application to use managed services in the cloud (Postgres, Redis, etc.) instead of maintaining your own services.
- Have someone entirely responsible for your infrastructure and metrics (DevOps / SRE engineer)
- Have a properly designed DB layer that is architected to present less of a bottleneck for the rest of your system. If your database is the bottleneck, there is nothing you can do to the other layers of the application that will fix performance and scaling issues.
- Do not use a relational database if your data does not present a complex relational schema or if your application is not transactional.
- KISS – Keep it simple, stupid. Simple things are easier to scale. The more complex your engineering, the harder it may become to scale quickly when you are under the gun.
- Have a smart team like us here at Ewizz to make sure that all these things are considered when developing an application.
In short, it is a good idea to think through your requirements for scaling your system. You want to avoid overspending where it isn’t necessary or underspending where it might be required. Just saying “I want to make sure my system will scale” is not enough information. You need to understand how your system will need to scale and how quickly it will need to scale as well to be successful