Tech Talk Series: Architecting for Scale in your Multi-tenant Cloud

Tech Talk Series: Architecting for Scale in your Multi-tenant Cloud

At Jitterbit our mission is to simplify even the most complex connectivity challenges and to allow anyone to get connected in today’s digital world.   While we stress that you don’t need to be a developer to use Jitterbit’s products – behind every great software solution is a team of fantastic developers.   In the Jitterbit Tech Talk blog series, members of the Jitterbit development team give us insight to building an enterprise-class cloud platform and the challenges they had to solve around multi-tenancy, scalability, and security.

This week, Pankaj Arora, our Sr. Director of Technology looks at how you design and architect a highly scalable enterprise-class cloud platform.

The Cloud offers companies a powerful way to offer their products and services that are easy to access from anywhere and on any device.  It also provides huge benefits to the vendor, including visibility into how their offering is used, rapid development and deployment and overall reductions in IT and maintenance costs.  As a result, almost every company today is looking to make their solutions available in the Cloud.

But there is a lot more to becoming a true cloud company than installing software on a hosted server and slapping “Cloud” on your marketing materials.

I was recently chatting with a technology strategist and they noted that around 75% of companies would like to migrate their legacy enterprise offerings to the cloud. Surprisingly, 90% of them draft an initial plan that would be built on an architecture in which their legacy applications would just be deployed on a publicly accessible cloud server.  This approach negates all of the benefits the Cloud has to offer.

Building a true multi-tenant Cloud platform requires solving a number of different kinds of challenges such as security, reliability, scalability, high availability, fault-tolerance, logging, monitoring, notification system, continuous deployment capability and more.

In today’s post, we will take a deeper look at how we handled the ‘scalability’ challenge when developing our multi-tenant cloud integration platform.

The extent of scalability of an application is tied to its capability to scale vertically and horizontally. In a Cloud environment, this means designing services that serve external requests that scale so that they can leverage internal backend systems such as databases, caches, mediator layers, batch processes (analytical engine, recommendation engine) etc to handle practically infinite requests in a reasonably short amount of time. It is not always possible to predict the number of users hitting your platform. Designing an architecture that accounts for this in the multi-tenancy model helps to make this uncertainty a generic scalability issue.

Vertical Scaling

Vertical scalability involves adding resources such as additional CPUs or memory to a single node in a system, and this should be the first thing you think about when scaling your services. It is very important to design your applications so that they optimally utilize these resources.  Take the common use case for our hybrid integration cloud platform, where integration operations are served by on-premise or cloud server node.  Here the services should take a request to run integrations, pass the request to server nodes and continue serving more requests. The services layer should not be dependent on, or blocked by, the processes happening at various layers of the system.  As such, it is very important to have an asynchronous, event-driven architecture.

A simple way to understand asynchronous architecture is to think of your architecture as a restaurant. A waiter takes orders from a table and passes it to the chef. While the chef, food-runner and bus boy take care of activities like cooking, serving and cleaning, the waiter continues serving new customers and runs checks at the end of the service.

Maintaining asynchronicity while the system works with all the components like database, cache and mediator layer is very important. Passing load to clustered, highly-available server nodes that run integrations frees up the resources for the service layer to take more requests and respond when processes are completed.

Fans of the game “Diner Dash” would make great cloud architects.

Horizontal Scaling

The next key consideration should be that the services are written in a way that it should not matter how many instances of those services are running, and which shared system is performing transactions at any given time. This gives our cloud platform flexibility to auto-scale the services layer. Each part of the backend is highly available and auto-scalable to keep up with demand of growing service containers.

However, we cannot have infinite containers running service layers that use shared components. That’s not the optimal design.  Let’s go back to our restaurant example. With vertical scaling a single restaurant could become highly scalable by adding more staff, but at a certain point, the number of resources (cooks, runners, bus boys) will reach a saturation point. To scale further, this business would need to open additional locations that would help distribute customers and balance the load.

It is better to divide the whole platform into multiple zones, with each zone behaving as a replica of the initial zone. This allows us to use multiple zones to load balance future needs.

This model has multiple advantages:

1. Possible infinite horizontal scaling, as we can bring in any number of zones

2. Fault tolerance, in case the entire zone fails due to a natural disaster

3. Data separation, in the case of a SaaS platform that requires even location-based metadata segregation such as for US, EMEA zones, etc.

The desired outcome here is that a user’s experience should be same no matter where they are and what zone they hit. No manual intervention should be required from a user to get the work done.

In our hybrid integration cloud blog, we discussed how we ran into the challenge of using a mediator layer in a way that it supported limited vertical scale. By writing our own customized non-blocking connection layer based on secure HTTPS transport we were able to extend the architecture to have 50x vertical scale. Of course, it is horizontally scalable as well. This combination of vertical and horizontal scaling gave us a less expensive and easily extensible layer to meet our growing needs.

Optimizing the use of shared components:

Shared components should themselves be scalable and highly available to support the Cloud platform. But shared components are shared, and as a result are not as scalable as the services layer. It is very important to take care of some components that can be sensitive to a high load, for example databases. One model can be to prioritize and queue the usage of shared components where multiple, concurrent high-volume load can cause issues. Prioritization can lower the load and mitigate error on any mission critical requests.   Back in our restaurant the kitchen is a shared component and while the waiter may take a diner’s order for an appetizer, entrée, and dessert, the kitchen will prioritize the cooking of these items and balance their load.

Another way to address load can be caching data that seldom changes to prevent a high usage of those components.  If French fries are a popular item on the menu, the restaurant may decide to “cache” a continuous batch rather than frying them to order.

Monitoring of components:

Finally, it is very important to monitor each component and to know its limitations.  The system should be able to increase bandwidth automatically and at the right time to keep the systems up and running per the needs of the Cloud platform. Monitoring can be external or application-initiated, but in either case the final goal is to let the system auto-scale based on the load.

In the next blog post, we’ll talk about monitoring, logging and other elements of the architecture.