A technical primer on the cloud application architectures that are driving enterprise IT transformation

In a September, 2013 blog post titled “Cloud IaaS market share and the developer-centric world”, Gartner analyst Lydia Leong wrote “it’s taken until this past year before most of the ‘enterprise class’ vendors acknowledge the legitimacy of the power that developers now hold”. More and more, developers are becoming the new enterprise Information Technology (IT) buyers. Why are application architectures changing? How are applications changing? What are some cloud application architecture patterns? This blog post will explore answers to these questions.

Cloud is changing the economics of IT and application developers are entering the new frontier of cloud applications. Once upon a time, hardware was a scare resource and maximizing hardware usage drove decisions. Today, compute resources are plentiful as cloud platforms present a seemingly infinite amount of compute resource to developers. In this new frontier, the limits will be our imagination and not compute resource. The trendy designations for the new generation of applications are cloud-centric or cloud-native. What is different about this type of application? The difference is in the architecture. Just putting an application on Amazon Web Services, Microsoft Azure or IBM Softlayer does not make an application cloud centric.

Cloud centric applications are architected for horizontal instead of vertical scaling. Applications are broken into stateless components. Mean-time-to-recovery (MTTR) becomes a more important goal than mean-time-between-failure (MTBF). Cloud centric application architecture has many benefits, including optimization for lower cost, elasticity and high scale. Not every application, however, needs to be 100% cloud centric.

Cloud centric architectures have drawbacks too. Examples include data consistency across nodes, noisy neighbors and managing operational data across transient nodes. Nodes, from a cloud development perspective, represent compute or data resource that are largely abstracted by a cloud platform. A node could be a virtual machine, physical server or cluster of servers. A few key foundational design points will help with understanding cloud centric application architecture. These include:

  1. Horizontal Scaling
  2. Eventual Data Consistency
  3. Reduce Network Latency

Horizontal Scaling
The horizontal scaling design point requires that instantiating a single compute node is as easy as instantiating a hundred. Such scaling should also be reversible. These nodes would be allocated for specific functions, e.g. web server and invoicing nodes. The key is that the nodes are stateless and autonomous. This is a change from many existing applications where state is preserved between nodes. Better efficiency is gained with homogenous nodes, which makes it easier to do round-robin load balancing, capacity planning and auto-scaling.

Eventual Data Consistency
Eventual data consistency represents a choice in the way data is updated that can optimize the scalability, performance and cost one can achieve with a cloud platform. An example of eventual data consistency is DNS (Domain Name Service). It can take hours for an update to propagate to all DNS servers. NoSQL databases like MongoDB or Couchbase leverage the eventual data consistency model. They can provide very high scale but do not have atomicity or data consistency guarantees like relational databases.

This design point can be very powerful when combined with a data processing approach like MapReduce. Data that is good for MapReduce include things like tech documents for Wikipedia, web server logs and user social graphs. LinkedIn uses MapReduce to suggest contacts. Facebook uses it to find friends. Amazon uses it to recommend books. It is an approach used heavily by travel and data sites, risk analysis and data security. The eventual data consistency approach is a business decision. Some services, such as credit card payments, require data consistency.

Reduced Network Latency
Because cloud applications are divided onto autonomous and stateless nodes, network latency becomes a bigger challenge. Distances become a bigger factor. Cloud centric application architectures need to account for latency, perhaps doing things like collocating nodes within one application closer together, e.g. within the same data center or rack, moving application data closer to users and moving applications closer to users.

Cloud Architecture Patterns
Much of this blog post has been inspired by a book that I recently read, “Cloud Architecture Patterns” by Bill Wilder. Wilder’s book describes a set of cloud centric application architecture patterns which I will introduce here. They are:

1. Horizontal Scaling – A pattern for resource application based on stateless autonomous nodes. There’s a discussion on dealing with state, e.g. using cookies and various storage options like NoSQL data stores, distributed caches, relational databases, and file stores.

2. Queue Centric Workflow – A pattern that leverages reliable cloud queue services to decouple application components. Key concepts include two-phase removal of messages and at-least once processing. The pattern addresses issues such as partial processing and failed nodes.

3. Auto-scaling Pattern – A pattern to (1) optimize resources and cost and (2) minimize human intervention to save time and reduce errors. Reversible scaling makes this highly effective. Methods discussed included a “N+1 rule” to allocate spare resource in case one node fails, upper and lower resource boundaries and throttling of features (versus instances).

4. MapReduce – A pattern for Big Data, e.g. processing large amounts of data efficiently. This Big Data concept brings compute to the data since moving data is expensive and slow. An implementation of MapReduce is Hadoop. A cloud platform that provides Hadoop as a service simplifies installation and administration. Savings can be substantial since a Hadoop cluster can involve many nodes.

5. Data Sharding – In the traditional model, databases scale vertically by moving them to bigger and better hardware. This pattern divides a database into two or more “shards” so that they can be distributed on smaller nodes. This technique is complex but cloud platforms can mask the complexity. Sharding is efficient when a single shard at a time can satisfy most of the common database operations. Not all databases can be easily updated to support sharding. In many cases, it may be easier to provide a relational database that resides outside the cloud centric part of an application.

6. Busy Signal Pattern – A pattern that could be applied generally but makes more sense for cloud because there are more failures. This pattern deals with hardware failures and transient failures, e.g. like retrying when getting a busy signal from a phone. Discussed are different retry patterns, techniques like increasing back off delays and guidance on when to throw exceptions. Multiple tenants also make this pattern more relevant cloud because while one client may not exceed maximum throughput individually, multiple clients can exceed it collectively.

7. Node Failure Pattern – A pattern that deals with node failure. For example, application state is persistent on reliable storage and not the local disk of an individual node. A cloud platform might identify a domain or set of resources in a cloud as a single-point-of-failure.

Other patterns described include a collocation pattern, valet/key pattern, CDN pattern and multisite deployment pattern. As cloud opens a new frontier for application development, IT vendors must change. The industry has undergone significant evolution. Hardware is abundant. There are also many choices in middleware in application development tools. Cloud makes compute, middleware and application building blocks readily available to developers. Cloud is the new frontier and it is the developers that are leading the way.

Note – One might say that cloud not only provides more power to not only to developers but directly to lines-of-business and consumers. So note that the Gartner blog title specifies IaaS (Infrastructure-as-a-Service) versus other cloud deployment models. In the SaaS (Software-as-a-Service) model, line-of-business buyers can bypass IT to source services. With IaaS, developers often front IT sourcing and make decisions on using various cloud IaaS providers.

Disclaimer The postings on this site are my own and don’t necessarily represent the positions, strategies or opinions of my employer.

4 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s