Lost in Time!!!: Clustering Landscape for WSO2 Stack

Furthermore, the article discusses different choices available under each decider and also describes when each of those choices would be useful. Based on that foundation, the article describes WSO2 Clustering Architecture, and revisits above three deciders and available choices while discussing their applicability within the WSO2 Clustering implementation. Moreover, the article describes some implicit assumptions that goes with those solutions, while pointing to detailed references for an interested reader.

Introduction

Clustering places a group a nodes together and try to make it behave as a single cohesive units, often providing an illusion that it is a single computer. Figure 1 depicts an outline of a Cluster, where it consists of collections of typically identical nodes, and a frontend, which is called "Front End" , distributes requests among nodes. Moreover, nodes either share state (e.g. shared database, shared file system) or updates each other about state changes. The article WSO2 ESB / Apache Synapse Clustering Guide provides a more detailed treatment of the topic.

The article consists of two sections: the first part, which discusses "End Goal", "Choice of the Front End" and "Shared States across nodes" as three deciders in choosing the right clustering strategy for a particular usecase, and the second part that extends the discussions to WSO2 solutions.

Clustering Landscape

Clustering differs based on the type of server or application used as the unit of clustering and the usecase the cluster is suppose to be serving. Consequently difficulty and associated trade-offs also vary. Overall, the following three parameters captures different clustering requirements.

Goal of Clustering - load balancing, high availability or both
Choice of the Front End - Synapse, Apache, Hardware, DNS
State shared across different nodes in the cluster

End Goals of Clustering

Users cluster their services to scale up their applications, to achieve high-availability, or to achieve both. By scaling up, the application supports a larger number of user requests and through high availability the application will be available even if few servers are down. To support load balancing, the front end distributes requests among the nodes in the cluster. There are wide verity of algorithms based on the policy of distributing the load. For examples, random or round-robin distribution of messages are among simple approaches and more sophisticated algorithm take runtime properties in the system like machine load or number of pending requests in to consideration. Furthermore, that distribution could also be controlled by application specific requirements like sticky sessions (see http://en.wikipedia.org/wiki/Load_balancing_(computing)). However, it is worth noting that with reasonably a diverse set of users, simple approaches tend to perform in par with complex approaches and therefore, they should be given the considerations first.
Clustering setup for high availability includes multiple copies of the server and the user requests are redirected to an alternative server if the primary server has been failed. This setup is also called passive replications (a.k.a. hot cold setup) where the primary serves the incoming requests and falls back to another server only if the first server has failed. The main difficulty in this scenario is that failure detections in a distributed environment is a very hard problem, and often, it is impossible to differentiate between network failures and server failures. However, most deployments run in highly available networks and typically, we ignore network failures and detect server failures through failed requests.This yields pretty reliable results, yet it is worth noting that it is only an approximation.
To achieve high availability and scalability is desired, both of above techniques are applied simultaneously. For instance, a typical deployment includes a front-end server that distributes messages across a group of servers and if a server failed, the Front End removes the server from its list of active services among which it distributes the load. Such a setup degrades gracefully in the face of failures where after a failure of one among N servers, the system only degrades by 1/N.

Front End

To denote the mechanism to distribute requests across nodes in the cluster based on the current system, we use the term "Front end" and often it is called the Load Balancer as well, a name which we will often use interchangeably. Load Balancer comes in wide verities and among them are hardware Load Balancers, DNS Load Balancers, transport level (e.g. HTTP level like Apache, Tomcat) and Application level (WSO2 ESB, Synapse). High level Load Balancers, like Application level Load Balancers, operate with more information about the messages they route, hence provide more flexibility, but also incur more overhead. Hence the choice of Load Balancer is a trade-off between performance and flexibility. Following are few of the options as Load Balancers.

Enterprise Service Bus - provides application level load balancing. For instance, can take the routing decisions based on the content of SOAP message. They provide great control, but low performance.
Apache mod_proxy - Same as above, but operates at the http level rather than at the SOAP level.
DNS Load Balancing - DNS lookups each time return a different servers when it is quired, and the clients end up talking to one server among a group of servers thus distributing their load. With this approach, decisions based on messages are not possible.
Hardware Load Balancers - the logic is implemented as hardware, and they can be very fast. However, the logic can not be changed later, and hence they are not-flexible. Therefore, they usually provides low level load balancing.

A common concern with the Front End(Load Balancer) is its reliability---or in other words, what if the Front End has failed. Typically the Front End need to be protected though a external mechanism (e.g. Passive Load Balancer with Linux High availability or a system management framework that monitors and recovers the Front End if failed). Moreover, DNS Load Balancer can handle the problem by setting up DNS servers to return only active instances.
Moreover, it is possible to use two level of Load Balancers where one provides simple transport level load balancing while the next level provide application level load balancing. If this method is used, low level Load Balancer like DNS distribute requests across multiple load balancing nodes, which make decisions considering message level information, and this approach, as a added advantage, avoids single point of failure for the Load Balancer.

Shared State

Third parameter, which defines the state shared across different cluster nodes, is one of the hardest aspects of clustering, and it, usually, defines limits of clustered system scalability. As a rule of thumb, less amount of state is shared and less consistency is required across nodes, more scalable is the cluster.
Following Figure 2 depicts four use cases based on how state is shared across cluster nodes.

In the most simple case, the cluster nodes do not share any state at all with each other and in this case, clustering involves placing a load balancing node in front of the nodes and redirecting requests to the nodes as necessary. This can be archived with WSO2 ESB, Apache mod_proxy Load Balancer, DNS level or hardware load balancing.
In the second scenario, all the persistent application state is stored in a database and all services are themselves stateless. For instance, most three tire architectures falls under this method. As shown by the second part of the Figure 2, these systems are clustered by connecting all the nodes of the cluster to the same database and system can scale up till the database is overwhelmed. With enterprise class databases, this setup is know to scale to thousands on nodes, and this is one of the most common deployment setup. A common optimization is caching data from reads at the service level for performance, but unless nodes flush their caches immediately following a write, the reads may return old data from the cache. However, all the writes to the database must be transactional and otherwise, concurrent writes might leave the database in a inconsistent state.
In the third scenario, request are bound in to a session, on which case, requests on the same session shares some state with each other. One of the simplest way to support this scenario is using sticky sessions, in which case the Load Balancer always redirects messages belonging to a same session to a single node, hence following requests can share state with the previous requests. This session can be implemented at the HTTP level or at the SOAP level. One downside of this approach is that if a node has failed, the sessions associated with that nodes are lost and need to be restarted. It is common to couple database based system described in scenario 2 with sticky sessions in practice, where session data is kept in memory, but persistent data are saved in to a database.
Finally, if cluster nodes share state, and they are not stored in same shared persistent media like a database, all changes done at each node have to be disseminated to all other nodes in the cluster. Often, this is implemented using a some kind of group communication methods. Group Communication keeps track of the members of groups of nodes defined by users and updates the group membership when nodes have joined or leave. When a message is sent using group communication, it guarantees that all nodes in the current group will receive the message. In this scenario, clustering implementation uses group communication to disseminate all the changes to all other nodes in the cluster. For example,Tribes group communication framework is used by both the tomcat and Axis2 clustering. It is worth noting that this approach have limits in its scalability, where Group Communication typically get overloaded around 8-15 nodes. However, that is well within most user needs, hence sufficient. Again there are two choices whether updates are synchronous or asynchronous, where in the former case, the request does not return till all the replicas of the system are updated, whereas the latter updates the replicas in the background. In the latter case, there may be a delay before values are reflected in the other nodes. Coupled with sticky sessions, this could be acceptable if the consistency of the system is not a concern. For example, if all write operations only perform appends and does not edit previous values, lazy propagation of changes are sufficient, and sticky sessions will ensure user who did the change will continue to see his changes. Alternatively, if the read-write ratio is sufficiently high, all writes can be done in a one node while other serves reads. The article The Dangers of Replication and a Solution, written by Jim Gary, provides a very detailed description of the problem and some answers.
Another interesting dimension is auto scaling, which is the ability to grow and shrink the size of the cluster based on the load received by the client. This is particularly interesting with the advent of Cloud computing, which enables "only pay for what you use" idea. Conceptually, this is implemented using a special Load Balancer, which keeps track of membership of nodes in the cluster and adds or remove new nodes based on their utilization.

Clustering support in WSO2 Products

At the point of time of writing, WSO2 WSAS (Web Service Application Server), WSO2 ESB (Enterprise Service Bus), and WSO2 Governance Registry supports clustering. WSO2 products support all three goals--- to scale up (load balancing), to achieve high availability, or both---descried in the earlier chapter. Also, as the Load Balancer, any of the methods mentioned in the earlier section can be also used.
As discussed in the earlier section, it is worth noting that different products require different levels of state replications based on the exact usecase it is utilized for. For instance, often, ESB requires full state replication across the nodes in the Cluster, but nodes in a Governance Registry cluster usually share the same database, hence do not need full state replication. Level of replication required for WSAS also depends on the particular usecase. For instance, most applications that adheres to the tree-tier architecture where the service tier is stateless does not need full state replication across all nodes in the cluster, but if the services in the application keep state within the application memory, they needs full state replication.

WSO2 Clustering Architecture

When cluster nodes are stateless or when they share data through a external data source like Database or a shared file system, replication of their state is not required and hence, such deployments are comparatively trivial. However, if each node has inherent state, dissemination of updates, which require full replication, is often required. Let us look at the full replication implementation of WSO2 products.
Each WSO2 product is built using Apache Axis2 as the base and to update dissemination to nodes, WSO2 products use Apache Axis2 clustering implementation. Axis2 clustering implementation uses Group Communication to replicate the state across all nodes in the Cluster, which keeps track of the current membership of the cluster and also guarantees that each message multicasted to the group are received by all the current members of the group. As explained in Gray et. al., use of Group Communication is the state art in implementing state replication. When state in a cluster node has changed, each change is sent to every node in the cluster and that change is applied at each node in a same order. These changes can be applied in two modes, where in the former, a change operation only returns after all the changes are applied whereas in the latter, change operation immediately returns.
However, it is worth noting that this approach assumes that when a cluster node receives an update, applying that update to local node's state does not fail. If that assumption is not realistic, then Two Phase Commit Protocol need to be used in applying changes. Axis2 does not support this at the time of writing, and hence WSO2 products also do not support this mode.
Another Consequence of this limitation is that Axis2 clustering only replicates changes to runtime state, but does not replicate changes to system configurations. The state held in Axis2 is categorized as runtime state and configurations and Clustering only replicate the context hierarchy, which is the runtime state of the services, but does not replicate the configuration hierarchy. Therefore, if the service configuration is changed at the runtime--- for instance, added a service---that change is not replicated to other nodes. The primary challenge in this case is that, unlike with runtime configurations, configuration updates access files and complex data structures and therefore, we cannot assume that configuration updates to services do not fail after it is delivered to a service. Hence, supporting configuration replication runs the risk of leaving some nodes in the cluster out of sync, hence leading to untold complications. As a result, not replicating the configuration is a conscious decision taken at design time; and therefore, updating configurations require a restart. We shall discuss the best practices for managing configurations in the next section.
Moreover, one obvious concern is that with the aforementioned setup, the Load Balancer becomes the single point of failure, and we recommend that Linux HA should be used here to provide a cold backup. However, it is worth noting that Linux HA cannot be used to provide high availability for Web Service containers because it cannot take application level decisions about failure.

WSO2 Cluster Management Support

Typically, all nodes in a cluster run an identical configuration and for the ease of management, most deployments configure all cluster nodes to load all their configurations from the same place. To support this usecase, WSO2 products support using registry or a URL repository to store the configurations. In the former case, all configurations are placed in a registry and multiple WSO2 products can be pointed to the same registry. In the latter case, URL of a remote web location can be given as the repository for WSO2 products installed in cluster nodes. If Governance Registry is used to store configurations, it provides a nice User interface to manage and update configurations and also provides add on services like versioning and tags.
However, as explained in the earlier section, changes to configurations are not replicated and therefore, updating configurations in a cluster requires a complete restart. We recommended the following pattern to update configurations of a Cluster. WSO2 products support a maintenance mode, which enable administrators to stop it from accepting new requests while continuing to process in progress requests. On this setting, updating a cluster involves taking half of its nodes into the maintenance mode, apply changes when they have processed all the requests, restarting them and then doing the same for the other half after bringing updated nodes back online. This will ensure that the configurations are updated while the cluster continues to server user requests.
Finally, manually monitoring and managing nodes in a cluster could be tedious, and WSO2 products support JMX to solve this problem. Therefore, cluster nodes can be monitored though any of the standard JMX clients or Management Consoles. A Cluster Management solution, which would provide a comprehensive solution to clustering management is under development. The blog entry by Azeez Afkham provides an initial preview about the solution.

Getting Your Feet Wet

If you are interested in setting up clustered version of WSO2 products, it involves primarily two tasks: setting up a front end, which would provide high availability, load balancing or both and if required, setting up state replication between cluster nodes.
Recommended Front End(Load Balancer) for a cluster of WSO2 products is WSO2 ESB and to set it up, users should configure a load balancing endpoint or a fail over endpoint. ESB Sample Index provide examples on how to setup those endpoints. Another alternative is using Apache mod proxy module, Configuring for mod-proxy is as same as configuring tomcat against Apache with mod_proxy and there are many resources on the topic (e.g. Load Balancing Tomcat with Apache).
In articles Introduction to WSO2 Carbon Clustering and WSO2 Carbon Cluster Configuration Language, Azeez Afkham describes how to configure full state replication for WSO2 products by configuring Axis2 clustering appropriately. Moreover, the article WSO2 ESB / Apache Synapse Clustering Guide describes clustering applied to WSO2 ESB. Furthermore, more details can be found under the clustering sections of each WSO2 Product.

Conclusion

This article discussed "End Goal", "Choice of Front End" and "Shared States across nodes" as three deciders in choosing right clustering strategy for a particular usecase. Following table summarizes different choices available under each decider.

	Goal of Clustering	Choice of the Front End	State shared
Choices	Load Balancing High Avialability Both	Apache Synapse Apache HTTPD mod_proxy Hardware DNS	Nothing Shared Shared Database Sticky Sessions Full Replication

Moreover, the article provides guidelines on when to use different choices available under each decider. The key message we wanted to get across to the user is that high availability or load balancing does not always require full state replication, and this article provides guidelines on choosing the right clustering architecture. We believe that the article will be useful for architects who want to choose the right clustering solution from WSO2 clustering solutions.

References

Asanka Abeysinghe, Eric Hubert, Ruwan Linton, Azeez Afkham, WSO2 ESB / Apache Synapse Clustering Guide, WSO2 Oxygen Tank, 2008
Gray, J. and Helland, P. and O'Neil, P. and Shasha, D., The dangers of replication and a solution, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, 1996
Wiesmann, M. and Pedone, F. and Schiper, A. and Kemme, B. and Alonso, G., Understanding replication in databases and distributed systems, Proceedings of the The 20th International Conference on Distributed Computing Systems (ICDCS 2000), 2000
Guerraoui, R. and Schiper, A., Software-based replication for fault tolerance, IEEE Computer, 1997
Chockler, G.V. and Keidar, I. and Vitenberg, R., Group communication specifications: a comprehensive study, ACM Computing Surveys (CSUR)},2001
Azeez Afkham, Introduction to WSO2 Carbon Clustering, WSO2 Oxygen Tank, 2009
Azeez Afkham, WSO2 Carbon Cluster Configuration Language, WSO2 Oxygen Tank, 2008
WSO2 ESB Sample Index, http://wso2.org/project/esb/java/2.1.2/docs/samples_index.html
Avneet Mangat, Load Balancing Tomcat with Apache, 2008
Azeez Afkham, Cluster Management in WSO2 Carbon, blog Entry

Attachment	Size
clustering.png	43.32 KB
clustering-pattern.png	78.45 KB

Lost in Time!!!

Friday, September 9, 2011

Clustering Landscape for WSO2 Stack