Fakultät für Informatik TU München - Fakultät für Informatik
Lehrstuhl III: Datenbanksysteme
Technische Universität München
Home  |  Personen  |  Forschung  |  Lehre  |  Sonstiges  | 

Load Balancing and Automatic Service Replication

Making a service available to a large user group can be problematic. First, if there are a lot of users at the same time, a single service host may not be sufficient to provide low response times. Second, if there are any problems with the service or the service host, the service will be completely unavailable. This will annoy a lot of customers, even if, e.g., a service host is only down for some minutes. Therefore, it is reasonable to run multiple instances of a service on a cluster of service hosts connected by a LAN and to route incoming requests within the cluster in a way that there is no load skew. Since it is very expensive and error-prone to integrate the functionality for the cooperation of the service instances directly into every new service, we propose a generic solution to this problem: We developed a modular dispatcher service which can act as a proxy for arbitrary services. Using this dispatcher service, it is possible to enhance many existing services or develop new services with load balancing and high availability features without having to consider these features during their development. All kinds of services are supported as long as concurrency control mechanisms are used, e.g., by using a database as back-end (as many real-world services do). The concurrency control mechanisms ensure a consistent view and consistent modifications of the data shared between all service instances. Of course, if there is no data sharing between different instances of a service, the dispatcher can be used as well. An additional feature of our dispatcher is called automatic service replication and enables the dispatcher to install new instances of static services on demand.

1. Architecture of the Dispatcher

Our dispatcher is a software-based layer-7 switch. This kind of switch is also used in the context of Web servers  [CC02]. Such switches perform load balancing (or load sharing) using several servers on the back-end with identically mirrored content and a dispatching strategy like round robin or more complex strategies using load information about these back-end servers. Our solution is a pure software solution and - in contrast to existing layer-7 switches - is realized as a regular service. Thus, our dispatcher is more flexible, extensible, and seamlessly integrated into the service platform.

 

Figure 1:
Survey of the Load Balancing System

Figure 1 shows our dispatcher for Service S monitoring three service hosts running two instances of Service S (both connected to the same DBMS). Using information from monitoring services available at the service hosts, the dispatcher generates the dispatcher's local view of the load situation of the service hosts. Upon receiving a message (in this case for Service S), the dispatcher looks for the service instance running on the least loaded service host and forwards the message to it. As already mentioned, our dispatcher is modular, as shown in Figure 2. There are four types of modules:

Operation Switch Module: This module controls the operation mode of the dispatcher on a per-service level. In our implementation, the standard operation mode is forward. If there are, e.g., no service hosts available at all or all service hosts are overloaded, the mode will be switched to buffer or reject and all incoming requests will be buffered (until the buffer is full) or rejected (sending a ``temporary unavailable'' message back to the service caller) instantly. This is done to prevent the more expensive execution of the dispatch module when there are no suitable service hosts. Also, this module provides an administration interface, so the dispatcher can be switched to, e.g., buffer-mode, during maintenance work on service hosts.

Dispatch Module: The dispatch module implements the actual dispatching strategy. It can access the load situation of service hosts and other resources for the assignment of requests to service instances. Possible results of a dispatch strategy are an assignment of a request to a service instance, a command to initiate a service replication (see below), a reject command, or a buffer command. We already implemented a strategy which takes the load (CPU) of the service hosts into account and always assigns requests to the service instance on the least loaded service host. Currently we are working on a more sophisticated strategy which is able to handle at least the load of CPU and main memory on different types of resources (e.g., service hosts and database management systems) needed for the execution of a service. This strategy prevents overload situations not only on service hosts but also on database hosts or other resources.

Advisor Modules: Advisor modules are used to collect the data for the dispatcher's view of the load situation of all relevant resources. We already implemented advisor modules to measure the average CPU load on service hosts and on hosts running database management systems. There are lots of reasonable different advisor modules. The simplest kind of advisor module only knows two conditions of a resource: available or unavailable. For service hosts, this could be done by a simple ping on the host running the ServiceGlobe system. More complex advisors can provide more detailed information like CPU or main memory load of a service host, or the load of a database management system depending on CPU, memory, disc I/O, and others.

Config Modules: The configuration modules are used to generate the configuration for new service instances. The modules can access the load situation archive which stores aggregated load information to find, e.g., the database host which was least loaded in the last few days. This is very beneficial if there are, e.g., several instances of a database system working on replicated data. Using historic load information, a new service instance can be advised to connect to the instance of the DBMS which had the lowest average load in the past.

 

Figure 2:
Architecture of the Dispatcher

To turn an existing service into a highly available and load balanced service, a properly configured dispatcher service must be started. Additionally, some new UDDI data has to be registered and some existing data has to be modified so that all service instances and all service hosts can be found by the dispatcher. After that, the service instances are no longer contacted directly, but via the dispatcher service controlling the forwarding of the messages. A cluster of service hosts can be easily supplemented with new service hosts. The administrators of these service hosts only have to install the ServiceGlobe system and register them at the UDDI repository using the appropriate tModel, e.g., ServiceHostClusterZ, indicating that these service hosts are members of cluster Z. The dispatcher will automatically use these service hosts as soon as it notices the changes to the UDDI repository.

2. Load Measurement

The dispatcher's view of the load situation is updated at intervals of several seconds to prevent overloading the network. Thus, this view of the load is constant between two updates. Therefore, a service host SH will still be considered having low load, even if several requests have been assigned to it after the last load update. Without precautions, the dispatcher might overload SH for this reason. To avoid these overload situations, the dispatcher adds ``penalties'' to its view of the load once a request is assigned. Figure 3 illustrates the load of SH, the load reported to the dispatcher (load without penalties), and the load with penalties.

 

Figure 3:
Different Views of Current Load Situations during Request Dispatching

The grey, thick line represents the load LSH(t) of the service shost SH. The dashed line represents the dispatcher's view D'SH(t) of the load of SH which is the average load of SH over the last update interval of length Iu. This average load is calculated by SH and sent to the dispatcher at regular intervals. The function interval(t) calculates the number of the interval containing a given time t:

The dispatcher's view D'SH(t) can now be written as follows:
   D'SH(t) := avg {LSH(t') | interval (t') = interval (t) - 1 }

The black, solid line shown in Figure 3 represents the dispatcher's view including penalties DSH(t). The initial (maximum) value of a penalty (represented by PmSH,S in the equations) depends on the service S and the performance of the service host SH and is configurable. This way, every assignment of a request Ri, i.e., every dispatch operation (represented by di, ; d7 in the Figure), has an effect on the dispatcher's view of the load situation, immediately. If there is a load update from SH shortly after an assignment of a request Ri, but before SH started to process Ri, the associated penalty would be lost if the dispatcher would replace its view with the reported load, because this load would not include load caused by Ri. Thus, the load reported by the load monitors and the dispatcher's view of the load situation are remerged using aging penalties: the penalties are decreasing over time and added to further load values reported by the service host until the penalties are zero. The time Ip until a penalty is zero is configurable and normally shorter than shown in the picture, e.g., twice the time a request Ri needs to arrive at SH plus the time SH needs to start processing Ri. After Ip, we assume that a request Ri arrived at SH and that the load caused by Ri is already included in the reported load, so that the dispatcher needs not to further add any penalties for Ri. Using our notation and defining time(di) to indicate the time of the assignment di, host(di) to indicate the destination host of the assignment di, and service(di) to indicate the destination service of the assignment di, the view with penalties DSH(t) can be calculated as follows: The penalty Pdi for the assignment di is zero before the assignment. After Ip, it is zero again. In between this interval the penalty is calculated using a linear function fdi(t) with the following constraints: fdi(0) = Pmhost(di), service(di) and fdi(Ip) = 0.

When receiving load updates from the service host SH, i.e., t = for a , the load including penalties is calculated by adding all aged penalties of assignments to this SH to the reported value:

Within an update interval, penalties of new assignments to SH, i.e., assignments already done within the current update interval, are added to this load as soon as the assignments occur:

3. Automatic Service Replication

If all available service instances of a static service\footnote{Dynamic services can be executed on arbitrary service hosts and need not be installed anyway.} are running on heavily loaded service hosts and there are service hosts available having a low workload, the dispatcher can decide to generate a new instance of the service using a feature called automatic service replication. Figure 4 demonstrates this feature: service hosts A and B are heavily loaded and host C currently has no instance of Service S running. Thus, the dispatcher sends a message to service host C to create a new instance of Service S. The configuration of the new Service S is generated using the appropriate configuration module. If no service hosts having low workload are available, the dispatcher can either buffer incoming messages (until the buffer is full) or reject them depending on the configuration of the dispatcher instance and the modules.
 

Figure 4:
Automatic Replication of Service S

4. High Availability/Single Point of Failure

Using several instances of a service greatly increases its availability and decreases the average response time. Just to get an impression about the high level of availability we want to sketch this very simple analytical investigation. Assuming that the server running the dispatcher itself and the database server (in our example the database server is needed for service S) are highly available, the availability of the entire system depends only on the availability of the service hosts. The availability of a pool of service hosts can be calculated as follows:

Equation 1 calculates the availability of a single service host based on its MTBF (mean time between failures) and MTTR (mean time to repair). The availability of a pool of N service hosts can be calculated using Equation 2. Even assuming very unreliable service hosts with MTBF = 48h and MTTR = 12h a pool with 8 members will only be unavailable about 1.5 minutes a year. Because database management systems are very often mission critical for companies, there are different approved solutions for highly available database management systems [Bre98, HD91]. Thus, the remaining single point of failure is the dispatcher service. There are several possibilities to reduce the risk of a failure of the dispatcher. A pure software solution is to run two identical dispatcher services on two different hosts. Only one of these dispatchers is registered at the UDDI server. The second dispatcher is the spare dispatcher and it monitors the other one (watchdog mechanism). If the first dispatcher fails, the spare dispatcher modifies the UDDI repository such that the it points to the spare dispatcher. If the clients of the dispatcher call services according to the UDDI service invocation pattern, any failed service invocation will lead to a check for service relocation. Thus, failures of the first dispatcher will lead to an additional UDDI query and an additional SOAP message to the second dispatcher. Of course, there are many other possible solutions which are adaptable for a highly available dispatcher service known from the fields of database systems [Bre98, HD91] and Web servers [CC02] including solutions based on redundant hardware, but this is out of the scope of this paper.
Lehrstuhl für Datenbanksysteme
Letzte Änderung: 25.05.2005 um 14:38:36