Figure 1 shows our dispatcher for Service S monitoring three service hosts running two instances of Service S (both connected to the same DBMS). Using information from monitoring services available at the service hosts, the dispatcher generates the dispatcher's local view of the load situation of the service hosts. Upon receiving a message (in this case for Service S), the dispatcher looks for the service instance running on the least loaded service host and forwards the message to it. As already mentioned, our dispatcher is modular, as shown in Figure 2. There are four types of modules:
Operation Switch Module: This module controls the operation mode of the dispatcher on a per-service level. In our implementation, the standard operation mode is forward. If there are, e.g., no service hosts available at all or all service hosts are overloaded, the mode will be switched to buffer or reject and all incoming requests will be buffered (until the buffer is full) or rejected (sending a ``temporary unavailable'' message back to the service caller) instantly. This is done to prevent the more expensive execution of the dispatch module when there are no suitable service hosts. Also, this module provides an administration interface, so the dispatcher can be switched to, e.g., buffer-mode, during maintenance work on service hosts.
Dispatch Module: The dispatch module implements the actual dispatching strategy. It can access the load situation of service hosts and other resources for the assignment of requests to service instances. Possible results of a dispatch strategy are an assignment of a request to a service instance, a command to initiate a service replication (see below), a reject command, or a buffer command. We already implemented a strategy which takes the load (CPU) of the service hosts into account and always assigns requests to the service instance on the least loaded service host. Currently we are working on a more sophisticated strategy which is able to handle at least the load of CPU and main memory on different types of resources (e.g., service hosts and database management systems) needed for the execution of a service. This strategy prevents overload situations not only on service hosts but also on database hosts or other resources.
Advisor Modules: Advisor modules are used to collect the data for the dispatcher's view of the load situation of all relevant resources. We already implemented advisor modules to measure the average CPU load on service hosts and on hosts running database management systems. There are lots of reasonable different advisor modules. The simplest kind of advisor module only knows two conditions of a resource: available or unavailable. For service hosts, this could be done by a simple ping on the host running the ServiceGlobe system. More complex advisors can provide more detailed information like CPU or main memory load of a service host, or the load of a database management system depending on CPU, memory, disc I/O, and others.
Config Modules: The configuration modules are used to generate
the configuration for new service instances. The modules can access the load
situation archive which stores aggregated load information to find, e.g., the
database host which was least loaded in the last few days. This is very
beneficial if there are, e.g., several instances of a database system working on
replicated data. Using historic load information, a new service instance can be
advised to connect to the instance of the DBMS which had the lowest average load
in the past.
To turn an existing service into a highly available and load balanced service, a properly configured dispatcher service must be started. Additionally, some new UDDI data has to be registered and some existing data has to be modified so that all service instances and all service hosts can be found by the dispatcher. After that, the service instances are no longer contacted directly, but via the dispatcher service controlling the forwarding of the messages. A cluster of service hosts can be easily supplemented with new service hosts. The administrators of these service hosts only have to install the ServiceGlobe system and register them at the UDDI repository using the appropriate tModel, e.g., ServiceHostClusterZ, indicating that these service hosts are members of cluster Z. The dispatcher will automatically use these service hosts as soon as it notices the changes to the UDDI repository.
The grey, thick line represents the load LSH(t) of the service
shost SH. The dashed line represents the dispatcher's view D'SH(t)
of the load of SH which is the average load of SH over the last update interval
of length Iu. This average load is calculated by SH and sent to the
dispatcher at regular intervals. The function interval(t) calculates
the number of the interval containing a given time t:
The dispatcher's view D'SH(t) can now be written as follows:
D'SH(t) := avg {LSH(t') | interval (t') = interval (t) - 1 }
The black, solid line shown in Figure 3 represents the
dispatcher's view including penalties DSH(t). The initial
(maximum) value of a penalty (represented by PmSH,S in
the equations) depends on the service S and the performance of the service host
SH and is configurable. This way, every assignment of a request Ri, i.e.,
every dispatch operation (represented by di,
;
d7 in the
Figure), has an effect on the dispatcher's view of the load situation,
immediately. If there is a load update from SH shortly after an assignment of a
request Ri, but before SH started to process Ri, the associated penalty
would be lost if the dispatcher would replace its view with the reported load,
because this load would not include load caused by Ri. Thus, the load
reported by the load monitors and the dispatcher's view of the load situation
are remerged using aging penalties: the penalties are decreasing over time and
added to further load values reported by the service host until the penalties
are zero. The time Ip until a penalty is zero is configurable and normally
shorter than shown in the picture, e.g., twice the time a request Ri needs to
arrive at SH plus the time SH needs to start processing Ri. After Ip, we
assume that a request Ri arrived at SH and that the load caused by Ri is
already included in the reported load, so that the dispatcher needs not to
further add any penalties for Ri. Using our notation and defining
time(di) to indicate the time of the assignment di,
host(di) to indicate the destination host of the assignment di,
and service(di) to indicate the destination service of the
assignment di, the view with penalties DSH(t) can be calculated
as follows: The penalty Pdi for the assignment di is zero before the
assignment. After Ip, it is zero again. In between this interval the penalty
is calculated using a linear function fdi(t) with the following
constraints: fdi(0) = Pmhost(di), service(di) and
fdi(Ip) = 0.
When receiving load updates from the service host SH, i.e., t =
for a
, the load including penalties is calculated by
adding all aged penalties of assignments to this SH to the reported value:
Within an update interval, penalties of new assignments to SH, i.e., assignments
already done within the current update interval, are added to this load as soon
as the assignments occur: