notes

Lecture 30: Web Server Redirection

When a server receives a request from a client, either it serves the request itself or redirects it to any other server. Because of the ever increasing traffic and content web server redirection becomes a necessity. For websites visited very frequently, one server can not process all the requests received. If there is only one server, it runs the risk of being a single point of failure. So the primary reasons for emplying Web Server Redirection are Load Balancing, reducing servicing time and Fault Tolerance. All these servers have the same IP binding.

Main Issues in web server redirection are where to redirect and How to redirect? As there may be many servers that may be distributed across LAN or WAN's , the where to redirect concerns choosing
one of these server. The decision may depend upon the distance of client from a server or how much
a server is loaded etc. How to redirect problem concerns with the level at which redirection is implemented. It may be implemented at Application Level or IP Level or MAC level. Redirection can be implemented at DNS itself. Below we have a closer look at all of them.

Ideally Web Server Redirection should be transparent to the clients. It has been statistically observed that when clients are presented with a set of server choices where they want to send their request, they generally tend to click on the first choice presented.

Client Side Approaches:

Redirection happens by choosing any number by browser itself. There is a set of available mirrors for a browser to choose from. Earlier netscape had a number of servers available as www1.netscape, www2,www3 and whenever a request to netscape was made , the browser selected one of them to send to. The problem with this approach is that the client must know all the replicas offhand. That is practically not possible . There is also a concept of Smart Client where client gets a specific applet for a website. The applet contains the information regarding which server the request should be redirected. Smart client is so named as client becomes smart enough by downloading the applet & choosing the replica. A proxy can implement a redirection mechanism so that it can be shared by all the clients served by the proxy.

DisAdvantages

CSA does not take into account the load distribution on the servers.
More Time overhead
Redirection hand crafted for all the websites

DNS Based Approaches

All the mirrors have different IP addresses. When DNS receives a request for resolving a name, it returns one of these addresses. DNS server here is required to know about all the replicas.

DisAdvanteges: Client and all the intermediate servers cache the IP address returned for some time. So for that time all the requests to that website will go to the same server. As a result this particular server might become overloaded. Introducing a TTL for a DNS response may not be viable because
the intermediate DNS might refuse to accept the response if it is found to be below a certain threshold.
As because DNS now must remeber about all the replicas of a website, may result into an increase in the load on DNS.

TTL can be introduced in two ways.

Constant TTL Approch

The simplest mechanism to introduce this is to use Round Robin. This does not take into account the load distribution on the servers or their availability. DNS may also keep tracking the states of all the servers and choose one of them depending upon the load and availability. Client State based approaches may be implemented too. Here the mirror is determined seeing the location of client.
As servers may be located across WAN, the simplest thing is to choose a server that is nearest to the client. Besides a combination of client and server based mechanism can be implemented.

Adaptive TTL
Here a judicious choice of both server and TTL is made. If a mirror is not having many requests TTL corresponding to this can be set high.

Dispatcher Based Approaches:

Here a dedicated server called dispatcher is there that collects all the requests and then carries out redirection. Dispatcher has single virtual IP address and identifies individual servers through some other address. Various ways in which this approach is implemented are summarized below.

Packet Single Rewriting: Dispatcher on receiving the packet changes the destination IP address in
IP header to the IP address of the replica and sends the packet. The replica in turn after serving the request replies with source IP address equal to IP address of dispatcher. Dispatcher keeps track of the connection and replica used and redirects all the packets to that replica.

Packet Double Rewriting: Here the replica replies with its own IP address instead of the dispatcher's IP. Dispatcher changes the source IP address to its own IP. As dispatcher changes the IP twice once in each direction hence the name Packet Double Rewriting.

Packet Forwarding: No change is made to the IP packet. Replicas are identified through the MAC address. Dispatcher selects a replica for a connection and redirects all the incoming IP packets to the MAC address of that replica.

Server Based Approaches:

Redirection is done at http level. One of the mirrors on getting a request , redirect the reques to
another server. The message we get many times "Please wait for a few seconds.." is because of this .

Approaches Comparison: