Lecture 30: Web Server Redirection
When a server receives a request from a client, either it serves
the request itself or redirects it to any other server. Because of
the ever increasing traffic and content web server redirection becomes
a necessity. For websites visited very frequently, one server can
not process all the requests received. If there is only one server, it
runs the risk of being a single point of failure. So the primary
reasons for emplying Web Server Redirection are Load Balancing,
reducing servicing time and Fault Tolerance. All these servers have the
same IP binding.
Main Issues in web server redirection are where to redirect and How to redirect? As there may be
many servers that may be distributed across LAN or WAN's , the where to
redirect concerns choosing
one of these server. The decision may depend upon the distance of
client from a server or how much
a server is loaded etc. How to redirect problem concerns with the level
at which redirection is implemented. It may be implemented at
Application Level or IP Level or MAC level. Redirection can be
implemented at DNS itself. Below we have a closer look at all of them.
Ideally Web Server Redirection should be transparent to the clients. It
has been statistically observed that when clients are presented with a
set of server choices where they want to send their request, they
generally tend to click on the first choice presented.
Client Side Approaches:
Redirection happens by choosing any number by browser itself.
There is a set of available mirrors for a browser to choose from.
Earlier netscape had a number of servers available as www1.netscape,
www2,www3 and whenever a request to netscape was made , the browser
selected one of them to send to. The problem with this approach is
that the client must know all the replicas offhand. That is practically
not possible . There is also a concept of Smart Client where client gets a
specific applet for a website. The applet contains the information
regarding which server the request should be redirected. Smart client is
so named as client becomes smart enough by downloading the applet &
choosing the replica. A proxy can implement a redirection
mechanism so that it can be shared by all the clients served by the
proxy.
DisAdvantages
- CSA does not take into account the load distribution on the
servers.
- More Time overhead
- Redirection hand crafted for all the websites
DNS Based Approaches
All the mirrors have different IP addresses. When DNS receives a
request for resolving a name, it returns one of these addresses. DNS
server here is required to know about all the replicas.
DisAdvanteges: Client
and all the intermediate servers cache the IP address returned for some
time. So for that time all the requests to that website will go to the
same server. As a result this particular server might become
overloaded. Introducing a TTL for a DNS response may not be viable
because
the intermediate DNS might refuse to accept the response if it is found
to be below a certain threshold.
As because DNS now must remeber about all the replicas of a website,
may result into an increase in the load on DNS.
TTL can be introduced in two ways.
Constant TTL Approch
The simplest mechanism to introduce this is to use Round
Robin. This does not take into account the load distribution on the
servers or their availability. DNS may also keep tracking the
states of all the servers and choose one of them depending upon the load
and availability. Client State
based approaches may be implemented too. Here the mirror is
determined seeing the location of client.
As servers may be located across WAN, the simplest thing is to choose a
server that is nearest to the client. Besides a combination of
client and server based mechanism can be implemented.
Adaptive TTL
Here a judicious choice of both server and TTL is made. If a mirror is
not having many requests TTL corresponding to this can be set high.
Dispatcher Based Approaches:
Here a dedicated server called dispatcher is there that
collects all the requests and then carries out redirection.
Dispatcher has single virtual IP address and identifies individual
servers through some other address. Various ways in which this approach
is implemented are summarized below.
Packet Single Rewriting: Dispatcher
on receiving the packet changes the destination IP address in
IP header to the IP address of the replica and sends the packet. The
replica in turn after serving the request replies with source IP address
equal to IP address of dispatcher. Dispatcher keeps track of the
connection and replica used and redirects all the packets to that
replica.
Packet Double Rewriting: Here
the replica replies with its own IP address instead of the dispatcher's
IP. Dispatcher changes the source IP address to its own IP. As
dispatcher changes the IP twice once in each direction hence the name
Packet Double Rewriting.
Packet Forwarding: No
change is made to the IP packet. Replicas are identified through the MAC
address. Dispatcher selects a replica for a connection and
redirects all the incoming IP packets to the MAC address of that
replica.
Server Based Approaches:
Redirection is done at http level. One of the mirrors on getting a
request , redirect the reques to
another server. The message we get many times "Please wait for a few
seconds.." is because of this .
Approaches Comparison: