Saturday, 26 November 2011

Tomcat Load Balancing with Apache Web Server


This post gives brief overview about load balancing and its concept. It demonstrates instructions for setting up Apache HTTP server, configuring Apache server as load balancer and configuring Tomcats (back end servers) in load balancing environment

About Server Load Balancing

Server load balancing is the process of distributing requests among servers.


Benefits: Click Here

Road Map

If you want to manage the load balancing environment you should follow below sections:

·         Installing Apache Web Server in Unix/Linux Environment
·         Apache Web Server Side Configuration for Load balancing
·         Tomcat Server Side Configuration for Load balancing
·         Re-starting Apache Web Server
·         Accessing Web Applications in Load Balanced Environment

If you want to manage and monitor load balancing environment you should follow below sections:

·         Apache Web Server - Balancer Manager UI
·         Safe Removal of a Tomcat Server in Load balancing Environment

If you want to improve performance of the back-end servers, you should follow below section:

·         Caching Static Contents at Apache Web Server

Installing Apache Web Server in Unix/Linux Environment

ü  Prerequisite
1.      Apache Web Server: Httpd-2.2.8 (or higher version)
2.       The following software must be present to compile Httpd -2.2.8
‘cc’ or ‘gcc’
This is a ‘c’ compiler.
3.       Ensure that the following commands are in PATH:
·         make
For setting path of ‘make’ command run the following command:
PATH=$PATH:/usr/ccs/bin
·         ld
For setting path of ‘ld’ command run the following command:
PATH=$PATH:/usr/ucb
4.       Ensure that the following directory is in LD_LIBRARY_PATH:
·         /usr/ucblib
For setting above directory in LD_LIBRARY_PATH, run the following command:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/ucblib

ü  Installation Instructions

The below table provides various steps to install Apache Web server on Unix machine. Execute the commands/steps mentioned in the right column.
Unix Source: httpd-2.2.8.tar.gz [PGP] [MD5]

$ gzip –d httpd-2.2.8.tar.gz
$ tar xvf httpd-2.2.8.tar
$ cd  httpd-2.2.8
$ ./configure --prefix=<apache-install-root> --enable-proxy --enable-proxy_ajp --enable-proxy_balancer --enable-headers --enable-cache --enable-disk_cache --enable-rewrite
NOTE:
1. ‘enable’ command is used to enable required modules.
2. <a
pache-install-root> should be replaced by the path where apache needs to be installed. Create this apache installation directory before running the ‘configure’ command.
$ make
$ make install
Edit file <apache-install-root>/conf/httpd.conf
·         Set the ‘Listen’ port (By default. the line is uncommented). The default value for port is ‘80’. Replace it with a value greater than 1100.
       Example: Listen 1380
·         Uncomment ‘ServerName’.  Replace the value ‘www.example.com’ to the ip-address of the box and ‘80’ to the listen port, where apache is being installed.
Example: ServerName 192.168.99.29:<Listen-Port>
This <Listen-Port> must be replaced by the Listen port that has been set above (for e.g. 1380).
Change directory to <apache-install-root>/bin/ directory and run the following command
$./apachectl start
Access the following URL to check if Apache is working or not.  If the index page (showing ‘It works!’) is displayed, then Apache has been set-up properly.
URL : http://<apache-host>:<listen-port>/

Apache Web Server Side Configuration for Load balancing

Load balancing specific configuration can be made in httpd.conf file. Change same file at below location as per given instruction:

<a
pache-install-root>/conf/httpd.conf

Add the following ‘Virtual Host’ entry at the end of httpd.conf file.  This configures Apache for load-balancing (using mod_proxy, mod_proxy_balancer and mod_proxy_ajp modules).

<VirtualHost <Apache-Host-IP>:<listen-port>>
           
             ServerName  <Apache-Host-IP>: <listen-port>
           
<Proxy balancer://loadBalancer>                             
                  BalancerMember ajp://<TomcatHostIP>:<AJPPort> route=<Tomcat_jvmRoute>  loadfactor=50
                  BalancerMember ajp://<TomcatHostIP>:<AJPPort> route=<Tomcat_jvmRoute>  loadfactor=50
            </Proxy>

            <Location /webapp1>
                 ProxyPass balancer://loadBalancer/webapp1  lbmethod=byrequests    stickysession=JSESSIONID 
            </Location>

            <Location /webapp2>
                  ProxyPass balancer://loadBalancer/webapp2  lbmethod=byrequests stickysession=JSESSIONID
            </Location>

            # For solving Trailing Slash Problem
            RewriteEngine  on
            RewriteRule    ^/webapp1$  webapp1/ [R]
            RewriteRule    ^/webapp2$  webapp2/ [R]
</VirtualHost>   


<listen-port>
Should be same as the Listen port, configured above
<Apache-Host-IP>
Should be same as apache-host
<TomcatHostIP>
Should be same as host where tomcat is hosted
<AJPPort>
Should be same as ajp port set in tomcat server.xml.  Instructions to set ajp port in tomcat server.xml are provided in next section
<Tomcat_jvmRoute> 
Should be same as jvmRoute value of tomcat that is configured in <Engine jvmRoute= “ ”> entry.  Instructions to set jvm route in tomcat server.xml are provided in next section

Notes:

·         Following ‘BalancerMember’ entry in the VirtualHost configuration is used to make a tomcat as balancerMember (See above in Proxy element). If only one tomcat is being used, then only one entry should be added, if two tomcats are being used, then two entries should be added.  While using more than two tomcats for load balancing, each of them should be configured as BalancerMember, with appropriate replacements to <TomcatHostIP>, <AJPPort>, <Tomcat_jvmRoute> parameters.

BalancerMember ajp://<TomcatHostIP>:<AJPPort> route=<Tomcat_jvmRoute>  loadfactor=50

·         The above Location entries are just sample for two web applications running on two different tomcat servers. If Apache is being configured for other web-applications, then, Location entry should be created for that web-application.  General format of the Location entry is as below, where <webapp_XYZ> represents the name of web application being configured.

               
<Location /<webapp_XYZ>>
ProxyPass balancer://loadBalancer/<webapp_XYZ>  lbmethod=byrequests stickysession=JSESSIONID
</Location>

·         Similarly, the RewriteRule should be modified to reflect the web-application name.  General format for the RewriteRule is as below, where <webapp_XYZ> represents the name of web application being configured.

RewriteRule    ^/<webapp_XYZ>$ <webapp_XYZ>/ [R]

Tomcat Server Side Configuration for Load balancing

Following configuration is required for all tomcats which need to be load-balanced using apache server. Modify server.xml file located @ <CATELINA_HOME>/tomcat/conf/server,xml for each tomcat server.
1.       Uncomment <Connector> entry for AJP protocol. This is used for communication between Apache server and Tomcat.
<Connector port="<port>”  enableLookups="false" redirectPort="8443 protocol="AJP/1.3" />
2.       Add jvmRoute attribute in existing <Engine> entry for enabling sticky-session used by apache.               
             <Engine  name=" Standalone" defaultHost="localhost" jvmRoute=”<value>”>
<port>
AJP Port
<value>
Value for jvmRoute (for e.g. jvm1, jvm_wb10, gui_61_20 etc.)

Note:

·         The default AJP connector port is 8009.  It should be configured to a separate value for each tomcat instance.
·         Engine jvmRoute values must be unique for different tomcat instances.
·         Each Tomcat must be restarted after any configuration changes in server.xml file.

Re-starting Apache Web Server

Once the above configuration steps are completed, start Apache Server.

Executing the following command from <apache-install-root>/bin> directory starts the Apache server:
$./apachectl start
Note: If any changes are done in the ‘<apache-install>/conf/httpd.conf’, then apache server must be restarted.  First stop Apache and then start it. For running these commands go to ‘<apache-install-root>/bin/’.

$./apachectl stop
$./apachectl start

Accessing Web Applications in Load Balanced Environment

Now time has come to access your web applications in load balanced environment after struggling with so many configuration steps.
Access your apache server URL with configured web applications.

Like: http://Apache-IP:Apache-Port/web-application

Request would first go to Apache server and Apache will redirect this request to your backend server (any of the servers in load balanced environment)

Apache Web Server - Balancer Manager UI
ü  Configuration Steps

Apache web server provides UI to monitor and manage load balancing environment. The ‘balancer-manager’ UI is used to dynamically change the balancing parameters like load factor etc. To configure this balancer manager UI add below configuration in httpd.conf file

<Location /balancer-manager>
    SetHandler balancer-manager
    Order Deny,Allow
    Deny from all
    Allow from <IP-address>
</Location>

<IP-address>
·         This is the ip address of the machine from where balancer-manger UI can be accessed.


This can be configured like this:

<Location /balancer-manager>
    SetHandler balancer-manager
    Order Deny,Allow
    Deny from all
    Allow from <IP1>  <IP2>  <IP3> [or ALL]
 </Location>

In this example, machines hosted at <IP1> <IP2> <IP3> can access the balancer-manager UI. If this UI is required to access by all machines, then, configuration ‘ALL’ instead of separate IPs.
 
ü  How to Use?

The ‘balancer-manager’ UI can be accessed using following URL:

http://<Apache-Host-IP >:<listen-port>/balancer-manager

The ‘balancer manager’ can be used to check the status of the servers used for load balancing and to prepare safe removal of a node.

Below status fields are displayed on Manager UI

a.       Method:        byrequests             Balancer load-balance method. Select the load-balancing scheduler method to use. Either byrequests, to perform weighted request counting or bytraffic, to perform weighted traffic byte count balancing. Default is byrequests.

b.       FailoverAttempts: Maximum number of failover attempts before giving up.

c.        StickySession: Balancer sticky session name. The value is usually set to something like JSESSIONID or PHPSESSIONID, and it depends on the backend application server that support sessions.

d.       Timeout: Balancer timeout in seconds. If set this will be the maximum time to wait for a free worker. Default is not to wait (0).

Editable fields

e.        Load Factor: Worker load factor. It is a number between 1 and 100 and defines the normalized weighted load applied to the worker.

f.        LB Set: Sets the load balancer cluster set that the worker is a member of. The load balancer will try all members of a lower numbered lbset before trying higher numbered ones.

g.        Route: Route of the worker when used inside load balancer. This represents the jvmRoute in the tomcat server.xml file.

h.       Route Redirect: Redirection Route of the worker. This value is usually set dynamically to enable safe removal of the node from the cluster. If set all requests without session id will be redirected to the BalancerMember that has route parameter equal as this value.

i.         Status: Value defining the initial status of this worker: Enabled, Disabled states. This can be set as per requirement.

ü  Known Issue with Balancer Manager UI
If one of the worker instance (tomcat instance) is bounced, Balance-manager UI status shows "Err"/"Dis Err" for that worker.  Status of the worker is not updated with "Ok", even when the tomcat is restarted.

The Apache balancer module does not check if the tomcat is operational or not until it gets a new request for the tomcat. Because apache only communicates with tomcats while putting request; only then it knows whether tomcat is running or not.
Caching Static Contents at Apache Web Server

ü     About Caching Concept

The CacheEnable configuration allows Apache to cache the configured static contents on Apache-end.  These contents are fetched from tomcat server, when the first request for these resources arrives.  These resources are then stored in ‘cache’ directory of Apache installation.  Thereafter, for other requests for these resources, Apache does not send the request to tomcat.  Rather, these resources are served from Apache itself, thus reducing the number of requests that tomcat need to serve. See the below diagram to understand the flow of caching concept. Apache caching is one of the key steps to enhance the performance of the backend servers.


The ‘Cache-Control max-age’ configuration controls the caching of data.  ‘Header add Cache-Control max-age’ is an additional header added by Apache to the responses received from tomcat.  

This configuration represents the time after which cached content must be again verified, and if need be, fetched from the tomcat server and cached again.  If the value is set to ‘3600’, it means that Apache will send request to tomcat once per hour for validating the static content.  If the resource has been modified on tomcat end, then it would be fetched and re-cached before being server again by Apache.  Otherwise the earlier cached content is served.

This header is also used by browsers to cache the content on Browser end.  So, even the browser will not send request for these static resources, for which Apache has added this header.  Browser will only send the request for these resources when the max-age has expired.

ü  How to configure?

1.       Create directory ‘cache’ in apache to store cached static contents.

$mkdir <apache-install-root>/cache


2.       Add the following mod_cache entry after the above VirtualHost entry.  This enables caching of static content (using mod_cache and mod_disk_cache).

 <IfModule mod_cache.c>
                                                <IfModule mod_disk_cache.c>
                                               
                                                               # Enable Disk type cache for caching static content
                                               
                                                CacheEnable disk /webapp1/css
                                                              CacheEnable disk /webapp1/images
                                                              CacheEnable disk /webapp1/scripts
                       
                                                              CacheEnable disk /webapp2/css
                                                              CacheEnable disk /webapp2/images
                                                CacheEnable disk /webapp2/scripts

                                                             #Specify content those are not supposed to cache.
                                                             #CacheDisable /webapp1/pages
                                                                    
                                                             #Cache root where cache would be placed
                                               
                                                             CacheRoot <apache-install-root>/cache
                                               
                                                             # Related to directory structure of cache
                                                             CacheDirLevels 2
                                                             CacheDirLength 2
               
                                          # Note: If response header does not contain 'LastModifiedDate' or 'CacheControl'            
                                          #  header then it would not be cached.
                                          #l Still we can force for caching by defining following directives.

                                                              CacheIgnoreNoLastMod On
                                                              CacheIgnoreCacheControl On
                       
    </IfModule>
</IfModule>

3.       Restart the apache server and serve the request. You can verify the caching contents at the location of cache directory and observe the difference specific to performance.

ü    Re-Caching static content if modified at backend server (Tomcat)

Add the following configuration at the end of httpd.conf file after the balancer-manager entry:

Header add Cache-Control max-age=<time-in-seconds>
<time-in-seconds>
Replaced by time in seconds.  ‘Header add Cache-Control max-age’ is an additional header added in cached responses (stored in <cache-dir> folder).  This represents the time after which cached content must be re-cached

Note: If ‘time-in-seconds’ = 3600, it means that Apache will send request to tomcat once per hour for re-caching static content.  If modified body is found on this request, then the body would be re-cached otherwise the earlier cached content is served.

Safe Removal of a Tomcat Server in Load balancing Environment 

When an application server (tomcat) has to be taken offline for maintenance, set a very low load factor (valid from 1 to 100) for the server due to be taken offline. Few new users will be served by that server. All existing users ‘sticking’ to this real worker (represents a tomcat instance in apache) will continue to be processed by this real worker. This will help reduce the number of users who will have to login again when the real worker is taken offline. After some time, the real worker is disabled and taken offline. Once it is ready to come online, the worker is enabled again in balancer manager.

URL Mapping using Apache Re-Write Feature 

Apache provides facility to remap the requested URLs specific to URLs served by the backend servers. URLs can be defined with patterns.

For example: If URL is proposed as http://APACHE-IP:80/webapp1 to the client where tomcat can serve the request with URL as http://TOMCAT-IP:8080:updated-webapp/ then below configuration needs to be done:

RewriteEngine  on
RewriteRule    ^/webapp1$  updated-webapp/ [R]

References: