Ehcache: Cache Replication in Clustered Environment using JGroups

Sunday, 27 October 2013

Ehcache: Cache Replication in Clustered Environment using JGroups

If you are using Ehcache and you want to replicate your cache across all the nodes in a clustered environment, you may find some fruitful information in this post.There are three different ways to replicate your cache across all the nodes in a cluster:

JGroups Replicated Caching
RMI Replicated Caching
JMS Replicated Caching

This post tells about ‘JGroups Replicated Caching’. JGroups is a simple clustered task distribution system. JGroups integration with Ehcache facilitates replicating the cache across the nodes in a cluster.

How to configure?

Cache replication configuration with JGroups is not much complicated .With very simple configuration you can achieve cache replication in your clustered environment.

You need to configure below files for cache replication:

ApplicationContext.xml (Spring's application context file)
Ehcache.xml: (Ehcache configuration file)
JgroupCache.xml (JGroups configuration file for nodes communication)

ApplicationContext.xml: Configure 'EhCacheManagerFactoryBean' in application context file to initialize cache manager.

<bean id='ehCacheManager'

class="org.springframework.cache.ehcache.EhCacheManagerFactoryBean''>

</bean>

Ehcache.xml: To replicate cache in a cluster you need to configure below tags in 'Ehcache.xml' file:

cacheManagerPeerProviderFactory: This tag is used to create a CacheManagerPeerProvider, which discovers other CacheManagers in the cluster.

cacheEventListenerFactory: Enables registration of listeners for cache events, such as put, remove, update, and expire.

bootstrapCacheLoaderFactory: Specifies a BootstrapCacheLoader, which is called by a cache on initialization to prepopulate itself.

Each cache that will be distributed needs to set a cache event listener which replicates messages to the other CacheManager peers. This can be done by adding a 'cacheEventListenerFactory' element of type 'JGroupsCacheReplicatorFactory' to each distributed cache's configuration as per the following example:

<?xml version="1.0" encoding="UTF-8"?>

<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="http://ehcache.org/ehcache.xsd"

updateCheck="false">

<cacheManagerPeerProviderFactory

class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"

properties="file=JGroupsCache.xml" />

//This cache is configured to be replicated

<cache name=”mycache" eternal="true" maxElementsInMemory="100"

overflowToDisk="false" diskPersistent="false" timeToIdleSeconds="0"

timeToLiveSeconds="60" memoryStoreEvictionPolicy="LRU">

<cacheEventListenerFactory

class="net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory"

properties="replicateAsynchronously=true, replicatePuts=true,

replicateUpdates=true, replicateUpdatesViaCopy=false,

replicateRemovals=true" />

<bootstrapCacheLoaderFactory

class="net.sf.ehcache.distribution.jgroups.JGroupsBootstrapCacheLoaderFactory"

properties="bootstrapAsynchronously=false" />

</cache>

</ehcache>

JgroupCache.xml: In this file you generally need to configure your nodes and there listening ports for cache replication.

<?xml version="1.0" encoding="UTF-8"?>

<TCPPING timeout="3000"

initial_hosts=" host1[7831], host2[7832]" //Two nodes are in the cluster

port_range="1"

num_initial_members="2"/>

<VERIFY_SUSPECT timeout="1500" />

<pbcast.NAKACK use_mcast_xmit="false" gc_lag="100"

retransmit_timeout="300,600,1200,2400,4800"

discard_delivered_msgs="false"/>

<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>

<pbcast.GMS print_local_addr="true" join_timeout="5000" shun="false" view_bundling="true"/>

</config>

Frequently Asked Questions

1: What if I get below log message If one node tries to send cache notification to others?

'Dropped message from host1-64423 (not in xmit_table)'

Solution: There is a property named ‘discard_delivered_msgs’ should be false in JGroups configuration file.

2: How to keep JGroups configuration file out of web application war file?

Solution: In 'Ehcache.xml', you need not to hard code your 'JGroupsCache.xml'. You can specify this with the help of system property.

Define the JVM argument like -Djsgroup-config-location = C:\jgroups-configuration\JGroupsCache.xml

and then specify this property in 'Ehcache.xml' file.

<cacheManagerPeerProviderFactory

class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"

properties="file=${jsgroup-config-location} " />

3: How to analyze whether nodes are communicating each other?

Solution: You must see below logs in your server to ensure whether your nodes are registered as per JGroups configuration or not.

Logs:

-------------------------------------------------------------------

GMS: address=IP-ADDRESS-41447, cluster=EH_CACHE, physical address= 2002:19a1:70a:0:0:0:19a1 :70a:58603

-------------------------------------------------------------------

[10/25/13 17:57:38:451 IST] 0000001e JGroupsCacheM I net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProvider init JGroups Replication started for 'EH_CACHE'. JChannel: local_addr=IP-ADDRESS-41447

cluster_name=EH_CACHE

my_view=[IP-ADDRESS-41447|0] [IP-ADDRESS-41447][IP-ADDRESS-41448]

connected=true

closed=false

incoming queue size=0

receive_blocks=false

receive_local_msgs=false

state_transfer_supported=true

4: How to enable 'Ehcache' logs?

Solution: In your log4j configuration file, add below entries to view the Ehcache related logs.

<appender-ref ref="console" />

</category>

<appender-ref ref="console" />

</category>

<appender-ref ref="console" />

</category>

Others

To get more detail about cache replication methods in Ehcache, you can refer this link.
To get detailed information about above configuration you can refer this link.
You can also refer very nicely written post from here.

2 comments:

Anonymous17 June 2015 at 01:59
I have the exact setting like above but it seems to be doing a UDP connection instead of TCP, not sure if my interpretation is correct, can you pls check..
UDP(bind_addr=/fe80:0:0:0:70f4:980e:2d13:aee8%11;oob_
-------------------------------------------------------------------
GMS: address=NIGSA725774-57928, cluster=EH_CACHE, physical address=fe80:0:0:0:70f4:980e:2d13:aee8%11:52066
-------------------------------------------------------------------
[INFO |2015-06-17 18:09:28.546|net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProvider] JGroups Replication started for 'EH_CACHE'. JChannel: local_addr=NIGSA725774-57928
cluster_name=EH_CACHE
my_view=[NIGSA725771-31043|3] [NIGSA725771-31043, NIGSA725774-57928]
connected=true
closed=false
incoming queue size=0
receive_blocks=false
receive_local_msgs=false
state_transfer_supported=true
props=UDP(bind_addr=/fe80:0:0:0:70f4:980e:2d13:aee8%11;oob_thread_pool_keep_alive_time=5000;oob_thread_pool_enabled=true;max_bundle_size=64000;receive_on_all_interfaces=false;mcast_port=45588;thread_pool_min_threads=2;thread_pool_keep_alive_time=5000;enable_diagnostics=true;thread_pool_max_threads=8;ucast_send_buf_size=640000;ip_ttl=2;oob_thread_pool_queue_max_size=100;enable_bundling=true;thread_pool_queue_enabled=true;diagnostics_port=7500;oob_thread_pool_max_threads=8;disable_loopback=false;logical_addr_cache_max_size=20;ip_mcast=true;logical_addr_cache_expiration=120000;thread_pool_rejection_policy=discard;oob_thread_pool_min_threads=1;port_range=50;stats=true;mcast_send_buf_size=640000;id=21;mcast_recv_buf_size=25000000;diagnostics_addr=/ff0e:0:0:0:0:0:75:75;bind_port=0;tos=8;oob_thread_pool_rejection_policy=Run;loopback=true;oob_thread_pool_queue_enabled=false;enable_unicast_bundling=false;name=UDP;thread_pool_enabled=true;thread_naming_pattern=cl;ucast_recv_buf_size=20000000;discard_incompatible_packets=true;bundler_capacity=20000;max_bundle_timeout=30;mcast_group_addr=/ff0e:0:0:0:0:8:8:8;bind_interface_str=;marshaller_pool_size=0;num_timer_threads=4;log_discard_msgs=true;thread_pool_queue_max_size=10000;bundler_type=new)
:PING(id=6;return_entire_cache=false;num_initial_members=3;break_on_coord_rsp=true;stats=true;name=PING;num_ping_requests=2;discovery_timeout=0;timeout=2000;num_initial_srv_members=0)
:MERGE2(id=0;stats=true;merge_fast=true;name=MERGE2;inconsistent_view_threshold=1;min_interval=10000;merge_fast_delay=1000;max_interval=30000)
:FD_SOCK(id=3;get_cache_timeout=1000;bind_addr=/fe80:0:0:0:70f4:980e:2d13:aee8%11;sock_conn_timeout=1000;bind_interface_str=;stats=true;name=FD_SOCK;suspect_msg_interval=5000;keep_alive=true;start_port=0;num_tries=3)
:FD_ALL(id=29;interval=3000;stats=true;name=FD_ALL;msg_counts_as_heartbeat=false;timeout=5000)
:VERIFY_SUSPECT(id=13;bind_addr=/fe80:0:0:0:70f4:980e:2d13:aee8%11;bind_interface_str=;stats=true;name=VERIFY_SUSPECT;num_msgs=1;use_icmp=false;timeout=1500)
:BARRIER(id=0;max_close_time=60000;stats=true;name=BARRIER)
:pbcast.NAKACK(gc_lag=0;use_mcast_xmit_req=false;use_mcast_xmit=true;max_msg_batch_size=20000;xmit_from_random_member=false;stats=true;retransmit_timeouts=300,600,1200;exponential_backoff=0;log_not_found_msgs=true;enable_xmit_time_stats=false;discard_delivered_msgs=true;print_stability_history_on_failed_xmit=false;id=15;xmit_history_max_size=50;use_stats_for_retransmission=false;max_rebroadcast_timeout=2000;name=NAKACK;log_discard_msgs=true;max_xmit_buf_size=0;use_range_based_retransmitter=true)
:UNICAST(id=12;max_retransmit_time=60000;max_msg_batch_size=50000;loopback=false;sta
ReplyDelete
Replies
Anonymous24 September 2019 at 07:22
Hi, thank you for your post.

Do you have sometimes blocking messages at startup where my_view add a new address in its list (an unknown member not declared for example : my_view=[web1-29955|68] [web1-29955, b321f703-bea7-528f-828c-d4377a69de6d, web2-19991] where b321f703-bea7-528f-828c-d4377a69de6d is not a declared member) with the message :
WARNING: web1-29955: no physical address for b321f703-bea7-528f-828c-d4377a69de6d, dropping message

Thanks you for your reply.
Best regards,
J.
ReplyDelete
Replies

Add comment

Tech Blog - Narendra Verma

AWS

Sunday, 27 October 2013

Ehcache: Cache Replication in Clustered Environment using JGroups

2 comments: