4 Nov 2009 12:18
Re: Node fails to join after restart
Dima Gutzeit <dima.gutzeit <at> mailvision.com>
2009-11-04 11:18:33 GMT
2009-11-04 11:18:33 GMT
Some more information:
If I wait between node stop and start, it starts ok but
the log is filled with:
[WARN] [OOB-1,null] org.jgroups.protocols.pbcast.NAKACK
- WebLynx02: discarded message from non-member OtherNode01, my view is
MergeView::[WebLynx02|43] [WebLynx02, OtherNode01], subgroups=[[WebLynx02|42]
[WebLynx02], [OtherNode01|42] [OtherNode01]]
My group has only two nodes...
Pease ... help
Regards,
Dima Gutzeit.
From: Dima Gutzeit
Sent: Wednesday, November 04, 2009 12:26 PM
Subject: Node fails to join after restart
Latest version of 2.8.
If I restart a node without waiting ~10 seconds between
shutdown and startup I get the following (total of two nodes, one is being
restarted) :
[WARN] [EngineConfigurator] org.jgroups.protocols.TCP -
null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping
message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
[WARN] [OOB-1,null] org.jgroups.protocols.TCP - null: no physical address for e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6, dropping message
[WARN] [EngineConfigurator] org.jgroups.protocols.pbcast.GMS - join(WebLynx02) sent to e1ec4d1f-5cf7-1149-db6a-22e432f3c3c6 timed out (after 3000 ms), retrying
And it never ends.
My config is :
<config>
<TCP
bind_addr="xx.xx.xx.xx"
loopback="true"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
max_bundle_size="64000"
max_bundle_timeout="30"
enable_bundling="true"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"
<TCP
bind_addr="xx.xx.xx.xx"
loopback="true"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
max_bundle_size="64000"
max_bundle_timeout="30"
enable_bundling="true"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"
thread_pool.enabled="true"
thread_pool.min_threads="1"
thread_pool.max_threads="50"
thread_pool.keep_alive_time="5000"
thread_pool.queue_enabled="false"
thread_pool.queue_max_size="100"
thread_pool.rejection_policy="Run"
thread_pool.min_threads="1"
thread_pool.max_threads="50"
thread_pool.keep_alive_time="5000"
thread_pool.queue_enabled="false"
thread_pool.queue_max_size="100"
thread_pool.rejection_policy="Run"
oob_thread_pool.enabled="true"
oob_thread_pool.min_threads="1"
oob_thread_pool.max_threads="15"
oob_thread_pool.keep_alive_time="5000"
oob_thread_pool.queue_enabled="true"
oob_thread_pool.queue_max_size="1000"
oob_thread_pool.rejection_policy="Run"
singleton_name="my_channels"/>
<MPING timeout="4000" receive_on_all_interfaces="true" send_on_all_interfaces="true" mcast_addr="228.8.8.11" mcast_port="60111" ip_ttl="8" num_initial_members="2" num_ping_requests="1"/>
<MERGE2 max_interval="10000" min_interval="5000"/>
<FD_SOCK/>
<FD_ALL timeout="10000" interval="5000"/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK use_mcast_xmit="false" gc_lag="50" retransmit_timeout="600,1200,2400,4800" discard_delivered_msgs="true"/>
<UNICAST timeout="1200,2400,3600"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>
<VIEW_SYNC avg_send_interval="60000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000" view_bundling="true"/>
<FC max_credits="2000000" min_threshold="0.10" max_block_times="500:2,1500:5,5000:50,20000:200,100000:500,1000000:1000"/>
<FRAG2 frag_size="60000"/>
<pbcast.STATE_TRANSFER/>
<pbcast.FLUSH timeout="5000"/>
</config>
oob_thread_pool.min_threads="1"
oob_thread_pool.max_threads="15"
oob_thread_pool.keep_alive_time="5000"
oob_thread_pool.queue_enabled="true"
oob_thread_pool.queue_max_size="1000"
oob_thread_pool.rejection_policy="Run"
singleton_name="my_channels"/>
<MPING timeout="4000" receive_on_all_interfaces="true" send_on_all_interfaces="true" mcast_addr="228.8.8.11" mcast_port="60111" ip_ttl="8" num_initial_members="2" num_ping_requests="1"/>
<MERGE2 max_interval="10000" min_interval="5000"/>
<FD_SOCK/>
<FD_ALL timeout="10000" interval="5000"/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK use_mcast_xmit="false" gc_lag="50" retransmit_timeout="600,1200,2400,4800" discard_delivered_msgs="true"/>
<UNICAST timeout="1200,2400,3600"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>
<VIEW_SYNC avg_send_interval="60000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000" view_bundling="true"/>
<FC max_credits="2000000" min_threshold="0.10" max_block_times="500:2,1500:5,5000:50,20000:200,100000:500,1000000:1000"/>
<FRAG2 frag_size="60000"/>
<pbcast.STATE_TRANSFER/>
<pbcast.FLUSH timeout="5000"/>
</config>
I thought that I will not have join
related issues anymore in 2.8
Thanks in advance.
Regards,
Dima Gutzeit.
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________ javagroups-users mailing list javagroups-users <at> lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/javagroups-users
RSS Feed