Cluster Formation between nodes in different data centers

classic Classic list List threaded Threaded
4 messages Options
visagan visagan
Reply | Threaded
Open this post in threaded view
|

Cluster Formation between nodes in different data centers

This post has NOT been accepted by the mailing list yet.
Hi,

I have two nodes in data center A and two nodes in Data center B.  I have opened up the ignite default ports and i am able to talk to all nodes and ports if i use  "nc" command.

But when i try to make the ignite cluster work, the four nodes join and form a cluster and then they leave.

I am able to see the total memory to increase from 24GB (when two nodes are in the cluster) to 36 and then to 48GB and then the other two nodes leave with an event NODE_FAILED.
But the nodes in the corresponding data center forms the cluster within themselves and lives there happily.
When i turn on the cross data center, though it is able to discover the nodes, it fails after joining with the other nodes.
Is there any possibly know reason for this ?

{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.597-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]
{"@timestamp":"2016-05-05T17:26:50.598-04:00","@version":1,"message":"Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.625-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=966, minorTopVer=0], evt=NODE_FAILED, node=0d9d891d-809c-4b72-9ba3-6b89804e280f]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.657-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=967, minorTopVer=0], evt=NODE_JOINED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.687-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=968, minorTopVer=0], evt=NODE_FAILED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
vdpyatkov vdpyatkov
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Formation between nodes in different data centers

Hello,

Can you check ping between data centers?
If delay of network may be long, you can incrase FailureDetectionTimeout (use org.apache.ignite.configuration.IgniteConfiguration#setFailureDetectionTimeout)

In additional you need make sure what communication ports are available in all nodes (by default org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi#DFLT_PORT) .
You can see all ports, which are will by used, in debug level logger.

If nothing to help, please provide log files from all nodes and I could see details.

Also I see you are not subscribed on users mail list. Can you please subscribe?
You can do it from this page https://ignite.apache.org/community/resources.html

Also I see you are not subscribed on users mail list. Can you please subscribe?
You can do it from this page https://ignite.apache.org/community/resources.html

visagan wrote
Hi,

I have two nodes in data center A and two nodes in Data center B.  I have opened up the ignite default ports and i am able to talk to all nodes and ports if i use  "nc" command.

But when i try to make the ignite cluster work, the four nodes join and form a cluster and then they leave.

I am able to see the total memory to increase from 24GB (when two nodes are in the cluster) to 36 and then to 48GB and then the other two nodes leave with an event NODE_FAILED.
But the nodes in the corresponding data center forms the cluster within themselves and lives there happily.
When i turn on the cross data center, though it is able to discover the nodes, it fails after joining with the other nodes.
Is there any possibly know reason for this ?

{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.597-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]
{"@timestamp":"2016-05-05T17:26:50.598-04:00","@version":1,"message":"Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.625-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=966, minorTopVer=0], evt=NODE_FAILED, node=0d9d891d-809c-4b72-9ba3-6b89804e280f]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.657-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=967, minorTopVer=0], evt=NODE_JOINED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.687-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=968, minorTopVer=0], evt=NODE_FAILED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
visagan visagan
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Formation between nodes in different data centers

Hi, 

I tried the option you told me. But it dint work. It happened to do the same thing again. 

So can i use the cloud based configuration. My machines are in open-stack i think it should be the public/private address issue. 
https://apacheignite.readme.io/v1.6/docs/generic-cloud-configuration

If i replace
  <property name="provider" value="google-compute-engine"/>
<property name="provider" value="openstack-nova"/>
And provide the rest of the information as required by ignite. Will that pick up the open stack configuration? Or should i write my own custom interface to that for openstack ?

On Thu, May 12, 2016 at 10:47 AM, vdpyatkov [via Apache Ignite Users] <[hidden email]> wrote:
Hello,

Can you check ping between data centers?
If delay of network may be long, you can incrase FailureDetectionTimeout (use org.apache.ignite.configuration.IgniteConfiguration#setFailureDetectionTimeout)

In additional you need make sure what communication ports are available in all nodes (by default org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi#DFLT_PORT) .
You can see all ports, which are will by used, in debug level logger.

If nothing to help, please provide log files from all nodes and I could see details.

Also I see you are not subscribed on users mail list. Can you please subscribe?
You can do it from this page https://ignite.apache.org/community/resources.html

Also I see you are not subscribed on users mail list. Can you please subscribe?
You can do it from this page https://ignite.apache.org/community/resources.html

visagan wrote
Hi,

I have two nodes in data center A and two nodes in Data center B.  I have opened up the ignite default ports and i am able to talk to all nodes and ports if i use  "nc" command.

But when i try to make the ignite cluster work, the four nodes join and form a cluster and then they leave.

I am able to see the total memory to increase from 24GB (when two nodes are in the cluster) to 36 and then to 48GB and then the other two nodes leave with an event NODE_FAILED.
But the nodes in the corresponding data center forms the cluster within themselves and lives there happily.
When i turn on the cross data center, though it is able to discover the nodes, it fails after joining with the other nodes.
Is there any possibly know reason for this ?

{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.597-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]
{"@timestamp":"2016-05-05T17:26:50.598-04:00","@version":1,"message":"Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.625-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=966, minorTopVer=0], evt=NODE_FAILED, node=0d9d891d-809c-4b72-9ba3-6b89804e280f]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.657-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=967, minorTopVer=0], evt=NODE_JOINED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.687-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=968, minorTopVer=0], evt=NODE_FAILED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/Cluster-Formation-between-nodes-in-different-data-centers-tp4838p4903.html
To unsubscribe from Cluster Formation between nodes in different data centers, click here.
NAML



--
Regards
V.Visagan
Denis Magda Denis Magda
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Formation between nodes in different data centers

Hi,

You won’t be able to use JClouds [1] for Openstack because JClouds ComputeService [2] is not supported for this cloud provider.

Please provide the following:
- configuration for all the nodes you use;
- full logs from all the servers. Enabled debugging level for org.apache.ignite.spi.discovery.tcp.ServerImpl class. This can be done by adding this category to {ignite}/config/ignite-log4j.xml

<category name="org.apache.ignite.spi.discovery.tcp.ServerImpl">
<level value="DEBUG"/>
</category>
To enable Log4j in Ignite move ignite-log4j folder from {ignite}/libs/optional to {ignite}/libs and activate it in the configuration

<property name=gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger>
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml/>
</bean>
<property>


On May 25, 2016, at 3:33 AM, visagan <[hidden email]> wrote:

Hi, 

I tried the option you told me. But it dint work. It happened to do the same thing again. 

So can i use the cloud based configuration. My machines are in open-stack i think it should be the public/private address issue. 
https://apacheignite.readme.io/v1.6/docs/generic-cloud-configuration

If i replace
  <property name="provider" value="google-compute-engine"/>
<property name="provider" value="openstack-nova"/>
And provide the rest of the information as required by ignite. Will that pick up the open stack configuration? Or should i write my own custom interface to that for openstack ?

On Thu, May 12, 2016 at 10:47 AM, vdpyatkov [via Apache Ignite Users] <<a href="x-msg://58/user/SendEmail.jtp?type=node&amp;node=5157&amp;i=0" target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
Hello,

Can you check ping between data centers?
If delay of network may be long, you can incrase FailureDetectionTimeout (use org.apache.ignite.configuration.IgniteConfiguration#setFailureDetectionTimeout)

In additional you need make sure what communication ports are available in all nodes (by default org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi#DFLT_PORT) .
You can see all ports, which are will by used, in debug level logger.

If nothing to help, please provide log files from all nodes and I could see details.

Also I see you are not subscribed on users mail list. Can you please subscribe?
You can do it from this page https://ignite.apache.org/community/resources.html

Also I see you are not subscribed on users mail list. Can you please subscribe?
You can do it from this page https://ignite.apache.org/community/resources.html

visagan wrote
Hi,

I have two nodes in data center A and two nodes in Data center B.  I have opened up the ignite default ports and i am able to talk to all nodes and ports if i use  "nc" command.

But when i try to make the ignite cluster work, the four nodes join and form a cluster and then they leave.

I am able to see the total memory to increase from 24GB (when two nodes are in the cluster) to 36 and then to 48GB and then the other two nodes leave with an event NODE_FAILED.
But the nodes in the corresponding data center forms the cluster within themselves and lives there happily.
When i turn on the cross data center, though it is able to discover the nodes, it fails after joining with the other nodes.
Is there any possibly know reason for this ?

{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.586-04:00","@version":1,"message":"Topology snapshot [ver=969, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.597-04:00","@version":1,"message":"Added new node to topology: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]
{"@timestamp":"2016-05-05T17:26:50.598-04:00","@version":1,"message":"Topology snapshot [ver=970, servers=4, clients=0, CPUs=32, heap=48.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.625-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=966, minorTopVer=0], evt=NODE_FAILED, node=0d9d891d-809c-4b72-9ba3-6b89804e280f]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=0d9d891d-809c-4b72-9ba3-6b89804e280f, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.115], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.115:47500], discPort=47500, order=970, intOrder=570, lastExchangeTime=1462483605307, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]
{"@timestamp":"2016-05-05T17:26:50.636-04:00","@version":1,"message":"Topology snapshot [ver=971, servers=3, clients=0, CPUs=24, heap=36.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Node FAILED: TcpDiscoveryNode [id=5f0315fe-94c8-4b87-b839-7aa9847c94c1, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.16.116], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /192.168.16.116:47500], discPort=47500, order=969, intOrder=569, lastExchangeTime=1462483600560, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"WARN","level_value":30000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
[17:26:50] Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]
{"@timestamp":"2016-05-05T17:26:50.638-04:00","@version":1,"message":"Topology snapshot [ver=972, servers=2, clients=0, CPUs=16, heap=24.0GB]","logger_name":"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager","thread_name":"disco-event-worker-#48%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.657-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=967, minorTopVer=0], evt=NODE_JOINED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}
{"@timestamp":"2016-05-05T17:26:50.687-04:00","@version":1,"message":"Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=968, minorTopVer=0], evt=NODE_FAILED, node=5f0315fe-94c8-4b87-b839-7aa9847c94c1]","logger_name":"org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager","thread_name":"exchange-worker-#81%null%","level":"INFO","level_value":20000,"HOSTNAME":"XXX35.datacenter2.stage.es.XXXXX.com"}



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/Cluster-Formation-between-nodes-in-different-data-centers-tp4838p4903.html
To unsubscribe from Cluster Formation between nodes in different data centers, click here.
NAML



--
Regards
V.Visagan


View this message in context: Re: Cluster Formation between nodes in different data centers
Sent from the Apache Ignite Users mailing list archive at Nabble.com.