Node can not join cluster

classic Classic list List threaded Threaded
12 messages Options
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Node can not join cluster

Hi,
    There is a cluster with 4 nodes, When I add a node to join this cluster, It failed.
    Here is the trace:
[09:06:11,090][INFO][main][IgniteKernal] Config URL: file:/home/ignite23/config/default-config.xml [09:06:11,090][INFO][main][IgniteKernal] Daemon mode: off [09:06:11,090][INFO][main][IgniteKernal] OS: Linux 3.10.0-123.el7.x86_64 amd64 [09:06:11,090][INFO][main][IgniteKernal] OS user: root [09:06:11,091][INFO][main][IgniteKernal] PID: 7678 [09:06:11,091][INFO][main][IgniteKernal] Language runtime: Java Platform API Specification ver. 1.8 [09:06:11,091][INFO][main][IgniteKernal] VM information: Java(TM) SE Runtime Environment 1.8.0_102-b14 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.102-b14 [09:06:11,092][INFO][main][IgniteKernal] VM total memory: 30.0GB [09:06:11,092][INFO][main][IgniteKernal] Remote Management [restart: on, REST: on, JMX (remote: on, port: 49114, auth: off, ssl: off)] [09:06:11,094][INFO][main][IgniteKernal] IGNITE_HOME=/home/ignite23 [09:06:11,094][INFO][main][IgniteKernal] VM arguments: [-Xms30g, -Xmx30g, -XX:+AggressiveOpts, -XX:MaxMetaspaceSize=256m, -XX:MaxDirectMemorySize=8g, -XX:+AlwaysPreTouch, -XX:+UseG1GC, -XX:+ScavengeBeforeFullGC, -XX:+DisableExplicitGC, -Djava.net.preferIPv4Stack=true, -DIGNITE_SQL_MERGE_TABLE_MAX_SIZE=30000000, -DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK=true, -DIGNITE_QUIET=true, -DIGNITE_SUCCESS_FILE=/home/ignite23/work/ignite_success_01ffd588-c864-4d6d-aa45-ad16338acf93, -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=49114, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -DIGNITE_HOME=/home/ignite23, -DIGNITE_PROG_NAME=./ignite.sh]
[09:06:11,095][INFO][main][IgniteKernal] System cache's DataRegion size is configured to 40 MB. Use DataStorageConfiguration.systemCacheMemorySize property to change the setting. [09:06:11,101][INFO][main][IgniteKernal] Configured caches [in 'sysMemPlc' dataRegion: ['ignite-sys-cache']] [09:06:11,104][INFO][main][IgniteKernal] 3-rd party licenses can be found at: /home/ignite23/libs/licenses [09:06:11,156][INFO][main][IgnitePluginProcessor] Configured plugins: [09:06:11,156][INFO][main][IgnitePluginProcessor] ^-- None [09:06:11,156][INFO][main][IgnitePluginProcessor] [09:06:11,193][INFO][main][TcpCommunicationSpi] Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false] [09:06:21,237][WARNING][main][TcpCommunicationSpi] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[09:06:21,257][WARNING][main][NoopCheckpointSpi] Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation) [09:06:21,284][WARNING][main][GridCollisionManager] Collision resolution is disabled (all jobs will be activated upon arrival). [09:06:21,285][INFO][main][IgniteKernal] Security status [authentication=off, tls/ssl=off] [09:06:21,502][INFO][main][ClientListenerProcessor] Client connector processor has started on TCP port 10800 [09:06:21,542][INFO][main][GridTcpRestProtocol] Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211] [09:06:21,575][INFO][main][IgniteKernal] Non-loopback local IPs: 10.1.50.0, 10.1.50.1, 192.168.63.60 [09:06:21,575][INFO][main][IgniteKernal] Enabled local MACs: 024294C8D41F, 6AB9618820B2 [09:06:21,613][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=8ad7a225-d6ad-4407-92fa-1035f43dfb4b] [09:06:21,678][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/192.168.63.47, rmtPort=59550] [09:06:21,689][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/192.168.63.47, rmtPort=59550] [09:06:21,689][INFO][tcp-disco-sock-reader-#4][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/192.168.63.47:59550, rmtPort=59550] [09:06:26,657][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[09:06:46,753][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/192.168.63.47, rmtPort=57627] [09:06:46,753][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/192.168.63.47, rmtPort=57627] [09:06:46,754][INFO][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/192.168.63.47:57627, rmtPort=57627] [09:07:11,803][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/192.168.63.47, rmtPort=60817] [09:07:11,803][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/192.168.63.47, rmtPort=60817] [09:07:11,804][INFO][tcp-disco-sock-reader-#6][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/192.168.63.47:60817, rmtPort=60817] [09:07:16,790][INFO][tcp-disco-sock-reader-#4][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/192.168.63.47:59550, rmtPort=59550 [09:07:36,837][INFO][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/192.168.63.47:57627, rmtPort=57627 [09:07:36,867][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/192.168.63.47, rmtPort=45884] [09:07:36,868][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/192.168.63.47, rmtPort=45884] [09:07:36,868][INFO][tcp-disco-sock-reader-#7][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/192.168.63.47:45884, rmtPort=45884]



The attachment is one node of the cluster's trace.

All the thing are as same as the other node.
What did I miss?
Thanks.
Lucky





 


new 20.txt (18K) Download Attachment
afedotov afedotov
Reply | Threaded
Open this post in threaded view
|

Re: Node can not join cluster

CONTENTS DELETED
The author has deleted this message.
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Node can not join cluster

Well,I run the ignite cluster in a physical machine with 2T ram.
All virtual machines configuration are the same.
And I have checked the port ,it's also normal .

All of Ignite cluster node also are working properly.

Thanks.






At 2017-12-15 23:35:49, "afedotov" <[hidden email]> wrote: >Hi, > >From logs, I can see that the node was not able to connect to the cluster >due to a timeout issue. >It could be caused by GC or network issue. Please check the logs of other >nodes, especially of the coordinator >node fcc47ef7-f080-4f88-93f5-2bc221dd1fcf. > >Kind regards, >Alex > > > >-- >Sent from: http://apache-ignite-users.70518.x6.nabble.com/


 

Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Node can not join cluster

In reply to this post by afedotov
All the logs of nodes in Ignite cluster are  the same.


 

afedotov afedotov
Reply | Threaded
Open this post in threaded view
|

Re: Re:Re: Node can not join cluster

CONTENTS DELETED
The author has deleted this message.
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Node can not join cluster

In reply to this post by afedotov
Hi
    It's been over a few months.
    Are there any suggestions?
    Thanks.






At 2017-12-15 23:35:49, "afedotov" <[hidden email]> wrote: >Hi, > >From logs, I can see that the node was not able to connect to the cluster >due to a timeout issue. >It could be caused by GC or network issue. Please check the logs of other >nodes, especially of the coordinator >node fcc47ef7-f080-4f88-93f5-2bc221dd1fcf. > >Kind regards, >Alex > > > >-- >Sent from: http://apache-ignite-users.70518.x6.nabble.com/


 

Stanislav Lukyanov Stanislav Lukyanov
Reply | Threaded
Open this post in threaded view
|

RE: Re: Node can not join cluster

Hi,

 

As Alex said before, from the log you’ve provided it’s hard to say much but what’s in this message:

===============

[09:06:26,657][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

===============

 

Can you provide other log files and share more details about what specifically you’re doing?

If the issue is reproducible in different environments, it would be helpful if you could share a reproducer project on GitHub.

 

Thanks,

Stan

 

From: [hidden email]
Sent: 6 марта 2018 г. 10:08
To: [hidden email]
Subject: Re:Re: Node can not join cluster

 

Hi

    It's been over a few months.

    Are there any suggestions?

    Thanks.




 


At 2017-12-15 23:35:49, "afedotov" <[hidden email]> wrote:
>Hi,
> 
>From logs, I can see that the node was not able to connect to the cluster
>due to a timeout issue.
>It could be caused by GC or network issue. Please check the logs of other
>nodes, especially of the coordinator
>node fcc47ef7-f080-4f88-93f5-2bc221dd1fcf.
> 
>Kind regards,
>Alex
> 
> 
> 
>--
>Sent from: http://apache-ignite-users.70518.x6.nabble.com/

 

 

 

Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:RE: Re: Node can not join cluster

Hi,
    I load data from database with 192.168.63.36 node. The other node don't load data. This you can see it from default-config_60.xml and default-config.xml file.
    I have provide files all about this .
    Thank you.



At 2018-03-06 16:53:49, "Stanislav Lukyanov" <[hidden email]> wrote:

Hi,

 

As Alex said before, from the log you’ve provided it’s hard to say much but what’s in this message:

===============

[09:06:26,657][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

===============

 

Can you provide other log files and share more details about what specifically you’re doing?

If the issue is reproducible in different environments, it would be helpful if you could share a reproducer project on GitHub.

 

Thanks,

Stan

 



 


ignite_Error_log.rar (132K) Download Attachment
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re:RE: Re: Node can not join cluster

Well,  I've  solved this problem.
Thanks a lot.






 

Vishalan Vishalan
Reply | Threaded
Open this post in threaded view
|

Re: Re:Re:RE: Re: Node can not join cluster

Hi,

                  If I may ask, What was the solution to your problem?

Thanks and Regards,
Vishalan



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Vishalan Vishalan
Reply | Threaded
Open this post in threaded view
|

Re: Re:Re:RE: Re: Node can not join cluster

In reply to this post by Lucky
What was the solution to above problem.....I am facing the same issue



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Denis Mekhanikov Denis Mekhanikov
Reply | Threaded
Open this post in threaded view
|

Re: Re:Re:RE: Re: Node can not join cluster

Vishalan,

Please create a new thread and provide information about your setup and logs.
The OP doesn't seem to be getting your questions.

Denis

пн, 20 мая 2019 г. в 12:44, Vishalan <[hidden email]>:
What was the solution to above problem.....I am facing the same issue



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/