Getting error Node is out of topology (probably, due to short-time network problems)

classic Classic list List threaded Threaded
5 messages Options
BEELA GAYATRI BEELA GAYATRI
Reply | Threaded
Open this post in threaded view
|

Getting error Node is out of topology (probably, due to short-time network problems)

Dear Team,

 

We are having 16  ignite worker nodes as data grid nodes  and the application is working fine . After few days/hours  we are getting  warning “Node is out of topology (probably, due to short-time network problems)”  and few nodes got down with System Critical error and cache was stopped on the particular nodes .  Attaching the ignite logs

Please suggest us what could be the issue and how to get the issue resolved.

 

Sent from Mail for Windows 10

 

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


error.txt (301K) Download Attachment
ibelyakov ibelyakov
Reply | Threaded
Open this post in threaded view
|

Re: Getting error Node is out of topology (probably, due to short-time network problems)

Can you also provide the logs for the few minutes before the "Node is out of
topology" message?

Igor



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
BEELA GAYATRI BEELA GAYATRI
Reply | Threaded
Open this post in threaded view
|

RE: Getting error Node is out of topology (probably, due to short-time network problems)

Hi Igor,

 

    PFA.  Complete log of the node for Node is out of topology(16 nodes are being used indicated as XX.XX.XXX.node1 to XX.XX.XXX.node16 in the log)

 

Sent from Mail for Windows 10

 


From: ibelyakov <[hidden email]>
Sent: Monday, November 16, 2020 8:00:33 PM
To: [hidden email] <[hidden email]>
Subject: Re: Getting error Node is out of topology (probably, due to short-time network problems)
 
"External email. Open with Caution"

Can you also provide the logs for the few minutes before the "Node is out of
topology" message?

Igor



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


ignite-e86691f0.0.log (762K) Download Attachment
ibelyakov ibelyakov
Reply | Threaded
Open this post in threaded view
|

RE: Getting error Node is out of topology (probably, due to short-time network problems)

Hi,

According to the provided log I see "Blocked system-critical thread has been
detected" message and that the node was segmented since it was unable to
respond to another node. Most probably it's caused by JVM pauses, possibly
related with GC.

Do you collect GC logs for the nodes?

You can find an information how to enable GC logs here:
https://ignite.apache.org/docs/latest/perf-and-troubleshooting/troubleshooting#detailed-gc-logs

Igor



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
BEELA GAYATRI BEELA GAYATRI
Reply | Threaded
Open this post in threaded view
|

RE: Getting error Node is out of topology (probably, due to short-time network problems)

Hi Igor,

 

     Asper the belowsuggesion, we have incorporated  jvm property as below and run all the 16 nodes.

“-DIGNITE_JVM_PAUSE_DETECTOR_THRESHOLD=10000”

 

Even though one of the node is out of topology and cache was stopped . PFA GClog and Ignite log for the same. Please suggest what can be done further.

 

 

Sent from Mail for Windows 10

 

From: [hidden email]
Sent: Monday, November 23, 2020 3:03 PM
To: [hidden email]
Subject: RE: Getting error Node is out of topology (probably, due to short-time network problems)

 

"External email. Open with Caution"

Hi,

According to the provided log I see "Blocked system-critical thread has been
detected" message and that the node was segmented since it was unable to
respond to another node. Most probably it's caused by JVM pauses, possibly
related with GC.

Do you collect GC logs for the nodes?

You can find an information how to enable GC logs here:
https://ignite.apache.org/docs/latest/perf-and-troubleshooting/troubleshooting#detailed-gc-logs

Igor



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

 

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


ignite-e14afbe9.0.log (1M) Download Attachment
GClog.txt (28K) Download Attachment