Ignite Cluster node stopped

classic Classic list List threaded Threaded
4 messages Options
qiq qiq
Reply | Threaded
Open this post in threaded view
|

Ignite Cluster node stopped

This post has NOT been accepted by the mailing list yet.
hey:
   I build ignite Cluster with 9 nodes. An error has occurred in one node.

  This is the error log.

SEVERE: TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability.
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
        at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
        at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5784)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2161)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)

Thanks in advance for your help.
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Cluster node stopped

Hi,

Can you please properly subscribe to the mailing list so that the community can receive email notifications? Here is the instruction: http://apache-ignite-users.70518.x6.nabble.com/mailing_list/MailingListOptions.jtp?forum=1

qiq wrote
hey:
   I build ignite Cluster with 9 nodes. An error has occurred in one node.

  This is the error log.

SEVERE: TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability.
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
        at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
        at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5784)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2161)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)

Thanks in advance for your help.
There are several reasons that can cause this exception. Can you attach the whole log file?

The first thing I would check is the memory consumption. Is it possible that some the nodes run out of memory or sit in long GC pauses? Do you have enough heap memory allocated?

-Val
Denis Magda Denis Magda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Cluster node stopped

In reply to this post by qiq
Hi,

Most likely the shutdown happened due to a segmentation policy.

By default the segmentation policy shut downs a node if it is kicked off the topology. Look for "Local node SEGMENTED:" message in the log of the failed node. As Val noted this can happen due to long GC pauses or network delays.

The InterruptedException shouldn't have been printed out in this cases and it will be fixed soon
https://issues.apache.org/jira/browse/IGNITE-2688
qiq qiq
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Cluster node stopped

This post has NOT been accepted by the mailing list yet.
thanks,
The problem is solved.
I don't have enough jvm heap.