NPE during printing failure information

classic Classic list List threaded Threaded
4 messages Options
Andrey Davydov Andrey Davydov
Reply | Threaded
Open this post in threaded view
|

NPE during printing failure information

Hello,

 

We start test our system on Ignite 2.8.1 and got very strange log. As I understand, It was error in system (on 2.7.6 this test works without any problem, I will investigate it later) and it was NPE during handling of this error:

 

[01:08:11] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]

[01:08:11] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.

2020-06-11 01:08:22,677 [tcp-disco-msg-worker-[crd]-#1225%TestNode-0%] WARN   :119 - Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

org.apache.ignite.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1810) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1805) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2858) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7759) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2946) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7697) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61) [ignite-core-2.8.1.jar:2.8.1]

[01:08:22] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

2020-06-11 01:08:22,685 [tcp-disco-msg-worker-[crd]-#1225%TestNode-0%] WARN   :119 - Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

org.apache.ignite.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1810) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1805) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2858) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7759) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2946) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7697) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61) [ignite-core-2.8.1.jar:2.8.1]

[01:08:22] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

2020-06-11 01:08:22,685 [nio-acceptor-tcp-comm-#16311%TestNode-1%] WARN   :119 - Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TestNode-1, finished=false, heartbeatTs=1591837692022]]]

org.apache.ignite.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TestNode-1, finished=false, heartbeatTs=1591837692022]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1810) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1805) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:3024) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2963) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.8.1.jar:2.8.1]

             at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]

[01:08:22] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TestNode-1, finished=false, heartbeatTs=1591837692022]]]

2020-06-11 01:08:22,685 [tcp-disco-msg-worker-[crd]-#1225%TestNode-0%] WARN   :119 - Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

org.apache.ignite.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1810) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1805) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2858) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7759) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2946) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7697) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61) [ignite-core-2.8.1.jar:2.8.1]

[01:08:22] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

2020-06-11 01:08:22,686 [nio-acceptor-tcp-comm-#16311%TestNode-1%] ERROR  :135 - Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]

java.lang.NullPointerException: null

             at org.apache.ignite.internal.processors.diagnostic.DiagnosticProcessor.onFailure(DiagnosticProcessor.java:109) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.processors.failure.FailureProcessor.process(FailureProcessor.java:188) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.processors.failure.FailureProcessor.process(FailureProcessor.java:146) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1808) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1805) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:3024) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2963) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.8.1.jar:2.8.1]

             at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]

2020-06-11 01:08:22,687 [nio-acceptor-tcp-comm-#16311%TestNode-1%] ERROR  :135 - Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: GridWorker [name=nio-acceptor-tcp-comm, igniteInstanceName=TestNode-1, finished=true, heartbeatTs=1591837692022]]]

org.apache.ignite.IgniteException: GridWorker [name=nio-acceptor-tcp-comm, igniteInstanceName=TestNode-1, finished=true, heartbeatTs=1591837692022]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1810) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1805) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.worker.WorkersRegistry.onStopped(WorkersRegistry.java:169) [ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:153) [ignite-core-2.8.1.jar:2.8.1]

             at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]

 

 

Andrey.

 

akorensh akorensh
Reply | Threaded
Open this post in threaded view
|

Re: NPE during printing failure information

Hi,
  This looks like a communication problem.
    GridWorker [name=grid-nio-worker-tcp-comm-3

   Make sure all nodes can see each other. Test it with a minimum number of
nodes.
   If it still doesn't work then describe your test and send the full logs
and Ignite condifg.
Thanks, Alex
   




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Andrey Davydov Andrey Davydov
Reply | Threaded
Open this post in threaded view
|

RE: NPE during printing failure information

Yes, I think that initial problem was with communication and it need to be tested more on my side. But during handling this error, Ignite throw NPE, I think that NPE during communication exception handling is bug.

 

[01:08:22] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

2020-06-11 01:08:22,686 [nio-acceptor-tcp-comm-#16311%TestNode-1%] ERROR  :135 - Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]

java.lang.NullPointerException: null

             at org.apache.ignite.internal.processors.diagnostic.DiagnosticProcessor.onFailure(DiagnosticProcessor.java:109) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.processors.failure.FailureProcessor.process(FailureProcessor.java:188) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.processors.failure.FailureProcessor.process(FailureProcessor.java:146) ~[ignite-core-2.8.1.jar:2.8.1]

 

 

Andrey.

 

От: [hidden email]
Отправлено: 11 июня 2020 г. в 18:31
Кому: [hidden email]
Тема: Re: NPE during printing failure information

 

Hi,

  This looks like a communication problem.

    GridWorker [name=grid-nio-worker-tcp-comm-3

 

   Make sure all nodes can see each other. Test it with a minimum number of

nodes.

   If it still doesn't work then describe your test and send the full logs

and Ignite condifg.

Thanks, Alex

  

 

 

 

 

--

Sent from: http://apache-ignite-users.70518.x6.abble.com/

 

ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: NPE during printing failure information

Hello!

I suggest filing an issue against Apache Ignite JIRA.
Ticket which added this line is IGNITE-11750

Regards,
--
Ilya Kasnacheev


сб, 13 июн. 2020 г. в 01:37, Andrey Davydov <[hidden email]>:

Yes, I think that initial problem was with communication and it need to be tested more on my side. But during handling this error, Ignite throw NPE, I think that NPE during communication exception handling is bug.

 

[01:08:22] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=TestNode-0, finished=false, heartbeatTs=1591837692022]]]

2020-06-11 01:08:22,686 [nio-acceptor-tcp-comm-#16311%TestNode-1%] ERROR  :135 - Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]

java.lang.NullPointerException: null

             at org.apache.ignite.internal.processors.diagnostic.DiagnosticProcessor.onFailure(DiagnosticProcessor.java:109) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.processors.failure.FailureProcessor.process(FailureProcessor.java:188) ~[ignite-core-2.8.1.jar:2.8.1]

             at org.apache.ignite.internal.processors.failure.FailureProcessor.process(FailureProcessor.java:146) ~[ignite-core-2.8.1.jar:2.8.1]

 

 

Andrey.

 

От: [hidden email]
Отправлено: 11 июня 2020 г. в 18:31
Кому: [hidden email]
Тема: Re: NPE during printing failure information

 

Hi,

  This looks like a communication problem.

    GridWorker [name=grid-nio-worker-tcp-comm-3

 

   Make sure all nodes can see each other. Test it with a minimum number of

nodes.

   If it still doesn't work then describe your test and send the full logs

and Ignite condifg.

Thanks, Alex

  

 

 

 

 

--

Sent from: http://apache-ignite-users.70518.x6.abble.com/