Ignite hangs during shut down - 2.7

classic Classic list List threaded Threaded
2 messages Options
Loredana Radulescu Ivanoff Loredana Radulescu Ivanoff
Reply | Threaded
Open this post in threaded view
|

Ignite hangs during shut down - 2.7

Hello,

I am using Ignite 2.7 embedded inside a Tomcat application, and have run into an issue where the application does not shut down due to a blocked Ignite thread. I think it would be good for Ignite to avoid hanging in this situation, what do you think? Here are the details:

1. Ignite node gets segmented due to CPU pressure (this part is on purpose)
2. The StopNodeFailureHandler invokes the stop procedure, during which process a lock is acquired.
3. The application detects the segmentation via custom code and also starts shutting down, during which process it also tells Ignite to stop via Ignitition.allGrids().close()
4. The close process triggered by #3 waits for the lock acquired in step #2 and remains blocked forever, preventing the application from shutting down.

Here are the stack traces from the two threads I mentioned:

First:

---------------------
"node-stopper" #272 prio=5 os_prio=0 tid=0x00007ffb1801d000 nid=0x126c waiting on condition [0x00007ffaf16d6000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000000cb189368> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
at org.apache.ignite.internal.processors.cache.GridCacheGateway.onStopped(GridCacheGateway.java:315)
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.blockGateways(GridCacheProcessor.java:1102)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2344)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2228)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2612)
- locked <0x00000000c71e9000> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379)
at org.apache.ignite.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36)
at java.lang.Thread.run(Thread.java:748)
---------------------------------------------------------------------

Second (omitting custom code at the top of the trace):

-----------------------------
"pool-18-thread-2" #127 prio=5 os_prio=0 tid=0x00007ffb3663e000 nid=0x1146 waiting for monitor entry [0x00007ffad94a6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2583)
- waiting to lock <0x00000000c71e9000> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379)
at org.apache.ignite.Ignition.stop(Ignition.java:225)
at org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:3568)
application code
----------------------------------------------------------------------

Thank you!


ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Ignite hangs during shut down - 2.7

Hello!

however not exactly. Is it possible to create a reproducer for such behavior? Maybe it is already fixed in master since there was a few tickets regarding deadlocks on node stop.

Regards,
--
Ilya Kasnacheev


пн, 10 июн. 2019 г. в 21:31, Loredana Radulescu Ivanoff <[hidden email]>:
Hello,

I am using Ignite 2.7 embedded inside a Tomcat application, and have run into an issue where the application does not shut down due to a blocked Ignite thread. I think it would be good for Ignite to avoid hanging in this situation, what do you think? Here are the details:

1. Ignite node gets segmented due to CPU pressure (this part is on purpose)
2. The StopNodeFailureHandler invokes the stop procedure, during which process a lock is acquired.
3. The application detects the segmentation via custom code and also starts shutting down, during which process it also tells Ignite to stop via Ignitition.allGrids().close()
4. The close process triggered by #3 waits for the lock acquired in step #2 and remains blocked forever, preventing the application from shutting down.

Here are the stack traces from the two threads I mentioned:

First:

---------------------
"node-stopper" #272 prio=5 os_prio=0 tid=0x00007ffb1801d000 nid=0x126c waiting on condition [0x00007ffaf16d6000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000000cb189368> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
at org.apache.ignite.internal.processors.cache.GridCacheGateway.onStopped(GridCacheGateway.java:315)
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.blockGateways(GridCacheProcessor.java:1102)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2344)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2228)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2612)
- locked <0x00000000c71e9000> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379)
at org.apache.ignite.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36)
at java.lang.Thread.run(Thread.java:748)
---------------------------------------------------------------------

Second (omitting custom code at the top of the trace):

-----------------------------
"pool-18-thread-2" #127 prio=5 os_prio=0 tid=0x00007ffb3663e000 nid=0x1146 waiting for monitor entry [0x00007ffad94a6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2583)
- waiting to lock <0x00000000c71e9000> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379)
at org.apache.ignite.Ignition.stop(Ignition.java:225)
at org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:3568)
application code
----------------------------------------------------------------------

Thank you!