Replace or Put after PutAsync causes Ignite to hang

classic Classic list List threaded Threaded
11 messages Options
Barney Pippin Barney Pippin
Reply | Threaded
Open this post in threaded view
|

Replace or Put after PutAsync causes Ignite to hang

This post was updated on .
Hi,

I'm working on a .Net Core (2+) application that uses Ignite. I've noticed some odd behaviour and have narrowed it down to a Put or a Replace call hanging after a PutAsync.

I started looking at the issue when one of our nodes was randomly logging a possible starvation in striped pool warning, hence the reduction of StripedThreadPoolSize to 1 to reproduce. However, I can't be totally sure that the hang I'm seeing in my test app is the same as with the simple configuration I'm using I'm not sure how to specify quiet mode to false.

I only see the issue being reproduced when I compile and run the test app leaving it running, then run a second instance of the same application.

The code below has no transactions specified as I stripped them out to keep the example minimal but we do need to use them in our actual application, hence the CacheAtomicityMode.Transactional setting.

I'm working with Visual Studio 2017 and Ignite 2.6. C# 7.3.

Can anyone help as to why this might be hanging? Is the test just nonsense as I've set the StripedThreadPoolSize to 1? If so why?

using System;
using System.Threading;
using System.Threading.Tasks;
using Apache.Ignite.Core;
using Apache.Ignite.Core.Cache.Configuration;

namespace IgniteSimple
{
    public static class Program
    {
        public static async Task Main(string[] args)
        {
            using (var ignite = Ignition.Start(new IgniteConfiguration() { StripedThreadPoolSize = 1 }))
            {
                var cache = ignite.GetOrCreateCache<int, string>(
                    new CacheConfiguration("TestCache")
                    {
                        AtomicityMode = CacheAtomicityMode.Transactional,
                        CacheMode = CacheMode.Partitioned
                    });

                Console.WriteLine("PutAsync");
                await cache.PutAsync(1, "Test");

                Console.WriteLine("Replace");
                cache.Replace(1, "Testing"); // Hangs here

                Console.WriteLine("Wait");
                await Task.Delay(Timeout.Infinite);
            }
        }
    }
}

Thanks.

Barney Pippin Barney Pippin
Reply | Threaded
Open this post in threaded view
|

Re: Replace or Put after PutAsync causes Ignite to hang

After upgrading to Ignite 2.7 (via nuget) I get the following output
alongside the hang (I suspect this extra logging info is due to
IGNITE-6587):

PutAsync

Replace

Apr 12, 2019 8:56:09 AM org.apache.ignite.logger.java.JavaLogger error

SEVERE: Blocked system-critical thread has been detected. This can lead to
cluster-wide undefined behaviour [threadName=sys-stripe-0, blockedFor=11s]

Apr 12, 2019 8:56:09 AM java.util.logging.LogManager$RootLogger log

SEVERE: Critical system error detected. Will be handled accordingly to
configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false,
timeout=0, super=AbstractFailureHandler
[ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext
[type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
[name=sys-stripe-0, igniteInstanceName=null, finished=false,
heartbeatTs=1555055758004]]]

class org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-0,
igniteInstanceName=null, finished=false, heartbeatTs=1555055758004]
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
        at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
        at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
        at
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2663)
        at
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7181)
        at
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
        at
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Alexandr Shapkin Alexandr Shapkin
Reply | Threaded
Open this post in threaded view
|

Re: Replace or Put after PutAsync causes Ignite to hang

In reply to this post by Barney Pippin
Hi,

I took a look at the reproducer and it works just fine with different Ignite
and .net versions.

Is there just a single Ignite server with the default config?









--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Alex Shapkin
Barney Pippin Barney Pippin
Reply | Threaded
Open this post in threaded view
|

RE: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

This post was updated on .
Hi,

Thanks for looking. You need to run two instances of the reproducer. Let the first run until you can see "Wait" on the console then run the second. For me the second instance won't get past the Replace call in either 2.6 or 2.7.

It's using the default config with nothing else set up over and above the code you can see on the first post.

Thanks

-----Original Message-----
From: Alexandr Shapkin <lexwert@gmail.com> 
Sent: 15 April 2019 11:46
To: user@ignite.apache.org
Subject: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Hi,

I took a look at the reproducer and it works just fine with different Ignite and .net versions.

Is there just a single Ignite server with the default config?









--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Barney Pippin Barney Pippin
Reply | Threaded
Open this post in threaded view
|

RE: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Hi, is anyone else seeing this or able to reproduce?

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Hello!

Yes, I can reproduce this scenario.

I attribute it to async/await, where there is a .Net callback which is invoked from cache operation and which calls cache operations, and this leads to attempts of scheduling striped pool operations to same stripe as cuirrently occupied.

Pavel, what do you think of that? Stack trace below:

"sys-stripe-0-#1" #12 prio=5 os_prio=0 tid=0x000055cecf9e2800 nid=0x3358 waiting on condition [0x00007f25b85c7000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2470)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2468)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4233)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2468)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2449)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.replace(GridCacheAdapter.java:2896)
        at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.replace(IgniteCacheProxyImpl.java:1294)
        at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.replace(GatewayProtectedCacheProxy.java:1012)
        at org.apache.ignite.internal.processors.platform.cache.PlatformCache.processInStreamOutLong(PlatformCache.java:483)
        at org.apache.ignite.internal.processors.platform.PlatformTargetProxyImpl.inStreamOutLong(PlatformTargetProxyImpl.java:67)
        at org.apache.ignite.internal.processors.platform.callback.PlatformCallbackUtils.inLongOutLong(Native Method)
        at org.apache.ignite.internal.processors.platform.callback.PlatformCallbackGateway.futureNullResult(PlatformCallbackGateway.java:643)
        at org.apache.ignite.internal.processors.platform.utils.PlatformFutureUtils$1.apply(PlatformFutureUtils.java:208)
        at org.apache.ignite.internal.processors.platform.utils.PlatformFutureUtils$1.apply(PlatformFutureUtils.java:189)
        at org.apache.ignite.internal.processors.platform.utils.PlatformFutureUtils$FutureListenable$1.apply(PlatformFutureUtils.java:382)
        at org.apache.ignite.internal.processors.platform.utils.PlatformFutureUtils$FutureListenable$1.apply(PlatformFutureUtils.java:377)
        at org.apache.ignite.internal.util.future.IgniteFutureImpl$InternalFutureListener.apply(IgniteFutureImpl.java:215)
        at org.apache.ignite.internal.util.future.IgniteFutureImpl$InternalFutureListener.apply(IgniteFutureImpl.java:179)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOpRetryFuture$1.apply(GridCacheAdapter.java:5022)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOpRetryFuture$1.apply(GridCacheAdapter.java:5017)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70)
        at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.processors.cache.GridCacheCompoundIdentityFuture.onDone(GridCacheCompoundIdentityFuture.java:56)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.onDone(GridNearTxFinishFuture.java:359)
        - locked <0x00000005cbbcbda0> (a org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.onDone(GridNearTxFinishFuture.java:75)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.checkComplete(GridCompoundFuture.java:285)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.markInitialized(GridCompoundFuture.java:276)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:510)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:425)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$25.apply(GridNearTxLocal.java:3788)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$25.apply(GridNearTxLocal.java:3782)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.processors.cache.GridCacheCompoundFuture.onDone(GridCacheCompoundFuture.java:56)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onComplete(GridNearOptimisticTxPrepareFuture.java:307)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onDone(GridNearOptimisticTxPrepareFuture.java:285)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onDone(GridNearOptimisticTxPrepareFuture.java:79)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.checkComplete(GridCompoundFuture.java:285)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:144)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:45)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture$MiniFuture.onResult(GridNearOptimisticTxPrepareFuture.java:1046)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onResult(GridNearOptimisticTxPrepareFuture.java:195)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxPrepareResponse(IgniteTxHandler.java:744)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$100(IgniteTxHandler.java:119)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$2.apply(IgniteTxHandler.java:205)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$2.apply(IgniteTxHandler.java:203)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1093)
        at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:505)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748)

Regards,
--
Ilya Kasnacheev


чт, 25 апр. 2019 г. в 12:59, Barney Pippin <[hidden email]>:
Hi, is anyone else seeing this or able to reproduce?

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Barney Pippin Barney Pippin
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Thanks for the response Ilya. Did you get a chance to look at this Pavel?
Thanks.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Hello!

I have filed a ticket about this issue so it won't get lost.

Regards,
--
Ilya Kasnacheev


чт, 2 мая 2019 г. в 10:53, Barney Pippin <[hidden email]>:
Thanks for the response Ilya. Did you get a chance to look at this Pavel?
Thanks.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ptupitsyn ptupitsyn
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Sorry guys, I've completely missed this thread, and the topic is very important.

First, a simple fix for the given example. Add the following on the first line of Main:
SynchronizationContext.SetSynchronizationContext(new ThreadPoolSynchronizationContext());

And put the ThreadPoolSynchronizationContext class somewhere:
class ThreadPoolSynchronizationContext : SynchronizationContext
{
    // No-op.
}


Now, detailed explanation. The problem exists forever in Ignite and is mentioned in the docs briefly [1].
Also mentioned in .NET docs (I've updated them a bit) [2]. 

Breakdown:
* Ignite (Java side) runs async callbacks (continuations) on system threads, and those threads have limitations (you should not call Ignite APIs from them in general)
* Ignite.NET wraps async operations into native .NET Tasks
* Usually `await ...` call in .NET will continue execution on the original Thread (simply put, actually it is more complex), so Ignite system thread issue is avoided
* However, Console applications have no `SynchronizationContext`, so the continuation can't be dispatched to original thread, and is executed on current (Ignite) thread
* Setting custom SynchronizationContext fixes the issue: all async continuations will be dispatched to .NET thread pool and never executed on Ignite threads

However, dispatching callbacks to a different thread causes performance hit, and Ignite favors performance over usability right now.
So it is up to the user to configure desired behavior.

Let me know if you need more details.

Thanks


On Thu, Aug 1, 2019 at 3:41 PM Ilya Kasnacheev <[hidden email]> wrote:
Hello!

I have filed a ticket about this issue so it won't get lost.

Regards,
--
Ilya Kasnacheev


чт, 2 мая 2019 г. в 10:53, Barney Pippin <[hidden email]>:
Thanks for the response Ilya. Did you get a chance to look at this Pavel?
Thanks.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
e.llull e.llull
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Hi guys,

We are also facing a similar problem, if not the same. Our main difference
with the initial reproducer is that we are using the Thick Client. We
applied the suggested fix of setting the SynchronizationContext, but we also
perform a GetAsync after the initial PutAsync. Also, I added a loop around
the Replace because sometimes it takes several iterations to block.

Here is the code:
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Apache.Ignite.Core;
using Apache.Ignite.Core.Cache.Configuration;
using Apache.Ignite.Core.Transactions;

namespace IgniteHangTest
{
    class Program : IDisposable
    {
        protected readonly IIgnite server;

        protected readonly IIgnite client;

        public static async Task Main(string[] args)
        {
            SynchronizationContext.SetSynchronizationContext(new
ThreadPoolSynchronizationContext());

            using (var program = new Program())
            {
                await program.Run();
            }
        }

        public Program() {
            server = Ignition.Start(IgniteConfiguration("server"));
            server.GetOrCreateCache<int, string>(new
CacheConfiguration("TestCache")
            {
                AtomicityMode = CacheAtomicityMode.Transactional,                
            });

            var clientConfiguration = IgniteConfiguration("client");
            clientConfiguration.ClientMode = true;
            client = Ignition.Start(clientConfiguration);
        }

        private async Task Run() {
            var cache = client.GetCache<int, string>("TestCache");

            Console.WriteLine("Put initial value");
            await cache.PutAsync(0, "Test");

            Console.WriteLine("Get initial value");
            string initialValue = await cache.GetAsync(0);  // if removed,
it works

            Console.WriteLine("Entering Replace loop");
            for(int i = 0; i < 100; i++)
            {
                cache.Replace(0, "Replace " + i);  // It blocks here
                Console.WriteLine("Loop: i = {0}", i);
            }

            Console.WriteLine("End");
        }

        public void Dispose() {
            Ignition.Stop("client", true);
            Ignition.Stop("server", true);
        }

        private IgniteConfiguration IgniteConfiguration(string instanceName)
        {
            return new IgniteConfiguration
            {
                IgniteInstanceName = instanceName,
                JvmOptions = new List<string> { "-DIGNITE_QUIET=false", },
                TransactionConfiguration = new TransactionConfiguration
                {
                    DefaultTimeout = TimeSpan.FromSeconds(5),
                    DefaultTransactionConcurrency =
TransactionConcurrency.Optimistic,
                    DefaultTransactionIsolation =
TransactionIsolation.Serializable
                },
            };
        }
    }

    class ThreadPoolSynchronizationContext : SynchronizationContext { }
}

In the reproducer we are starting two Ignite nodes, one as the server and
one with ClientMode = true. This is only in the reproducer, in the real use
case the server Ignite node is started in a different machine but the
problem also arises with the "external" server Ignite node.

If the line `string initialValue = await cache.GetAsync(0);` is removed the
programs finishes successfully.

In the console, the relevant logs are:

Critical system error detected. Will be handled accordingly to configured
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=sys-stripe-6,
igniteInstanceName=client, finished=false, heartbeatTs=1565776844696]]]
class org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-6,
igniteInstanceName=client, finished=false, heartbeatTs=1565776844696]
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
        at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
        at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
        at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748)

And the stack trace of the sys-stripe-6 thread is:
Thread [name="sys-stripe-6-#52%client%", id=82, state=WAITING, blockCnt=0,
waitCnt=11]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at
o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
        at
o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
        at
o.a.i.i.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2470)
        at
o.a.i.i.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2468)
        at
o.a.i.i.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4233)
        at
o.a.i.i.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2468)
        at
o.a.i.i.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2449)
        at
o.a.i.i.processors.cache.GridCacheAdapter.replace(GridCacheAdapter.java:2896)
        at
o.a.i.i.processors.cache.IgniteCacheProxyImpl.replace(IgniteCacheProxyImpl.java:1294)
        at
o.a.i.i.processors.cache.GatewayProtectedCacheProxy.replace(GatewayProtectedCacheProxy.java:1012)
        at
o.a.i.i.processors.platform.cache.PlatformCache.processInStreamOutLong(PlatformCache.java:483)
        at
o.a.i.i.processors.platform.PlatformTargetProxyImpl.inStreamOutLong(PlatformTargetProxyImpl.java:67)
        at
o.a.i.i.processors.platform.callback.PlatformCallbackUtils.inLongLongLongObjectOutLong(Native
Method)
        at
o.a.i.i.processors.platform.callback.PlatformCallbackGateway.futureObjectResult(PlatformCallbackGateway.java:626)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$1.apply(PlatformFutureUtils.java:219)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$1.apply(PlatformFutureUtils.java:189)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$FutureListenable$1.apply(PlatformFutureUtils.java:382)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$FutureListenable$1.apply(PlatformFutureUtils.java:377)
        at
o.a.i.i.util.future.IgniteFutureImpl$InternalFutureListener.apply(IgniteFutureImpl.java:215)
        at
o.a.i.i.util.future.IgniteFutureImpl$InternalFutureListener.apply(IgniteFutureImpl.java:179)
        at
o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at
o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at
o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at
o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at
o.a.i.i.processors.cache.GridCacheFutureAdapter.onDone(GridCacheFutureAdapter.java:55)
        at
o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at
o.a.i.i.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.onDone(GridPartitionedSingleGetFuture.java:774)
        at
o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at
o.a.i.i.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.setResult(GridPartitionedSingleGetFuture.java:696)
        at
o.a.i.i.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.onResult(GridPartitionedSingleGetFuture.java:551)
        at
o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetResponse(GridDhtCacheAdapter.java:368)
        at
o.a.i.i.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.access$100(GridDhtColocatedCache.java:87)
        at
o.a.i.i.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$2.apply(GridDhtColocatedCache.java:132)
        at
o.a.i.i.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$2.apply(GridDhtColocatedCache.java:130)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197)
        at
o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
        at
o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1093)
        at
o.a.i.i.util.StripedExecutor$Stripe.body(StripedExecutor.java:505)
        at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ptupitsyn ptupitsyn
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

Hi Eduard,

Yes, that is the same issue.
The workaround above is not enough, because SetSynchronizationContext affects only the current thread.
After the first async operation completes, the continuation is dispatched to some ThreadPool thread,
which may not have a SynchronizationContext set.

Updated workaround:
class ThreadPoolSynchronizationContext : SynchronizationContext
{
public override void Post(SendOrPostCallback d, object state)
{
ThreadPool.QueueUserWorkItem(s =>
{
SetSynchronizationContext(this);
d(s);
}, state);
}

Bear in mind, though - this can affect your entire application performance.
We are discussing Ignite-wide fix for this on the dev list.


On Wed, Aug 14, 2019 at 1:21 PM e.llull <[hidden email]> wrote:
Hi guys,

We are also facing a similar problem, if not the same. Our main difference
with the initial reproducer is that we are using the Thick Client. We
applied the suggested fix of setting the SynchronizationContext, but we also
perform a GetAsync after the initial PutAsync. Also, I added a loop around
the Replace because sometimes it takes several iterations to block.

Here is the code:
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Apache.Ignite.Core;
using Apache.Ignite.Core.Cache.Configuration;
using Apache.Ignite.Core.Transactions;

namespace IgniteHangTest
{
    class Program : IDisposable
    {
        protected readonly IIgnite server;

        protected readonly IIgnite client;

        public static async Task Main(string[] args)
        {
            SynchronizationContext.SetSynchronizationContext(new
ThreadPoolSynchronizationContext());

            using (var program = new Program())
            {
                await program.Run();
            }
        }

        public Program() {
            server = Ignition.Start(IgniteConfiguration("server"));
            server.GetOrCreateCache<int, string>(new
CacheConfiguration("TestCache")
            {
                AtomicityMode = CacheAtomicityMode.Transactional,                 
            });

            var clientConfiguration = IgniteConfiguration("client");
            clientConfiguration.ClientMode = true;
            client = Ignition.Start(clientConfiguration);
        }

        private async Task Run() {
            var cache = client.GetCache<int, string>("TestCache");

            Console.WriteLine("Put initial value");
            await cache.PutAsync(0, "Test");

            Console.WriteLine("Get initial value");
            string initialValue = await cache.GetAsync(0);  // if removed,
it works

            Console.WriteLine("Entering Replace loop");
            for(int i = 0; i < 100; i++)
            {
                cache.Replace(0, "Replace " + i);  // It blocks here
                Console.WriteLine("Loop: i = {0}", i);
            }

            Console.WriteLine("End");
        }

        public void Dispose() {
            Ignition.Stop("client", true);
            Ignition.Stop("server", true);
        }

        private IgniteConfiguration IgniteConfiguration(string instanceName)
        {
            return new IgniteConfiguration
            {
                IgniteInstanceName = instanceName,
                JvmOptions = new List<string> { "-DIGNITE_QUIET=false", },
                TransactionConfiguration = new TransactionConfiguration
                {
                    DefaultTimeout = TimeSpan.FromSeconds(5),
                    DefaultTransactionConcurrency =
TransactionConcurrency.Optimistic,
                    DefaultTransactionIsolation =
TransactionIsolation.Serializable
                },
            };
        }
    }

    class ThreadPoolSynchronizationContext : SynchronizationContext { }
}

In the reproducer we are starting two Ignite nodes, one as the server and
one with ClientMode = true. This is only in the reproducer, in the real use
case the server Ignite node is started in a different machine but the
problem also arises with the "external" server Ignite node.

If the line `string initialValue = await cache.GetAsync(0);` is removed the
programs finishes successfully.

In the console, the relevant logs are:

Critical system error detected. Will be handled accordingly to configured
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=sys-stripe-6,
igniteInstanceName=client, finished=false, heartbeatTs=1565776844696]]]
class org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-6,
igniteInstanceName=client, finished=false, heartbeatTs=1565776844696]
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
        at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
        at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
        at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748)

And the stack trace of the sys-stripe-6 thread is:
Thread [name="sys-stripe-6-#52%client%", id=82, state=WAITING, blockCnt=0,
waitCnt=11]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at
o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
        at
o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
        at
o.a.i.i.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2470)
        at
o.a.i.i.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2468)
        at
o.a.i.i.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4233)
        at
o.a.i.i.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2468)
        at
o.a.i.i.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2449)
        at
o.a.i.i.processors.cache.GridCacheAdapter.replace(GridCacheAdapter.java:2896)
        at
o.a.i.i.processors.cache.IgniteCacheProxyImpl.replace(IgniteCacheProxyImpl.java:1294)
        at
o.a.i.i.processors.cache.GatewayProtectedCacheProxy.replace(GatewayProtectedCacheProxy.java:1012)
        at
o.a.i.i.processors.platform.cache.PlatformCache.processInStreamOutLong(PlatformCache.java:483)
        at
o.a.i.i.processors.platform.PlatformTargetProxyImpl.inStreamOutLong(PlatformTargetProxyImpl.java:67)
        at
o.a.i.i.processors.platform.callback.PlatformCallbackUtils.inLongLongLongObjectOutLong(Native
Method)
        at
o.a.i.i.processors.platform.callback.PlatformCallbackGateway.futureObjectResult(PlatformCallbackGateway.java:626)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$1.apply(PlatformFutureUtils.java:219)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$1.apply(PlatformFutureUtils.java:189)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$FutureListenable$1.apply(PlatformFutureUtils.java:382)
        at
o.a.i.i.processors.platform.utils.PlatformFutureUtils$FutureListenable$1.apply(PlatformFutureUtils.java:377)
        at
o.a.i.i.util.future.IgniteFutureImpl$InternalFutureListener.apply(IgniteFutureImpl.java:215)
        at
o.a.i.i.util.future.IgniteFutureImpl$InternalFutureListener.apply(IgniteFutureImpl.java:179)
        at
o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
        at
o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
        at
o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
        at
o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
        at
o.a.i.i.processors.cache.GridCacheFutureAdapter.onDone(GridCacheFutureAdapter.java:55)
        at
o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
        at
o.a.i.i.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.onDone(GridPartitionedSingleGetFuture.java:774)
        at
o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453)
        at
o.a.i.i.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.setResult(GridPartitionedSingleGetFuture.java:696)
        at
o.a.i.i.processors.cache.distributed.dht.GridPartitionedSingleGetFuture.onResult(GridPartitionedSingleGetFuture.java:551)
        at
o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetResponse(GridDhtCacheAdapter.java:368)
        at
o.a.i.i.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.access$100(GridDhtColocatedCache.java:87)
        at
o.a.i.i.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$2.apply(GridDhtColocatedCache.java:132)
        at
o.a.i.i.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$2.apply(GridDhtColocatedCache.java:130)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197)
        at
o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
        at
o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1093)
        at
o.a.i.i.util.StripedExecutor$Stripe.body(StripedExecutor.java:505)
        at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/