We are using IA 2.8.1 with the C# client.
After triaging intermittent critical thread blockages we determined we have run into this problem reported back in 2019.
This thread contains suggestions about using SetSynchronizationContext() in a single threaded context, and overriding a method within it to direct all thread continuations into the .Net managed thread pool to resolve the issue.
Reading into the ticket there does not seem to be a good approach suggested to resolve this other than to use the public thread pool rather than the striped thread pool for task continuations.
None of the possible paths to solve this issue seem attractive for us:
- We use a lot of concurrency (as I suspect every Ignite system would) so the simple SynchronizationContext approach wont work.
- Overriding all callbacks via the SynchronizationContext into the managed thread pool not only imposes a performance penalty across the application, but also may direct non-thread pool based threads executing Ignite async operations into the managed thread pool after the async operation has completed, which would have difficult to predict consequences.
- Using .ConfigureAwait(true) [which may not be supported in .Net Core] forces synchronization with the call thread, which may be a .Net managed thread pool thread and so may not be available for some time.
Initial experiments suggest the SynchronizationContext approach may not work at all in our case with the override solution appearing to make the problem worse.
Given the current issues with async Cache operation should this be deprecated in the C# client until the underlying issues are resolved? It is hard to see how any non-trivial C# client based Ignite application can safely use them.
Title correction: Async operations in IA C# client appear dangerous
You are right, there is no efficient AND easy way to solve this.
Basically, user code has to append ContinueWith to every async Ignite API call manually:
t => t.Result,
This will move the continuation to the thread pool only for Ignite calls, where it is necessary.
An extension method can be cleaner, but this is still error-prone and verbose.
There should be a global Ignite setting to move all async continuations away from the striped pool.
I'm taking the ticket  ASAP, it is a shame that we let it sit for so long.
Thanks for the quick response.
We have trialled the ContinueWith() suggestion which has resolved the issue in testing so far. I suspect this will have only a very minor performance impact.
Just as a followup query: Are async Compute methods susceptible to a similar issue, and should have the same treatment using a ContinueWith()?
On Thu, Mar 11, 2021 at 8:45 PM Raymond Wilson <[hidden email]> wrote:
> Are async Compute methods susceptible to a similar issue
Short answer - yes, but to a lesser degree.
In a striped pool, threads are assigned to partitions.
If a thread for a partition gets blocked, other operations for that partitions block,
potentially causing a deadlock.
Compute simply uses a fixed-size thread pool (IgniteConfiguration.PublicThreadPoolSize),
there is no striping. However, without ContinueWith, continuations will run on that public pool, possibly
causing starvation and/or slowdown of other Compute tasks.
So yes, I would recommend having the same ConinueWith treatment.
I'm fixing this in  too.
On Thu, Mar 11, 2021 at 11:59 PM Raymond Wilson <[hidden email]> wrote:
|Free forum by Nabble||Edit this page|