Ignite 2.0.0 GridUnsafe unmonitor deadlock

classic Classic list List threaded Threaded
17 messages Options
dark dark
Reply | Threaded
Open this post in threaded view
|

Ignite 2.0.0 GridUnsafe unmonitor deadlock

This post was updated on .
Hello, I'm using Ignite 2.0.0, and I would like to ask if you have any doubts
about the deadlock.
The first use pattern is to create a new cache time unit, and after a
certain period of time, it will perform Destroy.

Example)

We create a cache that keeps the data of the 3-minute cycle as shown below

[00:00_Cache] [00:01_Cache] [00:02_Cache]

After one minute, create a new cache [00: 03_Cache] and clear old cache [00:
00_Cache].

[00:00_Cache] is destroy!
[00:03_Cache] is new!

below current cache list
[00:01_Cache] [00:02_Cache] [00:03_Cache]

The reason for using this is to remove the data of a certain time period
quickly rather than the expiry of Cache. As a result of eye observation, it
was possible to quickly remove data in the time zone without using a lot of
CPU.
In this state, I kept it for about 5 hours, and then I took down 5 Client
nodes that existed in Topology for a while and then uploaded them again.
Then, about ten minutes later, a deadlock occurred with the following
message.

[19:48:51,290][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-3-#4%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=179, val
    Deadlock: true
    Completed: 1054320
Thread [name="sys-stripe-3-#4%null%", id=21, state=BLOCKED, blockCnt=5364,
waitCnt=1261740]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@6c7a9d31,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteIfEmpty(GridCacheMapEntry.java:2095)
        at
o.a.i.i.processors.cache.CacheOffheapEvictionManager.touch(CacheOffheapEvictionManager.java:44)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.unlockEntries(GridDhtAtomicCache.java:2896)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1853)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

[19:48:51,423][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=541, val
    Deadlock: true
    Completed: 932925
Thread [name="sys-stripe-5-#6%null%", id=23, state=BLOCKED, blockCnt=5629,
waitCnt=1137576]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@449f1914,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at sun.misc.Unsafe.monitorEnter(Native Method)
        at o.a.i.i.util.GridUnsafe.monitorEnter(GridUnsafe.java:1193)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2815)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1741)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

Deadlock jmc picture
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-1.ignite-deadlock-1
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-2.ignite-deadlock-2
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-3.png

As you can see in the picture above, we can see that sys-stripe-5 and
sys-stripe-6 are the owner of the thread. Besides Ignite Cache Configuration
is shown below.

return ignite.getOrCreateCache(new CacheConfiguration<String,
RollupMetric>()
            .setName(cacheName)
            .setCacheMode(CacheMode.PARTITIONED)
            .setAtomicityMode(CacheAtomicityMode.ATOMIC)
            .setRebalanceMode(CacheRebalanceMode.ASYNC)
            .setMemoryPolicyName(MEMORY_POLICY_NAME)
            .setBackups(1)
            .setStatisticsEnabled(true)
            .setManagementEnabled(true)
            .setCopyOnRead(false)
            .setQueryParallelism(20)
            .setLongQueryWarningTimeout(10000) // 10s
            .setEagerTtl(false)
            .setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new
Duration(TimeUnit.DAYS, 365)))
         
.setMaxConcurrentAsyncOperations(CacheConfiguration.DFLT_MAX_CONCURRENT_ASYNC_OPS
* 10)
            .setAffinity(new CoupangAffinityFunction())
            .setIndexedTypes(String.class, RollupMetric.class));

The reason for setting the CacheExpiryPolicy to 1 year above is because the
entry is evicted by clearing the cache as described previously.

Ignite Memory Configuration
<property name="memoryConfiguration">
      <bean class="org.apache.ignite.configuration.MemoryConfiguration">
       
        <property name="memoryPolicies">
          <list>
            <bean
class="org.apache.ignite.configuration.MemoryPolicyConfiguration">
              <property name="name" value="RollupMemory"/>
             
              <property name="pageEvictionMode" value="RANDOM_LRU"/>
              <property name="metricsEnabled" value="true"/>
             
              <property name="initialSize" value="21474836480"/>
             
              <property name="maxSize" value="21474836480"/>
            </bean>
          </list>
        </property>
        <property name="pageSize" value="4096"/>
        <property name="concurrencyLevel" value="8"/>
      </bean>
    </property>

For what reason did Deadlock occur? Is there an option or usage pattern to
solve this?

I think it is due to the client's topology changes. If so, how would you
handle it?

Please let me know if you have any additional questions.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dark dark
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Additionally, Client use the cache.putAllAsync () call.

If you look at the Ignite log, you see a method call like updateAllAsyncInternal0.
At the same time, does the client have a lock issue when it asynchronously calls after sending a cache entry? :(

2017-10-21 21:06 GMT+09:00 dark <[hidden email]>:
Hello, I'm using Ignite 2.0.0, and I would like to ask if you have any doubts
about the deadlock.
The first use pattern is to create a new cache time unit, and after a
certain period of time, it will perform Destroy.

Example)

We create a cache that keeps the data of the 3-minute cycle as shown below

[00:00_Cache] [00:01_Cache] [00:02_Cache]

After one minute, create a new cache [00: 03_Cache] and clear old cache [00:
00_Cache].

[00:00_Cache] is destroy!
[00:03_Cache] is new!

below current cache list
[00:01_Cache] [00:02_Cache] [00:03_Cache]

The reason for using this is to remove the data of a certain time period
quickly rather than the expiry of Cache. As a result of eye observation, it
was possible to quickly remove data in the time zone without using a lot of
CPU.
In this state, I kept it for about 5 hours, and then I took down 5 Client
nodes that existed in Topology for a while and then uploaded them again.
Then, about ten minutes later, a deadlock occurred with the following
message.

[19:48:51,290][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-3-#4%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=179, val
    Deadlock: true
    Completed: 1054320
Thread [name="sys-stripe-3-#4%null%", id=21, state=BLOCKED, blockCnt=5364,
waitCnt=1261740]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@6c7a9d31,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteIfEmpty(GridCacheMapEntry.java:2095)
        at
o.a.i.i.processors.cache.CacheOffheapEvictionManager.touch(CacheOffheapEvictionManager.java:44)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.unlockEntries(GridDhtAtomicCache.java:2896)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1853)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

[19:48:51,423][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=541, val
    Deadlock: true
    Completed: 932925
Thread [name="sys-stripe-5-#6%null%", id=23, state=BLOCKED, blockCnt=5629,
waitCnt=1137576]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@449f1914,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at sun.misc.Unsafe.monitorEnter(Native Method)
        at o.a.i.i.util.GridUnsafe.monitorEnter(GridUnsafe.java:1193)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2815)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1741)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

Deadlock jmc picture
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-1.ignite-deadlock-1>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-2.ignite-deadlock-2>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-3.png>

As you can see in the picture above, we can see that sys-stripe-5 and
sys-stripe-6 are the owner of the thread. Besides Ignite Cache Configuration
is shown below.

return ignite.getOrCreateCache(new CacheConfiguration<String,
RollupMetric>()
            .setName(cacheName)
            .setCacheMode(CacheMode.PARTITIONED)
            .setAtomicityMode(CacheAtomicityMode.ATOMIC)
            .setRebalanceMode(CacheRebalanceMode.ASYNC)
            .setMemoryPolicyName(MEMORY_POLICY_NAME)
            .setBackups(1)
            .setStatisticsEnabled(true)
            .setManagementEnabled(true)
            .setCopyOnRead(false)
            .setQueryParallelism(20)
            .setLongQueryWarningTimeout(10000) // 10s
            .setEagerTtl(false)
            .setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new
Duration(TimeUnit.DAYS, 365)))

.setMaxConcurrentAsyncOperations(CacheConfiguration.DFLT_MAX_CONCURRENT_ASYNC_OPS
* 10)
            .setAffinity(new CoupangAffinityFunction())
            .setIndexedTypes(String.class, RollupMetric.class));

The reason for setting the CacheExpiryPolicy to 1 year above is because the
entry is evicted by clearing the cache as described previously.

Ignite Memory Configuration
<property name="memoryConfiguration">
      <bean class="org.apache.ignite.configuration.MemoryConfiguration">

        <property name="memoryPolicies">
          <list>
            <bean
class="org.apache.ignite.configuration.MemoryPolicyConfiguration">
              <property name="name" value="RollupMemory"/>

              <property name="pageEvictionMode" value="RANDOM_LRU"/>
              <property name="metricsEnabled" value="true"/>

              <property name="initialSize" value="21474836480"/>

              <property name="maxSize" value="21474836480"/>
            </bean>
          </list>
        </property>
        <property name="pageSize" value="4096"/>
        <property name="concurrencyLevel" value="8"/>
      </bean>
    </property>

For what reason did Deadlock occur? Is there an option or usage pattern to
solve this?

I think it is due to the client's topology changes. If so, how would you
handle it?

Please let me know if you have any additional questions.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

dark dark
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

A similar issue has re-emerged. When I looked at Stackoverflow, there was a user similar to me. https://stackoverflow.com/questions/45028962/possible-starvation-in-striped-pool-with-deadlock-true-apache-ignite

To summarize, I am sending a random value of a pattern like Timestamp_a.b.c to the key of Map at putAllAsync, about 500 times at a time. Do you have to send this part after sorting with key value?

2017-10-21 21:57 GMT+09:00 김성진 <[hidden email]>:
Additionally, Client use the cache.putAllAsync () call.

If you look at the Ignite log, you see a method call like updateAllAsyncInternal0.
At the same time, does the client have a lock issue when it asynchronously calls after sending a cache entry? :(

2017-10-21 21:06 GMT+09:00 dark <[hidden email]>:
Hello, I'm using Ignite 2.0.0, and I would like to ask if you have any doubts
about the deadlock.
The first use pattern is to create a new cache time unit, and after a
certain period of time, it will perform Destroy.

Example)

We create a cache that keeps the data of the 3-minute cycle as shown below

[00:00_Cache] [00:01_Cache] [00:02_Cache]

After one minute, create a new cache [00: 03_Cache] and clear old cache [00:
00_Cache].

[00:00_Cache] is destroy!
[00:03_Cache] is new!

below current cache list
[00:01_Cache] [00:02_Cache] [00:03_Cache]

The reason for using this is to remove the data of a certain time period
quickly rather than the expiry of Cache. As a result of eye observation, it
was possible to quickly remove data in the time zone without using a lot of
CPU.
In this state, I kept it for about 5 hours, and then I took down 5 Client
nodes that existed in Topology for a while and then uploaded them again.
Then, about ten minutes later, a deadlock occurred with the following
message.

[19:48:51,290][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-3-#4%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=179, val
    Deadlock: true
    Completed: 1054320
Thread [name="sys-stripe-3-#4%null%", id=21, state=BLOCKED, blockCnt=5364,
waitCnt=1261740]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@6c7a9d31,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteIfEmpty(GridCacheMapEntry.java:2095)
        at
o.a.i.i.processors.cache.CacheOffheapEvictionManager.touch(CacheOffheapEvictionManager.java:44)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.unlockEntries(GridDhtAtomicCache.java:2896)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1853)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

[19:48:51,423][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=541, val
    Deadlock: true
    Completed: 932925
Thread [name="sys-stripe-5-#6%null%", id=23, state=BLOCKED, blockCnt=5629,
waitCnt=1137576]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@449f1914,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at sun.misc.Unsafe.monitorEnter(Native Method)
        at o.a.i.i.util.GridUnsafe.monitorEnter(GridUnsafe.java:1193)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2815)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1741)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

Deadlock jmc picture
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-1.ignite-deadlock-1>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-2.ignite-deadlock-2>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-3.png>

As you can see in the picture above, we can see that sys-stripe-5 and
sys-stripe-6 are the owner of the thread. Besides Ignite Cache Configuration
is shown below.

return ignite.getOrCreateCache(new CacheConfiguration<String,
RollupMetric>()
            .setName(cacheName)
            .setCacheMode(CacheMode.PARTITIONED)
            .setAtomicityMode(CacheAtomicityMode.ATOMIC)
            .setRebalanceMode(CacheRebalanceMode.ASYNC)
            .setMemoryPolicyName(MEMORY_POLICY_NAME)
            .setBackups(1)
            .setStatisticsEnabled(true)
            .setManagementEnabled(true)
            .setCopyOnRead(false)
            .setQueryParallelism(20)
            .setLongQueryWarningTimeout(10000) // 10s
            .setEagerTtl(false)
            .setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new
Duration(TimeUnit.DAYS, 365)))

.setMaxConcurrentAsyncOperations(CacheConfiguration.DFLT_MAX_CONCURRENT_ASYNC_OPS
* 10)
            .setAffinity(new CoupangAffinityFunction())
            .setIndexedTypes(String.class, RollupMetric.class));

The reason for setting the CacheExpiryPolicy to 1 year above is because the
entry is evicted by clearing the cache as described previously.

Ignite Memory Configuration
<property name="memoryConfiguration">
      <bean class="org.apache.ignite.configuration.MemoryConfiguration">

        <property name="memoryPolicies">
          <list>
            <bean
class="org.apache.ignite.configuration.MemoryPolicyConfiguration">
              <property name="name" value="RollupMemory"/>

              <property name="pageEvictionMode" value="RANDOM_LRU"/>
              <property name="metricsEnabled" value="true"/>

              <property name="initialSize" value="21474836480"/>

              <property name="maxSize" value="21474836480"/>
            </bean>
          </list>
        </property>
        <property name="pageSize" value="4096"/>
        <property name="concurrencyLevel" value="8"/>
      </bean>
    </property>

For what reason did Deadlock occur? Is there an option or usage pattern to
solve this?

I think it is due to the client's topology changes. If so, how would you
handle it?

Please let me know if you have any additional questions.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


dark dark
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

I think I'm talking to myself and giving an answer. lol

Maybe this is the issue.

I am doing putAllAsync based on HashMap structure. I hope that it will be fine to replace this part with TreeMap.


I will try and attach a thread again when problems occur.

Thanks a lot.

2017-10-21 22:05 GMT+09:00 김성진 <[hidden email]>:
A similar issue has re-emerged. When I looked at Stackoverflow, there was a user similar to me. https://stackoverflow.com/questions/45028962/possible-starvation-in-striped-pool-with-deadlock-true-apache-ignite

To summarize, I am sending a random value of a pattern like Timestamp_a.b.c to the key of Map at putAllAsync, about 500 times at a time. Do you have to send this part after sorting with key value?

2017-10-21 21:57 GMT+09:00 김성진 <[hidden email]>:
Additionally, Client use the cache.putAllAsync () call.

If you look at the Ignite log, you see a method call like updateAllAsyncInternal0.
At the same time, does the client have a lock issue when it asynchronously calls after sending a cache entry? :(

2017-10-21 21:06 GMT+09:00 dark <[hidden email]>:
Hello, I'm using Ignite 2.0.0, and I would like to ask if you have any doubts
about the deadlock.
The first use pattern is to create a new cache time unit, and after a
certain period of time, it will perform Destroy.

Example)

We create a cache that keeps the data of the 3-minute cycle as shown below

[00:00_Cache] [00:01_Cache] [00:02_Cache]

After one minute, create a new cache [00: 03_Cache] and clear old cache [00:
00_Cache].

[00:00_Cache] is destroy!
[00:03_Cache] is new!

below current cache list
[00:01_Cache] [00:02_Cache] [00:03_Cache]

The reason for using this is to remove the data of a certain time period
quickly rather than the expiry of Cache. As a result of eye observation, it
was possible to quickly remove data in the time zone without using a lot of
CPU.
In this state, I kept it for about 5 hours, and then I took down 5 Client
nodes that existed in Topology for a while and then uploaded them again.
Then, about ten minutes later, a deadlock occurred with the following
message.

[19:48:51,290][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-3-#4%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=179, val
    Deadlock: true
    Completed: 1054320
Thread [name="sys-stripe-3-#4%null%", id=21, state=BLOCKED, blockCnt=5364,
waitCnt=1261740]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@6c7a9d31,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteIfEmpty(GridCacheMapEntry.java:2095)
        at
o.a.i.i.processors.cache.CacheOffheapEvictionManager.touch(CacheOffheapEvictionManager.java:44)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.unlockEntries(GridDhtAtomicCache.java:2896)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1853)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

[19:48:51,423][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=541, val
    Deadlock: true
    Completed: 932925
Thread [name="sys-stripe-5-#6%null%", id=23, state=BLOCKED, blockCnt=5629,
waitCnt=1137576]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@449f1914,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at sun.misc.Unsafe.monitorEnter(Native Method)
        at o.a.i.i.util.GridUnsafe.monitorEnter(GridUnsafe.java:1193)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2815)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1741)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

Deadlock jmc picture
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-1.ignite-deadlock-1>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-2.ignite-deadlock-2>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-3.png>

As you can see in the picture above, we can see that sys-stripe-5 and
sys-stripe-6 are the owner of the thread. Besides Ignite Cache Configuration
is shown below.

return ignite.getOrCreateCache(new CacheConfiguration<String,
RollupMetric>()
            .setName(cacheName)
            .setCacheMode(CacheMode.PARTITIONED)
            .setAtomicityMode(CacheAtomicityMode.ATOMIC)
            .setRebalanceMode(CacheRebalanceMode.ASYNC)
            .setMemoryPolicyName(MEMORY_POLICY_NAME)
            .setBackups(1)
            .setStatisticsEnabled(true)
            .setManagementEnabled(true)
            .setCopyOnRead(false)
            .setQueryParallelism(20)
            .setLongQueryWarningTimeout(10000) // 10s
            .setEagerTtl(false)
            .setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new
Duration(TimeUnit.DAYS, 365)))

.setMaxConcurrentAsyncOperations(CacheConfiguration.DFLT_MAX_CONCURRENT_ASYNC_OPS
* 10)
            .setAffinity(new CoupangAffinityFunction())
            .setIndexedTypes(String.class, RollupMetric.class));

The reason for setting the CacheExpiryPolicy to 1 year above is because the
entry is evicted by clearing the cache as described previously.

Ignite Memory Configuration
<property name="memoryConfiguration">
      <bean class="org.apache.ignite.configuration.MemoryConfiguration">

        <property name="memoryPolicies">
          <list>
            <bean
class="org.apache.ignite.configuration.MemoryPolicyConfiguration">
              <property name="name" value="RollupMemory"/>

              <property name="pageEvictionMode" value="RANDOM_LRU"/>
              <property name="metricsEnabled" value="true"/>

              <property name="initialSize" value="21474836480"/>

              <property name="maxSize" value="21474836480"/>
            </bean>
          </list>
        </property>
        <property name="pageSize" value="4096"/>
        <property name="concurrencyLevel" value="8"/>
      </bean>
    </property>

For what reason did Deadlock occur? Is there an option or usage pattern to
solve this?

I think it is due to the client's topology changes. If so, how would you
handle it?

Please let me know if you have any additional questions.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/



Denis Magda Denis Magda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Hi,

Yes, you’ve already grasped how to fix the deadlock - feed the keys sorted in TreeMap to bulk operations such as putAll or removeAll. Keys in HashMap are unordered which leads to a deadlock if there are multiple bulk updates running in parallel.

Furthermore, you might consider ‘eagerTtl’ parameter for the eviction policy instead of the custom code. That parameter instructs to remove stale items proactively.

 Lastly, upgrade to version  2.2, it’s much more stable than 2.0.

Denis

On Saturday, October 21, 2017, 김성진 <[hidden email]> wrote:
I think I'm talking to myself and giving an answer. lol

Maybe this is the issue.

I am doing putAllAsync based on HashMap structure. I hope that it will be fine to replace this part with TreeMap.


I will try and attach a thread again when problems occur.

Thanks a lot.

2017-10-21 22:05 GMT+09:00 김성진 <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;ekdxhrl0096@gmail.com&#39;);" target="_blank">ekdxhrl0096@...>:
A similar issue has re-emerged. When I looked at Stackoverflow, there was a user similar to me. https://stackoverflow.com/questions/45028962/possible-starvation-in-striped-pool-with-deadlock-true-apache-ignite

To summarize, I am sending a random value of a pattern like Timestamp_a.b.c to the key of Map at putAllAsync, about 500 times at a time. Do you have to send this part after sorting with key value?

2017-10-21 21:57 GMT+09:00 김성진 <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;ekdxhrl0096@gmail.com&#39;);" target="_blank">ekdxhrl0096@...>:
Additionally, Client use the cache.putAllAsync () call.

If you look at the Ignite log, you see a method call like updateAllAsyncInternal0.
At the same time, does the client have a lock issue when it asynchronously calls after sending a cache entry? :(

2017-10-21 21:06 GMT+09:00 dark <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;ekdxhrl0096@gmail.com&#39;);" target="_blank">ekdxhrl0096@...>:
Hello, I'm using Ignite 2.0.0, and I would like to ask if you have any doubts
about the deadlock.
The first use pattern is to create a new cache time unit, and after a
certain period of time, it will perform Destroy.

Example)

We create a cache that keeps the data of the 3-minute cycle as shown below

[00:00_Cache] [00:01_Cache] [00:02_Cache]

After one minute, create a new cache [00: 03_Cache] and clear old cache [00:
00_Cache].

[00:00_Cache] is destroy!
[00:03_Cache] is new!

below current cache list
[00:01_Cache] [00:02_Cache] [00:03_Cache]

The reason for using this is to remove the data of a certain time period
quickly rather than the expiry of Cache. As a result of eye observation, it
was possible to quickly remove data in the time zone without using a lot of
CPU.
In this state, I kept it for about 5 hours, and then I took down 5 Client
nodes that existed in Topology for a while and then uploaded them again.
Then, about ten minutes later, a deadlock occurred with the following
message.

[19:48:51,290][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-3-#4%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=179, val
    Deadlock: true
    Completed: 1054320
Thread [name="sys-stripe-3-#4%null%", id=21, state=BLOCKED, blockCnt=5364,
waitCnt=1261740]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@6c7a9d31,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteIfEmpty(GridCacheMapEntry.java:2095)
        at
o.a.i.i.processors.cache.CacheOffheapEvictionManager.touch(CacheOffheapEvictionManager.java:44)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.unlockEntries(GridDhtAtomicCache.java:2896)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1853)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

[19:48:51,423][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=541, val
    Deadlock: true
    Completed: 932925
Thread [name="sys-stripe-5-#6%null%", id=23, state=BLOCKED, blockCnt=5629,
waitCnt=1137576]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@449f1914,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at sun.misc.Unsafe.monitorEnter(Native Method)
        at o.a.i.i.util.GridUnsafe.monitorEnter(GridUnsafe.java:1193)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2815)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1741)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

Deadlock jmc picture
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-1.ignite-deadlock-1>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-2.ignite-deadlock-2>
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-3.png>

As you can see in the picture above, we can see that sys-stripe-5 and
sys-stripe-6 are the owner of the thread. Besides Ignite Cache Configuration
is shown below.

return ignite.getOrCreateCache(new CacheConfiguration<String,
RollupMetric>()
            .setName(cacheName)
            .setCacheMode(CacheMode.PARTITIONED)
            .setAtomicityMode(CacheAtomicityMode.ATOMIC)
            .setRebalanceMode(CacheRebalanceMode.ASYNC)
            .setMemoryPolicyName(MEMORY_POLICY_NAME)
            .setBackups(1)
            .setStatisticsEnabled(true)
            .setManagementEnabled(true)
            .setCopyOnRead(false)
            .setQueryParallelism(20)
            .setLongQueryWarningTimeout(10000) // 10s
            .setEagerTtl(false)
            .setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new
Duration(TimeUnit.DAYS, 365)))

.setMaxConcurrentAsyncOperations(CacheConfiguration.DFLT_MAX_CONCURRENT_ASYNC_OPS
* 10)
            .setAffinity(new CoupangAffinityFunction())
            .setIndexedTypes(String.class, RollupMetric.class));

The reason for setting the CacheExpiryPolicy to 1 year above is because the
entry is evicted by clearing the cache as described previously.

Ignite Memory Configuration
<property name="memoryConfiguration">
      <bean class="org.apache.ignite.configuration.MemoryConfiguration">

        <property name="memoryPolicies">
          <list>
            <bean
class="org.apache.ignite.configuration.MemoryPolicyConfiguration">
              <property name="name" value="RollupMemory"/>

              <property name="pageEvictionMode" value="RANDOM_LRU"/>
              <property name="metricsEnabled" value="true"/>

              <property name="initialSize" value="21474836480"/>

              <property name="maxSize" value="21474836480"/>
            </bean>
          </list>
        </property>
        <property name="pageSize" value="4096"/>
        <property name="concurrencyLevel" value="8"/>
      </bean>
    </property>

For what reason did Deadlock occur? Is there an option or usage pattern to
solve this?

I think it is due to the client's topology changes. If so, how would you
handle it?

Please let me know if you have any additional questions.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/



dark dark
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Thank you for your kind comments.

EagerTttl is usually useful. However, it did not match our situation.

We put 40k/s entry to ignite server node. If the backup option is enabled to
prevent some loss of data, large cache entries will cause simultaneous
expiration. (We expiry the Cache entry after 5 minutes without updating.) At
this time, Eager's 500ms interval and 1000 expiry was not enough to continue
the expiry efficiently, and there was an issue that keeps the
ttl-cleanup-worker CPU. So we chose to keep Cache itself on time, destroy
the cache itself and eliminate expiration costs.

<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/before_cpu.png>
< Before : EagerTtl is true >

<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/after_cpu.png>
< After : EagerTtl is false & cache destroy strategy >

Performance is definitely improving. However, the worry is whether or not
OffHeap will be cleaned up properly and whether memory leaks will be caused
by the cache meta information of the Ignite Java Heap.

Do you have any other concerns?

Thanks again for your reply. :)





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dark dark
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

This post was updated on .
Many people seem to be more likely to send Cache entries in bulk via a
HashMap.
What I would like to do is to check the type for the entites coming into putAll and leave a warning log that a deadlock occurs if it is not a TreeMap.

My Issue is resolved that change data structure HashMap => TreeMap

putAll(hashMap); (Deadlock!)
putAll(treeMap);  (very well work)

Thanks a lot :)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

In reply to this post by dark
> Performance is definitely improving. However, the worry is whether or not
> OffHeap will be cleaned up properly and whether memory leaks will be caused
> by the cache meta information of the Ignite Java Heap.
>
> Do you have any other concerns?

There is nothing I’m concerned about here.


Denis

> On Oct 21, 2017, at 7:57 AM, dark <[hidden email]> wrote:
>
> Thank you for your kind comments.
>
> EagerTttl is usually useful. However, it did not match our situation.
>
> We put 40k/s entry to ignite server node. If the backup option is enabled to
> prevent some loss of data, large cache entries will cause simultaneous
> expiration. (We expiry the Cache entry after 5 minutes without updating.) At
> this time, Eager's 500ms interval and 1000 expiry was not enough to continue
> the expiry efficiently, and there was an issue that keeps the
> ttl-cleanup-worker CPU. So we chose to keep Cache itself on time, destroy
> the cache itself and eliminate expiration costs.
>
> <http://apache-ignite-users.70518.x6.nabble.com/file/t1415/before_cpu.png>
> < Before : EagerTtl is true >
>
> <http://apache-ignite-users.70518.x6.nabble.com/file/t1415/after_cpu.png>
> < After : EagerTtl is false & cache destroy strategy >
>
> Performance is definitely improving. However, the worry is whether or not
> OffHeap will be cleaned up properly and whether memory leaks will be caused
> by the cache meta information of the Ignite Java Heap.
>
> Do you have any other concerns?
>
> Thanks again for your reply. :)
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/

dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

In reply to this post by dark
+ dev list

Igniters, that’s a relevant point below. Newcomers to Ignite tend to stumble on deadlocks simply because the keys are passed in an unordered HashMap. Propose to do the following:
- update bulk operations Java doc.
- print out a warning if a HashMap is used and its exceeds one element.

Thoughts?


Denis

> On Oct 21, 2017, at 6:16 PM, dark <[hidden email]> wrote:
>
> Many people seem to be more likely to send Cache entries in bulk via a
> HashMap.
> How do you expose a warning statement by checking if the TreeMap is putAll
> inside the code?
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/

dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Denis, 

We should definitely print out a thorough warning if HashMap is passed into a bulk method (instead of SortedMap). However, we should make sure that we only print that warning once and not ever time the API is called.

Can you please file a ticket for 2.4?

D.

On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email]> wrote:
+ dev list

Igniters, that’s a relevant point below. Newcomers to Ignite tend to stumble on deadlocks simply because the keys are passed in an unordered HashMap. Propose to do the following:
- update bulk operations Java doc.
- print out a warning if a HashMap is used and its exceeds one element. 

Thoughts?


Denis

> On Oct 21, 2017, at 6:16 PM, dark <[hidden email]> wrote:
>
> Many people seem to be more likely to send Cache entries in bulk via a
> HashMap.
> How do you expose a warning statement by checking if the TreeMap is putAll
> inside the code?
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Dmitry Pavlov Dmitry Pavlov
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

I agree with Denis, if we don't have such warning we should continiously warn users in wiki pages/blogs/presentations. It is simpler to warn from code. 

What do you think if we will issue warning only if size > 1. HashMap with 1 item will not cause deadlock. Moreover where can be some custom singleton Map provided by user.

Sincerely,
Dmitriy Pavlov

вт, 31 окт. 2017 г. в 7:18, Dmitriy Setrakyan <[hidden email]>:
Denis,

We should definitely print out a thorough warning if HashMap is passed into
a bulk method (instead of SortedMap). However, we should make sure that we
only print that warning once and not ever time the API is called.

Can you please file a ticket for 2.4?

D.

On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email]> wrote:

> + dev list
>
> Igniters, that’s a relevant point below. Newcomers to Ignite tend to
> stumble on deadlocks simply because the keys are passed in an unordered
> HashMap. Propose to do the following:
> - update bulk operations Java doc.
> - print out a warning if a HashMap is used and its exceeds one element.


> Thoughts?
>
> —
> Denis
>
> > On Oct 21, 2017, at 6:16 PM, dark <[hidden email]> wrote:
> >
> > Many people seem to be more likely to send Cache entries in bulk via a
> > HashMap.
> > How do you expose a warning statement by checking if the TreeMap is
> putAll
> > inside the code?
> >
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>
>
dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Here is a ticket for the improvement:
https://issues.apache.org/jira/browse/IGNITE-6804

Denis

On Oct 31, 2017, at 3:55 AM, Dmitry Pavlov <[hidden email]> wrote:

I agree with Denis, if we don't have such warning we should continiously warn users in wiki pages/blogs/presentations. It is simpler to warn from code. 

What do you think if we will issue warning only if size > 1. HashMap with 1 item will not cause deadlock. Moreover where can be some custom singleton Map provided by user.

Sincerely,
Dmitriy Pavlov

вт, 31 окт. 2017 г. в 7:18, Dmitriy Setrakyan <[hidden email]>:
Denis,

We should definitely print out a thorough warning if HashMap is passed into
a bulk method (instead of SortedMap). However, we should make sure that we
only print that warning once and not ever time the API is called.

Can you please file a ticket for 2.4?

D.

On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email]> wrote:

> + dev list
>
> Igniters, that’s a relevant point below. Newcomers to Ignite tend to
> stumble on deadlocks simply because the keys are passed in an unordered
> HashMap. Propose to do the following:
> - update bulk operations Java doc.
> - print out a warning if a HashMap is used and its exceeds one element.


> Thoughts?
>
> —
> Denis
>
> > On Oct 21, 2017, at 6:16 PM, dark <[hidden email]> wrote:
> >
> > Many people seem to be more likely to send Cache entries in bulk via a
> > HashMap.
> > How do you expose a warning statement by checking if the TreeMap is
> putAll
> > inside the code?
> >
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>
>

Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Guys,

Printing a warning in this case is really strange idea. First, how would explain it in case of OPTIMISTIC/SERIALIZABLE transactions where deadlocks are impossible? Second, what would you do in case tow sorted maps are passed one by one in a transaction? User still may have a deadlock. :Last, we are going towards SQL world, where "maps" simply do not exist, and virtually any update could eailty lead to a deadlock. 

Let's avoid strange warnings for normal usage scenario. Denis, please close the ticket :-)))

Vladimir.

On Tue, Oct 31, 2017 at 8:34 PM, Denis Magda <[hidden email]> wrote:
Here is a ticket for the improvement:
https://issues.apache.org/jira/browse/IGNITE-6804


Denis

> On Oct 31, 2017, at 3:55 AM, Dmitry Pavlov <[hidden email]> wrote:
>
> I agree with Denis, if we don't have such warning we should continiously warn users in wiki pages/blogs/presentations. It is simpler to warn from code.
>
> What do you think if we will issue warning only if size > 1. HashMap with 1 item will not cause deadlock. Moreover where can be some custom singleton Map provided by user.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 31 окт. 2017 г. в 7:18, Dmitriy Setrakyan <[hidden email] <mailto:[hidden email]>>:
> Denis,
>
> We should definitely print out a thorough warning if HashMap is passed into
> a bulk method (instead of SortedMap). However, we should make sure that we
> only print that warning once and not ever time the API is called.
>
> Can you please file a ticket for 2.4?
>
> D.
>
> On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email] <mailto:[hidden email]>> wrote:
>
> > + dev list
> >
> > Igniters, that’s a relevant point below. Newcomers to Ignite tend to
> > stumble on deadlocks simply because the keys are passed in an unordered
> > HashMap. Propose to do the following:
> > - update bulk operations Java doc.
> > - print out a warning if a HashMap is used and its exceeds one element.
>
>
> > Thoughts?
> >
> > —
> > Denis
> >
> > > On Oct 21, 2017, at 6:16 PM, dark <[hidden email] <mailto:[hidden email]>> wrote:
> > >
> > > Many people seem to be more likely to send Cache entries in bulk via a
> > > HashMap.
> > > How do you expose a warning statement by checking if the TreeMap is
> > putAll
> > > inside the code?
> > >
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ <http://apache-ignite-users.70518.x6.nabble.com/>
> >
> >


dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Vladimir,

That’s an oversight and lack of explanation on our side. The goal is to avoid unexpected deadlocks when a user passed a HashMap in cache.putAll. Before printing out a warning we can filter out OPTIMISTIC/SERIALIZABLE and other suitable scenarios.

So you’re free to offer another solution aside from closing the ticket :) 

Denis

 
On Oct 31, 2017, at 10:55 AM, Vladimir Ozerov <[hidden email]> wrote:

Guys,

Printing a warning in this case is really strange idea. First, how would explain it in case of OPTIMISTIC/SERIALIZABLE transactions where deadlocks are impossible? Second, what would you do in case tow sorted maps are passed one by one in a transaction? User still may have a deadlock. :Last, we are going towards SQL world, where "maps" simply do not exist, and virtually any update could eailty lead to a deadlock. 

Let's avoid strange warnings for normal usage scenario. Denis, please close the ticket :-)))

Vladimir.

On Tue, Oct 31, 2017 at 8:34 PM, Denis Magda <[hidden email]> wrote:
Here is a ticket for the improvement:
https://issues.apache.org/jira/browse/IGNITE-6804


Denis

> On Oct 31, 2017, at 3:55 AM, Dmitry Pavlov <[hidden email]> wrote:
>
> I agree with Denis, if we don't have such warning we should continiously warn users in wiki pages/blogs/presentations. It is simpler to warn from code.
>
> What do you think if we will issue warning only if size > 1. HashMap with 1 item will not cause deadlock. Moreover where can be some custom singleton Map provided by user.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 31 окт. 2017 г. в 7:18, Dmitriy Setrakyan <[hidden email] <mailto:[hidden email]>>:
> Denis,
>
> We should definitely print out a thorough warning if HashMap is passed into
> a bulk method (instead of SortedMap). However, we should make sure that we
> only print that warning once and not ever time the API is called.
>
> Can you please file a ticket for 2.4?
>
> D.
>
> On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email] <mailto:[hidden email]>> wrote:

>
> > + dev list
> >
> > Igniters, that’s a relevant point below. Newcomers to Ignite tend to
> > stumble on deadlocks simply because the keys are passed in an unordered
> > HashMap. Propose to do the following:
> > - update bulk operations Java doc.
> > - print out a warning if a HashMap is used and its exceeds one element.
>
>
> > Thoughts?
> >
> > —
> > Denis
> >
> > > On Oct 21, 2017, at 6:16 PM, dark <[hidden email] <mailto:[hidden email]>> wrote:

> > >
> > > Many people seem to be more likely to send Cache entries in bulk via a
> > > HashMap.
> > > How do you expose a warning statement by checking if the TreeMap is
> > putAll
> > > inside the code?
> > >
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ <http://apache-ignite-users.70518.x6.nabble.com/>
> >
> >

Dmitry Pavlov Dmitry Pavlov
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

Vladimir, thank you. Good point for optimistic tx, but still putAll usage require using sorted collections. 
User, of course, may broke this scenario also by using sorted maps with incorrect custom comparators (one asc, one desc).
Producing warning in code for Igntie for potential deadlock is good usability feature.
Please look to user list, deadlock is very often question.

вт, 31 окт. 2017 г. в 21:02, Denis Magda <[hidden email]>:
Vladimir,

That’s an oversight and lack of explanation on our side. The goal is to avoid unexpected deadlocks when a user passed a HashMap in cache.putAll. Before printing out a warning we can filter out OPTIMISTIC/SERIALIZABLE and other suitable scenarios.

So you’re free to offer another solution aside from closing the ticket :) 

Denis

 
On Oct 31, 2017, at 10:55 AM, Vladimir Ozerov <[hidden email]> wrote:

Guys,

Printing a warning in this case is really strange idea. First, how would explain it in case of OPTIMISTIC/SERIALIZABLE transactions where deadlocks are impossible? Second, what would you do in case tow sorted maps are passed one by one in a transaction? User still may have a deadlock. :Last, we are going towards SQL world, where "maps" simply do not exist, and virtually any update could eailty lead to a deadlock. 

Let's avoid strange warnings for normal usage scenario. Denis, please close the ticket :-)))

Vladimir.

On Tue, Oct 31, 2017 at 8:34 PM, Denis Magda <[hidden email]> wrote:
Here is a ticket for the improvement:
https://issues.apache.org/jira/browse/IGNITE-6804


Denis

> On Oct 31, 2017, at 3:55 AM, Dmitry Pavlov <[hidden email]> wrote:
>
> I agree with Denis, if we don't have such warning we should continiously warn users in wiki pages/blogs/presentations. It is simpler to warn from code.
>
> What do you think if we will issue warning only if size > 1. HashMap with 1 item will not cause deadlock. Moreover where can be some custom singleton Map provided by user.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 31 окт. 2017 г. в 7:18, Dmitriy Setrakyan <[hidden email] <mailto:[hidden email]>>:
> Denis,
>
> We should definitely print out a thorough warning if HashMap is passed into
> a bulk method (instead of SortedMap). However, we should make sure that we
> only print that warning once and not ever time the API is called.
>
> Can you please file a ticket for 2.4?
>
> D.
>
> On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email] <mailto:[hidden email]>> wrote:

>
> > + dev list
> >
> > Igniters, that’s a relevant point below. Newcomers to Ignite tend to
> > stumble on deadlocks simply because the keys are passed in an unordered
> > HashMap. Propose to do the following:
> > - update bulk operations Java doc.
> > - print out a warning if a HashMap is used and its exceeds one element.
>
>
> > Thoughts?
> >
> > —
> > Denis
> >
> > > On Oct 21, 2017, at 6:16 PM, dark <[hidden email] <mailto:[hidden email]>> wrote:

> > >
> > > Many people seem to be more likely to send Cache entries in bulk via a
> > > HashMap.
> > > How do you expose a warning statement by checking if the TreeMap is
> > putAll
> > > inside the code?
> > >
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ <http://apache-ignite-users.70518.x6.nabble.com/>
> >
> >

vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

I like the idea to print out a warning if unsorted map is provided. The fact that there are tons of other ways to get a deadlock doesn't mean that we should ignore this case which is actually very common.

-Val

On Tue, Oct 31, 2017 at 12:34 PM, Dmitry Pavlov <[hidden email]> wrote:
Vladimir, thank you. Good point for optimistic tx, but still putAll usage require using sorted collections. 
User, of course, may broke this scenario also by using sorted maps with incorrect custom comparators (one asc, one desc).
Producing warning in code for Igntie for potential deadlock is good usability feature.
Please look to user list, deadlock is very often question.

вт, 31 окт. 2017 г. в 21:02, Denis Magda <[hidden email]>:
Vladimir,

That’s an oversight and lack of explanation on our side. The goal is to avoid unexpected deadlocks when a user passed a HashMap in cache.putAll. Before printing out a warning we can filter out OPTIMISTIC/SERIALIZABLE and other suitable scenarios.

So you’re free to offer another solution aside from closing the ticket :) 

Denis

 
On Oct 31, 2017, at 10:55 AM, Vladimir Ozerov <[hidden email]> wrote:

Guys,

Printing a warning in this case is really strange idea. First, how would explain it in case of OPTIMISTIC/SERIALIZABLE transactions where deadlocks are impossible? Second, what would you do in case tow sorted maps are passed one by one in a transaction? User still may have a deadlock. :Last, we are going towards SQL world, where "maps" simply do not exist, and virtually any update could eailty lead to a deadlock. 

Let's avoid strange warnings for normal usage scenario. Denis, please close the ticket :-)))

Vladimir.

On Tue, Oct 31, 2017 at 8:34 PM, Denis Magda <[hidden email]> wrote:
Here is a ticket for the improvement:
https://issues.apache.org/jira/browse/IGNITE-6804


Denis

> On Oct 31, 2017, at 3:55 AM, Dmitry Pavlov <[hidden email]> wrote:
>
> I agree with Denis, if we don't have such warning we should continiously warn users in wiki pages/blogs/presentations. It is simpler to warn from code.
>
> What do you think if we will issue warning only if size > 1. HashMap with 1 item will not cause deadlock. Moreover where can be some custom singleton Map provided by user.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 31 окт. 2017 г. в 7:18, Dmitriy Setrakyan <[hidden email] <mailto:[hidden email]>>:
> Denis,
>
> We should definitely print out a thorough warning if HashMap is passed into
> a bulk method (instead of SortedMap). However, we should make sure that we
> only print that warning once and not ever time the API is called.
>
> Can you please file a ticket for 2.4?
>
> D.
>
> On Thu, Oct 26, 2017 at 11:05 AM, Denis Magda <[hidden email] <mailto:[hidden email]>> wrote:

>
> > + dev list
> >
> > Igniters, that’s a relevant point below. Newcomers to Ignite tend to
> > stumble on deadlocks simply because the keys are passed in an unordered
> > HashMap. Propose to do the following:
> > - update bulk operations Java doc.
> > - print out a warning if a HashMap is used and its exceeds one element.
>
>
> > Thoughts?
> >
> > —
> > Denis
> >
> > > On Oct 21, 2017, at 6:16 PM, dark <[hidden email] <mailto:[hidden email]>> wrote:

> > >
> > > Many people seem to be more likely to send Cache entries in bulk via a
> > > HashMap.
> > > How do you expose a warning statement by checking if the TreeMap is
> > putAll
> > > inside the code?
> > >
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ <http://apache-ignite-users.70518.x6.nabble.com/>
> >
> >


yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.0.0 GridUnsafe unmonitor

+1 for warning about potential deadlock and improving javadocs

--Yakov