I have Kubernetes Cluster (on GCP) with Apache 2.8.1 (upgraded from 2.8.0 )
with Gridgrain Control center installed. For last 1 weeks Ignite cluster
has 0 load (no read/write request to cluster) . But I am seeing below
exception in my cluster node with lot of threads in TIMED_WAITING, WAITING
STAGE, any clue why this behaviour occurs ? This is happening 2nd time
without any load on cluster . Last week also I had same issue and restarted
the cluster and kept it idle to confirm this behaviour . I have uploaded
complete log also
Re: Blocked system-critical thread has been detected
Your log doesn't have the full thread dumps and I can't find some
information (e.g Topology Snapshots). However, I see that checkpoint thread
was blocked for a long time:
10.20.4.18:47500]-#2][G] Blocked system-critical thread has been detected.
This can lead to cluster-wide undefined behaviour
But I see that it blocked not longer then 3 minutes.
I guess that checkpoint lock can't be taken until some other operation will
not be timeout. It can be some network related timeout or some operation
So please check your configuration and find where you have 3 min timeout and
check what is related to this timeout.