Gracefully shutting down the data grid

classic Classic list List threaded Threaded
2 messages Options
shivakumar shivakumar
Reply | Threaded
Open this post in threaded view
|

Gracefully shutting down the data grid

Hi all,

I am trying to deactivate a cluster which is being connected with few clients over JDBC.
As part of these clients connections, it inserts some records to many tables and runs some long-running queries.
At this time I am trying to deactivate the cluster [basically trying to take data backup, so before this, I need to de-activate the cluster] But de-activation is hanging and control.sh not returning the control and hangs infinitely.
when I check the current cluster state with rest API calls it sometime it returns saying cluster is inactive.
After some time I am trying to activate the cluster but it returns this error:

[root@ignite-test]# curl "http://ignite-service-shiv.ignite.svc.cluster.local:8080/ignite?cmd=activate&user=ignite&password=ignite"  | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   207  100   207    0     0   2411      0 --:--:-- --:--:-- --:--:--  2406
{
  "successStatus": 0,
  "sessionToken": "654F094484E24232AA74F35AC5E83481",
  "error": "Failed to activate, because another state change operation is currently in progress: deactivate\nsuppressed: \n",
  "response": null
}


This means that my earlier de-activation has not succeeded properly.
Is there any other way to de-activate the cluster or to terminate the existing client connections or to terminate the running queries.
I tried "kill -k -ar" from visor shell but it restarts few nodes and it ended up with some exception related to page corruption.
Note: My Ignite deployment is on Kubernetes 

Any help is appreciated.

regards,
shiva


Denis Mekhanikov Denis Mekhanikov
Reply | Threaded
Open this post in threaded view
|

Re: Gracefully shutting down the data grid

Shiva,

What version of Ignite do you use and do you have security configured in the cluster?

There was a bug in Ignite before version 2.7, that has similar symptoms: https://issues.apache.org/jira/browse/IGNITE-7624
It’s fixed under the following ticket: https://issues.apache.org/jira/browse/IGNITE-9535

Try updating to the latest version of Ignite and see if the issue is resolved there.

If this is not your case, then please collect thread dumps from all nodes and share them in this thread. Logs will also be useful.
Please don’t add it to the message body, use attachment.

Denis
On 30 Sep 2019, 17:49 +0300, Shiva Kumar <[hidden email]>, wrote:
Hi all,

I am trying to deactivate a cluster which is being connected with few clients over JDBC.
As part of these clients connections, it inserts some records to many tables and runs some long-running queries.
At this time I am trying to deactivate the cluster [basically trying to take data backup, so before this, I need to de-activate the cluster] But de-activation is hanging and control.sh not returning the control and hangs infinitely.
when I check the current cluster state with rest API calls it sometime it returns saying cluster is inactive.
After some time I am trying to activate the cluster but it returns this error:

[root@ignite-test]# curl "http://ignite-service-shiv.ignite.svc.cluster.local:8080/ignite?cmd=activate&user=ignite&password=ignite"  | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   207  100   207    0     0   2411      0 --:--:-- --:--:-- --:--:--  2406
{
  "successStatus": 0,
  "sessionToken": "654F094484E24232AA74F35AC5E83481",
  "error": "Failed to activate, because another state change operation is currently in progress: deactivate\nsuppressed: \n",
  "response": null
}


This means that my earlier de-activation has not succeeded properly.
Is there any other way to de-activate the cluster or to terminate the existing client connections or to terminate the running queries.
I tried "kill -k -ar" from visor shell but it restarts few nodes and it ended up with some exception related to page corruption.
Note: My Ignite deployment is on Kubernetes 

Any help is appreciated.

regards,
shiva