Ignite graceful shutdown

classic Classic list List threaded Threaded
4 messages Options
crenique crenique
Reply | Threaded
Open this post in threaded view
|

Ignite graceful shutdown

Hello,

 We are running many ignite server nodes in windows VM scaleset in a cloud.
But sometimes cloud provider forcefully reboot multiple VMs for system
update (even though update policy is set to manual, it forces reboot
sometimes)

Cache mode is partitioned cache with 2 backups.

 The worst case scenario would be like the cloud provider reboots all three
VMs (1 primary + 2 backups) at the same time. The we might lose data.
So, we have Windows service set up to delay reboot to buy some time to
gracefully shutdown ignite server local node. (the windows service spawns
ignite server node process as a child)

if i call this method,
 Ignition.Shutdown(null, false);

- Does the shutdown method block until all local node data is rebalanced to
other nodes ?

- Or is there another way to force shift all local node data to other nodes
?

- Also would you recommend a way to handle this kind of worst case scenario
that a primary and two backup replica nodes are rebooted at the same time ?


Thanks
Sam













--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
slava.koptilin slava.koptilin
Reply | Threaded
Open this post in threaded view
|

Re: Ignite graceful shutdown

Hello Sam,

> if i call this method, Ignition.Shutdown(null, false);
The Ignition class does not contain the `Shutdown` method [1], [2] Perhaps,
you mean 'stop'.

> Does the shutdown method block until all local node data is rebalanced to
> other nodes?
No, it does not. The second parameter of 'stop' method defines the behavior
of the stopped node regarding ComputeJobs, Ignite services etc.
If that parameter is set to true then Ignite instance will wait for all
tasks to be finished.

> Or is there another way to force shift all local node data to other nodes?
I think the best way to handle this, is stopping node one by one. I mean the
following:
 - stop one node
 - after that, you need to wait until the end of rebalancing.
   for example, you can use ignite events (EVT_CACHE_REBALANCE_STOPPED) [3]
   or check the status of rebalancing via Visor
 - and so on

[1]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/Ignition.html
[2]
https://ignite.apache.org/releases/latest/dotnetdoc/api/Apache.Ignite.Core.Ignition.html
[3] https://apacheignite.readme.io/docs/events

hope this helps.

Thanks,
S.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
crenique crenique
Reply | Threaded
Open this post in threaded view
|

Re: Ignite graceful shutdown

*1. */> if i call this method, Ignition.Shutdown(null, false);
The Ignition class does not contain the `Shutdown` method [1], [2] Perhaps,
you mean 'stop'. /

-> ah, yes stop method in ignite dotnet C# API.
 Apache.Ignite.Core.Ignition
     bool Stop(string name, bool cancel)

*2. */> Does the shutdown method block until all local node data is
rebalanced to
> other nodes?
No, it does not. The second parameter of 'stop' method defines the behavior
of the stopped node regarding ComputeJobs, Ignite services etc.
If that parameter is set to true then Ignite instance will wait for all
tasks to be finished. /

-> When set cancel = true, is there a chance that stop method call would
block long time or hang depending on what jobs are running ? Would it be
safe to wrap a wait timeout & call stop(null, false) again if taking too
long ?


*3. */> Or is there another way to force shift all local node data to other
nodes?
I think the best way to handle this, is stopping node one by one. I mean the
following:
 - stop one node
 - after that, you need to wait until the end of rebalancing.
   for example, you can use ignite events (EVT_CACHE_REBALANCE_STOPPED) [3]
   or check the status of rebalancing via Visor
 - and so on /

guess i need to capture ignite events (EVT_CACHE_REBALANCE_STOPPED) event in
the other nodes because stopped node can't capture event by itself ?



Thanks
Sam





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
slava.koptilin slava.koptilin
Reply | Threaded
Open this post in threaded view
|

Re: Ignite graceful shutdown

Hi Sam,

> is there a chance that stop method call would block long time or
> hang depending on what jobs are running?
> Would it be safe to wrap a wait timeout & call stop(null, false) again if
> taking too long?
In the vast majority of use cases, it will not be a problem I think,
because tasks/jobs should be aware of the interruption (cancellation
mechanism is implemented via interrupt flag).
In any way, you can use this approach. I don't see any flaws.

> guess i need to capture ignite events (EVT_CACHE_REBALANCE_STOPPED) event
> in
> the other nodes because stopped node can't capture event by itself ?
Yep, you need to capture that event on other nodes.

Thanks,
S.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/