How to persist data only on selected nodes but not all nodes in cluster

classic Classic list List threaded Threaded
4 messages Options
xingjl6280 xingjl6280
Reply | Threaded
Open this post in threaded view
|

How to persist data only on selected nodes but not all nodes in cluster

Hi team,

My cluster is running in Replicated cache mode, to ensure no data loss if
any node is down.

Now I'm going to to enable persistence, but I don't want each node to to
hold a full backup.
Is it possible to make some of nodes persist, while the rest are running in
pure memory mode to read and write same set of data?

I'm quite confused with Baseline topology.
I saw this in documentation:
====================================================================================
*Moreover, the cluster can have cluster nodes that are not a part of the
baseline topology such as:
Server nodes that either store data in memory or persist it to a 3rd party
database like RDBMS or NoSQL.*
====================================================================================


But I also find this statement seems conflict to above
====================================================================================
*The new node cannot hold data of caches/tables who persist data in Ignite
persistence.*
====================================================================================

please kindly advise.
thank you

Regards



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Denis Mekhanikov Denis Mekhanikov
Reply | Threaded
Open this post in threaded view
|

Re: How to persist data only on selected nodes but not all nodes in cluster

When you have persistence configured in your cluster, some set of nodes form a baseline topology (BLT). Those are the nodes that store the data and persist it on their disks.
Nodes outside of the BLT can query the data stored on other nodes that are in the BLT. 
Normally nodes are not added to the baseline automatically when they join the cluster. It requires manual actions or configuration of baseline auto-adjustment: https://www.gridgain.com/docs/latest/developers-guide/baseline-topology#baseline-topology-autoadjustment

You can add the nodes that you want to store and persist the data to the baseline. Others can work as "compute-only" nodes.
If you want to optimize the work of nodes that are outside of the BLT, consider using a near cache: https://www.gridgain.com/docs/latest/developers-guide/near-cache

More information about the Baseline Topology feature: https://www.gridgain.com/docs/latest/developers-guide/baseline-topology

Denis

ср, 2 сент. 2020 г. в 05:45, xingjl6280 <[hidden email]>:
Hi team,

My cluster is running in Replicated cache mode, to ensure no data loss if
any node is down.

Now I'm going to to enable persistence, but I don't want each node to to
hold a full backup.
Is it possible to make some of nodes persist, while the rest are running in
pure memory mode to read and write same set of data?

I'm quite confused with Baseline topology.
I saw this in documentation:
====================================================================================
*Moreover, the cluster can have cluster nodes that are not a part of the
baseline topology such as:
Server nodes that either store data in memory or persist it to a 3rd party
database like RDBMS or NoSQL.*
====================================================================================


But I also find this statement seems conflict to above
====================================================================================
*The new node cannot hold data of caches/tables who persist data in Ignite
persistence.*
====================================================================================

please kindly advise.
thank you

Regards



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
xingjl6280 xingjl6280
Reply | Threaded
Open this post in threaded view
|

Re: How to persist data only on selected nodes but not all nodes in cluster

Hi Denis,

thanks for your reply.
I did some experiment. Seems the non-BLT node only works for PARTITIONED
cache mode, but not for REPLICATED.
For REPLICATED mode, the non-BLT node startup will stuck without any
exception.
So this is the design, right?

My cluster holds many semaphore and atomicRef to sync biz logic, therefore I
cannot afford any data partition loss.


Regards,
Johnny


Denis Mekhanikov wrote

> When you have persistence configured in your cluster, some set of nodes
> form a baseline topology (BLT). Those are the nodes that store the data
> and
> persist it on their disks.
> Nodes outside of the BLT can query the data stored on other nodes that are
> in the BLT.
> Normally nodes are not added to the baseline automatically when they join
> the cluster. It requires manual actions or configuration of baseline
> auto-adjustment:
> https://www.gridgain.com/docs/latest/developers-guide/baseline-topology#baseline-topology-autoadjustment
>
> You can add the nodes that you want to store and persist the data to the
> baseline. Others can work as "compute-only" nodes.
> If you want to optimize the work of nodes that are outside of the BLT,
> consider using a near cache:
> https://www.gridgain.com/docs/latest/developers-guide/near-cache
>
> More information about the Baseline Topology feature:
> https://www.gridgain.com/docs/latest/developers-guide/baseline-topology
>
> Denis





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: How to persist data only on selected nodes but not all nodes in cluster

Hello!

Do you have a reproducer which highlights the issue?

Having said that, I don't really recommend having long-lived nodes outside BLT. Maybe it was expected to be a supported scenario, but I doubt it is tested very much. Nowadays it is assumed that all non-BLT nodes get added after baseline auto-adjust.

You can try the following: declare a persistent data region on a subset of nodes (ones which should hold data) and create all caches on that data region. Declare a persistent small default region for logistical reasons only (mixed in memory-persistent clusters are problematic).

Also, why not just use client nodes in place of non-data-holding?

Regards,
--
Ilya Kasnacheev


вт, 8 сент. 2020 г. в 05:59, xingjl6280 <[hidden email]>:
Hi Denis,

thanks for your reply.
I did some experiment. Seems the non-BLT node only works for PARTITIONED
cache mode, but not for REPLICATED.
For REPLICATED mode, the non-BLT node startup will stuck without any
exception.
So this is the design, right?

My cluster holds many semaphore and atomicRef to sync biz logic, therefore I
cannot afford any data partition loss.


Regards,
Johnny


Denis Mekhanikov wrote
> When you have persistence configured in your cluster, some set of nodes
> form a baseline topology (BLT). Those are the nodes that store the data
> and
> persist it on their disks.
> Nodes outside of the BLT can query the data stored on other nodes that are
> in the BLT.
> Normally nodes are not added to the baseline automatically when they join
> the cluster. It requires manual actions or configuration of baseline
> auto-adjustment:
> https://www.gridgain.com/docs/latest/developers-guide/baseline-topology#baseline-topology-autoadjustment
>
> You can add the nodes that you want to store and persist the data to the
> baseline. Others can work as "compute-only" nodes.
> If you want to optimize the work of nodes that are outside of the BLT,
> consider using a near cache:
> https://www.gridgain.com/docs/latest/developers-guide/near-cache
>
> More information about the Baseline Topology feature:
> https://www.gridgain.com/docs/latest/developers-guide/baseline-topology
>
> Denis





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/