CAP Theorem (CP? or AP?)

classic Classic list List threaded Threaded
5 messages Options
joseheitor joseheitor
Reply | Threaded
Open this post in threaded view
|

CAP Theorem (CP? or AP?)

In this GridGain presentation:

https://www.youtube.com/watch?v=u8BFLDfOdy8&t=1806s
<https://www.youtube.com/watch?v=u8BFLDfOdy8&t=1806s>  

Valentin Kulichenko explains the CAP theorem and states that Apache Ignite
is designed to favour Strong-Consistency (CP) over High-Availability (AP).

However, in my test case, my system appears to be behaving as an AP system.
Here is my setup:

4 partitioned nodes in 2 availability-zones [AZa-1, AZa-2] [AZb-3, AZb-4],
configured as described in this post:

http://apache-ignite-users.70518.x6.nabble.com/RESOLVED-Cluster-High-Availability-tp25740.html
<http://apache-ignite-users.70518.x6.nabble.com/RESOLVED-Cluster-High-Availability-tp25740.html>  

With 7,000 records loaded into a table in the cluster with JDBC Thin client:

1. [OK] I can connect to any node and verify that there are 7,000 records
with a SELECT COUNT(*)

2. [OK] If I kill all nodes in AZ-a [AZa-1, AZa-2], and connect to one of
the remaining online nodes in AZ-b, I can still verify that there are 7,000
records with a SELECT COUNT(*)

3. [?] I then kill one of the remaining two nodes in AZ-b and connect to the
single remaining node. Now a SELECT COUNT(*) returns a value of 3,444
records.

This seems to illustrate that the partitioning and backup configuration is
working as intended. But if Ignite is strongly-consistent (CP), shouldn't
the final query fail rather than return an inaccurate result (AP)?

Or am I missing some crucial configuration element(s)?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Ivan Pavlukhin Ivan Pavlukhin
Reply | Threaded
Open this post in threaded view
|

Re: CAP Theorem (CP? or AP?)

Hi Jose,

First of you refer to a slide about Data Center Replication, an
commercial feature of GridGain. Ignite does not provide such feature.
Also, SQL and Cache API could behave different.

You can check how Cache API shows itself in your experiments.
CacheAtomicityMode and PartitionLossPolicy (cache configuration
options) can change the behavior in respect of consistency.

пн, 24 дек. 2018 г. в 09:55, joseheitor <[hidden email]>:

>
> In this GridGain presentation:
>
> https://www.youtube.com/watch?v=u8BFLDfOdy8&t=1806s
> <https://www.youtube.com/watch?v=u8BFLDfOdy8&t=1806s>
>
> Valentin Kulichenko explains the CAP theorem and states that Apache Ignite
> is designed to favour Strong-Consistency (CP) over High-Availability (AP).
>
> However, in my test case, my system appears to be behaving as an AP system.
> Here is my setup:
>
> 4 partitioned nodes in 2 availability-zones [AZa-1, AZa-2] [AZb-3, AZb-4],
> configured as described in this post:
>
> http://apache-ignite-users.70518.x6.nabble.com/RESOLVED-Cluster-High-Availability-tp25740.html
> <http://apache-ignite-users.70518.x6.nabble.com/RESOLVED-Cluster-High-Availability-tp25740.html>
>
> With 7,000 records loaded into a table in the cluster with JDBC Thin client:
>
> 1. [OK] I can connect to any node and verify that there are 7,000 records
> with a SELECT COUNT(*)
>
> 2. [OK] If I kill all nodes in AZ-a [AZa-1, AZa-2], and connect to one of
> the remaining online nodes in AZ-b, I can still verify that there are 7,000
> records with a SELECT COUNT(*)
>
> 3. [?] I then kill one of the remaining two nodes in AZ-b and connect to the
> single remaining node. Now a SELECT COUNT(*) returns a value of 3,444
> records.
>
> This seems to illustrate that the partitioning and backup configuration is
> working as intended. But if Ignite is strongly-consistent (CP), shouldn't
> the final query fail rather than return an inaccurate result (AP)?
>
> Or am I missing some crucial configuration element(s)?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/



--
Best regards,
Ivan Pavlukhin
Stanislav Lukyanov Stanislav Lukyanov
Reply | Threaded
Open this post in threaded view
|

Re: CAP Theorem (CP? or AP?)

In reply to this post by joseheitor
When the cluster loses all copies of a partition the behavior is defined by PartitionLossPolicy. The current default is IGNORE which is indeed an AP rather than CP option. You can set it to READ_WRITE_SAFE or READ_ONLY_SAFE to get the CP behavior. I would also strongly advise to do so if you use native persistence as there are issues with IGNORE in that mode.

There also was a bug that prevented SQL from handling partition loss correctly. It was fixed for in-memory in 2.7 and for persistence in 2.8.

Stan

> On 24 Dec 2018, at 09:55, joseheitor <[hidden email]> wrote:
>
> In this GridGain presentation:
>
> https://www.youtube.com/watch?v=u8BFLDfOdy8&t=1806s
> <https://www.youtube.com/watch?v=u8BFLDfOdy8&t=1806s>  
>
> Valentin Kulichenko explains the CAP theorem and states that Apache Ignite
> is designed to favour Strong-Consistency (CP) over High-Availability (AP).
>
> However, in my test case, my system appears to be behaving as an AP system.
> Here is my setup:
>
> 4 partitioned nodes in 2 availability-zones [AZa-1, AZa-2] [AZb-3, AZb-4],
> configured as described in this post:
>
> http://apache-ignite-users.70518.x6.nabble.com/RESOLVED-Cluster-High-Availability-tp25740.html
> <http://apache-ignite-users.70518.x6.nabble.com/RESOLVED-Cluster-High-Availability-tp25740.html>  
>
> With 7,000 records loaded into a table in the cluster with JDBC Thin client:
>
> 1. [OK] I can connect to any node and verify that there are 7,000 records
> with a SELECT COUNT(*)
>
> 2. [OK] If I kill all nodes in AZ-a [AZa-1, AZa-2], and connect to one of
> the remaining online nodes in AZ-b, I can still verify that there are 7,000
> records with a SELECT COUNT(*)
>
> 3. [?] I then kill one of the remaining two nodes in AZ-b and connect to the
> single remaining node. Now a SELECT COUNT(*) returns a value of 3,444
> records.
>
> This seems to illustrate that the partitioning and backup configuration is
> working as intended. But if Ignite is strongly-consistent (CP), shouldn't
> the final query fail rather than return an inaccurate result (AP)?
>
> Or am I missing some crucial configuration element(s)?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
joseheitor joseheitor
Reply | Threaded
Open this post in threaded view
|

Re: CAP Theorem (CP? or AP?)

Guys, thank you both for your informative and helpful responses.

I have explicitly configured the cache-template with the additional
property:

<property name="partitionLossPolicy" value="READ_WRITE_SAFE"/>

And have observed the following behaviour:

1. [OK] Attempting to get a specific record(s) which resides in a lost
partition does indeed return an exception, as expected.

2. [?] Doing a SELECT COUNT(*) however, still succeeds without error, but
obviously reports the wrong number of total records (understandably). But
shouldn't any operation against a cache with lost partitions result in an
Exception? How will my application know that the result is valid and can be
trusted to be accurate?

Another question please - Stanislav, what is the issue with Ignite
persistence, that was fixed in 2.7 for in-memory, but will only be fixed for
Ignite native persistence in version 2.8...?

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Stanislav Lukyanov Stanislav Lukyanov
Reply | Threaded
Open this post in threaded view
|

RE: CAP Theorem (CP? or AP?)

There were two bugs actually, but the problem is basically the same, just in different cases.

SQL + Partition Loss Policies issue (fixed in 2.7) - https://issues.apache.org/jira/browse/IGNITE-8927 (the issue says “hang” but the visible behavior actually varies)

SQL + Partition Loss Policies + Native Persistence issue (fixed in 2.8) - https://issues.apache.org/jira/browse/IGNITE-9841

 

If you don’t use native persistence then various SELECTs should work as you expect on 2.7.

If you do need persistence then you could try working with master (e.g. take a nightly build – but don’t use it in any real environments).

 

Stan

 

From: [hidden email]
Sent: 24 декабря 2018 г. 18:40
To: [hidden email]
Subject: Re: CAP Theorem (CP? or AP?)

 

Guys, thank you both for your informative and helpful responses.

 

I have explicitly configured the cache-template with the additional

property:

 

<property name="partitionLossPolicy" value="READ_WRITE_SAFE"/>

 

And have observed the following behaviour:

 

1. [OK] Attempting to get a specific record(s) which resides in a lost

partition does indeed return an exception, as expected.

 

2. [?] Doing a SELECT COUNT(*) however, still succeeds without error, but

obviously reports the wrong number of total records (understandably). But

shouldn't any operation against a cache with lost partitions result in an

Exception? How will my application know that the result is valid and can be

trusted to be accurate?

 

Another question please - Stanislav, what is the issue with Ignite

persistence, that was fixed in 2.7 for in-memory, but will only be fixed for

Ignite native persistence in version 2.8...?

 

Thanks,

Jose

 

 

 

--

Sent from: http://apache-ignite-users.70518.x6.nabble.com/