Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

classic Classic list List threaded Threaded
6 messages Options
Kristian Rosenvold Kristian Rosenvold
Reply | Threaded
Open this post in threaded view
|

Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

We're using a cache with CacheMode.REPLICATED.

Using 2 nodes, I start each node sequentially and they both get the
same number of elements in their caches (as expected so far).

Almost immedately, the caches start to drift out sync, all of the
elements are simply not getting replicated. There is nothing in the
log to indicate anything peculiar happening.

Downgrading to 1.5 makes this problem go away.

Any suggestions ?


Kristian
dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

Kristian, it is likely an environment problem, rather than Ignite problem. Can you create a simple reproducer that starts 2 nodes in the same JVM and proves that data is not replicated? If the problem is in Ignite, we will fix it asap.

On Thu, Jun 16, 2016 at 10:58 PM, Kristian Rosenvold <[hidden email]> wrote:
We're using a cache with CacheMode.REPLICATED.

Using 2 nodes, I start each node sequentially and they both get the
same number of elements in their caches (as expected so far).

Almost immedately, the caches start to drift out sync, all of the
elements are simply not getting replicated. There is nothing in the
log to indicate anything peculiar happening.

Downgrading to 1.5 makes this problem go away.

Any suggestions ?


Kristian

Denis Magda Denis Magda
Reply | Threaded
Open this post in threaded view
|

Re: Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

Kristian,

This topic looks similar to the following one [1]. Probably the issue is the same so I would prefer to discuss this in one place if you don’t mind.


Denis

On Jun 17, 2016, at 3:41 PM, Dmitriy Setrakyan <[hidden email]> wrote:

Kristian, it is likely an environment problem, rather than Ignite problem. Can you create a simple reproducer that starts 2 nodes in the same JVM and proves that data is not replicated? If the problem is in Ignite, we will fix it asap.

On Thu, Jun 16, 2016 at 10:58 PM, Kristian Rosenvold <[hidden email]> wrote:
We're using a cache with CacheMode.REPLICATED.

Using 2 nodes, I start each node sequentially and they both get the
same number of elements in their caches (as expected so far).

Almost immedately, the caches start to drift out sync, all of the
elements are simply not getting replicated. There is nothing in the
log to indicate anything peculiar happening.

Downgrading to 1.5 makes this problem go away.

Any suggestions ?


Kristian


Kristian Rosenvold Kristian Rosenvold
Reply | Threaded
Open this post in threaded view
|

Re: Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

Denis, you linked back to my own post :)

I've left work for the weekend, but there is one piece of information
that couldnt leave my head: The database backing of the cache always
contains fewer nodes than either of the cluster members, even though
there is no reported error.

This would actually be consistent with an inconsistent equals/hashCode
implementation on one of the cache keys where the upsert in the
database normalizes 2 objects that appear to be different down to the
same value. equals/hashCode is one of the scariest things around, and
I'm supposed to be good at that stuff :)

Is Ignite known to be particularly picky about this ?

Kristian


2016-06-17 14:58 GMT+02:00 Denis Magda <[hidden email]>:

> Kristian,
>
> This topic looks similar to the following one [1]. Probably the issue is the
> same so I would prefer to discuss this in one place if you don’t mind.
>
> [1]
> http://apache-ignite-users.70518.x6.nabble.com/Replicated-cache-leaks-entries-on-1-6-and-1-7-SNAPSHOT-td5704.html
>
> —
> Denis
>
> On Jun 17, 2016, at 3:41 PM, Dmitriy Setrakyan <[hidden email]>
> wrote:
>
> Kristian, it is likely an environment problem, rather than Ignite problem.
> Can you create a simple reproducer that starts 2 nodes in the same JVM and
> proves that data is not replicated? If the problem is in Ignite, we will fix
> it asap.
>
> On Thu, Jun 16, 2016 at 10:58 PM, Kristian Rosenvold <[hidden email]>
> wrote:
>>
>> We're using a cache with CacheMode.REPLICATED.
>>
>> Using 2 nodes, I start each node sequentially and they both get the
>> same number of elements in their caches (as expected so far).
>>
>> Almost immedately, the caches start to drift out sync, all of the
>> elements are simply not getting replicated. There is nothing in the
>> log to indicate anything peculiar happening.
>>
>> Downgrading to 1.5 makes this problem go away.
>>
>> Any suggestions ?
>>
>>
>> Kristian
>
>
>
AndreyVel AndreyVel
Reply | Threaded
Open this post in threaded view
|

Re: Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

Hi Kristian, interesting idea about inconsistent equals/hashCode
could you show the code for review

you can display cache statistics:

     cacheCfg.setStatisticsEnabled(true);
     ...
     System.out.println("cache.metrics: " + cache.metrics());

Kristian Rosenvold Kristian Rosenvold
Reply | Threaded
Open this post in threaded view
|

Re: Replicated cache leaks entries on 1.6 and 1.7-SNAPSHOT

In reply to this post by Kristian Rosenvold
Okay, I drove back to work to check this out. It turns out all my
troubles were being caused by inconsistent equals/hashCode. The key in
the cache was an abstract base class with multiple overrides (which in
itself I believe is one of those grayish areas wrt equals/hashCode). I
just changed this to string and it all worked perfectly. I'll find the
root reason on monday :)

Kristian


2016-06-17 18:40 GMT+02:00 Kristian Rosenvold <[hidden email]>:

> Denis, you linked back to my own post :)
>
> I've left work for the weekend, but there is one piece of information
> that couldnt leave my head: The database backing of the cache always
> contains fewer nodes than either of the cluster members, even though
> there is no reported error.
>
> This would actually be consistent with an inconsistent equals/hashCode
> implementation on one of the cache keys where the upsert in the
> database normalizes 2 objects that appear to be different down to the
> same value. equals/hashCode is one of the scariest things around, and
> I'm supposed to be good at that stuff :)
>
> Is Ignite known to be particularly picky about this ?
>
> Kristian
>
>
> 2016-06-17 14:58 GMT+02:00 Denis Magda <[hidden email]>:
>> Kristian,
>>
>> This topic looks similar to the following one [1]. Probably the issue is the
>> same so I would prefer to discuss this in one place if you don’t mind.
>>
>> [1]
>> http://apache-ignite-users.70518.x6.nabble.com/Replicated-cache-leaks-entries-on-1-6-and-1-7-SNAPSHOT-td5704.html
>>
>> —
>> Denis
>>
>> On Jun 17, 2016, at 3:41 PM, Dmitriy Setrakyan <[hidden email]>
>> wrote:
>>
>> Kristian, it is likely an environment problem, rather than Ignite problem.
>> Can you create a simple reproducer that starts 2 nodes in the same JVM and
>> proves that data is not replicated? If the problem is in Ignite, we will fix
>> it asap.
>>
>> On Thu, Jun 16, 2016 at 10:58 PM, Kristian Rosenvold <[hidden email]>
>> wrote:
>>>
>>> We're using a cache with CacheMode.REPLICATED.
>>>
>>> Using 2 nodes, I start each node sequentially and they both get the
>>> same number of elements in their caches (as expected so far).
>>>
>>> Almost immedately, the caches start to drift out sync, all of the
>>> elements are simply not getting replicated. There is nothing in the
>>> log to indicate anything peculiar happening.
>>>
>>> Downgrading to 1.5 makes this problem go away.
>>>
>>> Any suggestions ?
>>>
>>>
>>> Kristian
>>
>>
>>