Affinity calls in stream receiver

classic Classic list List threaded Threaded
7 messages Options
Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Affinity calls in stream receiver

We have a custom stream receiver that makes affinity calls. This all functions properly, but we see a very large number of the following messages for the same two  classes.   We also just tripped a 2GB limit on Metaspace size, which we came close to in the past.

[18:41:50,365][INFO][pub-#6954%GridGainTrial%][GridDeploymentPerVersionStore] Class was deployed in SHARED or CONTINUOUS mode: class com.....IgniteCallable

So these affinity calls need to load classes that where loaded from client nodes, which may be related to why this happening, but my primary suspect is the fact that both classes are nested.  ( I had previously hit an issue where setting the peer-class-loading "userVersion" would cause ignite to thrown exceptions when the client node attempted to activate the cluster.    In that case, the Ignite call into the cluster was also using a nested class. )

We will try flattening these classes to see if the problem goes away.


Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.

Denis Mekhanikov Denis Mekhanikov
Reply | Threaded
Open this post in threaded view
|

Re: Affinity calls in stream receiver

David,

So, the problem is that the same class is loaded multiple times and it wastes the metaspace, right?
Could you share a reproducer?

Denis

вт, 3 июл. 2018 г. в 0:58, David Harvey <[hidden email]>:
We have a custom stream receiver that makes affinity calls. This all functions properly, but we see a very large number of the following messages for the same two  classes.   We also just tripped a 2GB limit on Metaspace size, which we came close to in the past.

[18:41:50,365][INFO][pub-#6954%GridGainTrial%][GridDeploymentPerVersionStore] Class was deployed in SHARED or CONTINUOUS mode: class com.....IgniteCallable

So these affinity calls need to load classes that where loaded from client nodes, which may be related to why this happening, but my primary suspect is the fact that both classes are nested.  ( I had previously hit an issue where setting the peer-class-loading "userVersion" would cause ignite to thrown exceptions when the client node attempted to activate the cluster.    In that case, the Ignite call into the cluster was also using a nested class. )

We will try flattening these classes to see if the problem goes away.


Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.

Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity calls in stream receiver

We are testing whether  removing the nested classes helps things, and if so, will create a reproducer.    
  1. IGNITE-7905 is the issue where ignite.active(true) fails from a client if userVersion is non-zero, which seems to be due to nested classes.

On Tue, Jul 3, 2018, 4:20 AM Denis Mekhanikov <[hidden email]> wrote:
David,

So, the problem is that the same class is loaded multiple times and it wastes the metaspace, right?
Could you share a reproducer?

Denis

вт, 3 июл. 2018 г. в 0:58, David Harvey <[hidden email]>:
We have a custom stream receiver that makes affinity calls. This all functions properly, but we see a very large number of the following messages for the same two  classes.   We also just tripped a 2GB limit on Metaspace size, which we came close to in the past.

[18:41:50,365][INFO][pub-#6954%GridGainTrial%][GridDeploymentPerVersionStore] Class was deployed in SHARED or CONTINUOUS mode: class com.....IgniteCallable

So these affinity calls need to load classes that where loaded from client nodes, which may be related to why this happening, but my primary suspect is the fact that both classes are nested.  ( I had previously hit an issue where setting the peer-class-loading "userVersion" would cause ignite to thrown exceptions when the client node attempted to activate the cluster.    In that case, the Ignite call into the cluster was also using a nested class. )

We will try flattening these classes to see if the problem goes away.


Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.

Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity calls in stream receiver

The nested class hypothesis seems unlikely.   We have 6000+
GridDeploymentClassLoaders on a node, because there are many instances of
"GridDeploymentPerVersionStore.SharedDeployment".

The userVersion is not changing, nor is the cluster topology.

I have enough data to debug this, just need some time.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity calls in stream receiver

We are running in SHARED_MODE on 2.5, and are currently quite suspicious of
this change in 2.4, the essence of this change is, in SHARED_MODE , to just
skip  the code that will "Find existing deployments that need to be checked
whether they should be reused for this request"
 
https://github.com/apache/ignite/commit/d2050237ee2b760d1c9cbc906b281790fd0976b4#diff-3fae20691c16a617d0c6158b0f61df3c



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity calls in stream receiver

We switched to CONTINUOUS mode based on the assumption that SHARED mode had
regressed in a way that allowed it to create many class loaders, and
eventually run out of Metaspace.  

CONTINUOUS mode failed much sooner, and we were able to reproduce that
failure and identify bugs in the code.   The code that tries to handle
cycles in a graph search fails the search on a cycle rather than just
breaking the recursion.
Added https://issues.apache.org/jira/browse/IGNITE-9026 

Note: we did conclude that this is unrelated to nested or anonymous classes,
as we originally assumed.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
vdpyatkov vdpyatkov
Reply | Threaded
Open this post in threaded view
|

Re: Affinity calls in stream receiver

Hi David,

I think if you have two various classes only, Metaspace should not contain 6000 classes.
If I am not wrong, serve will contain that two classes from each client owner node (after one of client node leave topology - classes will unload from Metaspace).

In othervice, please provide reproduction example, where Metospace overflow happends.


On Wed, Jul 18, 2018 at 1:45 AM, Dave Harvey <[hidden email]> wrote:
We switched to CONTINUOUS mode based on the assumption that SHARED mode had
regressed in a way that allowed it to create many class loaders, and
eventually run out of Metaspace. 

CONTINUOUS mode failed much sooner, and we were able to reproduce that
failure and identify bugs in the code.   The code that tries to handle
cycles in a graph search fails the search on a cycle rather than just
breaking the recursion.
Added https://issues.apache.org/jira/browse/IGNITE-9026

Note: we did conclude that this is unrelated to nested or anonymous classes,
as we originally assumed.



--
Vladislav Pyatkov