Behavior of init() for clustered singleton

classic Classic list List threaded Threaded
12 messages Options
dstieglitz dstieglitz
Reply | Threaded
Open this post in threaded view
|

Behavior of init() for clustered singleton

Hi folks:

We're trying to deploy a clustered singleton service. In our Service implementation object we have a reference to another object that is initialized in the init() method.

We've observed that upon topology change, this object can go "null."

What is the correct way to initialize objects like this? According to the example, it's done in the init() method (which we are doing now), but this seems to cause an issue on topology change. Is init() called again when the service has to migrate to a new node?

Dan
dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton


On Thu, Mar 31, 2016 at 8:58 AM, dstieglitz <[hidden email]> wrote:
Hi folks:

We're trying to deploy a clustered singleton service. In our Service
implementation object we have a reference to another object that is
initialized in the init() method.

We've observed that upon topology change, this object can go "null."

This is normal, since the service state is not moved. All Ignite does is make sure that the service is immediately started on another node and calls “init()” again.
 
 Is init() called again when the
service has to migrate to a new node?

Yes.
 

Dan



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

dstieglitz dstieglitz
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Following up on this...

Sorry for the vague description of the problem, but we are experiencing objects "going null" (as if they were garbage collected?) in our clustered singleton.

We have an instance variable of an object that is initialized in the service init() method. We have confirmed that on topology change, the object is properly re-initialized. However, after some period of time, for example, overnight, the object "goes null."

Are we doing this correctly? Should we store the object in the cluster?

The schedule class is here: https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java

The object in question is the "DistributedScheduledThreadPoolExecutor"
yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Your examples seems correct to me. 
1. What does it mean by "goes null"? 
2. I do not see any assignments other than instantiation in init() method.
3. You confirm that service worked OK on some node but after some time with no topology changes it starts to throw NPE, correct? Can you please share the stack trace? Maybe it can reveal some details we missing now.

--Yakov

2016-04-06 3:10 GMT+03:00 dstieglitz <[hidden email]>:
Following up on this...

Sorry for the vague description of the problem, but we are experiencing
objects "going null" (as if they were garbage collected?) in our clustered
singleton.

We have an instance variable of an object that is initialized in the service
init() method. We have confirmed that on topology change, the object is
properly re-initialized. However, after some period of time, for example,
overnight, the object "goes null."

Are we doing this correctly? Should we store the object in the cluster?

The schedule class is here:
https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java

The object in question is the "DistributedScheduledThreadPoolExecutor"



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p3944.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

dstieglitz dstieglitz
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

We did find an exception in the remote job which seems to be related (stack trace below).

There are try {} catch blocks to try and catch this exception but it always appears in the logs. Is it possible to catch this?

I also noticed it is translated to a "null" IgniteException somewhere, I'm not sure if this is the correct behavior.

016-04-07 17:51:38,505 [sys-#30%hapnin-grid%] ERROR task.GridTaskWorker  - Failed to obtain remote job result policy for result from ComputeTask.result(..) method (will fail the whole task): GridJobResultImpl [job=C2 [], sib=GridJobSiblingImpl [sesId=45f99d1f351-193a6965-27a9-43f0-bb84-af7d67c5e55b, jobId=55f99d1f351-4d30f145-aaa5-42b6-9194-27b6d20ac4bf, nodeId=4d30f145-aaa5-42b6-9194-27b6d20ac4bf, isJobDone=false], jobCtx=GridJobContextImpl [jobId=55f99d1f351-4d30f145-aaa5-42b6-9194-27b6d20ac4bf, timeoutObj=null, attrs={}], node=TcpDiscoveryNode [id=4d30f145-aaa5-42b6-9194-27b6d20ac4bf, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.30.1.34], sockAddrs=[dev2.localdomain/172.30.1.34:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /172.30.1.34:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1460051494963, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false], ex=class o.a.i.IgniteException: null, hasRes=true, isCancelled=false, isOccupied=true]
class org.apache.ignite.IgniteException: Remote job threw user exception (override or implement ComputeTask.result(..) method if you would like to have automatic failover for this exception).
        at org.apache.ignite.compute.ComputeTaskAdapter.result(ComputeTaskAdapter.java:101)
        at org.apache.ignite.internal.processors.task.GridTaskWorker$3.apply(GridTaskWorker.java:909)
        at org.apache.ignite.internal.processors.task.GridTaskWorker$3.apply(GridTaskWorker.java:902)
        at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6429)
        at org.apache.ignite.internal.processors.task.GridTaskWorker.result(GridTaskWorker.java:902)
        at org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:798)
        at org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:995)
        at org.apache.ignite.internal.processors.task.GridTaskProcessor$JobMessageListener.onMessage(GridTaskProcessor.java:1219)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:821)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103)
        at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:784)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteException: null
        at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1792)
        at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
        at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6397)
        at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
        at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1166)
        at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1770)
        ... 6 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.ignite.internal.processors.service.GridServiceProxy$ServiceProxyCallable.call(GridServiceProxy.java:382)
        at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1789)
        ... 13 more
Caused by: org.grails.ignite.DistributedRunnableException: invalid pattern: "0 0 2 * * ?"
        at org.grails.ignite.DistributedSchedulerServiceImpl.scheduleWithCron(DistributedSchedulerServiceImpl.java:162)
        ... 19 more
Caused by: org.grails.ignite.DistributedRunnableException: invalid pattern: "0 0 2 * * ?"
        at org.grails.ignite.DistributedScheduledThreadPoolExecutor.scheduleWithCron(DistributedScheduledThreadPoolExecutor.java:67)
        at org.grails.ignite.DistributedSchedulerServiceImpl.scheduleWithCron(DistributedSchedulerServiceImpl.java:149)
        ... 19 more
Caused by: it.sauronsoftware.cron4j.InvalidPatternException: invalid pattern: "0 0 2 * * ?"
        at it.sauronsoftware.cron4j.SchedulingPattern.<init>(Unknown Source)
        at it.sauronsoftware.cron4j.Scheduler.schedule(Unknown Source)
        at it.sauronsoftware.cron4j.Scheduler.schedule(Unknown Source)
        at org.grails.ignite.DistributedScheduledThreadPoolExecutor.scheduleWithCron(DistributedScheduledThreadPoolExecutor.java:61)
        ... 20 more
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Hi,

The try-catch block in scheduleWithCron method just wraps the original exception into DistributedRunnableException and rethrows it. It is then propagated to the node that invoked the service proxy.

Do you expect different behavior?

-Val
dstieglitz dstieglitz
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Hi guys:

So, we've investigated this a bit further and we think the service is actually working, but the issue is that our debug display is showing null for some objects. We think this is because the service and those objects live on another node, and we're seeing null because they are not serializing across the grid.

Is that possible? If there are some objects in the service that don't serialize and you try to access them from a different node would they just print out as null?

Dan
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Dan,

What exactly is not serialized? As Dmitry pointed out earlier, the service state is not preserved when it's redeployed, so you should reinitialize it in init() method. If you still need to share the state, you can use the cache.

-Val
dstieglitz dstieglitz
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

If you look at the line below:

https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/IgniteCronDistributedRunnableScheduledFuture.java#L79

We're seeing the string "NULL EXECUTOR" in our status. But based on the way the classes are initialized I don't think it's possible for that reference to be null. Also we've observed the scheduler working, so at this point I think our main issue was confusion caused by this seemingly null reference.

I'm not sure exactly what is not serialized, all we see is this null evaluation return true.
yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Guys,

It seems there can be a race condition between service methods call and initialization - org/apache/ignite/internal/processors/service/GridServiceProcessor.java:921

Alex G, Val, can you please check if service may be called prior to its initialization?

Dan, can you please add service instance identity hash code to output in init() and other service methods. Smth like - System.out.println("Inside service XXX method [thread=" + Thread.currentThread().getName() + ", hash=" + System.identityHashCode(this) + ']');

--Yakov

2016-04-08 13:01 GMT+03:00 dstieglitz <[hidden email]>:
If you look at the line below:

https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/IgniteCronDistributedRunnableScheduledFuture.java#L79

We're seeing the string "NULL EXECUTOR" in our status. But based on the way
the classes are initialized I don't think it's possible for that reference
to be null. Also we've observed the scheduler working, so at this point I
think our main issue was confusion caused by this seemingly null reference.

I'm not sure exactly what is not serialized, all we see is this null
evaluation return true.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4021.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

dstieglitz dstieglitz
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Ok I added the debug statements:

https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java

Let me know if you want me to report anything from our application.
yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of init() for clustered singleton

Please provide app logs after the issue gets reproduced.

--Yakov

2016-04-08 19:20 GMT+03:00 dstieglitz <[hidden email]>:
Ok I added the debug statements:

https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java

Let me know if you want me to report anything from our application.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4025.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.