Continuous Query remote listener misses some events or respond really late

classic Classic list List threaded Threaded
13 messages Options
begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Continuous Query remote listener misses some events or respond really late

Hi,
I am currently facing intermittent issue with continuous query. Cant really reproduce it but if any one faced this issue, please do let me know
My application is deployed on 12 nodes with 5-6 services are used to detect respective events using continuous query.
Lets say I have a cache of type
Cache<Long, Trade> where Trade is like this
class Trade{
int pkey,
String type
....
TradeState state;//enum
}
CQ detects the new entry to cache(with updated state) and checks if trade has the state which matches its remote filter criteria.
A Trade moves from state1-state5. each CQ listens to one stage and do some processing and move it to next state where next CQ will detect it and act accordingly.
Problem is sometimes, trade get stuck in some state and does not move. I have put logs in remote listener Predicate method(which checks the filter criteria) but these logs don't get printed on console. Some times CQ detect events after 4-5 hours.
I am using ignite 1.8.2
Does any one seen this behavior, I will be grateful for help extended
Sasha Belyak Sasha Belyak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Hi,
I'm trying to reproduce it in one host (with 6 ignite server node) but all work fine for me. Can you share ignite configuration, cache configuration, logs or some reproducer?

2017-05-02 15:48 GMT+07:00 begineer <[hidden email]>:
Hi,
I am currently facing intermittent issue with continuous query. Cant really
reproduce it but if any one faced this issue, please do let me know
My application is deployed on 12 nodes with 5-6 services are used to detect
respective events using continuous query.
Lets say I have a cache of type
Cache<Long, Trade> where Trade is like this
class Trade{
int pkey,
String type
....
TradeState state;//enum
}
CQ detects the new entry to cache(with updated state) and checks if trade
has the state which matches its remote filter criteria.
A Trade moves from state1-state5. each CQ listens to one stage and do some
processing and move it to next state where next CQ will detect it and act
accordingly.
Problem is sometimes, trade get stuck in some state and does not move. I
have put logs in remote listener Predicate method(which checks the filter
criteria) but these logs don't get printed on console. Some times CQ detect
events after 4-5 hours.
I am using ignite 1.8.2
Does any one seen this behavior, I will be grateful for help extended



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Hi Thanks for looking into this. Its not easily reproduce-able. I only see it some times. Here is my cache and service configuration

Cache configuration:

readThrough="true"
writeThrough="true"
writeBehindEnabled="true"
writeBehindFlushThreadCount="5"
backups="1"
readFromBackup="true"

service configuartion:

maxPerNodeCount="1"
totalCount="1"

Cache is distributed over 12 nodes.

Sasha Belyak Sasha Belyak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

1) How you use ContinuousQuery: with initialQuery or without? 
2) Did some nodes disconnect when you loose updates? 
3) Did you log entries in CQ.localListener? Just to be sure that error in CQ logic, not in your service logic.
4) Can someone update old entries? Maybe they just get into CQ again after 4-5 hours by external update?

2017-05-03 17:13 GMT+07:00 begineer <[hidden email]>:
Hi Thanks for looking into this. Its not easily reproduce-able. I only see it
some times. Here is my cache and service configuration

Cache configuration:

readThrough="true"
writeThrough="true"
writeBehindEnabled="true"
writeBehindFlushThreadCount="5"
backups="1"
readFromBackup="true"

service configuartion:

maxPerNodeCount="1"
totalCount="1"

Cache is distributed over 12 nodes.





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12382.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

1) How you use ContinuousQuery: with initialQuery or without? : with initial query having same predicate
2) Did some nodes disconnect when you loose updates? no
3) Did you log entries in CQ.localListener? Just to be sure that error in CQ logic, not in your service logic. :  
---- No log entries in remote filter, nor in locallistner
4) Can someone update old entries? Maybe they just get into CQ again after 4-5 hours by external update?
   --- I tried adding same events just to trigger event again, some time it moves ahead(event discovered), some times get stuck at same state.
Also, CQ detects them at its won after long time mentioned, we dont add any event in this case.
Regards,
Surinder
Sasha Belyak Sasha Belyak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Can you share you log files?

2017-05-03 19:05 GMT+07:00 begineer <[hidden email]>:
1) How you use ContinuousQuery: with initialQuery or without? : with initial
query having same predicate
2) Did some nodes disconnect when you loose updates? no
3) Did you log entries in CQ.localListener? Just to be sure that error in CQ
logic, not in your service logic. :
---- No log entries in remote filter, nor in locallistner
4) Can someone update old entries? Maybe they just get into CQ again after
4-5 hours by external update?
   --- I tried adding same events just to trigger event again, some time it
moves ahead(event discovered), some times get stuck at same state.
Also, CQ detects them at its won after long time mentioned, we dont add any
event in this case.
Regards,
Surinder



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12387.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Umm. actually nothing get logged in such scenario. However, as you indicated earlier, I could see trades get stuck if a node leaves the grid(not always). Do you know why that happens? Is that a bug?
Sasha Belyak Sasha Belyak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

If node with CQ leave grid (or just reconnect to grid, if it client node) - you should recreate CQ, because some cache updates can happen when node with CQ listener can't receive it. What happen it this case:
1) Node with changed cache entry process CQ, entry pass remote filter and node try to send continues query event message to CQ node
2) If sender node can't push msg by any reasons (sender will retry few times) - it can't wait receiver too long and drop it.
3) After CQ node return to the cluster - it must recreate CQ to process initialQuery to get such events.
If you sure that no CQ owners node leaves grid - we need to continue, becouse it can be bug.
And yes, I think that it is not evidently that you must recreate CQ after client reconnect, but that is how ignite work now.

2017-05-05 16:56 GMT+07:00 begineer <[hidden email]>:
Umm. actually nothing get logged in such scenario. However, as you indicated
earlier, I could see trades get stuck if a node leaves the grid(not always).
Do you know why that happens? Is that a bug?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12452.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Thanks, In my application, all nodes are server nodes
And how do we be sure that nodes removed/ reconnect to grid is CQ node, it can be any.
Also, Is this issue possible in all below scenarios?
1. if node happens to be CQ node or any node?
2. node is removed from grid forcefully(manual shutdown)
3. node went down due to some reason and grid dropped it

3rd one looks like safe option since it is dropped by grid so grid should be ware where to shift the CQ? Please correct me if I am wrong.
Sasha Belyak Sasha Belyak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

As far as I understant you create CQ in Service.init, so node with running service is CQ node. All other nodes from grid will send CQ events to this node to process in your service and if you don't configure nodeFilter for service - any node can run it, so any node can be CQ node.
But it shouldn't be a problem if you create CQ in Service.init() and haven't too heavy load on you cluster (anyway if data owner node failed to deliver messages to node with running service (CQ node) - you should see it in logs). If you give some code examples  how you use CQ - I can say more.

2017-05-05 17:59 GMT+07:00 begineer <[hidden email]>:
Thanks, In my application, all nodes are server nodes
And how do we be sure that nodes removed/ reconnect to grid is CQ node, it
can be any.
Also, Is this issue possible in all below scenarios?
1. if node happens to be CQ node or any node?
2. node is removed from grid forcefully(manual shutdown)
3. node went down due to some reason and grid dropped it

3rd one looks like safe option since it is dropped by grid so grid should be
ware where to shift the CQ? Please correct me if I am wrong.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12454.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Hi.. Sorry its quite late to reply. CQ is setup in execute method of service not in init(), but we do have initialQuery in CQ to scan existing events to matching the filter. Below is snapshot of one of the many ignite services set to process trade on when trade moves to particular status.

As you can see, I have added logs to remote filter predicate. But these logs don't get printed when trade get stuck at particular status. So I assume, remote filter does not pick the events it is supposed to track.

public enum TradeStatus {
        NEW, CHANGED, EXPIRED, FAILED, UNCHANGED , SUCCESS
}


/**
 * Ignite Service which picks up CHANGED trade delivery items
 */
public class ChangedTradeService implements Service{

        @IgniteInstanceResource
        private transient Ignite ignite;
        private transient IgniteCache<Long, Trade> tradeCache;
        private transient QueryCursor<Entry<Long, Trade>> cursor;

        @Override
        public void init(ServiceContext serviceContext) throws Exception {
                tradeCache = ignite.cache("tradeCache");
        }

        @Override
        public void execute(ServiceContext serviceContext) throws Exception {
                ContinuousQuery<Long, Trade> query = new ContinuousQuery<>();
                query.setLocalListener((CacheEntryUpdatedListenerAsync<Long, Trade>) events -> events
                                .forEach(event -> process(event.getValue())));
                query.setRemoteFilterFactory(factoryOf(checkStatus(status)));
                query.setInitialQuery(new ScanQuery<>(checkStatusPredicate(status)));
                QueryCursor<Cache.Entry<Long, Trade>> cursor = tradeCache.query(query);
                cursor.forEach(entry -> process(entry.getValue()));
        }

        private void process(Trade item){
             log.info("transition started for trade id :"+item.getPkey());
                //move the trade to next state(e.g SUCCESS) and next Service(contains CQ, which is looking for SUCCESS status) will pick this up for processing further and so on
             log.info("transition finished for trade id :"+item.getPkey());
}

        @Override
        public void cancel(ServiceContext serviceContext) {
                cursor.close();
        }
       
        static CacheEntryEventFilterAsync<Long, Trade> checkStatus(TradeStatus status) {
                return event -> event.getValue() != null && checkStatusPredicate(status).apply(event.getKey(), event.getValue());
        }
       
        static IgniteBiPredicate<Long, TradeStatus> checkStatusPredicate(TradeStatus status) {
                return (k, v) -> {
                        LOG.debug("Status checking for: {} Event value: {} isStatus: {}", status, v, v.getStatus() == status);
                        return v.getStatus() == status;
                };
        }
}
Sasha Belyak Sasha Belyak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Thank for your reply. From code I see that you log only entries with non null values. If your absolutely shure that you never put null in cache - I will create loadtest to reproduce it and create issue for you. But it will be great, if you move logging before event.getValue! = null.

среда, 7 июня 2017 г. пользователь begineer написал:
Hi.. Sorry its quite late to reply. CQ is setup in execute method of service
not in init(), but we do have initialQuery in CQ to scan existing events to
matching the filter. Below is snapshot of one of the many ignite services
set to process trade on when trade moves to particular status.

As you can see, I have added logs to remote filter predicate. But these logs
don't get printed when trade get stuck at particular status. So I assume,
remote filter does not pick the events it is supposed to track.

public enum TradeStatus {
        NEW, CHANGED, EXPIRED, FAILED, UNCHANGED , SUCCESS
}


/**
 * Ignite Service which picks up CHANGED trade delivery items
 */
public class ChangedTradeService implements Service{

        @IgniteInstanceResource
        private transient Ignite ignite;
        private transient IgniteCache<Long, Trade> tradeCache;
        private transient QueryCursor<Entry&lt;Long, Trade>> cursor;

        @Override
        public void init(ServiceContext serviceContext) throws Exception {
                tradeCache = ignite.cache("tradeCache");
        }

        @Override
        public void execute(ServiceContext serviceContext) throws Exception {
                ContinuousQuery<Long, Trade> query = new ContinuousQuery<>();
                query.setLocalListener((CacheEntryUpdatedListenerAsync<Long, Trade>)
events -> events
                                .forEach(event -> process(event.getValue())));
                query.setRemoteFilterFactory(factoryOf(checkStatus(status)));
                query.setInitialQuery(new ScanQuery<>(checkStatusPredicate(status)));
                QueryCursor<Cache.Entry&lt;Long, Trade>> cursor = tradeCache.query(query);
                cursor.forEach(entry -> process(entry.getValue()));
        }

        private void process(Trade item){
             log.info("transition started for trade id :"+item.getPkey());
                //move the trade to next state(e.g SUCCESS) and next Service(contains CQ,
which is looking for SUCCESS status) will pick this up for processing
further and so on
             log.info("transition finished for trade id :"+item.getPkey());
}

        @Override
        public void cancel(ServiceContext serviceContext) {
                cursor.close();
        }

        static CacheEntryEventFilterAsync<Long, Trade> checkStatus(TradeStatus
status) {
                return event -> event.getValue() != null &&
checkStatusPredicate(status).apply(event.getKey(), event.getValue());
        }

        static IgniteBiPredicate<Long, TradeStatus>
checkStatusPredicate(TradeStatus status) {
                return (k, v) -> {
                        LOG.debug("Status checking for: {} Event value: {} isStatus: {}", status,
v, v.getStatus() == status);
                        return v.getStatus() == status;
                };
        }
}




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p13476.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.
begineer begineer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Continuous Query remote listener misses some events or respond really late

Hi,
Thanks I will move the logging as suggested. And that is correct, we don't store null in caches.
Loading...