Slow invoke call

classic Classic list List threaded Threaded
8 messages Options
javastuff.sam@gmail.com javastuff.sam@gmail.com
Reply | Threaded
Open this post in threaded view
|

Slow invoke call

Hi,

I have a usecase where we are storing byte array into one of the OFFHEAP
ATOMIC cache. Multiple threads keep on appending bytes to it using remote
operation (Entry processor / invoke call).

Below is the logic for remote operation.

if (mutableEntry.exists()) {
        MyObject m = (MyObject)mutableEntry.getValue();
        byte[] bytes = new byte[m.getContent().length +
newContent.getContent().length];
        System.arraycopy(m.getContent(), 0, bytes, 0, m.getContent().length);
        System.arraycopy(newContent.getContent(), 0, bytes,
m.getContent().length,newContent.getContent().length);
        m.setContent(bytes);
        mutableEntry.setValue(m);
} else {
        mutableEntry.setValue(newContent);
}

We tested by adding about 25MB of content.

What we have seen is -
1. In 8 threads test we have seen average time for Invoke() is 36ms.
2. Out of total time taken to execute Invoke(), 50% time taken by above
remote logic. Where does other 50% spend? Probably just to invoke the
method.

What can we tune here? We are considering InvokeAll(), any more ideas?

If we increase this to 250MB we have seen a very slow performance and after
hours it goes Out-of-memory, setOffHeapMaxMemory(0) does not help on 32GB
box.

We are using Ignite 1.9, Java 1.8, Parallel GC and app JVM is 4GB.
Unfortunately, we can not upgrade Ignite version as of now.

Thanks,
-Sam




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Pavel Vinokurov Pavel Vinokurov
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

Hi Sam,

When you store entries into the off-heap, any update operation requires copying value from the off-heap to the heap.
Thus frequent updates of  the "heavy" entry  lead to significant overheads for copying data.
Have you tested only with the heap memory?

Thanks,
Pavel

2018-04-05 4:44 GMT+03:00 [hidden email] <[hidden email]>:
Hi,

I have a usecase where we are storing byte array into one of the OFFHEAP
ATOMIC cache. Multiple threads keep on appending bytes to it using remote
operation (Entry processor / invoke call).

Below is the logic for remote operation.

if (mutableEntry.exists()) {
        MyObject m = (MyObject)mutableEntry.getValue();
        byte[] bytes = new byte[m.getContent().length +
newContent.getContent().length];
        System.arraycopy(m.getContent(), 0, bytes, 0, m.getContent().length);
        System.arraycopy(newContent.getContent(), 0, bytes,
m.getContent().length,newContent.getContent().length);
        m.setContent(bytes);
        mutableEntry.setValue(m);
} else {
        mutableEntry.setValue(newContent);
}

We tested by adding about 25MB of content.

What we have seen is -
1. In 8 threads test we have seen average time for Invoke() is 36ms.
2. Out of total time taken to execute Invoke(), 50% time taken by above
remote logic. Where does other 50% spend? Probably just to invoke the
method.

What can we tune here? We are considering InvokeAll(), any more ideas?

If we increase this to 250MB we have seen a very slow performance and after
hours it goes Out-of-memory, setOffHeapMaxMemory(0) does not help on 32GB
box.

We are using Ignite 1.9, Java 1.8, Parallel GC and app JVM is 4GB.
Unfortunately, we can not upgrade Ignite version as of now.

Thanks,
-Sam




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/



--

Regards

Pavel Vinokurov

javastuff.sam@gmail.com javastuff.sam@gmail.com
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

Thanks Pavel for reply.

/"When you store entries into the off-heap, any update operation requires
copying value from the off-heap to the heap." /
I am updating using remote entry processor, does that also need to copy
value to calling heap? If yes then no benefit using entry processor here, I
can fetch and put, probably with a lock.

Why does invoking remote method is taking 50% of time? Is it because of
concurrency and entry processor will execute under a lock internally?

Coping existing and incoming bytes, must be generating a lot for GC.

Is there any better way to deal with this usecase? right now it seems slower
than DB updates.

Thanks,
-Sam



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Pavel Vinokurov Pavel Vinokurov
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

Sam,

>>
Why does invoking remote method is taking 50% of time? Is it because of
concurrency and entry processor will execute under a lock internally?
>>
Could you please share a small reproducer project.

>> Is there any better way to deal with this usecase? 
Have you tested with only heap memory?

Thanks,
Pavel




2018-04-06 2:39 GMT+07:00 [hidden email] <[hidden email]>:
Thanks Pavel for reply.

/"When you store entries into the off-heap, any update operation requires
copying value from the off-heap to the heap." /
I am updating using remote entry processor, does that also need to copy
value to calling heap? If yes then no benefit using entry processor here, I
can fetch and put, probably with a lock.

Why does invoking remote method is taking 50% of time? Is it because of
concurrency and entry processor will execute under a lock internally?

Coping existing and incoming bytes, must be generating a lot for GC.

Is there any better way to deal with this usecase? right now it seems slower
than DB updates.



--

Regards

Pavel Vinokurov

vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

In reply to this post by javastuff.sam@gmail.com
Hi Sam,

What does this byte array represent? You have to array copies in the entry
processor, so I'm not surprised it doesn't perform very well. That's also
the reason why it gets worse when size of the array is increased. The fact
that you're using off-heap also make it worse in this case as it basically
adds another two copies to read from off-heap and then write back.

I would consider revisiting your data model. Is it possible, for example, to
create a new entry instead of appending to an existing array?

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
javastuff.sam@gmail.com javastuff.sam@gmail.com
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

Thanks Val.

I understand Array copy is heavy operation and probably lots of memory
allocations too, however, my profiler showing complete logic of copy and
append taking 50% of the total time taken by Invoke call. that's why the
question, does invoke should take this much time or its the concurrency
killing it to have the atomic operation?

I have already tried putting separate entries instead of appending to single
byte array. However, this approach needs more logic to keep sequence,
locking or synchronizing during fetch or remove.
During the quick implementation of this new approach, I used scan query
filter on key for fetch and remove calls. As expected put was faster (no
entry-processor, no array copy), however, faced issue with scan query.
Probably one thread iterating on scan query and other tried to put, thats
where scan query bails out with an exception.



I am going to tweak this usecase further to get better results, any
ideas/input will be appreciated.

Thanks,
-Sambhav



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

Sam,

Entry processor is indeed executed within a lock, this is required to
achieve atomicity. So if there is high contention on a single key, requests
will wait for each other. And this is yet another reason to make entry
processor implementation as lightweight as possible so that it does not
acquire lock for a long time.

Scan query should not fail with an exception if there is a concurrent
update. Do you have the trace?

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
javastuff.sam@gmail.com javastuff.sam@gmail.com
Reply | Threaded
Open this post in threaded view
|

Re: Slow invoke call

Sorry, I do not have trace for scan query.
We moved away from the earlier implementation, as of now it is not showing
big latencies like earlier.

Thank you for help.

Thanks,
-Sam





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/