Is invokeAll() considered a batch operation?

classic Classic list List threaded Threaded
24 messages Options
12
javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Is invokeAll() considered a batch operation?

So I have 27 keys to insert but using the same "business" logic...

In the performance tips, it says to "reduce the number of jobs from 100 to 10".

So instead of doing single invoke() per key I do invokeAll() for a list of keys once is that consider a performance improvement?







javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

The more keys I try to add the slower it get's but the time remains constant.

It's a partitioned cache but no backups...

final Context ctx = vertx.getOrCreateContext();
TreeSet<String> keys = new TreeSet<String>();
keys.add(key1);
keys.add(key2);
keys.add(key3);
keys.add(key4);
...

cache.<String>invokeAll(keys, (entry, args) -> {
// Do something cool here...
return myString value
});

So to do something cool...
1 key takes 2ms to do some thing cool...
3 key takes 10ms
27 keys takes 30ms

But the time remains constant at least.

vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

What do you mean by "constant time"? You update a batch of 3 in 10 ms, which means that it will take 30 ms to update 9 entries (3 batches). This is 3 times less than batch of 27 that also takes 30ms, so batching gives you performance improvement. Am I missing something?

-Val
javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Sorry for the confusion...

So...

- Just to be clear running 1.3.0 on 2 nodes with total of 64 cores and 256GB of RAM
- Cache is configured as off-heap partitioned no backups with 96GB and each node is started as -Xmx4g
- Each app is a web server that receives json request that parses the json and inserts the json properties as different keys (27 total).
- Requests are load balanced to both nodes.

All times quoted below includes full business logic and network time. The json payload sent always contains the 27 properties. Only the app is recompiled to either invokeAll() for 1 key 3 keys, 9 keys etc...

Also time remains constant from 0 records inserted all the way up to millions (so at least this is good thing) which also means java is sufficiently warmed up with millions of calls.

01 keys: 2ms
03 keys: 4ms
09 keys: 9ms
18 keys: 16ms
27 keys: 25ms

I was maybe hopping slightly better latency... Here is the code maybe you have suggestions on improvements?

http://pastebin.com/pR36DwdG





Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

As I see you have your field names as keys, you can think of them as locks.
I mean that your scalability is limited by these 27 locks and you are acquiring them all for each cache update.
How many threads do you use for your latency test?
I'd suggest to use something like tuple (fieldName, fieldValue) as key and
just an integer (number of such keys) as value which must be incremented or set to 1 if none.

Sergi


2015-09-01 17:38 GMT+03:00 javadevmtl <[hidden email]>:
Sorry for the confusion...

So...

- Just to be clear running 1.3.0 on 2 nodes with total of 64 cores and 256GB
of RAM
- Cache is configured as off-heap partitioned no backups with 96GB and each
node is started as -Xmx4g
- Each app is a web server that receives json request that parses the json
and inserts the json properties as different keys (27 total).
- Requests are load balanced to both nodes.

All times quoted below includes full business logic and network time. The
json payload sent always contains the 27 properties. Only the app is
recompiled to either invokeAll() for 1 key 3 keys, 9 keys etc...

Also time remains constant from 0 records inserted all the way up to
millions (so at least this is good thing) which also means java is
sufficiently warmed up with millions of calls.

01 keys: 2ms
03 keys: 4ms
09 keys: 9ms
18 keys: 16ms
27 keys: 25ms

I was maybe hopping slightly better latency... Here is the code maybe you
have suggestions on improvements?

http://pastebin.com/pR36DwdG









--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1238.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

By the way you can also use field id (just number your fields from 0 to 27) instead of storing field names in counter keys, it will give some serialization speedup and better memory utilization.

Sergi

2015-09-01 19:10 GMT+03:00 Sergi Vladykin <[hidden email]>:
As I see you have your field names as keys, you can think of them as locks.
I mean that your scalability is limited by these 27 locks and you are acquiring them all for each cache update.
How many threads do you use for your latency test?
I'd suggest to use something like tuple (fieldName, fieldValue) as key and
just an integer (number of such keys) as value which must be incremented or set to 1 if none.

Sergi


2015-09-01 17:38 GMT+03:00 javadevmtl <[hidden email]>:
Sorry for the confusion...

So...

- Just to be clear running 1.3.0 on 2 nodes with total of 64 cores and 256GB
of RAM
- Cache is configured as off-heap partitioned no backups with 96GB and each
node is started as -Xmx4g
- Each app is a web server that receives json request that parses the json
and inserts the json properties as different keys (27 total).
- Requests are load balanced to both nodes.

All times quoted below includes full business logic and network time. The
json payload sent always contains the 27 properties. Only the app is
recompiled to either invokeAll() for 1 key 3 keys, 9 keys etc...

Also time remains constant from 0 records inserted all the way up to
millions (so at least this is good thing) which also means java is
sufficiently warmed up with millions of calls.

01 keys: 2ms
03 keys: 4ms
09 keys: 9ms
18 keys: 16ms
27 keys: 25ms

I was maybe hopping slightly better latency... Here is the code maybe you
have suggestions on improvements?

http://pastebin.com/pR36DwdG









--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1238.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Sorry, just noticed that you already use field values as cache keys.
But simplifying counters to ints instead of HashMaps of HashSets is
probably still makes sense to try.

Sergi

2015-09-01 19:19 GMT+03:00 Sergi Vladykin <[hidden email]>:
By the way you can also use field id (just number your fields from 0 to 27) instead of storing field names in counter keys, it will give some serialization speedup and better memory utilization.

Sergi

2015-09-01 19:10 GMT+03:00 Sergi Vladykin <[hidden email]>:
As I see you have your field names as keys, you can think of them as locks.
I mean that your scalability is limited by these 27 locks and you are acquiring them all for each cache update.
How many threads do you use for your latency test?
I'd suggest to use something like tuple (fieldName, fieldValue) as key and
just an integer (number of such keys) as value which must be incremented or set to 1 if none.

Sergi


2015-09-01 17:38 GMT+03:00 javadevmtl <[hidden email]>:
Sorry for the confusion...

So...

- Just to be clear running 1.3.0 on 2 nodes with total of 64 cores and 256GB
of RAM
- Cache is configured as off-heap partitioned no backups with 96GB and each
node is started as -Xmx4g
- Each app is a web server that receives json request that parses the json
and inserts the json properties as different keys (27 total).
- Requests are load balanced to both nodes.

All times quoted below includes full business logic and network time. The
json payload sent always contains the 27 properties. Only the app is
recompiled to either invokeAll() for 1 key 3 keys, 9 keys etc...

Also time remains constant from 0 records inserted all the way up to
millions (so at least this is good thing) which also means java is
sufficiently warmed up with millions of calls.

01 keys: 2ms
03 keys: 4ms
09 keys: 9ms
18 keys: 16ms
27 keys: 25ms

I was maybe hopping slightly better latency... Here is the code maybe you
have suggestions on improvements?

http://pastebin.com/pR36DwdG









--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1238.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

In reply to this post by Sergi Vladykin
Not sure I get what you mean? The field names where renamed to protect the inocent lol.

I'm using the actual value for a specific field as the key.
So for instance if I submit
First Name: John
Last Name: Smith
Mobile Number: 555-555-5555
Home Number Number: 123-555-5555

cache.put("John Smith", hashset of phonumber here);
I do this because I want to count how many unique number this "person" has.

As for threads I use 32 "web" threads for vertx and the defaults for ignite. Which should be 32 per node.
yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

In reply to this post by Sergi Vladykin

As a side note - can it be so that lambda passed to consume() is executed in 1 thread?

I agree with Sergi that string keys should be replaced with ints.

Alex G, can you check how we map invoke in atomic cache? Can it be so that processing 27 keys may end up with sending 13 or 14 messages if keys belong to nodes this way - local node, remote node, local, remote...

Can we optimize this to send 1 message and then process local set?

On Sep 1, 2015 19:31, "Sergi Vladykin" <[hidden email]> wrote:
Sorry, just noticed that you already use field values as cache keys.
But simplifying counters to ints instead of HashMaps of HashSets is
probably still makes sense to try.

Sergi

2015-09-01 19:19 GMT+03:00 Sergi Vladykin <[hidden email]>:
By the way you can also use field id (just number your fields from 0 to 27) instead of storing field names in counter keys, it will give some serialization speedup and better memory utilization.

Sergi

2015-09-01 19:10 GMT+03:00 Sergi Vladykin <[hidden email]>:
As I see you have your field names as keys, you can think of them as locks.
I mean that your scalability is limited by these 27 locks and you are acquiring them all for each cache update.
How many threads do you use for your latency test?
I'd suggest to use something like tuple (fieldName, fieldValue) as key and
just an integer (number of such keys) as value which must be incremented or set to 1 if none.

Sergi


2015-09-01 17:38 GMT+03:00 javadevmtl <[hidden email]>:
Sorry for the confusion...

So...

- Just to be clear running 1.3.0 on 2 nodes with total of 64 cores and 256GB
of RAM
- Cache is configured as off-heap partitioned no backups with 96GB and each
node is started as -Xmx4g
- Each app is a web server that receives json request that parses the json
and inserts the json properties as different keys (27 total).
- Requests are load balanced to both nodes.

All times quoted below includes full business logic and network time. The
json payload sent always contains the 27 properties. Only the app is
recompiled to either invokeAll() for 1 key 3 keys, 9 keys etc...

Also time remains constant from 0 records inserted all the way up to
millions (so at least this is good thing) which also means java is
sufficiently warmed up with millions of calls.

01 keys: 2ms
03 keys: 4ms
09 keys: 9ms
18 keys: 16ms
27 keys: 25ms

I was maybe hopping slightly better latency... Here is the code maybe you
have suggestions on improvements?

http://pastebin.com/pR36DwdG









--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1238.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Hi Yakov...

Each physical server node is 32 cores. When I deploy my vertx.io application I tell it to create 32 instances of consumer.handler

Vertx.io is much like node.js reactor pattern. Each request is put into an event loop and each instance of consume.handler will take 1 event from the loop and process it. Vertx underneath handles all the dirty work.

So everything that runs inside consume would be as if it was 1 single thread. Though of course ignite will work in it's on thread pool, that doesn't change.

As far as the keys goes it's arbitrary data so it's not attached to an identifier.

Here are some questions I try to answer

Given the person "John Smith" how many unique phone numbers does he have.
Given the phone number "555-555-5555" how many unique persons are associated to it.

The other way I was trying to this was using a a data model where each data is serialized, but secondary indexes where proving slow.






javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Or can I maybe use a hashcode function that creates a long of of the key and use that as the key?

Something like...

public static long hash(String string) {
  long h = 1125899906842597L; // prime
  int len = string.length();

  for (int i = 0; i < len; i++) {
    h = 31*h + string.charAt(i);
  }
  return h;
}
javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Converting each key to a long equivalent using the hashcode method I posted above didn't improve the performance it' stayed the same so it takes about 25ms to execute the 27 invokes...
javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Just curious are you guys looking at what Yakov suggested about the mapping to 1 event?

Thanks
alexey.goncharuk alexey.goncharuk
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

I have just confirmed by running a benchmark on a cluster that invokeAll in ATOMIC cache does NOT linearly increase latency because we do send all updates at once. Here are my results:

Batch size    Latency (ns)
1             459,957.47
2             497,306.94
4             494,442.05
8             507,194.52
16            555,202.75
32            643,215.03

If you are using a TRANSACTIONAL cache, you will probably end up in a situation when each next key travels to a separate node that would explain the behavior you see. Note that Ignite cannot change the order of lock acquisition for TRANSACTIONAL cache in order to avoid deadlocks. To improve performance in this case, you need to group keys by partition so that maximum number of keys is sent to a node at one network hop.

2015-09-02 6:44 GMT-07:00 javadevmtl <[hidden email]>:
Just curious are you guys looking at what Yakov suggested about the mapping
to 1 event?

Thanks



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1257.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Thanks, Alexey! This is very useful.

On Wed, Sep 2, 2015 at 4:38 PM, Alexey Goncharuk <[hidden email]> wrote:
I have just confirmed by running a benchmark on a cluster that invokeAll in ATOMIC cache does NOT linearly increase latency because we do send all updates at once. Here are my results:

Batch size    Latency (ns)
1             459,957.47
2             497,306.94
4             494,442.05
8             507,194.52
16            555,202.75
32            643,215.03

If you are using a TRANSACTIONAL cache, you will probably end up in a situation when each next key travels to a separate node that would explain the behavior you see. Note that Ignite cannot change the order of lock acquisition for TRANSACTIONAL cache in order to avoid deadlocks. To improve performance in this case, you need to group keys by partition so that maximum number of keys is sent to a node at one network hop.

2015-09-02 6:44 GMT-07:00 javadevmtl <[hidden email]>:
Just curious are you guys looking at what Yakov suggested about the mapping
to 1 event?

Thanks



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1257.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

According to pastebin link atomic cache is used.

On Sep 3, 2015 02:42, "Dmitriy Setrakyan" <[hidden email]> wrote:
Thanks, Alexey! This is very useful.

On Wed, Sep 2, 2015 at 4:38 PM, Alexey Goncharuk <[hidden email]> wrote:
I have just confirmed by running a benchmark on a cluster that invokeAll in ATOMIC cache does NOT linearly increase latency because we do send all updates at once. Here are my results:

Batch size    Latency (ns)
1             459,957.47
2             497,306.94
4             494,442.05
8             507,194.52
16            555,202.75
32            643,215.03

If you are using a TRANSACTIONAL cache, you will probably end up in a situation when each next key travels to a separate node that would explain the behavior you see. Note that Ignite cannot change the order of lock acquisition for TRANSACTIONAL cache in order to avoid deadlocks. To improve performance in this case, you need to group keys by partition so that maximum number of keys is sent to a node at one network hop.

2015-09-02 6:44 GMT-07:00 javadevmtl <[hidden email]>:
Just curious are you guys looking at what Yakov suggested about the mapping
to 1 event?

Thanks



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1257.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Alexey,

Did you use OFFHEAP in your benchmark?
Also it is interesting how it will behave with enabled offheap for relatively large values,
like HashSet<String> of different sizes.

Sergi
 

2015-09-03 15:07 GMT+03:00 Yakov Zhdanov <[hidden email]>:

According to pastebin link atomic cache is used.

On Sep 3, 2015 02:42, "Dmitriy Setrakyan" <[hidden email]> wrote:
Thanks, Alexey! This is very useful.

On Wed, Sep 2, 2015 at 4:38 PM, Alexey Goncharuk <[hidden email]> wrote:
I have just confirmed by running a benchmark on a cluster that invokeAll in ATOMIC cache does NOT linearly increase latency because we do send all updates at once. Here are my results:

Batch size    Latency (ns)
1             459,957.47
2             497,306.94
4             494,442.05
8             507,194.52
16            555,202.75
32            643,215.03

If you are using a TRANSACTIONAL cache, you will probably end up in a situation when each next key travels to a separate node that would explain the behavior you see. Note that Ignite cannot change the order of lock acquisition for TRANSACTIONAL cache in order to avoid deadlocks. To improve performance in this case, you need to group keys by partition so that maximum number of keys is sent to a node at one network hop.

2015-09-02 6:44 GMT-07:00 javadevmtl <[hidden email]>:
Just curious are you guys looking at what Yakov suggested about the mapping
to 1 event?

Thanks



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1257.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Alexy when you tested various key insertion sizes

1, 2, 4, 8, 16 etc...

Where you inserting serially in a loop Or concurrently? I.e: where you calling invokeAll() one after each other or did you call invokeAll by multiple threads?

In my case each web request will call invokeAll() for 27 keys. So if I have 3 web requests, that means that invokeAll() will be called 3 times and a total of 81 keys will be written.

Write now my app is doing 3,380 web request a second. So that basically generates 182,520 operations.
91260 reads and 91260 writes. I guess is not bad. Is it goo by your standards?

Wondering maybe I should have a few caches for a group of keys to reduce the concurrency on a single cache. So instead of doing 27 on 1 cache maybe I can split to more caches...
javadevmtl javadevmtl
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

Hi based on the numbers I mentioned above does it make sence?

Since I'm running potentially about 180,000 operations per second. Is it affecting the concurrency to a single cache?

So maybe I should break my problem into multiple caches? So group a few insert and reads to specific cache.
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Is invokeAll() considered a batch operation?

I don't think splitting into multiple caches will give you any improvement.
If it will, then it must be a bug in Ignite atomic cache.

Sergi

2015-09-09 22:57 GMT+03:00 javadevmtl <[hidden email]>:
Hi based on the numbers I mentioned above does it make sence?

Since I'm running potentially about 180,000 operations per second. Is it
affecting the concurrency to a single cache?

So maybe I should break my problem into multiple caches? So group a few
insert and reads to specific cache.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-invokeAll-considered-a-batch-operation-tp1220p1320.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

12