Fwd: Ignite partitioned mode not scaling

classic Classic list List threaded Threaded
7 messages Options
rajan rajan
Reply | Threaded
Open this post in threaded view
|

Fwd: Ignite partitioned mode not scaling



---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950
mcherkasov mcherkasov
Reply | Threaded
Open this post in threaded view
|

Re: Ignite partitioned mode not scaling

Hi Rajan,

could you please share the benchmark code with us?
do you run queries against the same amount of records each time? 
what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.

Thanks,
Mike.

On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat <[hidden email]> wrote:


---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950


--
Thanks,
Mikhail.
rajan rajan
Reply | Threaded
Open this post in threaded view
|

Re: Ignite partitioned mode not scaling

Hi Mikhail

could you please share the benchmark code with us?
I am first filling up around a million records in cache. Then through direct cache service classes, fetching those records randomly.

do you run queries against the same amount of records each time? 
Yes, 2600 QPS means, it picks 2600 records randomly over a second and do get query over sql caches of different tables.

what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Yes, these are five dedicated linux machines.

Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.
As you can see in the table, I have tried with different combinations on nodes, and with increase in nodes, our qps of requests being served under 50ms is getting down each time. 


On Fri, Jan 3, 2020 at 1:29 AM Mikhail Cherkasov <[hidden email]> wrote:
Hi Rajan,

could you please share the benchmark code with us?
do you run queries against the same amount of records each time? 
what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.

Thanks,
Mike.

On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat <[hidden email]> wrote:


---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950


--
Thanks,
Mikhail.
rajan rajan
Reply | Threaded
Open this post in threaded view
|

Re: Ignite partitioned mode not scaling

If QPS > 2000 I am using multiple hosts for application which is shooting requests to cache.
If benchmark is the bottleneck, we shouldn't see drop from 2600 to 2200 when we go from 1 to 3 node cluster.

On Fri, Jan 3, 2020 at 11:24 AM Rajan Ahlawat <[hidden email]> wrote:
Hi Mikhail

could you please share the benchmark code with us?
I am first filling up around a million records in cache. Then through direct cache service classes, fetching those records randomly.

do you run queries against the same amount of records each time? 
Yes, 2600 QPS means, it picks 2600 records randomly over a second and do get query over sql caches of different tables.

what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Yes, these are five dedicated linux machines.

Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.
As you can see in the table, I have tried with different combinations on nodes, and with increase in nodes, our qps of requests being served under 50ms is getting down each time. 


On Fri, Jan 3, 2020 at 1:29 AM Mikhail Cherkasov <[hidden email]> wrote:
Hi Rajan,

could you please share the benchmark code with us?
do you run queries against the same amount of records each time? 
what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.

Thanks,
Mike.

On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat <[hidden email]> wrote:


---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950


--
Thanks,
Mikhail.
mcherkasov mcherkasov
Reply | Threaded
Open this post in threaded view
|

Re: Ignite partitioned mode not scaling

What type of client do you use? is it JDBC thin driver?

The best if you can share benchmark source code, so we can see what queries you use, what flags you set to queries and etc.

On Thu, Jan 2, 2020 at 10:07 PM Rajan Ahlawat <[hidden email]> wrote:
If QPS > 2000 I am using multiple hosts for application which is shooting requests to cache.
If benchmark is the bottleneck, we shouldn't see drop from 2600 to 2200 when we go from 1 to 3 node cluster.

On Fri, Jan 3, 2020 at 11:24 AM Rajan Ahlawat <[hidden email]> wrote:
Hi Mikhail

could you please share the benchmark code with us?
I am first filling up around a million records in cache. Then through direct cache service classes, fetching those records randomly.

do you run queries against the same amount of records each time? 
Yes, 2600 QPS means, it picks 2600 records randomly over a second and do get query over sql caches of different tables.

what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Yes, these are five dedicated linux machines.

Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.
As you can see in the table, I have tried with different combinations on nodes, and with increase in nodes, our qps of requests being served under 50ms is getting down each time. 


On Fri, Jan 3, 2020 at 1:29 AM Mikhail Cherkasov <[hidden email]> wrote:
Hi Rajan,

could you please share the benchmark code with us?
do you run queries against the same amount of records each time? 
what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.

Thanks,
Mike.

On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat <[hidden email]> wrote:


---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950


--
Thanks,
Mikhail.


--
Thanks,
Mikhail.
rajan rajan
Reply | Threaded
Open this post in threaded view
|

Re: Ignite partitioned mode not scaling

We are using following ignite client : 

org.apache.ignite:ignite-core:2.6.0
org.apache.ignite:ignite-spring-data:2.6.0

Benchmark source code is pretty simple it does following : 

Executors.newFixedThreadPool(threadPoolSize) working behind rateLimiter executes threads
Each threads makes get query in three cache IgniteRepository tables, something like this :
memberCacheRepository.getMemberCacheObjectByMemberUuid(memberUuid)

cache is created during spring boot application load via igniteCacheConfiguration like : 
CacheConfiguration createSqlCacheConfig(String cacheName, String dataRegion) {
CacheConfiguration sqlCacheConfig = new CacheConfiguration(cacheName);
sqlCacheConfig.setBackups(0);
sqlCacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.PRIMARY_SYNC);
sqlCacheConfig.setCacheMode(CacheMode.PARTITIONED);
sqlCacheConfig.setDataRegionName(dataRegion);
return sqlCacheConfig;
}
I am sorry but won't be able to share the complete code, please let me know what specific information is required.


On Fri, Jan 3, 2020 at 2:45 PM Mikhail Cherkasov <[hidden email]> wrote:
What type of client do you use? is it JDBC thin driver?

The best if you can share benchmark source code, so we can see what queries you use, what flags you set to queries and etc.

On Thu, Jan 2, 2020 at 10:07 PM Rajan Ahlawat <[hidden email]> wrote:
If QPS > 2000 I am using multiple hosts for application which is shooting requests to cache.
If benchmark is the bottleneck, we shouldn't see drop from 2600 to 2200 when we go from 1 to 3 node cluster.

On Fri, Jan 3, 2020 at 11:24 AM Rajan Ahlawat <[hidden email]> wrote:
Hi Mikhail

could you please share the benchmark code with us?
I am first filling up around a million records in cache. Then through direct cache service classes, fetching those records randomly.

do you run queries against the same amount of records each time? 
Yes, 2600 QPS means, it picks 2600 records randomly over a second and do get query over sql caches of different tables.

what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Yes, these are five dedicated linux machines.

Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.
As you can see in the table, I have tried with different combinations on nodes, and with increase in nodes, our qps of requests being served under 50ms is getting down each time. 


On Fri, Jan 3, 2020 at 1:29 AM Mikhail Cherkasov <[hidden email]> wrote:
Hi Rajan,

could you please share the benchmark code with us?
do you run queries against the same amount of records each time? 
what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.

Thanks,
Mike.

On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat <[hidden email]> wrote:


---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950


--
Thanks,
Mikhail.


--
Thanks,
Mikhail.
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Ignite partitioned mode not scaling

Hello!

Unfortunately we can only help you if we have a runnable reproducer, meaning you should prepare a stripped down project which still exhibits this behavior while not containing any proprietary information, share it with us via e.g. GitHub.

Regards,
--
Ilya Kasnacheev


пт, 3 янв. 2020 г. в 20:24, Rajan Ahlawat <[hidden email]>:
We are using following ignite client : 

org.apache.ignite:ignite-core:2.6.0
org.apache.ignite:ignite-spring-data:2.6.0

Benchmark source code is pretty simple it does following : 

Executors.newFixedThreadPool(threadPoolSize) working behind rateLimiter executes threads
Each threads makes get query in three cache IgniteRepository tables, something like this :
memberCacheRepository.getMemberCacheObjectByMemberUuid(memberUuid)

cache is created during spring boot application load via igniteCacheConfiguration like : 
CacheConfiguration createSqlCacheConfig(String cacheName, String dataRegion) {
CacheConfiguration sqlCacheConfig = new CacheConfiguration(cacheName);
sqlCacheConfig.setBackups(0);
sqlCacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.PRIMARY_SYNC);
sqlCacheConfig.setCacheMode(CacheMode.PARTITIONED);
sqlCacheConfig.setDataRegionName(dataRegion);
return sqlCacheConfig;
}
I am sorry but won't be able to share the complete code, please let me know what specific information is required.


On Fri, Jan 3, 2020 at 2:45 PM Mikhail Cherkasov <[hidden email]> wrote:
What type of client do you use? is it JDBC thin driver?

The best if you can share benchmark source code, so we can see what queries you use, what flags you set to queries and etc.

On Thu, Jan 2, 2020 at 10:07 PM Rajan Ahlawat <[hidden email]> wrote:
If QPS > 2000 I am using multiple hosts for application which is shooting requests to cache.
If benchmark is the bottleneck, we shouldn't see drop from 2600 to 2200 when we go from 1 to 3 node cluster.

On Fri, Jan 3, 2020 at 11:24 AM Rajan Ahlawat <[hidden email]> wrote:
Hi Mikhail

could you please share the benchmark code with us?
I am first filling up around a million records in cache. Then through direct cache service classes, fetching those records randomly.

do you run queries against the same amount of records each time? 
Yes, 2600 QPS means, it picks 2600 records randomly over a second and do get query over sql caches of different tables.

what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Yes, these are five dedicated linux machines.

Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.
As you can see in the table, I have tried with different combinations on nodes, and with increase in nodes, our qps of requests being served under 50ms is getting down each time. 


On Fri, Jan 3, 2020 at 1:29 AM Mikhail Cherkasov <[hidden email]> wrote:
Hi Rajan,

could you please share the benchmark code with us?
do you run queries against the same amount of records each time? 
what host machines do you use for your nodes? when you say that you have 5 nodes, does it mean that you use 5 dedicates machines for each node?
Also, it might be that the benchmark itself is the bottleneck, so your system can handle more QPS, but you need to run a benchmark from several machines. Please try to use at least 2 hosts for the benchmark application and check if there any changes in QPS.

Thanks,
Mike.

On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat <[hidden email]> wrote:


---------- Forwarded message ---------
From: Rajan Ahlawat <[hidden email]>
Date: Thu, Jan 2, 2020 at 4:05 PM
Subject: Ignite partitioned mode not scaling
To: <[hidden email]>


We are moving from replicated (1-node cluster) to multinode partitioned cluster.
So assumption was that max QPS we can reach would be more if no. of nodes are added to cluster. 
We compared under 50ms QPS stats of partitioned mode with increasing no. of nodes in cluster, and found that performance actually degraded. 
We are using ignite key value as well as sql cache, where most of the data in sql cache, no persistence is being used.

please let us know what we are doing wrong or what can be done to make it scalable.
here are the results of perf tests : 

50ms in 95 percentile comparison of partitioned-mode


Response time in ms
cache mode (partitioned)QPSread from sql tableread from sql table with joinread from sql table
1-node2600484647
3-node2190504849
3-node-1-backup2200555354
5-node2000545253
5-node-2-backup1990514950


--
Thanks,
Mikhail.


--
Thanks,
Mikhail.