Quantcast

Get high throughput for loading data in ignite

classic Classic list List threaded Threaded
12 messages Options
rishi007bansod rishi007bansod
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Get high throughput for loading data in ignite

I am loading data through kafkastreamer and ignitedatastreamer to cache. What are ideal settings for ignitedatastreamer(for parameters like, pernodebuffersize, pernodeparallelprocessing, autoflushfreq) to load data at high rate. Also are there any system settings in linux that can increase data loading performance significantly? what else can I try for improving data load throughput. My message size is 500 bytes and i am getting rate of only 50k msgs/sec i.e. 25MB/s, but RAM write rates are upto 100GB/sec. So how can I achieve this rate?
Andrew Mashenkov Andrew Mashenkov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Hi,

To make any changes in Ignite or Linux configuration to improve performance you have to clearly undestand where a bottelneck is. 

What are configuration of your grid and caches? 
Where are you load data from, network or disk, and how you do this? 



On Tue, Feb 7, 2017 at 6:41 PM, rishi007bansod <[hidden email]> wrote:
I am loading data through kafkastreamer and ignitedatastreamer to cache. What
are ideal settings for ignitedatastreamer(for parameters like,
pernodebuffersize, pernodeparallelprocessing, autoflushfreq) to load data at
high rate. Also are there any system settings in linux that can increase
data loading performance significantly? what else can I try for improving
data load throughput. My message size is 500 bytes and i am getting rate of
only 50k msgs/sec i.e. 25MB/s, but RAM write rates are upto 100GB/sec. So
how can I achieve this rate?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Get-high-throughput-for-loading-data-in-ignite-tp10483.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



--
С уважением,
Машенков Андрей Владимирович
Тел. +7-921-932-61-82

Best regards,
Andrey V. Mashenkov
Cerr: +7-921-932-61-82
Regards, Andrew.
rishi007bansod rishi007bansod
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Cache is configured in OFF HEAP, partitioned mode. Data is read from kafka topic. There must be some reference settings, e.g. for machine with certain memory, cores how data streamer can be tunned? What are system parameters, that I should look for improved performance for data loading?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

I agree with Andrey, you should debug your application first and find out where the bottleneck is. Why do you think it's on system level? I honestly doubt it.

-Val
rishi007bansod rishi007bansod
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Hi Val,
      I have following machine configuration,
      Number of CPU cores : 24
      RAM : 65 GB
      Processor : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

      My message size is 512 Bytes

      Currently I am getting data loading throughput of about 2,30,000 messages/sec(i.e. about 117.76 MB/sec) on single machine.
       Here I want to ask what is maximum data cache write rate that can be achievable using Apache Ignite? What is maximum expected data loading throughput in this configuration?      
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Rishi,

There is no such thing as maximum possible throughput. First of all, it depends on a lot of factors and the only way to determine performance is to run a benchmark. Second of all, Ignite is highly scalable system, so performance can always be improved by adding nodes and resources.

If you think that there is a bottleneck, you need to isolate and locate it. Review the code, run profiler, check CPU utilization, memory utilization, GC, etc...

-Val
rishi007bansod rishi007bansod
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

This post was updated on .
Hi,
    On one server node I am getting upto 2,30,000 messages/sec. For scaling up this data loading rate I have added one more node with same configuration. But data loading rate is hardly scaling up, with 2 nodes connected I am getting only 2,70,000 messages/sec rate. What are best practices that should be followed while loading data so that I can scale up data loading rate on this 2 node cluster?

Thanks.
Andrew Mashenkov Andrew Mashenkov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Hi Rishi,

Have you collect metrics during data loading: CPU, io, network? Where is a bottlleneck?
How many backups do you set in CacheConfiguration: zero ore more?

On Wed, Feb 15, 2017 at 10:54 PM, rishi007bansod <[hidden email]> wrote:
Hi,
    On one server node I am getting upto 2,30,000 rec/sec. For scaling up
this data loading rate I have added one more node with same configuration.
But data loading rate is hardly scaling up, with 2 nodes connected I am
getting only 2,70,000 rec/sec rate. What are best practices that should be
followed while loading data so that I can scale up data loading rate on this
2 node cluster?

Thanks.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Get-high-throughput-for-loading-data-in-ignite-tp10483p10655.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



--
Best regards,
Andrey V. Mashenkov
Regards, Andrew.
rishi007bansod rishi007bansod
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Hi,
    In my case CPU is becoming bottleneck when 2 nodes are connected, I have attached my atop command snapshot,  
 
I have not set number of backups parameter, it is default(i.e. 0) in my case. Is there any way I can improve performance? Which ignite parameters or system parameters should I look for? Are there any settings that I should look for when 2 or more nodes are connected in grid? As in my case data loading rate is not scaling up with 2 nodes connected

Thanks.
Andrew Mashenkov Andrew Mashenkov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

Sorry for late answer.

Common performance tip described in [1] and [2]. 
Do you have all your ignite nodes on single machine? 
Do you use any Virtual Environment like VmWare ESX? Ignite running inside VM can have some performance issues.

Also I've notice from screenshot you provide:
- almost half of spap space is used. Is it used by Ignite?
- eth0 shows 798 Mbps incoming data rate, that ~ 200k msg per sec (for 500 bytes messages). Does it maximum speed? How does is changing while you adding new nodes?
- lo0 shown 1Gbps. It looks like Ignite inter-nodes communication. Is it possible to apply partition-aware data loading [3] to your test?

Please, correct me if I missed smth. 

Grid and cache configuration and JFR profiling results [4] may be helpful.
 


On Thu, Feb 16, 2017 at 5:36 PM, rishi007bansod <[hidden email]> wrote:
Hi,
    In my case CPU is becoming bottleneck when 2 nodes are connected, I have
attached my atop command snapshot,
<http://apache-ignite-users.70518.x6.nabble.com/file/n10675/atop_log.png>
I have not set number of backups parameter, it is default(i.e. 0) in my
case. Is there any way I can improve performance? Which ignite parameters or
system parameters should I look for? Are there any settings that I should
look for when 2 or more nodes are connected in grid? As in my case data
loading rate is not scaling up with 2 nodes connected

Thanks.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Get-high-throughput-for-loading-data-in-ignite-tp10483p10675.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



--
Best regards,
Andrey V. Mashenkov
Regards, Andrew.
rishi007bansod rishi007bansod
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

hi,
   Is it possible to use partition aware data loading in case of streaming data? Like in my case I have stored data in kafka and then I am passing it to ignite instance, so in this case how can we use partition aware data loading? Example provided at link : http://apacheignite.gridgain.org/docs/data-loading#section-partition-aware-data-loading, provides only case for loading data from persistent store. Is there any way I can use partition aware data loading in kafka to ignite data loading?

Thanks.
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Get high throughput for loading data in ignite

I think the answer is no. Partition-aware loading implies selecting subset of data based on partition, while Kafka is a queue that streams entries. I actually don't see why you would use anything except IgniteDataStreamer to loading the data from Kafka.

-Val
Loading...