Perfomance drop

classic Classic list List threaded Threaded
9 messages Options
Nikita Kuzin Nikita Kuzin
Reply | Threaded
Open this post in threaded view
|

Perfomance drop

Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Perfomance drop

Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27
Nikita Kuzin Nikita Kuzin
Reply | Threaded
Open this post in threaded view
|

RE: Perfomance drop

Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

log.txt.1 (2).current (2M) Download Attachment
Nikita Kuzin Nikita Kuzin
Reply | Threaded
Open this post in threaded view
|

RE: Perfomance drop

Hello!

gc log for 3 days of working ignite.

There are 58 GCs with duration 100-200ms and 9 GCs with duration of 200-300ms

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Кузин Никита (Nikita Kuzin) <[hidden email]>
Отправлено: 31 августа 2019 г. 10:26
Кому: [hidden email] <[hidden email]>
Тема: RE: Perfomance drop
 
Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

log.txt.1 (3).current (8M) Download Attachment
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Perfomance drop

Hello!

If GC is fine next step is examining thread dumps and logs when a long operation is taking place. You can trigger it yourself by calling IgniteUtils.dumpThreads() in a test setting.

Regards,
--
Ilya Kasnacheev


пн, 2 сент. 2019 г. в 11:11, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hello!

gc log for 3 days of working ignite.

There are 58 GCs with duration 100-200ms and 9 GCs with duration of 200-300ms

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Кузин Никита (Nikita Kuzin) <[hidden email]>
Отправлено: 31 августа 2019 г. 10:26
Кому: [hidden email] <[hidden email]>
Тема: RE: Perfomance drop
 
Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27
Nikita Kuzin Nikita Kuzin
Reply | Threaded
Open this post in threaded view
|

RE: Perfomance drop

jfr from ignite is in attachment

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 2 сентября 2019 г. 14:08
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

If GC is fine next step is examining thread dumps and logs when a long operation is taking place. You can trigger it yourself by calling IgniteUtils.dumpThreads() in a test setting.

Regards,
--
Ilya Kasnacheev


пн, 2 сент. 2019 г. в 11:11, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hello!

gc log for 3 days of working ignite.

There are 58 GCs with duration 100-200ms and 9 GCs with duration of 200-300ms

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Кузин Никита (Nikita Kuzin) <[hidden email]>
Отправлено: 31 августа 2019 г. 10:26
Кому: [hidden email] <[hidden email]>
Тема: RE: Perfomance drop
 
Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

flight_recording_localhost9010.jfr (2M) Download Attachment
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Perfomance drop

Hello!

I can see that there's huge unevenness with regards to striped pool usage: around 80% of load end up on just two striped threads,
sys-stripe-11-#12%ITLXCLKAP-server425b5ae8-214a-4ff8-b1a4-943a1513c415% and
sys-stripe-9-#10%ITLXCLKAP-server425b5ae8-214a-4ff8-b1a4-943a1513c415%. That's 2/3 of all utilization of all theads.

I imagine there is a lot of contention around a very narrow set of keys. Maybe you update a few keys continuously which will cause locking issues. Please check your data partitioning and affinity.

Regards,
--
Ilya Kasnacheev


чт, 5 сент. 2019 г. в 18:12, Кузин Никита (Nikita Kuzin) <[hidden email]>:
jfr from ignite is in attachment

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 2 сентября 2019 г. 14:08
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

If GC is fine next step is examining thread dumps and logs when a long operation is taking place. You can trigger it yourself by calling IgniteUtils.dumpThreads() in a test setting.

Regards,
--
Ilya Kasnacheev


пн, 2 сент. 2019 г. в 11:11, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hello!

gc log for 3 days of working ignite.

There are 58 GCs with duration 100-200ms and 9 GCs with duration of 200-300ms

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Кузин Никита (Nikita Kuzin) <[hidden email]>
Отправлено: 31 августа 2019 г. 10:26
Кому: [hidden email] <[hidden email]>
Тема: RE: Perfomance drop
 
Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27
Nikita Kuzin Nikita Kuzin
Reply | Threaded
Open this post in threaded view
|

Re: Perfomance drop

Hello!

How we can debug it to find cache name that causes it?
Some extra logging or metrics?

Скачайте Outlook для Android


From: Ilya Kasnacheev <[hidden email]>
Sent: Wednesday, September 11, 2019 4:08:18 PM
To: [hidden email] <[hidden email]>
Subject: Re: Perfomance drop
 
Hello!

I can see that there's huge unevenness with regards to striped pool usage: around 80% of load end up on just two striped threads,
sys-stripe-11-#12%ITLXCLKAP-server425b5ae8-214a-4ff8-b1a4-943a1513c415% and
sys-stripe-9-#10%ITLXCLKAP-server425b5ae8-214a-4ff8-b1a4-943a1513c415%. That's 2/3 of all utilization of all theads.

I imagine there is a lot of contention around a very narrow set of keys. Maybe you update a few keys continuously which will cause locking issues. Please check your data partitioning and affinity.

Regards,
--
Ilya Kasnacheev


чт, 5 сент. 2019 г. в 18:12, Кузин Никита (Nikita Kuzin) <[hidden email]>:
jfr from ignite is in attachment

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 2 сентября 2019 г. 14:08
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

If GC is fine next step is examining thread dumps and logs when a long operation is taking place. You can trigger it yourself by calling IgniteUtils.dumpThreads() in a test setting.

Regards,
--
Ilya Kasnacheev


пн, 2 сент. 2019 г. в 11:11, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hello!

gc log for 3 days of working ignite.

There are 58 GCs with duration 100-200ms and 9 GCs with duration of 200-300ms

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Кузин Никита (Nikita Kuzin) <[hidden email]>
Отправлено: 31 августа 2019 г. 10:26
Кому: [hidden email] <[hidden email]>
Тема: RE: Perfomance drop
 
Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Perfomance drop

Hello!

Unfortunately it's hard to say, I'm afraid you will have to debug it or break down your tasks until you see what step causes this issue.

Regards,
--
Ilya Kasnacheev


ср, 11 сент. 2019 г. в 20:57, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hello!

How we can debug it to find cache name that causes it?
Some extra logging or metrics?

Скачайте Outlook для Android


From: Ilya Kasnacheev <[hidden email]>
Sent: Wednesday, September 11, 2019 4:08:18 PM
To: [hidden email] <[hidden email]>
Subject: Re: Perfomance drop
 
Hello!

I can see that there's huge unevenness with regards to striped pool usage: around 80% of load end up on just two striped threads,
sys-stripe-11-#12%ITLXCLKAP-server425b5ae8-214a-4ff8-b1a4-943a1513c415% and
sys-stripe-9-#10%ITLXCLKAP-server425b5ae8-214a-4ff8-b1a4-943a1513c415%. That's 2/3 of all utilization of all theads.

I imagine there is a lot of contention around a very narrow set of keys. Maybe you update a few keys continuously which will cause locking issues. Please check your data partitioning and affinity.

Regards,
--
Ilya Kasnacheev


чт, 5 сент. 2019 г. в 18:12, Кузин Никита (Nikita Kuzin) <[hidden email]>:
jfr from ignite is in attachment

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 2 сентября 2019 г. 14:08
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

If GC is fine next step is examining thread dumps and logs when a long operation is taking place. You can trigger it yourself by calling IgniteUtils.dumpThreads() in a test setting.

Regards,
--
Ilya Kasnacheev


пн, 2 сент. 2019 г. в 11:11, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hello!

gc log for 3 days of working ignite.

There are 58 GCs with duration 100-200ms and 9 GCs with duration of 200-300ms

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Кузин Никита (Nikita Kuzin) <[hidden email]>
Отправлено: 31 августа 2019 г. 10:26
Кому: [hidden email] <[hidden email]>
Тема: RE: Perfomance drop
 
Hello!

We collected GC log, it seems to be ok.

Long duration operations are very random events, we will try to collect.

Can the caches' eviction processes has influence on cache reading? (use of same threads)
_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27

От: Ilya Kasnacheev <[hidden email]>
Отправлено: 30 августа 2019 г. 15:49
Кому: [hidden email] <[hidden email]>
Тема: Re: Perfomance drop
 
Hello!

Have you tried collecting thread dumps/JFR/etc while such an operation is waiting? It could provide some clues.

Do you have GC log collected? Any long GC pauses?

Regards,
--
Ilya Kasnacheev


пт, 30 авг. 2019 г. в 14:43, Кузин Никита (Nikita Kuzin) <[hidden email]>:
Hi! 

We have application that goes to ignite node (only one node in grid). 
Application and ignite are located on one machine. 

And we noticed that sometimes that cache located at ignite is very slow: we execute method igniteCache.containsKey(key) and it can lasts 900ms, size of cache less than 10000 entries 
This cache is used mostly for read. 

Persistence is off 
All of JVM recommendations is applied 
OS: windows server 2012 R2

_________________________________
С уважением, Никита Кузин
Ведущий программист-разработчик

 Интернейшнл АйТи Дистрибьюшн

тел.: 84995021375 доб. 320
моб. тел.: 79260948887
115114, Москва, Дербеневская ул., 20-27