Ignite Map-Reduce Deadlocking, Running in SYS pool

classic Classic list List threaded Threaded
7 messages Options
Chris Software Chris Software
Reply | Threaded
Open this post in threaded view
|

Ignite Map-Reduce Deadlocking, Running in SYS pool

Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris


ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Hello!

This looks like a mistake. However, we're going to drop IGFS so the fix is unlikely to be expected.

The recommended practical approach is to increase number of threads in system thread pool to large value.

Regards,
--
Ilya Kasnacheev


вт, 27 авг. 2019 г. в 00:34, Chris Software <[hidden email]>:
Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris


Chris Software Chris Software
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

I see.  Thank you. 

On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <[hidden email]> wrote:
Hello!

This looks like a mistake. However, we're going to drop IGFS so the fix is unlikely to be expected.

The recommended practical approach is to increase number of threads in system thread pool to large value.

Regards,
--
Ilya Kasnacheev


вт, 27 авг. 2019 г. в 00:34, Chris Software <[hidden email]>:
Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris


Chris Software Chris Software
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Ilya,

When will an official announcement and support-drop schedule be made about dropping IGFS?

Thank you,

Chris

On Tue, Aug 27, 2019 at 1:47 PM Chris Software <[hidden email]> wrote:
I see.  Thank you. 

On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <[hidden email]> wrote:
Hello!

This looks like a mistake. However, we're going to drop IGFS so the fix is unlikely to be expected.

The recommended practical approach is to increase number of threads in system thread pool to large value.

Regards,
--
Ilya Kasnacheev


вт, 27 авг. 2019 г. в 00:34, Chris Software <[hidden email]>:
Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris


ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Hello!


Vote already succeeded, there shouldn't be IGFS in 2.8.

Regards,
--
Ilya Kasnacheev


ср, 11 сент. 2019 г. в 16:52, Chris Software <[hidden email]>:
Ilya,

When will an official announcement and support-drop schedule be made about dropping IGFS?

Thank you,

Chris

On Tue, Aug 27, 2019 at 1:47 PM Chris Software <[hidden email]> wrote:
I see.  Thank you. 

On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <[hidden email]> wrote:
Hello!

This looks like a mistake. However, we're going to drop IGFS so the fix is unlikely to be expected.

The recommended practical approach is to increase number of threads in system thread pool to large value.

Regards,
--
Ilya Kasnacheev


вт, 27 авг. 2019 г. в 00:34, Chris Software <[hidden email]>:
Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris


Chris Software Chris Software
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

I respectfully suggest you update your public documentation ASAP, as people (like my team) are developing new software now, using IGFS, expecting that it will continue to be supported.  Please don't wait until you release 2.8. 

On Wed, Sep 11, 2019 at 11:07 AM Ilya Kasnacheev <[hidden email]> wrote:
Hello!


Vote already succeeded, there shouldn't be IGFS in 2.8.

Regards,
--
Ilya Kasnacheev


ср, 11 сент. 2019 г. в 16:52, Chris Software <[hidden email]>:
Ilya,

When will an official announcement and support-drop schedule be made about dropping IGFS?

Thank you,

Chris

On Tue, Aug 27, 2019 at 1:47 PM Chris Software <[hidden email]> wrote:
I see.  Thank you. 

On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <[hidden email]> wrote:
Hello!

This looks like a mistake. However, we're going to drop IGFS so the fix is unlikely to be expected.

The recommended practical approach is to increase number of threads in system thread pool to large value.

Regards,
--
Ilya Kasnacheev


вт, 27 авг. 2019 г. в 00:34, Chris Software <[hidden email]>:
Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris


dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Chris,

Thanks for reminding. We'll update the docs the next week. Sorry for wasting your time.

Btw, what was the task you tried to solve with IGFS? We might find an alternate API or solution with Ignite or recommend another project.

-
Denis


On Wed, Sep 11, 2019 at 10:21 AM Chris Software <[hidden email]> wrote:
I respectfully suggest you update your public documentation ASAP, as people (like my team) are developing new software now, using IGFS, expecting that it will continue to be supported.  Please don't wait until you release 2.8. 

On Wed, Sep 11, 2019 at 11:07 AM Ilya Kasnacheev <[hidden email]> wrote:
Hello!


Vote already succeeded, there shouldn't be IGFS in 2.8.

Regards,
--
Ilya Kasnacheev


ср, 11 сент. 2019 г. в 16:52, Chris Software <[hidden email]>:
Ilya,

When will an official announcement and support-drop schedule be made about dropping IGFS?

Thank you,

Chris

On Tue, Aug 27, 2019 at 1:47 PM Chris Software <[hidden email]> wrote:
I see.  Thank you. 

On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <[hidden email]> wrote:
Hello!

This looks like a mistake. However, we're going to drop IGFS so the fix is unlikely to be expected.

The recommended practical approach is to increase number of threads in system thread pool to large value.

Regards,
--
Ilya Kasnacheev


вт, 27 авг. 2019 г. в 00:34, Chris Software <[hidden email]>:
Hello,

I am working on a project and we have run into two related problems while doing Map_Reduce on Ignite Filesystem Cache.

We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful bid to resolve the problem.


Basically, if you run the test (mvn test) it will deadlock and hang.  We have two IgfsTasks created and have set the SYS threadpool to size 2 for demonstration purposes.  Each IgfsTask sleeps and then writes to a file.  This causes a deadlock because:
1.  The IgfsTask is run in the SYS pool.
2.  The Igfs write action uses a separate thread in the SYS pool
3.  Then if there are no empty threads available, the whole system hangs. 

First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using the SYS pool seems unnecessarily risky, as we found it actually locks up an entire cluster of many ignite nodes when it deadlocks.  How do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it actually seems to execute this on the client.  This is not obvious in this test, but in my real cluster of 30 nodes, the client seems to be doing this work, which is a problem. 

Second, is it bad form to open a file within a map-reduce?  Even using the public pool will not solve the inherent deadlock here--that one thread is depending on another thread in the same thread pool.  That's an inherent risk.  In our real process we open the file because we are performing file transformations in the IgfsTask, and then writing the results out to temp files in the cluster.  In the end, we collate all the temp files.  Is there a better approach, or a safe way to open a file and write to it from within a reduce? 

Thank you for your time!

Chris