Recovering from a data region OOM condition

classic Classic list List threaded Threaded
5 messages Options
colinc colinc
Reply | Threaded
Open this post in threaded view
|

Recovering from a data region OOM condition

I wrote a test for what happens in the case that a DataRegion runs out of
memory. I filled up a cache with records until I received the expected
IgniteOutOfMemoryException. Then I tried to remove entries from the cache -
expecting that memory would be freed up again.

What I found is that any cache operation such as remove() or clearAll() thew
a further IOOM exception. Although cache.size() decreased, it does not
appear that memory is freed up - it is not possible to add new entries to
the cache even after a clearAll().

Is this expected behaviour? What is the recommended approach for dealing
with an OOM condition - other than to avoid it in the first place?

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
colinc colinc
Reply | Threaded
Open this post in threaded view
|

Re: Recovering from a data region OOM condition

Ignite DataRegionMetrics reports that memory *is* freed up when removing
items from the cache. However, Ignite continues to throw an OOM exception on
each subsequent cache removal. Cache puts are unsuccessful.

So although Ignite reports that the memory is free, it doesn't seem possible
to actually use it again following the OOM condition.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Mikhail Mikhail
Reply | Threaded
Open this post in threaded view
|

Re: Recovering from a data region OOM condition

What ignite version do you use? Could you please share a reproducer with us?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
colinc colinc
Reply | Threaded
Open this post in threaded view
|

Re: Recovering from a data region OOM condition

We were using Ignite 2.4 (update pending). Ignite 2.5 and later seens to
treat OOM as a critical error by default and stops the node. The reproducer
below uses a failure handler to stop this from happening. It allocates a
100MB (configurable - 100MB is quite small) region and fills it up with
data. Afterwards, it attempts to clear data from the cache. Every cache
operation (even data removal) results in OOM errors.

package mytest;

import org.apache.ignite.Ignite;
import org.apache.ignite.IgniteCache;
import org.apache.ignite.Ignition;
import org.apache.ignite.configuration.DataRegionConfiguration;
import org.apache.ignite.configuration.DataStorageConfiguration;
import org.apache.ignite.configuration.IgniteConfiguration;
import org.apache.ignite.failure.NoOpFailureHandler;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.junit.Test;

import javax.cache.CacheException;

public class MemoryTest {

    private static final String CACHE_NAME = "cache";
    private static final Logger logger =
LogManager.getLogger(MemoryTest.class);
    private static final String DEFAULT_MEMORY_REGION = "Default_Region";

    private static final String I1_NAME = "IgniteMemoryMonitorTest1";

    private static final long MEM_SIZE = 100L * 1024 * 1024;



    @Test
    public void testOOM() {
        try (Ignite ignite = startIgnite(I1_NAME)) {
            fillDataRegion(ignite);
            IgniteCache<Object, Object> cache =
ignite.getOrCreateCache(CACHE_NAME);

            // Clear all entries from the cache to free up memory
            cache.clear();      // Fails here
            cache.put("Key", "Value");
        }
    }


    private Ignite startIgnite(String instanceName) {
        IgniteConfiguration cfg = new IgniteConfiguration();
        cfg.setIgniteInstanceName(instanceName);
        cfg.setDataStorageConfiguration(createDataStorageConfiguration());
        cfg.setFailureHandler(new NoOpFailureHandler());
        return Ignition.start(cfg);
    }

    private DataStorageConfiguration createDataStorageConfiguration() {
        return new DataStorageConfiguration()
                .setDefaultDataRegionConfiguration(
                        new DataRegionConfiguration()
                                .setName(DEFAULT_MEMORY_REGION)
                                .setInitialSize(MEM_SIZE)
                                .setMaxSize(MEM_SIZE)
                                .setMetricsEnabled(true));
    }



    private void fillDataRegion(Ignite ignite) {
        byte[] megabyte = new byte[1024 * 1024];

        int storedDataMB = 0;
        try {
            IgniteCache<Object, Object> cache =
ignite.getOrCreateCache(CACHE_NAME);
            for (int i = 0; i < 200; i++) {
                cache.put(i, megabyte);
                storedDataMB++;
            }
        } catch (CacheException e) {
            logger.info("Out of memory: " + e.getClass().getSimpleName() + "
after " + storedDataMB + "MB", e);
        }
    }
}




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
colinc colinc
Reply | Threaded
Open this post in threaded view
|

Re: Recovering from a data region OOM condition

This appears to be a problem that is fixed in Ignite 2.7.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/