How to end up the GC overhead problem in the IgniteRDD?

classic Classic list List threaded Threaded
6 messages Options
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

How to end up the GC overhead problem in the IgniteRDD?

I run a  spark streaming app in a ignite cluster overlap with a spark cluster(the server node of the ignite is also the worker node of the spark), the monitor page shows that only one task can success, then after a while an OOM error will be throwed
Here is the full stack log:
--------------------------------------------------------------------------------------------------------------
[05-04-2016 19:11:44][INFO ][grid-timeout-worker-#97%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=5083f90f, name=null]
    ^-- H/N/C [hosts=4, nodes=7, CPUs=96]
    ^-- CPU [cur=6%, avg=0.96%, GC=0.13%]
    ^-- Heap [used=528MB, free=42.03%, comm=911MB]
    ^-- Public thread pool [active=0, idle=48, qSize=0]
    ^-- System thread pool [active=0, idle=48, qSize=0]
    ^-- Outbound messages queue [size=0]
Exception in thread "pub-#2%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.ignite.internal.processors.cache.GridCacheUtils.versionToBytes(GridCacheUtils.java:1122)
        at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:406)
        at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.updateIndex(GridCacheMapEntry.java:3740)
        at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.initialValue(GridCacheMapEntry.java:3212)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$IsolatedUpdater.receive(DataStreamerImpl.java:1597)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerUpdateJob.call(DataStreamerUpdateJob.java:140)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:304)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:49)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:79)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:821)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103)
        at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:784)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[05-Apr-2016 19:12:30][ERROR][tcp-disco-sock-reader-#13%null%][TcpDiscoverySpi] Runtime error caught during grid runnable execution: Socket reader [id=515, name=tcp-disco-sock-reader-#13%null%, nodeId=null]
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.io.BufferedInputStream.<init>(BufferedInputStream.java:195)
        at java.io.BufferedInputStream.<init>(BufferedInputStream.java:175)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$SocketReader.body(ServerImpl.java:4973)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
Exception in thread "tcp-disco-sock-reader-#13%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.io.BufferedInputStream.<init>(BufferedInputStream.java:195)
        at java.io.BufferedInputStream.<init>(BufferedInputStream.java:175)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$SocketReader.body(ServerImpl.java:4973)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[05-Apr-2016 19:12:33][ERROR][shmem-worker-#352%null%][TcpCommunicationSpi] Runtime error caught during grid runnable execution: ShmemWorker [endpoint=IpcSharedMemoryClientEndpoint [inSpace=IpcSharedMemorySpace [opSize=262144, shmemPtr=140320943636544, shmemId=17694742, semId=17203205, closed=true, isReader=true, writerPid=12466, readerPid=12010, tokFileName=/opt/apache-ignite-1.5.0.final-src/work/ipc/shmem/5083f90f-ffcd-40b5-b311-2802db6a3d7a-12010/gg-shmem-space-446-12466-262144, closed=true], outSpace=IpcSharedMemorySpace [opSize=262144, shmemPtr=140320608149568, shmemId=17727511, semId=17235974, closed=true, isReader=false, writerPid=12010, readerPid=12466, tokFileName=/opt/apache-ignite-1.5.0.final-src/work/ipc/shmem/5083f90f-ffcd-40b5-b311-2802db6a3d7a-12010/gg-shmem-space-447-12466-262144, closed=true], checkIn=false, checkOut=false]]
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.ignite.internal.managers.communication.GridIoMessageFactory.create(GridIoMessageFactory.java:575)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ShmemWorker$1.create(TcpCommunicationSpi.java:2945)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1093)
        at org.apache.ignite.internal.direct.DirectMessageReader.readMessage(DirectMessageReader.java:305)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerEntry.readFrom(DataStreamerEntry.java:139)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1104)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.read(DirectByteBufferStreamImplV2.java:1566)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readCollection(DirectByteBufferStreamImplV2.java:1183)
        at org.apache.ignite.internal.direct.DirectMessageReader.readCollection(DirectMessageReader.java:327)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerRequest.readFrom(DataStreamerRequest.java:404)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1104)
        at org.apache.ignite.internal.direct.DirectMessageReader.readMessage(DirectMessageReader.java:305)
        at org.apache.ignite.internal.managers.communication.GridIoMessage.readFrom(GridIoMessage.java:249)
        at org.apache.ignite.internal.util.nio.GridDirectParser.decode(GridDirectParser.java:76)
        at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:104)
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
        at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:123)
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
        at org.apache.ignite.internal.util.ipc.IpcToNioAdapter$HeadFilter.onMessageReceived(IpcToNioAdapter.java:212)
        at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:173)
        at org.apache.ignite.internal.util.ipc.IpcToNioAdapter.serve(IpcToNioAdapter.java:122)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ShmemWorker.body(TcpCommunicationSpi.java:2990)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:745)
Exception in thread "shmem-worker-#352%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.ignite.internal.managers.communication.GridIoMessageFactory.create(GridIoMessageFactory.java:575)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ShmemWorker$1.create(TcpCommunicationSpi.java:2945)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1093)
        at org.apache.ignite.internal.direct.DirectMessageReader.readMessage(DirectMessageReader.java:305)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerEntry.readFrom(DataStreamerEntry.java:139)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1104)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.read(DirectByteBufferStreamImplV2.java:1566)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readCollection(DirectByteBufferStreamImplV2.java:1183)
        at org.apache.ignite.internal.direct.DirectMessageReader.readCollection(DirectMessageReader.java:327)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerRequest.readFrom(DataStreamerRequest.java:404)
        at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1104)
        at org.apache.ignite.internal.direct.DirectMessageReader.readMessage(DirectMessageReader.java:305)
        at org.apache.ignite.internal.managers.communication.GridIoMessage.readFrom(GridIoMessage.java:249)
        at org.apache.ignite.internal.util.nio.GridDirectParser.decode(GridDirectParser.java:76)
        at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:104)
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
        at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:123)
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
        at org.apache.ignite.internal.util.ipc.IpcToNioAdapter$HeadFilter.onMessageReceived(IpcToNioAdapter.java:212)
        at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:173)
        at org.apache.ignite.internal.util.ipc.IpcToNioAdapter.serve(IpcToNioAdapter.java:122)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ShmemWorker.body(TcpCommunicationSpi.java:2990)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:745)
[05-Apr-2016 19:12:36][WARN ][grid-nio-worker-2-#102%null%][TcpCommunicationSpi] Communication SPI Session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/20.0.0.148:39577, writeTimeout=2000]
[05-Apr-2016 19:12:36][WARN ][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Timed out waiting for message delivery receipt (most probably, the reason is in long GC pauses on remote node; consider tuning GC and increasing 'ackTimeout' configuration property). Will retry to send message with increased timeout. Current timeout: 10000.
[05-Apr-2016 19:12:38][ERROR][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability.
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.StringBuilder.toString(StringBuilder.java:405)
        at java.net.Inet4Address.numericToTextFormat(Inet4Address.java:374)
        at java.net.Inet4Address.getHostAddress(Inet4Address.java:329)
        at java.net.InetAddress.toString(InetAddress.java:698)
        at java.net.InetSocketAddress$InetSocketAddressHolder.toString(InetSocketAddress.java:107)
        at java.net.InetSocketAddress.toString(InetSocketAddress.java:380)
        at java.lang.String.valueOf(String.java:2849)
        at java.lang.StringBuilder.append(StringBuilder.java:128)
        at java.util.AbstractCollection.toString(AbstractCollection.java:458)
        at java.lang.String.valueOf(String.java:2849)
        at org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:474)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:335)
        at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode.toString(TcpDiscoveryNode.java:607)
        at java.lang.String.valueOf(String.java:2849)
        at org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:474)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:335)
        at org.apache.ignite.spi.discovery.tcp.messages.TcpDiscoveryStatusCheckMessage.toString(TcpDiscoveryStatusCheckMessage.java:113)
        at java.lang.String.valueOf(String.java:2849)
        at java.lang.StringBuilder.append(StringBuilder.java:128)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.sendMessageAcrossRing(ServerImpl.java:2707)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processStatusCheckMessage(ServerImpl.java:4314)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2270)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5784)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2161)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[05-Apr-2016 19:12:39][ERROR][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Runtime error caught during grid runnable execution: IgniteSpiThread [name=tcp-disco-msg-worker-#2%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.StringBuilder.toString(StringBuilder.java:405)
        at java.net.Inet4Address.numericToTextFormat(Inet4Address.java:374)
        at java.net.Inet4Address.getHostAddress(Inet4Address.java:329)
        at java.net.InetAddress.toString(InetAddress.java:698)
        at java.net.InetSocketAddress$InetSocketAddressHolder.toString(InetSocketAddress.java:107)
        at java.net.InetSocketAddress.toString(InetSocketAddress.java:380)
        at java.lang.String.valueOf(String.java:2849)
        at java.lang.StringBuilder.append(StringBuilder.java:128)
        at java.util.AbstractCollection.toString(AbstractCollection.java:458)
        at java.lang.String.valueOf(String.java:2849)
        at org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:474)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:335)
        at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode.toString(TcpDiscoveryNode.java:607)
        at java.lang.String.valueOf(String.java:2849)
        at org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:474)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:335)
        at org.apache.ignite.spi.discovery.tcp.messages.TcpDiscoveryStatusCheckMessage.toString(TcpDiscoveryStatusCheckMessage.java:113)
        at java.lang.String.valueOf(String.java:2849)
        at java.lang.StringBuilder.append(StringBuilder.java:128)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.sendMessageAcrossRing(ServerImpl.java:2707)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processStatusCheckMessage(ServerImpl.java:4314)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2270)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5784)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2161)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
Exception in thread "tcp-disco-msg-worker-#2%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.StringBuilder.toString(StringBuilder.java:405)
        at java.net.Inet4Address.numericToTextFormat(Inet4Address.java:374)
        at java.net.Inet4Address.getHostAddress(Inet4Address.java:329)
        at java.net.InetAddress.toString(InetAddress.java:698)
        at java.net.InetSocketAddress$InetSocketAddressHolder.toString(InetSocketAddress.java:107)
        at java.net.InetSocketAddress.toString(InetSocketAddress.java:380)
        at java.lang.String.valueOf(String.java:2849)
        at java.lang.StringBuilder.append(StringBuilder.java:128)
        at java.util.AbstractCollection.toString(AbstractCollection.java:458)
        at java.lang.String.valueOf(String.java:2849)
        at org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:474)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:335)
        at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode.toString(TcpDiscoveryNode.java:607)
        at java.lang.String.valueOf(String.java:2849)
        at org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:474)
        at org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:335)
        at org.apache.ignite.spi.discovery.tcp.messages.TcpDiscoveryStatusCheckMessage.toString(TcpDiscoveryStatusCheckMessage.java:113)
        at java.lang.String.valueOf(String.java:2849)
        at java.lang.StringBuilder.append(StringBuilder.java:128)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.sendMessageAcrossRing(ServerImpl.java:2707)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processStatusCheckMessage(ServerImpl.java:4314)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2270)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5784)
        at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2161)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
Exception in thread "pub-#46%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "tcp-disco-srvr-#3%null%" [05-Apr-2016 19:12:55][ERROR][tcp-disco-multicast-addr-sender-#5%null%][TcpDiscoveryMulticastIpFinder] Runtime error caught during grid runnable execution: IgniteSpiThread [name=tcp-disco-multicast-addr-sender-#5%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:08][WARN ][tcp-comm-worker-#1%null%][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=6159]
Exception in thread "pub-#1%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "tcp-disco-multicast-addr-sender-#5%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:16][ERROR][grid-nio-worker-0-#100%null%][TcpCommunicationSpi] Caught unhandled exception in NIO worker thread (restart the node).
java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:16][ERROR][grid-nio-worker-1-#101%null%][TcpCommunicationSpi] Caught unhandled exception in NIO worker thread (restart the node).
java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:18][ERROR][grid-timeout-worker-#97%null%][GridTimeoutProcessor] Error when executing timeout callback: CancelableTask [id=2020e16e351-86c9ade6-6410-416b-9a19-7dd162d71b7b, endTime=1459854857511, period=60000, cancel=false, task=o.a.i.i.IgniteKernal$3@4917dd1a]
java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:18][ERROR][exchange-worker-#119%null%][GridCachePartitionExchangeManager] Runtime error caught during grid runnable execution: GridWorker [name=partition-exchanger, gridName=null, finished=false, isCancelled=false, hashCode=1164808095, interrupted=false, runner=exchange-worker-#119%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "exchange-worker-#119%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:18][ERROR][grid-timeout-worker-#97%null%][GridTimeoutProcessor] Runtime error caught during grid runnable execution: GridWorker [name=grid-timeout-worker, gridName=null, finished=false, isCancelled=false, hashCode=666288465, interrupted=false, runner=grid-timeout-worker-#97%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "grid-timeout-worker-#97%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:18][ERROR][nio-acceptor-#99%null%][TcpCommunicationSpi] Runtime error caught during grid runnable execution: GridWorker [name=nio-acceptor, gridName=null, finished=false, isCancelled=false, hashCode=1534248948, interrupted=false, runner=nio-acceptor-#99%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "nio-acceptor-#99%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:18][ERROR][grid-nio-worker-0-#100%null%][TcpCommunicationSpi] Runtime error caught during grid runnable execution: GridWorker [name=grid-nio-worker-0, gridName=null, finished=false, isCancelled=false, hashCode=900302186, interrupted=false, runner=grid-nio-worker-0-#100%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "grid-nio-worker-0-#100%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:18][ERROR][grid-nio-worker-1-#101%null%][TcpCommunicationSpi] Runtime error caught during grid runnable execution: GridWorker [name=grid-nio-worker-1, gridName=null, finished=false, isCancelled=false, hashCode=698839997, interrupted=false, runner=grid-nio-worker-1-#101%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "grid-nio-worker-1-#101%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-04-2016 19:13:22][INFO ][node-stop-thread][GridTcpRestProtocol] Command protocol successfully stopped: TCP binary
[05-Apr-2016 19:13:25][ERROR][node-stop-thread][IgniteKernal] Failed to pre-stop processor: GridProcessorAdapter []
java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:25][ERROR][node-stop-thread][G] Failed to properly stop grid instance due to undeclared exception.
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "tcp-comm-worker-#1%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:48][ERROR][tcp-comm-worker-#1%null%][TcpCommunicationSpi] Runtime error caught during grid runnable execution: IgniteSpiThread [name=tcp-comm-worker-#1%null%]
[05-Apr-2016 19:13:47][ERROR][grid-time-server-reader-#113%null%][GridClockServer] Runtime error caught during grid runnable execution: GridWorker [name=grid-time-server-reader, gridName=null, finished=false, isCancelled=false, hashCode=732206488, interrupted=false, runner=grid-time-server-reader-#113%null%]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "grid-time-server-reader-#113%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "sys-#77%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "sys-#76%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded
[05-Apr-2016 19:13:55][ERROR][node-stop-thread][TcpDiscoverySpi] Failed to stop the node in response to TcpDiscoverySpi's message worker thread abnormal termination.
java.lang.OutOfMemoryError: GC overhead limit exceeded
--------------------------------------------------------------------------------------------------------------
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Re: How to end up the GC overhead problem in the IgniteRDD?

Is there some configuration I can use to monitor the GC behavior in the ignite?
It is  curious that a OOM happens on a node with more than 100GB RAM.
Alexey Kuznetsov Alexey Kuznetsov
Reply | Threaded
Open this post in threaded view
|

Re: How to end up the GC overhead problem in the IgniteRDD?

Hi!

Are you sure that your JVM is started with arguments like this:  -Xms50g -Xmx50g ?
By default ignite.sh starts node with 1Gb of heap.

You may configure $JVM_OPTS variable with you JVM settings or start like this "ignite.sh -J-Xms50g -Xmx50g"

On Wed, Apr 6, 2016 at 8:27 AM, F7753 <[hidden email]> wrote:
Is there some configuration I can use to monitor the GC behavior in the
ignite?
It is  curious that a OOM happens on a node with more than 100GB RAM.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-end-up-the-GC-overhead-problem-in-the-IgniteRDD-tp3945p3948.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



--
Alexey Kuznetsov
GridGain Systems
www.gridgain.com
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: How to end up the GC overhead problem in the IgniteRDD?

Hi,

Your node has less than 1GB of heap memory (see the line below). Please allocate proper amount of heap to the server nodes, as Alexey suggested.

^-- Heap [used=528MB, free=42.03%, comm=911MB]

In addition, if you're going to have more than 10 GB of data per node, I would recommend to use offheap memory [1]. Otherwise you will likely have long GC pauses.

[1] https://apacheignite.readme.io/docs/off-heap-memory

-Val
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Re: How to end up the GC overhead problem in the IgniteRDD?

Thanks for Alexy and vkulichenko,
I modified the ignite.sh, re-configured the jvm params, this did work.
And may I ask another question?
" Failed to execute query. Add module 'ignite-indexing' to the classpath of all Ignite nodes"
I compiled the code with maven added all dependencies and it runs successfully. Why there still need to add the "ignite-indexing" and how to?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: How to end up the GC overhead problem in the IgniteRDD?

'ignite-indexing' is one of the optional modules, which is required to enable SQL queries and indexing. In a Maven-based project you should simply add this dependency. On server nodes, move 'libs/optional/ignite-indexing' folder with all its JARs into 'libs' folder and it will be added to classpath. Everything that is in 'optional' folder is excluded by default.

-Val