Activating Cluster taking too long

classic Classic list List threaded Threaded
2 messages Options
iostream iostream
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Activating Cluster taking too long

This post was updated on .
Hi!

I am experimenting with v2.1 persistence store enabled.

1. Created 8 caches and pumped data into them.
2. Restarted the ignite cluster.
3. Waited for all server nodes to join the cluster.
4. called Ignite.active(true);

I waited for over an hour for the cluster to become ACTIVE. However the activation failed. Is there any known issue with activating cluster with huge amounts of data in the disk?

Number of clients - 8
Number of ignite servers - 8
Number of caches - 8
Disk usage for persistence store per server node = around 50 GB

Cache configuration -

cacheConfig.setAtomicityMode(TRANSACTIONAL);
cacheConfig.setCacheMode(PARTITIONED);
cacheConfig.setBackups(1);
cacheConfig.setCopyOnRead(TRUE);
cacheConfig.setPartitionLossPolicy(IGNORE);
cacheConfig.setQueryParallelism(2);
cacheConfig.setReadFromBackup(TRUE);
cacheConfig.setRebalanceBatchSize(524288);
cacheConfig.setRebalanceThrottle(100);
cacheConfig.setRebalanceTimeout(10000);
cacheConfig.setIndexedTypes(A.class, B.class);
cacheConfig.setOnheapCacheEnabled(FALSE);

Client and Server Configuration -

<?xml version="1.0" encoding="UTF-8"?>
<beans
    xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
       http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans.xsd">
   
    <bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="persistentStoreConfiguration">
                        <bean class="org.apache.ignite.configuration.PersistentStoreConfiguration" />
                </property>
                <property name="binaryConfiguration">
                        <bean class="org.apache.ignite.configuration.BinaryConfiguration">
                                <property name="compactFooter" value="false" />
                        </bean>
                </property>
        <property name="stripedPoolSize" value="24"/>
        <property name="systemThreadPoolSize" value="24"/>
        <property name="clientFailureDetectionTimeout" value="30000"/>
        <property name="failureDetectionTimeout" value="30000"/>
        <property name="communicationSpi">
            <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                <property name="messageQueueLimit" value="1024"/>
                <property name="slowClientQueueLimit" value="512"/>
                <property name="idleConnectionTimeout" value="3600000"/>
            </bean>
        </property>
        <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="socketTimeout" value="30000"/>
                <property name="networkTimeout" value="30000"/>
                <property name="ipFinder">
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.zk.TcpDiscoveryZookeeperIpFinder">
                        <property name="zkConnectionString" value="xyz"/>
                        <property name="basePath"
                                                                  value="xyz" />
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
</beans>

What I found in logs -

  ^-- CPU [cur=0.1%, avg=0.98%, GC=0%]
    ^-- PageMemory [pages=2268513]
    ^-- Heap [used=409MB, free=59.99%, comm=1023MB]
    ^-- Non heap [used=69MB, free=95.43%, comm=71MB]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=7, qSize=0]
    ^-- Outbound messages queue [size=0]
[07:32:26,656][WARNING][exchange-worker-#50%null%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], node=dcb07329-c5d6-404c-b4b1-3c0225e99a62]. Dumping pending objects that might be the cause:
[07:32:35,344][INFO][tcp-disco-ip-finder-cleaner-#4%null%][TcpDiscoveryZookeeperIpFinder] ZooKeeper IP Finder resolved addresses: [/x:47500, /x:47500, /x:47500, /x:47500, /x:47500, /x:47500, /x:47500, /x:47500, /x:47500]
[07:32:36,657][WARNING][exchange-worker-#50%null%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], node=dcb07329-c5d6-404c-b4b1-3c0225e99a62]. Dumping pending objects that might be the cause:
[07:32:46,658][WARNING][exchange-worker-#50%null%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], node=dcb07329-c5d6-404c-b4b1-3c0225e99a62]. Dumping pending objects that might be the cause:
[07:32:56,659][WARNING][exchange-worker-#50%null%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], node=dcb07329-c5d6-404c-b4b1-3c0225e99a62]. Dumping pending objects that might be the cause:




ezhuravlev ezhuravlev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Activating Cluster taking too long

Hi,

Please share full logs from all nodes so I can help in investigating of your problem.

Evgenii
Loading...