2.7.6 to 2.8.1 migration issue

classic Classic list List threaded Threaded
3 messages Options
Andrey Davydov Andrey Davydov
Reply | Threaded
Open this post in threaded view
|

2.7.6 to 2.8.1 migration issue

 

Hello,

 

We test migration from 2.7.6 to 2.8.1 in our DEV environment and got problem: new code doesn’t start over old data.

 

Some history:

Initially we had following base cache configuration on 2.7.6:

 

    <bean id="cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="atomicityMode" value="TRANSACTIONAL"/>

        <property name="writeSynchronizationMode" value="FULL_SYNC"/>

        <property name="rebalanceMode" value="ASYNC"/>

        <property name="maxConcurrentAsyncOperations" value="500"/>

        <property name="cacheMode" value="PARTITIONED"/>

        <property name="backups" value="2"/>

        <property name="dataRegionName" value="persistDataRegion"/>

        <property name="storeKeepBinary" value="true"/>       

        <!-- Group the cache belongs to. -->

        <property name="groupName" value="applicationGroup"/>

        <property name="encryptionEnabled" value="false"/>

 

        <property name="affinity">

            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">

                <property name="excludeNeighbors" value="true"/>

                <property name="partitions" value="1024"/>

            </bean>

        </property>

    </bean>

 

And create some caches using it.

 

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache1"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache2"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="indexedTypes">

                        <list>

                            <value>java.lang.String</value>

                            <value>some.our.WellAnnotatedClass</value>

                        </list>

                    </property>       

                </bean>

 

 

Some weeks later we update base cache configuration to use topology validator.

 

 

    <bean id="base-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="topologyValidator" >

            <bean class="our.custom.ValidatorClass">       

                <property name="minimalValidTopologyNodes" value="2"/>

            </bean>

        </property>

        <property name="sqlIndexMaxInlineSize" value="256"/>

    </bean>   

    <bean id="cache-template" abstract="true" parent="base-cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="atomicityMode" value="TRANSACTIONAL"/>

        <property name="writeSynchronizationMode" value="FULL_SYNC"/>

        <property name="rebalanceMode" value="ASYNC"/>

        <property name="maxConcurrentAsyncOperations" value="500"/>

        <property name="cacheMode" value="PARTITIONED"/>

        <property name="backups" value="2"/>

        <property name="dataRegionName" value="persistDataRegion"/>

        <property name="storeKeepBinary" value="true"/>       

        <!-- Group the cache belongs to. -->

        <property name="groupName" value="applicationGroup"/>

        <property name="encryptionEnabled" value="false"/>

 

        <property name="affinity">

            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">

                <property name="excludeNeighbors" value="true"/>

                <property name="partitions" value="1024"/>

            </bean>

        </property>

    </bean>

 

 

We run new version over old data and everything starts ok.

After that we update our system and create some more caches.

 

 

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache4"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache5"/>

                </bean>

 

 

And we also start new version of code and configuration over old data directory and everything starts ok.

Now we change ignite version from 2.7.6 to 2.8.1 and try start system over old data. One node starts ok and two other fails with following error:

 

2020-06-18 15:37:12,598 [main] WARN   o.a.i.i.p.c.GridLocalConfigManager - Static configuration for the following caches will be ignored because a persistent cache with the same name already exist (see https://apacheignite.readme.io/docs/cache-configuration for more information): [...]

2020-06-18 15:37:12,602 [main] INFO   o.a.i.i.p.c.p.f.FilePageStoreManager - Cleanup cache stores [total=1, left=0, cleanFiles=false]

2020-06-18 15:37:12,602 [main] ERROR  o.a.i.i.IgniteKernal%dev_app - Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Topology validator mismatch for caches related to the same group [groupName=applicationGroup, existingCache=cache1, existingTopologyValidator=null, startingCache=cache4, startingTopologyValidator= our.custom.ValidatorClass]

      at org.apache.ignite.internal.processors.cache.GridCacheUtils.validateCacheGroupsAttributesMismatch(GridCacheUtils.java:1032) ~[ignite-core-2.8.1.jar:2.8.1]

      at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.validateCacheGroupConfiguration(ClusterCachesInfo.java:2305) ~[ignite-core-2.8.1.jar:2.8.1]

      at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onStart(ClusterCachesInfo.java:286) ~[ignite-core-2.8.1.jar:2.8.1]

 

In our development environment only the first node makes write operation to cache1 (because we don’t use load balancer in dev and front-end send request to the first node) and only this node starts without problem. Two other nodes crashes.

We rollback to 2.7.6 and all three nodes start ok.

 

If there is way to start 2.8.1 over old data?

 

 

Andrey.

 

ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: 2.7.6 to 2.8.1 migration issue

Hello!

I think we restricted configuration of caches that share the same cache group. Previously, you could have e.g. caches with different atomicity mode in the same group, now you can't.

If this is the case, you won't be able to upgrade without migration. You need to copy data to a new cache in a new group, then delete the old conflicting cache.

Regards,
--
Ilya Kasnacheev


чт, 18 июн. 2020 г. в 18:47, Andrey Davydov <[hidden email]>:

 

Hello,

 

We test migration from 2.7.6 to 2.8.1 in our DEV environment and got problem: new code doesn’t start over old data.

 

Some history:

Initially we had following base cache configuration on 2.7.6:

 

    <bean id="cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="atomicityMode" value="TRANSACTIONAL"/>

        <property name="writeSynchronizationMode" value="FULL_SYNC"/>

        <property name="rebalanceMode" value="ASYNC"/>

        <property name="maxConcurrentAsyncOperations" value="500"/>

        <property name="cacheMode" value="PARTITIONED"/>

        <property name="backups" value="2"/>

        <property name="dataRegionName" value="persistDataRegion"/>

        <property name="storeKeepBinary" value="true"/>       

        <!-- Group the cache belongs to. -->

        <property name="groupName" value="applicationGroup"/>

        <property name="encryptionEnabled" value="false"/>

 

        <property name="affinity">

            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">

                <property name="excludeNeighbors" value="true"/>

                <property name="partitions" value="1024"/>

            </bean>

        </property>

    </bean>

 

And create some caches using it.

 

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache1"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache2"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="indexedTypes">

                        <list>

                            <value>java.lang.String</value>

                            <value>some.our.WellAnnotatedClass</value>

                        </list>

                    </property>       

                </bean>

 

 

Some weeks later we update base cache configuration to use topology validator.

 

 

    <bean id="base-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="topologyValidator" >

            <bean class="our.custom.ValidatorClass">       

                <property name="minimalValidTopologyNodes" value="2"/>

            </bean>

        </property>

        <property name="sqlIndexMaxInlineSize" value="256"/>

    </bean>   

    <bean id="cache-template" abstract="true" parent="base-cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="atomicityMode" value="TRANSACTIONAL"/>

        <property name="writeSynchronizationMode" value="FULL_SYNC"/>

        <property name="rebalanceMode" value="ASYNC"/>

        <property name="maxConcurrentAsyncOperations" value="500"/>

        <property name="cacheMode" value="PARTITIONED"/>

        <property name="backups" value="2"/>

        <property name="dataRegionName" value="persistDataRegion"/>

        <property name="storeKeepBinary" value="true"/>       

        <!-- Group the cache belongs to. -->

        <property name="groupName" value="applicationGroup"/>

        <property name="encryptionEnabled" value="false"/>

 

        <property name="affinity">

            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">

                <property name="excludeNeighbors" value="true"/>

                <property name="partitions" value="1024"/>

            </bean>

        </property>

    </bean>

 

 

We run new version over old data and everything starts ok.

After that we update our system and create some more caches.

 

 

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache4"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache5"/>

                </bean>

 

 

And we also start new version of code and configuration over old data directory and everything starts ok.

Now we change ignite version from 2.7.6 to 2.8.1 and try start system over old data. One node starts ok and two other fails with following error:

 

2020-06-18 15:37:12,598 [main] WARN   o.a.i.i.p.c.GridLocalConfigManager - Static configuration for the following caches will be ignored because a persistent cache with the same name already exist (see https://apacheignite.readme.io/docs/cache-configuration for more information): [...]

2020-06-18 15:37:12,602 [main] INFO   o.a.i.i.p.c.p.f.FilePageStoreManager - Cleanup cache stores [total=1, left=0, cleanFiles=false]

2020-06-18 15:37:12,602 [main] ERROR  o.a.i.i.IgniteKernal%dev_app - Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Topology validator mismatch for caches related to the same group [groupName=applicationGroup, existingCache=cache1, existingTopologyValidator=null, startingCache=cache4, startingTopologyValidator= our.custom.ValidatorClass]

      at org.apache.ignite.internal.processors.cache.GridCacheUtils.validateCacheGroupsAttributesMismatch(GridCacheUtils.java:1032) ~[ignite-core-2.8.1.jar:2.8.1]

      at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.validateCacheGroupConfiguration(ClusterCachesInfo.java:2305) ~[ignite-core-2.8.1.jar:2.8.1]

      at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onStart(ClusterCachesInfo.java:286) ~[ignite-core-2.8.1.jar:2.8.1]

 

In our development environment only the first node makes write operation to cache1 (because we don’t use load balancer in dev and front-end send request to the first node) and only this node starts without problem. Two other nodes crashes.

We rollback to 2.7.6 and all three nodes start ok.

 

If there is way to start 2.8.1 over old data?

 

 

Andrey.

 

Andrey Davydov Andrey Davydov
Reply | Threaded
Open this post in threaded view
|

RE: 2.7.6 to 2.8.1 migration issue

I make some investigation. So I found you update logic in GridCacheProcessor and ClusterCachesInfo.

 

It was some logic of configs priority in old version (see 2.7.6 GridCacheProcessor.java line 901)

 

        if (CU.isPersistenceEnabled(ctx.config()) && ctx.cache().context().pageStore() != null) {

            Map<String, StoredCacheData> storedCaches = ctx.cache().context().pageStore().readCacheConfigurations();

 

            if (!F.isEmpty(storedCaches)) {

                for (StoredCacheData storedCacheData : storedCaches.values()) {

                    String cacheName = storedCacheData.config().getName();

 

                    //Ignore stored caches if it already added by static config(static config has higher priority).

                    if (!caches.containsKey(cacheName))

                        addStoredCache(caches, storedCacheData, cacheName, cacheType(cacheName), true, false);

                    else {

                        CacheConfiguration cfg = caches.get(cacheName).cacheData().config();

                        CacheConfiguration cfgFromStore = storedCacheData.config();

 

                        validateCacheConfigurationOnRestore(cfg, cfgFromStore);

                    }

                }

            }

        }

 

In new version config check is strict and any modification of cache configuration requires reload data.

 

As I find in my logs, first node starts because it was some crash last month and this node was recreated from scratch. So to start on 2.8.1 I should run 2.7.6 and should node by node restart over empty workdir and wait for rebalance.

 

So I see 2 problems:

 

  1. Migration process described over seem not safe for production, or I should implement some migration logic, i.e. create new cache group, copy data and so on. Then run this login over 2.7.6 and then run 2.8.1, this is looks safe, but more difficult. (Are you have any paper about best practices for this tasks?)
  2. New config validation disable ANY configuration update in my example we change topology validator but not atomicity mode. So if I want to change topology validator or some other parameter on 2.8.1 in future, I must repeat difficult migration process.

 

In my opinion, there are cache config parameters which must be same across cluster and Ignite must take it from cache metadata from storage for existing caches, and there are some parameters that affect only local node and may be used from current config. In our case we setup new validator for all caches in group. You may collect mutable properties of cache config in some other configuration.

 

Andrey.

 

От: [hidden email]
Отправлено: 18 июня 2020 г. в 18:59
Кому: [hidden email]
Тема: Re: 2.7.6 to 2.8.1 migration issue

 

Hello!

 

I think we restricted configuration of caches that share the same cache group. Previously, you could have e.g. caches with different atomicity mode in the same group, now you can't.

 

If this is the case, you won't be able to upgrade without migration. You need to copy data to a new cache in a new group, then delete the old conflicting cache.

 

Regards,

--

Ilya Kasnacheev

 

 

чт, 18 июн. 2020 г. в 18:47, Andrey Davydov <[hidden email]>:

 

Hello,

 

We test migration from 2.7.6 to 2.8.1 in our DEV environment and got problem: new code doesn’t start over old data.

 

Some history:

Initially we had following base cache configuration on 2.7.6:

 

    <bean id="cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="atomicityMode" value="TRANSACTIONAL"/>

        <property name="writeSynchronizationMode" value="FULL_SYNC"/>

        <property name="rebalanceMode" value="ASYNC"/>

        <property name="maxConcurrentAsyncOperations" value="500"/>

        <property name="cacheMode" value="PARTITIONED"/>

        <property name="backups" value="2"/>

        <property name="dataRegionName" value="persistDataRegion"/>

        <property name="storeKeepBinary" value="true"/>       

        <!-- Group the cache belongs to. -->

        <property name="groupName" value="applicationGroup"/>

        <property name="encryptionEnabled" value="false"/>

 

        <property name="affinity">

            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">

                <property name="excludeNeighbors" value="true"/>

                <property name="partitions" value="1024"/>

            </bean>

        </property>

    </bean>

 

And create some caches using it.

 

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache1"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache2"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="indexedTypes">

                        <list>

                            <value>java.lang.String</value>

                            <value>some.our.WellAnnotatedClass</value>

                        </list>

                    </property>       

                </bean>

 

 

Some weeks later we update base cache configuration to use topology validator.

 

 

    <bean id="base-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="topologyValidator" >

            <bean class="our.custom.ValidatorClass">       

                <property name="minimalValidTopologyNodes" value="2"/>

            </bean>

        </property>

        <property name="sqlIndexMaxInlineSize" value="256"/>

    </bean>   

    <bean id="cache-template" abstract="true" parent="base-cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

        <property name="atomicityMode" value="TRANSACTIONAL"/>

        <property name="writeSynchronizationMode" value="FULL_SYNC"/>

        <property name="rebalanceMode" value="ASYNC"/>

        <property name="maxConcurrentAsyncOperations" value="500"/>

        <property name="cacheMode" value="PARTITIONED"/>

        <property name="backups" value="2"/>

        <property name="dataRegionName" value="persistDataRegion"/>

        <property name="storeKeepBinary" value="true"/>       

        <!-- Group the cache belongs to. -->

        <property name="groupName" value="applicationGroup"/>

        <property name="encryptionEnabled" value="false"/>

 

        <property name="affinity">

            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">

                <property name="excludeNeighbors" value="true"/>

                <property name="partitions" value="1024"/>

            </bean>

        </property>

    </bean>

 

 

We run new version over old data and everything starts ok.

After that we update our system and create some more caches.

 

 

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache4"/>

                </bean>

                <bean parent="cache-template" class="org.apache.ignite.configuration.CacheConfiguration">

                    <property name="name" value="cache5"/>

                </bean>

 

 

And we also start new version of code and configuration over old data directory and everything starts ok.

Now we change ignite version from 2.7.6 to 2.8.1 and try start system over old data. One node starts ok and two other fails with following error:

 

2020-06-18 15:37:12,598 [main] WARN   o.a.i.i.p.c.GridLocalConfigManager - Static configuration for the following caches will be ignored because a persistent cache with the same name already exist (see https://apacheignite.readme.io/docs/cache-configuration for more information): [...]

2020-06-18 15:37:12,602 [main] INFO   o.a.i.i.p.c.p.f.FilePageStoreManager - Cleanup cache stores [total=1, left=0, cleanFiles=false]

2020-06-18 15:37:12,602 [main] ERROR  o.a.i.i.IgniteKernal%dev_app - Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Topology validator mismatch for caches related to the same group [groupName=applicationGroup, existingCache=cache1, existingTopologyValidator=null, startingCache=cache4, startingTopologyValidator= our.custom.ValidatorClass]

      at org.apache.ignite.internal.processors.cache.GridCacheUtils.validateCacheGroupsAttributesMismatch(GridCacheUtils.java:1032) ~[ignite-core-2.8.1.jar:2.8.1]

      at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.validateCacheGroupConfiguration(ClusterCachesInfo.java:2305) ~[ignite-core-2.8.1.jar:2.8.1]

      at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onStart(ClusterCachesInfo.java:286) ~[ignite-core-2.8.1.jar:2.8.1]

 

In our development environment only the first node makes write operation to cache1 (because we don’t use load balancer in dev and front-end send request to the first node) and only this node starts without problem. Two other nodes crashes.

We rollback to 2.7.6 and all three nodes start ok.

 

If there is way to start 2.8.1 over old data?

 

 

Andrey.