Is there any way to force recover the cluster - copying running cluster datastore

classic Classic list List threaded Threaded
9 messages Options
Kamlesh Joshi Kamlesh Joshi
Reply | Threaded
Open this post in threaded view
|

Is there any way to force recover the cluster - copying running cluster datastore

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

akurbanov akurbanov
Reply | Threaded
Open this post in threaded view
|

Re: Is there any way to force recover the cluster - copying running cluster datastore

Hello,

There are no guarantees that you can actually copy data storage from the
working cluster under any kind of load. However, if the cluster is idle and
no checkpoint is running at the moment, data can be copied to another folder
and serve as persistent storage for new cluster.

The only thing here that you have missed is the removal of checkpoint
markers on "WAL history is too short" exception:

Ignite has a relation between WAL, WAL archive and CP folders that has to be
maintained: if you remove WAL and WAL archive, you have to remove the third
"cp" folder also, as it contains checkpoint markers that have references to
WAL (which doesn't exist in your case).

The folder located at $IGNITE_HOME/work/db/$NODE_UUID/cp has to be cleaned
up. After this, node should start in normal way.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Is there any way to force recover the cluster - copying running cluster datastore

In reply to this post by Kamlesh Joshi
I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.

-
Denis


On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

Kamlesh Joshi Kamlesh Joshi
Reply | Threaded
Open this post in threaded view
|

RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

Thanks for the update Denis.

 

If one of the WAL disk gets failed, is there any way to start or recover the cluster forcefully ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Denis Magda <[hidden email]>
Sent: Thursday, June 6, 2019 4:44 PM
To: [hidden email]
Subject: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.


-

Denis

 

 

On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

Kamlesh Joshi Kamlesh Joshi
Reply | Threaded
Open this post in threaded view
|

RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

Hi Denis,

 

Only WAL disk is corrupted but datastore is intact by any way can we restore the cluster ? some data loss is fine. Any suggestion on this?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Kamlesh Joshi
Sent: Thursday, June 6, 2019 7:52 PM
To: [hidden email]
Subject: RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

Thanks for the update Denis.

 

If one of the WAL disk gets failed, is there any way to start or recover the cluster forcefully ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Denis Magda <[hidden email]>
Sent: Thursday, June 6, 2019 4:44 PM
To: [hidden email]
Subject: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.


-

Denis

 

 

On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

Hello!

If your node did not crash during a checkpoint, complerely removing WAL files and WAL markers should set you back to some stable state.

If it did crash during a checkpoint, you can't start with corrupted WAL.

Regards,
--
Ilya Kasnacheev


ср, 12 июн. 2019 г. в 17:37, Kamlesh Joshi <[hidden email]>:

Hi Denis,

 

Only WAL disk is corrupted but datastore is intact by any way can we restore the cluster ? some data loss is fine. Any suggestion on this?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Kamlesh Joshi
Sent: Thursday, June 6, 2019 7:52 PM
To: [hidden email]
Subject: RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

Thanks for the update Denis.

 

If one of the WAL disk gets failed, is there any way to start or recover the cluster forcefully ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Denis Magda <[hidden email]>
Sent: Thursday, June 6, 2019 4:44 PM
To: [hidden email]
Subject: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.


-

Denis

 

 

On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

Kamlesh Joshi Kamlesh Joshi
Reply | Threaded
Open this post in threaded view
|

RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

Hi Ilya,

 

WAL markers you mean cp folder inside datastore ? or any other ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Ilya Kasnacheev <[hidden email]>
Sent: Friday, June 14, 2019 6:59 PM
To: [hidden email]
Subject: Re: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

Hello!

 

If your node did not crash during a checkpoint, complerely removing WAL files and WAL markers should set you back to some stable state.

 

If it did crash during a checkpoint, you can't start with corrupted WAL.

 

Regards,

--

Ilya Kasnacheev

 

 

ср, 12 июн. 2019 г. в 17:37, Kamlesh Joshi <[hidden email]>:

Hi Denis,

 

Only WAL disk is corrupted but datastore is intact by any way can we restore the cluster ? some data loss is fine. Any suggestion on this?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Kamlesh Joshi
Sent: Thursday, June 6, 2019 7:52 PM
To: [hidden email]
Subject: RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

Thanks for the update Denis.

 

If one of the WAL disk gets failed, is there any way to start or recover the cluster forcefully ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Denis Magda <[hidden email]>
Sent: Thursday, June 6, 2019 4:44 PM
To: [hidden email]
Subject: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.


-

Denis

 

 

On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

Hello!

Yes, I believe that's the correct dir. It contains checkpoint markers which you should remove.

Regards,
--
Ilya Kasnacheev


пт, 14 июн. 2019 г. в 17:30, Kamlesh Joshi <[hidden email]>:

Hi Ilya,

 

WAL markers you mean cp folder inside datastore ? or any other ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Ilya Kasnacheev <[hidden email]>
Sent: Friday, June 14, 2019 6:59 PM
To: [hidden email]
Subject: Re: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

Hello!

 

If your node did not crash during a checkpoint, complerely removing WAL files and WAL markers should set you back to some stable state.

 

If it did crash during a checkpoint, you can't start with corrupted WAL.

 

Regards,

--

Ilya Kasnacheev

 

 

ср, 12 июн. 2019 г. в 17:37, Kamlesh Joshi <[hidden email]>:

Hi Denis,

 

Only WAL disk is corrupted but datastore is intact by any way can we restore the cluster ? some data loss is fine. Any suggestion on this?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Kamlesh Joshi
Sent: Thursday, June 6, 2019 7:52 PM
To: [hidden email]
Subject: RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

Thanks for the update Denis.

 

If one of the WAL disk gets failed, is there any way to start or recover the cluster forcefully ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Denis Magda <[hidden email]>
Sent: Thursday, June 6, 2019 4:44 PM
To: [hidden email]
Subject: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.


-

Denis

 

 

On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."

Kamlesh Joshi Kamlesh Joshi
Reply | Threaded
Open this post in threaded view
|

RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

Thanks Ilya, will give it a try !

 

Thanks and Regards,

Kamlesh Joshi

 

From: Ilya Kasnacheev <[hidden email]>
Sent: Friday, June 14, 2019 8:40 PM
To: [hidden email]
Subject: Re: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

Hello!

 

Yes, I believe that's the correct dir. It contains checkpoint markers which you should remove.

 

Regards,

--

Ilya Kasnacheev

 

 

пт, 14 июн. 2019 г. в 17:30, Kamlesh Joshi <[hidden email]>:

Hi Ilya,

 

WAL markers you mean cp folder inside datastore ? or any other ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Ilya Kasnacheev <[hidden email]>
Sent: Friday, June 14, 2019 6:59 PM
To: [hidden email]
Subject: Re: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

Hello!

 

If your node did not crash during a checkpoint, complerely removing WAL files and WAL markers should set you back to some stable state.

 

If it did crash during a checkpoint, you can't start with corrupted WAL.

 

Regards,

--

Ilya Kasnacheev

 

 

ср, 12 июн. 2019 г. в 17:37, Kamlesh Joshi <[hidden email]>:

Hi Denis,

 

Only WAL disk is corrupted but datastore is intact by any way can we restore the cluster ? some data loss is fine. Any suggestion on this?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Kamlesh Joshi
Sent: Thursday, June 6, 2019 7:52 PM
To: [hidden email]
Subject: RE: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

Thanks for the update Denis.

 

If one of the WAL disk gets failed, is there any way to start or recover the cluster forcefully ?

 

Thanks and Regards,

Kamlesh Joshi

 

From: Denis Magda <[hidden email]>
Sent: Thursday, June 6, 2019 4:44 PM
To: [hidden email]
Subject: [External]Re: Is there any way to force recover the cluster - copying running cluster datastore

 

The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin.

I would discourage you from doing this if data consistency is prominent for you. What you see on the disk of one cluster node might be inconsistent with the whole cluster state and the actual/last updates in memory. Snapshots and backups can solve your task. Google for a solution provided by GridGain.


-

Denis

 

 

On Wed, Jun 5, 2019 at 8:27 AM Kamlesh Joshi <[hidden email]> wrote:

Hi Team,

               

                We are trying to start another Ignite cluster by taking a copy of the running cluster’s datastore (source cluster’s datastore is getting modified in parallel). So, when we try to start the server node with copied datastore, it gives error as below. Also, giving cluster configuration for reference:

 

pageSize=#{4 * 1024}

walMode=LOG_ONLY

walFlushFrequency=60000

rebalanceThreadPoolSize=8

rebalanceThrottle=100

rebalanceBatchSize=#{32 * 1024 * 1024}

storagePath=/datastore/datastore

walPath=/datastore1/wal

walArchivePath=/datastore1/archive

metadataWorkDir=/datastore/metadataWorkDir

 

 

[2019-06-05T12:21:52,943][INFO ][main][GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=null, endMarker=null]

[2019-06-05T12:21:52,967][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=128.0 MiB, pages=31744, tableSize=2.5 MiB, checkpointBuffer=100.0 MiB]

[2019-06-05T12:21:52,968][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=00000000-0000-0000-0000-000000000000]

[2019-06-05T12:21:52,973][ERROR][main][IgniteKernal%EDIFCustomer_DR] Exception during start processors, node will be stopped and close connections

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

       at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

        ... 11 more

[2019-06-05T12:21:52,978][ERROR][main][IgniteKernal%EDIFCustomer_DR] Got exception while starting (will rollback startup routine).

org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1742) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:980) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) [ignite-core-2.6.0.jar:2.6.0]

Caused by: org.apache.ignite.IgniteCheckedException: WAL history is too short [descs=[org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDescriptor@4d6c2], start=FileWALPointer [idx=0, fileOff=0, len=0]]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.init(FileWriteAheadLogManager.java:3009) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2960) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.<init>(FileWriteAheadLogManager.java:2896) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:799) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1968) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:574) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:525) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:700) ~[ignite-core-2.6.0.jar:2.6.0]

        at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1739) ~[ignite-core-2.6.0.jar:2.6.0]

 

 

                So, is there any way to start this cluster with copied data store forcefully? This scenario may also arrive if WAL disk gets failed. How can we atleast start the cluster with minimum data loss ?

                Any help would be highly appreciated !

 

Thanks and Regards,

Kamlesh Joshi

 


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."


"Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment."