Ignite for Spark on YARN Deployment

classic Classic list List threaded Threaded
21 messages Options
12
Nikolai Tikhonov-2 Nikolai Tikhonov-2
Reply | Threaded
Open this post in threaded view
|

Re: Ignite for Spark on YARN Deployment

Hi Hongmei Zong,

Could you use the following configuration for Ignite [1]?  You need to use the same configuration file for YARN integration and IgniteContext. The configuration uses TcpDiscoveryMulticastIpFinder ip finder which will be more acceptable for your case. You can get more details about it there (https://apacheignite.readme.io/docs/cluster-config).

Also I see that you try use different ignite versions. Could you download apache ignite fabric [2], copy it to hdfs (for example /user/hongmei/ignite/apache-ignite-fabric-1.6.0-bin.zip) and add the following property to file.

IGNITE_PATH=/user/hongmei/ignite/apache-ignite-fabric-1.6.0-bin.zip


On Fri, Jun 10, 2016 at 5:52 PM, Hongmei Zong <[hidden email]> wrote:
Hi Nikolai,

I ran the command to start the Spark-shell and Spark started successfully Then I import two classes and create a new IgniteContext, I got the error as posted below.  Any good suggestions??? Thank you very much!

Hongmei

First run:

/usr/bin/spark-shell --jars /u/hongmei/apache-ignite/libs/ignite-core-1.6.0.jar,/u/hongmei/apache-ignite/libs/optional/ignite-spark/ignite-spark-1.6.0.jar,/u/hongmei/apache-ignite/libs/cache-api-1.0.0.jar,/u/hongmei/apache-ignite/libs/optional/ignite-log4j/ignite-log4j-1.6.0.jar,/u/hongmei/apache-ignite/libs/optional/ignite-log4j/log4j-1.2.17.jar --packages org.apache.ignite:ignite-spark:1.6.0,org.apache.ignite:ignite-spring:1.6.0

Then import and new a ic:

scala> import org.apache.ignite.spark._
import org.apache.ignite.spark._

scala> import org.apache.ignite.configuration._
import org.apache.ignite.configuration._

scala> val ic = new IgniteContext[Integer, Integer](sc, "config/ignite-default-config.xml")
16/06/10 10:32:17 INFO XmlBeanDefinitionReader: Loading XML bean definitions from URL [file:/u/hongmei/apache-ignite/config/ignite-default-config.xml]
16/06/10 10:32:18 INFO GenericApplicationContext: Refreshing org.springframework.context.support.GenericApplicationContext@3cfbcf6a: startup date [Fri Jun 10 10:32:18 EDT 2016]; root of context hierarchy
16/06/10 10:32:19 INFO IgniteKernal: 

>>>    __________  ________________  
>>>   /  _/ ___/ |/ /  _/_  __/ __/  
>>>  _/ // (7 7    // /  / / / _/    
>>> /___/\___/_/|_/___/ /_/ /___/   
>>> 
>>> ver. 1.6.0#20160518-sha1:0b22c45b
>>> 2016 Copyright(C) Apache Software Foundation
>>> 
>>> Ignite documentation: http://ignite.apache.org

16/06/10 10:32:19 INFO IgniteKernal: Config URL: n/a
16/06/10 10:32:19 INFO IgniteKernal: Daemon mode: off
16/06/10 10:32:19 INFO IgniteKernal: OS: Linux 2.6.32-573.26.1.el6.x86_64 amd64
16/06/10 10:32:19 INFO IgniteKernal: OS user: hongmei
16/06/10 10:32:19 INFO IgniteKernal: Language runtime: Scala ver. 2.10.4
16/06/10 10:32:19 INFO IgniteKernal: VM information: Java(TM) SE Runtime Environment 1.7.0_45-b18 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 24.45-b08
16/06/10 10:32:19 INFO IgniteKernal: VM total memory: 4.2GB
16/06/10 10:32:19 INFO IgniteKernal: Remote Management [restart: off, REST: on, JMX (remote: off)]
16/06/10 10:32:19 INFO IgniteKernal: IGNITE_HOME=/u/hongmei/apache-ignite
16/06/10 10:32:19 INFO IgniteKernal: VM arguments: [-Dhdp.version=2.3.4.0-3485, -Dscala.usejavacp=true, -Xms4500m, -Xmx4500m, -XX:MaxPermSize=1024m, -XX:PermSize=256m]
16/06/10 10:32:19 INFO IgniteKernal: Configured caches ['ignite-marshaller-sys-cache', 'ignite-sys-cache', 'ignite-atomics-sys-cache']

…….

Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start manager: GridManagerAdapter [enabled=true, name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1536)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:897)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1736)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1589)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1042)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:569)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:530)
at org.apache.ignite.Ignition.getOrStart(Ignition.java:414)
... 53 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start SPI: TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, reconCnt=10, maxAckTimeout=600000, forceSrvMode=false, clientReconnectDisabled=false]
at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:258)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:677)
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1531)
... 60 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Join process timed out, did not receive response for join request (consider increasing 'joinTimeout' configuration property) [joinTimeout=60000, sock=Socket[addr=c5hdp112.c5.runwaynine.com/10.138.10.47,port=47500,localport=59822]]
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1335)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)


scala>



On Jun 10, 2016, at 8:20 AM, Nikolay Tikhonov <[hidden email]> wrote:

Could you download a configuration file (https://gist.github.com/ntikhonov/d2a2ede2faca7b533dc643e0da475959) and put it to
/user/hongmei/ignite/config/ignite-default-config.xml? Also change IGNITE_XML_CONFIG from /user/hongmei/ignite/config/ to /user/hongmei/ignite/config/ignite-default-config.xml?

IGNITE_XML_CONFIG=/user/hongmei/ignite/config/ignite-default-config.xml

On Fri, Jun 10, 2016 at 2:58 PM, Hongmei Zong <[hidden email]> wrote:
Hi Nikolai,

Here is the log of one of the container:

Logs for container_e24_1464374946035_32114_01_000014



Showing 4096 bytes. Click here for full log
SpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:391)
	at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:104)
	at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:98)
	at org.apache.ignite.internal.IgnitionEx.loadConfigurations(IgnitionEx.java:606)
	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:807)
	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:716)
	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:586)
	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:556)
	at org.apache.ignite.Ignition.start(Ignition.java:347)
	... 1 more
Caused by: org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 1 in XML document from URL [file:/disk2/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_32114/container_e24_1464374946035_32114_01_000014/./ignite-config.xml/] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:398)
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:335)
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:303)
	at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:379)
	... 9 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
	at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:999)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243)
	at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
	at org.springframework.beans.factory.xml.DefaultDocumentLoader.loadDocument(DefaultDocumentLoader.java:76)
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadDocument(XmlBeanDefinitionReader.java:428)
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:390)
	... 12 more
Failed to start grid: Failed to instantiate Spring XML application context [springUrl=file:/disk2/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_32114/container_e24_1464374946035_32114_01_000014/./ignite-config.xml/, err=Line 1 in XML document from URL [file:/disk2/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_32114/container_e24_1464374946035_32114_01_000014/./ignite-config.xml/] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.]


Thank you very much!
Hongmei

On Jun 10, 2016, at 5:56 AM, Nikolai Tikhonov <[hidden email]> wrote:

Hi Hongmei Zong!

Could you show logs from other containers (container_e24_1464374946035_29722_01_000015) which was completed?

On Thu, Jun 9, 2016 at 6:29 PM, Hongmei Zong <[hidden email]> wrote:
Hi Nikolay,

After I changed the value of IGNITE_XML_CONFIG=/user/hongmei/ignite/config/  (a HDFS path). Ignite RARN is running now. I use the Hadoop UI console to check the log of the application, the attached is the stderr log information about containers:

It looks like that the containers are allocated and then completed! The stderr log is very long and the container ID from XXXXX01_0000001 to XXXXXX01_013582.  Finally all these containers are completed.

I have no idea, is there anything not right?

There is no information in stdout log.

Thank you!

Hongmei

Logs for container_e24_1464374946035_29722_01_000001


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.3.4.0-3485/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/disk2/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_29722/filecache/10/ignite-yarn.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/06/09 10:48:16 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
16/06/09 10:48:16 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
Jun 09, 2016 10:48:16 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Application master registered.
Jun 09, 2016 10:48:16 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:16 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
16/06/09 10:48:17 INFO impl.AMRMClientImpl: Received new token for : c5hdp108.c5.runwaynine.com:45454
16/06/09 10:48:17 INFO impl.AMRMClientImpl: Received new token for : c5hdp111.c5.runwaynine.com:45454
Jun 09, 2016 10:48:17 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000002.
16/06/09 10:48:17 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp108.c5.runwaynine.com:45454
Jun 09, 2016 10:48:17 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000003.
16/06/09 10:48:17 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp111.c5.runwaynine.com:45454
Jun 09, 2016 10:48:24 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000002. State: COMPLETE.
Jun 09, 2016 10:48:24 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000003. State: COMPLETE.
Jun 09, 2016 10:48:24 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:24 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
16/06/09 10:48:25 INFO impl.AMRMClientImpl: Received new token for : c5hdp109.c5.runwaynine.com:45454
16/06/09 10:48:25 INFO impl.AMRMClientImpl: Received new token for : c5hdp105.c5.runwaynine.com:45454
Jun 09, 2016 10:48:25 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000004.
16/06/09 10:48:25 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp108.c5.runwaynine.com:45454
Jun 09, 2016 10:48:25 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000005.
16/06/09 10:48:25 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp111.c5.runwaynine.com:45454
Jun 09, 2016 10:48:25 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000006. State: COMPLETE.
Jun 09, 2016 10:48:25 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000007. State: COMPLETE.
Jun 09, 2016 10:48:27 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000004. State: COMPLETE.
Jun 09, 2016 10:48:27 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000005. State: COMPLETE.
Jun 09, 2016 10:48:27 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:27 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:28 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000008.
16/06/09 10:48:28 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp108.c5.runwaynine.com:45454
16/06/09 10:48:28 INFO impl.AMRMClientImpl: Received new token for : c5hdp112.c5.runwaynine.com:45454
16/06/09 10:48:28 INFO impl.AMRMClientImpl: Received new token for : c5hdp114.c5.runwaynine.com:45454
Jun 09, 2016 10:48:28 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000009.
16/06/09 10:48:28 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp111.c5.runwaynine.com:45454
Jun 09, 2016 10:48:28 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000012. State: COMPLETE.
Jun 09, 2016 10:48:28 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000010. State: COMPLETE.
Jun 09, 2016 10:48:28 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000011. State: COMPLETE.
Jun 09, 2016 10:48:28 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000013. State: COMPLETE.
Jun 09, 2016 10:48:30 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000008. State: COMPLETE.
Jun 09, 2016 10:48:30 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000009. State: COMPLETE.
Jun 09, 2016 10:48:30 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:30 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000014.
16/06/09 10:48:31 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp108.c5.runwaynine.com:45454
16/06/09 10:48:31 INFO impl.AMRMClientImpl: Received new token for : c5hdp107.c5.runwaynine.com:45454
16/06/09 10:48:31 INFO impl.AMRMClientImpl: Received new token for : c5hdp110.c5.runwaynine.com:45454
16/06/09 10:48:31 INFO impl.AMRMClientImpl: Received new token for : c5hdp101.c5.runwaynine.com:45454
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_29722_01_000015.
16/06/09 10:48:31 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp111.c5.runwaynine.com:45454
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000020. State: COMPLETE.
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000019. State: COMPLETE.
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000018. State: COMPLETE.
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000017. State: COMPLETE.
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000016. State: COMPLETE.
Jun 09, 2016 10:48:31 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000021. State: COMPLETE.
Jun 09, 2016 10:48:33 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000014. State: COMPLETE.
Jun 09, 2016 10:48:33 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_29722_01_000015. State: COMPLETE.
Jun 09, 2016 10:48:33 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:33 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 09, 2016 10:48:34 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated

On Thu, Jun 9, 2016 at 10:21 AM, Nikolay Tikhonov <[hidden email]> wrote:
You set wrong value to IGNITE_XML_CONFIG property. The property should contains path to ignite configuration file. For example 

IGNITE_XML_CONFIG=/u/hongmei/apache-ignite/config/default-config.xml

I think you can comment this line in property file and ignite will start with default configuration.

On Thu, Jun 9, 2016 at 5:08 PM, Hongmei Zong <[hidden email]> wrote:
Hi nikolai,

Thank you very much for prompt reply! 

I did not find the ignite-config.xml file under my ignite home directory(  /u/hongmei/apache-ignite/  ).

I find a "default-config.xml" at the path:  /u/hongmei/apache-ignite/config/default-config.xml

<?xml version="1.0" encoding="UTF-8"?>

<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at


  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->

       xsi:schemaLocation="
    <!--
        Alter configuration below as needed.
    -->
        <bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
                <property name="discoverySpi">
                        <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                                <property name="ipFinder">
                                        <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                                                <property name="addresses">
                                                        <list>
                                                                <!--
                                                                 Explicitly specifying address of a local node to let it start
                                                                and operate normally even if there is no more nodes in the cluster.
                                                                You can also optionally specify an individual port or port range.
                                                                -->
                                                                <!--
                                                                <value>1.2.3.4</value>
                                                                -->
                                                                <!-- 
                                                                        IP Address and optional port range of a remote node.
                                                                        You can also optionally specify an individual port and don't set
                                                                        the port range at all.                                  
                                                                -->
                                                                <value>c5hdpe001.c5.runwaynine.com:47500..47509</value>
                                                                <value>c5hdpe002.c5.runwaynine.com:47500..47509</value>
                                                                <value>c5hdpe003.c5.runwaynine.com:47500..47509</value>

                                                        </list>
                                                </property>
                                        </bean>
                                </property>
                        </bean>
                </property>
        </bean>

</beans>


The "cluster.properties" at the path:  /u/hongmei/apache-ignite/config/cluster.properties

# The number of nodes in the cluster.
IGNITE_NODE_COUNT=2

# The number of CPU Cores for each Apache Ignite node.
IGNITE_RUN_CPU_PER_NODE=1

# The number of Megabytes of RAM for each Apache Ignite node.
IGNITE_MEMORY_PER_NODE=2048

# The version ignite which will be run on nodes.
IGNITE_VERSION=1.6.0

# The hdfs directory which will be used for saving Apache Ignite disbributives.
IGNITE_RELEASES_DIR=/user/hongmei/ignite-yarn

#The directory which will be used for saving Apache Ignite distributives(copy .jar file to it).
IGNITE_WORKING_DIR=/user/hongmei/ignite/workdir

#The hdfs path to Apache Ignite config file.
#IGNITE_XML_CONFIG=/user/yarn/ignite/
IGNITE_XML_CONFIG=/user/hongmei/ignite/

#The hdfs path to libs which will be added to classpath.
IGNITE_USERS_LIBS=/user/hongmei/ignite/libs/

#The constraint on slave hosts.
#IGNITE_HOSTENAME_CONSTRAINT=

IGNITE_LOCAL_WORK_DIR=/u/hongmei/apache-ignite


Thank you very much!!!!
Hongmei




On Thu, Jun 9, 2016 at 9:43 AM, Nikolai Tikhonov <[hidden email]> wrote:
It seems that your ignite configuration invalid. Could you share ignite-config.xml and /u/hongmei/apache-ignite/config/cluster.properties?

On Thu, Jun 9, 2016 at 4:35 PM, Hongmei Zong <[hidden email]> wrote:
Thank you very much Nikolai !

I found another issue regarding my Ignite YARN Integration:

I  run Ignite YARN application on one client server machine 'c5hdpe001', screenshot as the following:
Then I log onto one of the container node and the log is as the following:

[hongmei@c5hdpe001 apache-ignite]$ hadoop jar /u/hongmei/apache-ignite/libs/optional/ignite-yarn/ignite-yarn-1.6.0.jar /u/hongmei/apache-ignite/libs/optional/ignite-yarn/ignite-yarn-1.6.0.jar /u/hongmei/apache-ignite/config/cluster.properties
WARNING: Use "yarn jar" to launch YARN applications.
16/06/09 08:55:20 INFO impl.TimelineClientImpl: Timeline service address: http://c5hdp003.c5.runwaynine.com:8188/ws/v1/timeline/
16/06/09 08:55:22 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
16/06/09 08:55:24 INFO impl.YarnClientImpl: Submitted application application_1464374946035_29511
Jun 09, 2016 8:55:24 AM org.apache.ignite.yarn.IgniteYarnClient main
INFO: Submitted application. Application id: application_1464374946035_29511
Jun 09, 2016 8:55:27 AM org.apache.ignite.yarn.IgniteYarnClient main
INFO: Application application_1464374946035_29511 is RUNNING. 


Then I log onto one of the container node and the log is as the following: "It failed to start grid". 
Any good suggestion? Thank you very much!!!


class org.apache.ignite.IgniteException: Failed to instantiate Spring XML application context [springUrl=file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/, err=Line 1 in XML document from URL [file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.]
        at org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:904)
        at org.apache.ignite.Ignition.start(Ignition.java:350)
        at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:302)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to instantiate Spring XML application context [springUrl=file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/, err=Line 1 in XML document from URL [file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.]
        at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:391)
        at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:104)
        at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:98)
        at org.apache.ignite.internal.IgnitionEx.loadConfigurations(IgnitionEx.java:606)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:807)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:716)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:586)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:556)
        at org.apache.ignite.Ignition.start(Ignition.java:347)
        ... 1 more
Caused by: org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 1 in XML document from URL [file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:398)
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:335)
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:303)
        at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:379)
        ... 9 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
        at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
        at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
        at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)
        at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
        at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:999)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
        at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
        at org.springframework.beans.factory.xml.DefaultDocumentLoader.loadDocument(DefaultDocumentLoader.java:76)
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadDocument(XmlBeanDefinitionReader.java:428)
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:390)
        ... 12 more
Failed to start grid: Failed to instantiate Spring XML application context [springUrl=file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/, err=Line 1 in XML document from URL [file:/disk/12/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_27403/container_e24_1464374946035_27403_01_110493/./ignite-config.xml/] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.]


Hongmei


On Thu, Jun 9, 2016 at 6:04 AM, Nikolai Tikhonov <[hidden email]> wrote:
your_address1:47500..47510,your_address2:47500..47510 and your_address3:47500..47510 are the YARN master_host address, right?

No, this addresses hosts on which deploy YARN cluster. For example, you have YARN cluster which contains two servers: 10.0.0.1 and 10.0.0.2. In this case you will have the following configuration:

ipFinder.setAddresses(Arrays.asList("10.0.0.1:47500..47510""10.0.0.2:47500..47510"));











12