Should Ignition.start() method to be called in a spark-igniteRDD app?

classic Classic list List threaded Threaded
8 messages Options
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Should Ignition.start() method to be called in a spark-igniteRDD app?

I got an error when I submit a jar. The cluster does not seems to be wrong, since I've tested it using the official guide: https://apacheignite-fs.readme.io/docs/ignitecontext-igniterdd#section-running-sql-queries-against-ignite-cache and the simple use case works well. The exception tell me that I should write 'Ignition.start()' before other code,I have not do this in the use case given by the guide, there must be something wrong. here is the full stack trace:
-------------------------------------------------------------------------------------------------------------------
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, nobida145): class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=null]
        at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1235)
        at org.apache.ignite.Ignition.ignite(Ignition.java:516)
        at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:150)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply$mcVI$sp(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
        at org.apache.spark.rdd.RDD.foreach(RDD.scala:910)
        at org.apache.ignite.spark.IgniteContext.<init>(IgniteContext.scala:58)
        at main.scala.StreamingJoin$.main(StreamingJoin.scala:180)
        at main.scala.StreamingJoin.main(StreamingJoin.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=null]
        at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1235)
        at org.apache.ignite.Ignition.ignite(Ignition.java:516)
        at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:150)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply$mcVI$sp(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
./igparkStr.sh: line 16: --repositories: command not found

-----------------------------------------------------------------------------------------------------------------
igsparkStr.sh is a shell script I wrote to submit a spark app, its content listed below:
-----------------------------------------------------------------------------------------------------------------
/opt/spark-1.6.1/bin/spark-submit --class main.scala.StreamingJoin   --properties-file conf/spark.conf ./myapp-1.0-SNAPSHOT-jar-with-dependencies.jar --packages org.apache.ignite:ignite-spark:1.5.0.final
 16   --repositories http://www.gridgainsystems.com/nexus/content/repositories/external
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

I've no idea about how the Ignition.start() method was  called in the ignite kernel. Is there some document to refer to?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

Hi,

The code there is not completely correct and there is a ticket for that [1]. Essentially, if there is some kind of misconfiguration, the corresponding exception will be lost and the one you got will be thrown. Most likely that's what happening in your test.

I would recommend to execute a Spark closure which will simply call Ignition.start() with your configuration. If my assumption is right, you will see the exception that causes the issue.

Let me know if it helps.

[1] https://issues.apache.org/jira/browse/IGNITE-2942

-Val
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

I added 'Ignition.start()' in my first line of the 'main' method, but still got the same error, add also notified that it seems the approach I use to start ignite in spark cluster is wrong, the ignite instance is more than I expected, there is 3 worker nodes in my cluster . I use '${IGNITE_HOME}/bin/ignite.sh' to start 3 ignite worker already, but after I add 'Ignition.start()' ,the ignite instance  became 5, this is not seems true:
------------------------------------------------------------------------------------------------------------------
16/04/02 11:31:42 INFO AppClient$ClientEndpoint: Executor updated: app-20160402113142-0000/2 is now RUNNING
16/04/02 11:31:42 INFO AppClient$ClientEndpoint: Executor updated: app-20160402113142-0000/1 is now RUNNING
16/04/02 11:31:42 INFO AppClient$ClientEndpoint: Executor updated: app-20160402113142-0000/0 is now RUNNING
16/04/02 11:31:42 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
[11:31:51] Topology snapshot [ver=5, servers=5, clients=0, CPUs=96, heap=100.0GB]
[11:31:51] Topology snapshot [ver=6, servers=4, clients=0, CPUs=96, heap=53.0GB]
[11:31:57] Topology snapshot [ver=7, servers=5, clients=0, CPUs=96, heap=100.0GB]
[11:31:57] Topology snapshot [ver=8, servers=4, clients=0, CPUs=96, heap=53.0GB]
[11:32:12] Topology snapshot [ver=9, servers=5, clients=0, CPUs=96, heap=100.0GB]
[11:32:12] Topology snapshot [ver=10, servers=4, clients=0, CPUs=96, heap=53.0GB]
[11:32:14] Topology snapshot [ver=11, servers=5, clients=0, CPUs=96, heap=100.0GB]
[11:32:14] Topology snapshot [ver=12, servers=4, clients=0, CPUs=96, heap=53.0GB]
-------------------------------------------------------------------------------------------------------------------
also the full stack trace is listed below, this matter seems not so easy to solve:
-------------------------------------------------------------------------------------------------------------------
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, nobida145): class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=null]
        at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1235)
        at org.apache.ignite.Ignition.ignite(Ignition.java:516)
        at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:150)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply$mcVI$sp(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
        at org.apache.spark.rdd.RDD.foreach(RDD.scala:910)
        at org.apache.ignite.spark.IgniteContext.<init>(IgniteContext.scala:58)
        at main.scala.StreamingJoin$.main(StreamingJoin.scala:183)
        at main.scala.StreamingJoin.main(StreamingJoin.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=null]
        at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1235)
        at org.apache.ignite.Ignition.ignite(Ignition.java:516)
        at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:150)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply$mcVI$sp(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at org.apache.ignite.spark.IgniteContext$$anonfun$1.apply(IgniteContext.scala:58)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
-----------------------------------------------------------------------------------------------------------------

Here is my ignite config file's content:
-----------------------------------------------------------------------------------------------------------------
 20 <beans xmlns="http://www.springframework.org/schema/beans"
 21        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 22        xsi:schemaLocation="http://www.springframework.org/schema/beans
 23                             http://www.springframework.org/schema/beans/spring-beans.xsd">
 24     <bean class="org.apache.ignite.configuration.IgniteConfiguration"/>
 25 </beans>
-----------------------------------------------------------------------------------------------------------------
I created two cache 
-----------------------------------------------------------------------------------------------------------------
   
    Ignition.start()
    ...
    // fetch the small table data
    ...
    // ignite small table cache
    val smallTableCache = smallTableContext.fromCache(smallTableCacheCfg)
    ...
    smallTableCache.savePairs(smallTableCols_rdd)
    ...
    // fetch the source table data
    ...
    // ignite source table cache
    val sourceTableCache = streamTableContext.fromCache(sourceTableCacheCfg)
    ...
    // sourceTableCols rdd save to cache
    sourceTableCache.savePairs(sourceTableCols_rdd)
    ...
    val res = sourceTableCache.sql(queryString)
-------------------------------------------------------------------------------------------------------------------
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

What makes me very confused is that when I use ignite to run a spark word count app, it did not ask me to call 'Ignition.start()', while I use igniteRDD to cache table and just run a spark streaming app, the console throw the exception.
What's the difference?
agura-2 agura-2
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

CONTENTS DELETED
The author has deleted this message.
F7753 F7753
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

Hi agura,
I refer to the source code of IgniteContext.scala, and notified that the ignite method will do "Ignition.start()" in a try-catch block.
And the code I use is listed in this topic:
http://apache-ignite-users.70518.x6.nabble.com/How-to-solve-the-22-parameters-limit-under-scala-2-10-in-the-case-class-tc3847.html
You can find them in the 6th comment written by me, thanks a lot
Alexei Scherbakov Alexei Scherbakov
Reply | Threaded
Open this post in threaded view
|

Re: Should Ignition.start() method to be called in a spark-igniteRDD app?

Hello,

I was not able to reproduce your case because source code you've provided is incomplete.
Could you attach fully working minimalistic example reproducing your problem?
Have you tried to run your code in local environment using something like
val ssc: SparkContext = new SparkContext("local[*]", "test")
FYI: you have an error in ./igparkStr.sh
/igparkStr.sh: line 16: --repositories: command not found

2016-04-06 4:20 GMT+03:00 F7753 <[hidden email]>:
Hi agura,
I refer to the source code of IgniteContext.scala, and notified that the
ignite method will do "Ignition.start()" in a try-catch block.
And the code I use is listed in this topic:
http://apache-ignite-users.70518.x6.nabble.com/How-to-solve-the-22-parameters-limit-under-scala-2-10-in-the-case-class-tc3847.html
You can find them in the 6th comment written by me, thanks a lot



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Should-Ignition-start-method-to-be-called-in-a-spark-igniteRDD-app-tp3854p3947.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.



--

Best regards,
Alexei Scherbakov