Running pig on ignite hadoop

classic Classic list List threaded Threaded
2 messages Options
Evans Ye Evans Ye
Reply | Threaded
Open this post in threaded view
|

Running pig on ignite hadoop

Hi Ignite community,

Greeting!
I'm from bigtop community and would like to learn more about the ignite hadoop accelerator. 
In particular, I'm interesting in running pig on ignite .

I've already setup a cluster with ignite installed and running.
I can run pi job successfully on ignite as well. BTW, it's amazing fast !

Then, I start to test pig:

grunt> A = load '/passwd';
grunt> B = foreach A generate $0;
grunt> dump B;

However it failed. 
Here's the console output, but it seems not helpful though: https://gist.github.com/evans-ye/35a55ae9f95ccf36b796

I looked into the ignite log and found the stack trace:

[13:33:49,455][ERROR][Hadoop-task-16820899-f25d-40ee-acc7-7ec0b86f4c27_10-MAP-0-0-#165%null%][HadoopRunnableTask] Task execution failed.
class org.apache.ignite.IgniteCheckedException: class org.apache.ignite.IgniteCheckedException: null
        at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:102)
        at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2Task.run(HadoopV2Task.java:50)
        at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:193)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:176)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:120)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:36)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body(HadoopExecutorService.java:183)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:107)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException
        at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2Context.getInputSplit(HadoopV2Context.java:93)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getInputSplit(WrappedMapper.java:76)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:202)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
        at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:84)
        ... 8 more

        at org.apache.ignite.internal.processors.hadoop.HadoopUtils.transformException(HadoopUtils.java:273)
        at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:196)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:176)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:120)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:36)
        at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body(HadoopExecutorService.java:183)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:107)
        at java.lang.Thread.run(Thread.java:745)


Since I'm new to ignite so I've no crude where to start to look into it.
If this seems to be a sily configuration error, please just dump me docs then I'll do my homework. :)


Thanks,
Evans

Ivan Veselovsky Ivan Veselovsky
Reply | Threaded
Open this post in threaded view
|

Re: Running pig on ignite hadoop

The immediate reason of the failure is that HadoopExternalSplit split type is not currently supported. It looks like you just hit a functionality gap, something that was not implemented.
The exception you see thrown from the code line with "TODO" mark, see below.
Can you please submit a ticket to Jira (https://issues.apache.org/jira/secure/Dashboard.jspa) so we can schedule and implement that.
 
 
  /** {@inheritDoc} */
    @Override public InputSplit getInputSplit() {
        if (inputSplit == null) {
            HadoopInputSplit split = ctx.taskInfo().inputSplit();

            if (split == null)
                return null;

            if (split instanceof HadoopFileBlock) {
                HadoopFileBlock fileBlock = (HadoopFileBlock)split;

                inputSplit = new FileSplit(new Path(fileBlock.file()), fileBlock.start(), fileBlock.length(), null);
            }
            else if (split instanceof HadoopExternalSplit)
                throw new UnsupportedOperationException(); // TODO