How to access data stored in IGFS or IGFS as Hadoop cache

classic Classic list List threaded Threaded
12 messages Options
yaoqin yaoqin
Reply | Threaded
Open this post in threaded view
|

How to access data stored in IGFS or IGFS as Hadoop cache

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: How to access data stored in IGFS or IGFS as Hadoop cache

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.


On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 


yaoqin yaoqin
Reply | Threaded
Open this post in threaded view
|

答复: How to access data stored in IGFS or IGFS as Hadoop cache

1.       If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

And if I want to compare the performance of these two file system, what shall I do?

 

2.       I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 15:13
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.

 

On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

 

Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

>> 1. If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)
IGFS doesn't provide the way to check whether file is stored in it, in secondary file system, or in both. You can query HDFS and check whether file exists there. IGFS is a caching layer. Normally you should only need to know the mode in which partiuclar path works: PRIMARY, DUAL_SYNC, DUAL_ASYNC or PROXY. Please check documentation here: https://apacheignite.readme.io/docs/modes

>> And if I want to compare the performance of these two file system, what shall I do?
I have no answer to this question, because it highly depends in what and how are you going to measure. Please provide more information on your case and I will try to give you some ideas.

>> 2.  I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected
Yes, sorry, HDFS cannot start normally when fs.defaultFS in core-site.xml is set to non-HDFS file system. In this case just set fs.defaultFS back to default and access file system using fully qualified paths. E.g., instead of "/path/to/my/file" try using "igfs:///path/to/my/file".

On Mon, Sep 7, 2015 at 2:57 PM, yaoqin <[hidden email]> wrote:

1.       If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

And if I want to compare the performance of these two file system, what shall I do?

 

2.       I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 15:13
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.

 

On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

 


yaoqin yaoqin
Reply | Threaded
Open this post in threaded view
|

答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

 

I just tried to put a file(only size about 400M) to IGFS using the code bellow.

The code is quite same as the  offical examplesit cant end with a reslut.

While using “hadoop fs –put xxxx”, it only take about 7s.

 

public class IgfsWrite {
   
public static void main(String[] args) throws IOException {
        Ignite ignite = Ignition.start(
"examples/config/filesystem/yq-example-igfs.xml");

        System.
out.println("Ignite is started");
        Long t1 = System.currentTimeMillis();

        File file =
new File(args[0]);
        IgniteFileSystem fs = ignite.fileSystem(
"igfs");
        IgfsPath workDir =
new IgfsPath("/examples/fs");
        IgfsPath fsPath =
new IgfsPath(workDir, file.getName());

        System.
out.println("Start copying file: " + file.getAbsolutePath());
        IgfsOutputStream os = fs.create(fsPath,
true);
        FileInputStream fis =
new FileInputStream(file);
       
byte[] buf = new byte[2048];

       
int read = fis.read(buf);

       
while (read != -1) {
            os.write(buf,
0, read);
            read = fis.read(buf);

        }
        System.
out.println(System.currentTimeMillis()-t1);

        Ignition.stop(
true);
    }
}

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 20:16
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1. If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

IGFS doesn't provide the way to check whether file is stored in it, in secondary file system, or in both. You can query HDFS and check whether file exists there. IGFS is a caching layer. Normally you should only need to know the mode in which partiuclar path works: PRIMARY, DUAL_SYNC, DUAL_ASYNC or PROXY. Please check documentation here: https://apacheignite.readme.io/docs/modes

 

>> And if I want to compare the performance of these two file system, what shall I do?

I have no answer to this question, because it highly depends in what and how are you going to measure. Please provide more information on your case and I will try to give you some ideas.

 

>> 2.  I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

Yes, sorry, HDFS cannot start normally when fs.defaultFS in core-site.xml is set to non-HDFS file system. In this case just set fs.defaultFS back to default and access file system using fully qualified paths. E.g., instead of "/path/to/my/file" try using "igfs:///path/to/my/file".

 

On Mon, Sep 7, 2015 at 2:57 PM, yaoqin <[hidden email]> wrote:

1.       If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

And if I want to compare the performance of these two file system, what shall I do?

 

2.       I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 15:13
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.

 

On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

 

 

dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

Can you try putting a very small file in and see if it works? Let's get it to work first before trying to compare performance.

On Mon, Sep 7, 2015 at 5:36 AM, yaoqin <[hidden email]> wrote:

 

 

I just tried to put a file(only size about 400M) to IGFS using the code bellow.

The code is quite same as the  offical examplesit cant end with a reslut.

While using “hadoop fs –put xxxx”, it only take about 7s.

 

public class IgfsWrite {
   
public static void main(String[] args) throws IOException {
        Ignite ignite = Ignition.start(
"examples/config/filesystem/yq-example-igfs.xml");

        System.
out.println("Ignite is started");
        Long t1 = System.currentTimeMillis();

        File file =
new File(args[0]);
        IgniteFileSystem fs = ignite.fileSystem(
"igfs");
        IgfsPath workDir =
new IgfsPath("/examples/fs");
        IgfsPath fsPath =
new IgfsPath(workDir, file.getName());

        System.
out.println("Start copying file: " + file.getAbsolutePath());
        IgfsOutputStream os = fs.create(fsPath,
true);
        FileInputStream fis =
new FileInputStream(file);
       
byte[] buf = new byte[2048];

       
int read = fis.read(buf);

       
while (read != -1) {
            os.write(buf,
0, read);
            read = fis.read(buf);

        }
        System.
out.println(System.currentTimeMillis()-t1);

        Ignition.stop(
true);
    }
}

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 20:16
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1. If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

IGFS doesn't provide the way to check whether file is stored in it, in secondary file system, or in both. You can query HDFS and check whether file exists there. IGFS is a caching layer. Normally you should only need to know the mode in which partiuclar path works: PRIMARY, DUAL_SYNC, DUAL_ASYNC or PROXY. Please check documentation here: https://apacheignite.readme.io/docs/modes

 

>> And if I want to compare the performance of these two file system, what shall I do?

I have no answer to this question, because it highly depends in what and how are you going to measure. Please provide more information on your case and I will try to give you some ideas.

 

>> 2.  I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

Yes, sorry, HDFS cannot start normally when fs.defaultFS in core-site.xml is set to non-HDFS file system. In this case just set fs.defaultFS back to default and access file system using fully qualified paths. E.g., instead of "/path/to/my/file" try using "igfs:///path/to/my/file".

 

On Mon, Sep 7, 2015 at 2:57 PM, yaoqin <[hidden email]> wrote:

1.       If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

And if I want to compare the performance of these two file system, what shall I do?

 

2.       I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 15:13
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.

 

On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

 

 


yaoqin yaoqin
Reply | Threaded
Open this post in threaded view
|

答复: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

Yes, I tried small file before ,it can work

 

 

And I use official example “org.apache.ignite.examples.igfs.IgfsMapReduceExample” to upload the file (size 324.13) to IGFS,  it can only success with 128M,where are other data go?  (the second line shows the right size of the file using hadoop fs –put XXXX)

 

 

 

 

发件人: Dmitriy Setrakyan [mailto:[hidden email]]
发送时间: 201598 1:05
收件人: user
抄送: Hejun (Jun He)
主题: Re: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

Can you try putting a very small file in and see if it works? Let's get it to work first before trying to compare performance.

 

On Mon, Sep 7, 2015 at 5:36 AM, yaoqin <[hidden email]> wrote:

 

 

I just tried to put a file(only size about 400M) to IGFS using the code bellow.

The code is quite same as the  offical examples it cant end with a reslut.

While using “hadoop fs –put xxxx”, it only take about 7s.

 

public class IgfsWrite {
   
public static void main(String[] args) throws IOException {
        Ignite ignite = Ignition.start(
"examples/config/filesystem/yq-example-igfs.xml");

        System.
out.println("Ignite is started");
        Long t1 = System.currentTimeMillis();

        File file =
new File(args[0]);
        IgniteFileSystem fs = ignite.fileSystem(
"igfs");
        IgfsPath workDir =
new IgfsPath("/examples/fs");
        IgfsPath fsPath =
new IgfsPath(workDir, file.getName());

        System.
out.println("Start copying file: " + file.getAbsolutePath());
        IgfsOutputStream os = fs.create(fsPath,
true);
        FileInputStream fis =
new FileInputStream(file);
       
byte[] buf = new byte[2048];

       
int read = fis.read(buf);

       
while (read != -1) {
            os.write(buf,
0, read);
            read = fis.read(buf);

        }
        System.
out.println(System.currentTimeMillis()-t1);

        Ignition.stop(
true);
    }
}

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 20:16
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1. If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

IGFS doesn't provide the way to check whether file is stored in it, in secondary file system, or in both. You can query HDFS and check whether file exists there. IGFS is a caching layer. Normally you should only need to know the mode in which partiuclar path works: PRIMARY, DUAL_SYNC, DUAL_ASYNC or PROXY. Please check documentation here: https://apacheignite.readme.io/docs/modes

 

>> And if I want to compare the performance of these two file system, what shall I do?

I have no answer to this question, because it highly depends in what and how are you going to measure. Please provide more information on your case and I will try to give you some ideas.

 

>> 2.  I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

Yes, sorry, HDFS cannot start normally when fs.defaultFS in core-site.xml is set to non-HDFS file system. In this case just set fs.defaultFS back to default and access file system using fully qualified paths. E.g., instead of "/path/to/my/file" try using "igfs:///path/to/my/file".

 

On Mon, Sep 7, 2015 at 2:57 PM, yaoqin <[hidden email]> wrote:

1.       If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

And if I want to compare the performance of these two file system, what shall I do?

 

2.       I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 15:13
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.

 

On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

 

 

 

dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: 答复: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

Can you tell us how much memory you allocate for the JVM running IGFS? What is the -Xmx value? 

Also, would be great to find out if you are running out of memory. Can you check in a profiler how much memory is used?

D.

On Mon, Sep 7, 2015 at 7:47 PM, yaoqin <[hidden email]> wrote:

Yes, I tried small file before ,it can work

 

 

And I use official example “org.apache.ignite.examples.igfs.IgfsMapReduceExample” to upload the file (size 324.13) to IGFS,  it can only success with 128M,where are other data go?  (the second line shows the right size of the file using hadoop fs –put XXXX)

 

 

 

 

发件人: Dmitriy Setrakyan [mailto:[hidden email]]
发送时间: 201598 1:05
收件人: user
抄送: Hejun (Jun He)
主题: Re: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

Can you try putting a very small file in and see if it works? Let's get it to work first before trying to compare performance.

 

On Mon, Sep 7, 2015 at 5:36 AM, yaoqin <[hidden email]> wrote:

 

 

I just tried to put a file(only size about 400M) to IGFS using the code bellow.

The code is quite same as the  offical examples it cant end with a reslut.

While using “hadoop fs –put xxxx”, it only take about 7s.

 

public class IgfsWrite {
   
public static void main(String[] args) throws IOException {
        Ignite ignite = Ignition.start(
"examples/config/filesystem/yq-example-igfs.xml");

        System.
out.println("Ignite is started");
        Long t1 = System.currentTimeMillis();

        File file =
new File(args[0]);
        IgniteFileSystem fs = ignite.fileSystem(
"igfs");
        IgfsPath workDir =
new IgfsPath("/examples/fs");
        IgfsPath fsPath =
new IgfsPath(workDir, file.getName());

        System.
out.println("Start copying file: " + file.getAbsolutePath());
        IgfsOutputStream os = fs.create(fsPath,
true);
        FileInputStream fis =
new FileInputStream(file);
       
byte[] buf = new byte[2048];

       
int read = fis.read(buf);

       
while (read != -1) {
            os.write(buf,
0, read);
            read = fis.read(buf);

        }
        System.
out.println(System.currentTimeMillis()-t1);

        Ignition.stop(
true);
    }
}

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 20:16
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1. If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

IGFS doesn't provide the way to check whether file is stored in it, in secondary file system, or in both. You can query HDFS and check whether file exists there. IGFS is a caching layer. Normally you should only need to know the mode in which partiuclar path works: PRIMARY, DUAL_SYNC, DUAL_ASYNC or PROXY. Please check documentation here: https://apacheignite.readme.io/docs/modes

 

>> And if I want to compare the performance of these two file system, what shall I do?

I have no answer to this question, because it highly depends in what and how are you going to measure. Please provide more information on your case and I will try to give you some ideas.

 

>> 2.  I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

Yes, sorry, HDFS cannot start normally when fs.defaultFS in core-site.xml is set to non-HDFS file system. In this case just set fs.defaultFS back to default and access file system using fully qualified paths. E.g., instead of "/path/to/my/file" try using "igfs:///path/to/my/file".

 

On Mon, Sep 7, 2015 at 2:57 PM, yaoqin <[hidden email]> wrote:

1.       If the way accessing IGFS is same as HDFS, how can I ensure where the data is stored , in IGFS or HDFS? (consider IGFS as HDFS cache mode)

And if I want to compare the performance of these two file system, what shall I do?

 

2.       I tried  to set fs.defaultFS as “igfs:///”, but hadoop could not start for no namenode detected

 

 

发件人: Vladimir Ozerov [mailto:[hidden email]]
发送时间: 201597 15:13
收件人: [hidden email]
抄送: Hejun (Jun He)
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

>> 1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

This depeds on how do you want to access it. If you are accessing data form Hadoop, then it is not different from any other Hadoop file systems: you can use FileSystem API, "dfs" command line tool, etc.. If you want to access data through Ignite API, use IgniteFileSystem API: https://apacheignite.readme.io/docs/igfs

>> 2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

Yes, it needs to be configured.

>> - what shall fs.defaultFS be?

"fs.defaultFS" parameter should be set to the same value as "fs.default.name" in v1. It should be URI to running IGFS instance. Please see "Configure Hadoop" and "File System URI" paragraphs here https://apacheignite.readme.io/docs/file-system for examples.

Let us know if you still have outstanding questions.

>> - where shall this core-site.xml file put, in hadoop path or ignite path or both?

core-site.xml is only needed by Hadoop. Please put it into Hadoop path as usual.

 

On Sun, Sep 6, 2015 at 7:14 AM, yaoqin <[hidden email]> wrote:

1.  How to access(set/get) data stored in IGFS or IGFS as Hadoop cache?

2.  In IGFS as Hadoop cache mode, is core-site.xml need to be configured? (hadoop v2)

- what shall fs.defaultFS be?

- where shall this core-site.xml file put, in hadoop path or ignite path or both?

 

 

 

 

 

 


dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: 答复: 答复: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

Given that you have tried a smaller file and it worked, I think it is a heap allocation problem. Unlike HDFS, IGFS stores data in memory and therefore requires more RAM. Can you allocate -Xmx2g and see if the same problem remains?

D.

On Tue, Sep 8, 2015 at 12:39 AM, yaoqin <[hidden email]> wrote:

JVM_OPTS="-Xms1g –Xmx1g -server -XX:+AggressiveOpts -XX:MaxPermSize=256m"



 

 

发件人: Dmitriy Setrakyan [mailto:[hidden email]]
发送时间: 201598 12:07
收件人: yaoqin; [hidden email]; Hejun (Jun He)
主题: Re: 答复: 答复: 答复: How to access data stored in IGFS or IGFS as Hadoop cache

 

Can you tell us how much memory you allocate for the JVM running IGFS? What is the -Xmx value? 

 

Also, would be great to find out if you are running out of memory. Can you check in a profiler how much memory is used?

 

D.

 

On Mon, Sep 7, 2015 at 7:47 PM, yaoqin <[hidden email]> wrote:

Yes, I tried small file before ,it can work

 

And I use official example “org.apache.ignite.examples.igfs.IgfsMapReduceExample” to upload the file (size 324.13) to IGFS,  it can only success with 128M,where are other data go?  (the second line shows the right size of the file using hadoop fs –put XXXX)

 


yaoqin yaoqin
Reply | Threaded
Open this post in threaded view
|

Re: How to access data stored in IGFS or IGFS as Hadoop cache

In reply to this post by yaoqin
dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: How to access data stored in IGFS or IGFS as Hadoop cache

@yaoqin,

Did you forget to add email body?

D.


On Tue, Sep 8, 2015 at 11:40 PM, yaoqin <[hidden email]> wrote:




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-access-data-stored-in-IGFS-or-IGFS-as-Hadoop-cache-tp1278p1310.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

yaoqin yaoqin
Reply | Threaded
Open this post in threaded view
|

答复: How to access data stored in IGFS or IGFS as Hadoop cache

My apologize

 

And Thanks, it works when I add more jvm heap memory.

 

 

发件人: Dmitriy Setrakyan [mailto:[hidden email]]
发送时间: 201599 15:04
收件人: user; yaoqin
主题: Re: How to access data stored in IGFS or IGFS as Hadoop cache

 

@yaoqin,

 

Did you forget to add email body?

 

D.

 

 

On Tue, Sep 8, 2015 at 11:40 PM, yaoqin <[hidden email]> wrote:





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-access-data-stored-in-IGFS-or-IGFS-as-Hadoop-cache-tp1278p1310.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.