Hadoop client configuration IGFS - hdfs dfs

classic Classic list List threaded Threaded
12 messages Options
joaquinsanroman joaquinsanroman
Reply | Threaded
Open this post in threaded view
|

Hadoop client configuration IGFS - hdfs dfs

Hi,

First of all, thak you very much for your help.

I have configured one IGFS cluster without HDFS secondary filesystem because
the intention is to use IGFS as an independent storage.

The configuration file for all server nodes is the next:

<?xml version="1.0" encoding="UTF-8"?>

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.springframework.org/schema/util
       http://www.springframework.org/schema/util/spring-util.xsd">

   
    <description>
        Spring file for Ignite node configuration with IGFS and Apache
Hadoop map-reduce support enabled.
        Ignite node will start with this configuration by default.
    </description>

   
    <bean id="propertyConfigurer"
class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
        <property name="systemPropertiesModeName"
value="SYSTEM_PROPERTIES_MODE_FALLBACK"/>
        <property name="searchSystemEnvironment" value="true"/>
    </bean>

   
    <bean id="grid.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">
       
        <property name="connectorConfiguration">
            <bean
class="org.apache.ignite.configuration.ConnectorConfiguration">
                <property name="port" value="11211"/>
            </bean>
        </property>

        <property name="discoverySpi">
    <bean class="org.apache.ignite.spi.discovery.zk.ZookeeperDiscoverySpi">
      <property name="zkConnectionString"
value="sanlbeclomi0001.santander.pre.corp:2181,sanlbeclomi0002.santander.pre.corp:2181,sanlbeclomi0003.santander.pre.corp:2181"/>
      <property name="sessionTimeout" value="30000"/>
      <property name="zkRootPath" value="/apacheIgnite"/>
      <property name="joinTimeout" value="10000"/>
    </bean>
  </property>

       
        <property name="fileSystemConfiguration">
            <list>
                <bean
class="org.apache.ignite.configuration.FileSystemConfiguration">
                   
                    <property name="name" value="igfs"/>
                                        <property name="ipcEndpointEnabled"
value="true"/>
                   
                                       
                    <property name="blockSize" value="#{128 * 1024}"/>
                    <property name="perNodeBatchSize" value="512"/>
                    <property name="perNodeParallelBatchCount" value="16"/>

                   
                    <property name="prefetchBlocks" value="32"/>

                     
                    <property name="DataCacheConfiguration">
                        <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                           
                            <property name="name" value="myDataCache"/>
                           
                            <property name="cacheMode" value="PARTITIONED"/>
                           
                            <property name="atomicityMode"
value="TRANSACTIONAL"/>
                            <property name="backups" value="1"/>
                        </bean>
                    </property>
                   
                   
                    <property name="MetaCacheConfiguration">
                        <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                           
                            <property name="name" value="myMetaCache"/>
                           
                            <property name="cacheMode" value="PARTITIONED"/>
                           
                            <property name="atomicityMode"
value="TRANSACTIONAL"/>
                            <property name="backups" value="1"/>
                           
                        </bean>
                    </property>

                   
                    <property name="ipcEndpointConfiguration">
                        <bean
class="org.apache.ignite.igfs.IgfsIpcEndpointConfiguration">
                            <property name="type" value="TCP"/>
                                <property name="host" value="0.0.0.0"/>
                                                                <property
name="port" value="10500"/>
                        </bean>
                    </property>

                </bean>

            </list>
        </property>
    </bean>
</beans>

I can access to the IGFS through java setting the same configuration file
and adding the Ignition.setClientMode(true).
In spark, I am also able to access data remotly seting the client mode and
including the libraries into de classpath.

My problem is when I try to get information or files directly through "hdfs
dfs". I have set the properties inside the core-site.xml:

fs.igfs.impl
org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem
Class name mapping
       
fs.AbstractFileSystem.igfs.impl
org.apache.ignite.hadoop.fs.v2.IgniteHadoopFileSystem
Class name mapping
       
fs.igfs.igfs.config_path
/tmp/IGFSConfigZookeeper.xml
Class name mapping
       
fs.igfs.igfs.endpoint.no_embed
true
Class name mapping

The /tmp/IGFSConfigZookeeper.xml has the same configuration of the ignite
servers but including the property:

<property name="clientMode" value="true"/>

When I run the command "hdfs dfs -ls igfs://igfs@/", I get the error:

ls: Failed to communicate with IGFS: Failed to connect to IGFS
[endpoint=igfs://igfs@, attempts=[[type=SHMEM, port=10500,
err=java.io.IOException: Failed to connect shared memory endpoint to port
(is shared memory server endpoint up and running?): 10500], [type=TCP,
host=127.0.0.1, port=10500, err=java.io.IOException: Failed to connect to
endpoint [host=127.0.0.1, port=10500]]] (ensure that IGFS is running and
have IPC endpoint enabled; ensure that ignite-shmem-1.0.0.jar is in Hadoop
classpath if you use shared memory endpoint).

Otherwise, it runs ok when I execute: hdfs dfs -ls igfs://igfs@/host1:10500/

I think the problem is I am not setting ok the client conection properties
into the /tmp/IGFSConfigZookeeper.xml file.

Could you help me with the error?

Thank you very much,
Regards.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hello!

I don't think you actually want to use SHMEM. How about just using @localhost:10500?
--
Ilya Kasnacheev


чт, 23 мая 2019 г. в 13:15, joaquinsanroman <[hidden email]>:
Hi,

First of all, thak you very much for your help.

I have configured one IGFS cluster without HDFS secondary filesystem because
the intention is to use IGFS as an independent storage.

The configuration file for all server nodes is the next:

<?xml version="1.0" encoding="UTF-8"?>

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.springframework.org/schema/util
       http://www.springframework.org/schema/util/spring-util.xsd">


    <description>
        Spring file for Ignite node configuration with IGFS and Apache
Hadoop map-reduce support enabled.
        Ignite node will start with this configuration by default.
    </description>


    <bean id="propertyConfigurer"
class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
        <property name="systemPropertiesModeName"
value="SYSTEM_PROPERTIES_MODE_FALLBACK"/>
        <property name="searchSystemEnvironment" value="true"/>
    </bean>


    <bean id="grid.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">

        <property name="connectorConfiguration">
            <bean
class="org.apache.ignite.configuration.ConnectorConfiguration">
                <property name="port" value="11211"/>
            </bean>
        </property>

        <property name="discoverySpi">
    <bean class="org.apache.ignite.spi.discovery.zk.ZookeeperDiscoverySpi">
      <property name="zkConnectionString"
value="sanlbeclomi0001.santander.pre.corp:2181,sanlbeclomi0002.santander.pre.corp:2181,sanlbeclomi0003.santander.pre.corp:2181"/>
      <property name="sessionTimeout" value="30000"/>
      <property name="zkRootPath" value="/apacheIgnite"/>
      <property name="joinTimeout" value="10000"/>
    </bean>
  </property>


        <property name="fileSystemConfiguration">
            <list>
                <bean
class="org.apache.ignite.configuration.FileSystemConfiguration">

                    <property name="name" value="igfs"/>
                                        <property name="ipcEndpointEnabled"
value="true"/>


                    <property name="blockSize" value="#{128 * 1024}"/>
                    <property name="perNodeBatchSize" value="512"/>
                    <property name="perNodeParallelBatchCount" value="16"/>


                    <property name="prefetchBlocks" value="32"/>


                    <property name="DataCacheConfiguration">
                        <bean
class="org.apache.ignite.configuration.CacheConfiguration">

                            <property name="name" value="myDataCache"/>

                            <property name="cacheMode" value="PARTITIONED"/>

                            <property name="atomicityMode"
value="TRANSACTIONAL"/>
                            <property name="backups" value="1"/>
                        </bean>
                    </property>


                    <property name="MetaCacheConfiguration">
                        <bean
class="org.apache.ignite.configuration.CacheConfiguration">

                            <property name="name" value="myMetaCache"/>

                            <property name="cacheMode" value="PARTITIONED"/>

                            <property name="atomicityMode"
value="TRANSACTIONAL"/>
                            <property name="backups" value="1"/>

                        </bean>
                    </property>


                    <property name="ipcEndpointConfiguration">
                        <bean
class="org.apache.ignite.igfs.IgfsIpcEndpointConfiguration">
                            <property name="type" value="TCP"/>
                                <property name="host" value="0.0.0.0"/>
                                                                <property
name="port" value="10500"/>
                        </bean>
                    </property>

                </bean>

            </list>
        </property>
    </bean>
</beans>

I can access to the IGFS through java setting the same configuration file
and adding the Ignition.setClientMode(true).
In spark, I am also able to access data remotly seting the client mode and
including the libraries into de classpath.

My problem is when I try to get information or files directly through "hdfs
dfs". I have set the properties inside the core-site.xml:

fs.igfs.impl
org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem
Class name mapping

fs.AbstractFileSystem.igfs.impl
org.apache.ignite.hadoop.fs.v2.IgniteHadoopFileSystem
Class name mapping

fs.igfs.igfs.config_path
/tmp/IGFSConfigZookeeper.xml
Class name mapping

fs.igfs.igfs.endpoint.no_embed
true
Class name mapping

The /tmp/IGFSConfigZookeeper.xml has the same configuration of the ignite
servers but including the property:

<property name="clientMode" value="true"/>

When I run the command "hdfs dfs -ls igfs://igfs@/", I get the error:

ls: Failed to communicate with IGFS: Failed to connect to IGFS
[endpoint=igfs://igfs@, attempts=[[type=SHMEM, port=10500,
err=java.io.IOException: Failed to connect shared memory endpoint to port
(is shared memory server endpoint up and running?): 10500], [type=TCP,
host=127.0.0.1, port=10500, err=java.io.IOException: Failed to connect to
endpoint [host=127.0.0.1, port=10500]]] (ensure that IGFS is running and
have IPC endpoint enabled; ensure that ignite-shmem-1.0.0.jar is in Hadoop
classpath if you use shared memory endpoint).

Otherwise, it runs ok when I execute: hdfs dfs -ls igfs://igfs@/host1:10500/

I think the problem is I am not setting ok the client conection properties
into the /tmp/IGFSConfigZookeeper.xml file.

Could you help me with the error?

Thank you very much,
Regards.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
joaquinsanroman joaquinsanroman
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hi Ilya,

I am not using SHMEM, because the client and the servers are in different
hosts.

If I use @localhost:10500, I will never connect because in localhost is not
running one node of ignite.

My intention is to connect to one cluster remotely.

Do you know how to do it?

Thank you very much,
Regards.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hello!

I don't understand what you are trying to do. Do you want igfs://igfs@/ to spawn a client node that would connect to a cluster and do IGFS operations?

Regards,
--
Ilya Kasnacheev


чт, 23 мая 2019 г. в 16:16, joaquinsanroman <[hidden email]>:
Hi Ilya,

I am not using SHMEM, because the client and the servers are in different
hosts.

If I use @localhost:10500, I will never connect because in localhost is not
running one node of ignite.

My intention is to connect to one cluster remotely.

Do you know how to do it?

Thank you very much,
Regards.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
joaquinsanroman joaquinsanroman
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hi,

Yes, this what I need. When I run "hdfs dfs -ls igfs://igfs@" in an external
node (who has not an ignite node) it should connect to a defined
endpoint:port to do IGFS operations.

I have checked the documentation and I found 2 properties (File system URI:
https://apacheignite-fs.readme.io/docs/file-system):

- IgfsIpcEndpointConfiguration.host
- IgfsIpcEndpointConfiguration.port

I have configured them in my hdfs core-site.xml but it continues with the
same error.

Regards!





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hello!

It seems that Igfs can try use client node from the same process if fs.igfs.igfs.endpoint.no_embed is set to false. But you have it set to true. In this case it will use IPC by host/port.

Regards,
--
Ilya Kasnacheev


чт, 23 мая 2019 г. в 17:20, joaquinsanroman <[hidden email]>:
Hi,

Yes, this what I need. When I run "hdfs dfs -ls igfs://igfs@" in an external
node (who has not an ignite node) it should connect to a defined
endpoint:port to do IGFS operations.

I have checked the documentation and I found 2 properties (File system URI:
https://apacheignite-fs.readme.io/docs/file-system):

- IgfsIpcEndpointConfiguration.host
- IgfsIpcEndpointConfiguration.port

I have configured them in my hdfs core-site.xml but it continues with the
same error.

Regards!





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
joaquinsanroman joaquinsanroman
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hi,

I set fs.igfs.igfs.endpoint.no_embed to false, but it does not run. This is
the actual situation:

[xxx@snnni0006 ~]$ hdfs getconf -confkey fs.igfs.igfs.endpoint.no_embed
false

[xxx@snnni0006 ~]$ hdfs getconf -confkey IgfsIpcEndpointConfiguration.host
snnni0010

[xxx@snnni0006 ~]$ hdfs getconf -confkey IgfsIpcEndpointConfiguration.port
10500

I am configuring something wrong?

Regards.







--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hello!

Unfortunately I'm not familiar enough with IGFS to answer such question purely from configuration variables.

What does it say when it does not work? Have you tried starting Ignite client in the same VM prior to the launch? If you don't have Ignite client, you will have to use IPC.

Regards,
--
Ilya Kasnacheev


чт, 23 мая 2019 г. в 18:19, joaquinsanroman <[hidden email]>:
Hi,

I set fs.igfs.igfs.endpoint.no_embed to false, but it does not run. This is
the actual situation:

[xxx@snnni0006 ~]$ hdfs getconf -confkey fs.igfs.igfs.endpoint.no_embed
false

[xxx@snnni0006 ~]$ hdfs getconf -confkey IgfsIpcEndpointConfiguration.host
snnni0010

[xxx@snnni0006 ~]$ hdfs getconf -confkey IgfsIpcEndpointConfiguration.port
10500

I am configuring something wrong?

Regards.







--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
joaquinsanroman joaquinsanroman
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hi,

Thank you very much for your help!

You mean that I need to run one IgniteClient inside the host from I want to
make the query? With this configuration the hdfs will access to the cluster
through the local client, right?

I would like to access without having to run the ignite client in local, but
this solve my problem temporally.

If I would like to access through IPC, what I need to configure?

King regards.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hello!

igfs://igfs@/host1:10500/ is access through IPC and you said it works for you.

You will definitely need to run a client anyway if you want to use Ignite configration, with discovery and stuff.

Regards,
--
Ilya Kasnacheev


чт, 23 мая 2019 г. в 18:40, joaquinsanroman <[hidden email]>:
Hi,

Thank you very much for your help!

You mean that I need to run one IgniteClient inside the host from I want to
make the query? With this configuration the hdfs will access to the cluster
through the local client, right?

I would like to access without having to run the ignite client in local, but
this solve my problem temporally.

If I would like to access through IPC, what I need to configure?

King regards.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
joaquinsanroman joaquinsanroman
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs

Hi,

Ok, I will start one client because with igfs://igfs@/host1:10500/ it will
not balance the access to all ignite servers.

Thank you very much,
Regards.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop client configuration IGFS - hdfs dfs


Not sure what your goals for IGFS are but that thread is worth reading.

-
Denis


On Fri, May 24, 2019 at 3:24 AM joaquinsanroman <[hidden email]> wrote:
Hi,

Ok, I will start one client because with igfs://igfs@/host1:10500/ it will
not balance the access to all ignite servers.

Thank you very much,
Regards.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/