C++ Distributed cache for caching files

classic Classic list List threaded Threaded
8 messages Options
rajs123 rajs123
Reply | Threaded
Open this post in threaded view
|

C++ Distributed cache for caching files

Hi,

In my current program, I frequently need multiple files. Those files are currently loaded in NFS server and I access files through that. I want to move to a distributed system, where there are multiple machines and files will be distributed in cache of those machines.

Is there any way to do that using ignite?

I understand this can be achieved using igfs. However it is implemented in java and there is no c++ library for that.

I initially tried using cache<string, string> of (filename, contents). What would be performance of such an approach?
My file sizes are typically 20-50MB.
Is there any work around?
Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Hi,

IGFS is a Hadoop-compliant distributed file system. If your application can work with Hadoop file systems, then you can just setup IGFS on top of it and cache data from Hadoop file system.
If you want to employ a kind of POSIX distributed file system, such as NFS, IGFS is not able to work with it at the moment, though we have it on the roadmap.

As per parformance of straigtforward solution with cache<string, byte*> or cache<string, string>, it is really hard to say what will be the performance gain, because it strongly depends on actual data distribution. You'd better try it, and share your performance numbers and configuration. We will analyze it and give recommendations.

Vladimir.

On Thu, Apr 14, 2016 at 1:54 PM, rajs123 <[hidden email]> wrote:
Hi,

In my current program, I frequently need multiple files. Those files are
currently loaded in NFS server and I access files through that. I want to
move to a distributed system, where there are multiple machines and files
will be distributed in cache of those machines.

Is there any way to do that using ignite?

I understand this can be achieved using igfs. However it is implemented in
java and there is no c++ library for that.

I initially tried using cache<string, string> of (filename, contents). What
would be performance of such an approach?
My file sizes are typically 20-50MB.
Is there any work around?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/C-Distributed-cache-for-caching-files-tp4158.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

rajs123 rajs123
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Hi,

I tried to cache the contents of file using c++ cache, I get the following error:
An error occurred: Java exception occurred [cls=org.apache.ignite.internal.processors.platform.PlatformNoCallbackException, msg=Callback handler is not set in native platform.]

I think it might have to do with large size of file content. So I tried this as a value, which gives same error.


string val = "";
for (int i = 0; i < 1000; i++)
        val = val + "aaaaa\n";


When I use i till 100, the code works.

Current configuration of nodes:
2 nodes stated with following commands on  the same machine.
./modules/platforms/cpp/ignite/ignite -springConfigUrl=config/config2.xml
./modules/platforms/cpp/ignite/ignite -springConfigUrl=config/config2.xml -jvmMaxMemoryMB=4096

config2.xml:
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                            http://www.springframework.org/schema/beans/spring-beans.xsd">
    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="peerClassLoadingEnabled" value="true"/>
        <property name="cacheConfiguration">
            <list>
               
                <bean class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="1"/>
                </bean>
            </list>
        </property>
        <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                   
                   
                   
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
                        <property name="addresses">
                            <list>
                               
                                <value>10.0.1.48</value>
                            </list>
                        </property>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
</beans>
Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Ho,

This is a known issue which will be fixed in upcoming 1.6 release - https://issues.apache.org/jira/browse/IGNITE-2564
For now you can do one of the following things:
1) Build Ignite from current master branch.
2) Patch you Ignite version manually using the following Pull Request: https://github.com/apache/ignite/pull/460/files
3) Or the most straightforward, but not absolutely correct workaround: change "1024" to some larger number in modules/platforms/cpp/core/src/impl/ignite_environment.cpp, method IgniteEnvironment::AllocateMemory.

Vladimir.

On Tue, Apr 19, 2016 at 10:17 AM, rajs123 <[hidden email]> wrote:
Hi,

I tried to cache the contents of file using c++ cache, I get the following
error:
An error occurred: Java exception occurred
[cls=org.apache.ignite.internal.processors.platform.PlatformNoCallbackException,
msg=Callback handler is not set in native platform.]

I think it might have to do with large size of file content. So I tried this
as a value, which gives same error.


string val = "";
for (int i = 0; i < 1000; i++)
        val = val + "aaaaa\n";


When I use i till 100, the code works.

Current configuration of nodes:
2 nodes stated with following commands on  the same machine.
./modules/platforms/cpp/ignite/ignite -springConfigUrl=config/config2.xml
./modules/platforms/cpp/ignite/ignite -springConfigUrl=config/config2.xml
-jvmMaxMemoryMB=4096

config2.xml:
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans

http://www.springframework.org/schema/beans/spring-beans.xsd">
    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="peerClassLoadingEnabled" value="true"/>
        <property name="cacheConfiguration">
            <list>

                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="1"/>
                </bean>
            </list>
        </property>
        <property name="discoverySpi">
            <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">



                    <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
                        <property name="addresses">
                            <list>

                                <value>10.0.1.48</value>
                            </list>
                        </property>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
</beans>




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/C-Distributed-cache-for-caching-files-tp4158p4312.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

rajs123 rajs123
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Hi,

I changed 1024 to 2048 and recompiled the module.
I get the following error:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_S_construct null not valid
Aborted
Igor Sapego Igor Sapego
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Hi,

Can you share the code of your test so we can investigate it?

Best Regards,
Igor

On Tue, Apr 19, 2016 at 12:45 PM, rajs123 <[hidden email]> wrote:
Hi,

I changed 1024 to 2048 and recompiled the module.
I get the following error:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_S_construct null not valid
Aborted




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/C-Distributed-cache-for-caching-files-tp4158p4318.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

rajs123 rajs123
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Code:

#include <iostream>
#include <string>
#include <fstream>
#include <streambuf>
#include <cerrno>
#include <ctime>
#include <chrono>
#include "ignite/ignite.h"
#include "ignite/ignition.h"


using namespace ignite;
using namespace cache;
using namespace std;

std::string get_file_contents(const char *filename)
{
                std::ifstream in(filename, std::ios::in | std::ios::binary);
                if (in)
                {
                                return(std::string((std::istreambuf_iterator<char>(in)), std::istreambuf_iterator<char>()));
                }
                throw(errno);
}

void PutFile(Cache<string, string>& cache, const char * file) {
                cout<<file<<endl;
                string key(file);
                string val = "";
                auto start = std::chrono::high_resolution_clock::now();
                //val=get_file_contents(file);
                for (int i = 0; i < 1000; i++)
                                val = val + "aaaaa\n";
                auto elapsed = std::chrono::high_resolution_clock::now() - start;
                long long microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
                cout<<val.length()<<" "<<microseconds<<"!!"<<endl;

                start = std::chrono::high_resolution_clock::now();
                cache.Put(key, val);
                elapsed = std::chrono::high_resolution_clock::now() - start;
                microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
                cout<<"Put time: "<<" "<<key<<" "<<microseconds<<"!!"<<endl;

                start = std::chrono::high_resolution_clock::now();
                val = cache.Get(key);
                elapsed = std::chrono::high_resolution_clock::now() - start;
                std::ofstream out("output");
                out << val;
                out.close();
                microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
                cout<<"Get time: "<<microseconds<<" "<<val.length()<<" !!"<<endl;
                std::cout << ">>> Retrieved organization instance from cache: " << std::endl;

                /*auto start = std::chrono::high_resolution_clock::now();
                  val = cache.Get(key);
                  auto elapsed = std::chrono::high_resolution_clock::now() - start;
                  long  long microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
                  cout<<"Get time: "<<microseconds<<" "<<val.length()<<" !!"<<endl;
                  std::cout << ">>> Retrieved organization instance from cache: " << std::endl;*/
                //std::cout << val << std::endl;
                std::cout << std::endl;
}

int main(int argc, const char * argv[]) {

                IgniteConfiguration cfg;

                cfg.jvmInitMem = 512;
                cfg.jvmMaxMem = 1024*2;

                cfg.springCfgPath = "config/example-cache.xml";
                std::cout << std::endl;
                std::cout << ">>> Example started ..." << std::endl;
                try {
                                Ignite grid = Ignition::Start(cfg);
                                Cache<string, string> cache = grid.GetOrCreateCache<string, string>("example");
                                // Cache<string, string> cache = grid.GetCache<string, string>("example");

                                PutFile(cache, argv[1]);

                } catch (IgniteError& err) {
                                std::cout << "An error occurred: " << err.GetText() << std::endl;
                }

                std::cout << std::endl;
                std::cout << ">>> Example finished, press any key to exit ..." << std::endl;
                std::cout << std::endl;
                return 0;
}
Igor Sapego Igor Sapego
Reply | Threaded
Open this post in threaded view
|

Re: C++ Distributed cache for caching files

Hello,

I can see that such kind of error can be caused if you don't have args[1] filled.
Do you pass argument to the test when you run it?

Best Regards,
Igor

On Wed, Apr 20, 2016 at 5:34 PM, rajs123 <[hidden email]> wrote:
Code:

#include <iostream>
#include <string>
#include <fstream>
#include <streambuf>
#include <cerrno>
#include <ctime>
#include <chrono>
#include "ignite/ignite.h"
#include "ignite/ignition.h"


using namespace ignite;
using namespace cache;
using namespace std;

std::string get_file_contents(const char *filename)
{
                std::ifstream in(filename, std::ios::in | std::ios::binary);
                if (in)
                {
                                return(std::string((std::istreambuf_iterator<char>(in)),
std::istreambuf_iterator<char>()));
                }
                throw(errno);
}

void PutFile(Cache<string, string>& cache, const char * file) {
                cout<<file&lt;&lt;endl;
                string key(file);
                string val = &quot;&quot;;
                auto start = std::chrono::high_resolution_clock::now();
                //val=get_file_contents(file);
                for (int i = 0; i &lt; 1000; i++)
                                val = val + &quot;aaaaa\n&quot;;
                auto elapsed = std::chrono::high_resolution_clock::now() - start;
                long long microseconds =
std::chrono::duration_cast&lt;std::chrono::microseconds>(elapsed).count();
                cout<<val.length()&lt;&lt;&quot;
&quot;&lt;&lt;microseconds&lt;&lt;&quot;!!&quot;&lt;&lt;endl;

                start = std::chrono::high_resolution_clock::now();
                cache.Put(key, val);
                elapsed = std::chrono::high_resolution_clock::now() - start;
                microseconds =
std::chrono::duration_cast&lt;std::chrono::microseconds>(elapsed).count();
                cout<<"Put time: "<<" "<<key&lt;&lt;&quot;
&quot;&lt;&lt;microseconds&lt;&lt;&quot;!!&quot;&lt;&lt;endl;

                start = std::chrono::high_resolution_clock::now();
                val = cache.Get(key);
                elapsed = std::chrono::high_resolution_clock::now() - start;
                std::ofstream out(&quot;output&quot;);
                out &lt;&lt; val;
                out.close();
                microseconds =
std::chrono::duration_cast&lt;std::chrono::microseconds>(elapsed).count();
                cout<<"Get time: "<<microseconds&lt;&lt;&quot;
&quot;&lt;&lt;val.length()&lt;&lt;&quot; !!&quot;&lt;&lt;endl;
                std::cout &lt;&lt; &quot;>>> Retrieved organization instance from cache: "
<< std::endl;

                /*auto start = std::chrono::high_resolution_clock::now();
                  val = cache.Get(key);
                  auto elapsed = std::chrono::high_resolution_clock::now() - start;
                  long  long microseconds =
std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
                  cout<<"Get time: "<<microseconds&lt;&lt;&quot;
&quot;&lt;&lt;val.length()&lt;&lt;&quot; !!&quot;&lt;&lt;endl;
                  std::cout &lt;&lt; &quot;>>> Retrieved organization instance from cache:
" << std::endl;*/
                //std::cout << val << std::endl;
                std::cout << std::endl;
}

int main(int argc, const char * argv[]) {

                IgniteConfiguration cfg;

                cfg.jvmInitMem = 512;
                cfg.jvmMaxMem = 1024*2;

                cfg.springCfgPath = "config/example-cache.xml";
                std::cout << std::endl;
                std::cout << ">>> Example started ..." << std::endl;
                try {
                                Ignite grid = Ignition::Start(cfg);
                                Cache<string, string> cache = grid.GetOrCreateCache<string,
string>("example");
                                //                      Cache<string, string> cache = grid.GetCache<string,
string>("example");

                                PutFile(cache, argv[1]);

                } catch (IgniteError& err) {
                                std::cout << "An error occurred: " << err.GetText() << std::endl;
                }

                std::cout << std::endl;
                std::cout << ">>> Example finished, press any key to exit ..." <<
std::endl;
                std::cout << std::endl;
                return 0;
}




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/C-Distributed-cache-for-caching-files-tp4158p4376.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.