MapReduce with external databases

classic Classic list List threaded Threaded
7 messages Options
tcostasouza tcostasouza
Reply | Threaded
Open this post in threaded view
|

MapReduce with external databases

Hello,

I've just started evaluating Apache Ignite. Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?

Thanks
dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce with external databases

tcostasouza wrote
Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?
Yes, Ignite supports pluggable AffinityFunction API which can be configured through CacheConfiguration.

You should be able to create an implementation which will simply delegate to one of the Cassandra Partitioner implementations.

Is this what you were looking for?
tcostasouza tcostasouza
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce with external databases

Hello!

But isn't this tied to Ignite's cache? Wouldn't I need to cache the external data first for this to work?

Thanks

On Mon, May 18, 2015 at 5:56 AM dsetrakyan [via Apache Ignite Users] <[hidden email]> wrote:
tcostasouza wrote
Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?
Yes, Ignite supports pluggable AffinityFunction API which can be configured through CacheConfiguration.

You should be able to create an implementation which will simply delegate to one of the Cassandra Partitioner implementations.

Is this what you were looking for?



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/MapReduce-with-external-databases-tp311p312.html
To unsubscribe from MapReduce with external databases, click here.
NAML
Yakov Zhdanov Yakov Zhdanov
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce with external databases

I am not sure if I understand you.

I assume your primary data access point will be Ignite cache backed by Cassandra as persistent store. Is this the point?

Or you want to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

--
Yakov Zhdanov, Director R&D
GridGain Systems

2015-05-18 14:49 GMT+03:00 tcostasouza <[hidden email]>:
Hello!

But isn't this tied to Ignite's cache? Wouldn't I need to cache the external data first for this to work?

Thanks

On Mon, May 18, 2015 at 5:56 AM dsetrakyan [via Apache Ignite Users] <[hidden email]> wrote:
tcostasouza wrote
Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?
Yes, Ignite supports pluggable AffinityFunction API which can be configured through CacheConfiguration.

You should be able to create an implementation which will simply delegate to one of the Cassandra Partitioner implementations.

Is this what you were looking for?



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/MapReduce-with-external-databases-tp311p312.html
To unsubscribe from MapReduce with external databases, click here.
NAML


View this message in context: Re: MapReduce with external databases

Sent from the Apache Ignite Users mailing list archive at Nabble.com.

tcostasouza tcostasouza
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce with external databases

Hello,

The second one, I would like to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

Cheers!

On Mon, May 18, 2015 at 9:04 AM Yakov Zhdanov [via Apache Ignite Users] <[hidden email]> wrote:
I am not sure if I understand you.

I assume your primary data access point will be Ignite cache backed by Cassandra as persistent store. Is this the point?

Or you want to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

--
Yakov Zhdanov, Director R&D
GridGain Systems

2015-05-18 14:49 GMT+03:00 tcostasouza <[hidden email]>:
Hello!

But isn't this tied to Ignite's cache? Wouldn't I need to cache the external data first for this to work?

Thanks

On Mon, May 18, 2015 at 5:56 AM dsetrakyan [via Apache Ignite Users] <[hidden email]> wrote:
tcostasouza wrote
Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?
Yes, Ignite supports pluggable AffinityFunction API which can be configured through CacheConfiguration.

You should be able to create an implementation which will simply delegate to one of the Cassandra Partitioner implementations.

Is this what you were looking for?



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/MapReduce-with-external-databases-tp311p312.html
To unsubscribe from MapReduce with external databases, click here.
NAML


View this message in context: Re: MapReduce with external databases

Sent from the Apache Ignite Users mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
To unsubscribe from MapReduce with external databases, click here.
NAML
yakov yakov
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce with external databases

I see now.

Well, you do not need to start or configure any caches in this case, but you will need to manually route your jobs or closures to the proper nodes and, therefore, you should somehow get distribution info from Cassandra.

See this example for idea:

        final String host = YourCassandraMapper.hostForKey(someKey);

        ignite.compute(
            ignite.cluster().forPredicate(
                new IgnitePredicate<ClusterNode>() {
                    @Override public boolean apply(ClusterNode node) {
                        return node.hostNames().contains(host);
                    }
                }))
            .call(
                new IgniteCallable<Object>() {
                    @Override public Object call() throws Exception {
                        // This callable will be executed on the HOST
                        // that holds the data.
                        
                        return null;
                    }
                }
            );

--Yakov

2015-05-18 15:04 GMT+03:00 tcostasouza <[hidden email]>:
Hello,

The second one, I would like to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

Cheers!

On Mon, May 18, 2015 at 9:04 AM Yakov Zhdanov [via Apache Ignite Users] <[hidden email]> wrote:
I am not sure if I understand you.

I assume your primary data access point will be Ignite cache backed by Cassandra as persistent store. Is this the point?

Or you want to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

--
Yakov Zhdanov, Director R&D
GridGain Systems

2015-05-18 14:49 GMT+03:00 tcostasouza <[hidden email]>:
Hello!

But isn't this tied to Ignite's cache? Wouldn't I need to cache the external data first for this to work?

Thanks

On Mon, May 18, 2015 at 5:56 AM dsetrakyan [via Apache Ignite Users] <[hidden email]> wrote:
tcostasouza wrote
Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?
Yes, Ignite supports pluggable AffinityFunction API which can be configured through CacheConfiguration.

You should be able to create an implementation which will simply delegate to one of the Cassandra Partitioner implementations.

Is this what you were looking for?



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/MapReduce-with-external-databases-tp311p312.html
To unsubscribe from MapReduce with external databases, click here.
NAML


View this message in context: Re: MapReduce with external databases

Sent from the Apache Ignite Users mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
To unsubscribe from MapReduce with external databases, click here.
NAML


View this message in context: Re: MapReduce with external databases
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

tcostasouza tcostasouza
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce with external databases

Thanks for the info! I'll explore this!

Regards

On Mon, May 18, 2015, 09:27 Yakov Zhdanov-2 [via Apache Ignite Users] <[hidden email]> wrote:
I see now.

Well, you do not need to start or configure any caches in this case, but you will need to manually route your jobs or closures to the proper nodes and, therefore, you should somehow get distribution info from Cassandra.

See this example for idea:

        final String host = YourCassandraMapper.hostForKey(someKey);

        ignite.compute(
            ignite.cluster().forPredicate(
                new IgnitePredicate<ClusterNode>() {
                    @Override public boolean apply(ClusterNode node) {
                        return node.hostNames().contains(host);
                    }
                }))
            .call(
                new IgniteCallable<Object>() {
                    @Override public Object call() throws Exception {
                        // This callable will be executed on the HOST
                        // that holds the data.
                        
                        return null;
                    }
                }
            );

--Yakov

2015-05-18 15:04 GMT+03:00 tcostasouza <[hidden email]>:
Hello,

The second one, I would like to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

Cheers!

On Mon, May 18, 2015 at 9:04 AM Yakov Zhdanov [via Apache Ignite Users] <[hidden email]> wrote:
I am not sure if I understand you.

I assume your primary data access point will be Ignite cache backed by Cassandra as persistent store. Is this the point?

Or you want to run map-reduce jobs with Ignite using data currently stored in Cassandra and properly route that jobs to the hosts holding the data?

--
Yakov Zhdanov, Director R&D
GridGain Systems

2015-05-18 14:49 GMT+03:00 tcostasouza <[hidden email]>:
Hello!

But isn't this tied to Ignite's cache? Wouldn't I need to cache the external data first for this to work?

Thanks

On Mon, May 18, 2015 at 5:56 AM dsetrakyan [via Apache Ignite Users] <[hidden email]> wrote:
tcostasouza wrote
Does it supports data collocated processing by overlaying ignite cluster with another storage cluster (i.e. Cassandra)?
Yes, Ignite supports pluggable AffinityFunction API which can be configured through CacheConfiguration.

You should be able to create an implementation which will simply delegate to one of the Cassandra Partitioner implementations.

Is this what you were looking for?



If you reply to this email, your message will be added to the discussion below:
http://apache-ignite-users.70518.x6.nabble.com/MapReduce-with-external-databases-tp311p312.html
To unsubscribe from MapReduce with external databases, click here.
NAML


View this message in context: Re: MapReduce with external databases

Sent from the Apache Ignite Users mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
To unsubscribe from MapReduce with external databases, click here.
NAML


View this message in context: Re: MapReduce with external databases
Sent from the Apache Ignite Users mailing list archive at Nabble.com.
If you reply to this email, your message will be added to the discussion below: