IgniteDataStreamer.keepBinary proposal

classic Classic list List threaded Threaded
3 messages Options
vtchernyi vtchernyi
Reply | Threaded
Open this post in threaded view
|

IgniteDataStreamer.keepBinary proposal

Hi, community

 

I've just finished drilling a small page [1] about Ignite data streaming and I want to share my impressions. The situation is common for many Ignite documentation pages, impressions are the same.

 

My problem was to adapt IgniteDataStreamer to data loading using the binary format as described in my article [2]. I try to use the same approach:

1) load data on the client node;

2) convert it to the binary form;

3) use IgniteDataStreamer/StreamReceiver pair (instead of ComputeTaskAdapter/ComputeJobAdapter) to ingest data in the cache.

 

I modified my production code using IgniteDataStreamer<BinaryObject, BinaryObject> and  StreamReceiver<BinaryObject, BinaryObject>, tried to start on the dev cluster made of 2 server nodes and 1 client node. That is it: ClassNotFoundException for the class that exists on the client node only.

 

The solution to the problem seems to be in setting streamer.keepBinary(true), but page [1] never says about it. I found that setter in the IgniteDataStreamer source code after a single day of troubleshooting. Definitely, "In Ignite We Trust" - what else reason would drive me to spend so much time?

 

The code snippets on the page [1] are hard to implement in real-world applications because of using only primitive types String, Integer, etc. These are more like unit tests.

 

My proposal - it would be great to create a small GitHub repo containing a complete compilable code example, one repo for every page. I think such repos will keep the newbie Ignite users inside the project and prevent them from leaving.

 

Regards,

Vladimir Tchernyi

--

[1] https://ignite.apache.org/docs/latest/data-streaming

[2] https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api

 

dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: IgniteDataStreamer.keepBinary proposal

Hi Vladimir, 

Most of the code snippets are already arranged in complete and ready-for-usage samples:

Anyway, those are code snippets that are injected into quite generic documentation pages. Your case represents a situation when someone needs to work with binary objects and streaming APIs. What if we add a data streamer example for BinaryObjects into Ignite's examples and put a reference to that example from the documentation page? Are you interested in contributing the example?

-
Denis


On Fri, Dec 4, 2020 at 2:58 AM Vladimir Tchernyi <[hidden email]> wrote:

Hi, community

 

I've just finished drilling a small page [1] about Ignite data streaming and I want to share my impressions. The situation is common for many Ignite documentation pages, impressions are the same.

 

My problem was to adapt IgniteDataStreamer to data loading using the binary format as described in my article [2]. I try to use the same approach:

1) load data on the client node;

2) convert it to the binary form;

3) use IgniteDataStreamer/StreamReceiver pair (instead of ComputeTaskAdapter/ComputeJobAdapter) to ingest data in the cache.

 

I modified my production code using IgniteDataStreamer<BinaryObject, BinaryObject> and  StreamReceiver<BinaryObject, BinaryObject>, tried to start on the dev cluster made of 2 server nodes and 1 client node. That is it: ClassNotFoundException for the class that exists on the client node only.

 

The solution to the problem seems to be in setting streamer.keepBinary(true), but page [1] never says about it. I found that setter in the IgniteDataStreamer source code after a single day of troubleshooting. Definitely, "In Ignite We Trust" - what else reason would drive me to spend so much time?

 

The code snippets on the page [1] are hard to implement in real-world applications because of using only primitive types String, Integer, etc. These are more like unit tests.

 

My proposal - it would be great to create a small GitHub repo containing a complete compilable code example, one repo for every page. I think such repos will keep the newbie Ignite users inside the project and prevent them from leaving.

 

Regards,

Vladimir Tchernyi

--

[1] https://ignite.apache.org/docs/latest/data-streaming

[2] https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api

 

vtchernyi vtchernyi
Reply | Threaded
Open this post in threaded view
|

Re: IgniteDataStreamer.keepBinary proposal

Hi Denis, 

I think the code examples we already have do not show the nature of Ignite as a DISTRIBUTED database. These examples are oriented on a single-node start. An inexperienced user can have a false impression that a single Ignite node can outperform, for example, a commercial database server.

IMHO the documentation should be written for a multinode Ignite cluster. I do not understand what is the purpose to show how to stream 100_000 integer values in a cache defined as <Integer, String>. In the real world, I need to stream structured records (Kafka Avro messages), and I will create a POJO to hold each message. It is known that Ignite does not peer-deploy user POJOs, so using BinaryObject is the only way to forward my POJOs to the remote nodes (correct me if I am wrong).

I trust Ignite and I managed to create really fast Ignite app in production. But recently I faced again the long-forgotten feeling - the page is nice but hard to use. Hope my experience will help to improve documentation.

Vladimir

PS
as for contributing, I need some time to get my Kafka Ignite app to production to be sure of it. After that, I will be ready to contribute

сб, 5 дек. 2020 г. в 06:31, Denis Magda <[hidden email]>:
Hi Vladimir, 

Most of the code snippets are already arranged in complete and ready-for-usage samples:

Anyway, those are code snippets that are injected into quite generic documentation pages. Your case represents a situation when someone needs to work with binary objects and streaming APIs. What if we add a data streamer example for BinaryObjects into Ignite's examples and put a reference to that example from the documentation page? Are you interested in contributing the example?

-
Denis


On Fri, Dec 4, 2020 at 2:58 AM Vladimir Tchernyi <[hidden email]> wrote:

Hi, community

 

I've just finished drilling a small page [1] about Ignite data streaming and I want to share my impressions. The situation is common for many Ignite documentation pages, impressions are the same.

 

My problem was to adapt IgniteDataStreamer to data loading using the binary format as described in my article [2]. I try to use the same approach:

1) load data on the client node;

2) convert it to the binary form;

3) use IgniteDataStreamer/StreamReceiver pair (instead of ComputeTaskAdapter/ComputeJobAdapter) to ingest data in the cache.

 

I modified my production code using IgniteDataStreamer<BinaryObject, BinaryObject> and  StreamReceiver<BinaryObject, BinaryObject>, tried to start on the dev cluster made of 2 server nodes and 1 client node. That is it: ClassNotFoundException for the class that exists on the client node only.

 

The solution to the problem seems to be in setting streamer.keepBinary(true), but page [1] never says about it. I found that setter in the IgniteDataStreamer source code after a single day of troubleshooting. Definitely, "In Ignite We Trust" - what else reason would drive me to spend so much time?

 

The code snippets on the page [1] are hard to implement in real-world applications because of using only primitive types String, Integer, etc. These are more like unit tests.

 

My proposal - it would be great to create a small GitHub repo containing a complete compilable code example, one repo for every page. I think such repos will keep the newbie Ignite users inside the project and prevent them from leaving.

 

Regards,

Vladimir Tchernyi

--

[1] https://ignite.apache.org/docs/latest/data-streaming

[2] https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api