Best Practice for Leveraging Ignite to generate large number records files

classic Classic list List threaded Threaded
2 messages Options
diopek diopek
Reply | Threaded
Open this post in threaded view
|

Best Practice for Leveraging Ignite to generate large number records files

This post has NOT been accepted by the mailing list yet.
We are developing batch application that will eventually be generating unordered, large number or records data file around 1GB using multi-threads/process. What would be the best practices to accomplish this using Ignite. Using ignite data cache with write-behind (file system) flag enabled, or should use Ignite cache, one process to write records into cache, and another process to read from and remove records from this cache or any other suggestion for help enhancing this batch performance.
Also batch process using Spring Batch partitioning (local/remote) feature, I noticed that Spring Batch reference documentation is mentioning GridGain as one possible middleware that can be leveraged as grid fabric solution. Is there any utility library that can fulfill such SB remote partitioning mechanism for Ignite/GridGain?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Best Practice for Leveraging Ignite to generate large number records files

diopek wrote
We are developing batch application that will eventually be generating unordered, large number or records data file around 1GB using multi-threads/process. What would be the best practices to accomplish this using Ignite. Using ignite data cache with write-behind (file system) flag enabled, or should use Ignite cache, one process to write records into cache, and another process to read from and remove records from this cache or any other suggestion for help enhancing this batch performance.
Also batch process using Spring Batch partitioning (local/remote) feature, I noticed that Spring Batch reference documentation is mentioning GridGain as one possible middleware that can be leveraged as grid fabric solution. Is there any utility library that can fulfill such SB remote partitioning mechanism for Ignite/GridGain?
It sounds like you can use IgniteDataStreamer: https://apacheignite.readme.io/v1.3/docs/data-streamers. I think you should implement and configure your own StreamReceiver which will write data to files. Note that this way you will bypass the cache (i.e. won't save any data in memory), but entries will still be mapped to nodes by affinity, as it happens with entries stored in cache. Streamer will automatically do this mapping as well as batching.

Hope this helps.

-Val