History, multiple source systems, Data Vault using Ignite...

classic Classic list List threaded Threaded
6 messages Options
Mikhail Fokanov Mikhail Fokanov
Reply | Threaded
Open this post in threaded view
|

History, multiple source systems, Data Vault using Ignite...

Hello,

              We have typical task: we need to implement application, which will receive data (and updates) from multiple source systems. Also there will be default (our) data source, which can be updated by our application. Only the last version of data should be "actual" one, which could be retrieved from our application. But full audit trails of updates from every system should be kept always in order to investigate issues. Now the team consider Data Vault [1] as one of the possible solutions (but for me it looks superfluous). Is it a good option to implement Data Vault architecture by means of Ignite? Have anybody implemented applications with such requirements? We want to use Ignite, because in future we will have data analytics process (machine learning). What possible solutions for the task using Ignite do you see?

[1] https://en.wikipedia.org/wiki/Data_vault_modeling 
--
Best Regards,
Mikhail
ezhuravlev ezhuravlev
Reply | Threaded
Open this post in threaded view
|

Re: History, multiple source systems, Data Vault using Ignite...

Hi,

Yes, it's possible to implement storage for previous values in Ignite. For example, you can use listener of ContinuousQuery with @IgniteAsyncCallback for updating values in your cache. When value updated, you can put previous value(which will be available in this event listener) in another cache with previous values and add to it version of the object(for example time of updating). Documentation about ContinuousQuery: https://apacheignite.readme.io/docs/continuous-queries

Or, if you will need more guarantees, you can use transactions for inserting previous value to history-cache and updating value in one transaction. https://apacheignite.readme.io/docs/transactions

Also, It looks like you will need to use Ignite Persistence: https://apacheignite.readme.io/docs/distributed-persistent-store

If you will have any certain questions about implementation, feel free to send it to the user list.

All the best,
Evgenii




2017-08-28 14:55 GMT+03:00 Mikhail <[hidden email]>:
Hello,

              We have typical task: we need to implement application, which will receive data (and updates) from multiple source systems. Also there will be default (our) data source, which can be updated by our application. Only the last version of data should be "actual" one, which could be retrieved from our application. But full audit trails of updates from every system should be kept always in order to investigate issues. Now the team consider Data Vault [1] as one of the possible solutions (but for me it looks superfluous). Is it a good option to implement Data Vault architecture by means of Ignite? Have anybody implemented applications with such requirements? We want to use Ignite, because in future we will have data analytics process (machine learning). What possible solutions for the task using Ignite do you see?

[1] https://en.wikipedia.org/wiki/Data_vault_modeling 
--
Best Regards,
Mikhail

Mikhail Fokanov Mikhail Fokanov
Reply | Threaded
Open this post in threaded view
|

IgniteJdbcThinDriver statements accumulation

In reply to this post by Mikhail Fokanov
Hello,

             I need to execute a lot of SQL statements in one connection using IgniteJdbcThinDriver. I get memory leak because of accumulation of all statements in:  
    private final ArrayList<JdbcThinStatement> stmts = new ArrayList<>(); (IgniteJdbcThinDriver:118). All statements are added to this list. As I see it, this list is cleared only by onDisconnect() method, which is called only on error. So In case of many statements, there will be memory leaks. And possibly the same situation will occur in connection pools, because they can reuse one connection many times. Is it the desired behavior for this ignite jdbc driver?

--
Best Regards,
Mikhail
Mikhail Fokanov Mikhail Fokanov
Reply | Threaded
Open this post in threaded view
|

time series

Hi Igniters,

                   Are there any best practices of storing time series data in Ignite? We need it for extremely high load IoT system. Cassandra is likely to be an appropriate solution, but slow speed of analytical SQL queries are not acceptable for us. We can implement Ignite over Cassandra, but we need to access the hole data in Ignite and the cache shouldn't be extremely huge (e.g. it should be cache per day).
                   We want to have the similar approach, as for example in [1]. However, writing such functionality from scratch has a lot of pitfalls. Are there any out-of-the-box features for time series data in Ignite? Does it sound reasonable to implement rollover pattern in Ignite (like in ES)? Or there could be another options?

[1] - https://www.elastic.co/blog/managing-time-based-indices-efficiently

--
Best Regards,
Mikhail
Denis Magda-2 Denis Magda-2
Reply | Threaded
Open this post in threaded view
|

Re: time series

Ignite can store the whole data set on disk and X% in RAM thanks to the native persistence. So, you decide how much data you'd like to keep in RAM:

As per times series, I heard that Ignite is being used for that use case. However, you might need more. My suggestion is to start and see how it goes.

--
Denis

On Tue, Nov 6, 2018 at 4:05 AM Mikhail <[hidden email]> wrote:
Hi Igniters,

                   Are there any best practices of storing time series data in Ignite? We need it for extremely high load IoT system. Cassandra is likely to be an appropriate solution, but slow speed of analytical SQL queries are not acceptable for us. We can implement Ignite over Cassandra, but we need to access the hole data in Ignite and the cache shouldn't be extremely huge (e.g. it should be cache per day).
                   We want to have the similar approach, as for example in [1]. However, writing such functionality from scratch has a lot of pitfalls. Are there any out-of-the-box features for time series data in Ignite? Does it sound reasonable to implement rollover pattern in Ignite (like in ES)? Or there could be another options?

[1] - https://www.elastic.co/blog/managing-time-based-indices-efficiently

--
Best Regards,
Mikhail
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: IgniteJdbcThinDriver statements accumulation

In reply to this post by Mikhail Fokanov
Hello!


Regards,
--
Ilya Kasnacheev


вт, 30 окт. 2018 г. в 16:31, Mikhail <[hidden email]>:
Hello,

             I need to execute a lot of SQL statements in one connection using IgniteJdbcThinDriver. I get memory leak because of accumulation of all statements in: 
    private final ArrayList<JdbcThinStatement> stmts = new ArrayList<>(); (IgniteJdbcThinDriver:118). All statements are added to this list. As I see it, this list is cleared only by onDisconnect() method, which is called only on error. So In case of many statements, there will be memory leaks. And possibly the same situation will occur in connection pools, because they can reuse one connection many times. Is it the desired behavior for this ignite jdbc driver?

--
Best Regards,
Mikhail