SW recommendation: Ignite Native Persistence for traditional relational data warehouse?

classic Classic list List threaded Threaded
5 messages Options
m4mmr m4mmr
Reply | Threaded
Open this post in threaded view
|

SW recommendation: Ignite Native Persistence for traditional relational data warehouse?

Hi,

I am in a project where we are building a new database with strong
normalisation requirements - very much like a relational data warehouse.  We
source the data from HDFS. And the maintenance team requires the data
movement to be implemented through SQL APIs.

Main Question: Would it be a viable use case to use Ignite with native
persistence store as storage when building a relational data warehouse?  So
that we do not have to manage both a RDBMS and Ignite.

I don’t fully see  a clear picture of what the limitations would be of using
native persistence instead of RDBMS for persistence - but I have not seen a
single use case where someone use native for relational data warehouses
either.

Again - I think I am missing some basic understanding of the native
persistence here - even after reading through the docs I could find. So
would be happy if someone could shed some light on it.

Would be very thankful for all type of help/assistance!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: SW recommendation: Ignite Native Persistence for traditional relational data warehouse?

Hello,

In short, Ignite persistence primary advantages are:
  • Enables multi-tiered storage across RAM and disk - 100% of data persisted to disk ("warm" and "cold" data sets) while "hot" data always stays in RAM. You lay out the data the way you need. Applications just run queries and Ignite internally goes either to RAM or disk, transparently. This property is crucial for real-time analytics and data warehousing offloading.
  • Instantaneous restarts and advanced high-availability - in case of full cluster restarts, the cluster becomes fully operational as soon as the nodes are interconnected. No need to preload anything from disk to RAM (based on the advantage above).
Now, talking about Ignite and data warehousing. Ignite is used for real-time analytics and Hadoop offloading. But don't treat it as a Hadoop replacement or a solution for data warehousing. Ignite is used together with Hadoop but deployed separately. Here are things to consider:
  • Use Ignite for business operations/computations that require low-latency response time (seconds or milliseconds) and high-throughput. Preload data to Ignite cluster needed for this computations. Enabled Ignite persistence for the sake of durability.
  • Keep using Hadoop for high-latency workloads (minutes and hours) and batch processing.
  • APIs: modify your applications to ensure that Ignite APIs are used for Ignite cluster access (SQL, compute grid, ML). Spark can be used as a generic API that can connect to both Hadoop and Ignite and run joins across 2 storages (use DataFrames).
  • Tooling: data preloading from Hadoop, bi-directional synchronization, advanced Spark integration, etc. - reach out GridGain, they have been working on a special Hadoop pack.
Hope it helps.

-
Denis


On Tue, May 14, 2019 at 3:20 PM m4mmr <[hidden email]> wrote:
Hi,

I am in a project where we are building a new database with strong
normalisation requirements - very much like a relational data warehouse.  We
source the data from HDFS. And the maintenance team requires the data
movement to be implemented through SQL APIs.

Main Question: Would it be a viable use case to use Ignite with native
persistence store as storage when building a relational data warehouse?  So
that we do not have to manage both a RDBMS and Ignite.

I don’t fully see  a clear picture of what the limitations would be of using
native persistence instead of RDBMS for persistence - but I have not seen a
single use case where someone use native for relational data warehouses
either.

Again - I think I am missing some basic understanding of the native
persistence here - even after reading through the docs I could find. So
would be happy if someone could shed some light on it.

Would be very thankful for all type of help/assistance!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
m4mmr m4mmr
Reply | Threaded
Open this post in threaded view
|

Re: SW recommendation: Ignite Native Persistence for traditional relational data warehouse?

Hi Denis,

Thank you very much for a good response. Definitely helps. Hope I can ask
one follow-up question which I feel I did not make so clear from the
beginning:

The business has a very strong (non-negotiable) requirement on that the data
warehouse should be modeled with high normalisation. These model
requirements prevents us from building the data warehouse on hadoop(hive).
We won't get rid of hadoop though - it is still used for offloading the
operational source systems.

Therefore we need to:
1. Either build a data warehouse on a RDBMS on top of the hadoop - to meet
the business requirements.
2. Or build the relational data warehouse directly on Ignite with
Persistence Store - on top of hadoop.

But after seeing your explanation below I understand that option 2 above is
not really the way Ignite is supposed to be used - even if it is on top of
hadoop. Did I get that right?

We can of course use Ignite with the RDBMS from option 1 as persistence
store for making the DWH offloading more effective later on. Will look into
that.

Again - thank you very much for you pedagogical response Denis.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Denis Magda Denis Magda
Reply | Threaded
Open this post in threaded view
|

Re: SW recommendation: Ignite Native Persistence for traditional relational data warehouse?

Hi,

But after seeing your explanation below I understand that option 2 above is
not really the way Ignite is supposed to be used - even if it is on top of
hadoop. Did I get that right?

In this configuration, Ignite will not be on top of Hadoop, it will be close to it - deployed as separate storage for faster performance. Synchronization is possible if needed. 

As for the other details, let's connect privately, more details are needed to give useful suggestions.

--
Denis Magda


On Wed, May 15, 2019 at 4:15 AM m4mmr <[hidden email]> wrote:
Hi Denis,

Thank you very much for a good response. Definitely helps. Hope I can ask
one follow-up question which I feel I did not make so clear from the
beginning:

The business has a very strong (non-negotiable) requirement on that the data
warehouse should be modeled with high normalisation. These model
requirements prevents us from building the data warehouse on hadoop(hive).
We won't get rid of hadoop though - it is still used for offloading the
operational source systems.

Therefore we need to:
1. Either build a data warehouse on a RDBMS on top of the hadoop - to meet
the business requirements.
2. Or build the relational data warehouse directly on Ignite with
Persistence Store - on top of hadoop.

But after seeing your explanation below I understand that option 2 above is
not really the way Ignite is supposed to be used - even if it is on top of
hadoop. Did I get that right?

We can of course use Ignite with the RDBMS from option 1 as persistence
store for making the DWH offloading more effective later on. Will look into
that.

Again - thank you very much for you pedagogical response Denis.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dmagda dmagda
Reply | Threaded
Open this post in threaded view
|

Re: SW recommendation: Ignite Native Persistence for traditional relational data warehouse?

Are you available for a verbal conversation? I would invite a solution architect on a call to figure out if Ignite works or not for your use case.

-
Denis


On Wed, May 15, 2019 at 10:17 PM Denis Magda <[hidden email]> wrote:
Hi,

But after seeing your explanation below I understand that option 2 above is
not really the way Ignite is supposed to be used - even if it is on top of
hadoop. Did I get that right?

In this configuration, Ignite will not be on top of Hadoop, it will be close to it - deployed as separate storage for faster performance. Synchronization is possible if needed. 

As for the other details, let's connect privately, more details are needed to give useful suggestions.

--
Denis Magda


On Wed, May 15, 2019 at 4:15 AM m4mmr <[hidden email]> wrote:
Hi Denis,

Thank you very much for a good response. Definitely helps. Hope I can ask
one follow-up question which I feel I did not make so clear from the
beginning:

The business has a very strong (non-negotiable) requirement on that the data
warehouse should be modeled with high normalisation. These model
requirements prevents us from building the data warehouse on hadoop(hive).
We won't get rid of hadoop though - it is still used for offloading the
operational source systems.

Therefore we need to:
1. Either build a data warehouse on a RDBMS on top of the hadoop - to meet
the business requirements.
2. Or build the relational data warehouse directly on Ignite with
Persistence Store - on top of hadoop.

But after seeing your explanation below I understand that option 2 above is
not really the way Ignite is supposed to be used - even if it is on top of
hadoop. Did I get that right?

We can of course use Ignite with the RDBMS from option 1 as persistence
store for making the DWH offloading more effective later on. Will look into
that.

Again - thank you very much for you pedagogical response Denis.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/