What does mean caching database exactly ?

classic Classic list List threaded Threaded
11 messages Options
tomk tomk
Reply | Threaded
Open this post in threaded view
|

What does mean caching database exactly ?

This post has NOT been accepted by the mailing list yet.
This post was updated on .
Hello,
I consider caching database (postgresSQL).
However, I don't understand way of working ignite, although I looked for docs.
Tell me please,
Is it possible to cache into RAM table from postgres SQL and fastly answer to SQL queries ? The aim is make it faster to query postgres.
I managed to establish connection with postgres (JDBC). (I separately created jdbc connection and create data source from that).  Now, I would like to cache into RAM memory entire table (using this JDBC connection) . Consequently, thanks to caching to memory RAM I am going to query postgres by SQL with low latencies.
I am not sure if my way is good
I am looking at  example:
https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/datagrid/store/jdbc

Can you help me?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

Hi,

Can you please properly subscribe to the mailing list so that the community can receive email notifications? Here is the instruction: http://apache-ignite-users.70518.x6.nabble.com/mailing_list/MailingListOptions.jtp?forum=1

tomk wrote
I consider caching database (postgresSQL).
However, I don't understand way of working ignite, although I looked for docs.
Tell me please,
Is it possible to cache into RAM table from postgres SQL and fastly answer to SQL queries ? The aim is make it faster to query postgres.
I managed to establish connection with postgres (JDBC). (I separately created jdbc connection and create data source from that).  Now, I would like to cache into RAM memory entire table (using this JDBC connection) . Consequently, thanks to caching to memory RAM I am going to query postgres by SQL with low latencies.
I am not sure if my way is good
I am looking at  example:
https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/datagrid/store/jdbc
If you run SQL in Ignite, it touches only data that is already in memory. So you have to load the whole data set that you're going to work with into Ignite cache prior to running the queries. But this should give you good performance improvement.

The data can be loaded from the Postgres as well as any other storage. For details refer to this page: https://apacheignite.readme.io/docs/data-loading

-Val
tomk tomk
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

This post has NOT been accepted by the mailing list yet.
This post was updated on .
Ok,
(1) Is it possible to use cached data from other application (something like JDBC Connection).  For example application A loaded something into database. Now, I would like to use this cache from other applicatiob B. Is it possible ? How ? Maybe REST API (I did read about it). Some JDBC connection ? I still think about the same (one for others applications).
To sum up the question:
Application A saves data into cache by throught-write (so data are in underlaying db and in memory cache) and application B get data from memory cache (by JDBC/REST). Is it possible ?

(2) As we can read this cache is working based on key-value mapping. I understand that cache is searched by key. For example, I can set key on tuple: <name, surname, sex>. Is it possible to get data from cache using other field, for example country ? What key influence on way of working cache ?  Should it be unambigous  ?

(3) I have read your link. Using through-writer and through reader requires the same connection jdbc (so the same application) ? I know that it is mentioned in link, but I have some doubts.
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

1. You can have a cluster consisting of several server nodes with one or more caches storing some data. Then you can have multiple applications, each will start a client node to connect to the same cluster and use this data. JDBC and REST can also be used. So the answer is definitely YES, this is possible.

2. You can do simple gets by key as well as SQL queries for more complicated requests.

3. This link is not about read/write-through, but rather about bulk loading of large amounts of data from DB into the cache. You should use one of two approaches described there prior to executing SQL queries, because SQL does not support read-through and queries only data that is already in memory.

-Val
tomk tomk
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

This post has NOT been accepted by the mailing list yet.
This post was updated on .
I tell you what I am going to do. Then you can understand my problem.
I have some app A that save data to postgres. Simply, it inserts row.  

Second app B is querying these data. The idea is app B can fast query data using cached created by Ignite.

(0) Should server should be from java code ?

So I can see different ways:

(1) Second app (B) is running in mode server. At begin, it loadCache (all data from database), then it may fastly queries data.
Here, is some doubt - does it be able to see newly inserted data by app A ? App A (in this scenario) insert data into postgres - it doesnt know about ignite.  
From what I did read it is possible to create triggers in database, that by REST API of ignite force loading to cache newly inserted data.  I don't know if it is solution ok.  Maybe, it can see newly inserted data automatically ?

(2) First App (A, server mode) is using write-throught and read-throught. Then, newly inserted data (by A) are loaded into cache RAM and automatically to postgres. (It seems to be very nice).   Then clients (eg B, client mode) that will connect to ignite server (JDBC/REST) can SQL query data in RAM memory - so also newly inserted.

(3) First App (A, server mode)) does only write-throught.  Second App (B) (client mode) loads entire data into cache (once time -  during starting). Then it can SQL query data in cache and be able to see newly inserted data by app A (thanks to write-throught).

The difference between (2) and (3) is small - who should loadCache data ? A or B?


As you can see main aim: cache data from postgres  for query. New data are permanently inserted.
It seems that ideal solution is starting nodes in mode server by bin/ignite.sh.
Then App A do write-through on this sever node. When it comes App B it simply JDBC connect to this server. Also REST queries possible.
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

Hi,

Can you please subscribe to the mailing list as I asked earlier? You just need to send couple of emails to do this. See details here: http://apache-ignite-users.70518.x6.nabble.com/mailing_list/MailingListOptions.jtp?forum=1

tomk wrote
I tell you what I am going to do. Then you can understand my problem.
I have some app A that save data to postgres. Simply, it inserts row.  

Second app B is querying these data. The idea is app B can fast query data using cached created by Ignite.

(0) Should server should be from java code ?

So I can see different ways:

(1) Second app (B) is running in mode server. At begin, it loadCache (all data from database), then it may fastly queries data.
Here, is some doubt - does it be able to see newly inserted data by app A ? App A (in this scenario) insert data into postgres - it doesnt know about ignite.  
From what I did read it is possible to create triggers in database, that by REST API of ignite force loading to cache newly inserted data.  I don't know if it is solution ok.  Maybe, it can see newly inserted data automatically ?

(2) First App (A, server mode) is using write-throught and read-throught. Then, newly inserted data (by A) are loaded into cache RAM and automatically to postgres. (It seems to be very nice).   Then clients (eg B, client mode) that will connect to ignite server (JDBC/REST) can SQL query data in RAM memory - so also newly inserted.

(3) First App (A, server mode)) does only write-throught.  Second App (B) (client mode) loads entire data into cache (once time -  during starting). Then it can SQL query data in cache and be able to see newly inserted data by app A (thanks to write-throught).

The difference between (2) and (3) is small - who should loadCache data ? A or B?


As you can see main aim: cache data from postgres  for query. New data are permanently inserted.
It seems that ideal solution is starting nodes in mode server by bin/ignite.sh.
Then App A do write-through on this sever node. When it comes App B it simply JDBC connect to this server. Also REST queries possible.
If DB is updated directly, Ignite is not aware about these changes. Such awareness can be achieved with DB triggers, periodic polling, etc., but the most effective approach is to use Ignite API to read and update the cache in both apps. There is no much difference between approaches 2 and 3, it's up to you who will load the data. Only make sure that all the required data is in memory prior to running the queries.

-Val
tomk tomk
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

This post was updated on .
In reply to this post by tomk
My approach is now:
start server by ./ignite.sh.
In config file .xml I set cache name, etc.
Then from writer I set the same name of cache.
And the same situation in reader application.
Only make sure that all the required data is in memory prior to running the queries.
What do you mean exactly ? When Writer-through saves data it must be in memory and database. So why I must make sure ?
ok ?

Writer shoulb be client mode?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

I meant the scenario when you already have some data in the DB, but start an empty cache. If you want to query the data that is in the DB, you need to load it to the cache first. This can also happen if the cluster crashes for some reason and you lose all in-memory data.

Make sense?

-Val
tomk tomk
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

This post was updated on .
Yes,
firstly I am going to loadCache contain of database. Then, I am going to write newly data (into cache and db - write throught). Loader app will be only read data from cache (not from db). Thanks to read-through it will see new data.

Tell me please,  what exactly I should say in .xml configure file ? only name for cache ?
Writer and reader are reader, server is running by ./ignite.sh

At this moment, it is working for me :)  But everything is in one application. Now, I must separate it - one app write-through, and second app read-through. However, I have no idea how to do it ? After all, CacheStoreAdapter makes me to implement load, loadCache, write, delete.
I don't need load and loadCache in my writer application. I know that loadCache is neccesary, because my start cache is empty, but loadCache may be done by loader-through application.

What should I do ?
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: What does mean caching database exactly ?

Data loading happens on the server (in case you're using CacheStore for this). But someone has to initiate the process by calling loadCache method. It can be done by any of the apps, I believe it's better to do it by one that starts first.

-Val
Alexey Kuznetsov Alexey Kuznetsov
Reply | Threaded
Open this post in threaded view
|

Re: cache cannot load all the data into cache

In reply to this post by vkulichenko
Hi, Kevin.

Could you please make small example + db scripts to reproduce and debug this issue?


On Wed, Apr 13, 2016 at 10:34 AM, Zhengqingzheng <[hidden email]> wrote:
Dear all,
I am trying to load two tables data into caches to speed up my queries.
table1 contains 564 records, with one primary key as index.
definition of table content from java  as follows:
    @QuerySqlField
    private String orgId;

    @QuerySqlField(index=true)
    private String objId;

    @QuerySqlField
    private int numRows;

table2 contains 9626 records, with no primary key defined but a group index is defined.
definition of table2 from java as follows:
    @QuerySqlField
    private String orgId;

    @QuerySqlField(orderedGroups={@QuerySqlField.Group(
            name="objId_fieldName_idx", order=0, descending = true)})
    private String objId;

    @QuerySqlField(orderedGroups={@QuerySqlField.Group(
            name="objId_fieldName_idx", order=1, descending = true)})
    private String fieldName;

    @QuerySqlField
    private int fieldNum;

    @QuerySqlField
    private int statVal;

I defined two caches to load all the data from two tables:
the first cache load data from table1, and works fine.
but the second cache which load data from table2 cannot load all the data, only few of them.
I think this is due to the configuration of cache2 was probability wrong[ because cache content shows that objid was the unique key to retrieve the data record]:

                final String CACHE_NAME1= IgniteMetaDatabaseFieldStat.class.getSimpleName()+"_Cache";
        CacheConfiguration<String, IgniteMetaDatabaseFieldStat> cfg =
                new CacheConfiguration<String, IgniteMetaDatabaseFieldStat>(CACHE_NAME1);

        CacheJdbcPojoStoreExampleFactory<String, IgniteMetaDatabaseFieldStat> storeFactory =
                new CacheJdbcPojoStoreExampleFactory<String,IgniteMetaDatabaseFieldStat>();

        storeFactory.setDialect(new OracleDialect());

        JdbcType jdbcType = new JdbcType();

        jdbcType.setCacheName(CACHE_NAME1);
        jdbcType.setDatabaseSchema("besdb");
        jdbcType.setDatabaseTable("data_base_field_stat");

<b>
        jdbcType.setKeyType("java.lang.String");
        jdbcType.setKeyFields(new JdbcTypeField(Types.VARCHAR, "OBJID", String.class, "objId")
           /* ,new JdbcTypeField(Types.VARCHAR, "FIELDNAME", String.class, "fieldName")*/);
</b>
        jdbcType.setValueType("org.apache.ignite.examples.model.IgniteMetaDatabaseFieldStat");
        jdbcType.setValueFields(
                new JdbcTypeField(Types.VARCHAR,"ORGID", String.class, "orgId"),
                new JdbcTypeField(Types.VARCHAR,"OBJID", String.class, "objId"),
                new JdbcTypeField(Types.VARCHAR,"FIELDNAME", String.class, "fieldName"),
                new JdbcTypeField(Types.INTEGER,"FIELDNUM", Integer.class, "fieldNum"),
                new JdbcTypeField(Types.INTEGER,"STAT_VAL", Integer.class, "statVal")
                    );


        storeFactory.setTypes(jdbcType);

        cfg.setCacheStoreFactory(storeFactory);

        // Set atomicity as transaction, since we are showing transactions in the example.
        cfg.setAtomicityMode(CacheAtomicityMode.ATOMIC);
        cfg.setIndexedTypes(String.class, IgniteMetaDatabaseFieldStat.class);

        cfg.setReadThrough(true);
        cfg.setWriteThrough(true);

        cfg.setCacheMode(CacheMode.PARTITIONED);
        //cfg.setAtomicityMode(CacheAtomicityMode.ATOMIC);
        //cfg.setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED);
        //cfg.setOffHeapMaxMemory(64 * 1024L * 1024L);

        //cfg.setStartSize(100 * 1024 * 1024);
        cfg.setBackups(0);

please note the jdbcType.setKeyTypes and setKeyFields part. I want to use the <b>groupIndex</b> as the cache key setting, which was defined in the annotation part of           @QuerySqlField(orderedGroups={@QuerySqlField.Group(
            name="objId_fieldName_idx", order=0, descending = true)})
    private String objId;

    @QuerySqlField(orderedGroups={@QuerySqlField.Group(
            name="objId_fieldName_idx", order=1, descending = true)})
    private String fieldName;

but I don't know how to do that, in my example I just use objid as the key. In this case, if there are duplicate values come into the cache, the rest values was ignored.

How to setup the correct key for jdbcTypes in my cache configuration?


Best regards,
Kevin




--
Alexey Kuznetsov
GridGain Systems
www.gridgain.com