Suggest a better way to access a particular K-V store

classic Classic list List threaded Threaded
10 messages Options
adipro adipro
Reply | Threaded
Open this post in threaded view
|

Suggest a better way to access a particular K-V store

The K is key with type String.
The V is the value with type JSONObject/HashMap.

V is having the following structure:

{"agentId":<long>,"score":<double>,"url":<string>}

Now the query on this will be in such a way that -> Get top 50 values of "K"
where values in that K-V store are sorted based on "score". Provided that
K-V store has 10-50 million K-V pairs.

Can someone please suggest which data structure I need to use to solve this
particular case?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

Hello!

Ignite SQL table with index on score descending would fit it nicely.

You will have to convert JSON into BinaryObject or just extract score as a column.

Regards,
--
Ilya Kasnacheev


пн, 11 мая 2020 г. в 19:11, adipro <[hidden email]>:
The K is key with type String.
The V is the value with type JSONObject/HashMap.

V is having the following structure:

{"agentId":<long>,"score":<double>,"url":<string>}

Now the query on this will be in such a way that -> Get top 50 values of "K"
where values in that K-V store are sorted based on "score". Provided that
K-V store has 10-50 million K-V pairs.

Can someone please suggest which data structure I need to use to solve this
particular case?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
adipro adipro
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

We've already tried that but we are receiving these errors. We are using
index in a correct way only
We are using Ignite persistence.

```
[15:32:19,481][WARNING][long-qry-#110][LongRunningQueryManager] Query
execution is too long [duration=3898ms, type=MAP, distributedJoin=false,
enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 1342938 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode
[id=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8,
consistentId=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8, addrs=ArrayList
[127.0.0.1, 192.168.177.58], sockAddrs=HashSet [/127.0.0.1:0,
/192.168.177.58:0], discPort=0, order=4, intOrder=4,
lastExchangeTime=1589276400188, loc=false,
ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=482, segment=0]
```
The thing is we are running this in an 8 GB machine. According to the
documents about tuning, we were told that 70% of the RAM should be given to
heap and non heap so we gave 2GB heap + 2GB offheap + 1GB directbuffmemory..

Till 60Lakh records, it didn't throw any warnings, but after that we are
receiving many warnings.

Is it got to do something with Off-heap size or BTree data size that is
stored in RAM? Should we increase RAM? We don't want to increase RAM. Is it
possible without increasing RAM? Or is there a way we can calculate how much
RAM we need for how much data?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
akorensh akorensh
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

Hi,
   Use these tips for memory planning:
https://apacheignite.readme.io/docs/capacity-planning#memory-capacity-planning-example

 https://apacheignite.readme.io/docs/capacity-planning#capacity-planning-faq 
(this has a spreadsheet w/capacity calculator)


 You could make you cache partitioned, and use compute jobs to run the query
simultaneously on multiple machines then reduce the results to have only the
top 50.

   see: https://apacheignite.readme.io/docs/compute-grid
    https://apacheignite.readme.io/docs/compute-tasks
   https://apacheignite.readme.io/docs/cache-modes
Thanks, Alex



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

In reply to this post by adipro
Hello!

Have you tried adding an index on URLS (APPNAME_ID, SCORE DESC)? (or ASC if you are going to order ascending)

Regards,
--
Ilya Kasnacheev


вт, 12 мая 2020 г. в 13:42, adipro <[hidden email]>:
We've already tried that but we are receiving these errors. We are using
index in a correct way only
We are using Ignite persistence.

```
[15:32:19,481][WARNING][long-qry-#110][LongRunningQueryManager] Query
execution is too long [duration=3898ms, type=MAP, distributedJoin=false,
enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 1342938 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode
[id=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8,
consistentId=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8, addrs=ArrayList
[127.0.0.1, 192.168.177.58], sockAddrs=HashSet [/127.0.0.1:0,
/192.168.177.58:0], discPort=0, order=4, intOrder=4,
lastExchangeTime=1589276400188, loc=false,
ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=482, segment=0]
```
The thing is we are running this in an 8 GB machine. According to the
documents about tuning, we were told that 70% of the RAM should be given to
heap and non heap so we gave 2GB heap + 2GB offheap + 1GB directbuffmemory..

Till 60Lakh records, it didn't throw any warnings, but after that we are
receiving many warnings.

Is it got to do something with Off-heap size or BTree data size that is
stored in RAM? Should we increase RAM? We don't want to increase RAM. Is it
possible without increasing RAM? Or is there a way we can calculate how much
RAM we need for how much data?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
adipro adipro
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

My query is

SELECT * FROM URLS WHERE APPNAME_ID = ? ORDER BY SCORE LIMIT ?

That is giving me 0.7-1.0 msec read performance for a test run for about
some time. But the index with URLS (SCORE ASC, APPNAME_ID), I'm getting a
read performance of about 0.3 to 0.5 msec. i found it to be constant when
data grows.. whereas the index which you provided is getting delayed with
growing data. Ours is write intensive application.

But the thing is in deployment servers we are constantly receiving these
warnings

```
[22:26:31,046][WARNING][query-#48349][IgniteH2Indexing] Long running query
is finished [duration=3006ms, type=MAP, distributedJoin=false,
enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 1106428 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode
[id=9a62f525-98a8-43d6-85da-2270f1ee4e7a,
consistentId=9a62f525-98a8-43d6-85da-2270f1ee4e7a, addrs=ArrayList
[127.0.0.1, 172.20.42.17], sockAddrs=HashSet [/172.20.42.17:0,
/127.0.0.1:0], discPort=0, order=3, intOrder=3,
lastExchangeTime=1589893316075, loc=false,
ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=262124, segment=0]
```


data metric at the point of warning ->

```
[22:34:05,866][INFO][grid-timeout-worker-#407][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=4f2b808c, uptime=04:19:01.321]
    ^-- H/N/C [hosts=4, nodes=4, CPUs=192]
    ^-- CPU [cur=0.03%, avg=13.79%, GC=0%]
    ^-- PageMemory [pages=927396]
    ^-- Heap [used=1521MB, free=62.86%, comm=4096MB]
    ^-- Off-heap [used=3665MB, free=56.84%, comm=8392MB]
    ^--   sysMemPlc region [used=0MB, free=99.98%, comm=100MB]
    ^--   default region [used=3665MB, free=55.26%, comm=8192MB]
    ^--   metastoreMemPlc region [used=0MB, free=99.94%, comm=0MB]
    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
    ^-- Ignite persistence [used=3622MB]
    ^--   sysMemPlc region [used=0MB]
    ^--   default region [used=3622MB]
    ^--   metastoreMemPlc region [used=0MB]
    ^--   TxLog region [used=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=200, qSize=0]
    ^-- Striped thread pool [active=2, idle=198, qSize=0]
```

Can you please tell why these warnings are coming. Although in client side,
I added a check if query delays to print a warning. But it didn't through
any warnings in client machine. It's weird why this warning is coming in
server logs.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

Hello!

Is it possible that you are not closing your JDBC result sets (or corresponding Ignite QueryCursor)?

It's actually strange. The query in question seem rather bulky. It's not expected that it is sufficiently fast for you.

Regards,
--
Ilya Kasnacheev


вт, 19 мая 2020 г. в 20:36, adipro <[hidden email]>:
My query is

SELECT * FROM URLS WHERE APPNAME_ID = ? ORDER BY SCORE LIMIT ?

That is giving me 0.7-1.0 msec read performance for a test run for about
some time. But the index with URLS (SCORE ASC, APPNAME_ID), I'm getting a
read performance of about 0.3 to 0.5 msec. i found it to be constant when
data grows.. whereas the index which you provided is getting delayed with
growing data. Ours is write intensive application.

But the thing is in deployment servers we are constantly receiving these
warnings

```
[22:26:31,046][WARNING][query-#48349][IgniteH2Indexing] Long running query
is finished [duration=3006ms, type=MAP, distributedJoin=false,
enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 1106428 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode
[id=9a62f525-98a8-43d6-85da-2270f1ee4e7a,
consistentId=9a62f525-98a8-43d6-85da-2270f1ee4e7a, addrs=ArrayList
[127.0.0.1, 172.20.42.17], sockAddrs=HashSet [/172.20.42.17:0,
/127.0.0.1:0], discPort=0, order=3, intOrder=3,
lastExchangeTime=1589893316075, loc=false,
ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=262124, segment=0]
```


data metric at the point of warning ->

```
[22:34:05,866][INFO][grid-timeout-worker-#407][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=4f2b808c, uptime=04:19:01.321]
    ^-- H/N/C [hosts=4, nodes=4, CPUs=192]
    ^-- CPU [cur=0.03%, avg=13.79%, GC=0%]
    ^-- PageMemory [pages=927396]
    ^-- Heap [used=1521MB, free=62.86%, comm=4096MB]
    ^-- Off-heap [used=3665MB, free=56.84%, comm=8392MB]
    ^--   sysMemPlc region [used=0MB, free=99.98%, comm=100MB]
    ^--   default region [used=3665MB, free=55.26%, comm=8192MB]
    ^--   metastoreMemPlc region [used=0MB, free=99.94%, comm=0MB]
    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
    ^-- Ignite persistence [used=3622MB]
    ^--   sysMemPlc region [used=0MB]
    ^--   default region [used=3622MB]
    ^--   metastoreMemPlc region [used=0MB]
    ^--   TxLog region [used=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=200, qSize=0]
    ^-- Striped thread pool [active=2, idle=198, qSize=0]
```

Can you please tell why these warnings are coming. Although in client side,
I added a check if query delays to print a warning. But it didn't through
any warnings in client machine. It's weird why this warning is coming in
server logs.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
adipro adipro
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

Thanks for the reply.

For your reference I'm sharing the code.. Please have a look at it..

```
public Set<String> execQuery(Object... args){
            Set<String> urls = new LinkedHashSet<>();
            Long start = (Long)objects[2];
            Long end = (Long)objects[3];
            int i = 0;
            SqlFieldsQuery query = new SqlFieldsQuery("SELECT * FROM URLS
WHERE PROCESS_APPNAME_ID = ? ORDER BY SCORE LIMIT ?"));
            List queryObj = new ArrayList<>();
            queryObj.add(getGlobalAppId(objects[0].toString()));
            if(start != null && end != null){
                queryObj.add(end-start);
            }
            query.setArgs(queryObj.toArray());
            FieldsQueryCursor<List&lt;?>> cursor =
cacheHolder.getCache().query(query);
            for (List<?> row : cursor) {
                urls.add((String)row.get(1));
                i++;
            }
            return urls;
}
```

Is it that cursor.close() is missed here? What happens if it's not there?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
adipro adipro
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

Can someone please help regarding this issue?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Suggest a better way to access a particular K-V store

Hello!

Please try doing cursor.close() before return;

Regards,
--
Ilya Kasnacheev


ср, 3 июн. 2020 г. в 09:17, adipro <[hidden email]>:
Can someone please help regarding this issue?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/