Fetched result use too much time

classic Classic list List threaded Threaded
13 messages Options
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Fetched result use too much time

Hi
    I have a table with 25,000,000 records.
    The sql like this:
        select fdatabasedid from databasedassign where fassingcuid in(3589 ids) group by fdatabasedid have count(fassingcuid)>=3589
    It return 1500 records.
    But It took 82 seconds!!!


    I see the explain like this:
    

 Any suggestions?
Thanks.
Lucky


 


111.png (60K) Download Attachment
222.png (50K) Download Attachment
Yakov Zhdanov Yakov Zhdanov
Reply | Threaded
Open this post in threaded view
|

Re: Fetched result use too much time


Thanks!
--
Yakov Zhdanov, Director R&D
GridGain Systems

2017-09-15 4:50 GMT+03:00 Lucky <[hidden email]>:
Hi
    I have a table with 25,000,000 records.
    The sql like this:
        select fdatabasedid from databasedassign where fassingcuid in(3589 ids) group by fdatabasedid have count(fassingcuid)>=3589
    It return 1500 records.
    But It took 82 seconds!!!


    I see the explain like this:
    

 Any suggestions?
Thanks.
Lucky


 


Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Fetched result use too much time

Hi , Yakov Zhdanov
    Actually I did not run H2 console, I run  like this :
        cache.query(new SqlFieldsQuery("explain select id from assignInfo "));

    I change to like this:
new SqlFieldsQuery("select * from Person p join table(id bigint = ?) i on p.id = i.id").setArgs(new Object[]{ new Integer[] {2, 3, 4} }))

but it also need take 82 seconds.That did not change anything.

Any other suggestion?
Thanks a lot.
Lucky




At 2017-09-15 22:17:30, "Yakov Zhdanov" <[hidden email]> wrote:

Thanks!
--
Yakov Zhdanov, Director R&D
GridGain Systems

2017-09-15 4:50 GMT+03:00 Lucky <[hidden email]>:
Hi
    I have a table with 25,000,000 records.
    The sql like this:
        select fdatabasedid from databasedassign where fassingcuid in(3589 ids) group by fdatabasedid have count(fassingcuid)>=3589
    It return 1500 records.
    But It took 82 seconds!!!


    I see the explain like this:
    

 Any suggestions?
Thanks.
Lucky


 




 

dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: Re: Fetched result use too much time

Lucky, 

We would like to see the output of the "EXPLAIN" command for the query that takes a long time, so we could make suggestions. Can you post it here?

D.

On Fri, Sep 15, 2017 at 11:50 PM, Lucky <[hidden email]> wrote:
Hi , Yakov Zhdanov
    Actually I did not run H2 console, I run  like this :
        cache.query(new SqlFieldsQuery("explain select id from assignInfo "));

    I change to like this:
new SqlFieldsQuery("select * from Person p join table(id bigint = ?) i on p.id = i.id").setArgs(new Object[]{ new Integer[] {2, 3, 4} }))

but it also need take 82 seconds.That did not change anything.

Any other suggestion?
Thanks a lot.
Lucky




At 2017-09-15 22:17:30, "Yakov Zhdanov" <[hidden email]> wrote:

Thanks!
--
Yakov Zhdanov, Director R&D
GridGain Systems

2017-09-15 4:50 GMT+03:00 Lucky <[hidden email]>:
Hi
    I have a table with 25,000,000 records.
    The sql like this:
        select fdatabasedid from databasedassign where fassingcuid in(3589 ids) group by fdatabasedid have count(fassingcuid)>=3589
    It return 1500 records.
    But It took 82 seconds!!!


    I see the explain like this:
    

 Any suggestions?
Thanks.
Lucky


 




 


Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Re: Fetched result use too much time


I do not understand your points exactlly.
but I have post the screenshot here.
It is the output of the "explain" query.
Can you see that?




At 2017-09-16 15:28:23, "Dmitriy Setrakyan" <[hidden email]> wrote:
Lucky, 

We would like to see the output of the "EXPLAIN" command for the query that takes a long time, so we could make suggestions. Can you post it here?

D.

On Fri, Sep 15, 2017 at 11:50 PM, Lucky <[hidden email]> wrote:
Hi , Yakov Zhdanov
    Actually I did not run H2 console, I run  like this :
        cache.query(new SqlFieldsQuery("explain select id from assignInfo "));

    I change to like this:
new SqlFieldsQuery("select * from Person p join table(id bigint = ?) i on p.id = i.id").setArgs(new Object[]{ new Integer[] {2, 3, 4} }))

but it also need take 82 seconds.That did not change anything.

Any other suggestion?
Thanks a lot.
Lucky




At 2017-09-15 22:17:30, "Yakov Zhdanov" <[hidden email]> wrote:

Thanks!
--
Yakov Zhdanov, Director R&D
GridGain Systems

2017-09-15 4:50 GMT+03:00 Lucky <[hidden email]>:
Hi
    I have a table with 25,000,000 records.
    The sql like this:
        select fdatabasedid from databasedassign where fassingcuid in(3589 ids) group by fdatabasedid have count(fassingcuid)>=3589
    It return 1500 records.
    But It took 82 seconds!!!


    I see the explain like this:
    

 Any suggestions?
Thanks.
Lucky


 




 




 

Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

回复: Fetched result use too much time

In reply to this post by dsetrakyan
Because some reson,I can not copy the message to here.
Just can post screeshot.

Here is the sql and output of the "explain" command of the query.
There have 3589 id, for you can see the whole output,I reduce it to 3.
Fetch all result  took 82 seconds.
Thanks.





2017年09月16日 15:28[hidden email]
Lucky, 

We would like to see the output of the "EXPLAIN" command for the query that takes a long time, so we could make suggestions. Can you post it here?

D.

On Fri, Sep 15, 2017 at 11:50 PM, Lucky <[hidden email]> wrote:
Hi , Yakov Zhdanov
    Actually I did not run H2 console, I run  like this :
        cache.query(new SqlFieldsQuery("explain select id from assignInfo "));

    I change to like this:
new SqlFieldsQuery("select * from Person p join table(id bigint = ?) i on p.id = i.id").setArgs(new Object[]{ new Integer[] {2, 3, 4} }))

but it also need take 82 seconds.That did not change anything.

Any other suggestion?
Thanks a lot.
Lucky


Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: Fetched result use too much time

Hi Lucky,

Could you please share you data model and node/cache configuration? I want to make sure that proper indexes are set. I will be able to advise something then. As I quick suggestion you may try to increase query parallelism on your "databaseDAssignCache". Please try setting it to the number of cores on your server nodes. Relevant property - CacheConfifuration.queryParallelism. Btw, how many nodes do you have?

Also I am struggling to understand the number "3589". Why this number appears both as ">= 3589" condition and as a number of parameters inside "IN" clause? 

Vladimir.

On Mon, Sep 18, 2017 at 9:20 AM, Lucky <[hidden email]> wrote:
Because some reson,I can not copy the message to here.
Just can post screeshot.

Here is the sql and output of the "explain" command of the query.
There have 3589 id, for you can see the whole output,I reduce it to 3.
Fetch all result  took 82 seconds.
Thanks.





2017年09月16日 15:28[hidden email]
Lucky, 

We would like to see the output of the "EXPLAIN" command for the query that takes a long time, so we could make suggestions. Can you post it here?

D.

On Fri, Sep 15, 2017 at 11:50 PM, Lucky <[hidden email]> wrote:
Hi , Yakov Zhdanov
    Actually I did not run H2 console, I run  like this :
        cache.query(new SqlFieldsQuery("explain select id from assignInfo "));

    I change to like this:
new SqlFieldsQuery("select * from Person p join table(id bigint = ?) i on p.id = i.id").setArgs(new Object[]{ new Integer[] {2, 3, 4} }))

but it also need take 82 seconds.That did not change anything.

Any other suggestion?
Thanks a lot.
Lucky



Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

回复: Fetched result use too much time


Please see the attachment.
I have set query parallelism  to 30. it took 42 seconds.
But it is not enough.
I excepted it took less than 3 seconds.

then,I have 3 nodes.

As for the 3589 number, we need to check the number of ID using in conditions. Only the number of times used is equal to the record in the in condition. That's the record we need. This is the business scenario required. I can't change this.

Thanks for your suggestion.
Lucky


2017年09月18日 21:55,[hidden email] :
Hi Lucky,

Could you please share you data model and node/cache configuration? I want to make sure that proper indexes are set. I will be able to advise something then. As I quick suggestion you may try to increase query parallelism on your "databaseDAssignCache". Please try setting it to the number of cores on your server nodes. Relevant property - CacheConfifuration.queryParallelism. Btw, how many nodes do you have?

Also I am struggling to understand the number "3589". Why this number appears both as ">= 3589" condition and as a number of parameters inside "IN" clause? 

Vladimir.


DatabaseDAssign.java (4K) Download Attachment
AbstractCacheConfiguration.java (6K) Download Attachment
Andrew Mashenkov Andrew Mashenkov
Reply | Threaded
Open this post in threaded view
|

Re: Fetched result use too much time

Lucky,


1. Looks like it make no sense to set query parallelism level higher number of available CPU on node.

2. Map query use index for field FASSIGCUID type of String and seems values are 16 chars length strings (32 bytes)
By default, values with size < 10 bytes can be inlined in index, so Ignite doesn't need to lookup a data page for value data. 
You can try to increase it up to 32 via cacheConfiguration.setSqlIndexMaxInlineSize(32) or JVM property -DIGNITE_MAX_INDEX_PAYLOAD_SIZE=32.

3. Ignite doesn't know whether your data is collocated by FDATABASEDID (group by clause) or not collocated.
So, Ignite can't apply HAVING condition instantly on map phase and have to load and merge all groups from all nodes before check for HAVING.
If it possible to collocate data on GROUP BY condition, you can hint Ignite with setting query flag:   sqlFieldsQuery.setCollocated(true).
However, I'm not sure it will help much and H2 will be able to make any optimization here.

4. Also, you can force Ignite to use different index. E.g. group index on FDATABASEDID and FASSIGCUID and same fields in different order.

5. Sometimes, Ignite change join order and it can cause unexcpected slowdown. You can try to change join order by changing tables positions in query string.
To preserve Ignite join order optimization you may use a flag:  sqlFieldsQuery.setEnforceJoinOrder(true).


Hope, this will help you.

On Tue, Sep 19, 2017 at 5:07 AM, Lucky <[hidden email]> wrote:

Please see the attachment.
I have set query parallelism  to 30. it took 42 seconds.
But it is not enough.
I excepted it took less than 3 seconds.

then,I have 3 nodes.

As for the 3589 number, we need to check the number of ID using in conditions. Only the number of times used is equal to the record in the in condition. That's the record we need. This is the business scenario required. I can't change this.

Thanks for your suggestion.
Lucky


2017年09月18日 21:55,[hidden email] :
Hi Lucky,

Could you please share you data model and node/cache configuration? I want to make sure that proper indexes are set. I will be able to advise something then. As I quick suggestion you may try to increase query parallelism on your "databaseDAssignCache". Please try setting it to the number of cores on your server nodes. Relevant property - CacheConfifuration.queryParallelism. Btw, how many nodes do you have?

Also I am struggling to understand the number "3589". Why this number appears both as ">= 3589" condition and as a number of parameters inside "IN" clause? 

Vladimir.




--
Best regards,
Andrey V. Mashenkov
Regards,
Andrew.
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

回复: Fetched result use too much time

    Thank you very much!
    1.query parallelism:this will cause a problem: fetch wrong reslut. 
       I set it to 10,and have table a with 150,000 records, table b with 12,000,000 records.
       when I query single table,the result is correct.
       but when the sql is like this:
           select a.id from a inner join b on a.id = b.tid 
      it got the wrong result. The result should be 11,000,000;but it just return 380,000 records.
      when I remove query parallelism setting,it return correctly. 

    2. I have modified ths property,and restart the server.for the record is too large, it need 4 hours to load data to ignite.So I have to wait.
    3.Actually, if I remove the group by clause and having condition, it took more time!
    4  and 5: I have try them before ,but it did not work.
Thanks again.
Lucky   

2017年09月21日 21:28[hidden email] 写道:
Lucky,


1. Looks like it make no sense to set query parallelism level higher number of available CPU on node.

2. Map query use index for field FASSIGCUID type of String and seems values are 16 chars length strings (32 bytes)
By default, values with size < 10 bytes can be inlined in index, so Ignite doesn't need to lookup a data page for value data. 
You can try to increase it up to 32 via cacheConfiguration.setSqlIndexMaxInlineSize(32) or JVM property -DIGNITE_MAX_INDEX_PAYLOAD_SIZE=32.

3. Ignite doesn't know whether your data is collocated by FDATABASEDID (group by clause) or not collocated.
So, Ignite can't apply HAVING condition instantly on map phase and have to load and merge all groups from all nodes before check for HAVING.
If it possible to collocate data on GROUP BY condition, you can hint Ignite with setting query flag:   sqlFieldsQuery.setCollocated(true).
However, I'm not sure it will help much and H2 will be able to make any optimization here.

4. Also, you can force Ignite to use different index. E.g. group index on FDATABASEDID and FASSIGCUID and same fields in different order.

5. Sometimes, Ignite change join order and it can cause unexcpected slowdown. You can try to change join order by changing tables positions in query string.
To preserve Ignite join order optimization you may use a flag:  sqlFieldsQuery.setEnforceJoinOrder(true).


Hope, this will help you.

Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Fetched result use too much time

In reply to this post by Andrew Mashenkov
    Thank you very much!
    1.query parallelism:this will cause a problem: fetch wrong reslut. 
       I set it to 10,and have table a with 150,000 records, table b with 12,000,000 records.
       when I query single table,the result is correct.
       but when the sql is like this:
           select a.id from a inner join b on a.id = b.tid 
      it got the wrong result. The result should be 11,000,000;but it just return 380,000 records.
      when I remove query parallelism setting,it return correctly. 

    2. I have modified ths property,and restart the server.for the record is too large, it need 4 hours to load data to ignite.So I have to wait.
    3.Actually, if I remove the group by clause and having condition, it took more time!
    4  and 5: I have try them before ,but it did not work.
Thanks again.
Lucky 




At 2017-09-21 21:28:40, "Andrey Mashenkov" <[hidden email]> wrote:
Lucky,


1. Looks like it make no sense to set query parallelism level higher number of available CPU on node.

2. Map query use index for field FASSIGCUID type of String and seems values are 16 chars length strings (32 bytes)
By default, values with size < 10 bytes can be inlined in index, so Ignite doesn't need to lookup a data page for value data. 
You can try to increase it up to 32 via cacheConfiguration.setSqlIndexMaxInlineSize(32) or JVM property -DIGNITE_MAX_INDEX_PAYLOAD_SIZE=32.

3. Ignite doesn't know whether your data is collocated by FDATABASEDID (group by clause) or not collocated.
So, Ignite can't apply HAVING condition instantly on map phase and have to load and merge all groups from all nodes before check for HAVING.
If it possible to collocate data on GROUP BY condition, you can hint Ignite with setting query flag:   sqlFieldsQuery.setCollocated(true).
However, I'm not sure it will help much and H2 will be able to make any optimization here.

4. Also, you can force Ignite to use different index. E.g. group index on FDATABASEDID and FASSIGCUID and same fields in different order.

5. Sometimes, Ignite change join order and it can cause unexcpected slowdown. You can try to change join order by changing tables positions in query string.
To preserve Ignite join order optimization you may use a flag:  sqlFieldsQuery.setEnforceJoinOrder(true).


Hope, this will help you.



Andrew Mashenkov Andrew Mashenkov
Reply | Threaded
Open this post in threaded view
|

Re: Re: Fetched result use too much time

Hi Lucky,

Looks like your query selectivity is poor and even with GroupBy large amount of data shoud be fetched to reduce node.

1. Is it possiblt to coolocate data on field used in OrderBy clause?
2. Looks weird that queryParallelizm cause wrong results. Looks like you have a single node grid and there is a bug in queryParallelizm feature.
 Also I can find what ignite version you use. Would you try to switch to the latest one?

On Tue, Oct 10, 2017 at 2:48 PM, Lucky <[hidden email]> wrote:
    Thank you very much!
    1.query parallelism:this will cause a problem: fetch wrong reslut. 
       I set it to 10,and have table a with 150,000 records, table b with 12,000,000 records.
       when I query single table,the result is correct.
       but when the sql is like this:
           select a.id from a inner join b on a.id = b.tid 
      it got the wrong result. The result should be 11,000,000;but it just return 380,000 records.
      when I remove query parallelism setting,it return correctly. 

    2. I have modified ths property,and restart the server.for the record is too large, it need 4 hours to load data to ignite.So I have to wait.
    3.Actually, if I remove the group by clause and having condition, it took more time!
    4  and 5: I have try them before ,but it did not work.
Thanks again.
Lucky 




At 2017-09-21 21:28:40, "Andrey Mashenkov" <[hidden email]> wrote:
Lucky,


1. Looks like it make no sense to set query parallelism level higher number of available CPU on node.

2. Map query use index for field FASSIGCUID type of String and seems values are 16 chars length strings (32 bytes)
By default, values with size < 10 bytes can be inlined in index, so Ignite doesn't need to lookup a data page for value data. 
You can try to increase it up to 32 via cacheConfiguration.setSqlIndexMaxInlineSize(32) or JVM property -DIGNITE_MAX_INDEX_PAYLOAD_SIZE=32.

3. Ignite doesn't know whether your data is collocated by FDATABASEDID (group by clause) or not collocated.
So, Ignite can't apply HAVING condition instantly on map phase and have to load and merge all groups from all nodes before check for HAVING.
If it possible to collocate data on GROUP BY condition, you can hint Ignite with setting query flag:   sqlFieldsQuery.setCollocated(true).
However, I'm not sure it will help much and H2 will be able to make any optimization here.

4. Also, you can force Ignite to use different index. E.g. group index on FDATABASEDID and FASSIGCUID and same fields in different order.

5. Sometimes, Ignite change join order and it can cause unexcpected slowdown. You can try to change join order by changing tables positions in query string.
To preserve Ignite join order optimization you may use a flag:  sqlFieldsQuery.setEnforceJoinOrder(true).


Hope, this will help you.






--
Best regards,
Andrey V. Mashenkov
Regards,
Andrew.
Lucky Lucky
Reply | Threaded
Open this post in threaded view
|

Re:Re: Re: Fetched result use too much time


My ignite version is 2.2.




在 2017-10-10 20:58:55,"Andrey Mashenkov" <[hidden email]> 写道:
Hi Lucky,

Looks like your query selectivity is poor and even with GroupBy large amount of data shoud be fetched to reduce node.

1. Is it possiblt to coolocate data on field used in OrderBy clause?
2. Looks weird that queryParallelizm cause wrong results. Looks like you have a single node grid and there is a bug in queryParallelizm feature.
 Also I can find what ignite version you use. Would you try to switch to the latest one?

On Tue, Oct 10, 2017 at 2:48 PM, Lucky <[hidden email]> wrote:
    Thank you very much!
    1.query parallelism:this will cause a problem: fetch wrong reslut. 
       I set it to 10,and have table a with 150,000 records, table b with 12,000,000 records.
       when I query single table,the result is correct.
       but when the sql is like this:
           select a.id from a inner join b on a.id = b.tid 
      it got the wrong result. The result should be 11,000,000;but it just return 380,000 records.
      when I remove query parallelism setting,it return correctly. 

    2. I have modified ths property,and restart the server.for the record is too large, it need 4 hours to load data to ignite.So I have to wait.
    3.Actually, if I remove the group by clause and having condition, it took more time!
    4  and 5: I have try them before ,but it did not work.
Thanks again.
Lucky 




At 2017-09-21 21:28:40, "Andrey Mashenkov" <[hidden email]> wrote:
Lucky,


1. Looks like it make no sense to set query parallelism level higher number of available CPU on node.

2. Map query use index for field FASSIGCUID type of String and seems values are 16 chars length strings (32 bytes)
By default, values with size < 10 bytes can be inlined in index, so Ignite doesn't need to lookup a data page for value data. 
You can try to increase it up to 32 via cacheConfiguration.setSqlIndexMaxInlineSize(32) or JVM property -DIGNITE_MAX_INDEX_PAYLOAD_SIZE=32.

3. Ignite doesn't know whether your data is collocated by FDATABASEDID (group by clause) or not collocated.
So, Ignite can't apply HAVING condition instantly on map phase and have to load and merge all groups from all nodes before check for HAVING.
If it possible to collocate data on GROUP BY condition, you can hint Ignite with setting query flag:   sqlFieldsQuery.setCollocated(true).
However, I'm not sure it will help much and H2 will be able to make any optimization here.

4. Also, you can force Ignite to use different index. E.g. group index on FDATABASEDID and FASSIGCUID and same fields in different order.

5. Sometimes, Ignite change join order and it can cause unexcpected slowdown. You can try to change join order by changing tables positions in query string.
To preserve Ignite join order optimization you may use a flag:  sqlFieldsQuery.setEnforceJoinOrder(true).


Hope, this will help you.


--
Best regards,
Andrey V. Mashenkov