Local Node Query Optimization

classic Classic list List threaded Threaded
14 messages Options
JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Local Node Query Optimization

Hi

I'm using Apache Ignite to query a large spatial database using a z-order curve and I'm trying to improve the query performance in local node, i.e. I use sql.setLocal(true). The database consists of GPS tracks from multiple users divided into tiles which I query. The data is meant to be mostly read-only with updates once a day at most. I use partitioned mode.

I query the cache for tracks in a tile and the program needs to iterate over the result set as quickly as possible or calculate the total number of users in a tile. I'm using a group index (index of the tile, id for user) which is used in the query (I can see this in H2 console). The records in a tile are sorted by the user id.

For a test tile that I've been using, the index limits the query to 33000 records from which the total number of users needs to be calculated. Currently the query takes 120 ms when using ONHEAP_TIERED cache and a single node. For OFFHEAP_TIERED cache the performance drops to 200 ms. The performance scales linearly with the number of nodes when using a broadcast which is good. I however intend to use the cache as a part of a web application and the single node performance needs to be faster. In a C++ implementation that I did previously (and which is hard to scale to multiple nodes and large data sets), a similar calculation can be done in a few ms when using cache friendly containers.

My question is is there anything I can do to increase the performance of the local node queries? I've been trying most of the performance tips from http://apacheignite.gridgain.org/v1.3/docs/performance-tips except for the externalizable part. For example, can I change H2 settings somehow or use something similar to local peek to increase the query performance?

Thanks,
johnny
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Hi Johnny,

I'm not sure why it's so much slower than with some C++ implementation (what did you use, BTW?). Most likely there are ways to improve local execution and I think Sergi Vladykin should have some comments on this, as he is an expert in this part of Ignite.

But anyway, if performance scales, why do you want to use local queries? Your web app can start an embedded client node which will execute queries on a remote cluster which can have as many nodes as you need. You can even have all the nodes on one nodes if it's dictated by your environment.
This approach is much more flexible, because you can add nodes at any moment (for example, your data set size increases and performance gets worse). In case of local query you will need to switch to a bigger physical box to store the data and you can do nothing with performance.

-Val
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Hi,

First of all I would not expect that general purpose solution (Apache Ignite) can be fairly compared in performance to any highly optimized  custom implementation (yours with cache friendly containers). Just because Ignite is a distributed highly parallel system which supports much more use cases than you implemented.

There are multiple ways we can try to solve your problem:

1. Use custom Indexing implementation with your optimized low latency C++ implementation inside. See https://ignite.incubator.apache.org/releases/1.3.0/javadoc/org/apache/ignite/spi/indexing/IndexingSpi.html

2. I would suggest to try H2 Geospatial extension which can be used in Ignite instead of z-order curve and check if it will work better for you. See https://github.com/apache/incubator-ignite/blob/f9fe99963fd3defab639e9700dffc965cdd39bb1/modules/geospatial/src/test/java/org/apache/ignite/internal/processors/query/h2/GridH2IndexingGeoSelfTest.java

In addition to Val's comments I also would not recommend to run local queries with partitioned cache because it is by design not a fault tolerant solution.

Sergi



2015-08-17 21:13 GMT+03:00 vkulichenko <[hidden email]>:
Hi Johnny,

I'm not sure why it's so much slower than with some C++ implementation (what
did you use, BTW?). Most likely there are ways to improve local execution
and I think Sergi Vladykin should have some comments on this, as he is an
expert in this part of Ignite.

But anyway, if performance scales, why do you want to use local queries?
Your web app can start an embedded client node which will execute queries on
a remote cluster which can have as many nodes as you need. You can even have
all the nodes on one nodes if it's dictated by your environment.
This approach is much more flexible, because you can add nodes at any moment
(for example, your data set size increases and performance gets worse). In
case of local query you will need to switch to a bigger physical box to
store the data and you can do nothing with performance.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Local-Node-Query-Optimization-tp996p1000.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

In reply to this post by vkulichenko
Hi

The C++ implementation is almost completely custom. I used zip_iterators from Cuda Thrust to create a "virtual container" in the host memory from a bunch of STL vectors. I then sort this by key and use upper and lower to find the limits for the records in a tile. I then run a simple scan over these to count the number of distinct users. I'm fairly sure that this could be further optimized. The reading of the STL vectors sequentially should be as fast as possible. If I insert more data, the container needs to be sorted again which is not optimal.

I could use global queries to count the number of distinct users. I do also a map-reduce like reduction to aggregate the gps tracks in a tile. To do this, I follow http://apache-ignite-users.70518.x6.nabble.com/Closure-method-on-a-Cache-Query-td456.html#a457 and do a query broadcast, reduce locally and this reduced result is returned to the caller. If it is possible to create complex custom aggregate SQL functions I could probably use these.

I tried profiling using virtualvm and the bottleneck might be the marshalling, but the results are somewhat conflicting and I'm not sure of this. Hazelcast has binary and object formats for the values http://docs.hazelcast.org/docs/latest/manual/html/map-inmemoryformat.html. I could try this if something similar is available in Ignite.

Regards,
johnny.
JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

In reply to this post by Sergi Vladykin
Sergi

Do you have any comment on the memory format and changing/optimizing the H2 database settings? Can these be changed at cache startup? For example run analyze or change EARLY_FILTER to true. Are the requirements for optimizeDistinct http://www.h2database.com/javadoc/org/h2/constant/DbSettings.html#OPTIMIZE_DISTINCT valid for the last field in an ordered group index in the used H2 implementation?

1) Are there are any more complex examples of a custom Indexing implementation?

2) I don't think the geospatial extension is valid for this case. If I had point data, maybe then, but I'm fairly sure that MapD is the solution for this kind of data. I actually have track data as a value and I'm fairly sure that Morton code is optimal for the case I'm using.

Regards,

j.
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Johnny,

Ignite with ONHEAP_TIERED keeps java objects in memory, it is the fastest possible format. I don't think the problem is there.
Also I don't believe you will find any magic SPEEDUP_MY_QUERY_100_TIMES option in H2.

As for OPTIMIZE_DISTINCT option, I'm not sure. Ignite uses H2 1.3.175, you can check.

1) I don't see any, but it is a fairly simple interface. You just need to implement `store`, `remove` and `query` methods delegating to you C++ code. It will be instantiated in each data node, each query will be ran on each node and results are merged. So your library will become distributed and scalable automatically. Note that Ignite assumes that IndexingSpi supports concurrent updates and queries, so you have to implement correct synchronization yourself.

2) Geospatial extension supports more complex geometries than points like `LineString`, so I believe it should work. Still I'm not sure if it will be effective enough for you.

Sergi










2015-08-18 1:31 GMT+03:00 JohnnyS <[hidden email]>:
Sergi

Do you have any comment on the memory format and changing/optimizing the H2
database settings? Can these be changed at cache startup? For example run
analyze or change EARLY_FILTER to true. Are the requirements for
optimizeDistinct
http://www.h2database.com/javadoc/org/h2/constant/DbSettings.html#OPTIMIZE_DISTINCT
valid for the last field in an ordered group index in the used H2
implementation?

1) Are there are any more complex examples of a custom Indexing
implementation?

2) I don't think the geospatial extension is valid for this case. If I had
point data, maybe then, but I'm fairly sure that MapD is the solution for
this kind of data. I actually have track data as a value and I'm fairly sure
that Morton code is optimal for the case I'm using.

Regards,

j.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Local-Node-Query-Optimization-tp996p1016.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Sergi

By using setSqlOnheapRowCacheSize(100000) in the cache configuration the distinct query time went from 200 ms to 30 ms when using off-heap. For the ONHEAP_TIERED the query time was originally 120 ms with the same optimized query time. I think I could live with these numbers considering that I will be running the queries in parallel. I think the problem is that for millions of rows, majority of data will not be in heap and the queries will be slowish for these.

I couldn't find how to change the H2 settings through the cache configuration. Can I edit the H2 url somehow?

Regards,

j.
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Thats quite interesting. May be slowdown was due to GC activity? How large your dataset was in that test?

No, editing H2 url is explicitly prohibited, there are quite a few things you can brake and most probably you will not improve anything much.
If you are interested, you can hack on Ignite source code, I will be happy to accept your patch if you will find something useful.

Sergi

2015-08-19 2:16 GMT+03:00 JohnnyS <[hidden email]>:
Sergi

By using setSqlOnheapRowCacheSize(100000) in the cache configuration the
distinct query time went from 200 ms to 30 ms when using off-heap. For the
ONHEAP_TIERED the query time was originally 120 ms with the same optimized
query time. I think I could live with these numbers considering that I will
be running the queries in parallel. I think the problem is that for millions
of rows, majority of data will not be in heap and the queries will be
slowish for these.

I couldn't find how to change the H2 settings through the cache
configuration. Can I edit the H2 url somehow?

Regards,

j.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Local-Node-Query-Optimization-tp996p1043.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Based on jstat the time taken by gc is substantially more when I don't use setSqlOnheapRowCacheSize. The results can be found here http://pastebin.com/JpuaKSUk. In the test I load the data to cache, pause for a few seconds and run the same count distinct query in a for loop for a few hundred times. I start the jstat during the pause.

Hacking Ignite might not be worth the effort. If the h2 options are exposed through the cache configuration ever in the future, I can wait for that. I'm not expecting a huge performance boost.

Regards,
j.
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

What your CacheConfiguration looks like when you run with ONHEAP_TIERED? Do you have setOffheapMaxMemory(-1)?
When you have something different from -1 indexes are always kept offheap, you will have onheap indexes only with -1.

H2 options will not be exposed on configuration, I already described reasons for that.

Sergi

2015-08-19 18:08 GMT+03:00 JohnnyS <[hidden email]>:
Based on jstat the time taken by gc is substantially more when I don't use
setSqlOnheapRowCacheSize. The results can be found here
http://pastebin.com/JpuaKSUk. In the test I load the data to cache, pause
for a few seconds and run the same count distinct query in a for loop for a
few hundred times. I start the jstat during the pause.

Hacking Ignite might not be worth the effort. If the h2 options are exposed
through the cache configuration ever in the future, I can wait for that. I'm
not expecting a huge performance boost.

Regards,
j.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Local-Node-Query-Optimization-tp996p1053.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

I've set off_heap to 10 gigabytes for the test data. In production I probably need to load hundreds of gigabytes to the cache. I might have to use also swap space. For queries I need to keep the indexes in heap. Is there any other way to force this behavior besides using setSqlOnheapRowCacheSize?

Regards,

j.
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

If offheap is enabled (setOffheapMaxMemory is not -1) then SQL indexes are always offheap. Setting setSqlOnheapRowCacheSize is the only way to put part of your indexed in offheap data on heap to improve performance.

If you set setSqlOnheapRowCacheSize larger than the size of your dataset then effectively you will have all your data on heap and better to disable offheap at all.

Sergi

2015-08-20 12:03 GMT+03:00 JohnnyS <[hidden email]>:
I've set off_heap to 10 gigabytes for the test data. In production I probably
need to load hundreds of gigabytes to the cache. I might have to use also
swap space. For queries I need to keep the indexes in heap. Is there any
other way to force this behavior besides using setSqlOnheapRowCacheSize?

Regards,

j.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Local-Node-Query-Optimization-tp996p1075.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

JohnnyS JohnnyS
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

Is there anyway to circumvent this? I basically have Too Big Data (tm?) for memory only. I have to use Raid 0 SSDs or an NVMe swap to store some or most of the data. I would like to store indexes in the heap though for fast queries. I'm basically trying to replicate Aerospike but with a better query model.

Regards,

j.
Sergi Vladykin Sergi Vladykin
Reply | Threaded
Open this post in threaded view
|

Re: Local Node Query Optimization

I think it is a wrong question about the way to circumvent this. These things are supposed to work this way.
If you need on-heap, just disable off-heap. If you have to much data for on-heap, use off-heap and may be swap.
Otherwise you will end up with 2 copies of the same data: one on-heap, another one off-heap,
and purpose of off-heap will be completely defeated anyways.

Even if you are using swap it will swap to disk only cache values, SQL index is still resides
in memory (either off-heap or on-heap) and still require to have entry with cache key to access
that swapped cache value. Thus swap will help only if you have relatively large cache values.

Sergi


2015-08-20 15:05 GMT+03:00 JohnnyS <[hidden email]>:
Is there anyway to circumvent this? I basically have Too Big Data (tm?) for
memory only. I have to use Raid 0 SSDs or an NVMe swap to store some or most
of the data. I would like to store indexes in the heap though for fast
queries. I'm basically trying to replicate Aerospike but with a better query
model.

Regards,

j.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Local-Node-Query-Optimization-tp996p1078.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.