work around for problem where ignite query does not include objects added into Cache from within a transaction

classic Classic list List threaded Threaded
7 messages Options
kotamrajuyashasvi kotamrajuyashasvi
Reply | Threaded
Open this post in threaded view
|

work around for problem where ignite query does not include objects added into Cache from within a transaction

Hi

When a transaction is started and some rows are updated/inserted/deleted in
a cache which is in transactional mode, And after that when I perform a
select query, its not reflecting the results from of update/insert/delete
done before with in the transaction.  I found out that ignite sql does not
have this functionality yet. But I need a temporary workaround for this
problem. I thought of two possible solutions. I do not know the feasibility
of these solutions and also its impacts on performance or functionality.
Hence I need your help.

solution 1: Assuming all updates/inserts/deletes with in transaction are
temporarily stored in some cache, and then flushed to original cache when
commited or rolledback, Is there any way I could access that cache to
perform select query on that temporary cache and then decide what rows
should I include as result of select query.

solution 2: When ever I start a Transaction, for every original cache, I
initialize a new dynamic temporary cache which is not in  transactional
mode[also does not have any persistent store] using 'getOrCreateCache(..)'
as mentioned in docs. Now within transaction when ever any update/insert
happens I insert into the corresponding temporary cache. During the select
query I perform select query on both main and corresponding temporary cache.
Then I take results from both the queries and decide the actual result.I
also maintain a list of rows/keys deleted within transaction, and during
select query I will exclude the keys present in this list. During commit I
sync the temporary caches into original caches. Will creating temporary
caches for every transaction degrade performance? Is there any way to create
a temporary cache local to transaction or node so that it does not affect
performance, since Im using temporary cache mainly for performing sql
queries.


Please suggest if there are any drawbacks to the above solutions and also if
there are any other workarounds for this problem.

 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: work around for problem where ignite query does not include objects added into Cache from within a transaction

Hi,

Looks like you need the feature which is simply not implement in Ignite at the moment. SELECTs are not transactional and there is not easy way to make not-yet-committed updates visible to them. Solution 1 doesn't work as there are not "temporal" caches in Ignite. Not-yet-committed updates are stored in some internal data structures. Solution 2 would lead to dramatic performance degradation.

I would suggest you to rethink the whole approach - SELECTs do have any transactional guarantees at the moment. May be you will be able to split single transaction into multiple and then run SQL query in the middle.

Vladimir.

On Sat, Sep 16, 2017 at 10:17 AM, kotamrajuyashasvi <[hidden email]> wrote:
Hi

When a transaction is started and some rows are updated/inserted/deleted in
a cache which is in transactional mode, And after that when I perform a
select query, its not reflecting the results from of update/insert/delete
done before with in the transaction.  I found out that ignite sql does not
have this functionality yet. But I need a temporary workaround for this
problem. I thought of two possible solutions. I do not know the feasibility
of these solutions and also its impacts on performance or functionality.
Hence I need your help.

solution 1: Assuming all updates/inserts/deletes with in transaction are
temporarily stored in some cache, and then flushed to original cache when
commited or rolledback, Is there any way I could access that cache to
perform select query on that temporary cache and then decide what rows
should I include as result of select query.

solution 2: When ever I start a Transaction, for every original cache, I
initialize a new dynamic temporary cache which is not in  transactional
mode[also does not have any persistent store] using 'getOrCreateCache(..)'
as mentioned in docs. Now within transaction when ever any update/insert
happens I insert into the corresponding temporary cache. During the select
query I perform select query on both main and corresponding temporary cache.
Then I take results from both the queries and decide the actual result.I
also maintain a list of rows/keys deleted within transaction, and during
select query I will exclude the keys present in this list. During commit I
sync the temporary caches into original caches. Will creating temporary
caches for every transaction degrade performance? Is there any way to create
a temporary cache local to transaction or node so that it does not affect
performance, since Im using temporary cache mainly for performing sql
queries.


Please suggest if there are any drawbacks to the above solutions and also if
there are any other workarounds for this problem.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

rick_tem rick_tem
Reply | Threaded
Open this post in threaded view
|

Re: work around for problem where ignite query does not include objects added into Cache from within a transaction

I see that as a huge problem.  Certainly one of the functions of Ignite is to
be faster than the database, but if it fails to meet all of the requirements
of what a database will do for you, what is the point of using it? Clearly a
database will keep read consistency between transactions.  Most applications
I've worked with require that as well.  If I understand correctly, this hole
makes querying the grid almost useless as I can't count it being consistent.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
vkulichenko vkulichenko
Reply | Threaded
Open this post in threaded view
|

Re: work around for problem where ignite query does not include objects added into Cache from within a transaction

As Vladimir mentioned, this is one of the features that is simply not
implemented yet. Transactional support for SQL is in development right now.
Hopefully we will have sometime this year, or probably early next year.

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: work around for problem where ignite query does not include objects added into Cache from within a transaction

In reply to this post by rick_tem

On Mon, Sep 18, 2017 at 8:01 AM, rick_tem <[hidden email]> wrote:
I see that as a huge problem.  Certainly one of the functions of Ignite is to
be faster than the database, but if it fails to meet all of the requirements
of what a database will do for you, what is the point of using it? Clearly a
database will keep read consistency between transactions.  Most applications
I've worked with require that as well.  If I understand correctly, this hole
makes querying the grid almost useless as I can't count it being consistent.

Rick, 
The Ignite community does understand this and is very honest about warning users about it:

The plan is to add this feature in 2.4 release, hopefully by the end of the year.

D.
rick_tem rick_tem
Reply | Threaded
Open this post in threaded view
|

Re: work around for problem where ignite query does not include objects added into Cache from within a transaction

Okay, thank you.  Nice to know it is under development.

Best,
Rick



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
kotamrajuyashasvi kotamrajuyashasvi
Reply | Threaded
Open this post in threaded view
|

Re: work around for problem where ignite query does not include objects added into Cache from within a transaction

This post was updated on .
In reply to this post by Vladimir Ozerov
Hi

Thanks for your responses. I just wanted a temporary work around till the
actual feature is implemented, even at the cost of performance. I have
thought of another approach to this problem.

I plan to use my own transaction mechanism instead of Ignite transactions. I
can use explicit locks to lock the keys and my own commit and rollback
functionality.In Primary Key Pojo class along with pk fields I can add a
field 'transaction_id' which has the value of the transaction id(some unique
id) in which the row has been inserted with in the transaction. By default
it is null. I do not perform update or delete or insert directly using
queries. For update and delete first I perform a select query to get the
keys(_key)  which should be deleted/updated along with values(_val).Then I lock the keys using explicit
locks and once I acquire a lock I check the _val from query with cache.get(PK).
If both are same I continue else I rerun the query. Before locking
the keys I make a few checks.
(a)If the key's transaction_id is not null and is not same as the present
transaction id then i discard the key and continue with the remaining
keys(as it belongs to some other transaction).
(b)If the key is already present with in the transaction(i.e same key fields
except the transaction_id) I discard it because that row has been updated
with in transaction and hence I need to use the updated value of pk if it
was eligible for query result.
(c)I maintain a HashMap<PK,Value> of oldcacherows that I just locked. also
HashMap<PK,Value> of new rows that are updated or inserted with
transaction_id with value of current transaction id.I also maintain a list
of deleteList which are to deleted with in the transaction. when ever a
delete request comes I do not delete it directly, I first push it into the
delete HashSet and during commit I delete them. during select if key is
present  in the delete HashSet I discard the key. If the key is present in
oldcacherows or new cacherows HashMap I do not acquire locks.
Also I acquire locks in a specific order to avoid deadlocks.

Most of these checks after select query are constant time operations (O(1))
since I use HashMaps and HashSets. But rerunning query will degrade
performance but it might not happen frequently.

Once I obtain locks, for delete I just push the key into a HashSet. for
update for the given key I update the value accordingly and put in cache but
the key's transaction_id field is set to current transaction id. Hence the
old row still exists and is still visible to other transactions. Also I put
this row into newcacherows HashMap.  During Insert also I put the new row
into cache but the key has transaction_id set to present transaction id
hence will be ignored by other transactions. I put the inserted into the
newcacherows HashMap. During the commit I just put all the rows in
newcacherows into the cache with transaction_id as null replacing old rows/
new rows inserted into the cache. also delete any rows still with
transaction_id with current transaction Also delete rows from delete
HashSet. During rollback I just remove the temporary rows inserted during
transaction as rows are not actually inserted or deleted with transaction_id
as null. Also I maintain a list of locks that I acquired and during
commit/rollback I release all the locks.

Also I have a work around for the problem where select query might return
partial results of a commit. The solution is to maintain a commit_bit in row
object which is by default 0 . while inserting into cache during commit set
commit_bit to 1 and insert. and after commit i.e all rows are inserted into
cache then update the commit_bit back to 0 for all these rows. Now during
select another additional check is made
(d)if any row has commit_bit as 1 it indicates that this row has been
inserted in the middle of a commit and hence some additional rows might get
inserted/updated which might change the query result and hence rerun the
query until no row has commit_bit as 1.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/