![]() |
Hi all,
I'm trying to commit a very large transaction (8M keys and ~4GB of data). After a while, I can see this diagnostics message in node log: [08:56:31,721][WARNING][sys-#989][diagnostic] >>> Transaction [startTime=08:55:22.095, curTime=08:56:31.712, ... *state=SUSPENDED* ... Does anyone know why it is suspended, and how to avoid it? Thanks in advance José -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
Hi again,
For an smaller succeeding transaction 1.2M keys and 600MB in size, I noticed it changed its state something similar as follows: SUSPENDED -> ACTIVE -> COMMITTING ... and it takes around 3 min to finish. For another test with 4M keys and 2GB it is still in SUSPENDED state after 30 min. There is a maximum number of keys/size for a single transaction? There is any documentation out there about transaction states? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
Another test with 2M keys and 1GB also remains in SUSPENDED state after 11
minutes... I don't understand where the difference between this one and the successful 1.2M keys and 600MB could be. Any idea is welcomed -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
I would also recommend taking a thread dump to see where this suspension is
coming from. Attach this thread dump here along w/the reproducer. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() ![]() |
ilya.kasnacheev |
![]() |
Hello! I can see that the only occurrence of transaction suspending in our own code is in thin client implementation. Do you happen to use thin client for this operation? Regards, -- Ilya Kasnacheev пн, 8 февр. 2021 г. в 20:32, akorensh <[hidden email]>: I would also recommend taking a thread dump to see where this suspension is |
![]() |
In reply to this post by akorensh
Hi,
First off, thanks for your help. In the test, I'm using a single server node cluster with the official 2.9.1 version. Client is a C++ Thin Client with transactions support (commit 685c1b70ca from master branch). The test is very simple: struct Blob { int8_t m_blob[512]; }; IgniteClient client = IgniteClient::Start(cfg); CacheClient<int32_t, Blob> cache = client.GetOrCreateCache<int32_t, examples::Blob>("vds"); cache.Clear(); std::map<int32_t, Blob> map; for (uint32_t i = 0; i < 2000000; ++i) map.insert (std::make_pair(i, Blob())); ClientTransactions transactions = client.ClientTransactions(); ClientTransaction tx = transactions.TxStart(PESSIMISTIC, READ_COMMITTED); cache.PutAll(map); tx.Commit(); As you can see, the total size of the transaction (not taking keys into account) is 2M * 512B = 1GB. If we limit the loop up to 1.9M, it works... and I've found where the problem is: <http://apache-ignite-users.70518.x6.nabble.com/file/t3059/bug.png> As you can see, as "doubleCap" is an int, trying to double it when "cap" is big enough makes it negative, therefore, it's not finally doubled... which leads to a reallocation of 1GB each time a new key-value entry is added to the tcp message. Using integers to store capacity in your C++ Thin Client is implicitly limiting your maximum transaction size up to 1GB. Maybe you should consider to use uint64_t instead... -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
In reply to this post by ilya.kasnacheev
Hello Ilya,
Yes, but it has nothing to do with suspending an active transaction... the problem is that transaction never reaches ACTIVE state because it takes a long time creating the tcp message. Please, take a look to my previous post. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() ![]() |
ilya.kasnacheev |
![]() |
In reply to this post by jjimeno
Hello! Would you care to create a JIRA ticket for that issue? Regards, -- Ilya Kasnacheev ср, 10 февр. 2021 г. в 14:18, jjimeno <[hidden email]>: Hi, |
![]() |
I wouldn't mind, but I'm afraid I'm not allowed to... at least, I couldn't
find the option on that page :) -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() ![]() |
ilya.kasnacheev |
![]() |
Hello! I think you need to register first. Btw, why do you need such large transactions? Have you considered data streamer instead? Regards, -- Ilya Kasnacheev ср, 10 февр. 2021 г. в 15:28, jjimeno <[hidden email]>: I wouldn't mind, but I'm afraid I'm not allowed to... at least, I couldn't |
![]() |
Hi,
Because of the kind of product we have to develop, we currently have a set of scenarios with this kind of transactions and we're evaluating several datastores as RocksDB and, sadly, timings there are quite better than the ones I've got in Ignite... :( Data streamer is not available in C++ afaik... -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() ![]() |
ilya.kasnacheev |
![]() |
Hello! RocksDB is an embedded database whereas Apache Ignite is a distributed database. Regards, -- Ilya Kasnacheev ср, 10 февр. 2021 г. в 16:11, jjimeno <[hidden email]>: Hi, |
![]() ![]() |
Zhenya Stanilovsky |
![]() |
In reply to this post by jjimeno
Hi !
I believe tx.putAll will be fixed soon ) I have working prototype for now, need a little bit time to fix all tests )
|
![]() ![]() |
ilya.kasnacheev |
![]() |
Hello! Unfortunately, this C++ thin client / platforms issue is not so easily fixable. Our platforms interaction does not expect buffers larger than 2G apparently. Regards, -- Ilya Kasnacheev ср, 10 февр. 2021 г. в 16:43, Zhenya Stanilovsky <[hidden email]>:
|
![]() |
In reply to this post by ilya.kasnacheev
Hello!
That's exactly the reason why we would prefer to choose Ignite over RocksDB. Otherwise, we will have to implement scalability by ourselves and, believe me, that's not something we would like to do. We also know they're not directly comparable. We would agree to pay the price for scalability with slightly worse performance but, based on our tests, it's too big. For instance: - Single node cluster in the same host as the application (no communication over the wire, trying to get closer to an embedded database) - A single user (no multiple users working either on the application or the database) A transactional commit with 1.8M keys and 1GB in size takes 97 seconds with NO persistence, and this time is doubled if persistence is enabled. RocksDB takes around 100 seconds to perform a transaction with 4M keys and 4GB in size, persistence included. As you can see, there is a huge difference. On the other hand, limitations like the ones we have found in one month of research: - PutAll performance in transactional cache <https://issues.apache.org/jira/browse/IGNITE-14076> - Not asynchronous tcp connection <https://issues.apache.org/jira/browse/IGNITE-13997> - The maximum transaction size of 1GB we are discussing in this thread don't really help to go for Ignite, at least in our kind of project. But we would still like to do more tests to be 100% sure about our decision, that's why I'd like to ask you: - Should I get a better performance in a multi-node cluster? Read/Write/Both? - Should I do the tests in a different way? Thanks in advance! -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
In reply to this post by ilya.kasnacheev
Hello!
I'm sorry hearing that. Would you think it could be fixed to reach these 2GB? Currently it's only 1GB in the C++ Thin Client Regards -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
In reply to this post by Zhenya Stanilovsky
Great!... I'm really looking forward it :)
-- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() ![]() |
stephendarlington |
![]() |
In reply to this post by jjimeno
I’ve not been following this thread closely, so I apologise if I’ve missed something.
> - Should I get a better performance in a multi-node cluster? Read/Write/Both? As per the documentation: “Ignite is designed and optimized for distributed computing scenarios. Deploy and benchmark a multi-node cluster rather than a single-node one.” (https://ignite.apache.org/docs/latest/perf-and-troubleshooting/general-perf-tips) So yes, all else being equal, more nodes will give you better performance. Keeping the volume of data the same, doubling the number of nodes will roughly halve the number of reads/writes going to each node. But even there, things like the use of transactions and thin clients will limit your throughput to well below what Ignite is capable of “flat out." Without analysing your architecture it’s difficult to give specific advice, but best write performance is achieved with many nodes, fast disks, JVM tuning and thick clients using the data streamer API. Regards, Stephen > On 11 Feb 2021, at 07:08, jjimeno <[hidden email]> wrote: > > Hello! > > That's exactly the reason why we would prefer to choose Ignite over RocksDB. > Otherwise, we will have to implement scalability by ourselves and, believe > me, that's not something we would like to do. > > We also know they're not directly comparable. We would agree to pay the > price for scalability with slightly worse performance but, based on our > tests, it's too big. > > For instance: > - Single node cluster in the same host as the application (no > communication over the wire, trying to get closer to an embedded database) > - A single user (no multiple users working either on the application or > the database) > > A transactional commit with 1.8M keys and 1GB in size takes 97 seconds with > NO persistence, and this time is doubled if persistence is enabled. RocksDB > takes around 100 seconds to perform a transaction with 4M keys and 4GB in > size, persistence included. As you can see, there is a huge difference. > > On the other hand, limitations like the ones we have found in one month of > research: > - PutAll performance in transactional cache > <https://issues.apache.org/jira/browse/IGNITE-14076> > - Not asynchronous tcp connection > <https://issues.apache.org/jira/browse/IGNITE-13997> > - The maximum transaction size of 1GB we are discussing in this thread > > don't really help to go for Ignite, at least in our kind of project. > > But we would still like to do more tests to be 100% sure about our decision, > that's why I'd like to ask you: > - Should I get a better performance in a multi-node cluster? > Read/Write/Both? > - Should I do the tests in a different way? > > Thanks in advance! > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
![]() |
Hi, thanks for pointing it out
This confirms our tests... moving from a single-node cluster to a two-nodes one dropped the read timings to less than the half! -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/ |
Free forum by Nabble | Edit this page |