Affinity Key column to be always part of the Primary Key

classic Classic list List threaded Threaded
13 messages Options
Naveen Naveen
Reply | Threaded
Open this post in threaded view
|

Affinity Key column to be always part of the Primary Key

Hi

I am using Ignite 2.3

Have 2 tables

Table 1: Customer - primary Key is PartyId
Table 2: Account - primary key is AccountID  (also has PartyID as one of the
column)

To keep both customer and account data for a customer  on the same node I
need to  use affinity key for Account table. And affinity key column should
be always part of the primary key.
So Account table will have a composite key consisting of AccountID, PartyID.

My question is, while querying the data from Account table, how does the
query work

1. If I only pass AccountID
2. If I pass both AccountID and PartyID

if I only pass AccountID, does it treat like querying the table based on the
primary key OR I need to create an index on AccountID as well to improve the
performance.

To keep the data unique, I dont need PartyID as part of Primary key on
Account table. Just for the sake of using Affinity functionality, I need to
use PartyID as part of Account table primary key.

Can someone answer this please?

Thanks
Naveen



Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Stanislav Lukyanov Stanislav Lukyanov
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Generally, you should have an index for each combination of fields you use in
a query. Primary key gives you an implicit index, but you need to create the
rest yourself.

In your case, I'd suggest to have AccountID as a primary key (without
PartyID) and also create a compound index for the pair (AccountID, PartyID)
via CREATE INDEX.

Thanks,
Stan




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Naveen Naveen
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Hi Stan

If I do not have the affinity key column as part of the primary key, it does
not allow me to create the table itself.

If I want to execute a join query on both the tables, I had to use affinity
key to collocate the data

select p.party_id, a.party_id, first_name, A.SERVICE_ID_LIST  from Customer
P, Account  A where P.PARTY_ID= A.PARTY_ID and A.PARTY_ID IN ('P1', 'P2');

How can we make use of Affinity to collocate CUSTOMER and ACCOUNT data in
this case.

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Naveen Naveen
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Do we have any update on this clarification regarding the affinity

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Mikhail Mikhail
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

In reply to this post by Naveen
Hi Naveen


>If I do not have the affinity key column as part of the primary key, it
does
>not allow me to create the table itself.

Could you please explain how it doesn't allow you create the table? is there
any exceptions/errors messages?

Thanks,
Mike.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Naveen Naveen
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Hi Mike

I have created a table called CITY

: jdbc:ignite:thin://127.0.0.1> CREATE TABLE City (  city_id LONG PRIMARY
KEY, name VARCHAR)  WITH "template=replicated";
No rows affected (0.224 seconds)

Creating a table called Person with affinity key as city_id

0: jdbc:ignite:thin://127.0.0.1> CREATE TABLE IF NOT EXISTS Person ( age
int, id int, city_id LONG , name varchar, company varchar,  PRIMARY KEY
(name, id)) WITH "template=partitioned,backups=1,affinitykey=city_id,
key_type=PersonKey, value_type=MyPerson";

This is the exception I get

Error: Affinity key column must be one of key columns: CITY_ID
(state=42000,code=0)
java.sql.SQLException: Affinity key column must be one of key columns:
CITY_ID
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.sendRequest(JdbcThinConnection.java:671)
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute0(JdbcThinStatement.java:130)
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute(JdbcThinStatement.java:299)
        at sqlline.Commands.execute(Commands.java:823)
        at sqlline.Commands.sql(Commands.java:733)
        at sqlline.SqlLine.dispatch(SqlLine.java:795)
        at sqlline.SqlLine.begin(SqlLine.java:668)
        at sqlline.SqlLine.start(SqlLine.java:373)
        at sqlline.SqlLine.main(SqlLine.java:265)
0: jdbc:ignite:thin://127.0.0.1>

And, when I change the primary key to include affinity id, below DDL is
working fine.
0: jdbc:ignite:thin://127.0.0.1> CREATE TABLE IF NOT EXISTS Person ( age
int, id int, city_id LONG , name varchar, company varchar,  PRIMARY KEY
(name, id,city_id)) WITH
"template=partitioned,backups=1,affinitykey=city_id, key_type=PersonKey,
value_type=MyPerson";

This is what I was trying to explain, is affinity key to be part of the
primary key ??

If this is the case, whole my data model will change drastically.

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Yes, the affinity key must be part of the primary key.   Welcome to my world....

On Fri, Mar 16, 2018 at 3:23 AM, Naveen <[hidden email]> wrote:
Hi Mike

I have created a table called CITY

: jdbc:ignite:thin://127.0.0.1> CREATE TABLE City (  city_id LONG PRIMARY
KEY, name VARCHAR)  WITH "template=replicated";
No rows affected (0.224 seconds)

Creating a table called Person with affinity key as city_id

0: jdbc:ignite:thin://127.0.0.1> CREATE TABLE IF NOT EXISTS Person ( age
int, id int, city_id LONG , name varchar, company varchar,  PRIMARY KEY
(name, id)) WITH "template=partitioned,backups=1,affinitykey=city_id,
key_type=PersonKey, value_type=MyPerson";

This is the exception I get

Error: Affinity key column must be one of key columns: CITY_ID
(state=42000,code=0)
java.sql.SQLException: Affinity key column must be one of key columns:
CITY_ID
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.sendRequest(JdbcThinConnection.java:671)
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute0(JdbcThinStatement.java:130)
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute(JdbcThinStatement.java:299)
        at sqlline.Commands.execute(Commands.java:823)
        at sqlline.Commands.sql(Commands.java:733)
        at sqlline.SqlLine.dispatch(SqlLine.java:795)
        at sqlline.SqlLine.begin(SqlLine.java:668)
        at sqlline.SqlLine.start(SqlLine.java:373)
        at sqlline.SqlLine.main(SqlLine.java:265)
0: jdbc:ignite:thin://127.0.0.1>

And, when I change the primary key to include affinity id, below DDL is
working fine.
0: jdbc:ignite:thin://127.0.0.1> CREATE TABLE IF NOT EXISTS Person ( age
int, id int, city_id LONG , name varchar, company varchar,  PRIMARY KEY
(name, id,city_id)) WITH
"template=partitioned,backups=1,affinitykey=city_id, key_type=PersonKey,
value_type=MyPerson";

This is what I was trying to explain, is affinity key to be part of the
primary key ??

If this is the case, whole my data model will change drastically.

Thanks
Naveen



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.

Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Internally Ignite is key-value storage. It use key to derive partition it belongs to. By default the whole key is used. Alternatively you can use @AffinityKey annotation in cache API or "affinityKey" option in CREATE TABLE to specify part of the key to be used for affinity calculation. Affinity column cannot belong to value because in this case single key-value pair could migrate between nodes during updates and IgniteCache.get(K) will not be able to locate the key in cluster.

On Fri, Mar 16, 2018 at 4:56 PM, David Harvey <[hidden email]> wrote:
Yes, the affinity key must be part of the primary key.   Welcome to my world....

On Fri, Mar 16, 2018 at 3:23 AM, Naveen <[hidden email]> wrote:
Hi Mike

I have created a table called CITY

: jdbc:ignite:thin://127.0.0.1> CREATE TABLE City (  city_id LONG PRIMARY
KEY, name VARCHAR)  WITH "template=replicated";
No rows affected (0.224 seconds)

Creating a table called Person with affinity key as city_id

0: jdbc:ignite:thin://127.0.0.1> CREATE TABLE IF NOT EXISTS Person ( age
int, id int, city_id LONG , name varchar, company varchar,  PRIMARY KEY
(name, id)) WITH "template=partitioned,backups=1,affinitykey=city_id,
key_type=PersonKey, value_type=MyPerson";

This is the exception I get

Error: Affinity key column must be one of key columns: CITY_ID
(state=42000,code=0)
java.sql.SQLException: Affinity key column must be one of key columns:
CITY_ID
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.sendRequest(JdbcThinConnection.java:671)
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute0(JdbcThinStatement.java:130)
        at
org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute(JdbcThinStatement.java:299)
        at sqlline.Commands.execute(Commands.java:823)
        at sqlline.Commands.sql(Commands.java:733)
        at sqlline.SqlLine.dispatch(SqlLine.java:795)
        at sqlline.SqlLine.begin(SqlLine.java:668)
        at sqlline.SqlLine.start(SqlLine.java:373)
        at sqlline.SqlLine.main(SqlLine.java:265)
0: jdbc:ignite:thin://127.0.0.1>

And, when I change the primary key to include affinity id, below DDL is
working fine.
0: jdbc:ignite:thin://127.0.0.1> CREATE TABLE IF NOT EXISTS Person ( age
int, id int, city_id LONG , name varchar, company varchar,  PRIMARY KEY
(name, id,city_id)) WITH
"template=partitioned,backups=1,affinitykey=city_id, key_type=PersonKey,
value_type=MyPerson";

This is what I was trying to explain, is affinity key to be part of the
primary key ??

If this is the case, whole my data model will change drastically.

Thanks
Naveen



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.


dsetrakyan dsetrakyan
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key



On Tue, Mar 20, 2018 at 2:09 PM, Vladimir Ozerov <[hidden email]> wrote:
Internally Ignite is key-value storage. It use key to derive partition it belongs to. By default the whole key is used. Alternatively you can use @AffinityKey annotation in cache API or "affinityKey" option in CREATE TABLE to specify part of the key to be used for affinity calculation. Affinity column cannot belong to value because in this case single key-value pair could migrate between nodes during updates and IgniteCache.get(K) will not be able to locate the key in cluster.

Vladimir, while it makes sense that the key must be composed of the ID and Affinity Key, I still do not understand why we require that user declares them both as PRIMARY KEY. Why do you need to enforce that explicitly? In my view you can do it automatically, if you see that the table has both, PRIMARY KEY and AFFINITY KEY declared.

Vladimir Ozerov Vladimir Ozerov
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Because without AFFINITY KEY option we do not know order of fields within composite PK which is very important for index creation.

вт, 20 марта 2018 г. в 19:58, Dmitriy Setrakyan <[hidden email]>:
On Tue, Mar 20, 2018 at 2:09 PM, Vladimir Ozerov <[hidden email]> wrote:
Internally Ignite is key-value storage. It use key to derive partition it belongs to. By default the whole key is used. Alternatively you can use @AffinityKey annotation in cache API or "affinityKey" option in CREATE TABLE to specify part of the key to be used for affinity calculation. Affinity column cannot belong to value because in this case single key-value pair could migrate between nodes during updates and IgniteCache.get(K) will not be able to locate the key in cluster.

Vladimir, while it makes sense that the key must be composed of the ID and Affinity Key, I still do not understand why we require that user declares them both as PRIMARY KEY. Why do you need to enforce that explicitly? In my view you can do it automatically, if you see that the table has both, PRIMARY KEY and AFFINITY KEY declared.

Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

In reply to this post by dsetrakyan
What is your goal?   If you have a unique key that does not contain the affinity key, and the primary key contains both fields, you can create an index on that unique key, so that you could have fast node local lookups using SqlQuery().    If you want to find an object only by that original unique key from a random node, then you must either search all nodes, or not append an affinity key.    So if Ignite had done what you suggested, the cost of a single lookup by primary key would increase with node count, which would be at odds with it being declared a primary key.


On Tue, Mar 20, 2018 at 12:57 PM, Dmitriy Setrakyan <[hidden email]> wrote:


On Tue, Mar 20, 2018 at 2:09 PM, Vladimir Ozerov <[hidden email]> wrote:
Internally Ignite is key-value storage. It use key to derive partition it belongs to. By default the whole key is used. Alternatively you can use @AffinityKey annotation in cache API or "affinityKey" option in CREATE TABLE to specify part of the key to be used for affinity calculation. Affinity column cannot belong to value because in this case single key-value pair could migrate between nodes during updates and IgniteCache.get(K) will not be able to locate the key in cluster.

Vladimir, while it makes sense that the key must be composed of the ID and Affinity Key, I still do not understand why we require that user declares them both as PRIMARY KEY. Why do you need to enforce that explicitly? In my view you can do it automatically, if you see that the table has both, PRIMARY KEY and AFFINITY KEY declared.




Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.

Naveen Naveen
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

My whole and sole requirement is to make lookup work fast, this can only be
achieved with lookup based on primary key, cluster knows which node is
holding this record based on the key, no other look-up can give best results
close to primary key lookup.

We have done some tests on lookup based on the indexed column, which is
giving (1/5)th of the TPS of lookup based on the primary key

Lookup with primary key was around 30K TPS
Lookup on indexed column was around 6 K TPS

Though its indexed, request still goes to all the nodes, instead of querying
the source table it will query index table which is small in size and
sorted, so retrieval process is faster, is this understanding correct ??

Results are justifiable, considering the architecture, where indexes are
distributed across the nodes.

In my view, affinity should not influence the data model, at the most it can
add  to the memory footprint by adding extra column to the table just for
the same of using affinity (collocating the data).

If the whole idea of having affinity key as part of the primary key is to
just identify the order of fields, then we should look for other ways not by
enforcing the affinity key as part of composite primary key.

Regards
Naveen





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Dave Harvey Dave Harvey
Reply | Threaded
Open this post in threaded view
|

Re: Affinity Key column to be always part of the Primary Key

Based this  latest description, simply not specifying an affinity key here would be sufficient.   But presumably you were specifying the affinity key to cause co-location.   The reason a lookup on the K of the KV pair is fast is because it can hash to a node.   If  the affinity key was not included in the K then you would not get colocation.  To do a lookup on only part of K means you cannot hash to node.    Assuming that you can't know the affinity key a priori, i.e., you know the person and not the company, there are other solutions.   Depending on the data, you may be able to use a replicated cache for other (company) data rather than trying to co-locate this record with that other data.  That increases the storage and insert/update cost for company records.     
There are also a number of solutions using Java to send closures to the server nodes.

On Wed, Mar 21, 2018 at 1:55 AM, Naveen <[hidden email]> wrote:
My whole and sole requirement is to make lookup work fast, this can only be
achieved with lookup based on primary key, cluster knows which node is
holding this record based on the key, no other look-up can give best results
close to primary key lookup.

We have done some tests on lookup based on the indexed column, which is
giving (1/5)th of the TPS of lookup based on the primary key

Lookup with primary key was around 30K TPS
Lookup on indexed column was around 6 K TPS

Though its indexed, request still goes to all the nodes, instead of querying
the source table it will query index table which is small in size and
sorted, so retrieval process is faster, is this understanding correct ??

Results are justifiable, considering the architecture, where indexes are
distributed across the nodes.

In my view, affinity should not influence the data model, at the most it can
add  to the memory footprint by adding extra column to the table just for
the same of using affinity (collocating the data).

If the whole idea of having affinity key as part of the primary key is to
just identify the order of fields, then we should look for other ways not by
enforcing the affinity key as part of composite primary key.

Regards
Naveen





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.