Issues with Clients joining cluster on Kubernetes (EKS)

classic Classic list List threaded Threaded
9 messages Options
Aaron Stockton Aaron Stockton
Reply | Threaded
Open this post in threaded view
|

Issues with Clients joining cluster on Kubernetes (EKS)

I currently am running a 10 node Ignite cluster on EKS. I tried running it was a k8s deployment, but the thrash in getting all the nodes to join and agree on a topology was pretty painful, so its running currently as a statefulset.

Currently, the issue I am seeing is that when I try to deploy a client node via kubernetes, the client node joins the cluster immediately, then fails to gossip thus dropping it from the cluster. After some timeout the node re-joins the cluster and is more or less stable from that point forward. I'm looking for advice on configuring the client so that it can join reliably.

I'll attach logs from the client, server, and the relevant client config.

Thanks in advance!
-aes

JobConfiguration.java (1K) Download Attachment
igniteClientLogs (68K) Download Attachment
igniteServerLogs (8K) Download Attachment
ezhuravlev ezhuravlev
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Hi,

I see this warning in the logs:

Local node's value of 'java.net.preferIPv4Stack' system property differs from remote node's (all nodes in topology should have identical value) [locPreferIpV4=true, rmtPreferIpV4=null, locId8=20c2dc57, rmtId8=699d2d8c, rmtAddrs=[ignite-jwp-0.ignite.ignite-jwp.svc.cluster.local/10.100.78.98, /127.0.0.1], rmtNode=ClusterNode [id=699d2d8c-a54b-4104-9f97-62d550352315, order=1, addr=[10.100.78.98, 127.0.0.1], daemon=false]]

 so, probably it makes sense to set java.net.preferIPv4Stack property to true on all the nodes in the cluster.

Also, it would be great to check INFO logs from server nodes, please add -DIGNITE_QUIET=false property.

Evgenii

ср, 1 мая 2019 г. в 00:42, Aaron Stockton <[hidden email]>:
I currently am running a 10 node Ignite cluster on EKS. I tried running it was a k8s deployment, but the thrash in getting all the nodes to join and agree on a topology was pretty painful, so its running currently as a statefulset.

Currently, the issue I am seeing is that when I try to deploy a client node via kubernetes, the client node joins the cluster immediately, then fails to gossip thus dropping it from the cluster. After some timeout the node re-joins the cluster and is more or less stable from that point forward. I'm looking for advice on configuring the client so that it can join reliably.

I'll attach logs from the client, server, and the relevant client config.

Thanks in advance!
-aes
aes aes
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Sure, set preferIPv4 on the client, and set IGNITE_QUIET to false. logs
attached. Thank you!!

serverLogs.serverLogs
<http://apache-ignite-users.70518.x6.nabble.com/file/t2409/serverLogs.serverLogs>  
clientLogs.clientLogs
<http://apache-ignite-users.70518.x6.nabble.com/file/t2409/clientLogs.clientLogs>  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ezhuravlev ezhuravlev
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Why there is no lines in the server log between these two: [17:10:48,827][INFO][sys-#149][GridCachePartitionExchangeManager] Sending Full Message for AffinityTopologyVersion [topVer=10, minorTopVer=1] performed in 1 ms.
[19:39:58,910][INFO][disco-event-worker-#62][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=825f1597-6bd9-4317-9344-3ee9b29be501, addrs=[10.100.111.229, 127.0.0.1], sockAddrs=[/10.100.111.229:0, /127.0.0.1:0], discPort=0, order=11, intOrder=11, lastExchangeTime=1556825943775, loc=false, ver=2.7.0#20181130-sha1:256ae401, isClient=true] ?

I see that client tried to initially connect at 19:38, while there is no messages for this period of time in the server log. Any idea what could happen there?
Evgenii

чт, 2 мая 2019 г. в 23:54, aes <[hidden email]>:
Sure, set preferIPv4 on the client, and set IGNITE_QUIET to false. logs
attached. Thank you!!

serverLogs.serverLogs
<http://apache-ignite-users.70518.x6.nabble.com/file/t2409/serverLogs.serverLogs
clientLogs.clientLogs
<http://apache-ignite-users.70518.x6.nabble.com/file/t2409/clientLogs.clientLogs



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
aes aes
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

I didnt deploy the client right away, got distracted :) From your experience
is ignite on kubernetes something thats considered to be stable? If so, I
feel like there are some settings I must be missing. Between something
blocking the discovery thread, or the client timing out on every first
attempt, Im questioning its viability on k8s. Thank you for your help

-aes



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ezhuravlev ezhuravlev
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Yes, I know that users have deployments in kubernetes and they don't have any issues with that. Did you follow this instruction: https://apacheignite.readme.io/docs/kubernetes-deployment ? Please make sure that ports for Discovery and Communication are open in this environment.

Evgenii

пт, 3 мая 2019 г. в 20:18, aes <[hidden email]>:
I didnt deploy the client right away, got distracted :) From your experience
is ignite on kubernetes something thats considered to be stable? If so, I
feel like there are some settings I must be missing. Between something
blocking the discovery thread, or the client timing out on every first
attempt, Im questioning its viability on k8s. Thank you for your help

-aes



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
aes aes
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Yes I followed this guide to get started, and I have egress/ingress rules
that allow intra-cluster communication on every port that I have verified.
If discovery and communication ports werent open, I feel like the client
would have a hard time connecting after the timeout elapsed, wouldnt it?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
aes aes
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Interestingly, when I downgraded from 2.7 -> 2.6 the connection from the
client became much more stable. is this a known regression in 2.7? Something
that will be addressed in a 2.7.X or 2.8 release?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Roman Guseinov Roman Guseinov
Reply | Threaded
Open this post in threaded view
|

Re: Issues with Clients joining cluster on Kubernetes (EKS)

Hi aes,

I don't know any issues related to Kubernetes which appeared in 2.7. As
Evgenii mentioned it seems there are some issues on the server node. There
is no message in 2 hours in server logs. Most likely this is related to the
environment (not Ignite).

Do you observe any issues with connectivity between server nodes? Is it
possible that you run client node outside the k8s environment? If so I would
recommend using thin clients instead. Ignite client nodes are the part of
the cluster and should be deployed in the same network and k8s namespace.

Roman



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/