Request among nodes taking minimum of idleConnectionTimeout configuration in tcpcommunicationSpi

classic Classic list List threaded Threaded
4 messages Options
trans trans
Reply | Threaded
Open this post in threaded view
|

Request among nodes taking minimum of idleConnectionTimeout configuration in tcpcommunicationSpi

This post was updated on .
Hi Guys,

We have 4 clients(p1,p2,p3,p4) and 1 server running in Prague, then one
other node(i1) running in India.

But when we communicate from node in India to node in Prague then we see
that intermittently the request reaches to Prague after delay of around
30sec.
Node i1 logs has information that request was once tried and then next one
was tried after exact 30 sec.
idleConnectionTimeout is configured to 30s, so we think it may be related to
that.

Here is the configuration:

<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="      
http://www.springframework.org/schema/beans       
http://www.springframework.org/schema/beans/spring-beans.xsd">
  <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
    <property name="joinTimeout" value="15000" />
    <property name="socketTimeout" value="14000" />
    <property name="ClientReconnectDisabled" value="true" />
   
    <property name="localPort" value="48500" />
   
    <property name="localPortRange" value="20" />
    <property name="ipFinder">
      <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
        <property name="addresses">
          <list>
            <value>myPragueMachine4660:48500..48520</value>
       
          </list>
        </property>
      </bean>
    </property>
  </bean>
 
  <bean id="communicationSpiBean"
class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
   
   
    <property name="localPort" value="48100" />
    <property name="slowClientQueueLimit" value="100000" />
    <property name="idleConnectionTimeout" value="30000" />
    <property name="usePairedConnections" value="true" />
  </bean>
</beans>

Then after adding India machine entry in discovery ip, things started
working fine but we started seeing nearly twice of nodes in of topology, as
2 server and 8+ clients.

 <property name="ipFinder">
      <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
        <property name="addresses">
          <list>
            <value>myPragueMachine:48500..48520</value>
            <value>myIndiaMachine:48500..48520</value>
 
          </list>
        </property>
      </bean>
    </property>

Please refer to attached logs.

Update:Please also refer to below two message in same thread.

Can someone please clarify
1. why our request was failing intermittently and then after exactly 30 sec
of time it reaches to Prague machine again. It looks like first requests'
TCP communication was idle and timedout then another request went. If yes,
then how can we be sure about the reason.
2. Why the topology has multiple nodes once we add the India Machine info in
TCPDiscoverySPI.
3. Should we add IPs of machine in both Prague and India in TCPDiscoverySPI.

Thanks.logs.txt



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
trans trans
Reply | Threaded
Open this post in threaded view
|

Re: Request among nodes taking minimum of idleConnectionTimeout configuration in tcpcommunicationSpi



After enabling the debug level logging we notice below

Machine in india(10.101.137.123) running on port 56037, but why the tcp
client is being tried to create with port 48104. Please note that 48104 is
the port of the other node in prague.
Could you please point out the things done wrong.

-----------------------------------------------
15:57:29:339723|0274-00313:FFTracer_IgniteWrapper: {DEBUG} Action {>>
sendOrdered "loginrequest" to 73ecc8a5-9db9-4824-9044-8d0748a7bd65
[LoginRequest{token=''} company&user=business/test_web2,
TrackingId=NG_OMS_ADAPTER_TEST_APP04_755_33360, corrId=513]  tmo(PT0S)}
Thread {AuthenticationThread}
15:57:29:341183|0563-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Creating NIO client to node: TcpDiscoveryNode
[id=73ecc8a5-9db9-4824-9044-8d0748a7bd65,
consistentId=73ecc8a5-9db9-4824-9044-8d0748a7bd65, addrs=ArrayList
[0:0:0:0:0:0:0:1, 10.220.2.75, 127.0.0.1], sockAddrs=HashSet
[/0:0:0:0:0:0:0:1:0, /127.0.0.1:0, ff-cfgmanagers/10.101.137.123:0,
/10.220.2.75:0], discPort=0, order=2, intOrder=2,
lastExchangeTime=1594025456918, loc=false, ver=2.8.0#20200226-sha1:341b01df,
isClient=true]} Thread {AuthenticationThread}
15:57:29:343699|0348-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Addresses resolved from attributes
[rmtNode=73ecc8a5-9db9-4824-9044-8d0748a7bd65,
addrs=[ff-cfgmanagers/10.101.137.123:48104, /10.220.2.75:48104,
/127.0.0.1:48104, /0:0:0:0:0:0:0:1:48104], isRmtAddrsExist=true]} Thread
{AuthenticationThread}
15:57:29:814233|0416-00964:FFTracer_Ignite
[org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor]:
{DEBUG} Action {Timeout has occurred [obj=CancelableTask
[id=47eb8332371-e91dfaa7-692d-4855-95b0-7c4203e40a53, endTime=1594031249902,
period=2000, cancel=false,
task=org.apache.ignite.internal.processors.query.h2.ConnectionManager$$Lambda$136/423733787@7268b912],
process=true]} Thread {grid-timeout-worker-#442%WebCluster%}
15:57:29:844691|0240-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Client creation failed [addr=ff-cfgmanagers/10.101.137.123:48104,
err=java.net.SocketTimeoutException]} Thread {AuthenticationThread}
-----------------------------------------------------------

After 30 attempts, each every second, we see below:

------------------------------------------------------------
15:57:59:315536|0278-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Client creation failed [addr=ff-cfgmanagers/10.101.137.123:48104,
err=java.net.ConnectException: Connection refused: no further information]}
Thread {AuthenticationThread}
15:57:59:365793|0611-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Handshake timed out (will retry with increased timeout)
[connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=30000, startNanos=4058375492943000, currTimeout=600000],
addr=ff-cfgmanagers/10.101.137.123:48104, err=class
org.apache.ignite.spi.IgniteSpiOperationTimeoutException: Operation timed
out [timeoutStrategy= ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=30000, startNanos=4058375492943000, currTimeout=600000]]]}
Thread {AuthenticationThread}
15:57:59:366279|0647-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {WARN} Action
{Handshake timed out (will stop attempts to perform the handshake)
[node=73ecc8a5-9db9-4824-9044-8d0748a7bd65,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=30000, startNanos=4058375492943000, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=30000, startNanos=4058375492943000,
currTimeout=600000]], addr=ff-cfgmanagers/10.101.137.123:48104,
failureDetectionTimeoutEnabled=true, timeout=0]} Thread
{AuthenticationThread}

----------------------------------------------------------
Then correct request:
----------------------------------------------------------
15:57:59:692154|0199-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Received remote node ID: 73ecc8a5-9db9-4824-9044-8d0748a7bd65} Thread
{AuthenticationThread}
15:57:59:692789|0339-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Writing handshake message [rmtNode=73ecc8a5-9db9-4824-9044-8d0748a7bd65,
msg=HandshakeMessage2 [connIdx=0, super=HandshakeMessage
[nodeId=8f16a66c-1597-4c29-9580-3a2203a62156, rcvCnt=0, connectCnt=2]]]}
Thread {AuthenticationThread}
15:57:59:693432|0206-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Waiting for handshake [rmtNode=73ecc8a5-9db9-4824-9044-8d0748a7bd65]}
Thread {AuthenticationThread}
15:57:59:854455|0223-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Received handshake message [rmtNode=73ecc8a5-9db9-4824-9044-8d0748a7bd65,
rcvCnt=352]} Thread {AuthenticationThread}
15:57:59:854712|0194-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {DEBUG} Action
{Handle acknowledgment [acked=352, rcvCnt=352, msgReqs=0]} Thread
{AuthenticationThread}
15:57:59:856221|0265-00966:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {INFO} Action
{Established outgoing communication connection
[locAddr=/10.101.137.123:56037, rmtAddr=/10.220.2.75:48104]} Thread
{grid-nio-worker-tcp-comm-1-#444%WebCluster%}
15:57:59:857060|2075-00313:FFTracer_Ignite [class
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {INFO} Action
{TCP client created [client=GridTcpNioCommunicationClient
[ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker
[super=AbstractNioClientWorker [idx=1, bytesRcvd=510, bytesSent=86617,
bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
[name=grid-nio-worker-tcp-comm-1, igniteInstanceName=WebCluster,
finished=false, heartbeatTs=1594031279954, hashCode=660763771,
interrupted=false, runner=grid-nio-worker-tcp-comm-1-#444%WebCluster%]]],
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
inRecovery=null, outRecovery=GridNioRecoveryDescriptor [acked=352,
resendCnt=0, rcvCnt=0, sentCnt=352, reserved=true, lastAck=0,
nodeLeft=false, node=TcpDiscoveryNode
[id=73ecc8a5-9db9-4824-9044-8d0748a7bd65,
consistentId=73ecc8a5-9db9-4824-9044-8d0748a7bd65, addrs=ArrayList
[0:0:0:0:0:0:0:1, 10.220.2.75, 127.0.0.1], sockAddrs=HashSet
[/0:0:0:0:0:0:0:1:0, /127.0.0.1:0, ff-cfgmanagers/10.101.137.123:0,
/10.220.2.75:0], discPort=0, order=2, intOrder=2,
lastExchangeTime=1594025456918, loc=false, ver=2.8.0#20200226-sha1:341b01df,
isClient=true], connected=false, connectCnt=3, queueLimit=4096,
reserveCnt=93, pairedConnections=true], closeSocket=true,
outboundMessagesQueueSizeMetric=org.apache.ignite.internal.processors.metric.impl.LongAdderMetric@69a257d1,
super=GridNioSessionImpl [locAddr=/10.101.137.123:56037,
rmtAddr=/10.220.2.75:48104, createTime=1594031279954, closeTime=0,
bytesSent=0, bytesRcvd=0, bytesSent0=0, bytesRcvd0=0,
sndSchedTime=1594031279954, lastSndTime=1594031279954,
lastRcvTime=1594031279954, readsPaused=false,
filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@4e793be8,
directMode=true], GridConnectionBytesVerifyFilter], accepted=false,
markedForClose=false]], super=GridAbstractCommunicationClient
[lastUsed=1594031279954, closed=false, connIdx=0]], duration=30516ms]}
Thread {AuthenticationThread}





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
trans trans
Reply | Threaded
Open this post in threaded view
|

Re: Request among nodes taking minimum of idleConnectionTimeout configuration in tcpcommunicationSpi

In reply to this post by trans
Hi, can someone please suggest on above.
Why client is trying to use different port other than its for creating TCP
client and failing 30 times

Thanks.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: Request among nodes taking minimum of idleConnectionTimeout configuration in tcpcommunicationSpi

Hello!

Maybe you have some firewall issue here, such as only one way connections are possible but not the other way around.

We do not recommend geo-distributed clustering with Apache Ignite.

Regards,
--
Ilya Kasnacheev


вт, 7 июл. 2020 г. в 06:59, trans <[hidden email]>:
Hi, can someone please suggest on above.
Why client is trying to use different port other than its for creating TCP
client and failing 30 times

Thanks.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/