Client connects to server after too long time interval (1 minute)

classic Classic list List threaded Threaded
6 messages Options
Dmitry Lazurkin Dmitry Lazurkin
Reply | Threaded
Open this post in threaded view
|

Client connects to server after too long time interval (1 minute)

Hello.

I need some advice. I have 1 client and 1 server. Ignite client
connection to server take ~1 minute. And I don't understand why.

In server log I see next entry:

2018-11-09 16:34:16,471 [tcp-disco-client-message-worker-#16] DEBUG
o.a.i.s.d.tcp.TcpDiscoverySpi - Redirecting message to client
[sock=Socket[addr=/10.37.92.222,port=48843,localport=47500],
locNodeId=c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f,
rmtNodeId=f4856174-de1d-4eae-ad85-80681f783955,
msg=TcpDiscoveryNodeAddFinishedMessage
[nodeId=f4856174-de1d-4eae-ad85-80681f783955,
super=TcpDiscoveryAbstractMessage
[sndNodeId=c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f,
id=a79abb8f661-c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f,
verifierNodeId=c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f, topVer=7,
pendingIdx=0, failedNodes=null, isClient=false]]]

Client log:

2018-11-09 16:35:06.585 DEBUG 10 --- [-sock-reader-#3]
o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Message has been received:
TcpDiscoveryNodeAddFinishedMessage
[nodeId=f4856174-de1d-4eae-ad85-80681f783955,
super=TcpDiscoveryAbstractMessage
[sndNodeId=c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f,
id=a79abb8f661-c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f,
verifierNodeId=c86fe162-0a55-4aaa-a2cd-f2ea1c644b5f, topVer=7,
pendingIdx=0, failedNodes=null, isClient=false]]

Difference is 50 seconds. Wireshark log entries on client and server
with TcpDiscoveryNodeAddFinishedMessage in packet bytes:

On server (10.48.14.1):
1648    16:34:16.473071    10.48.14.1    10.37.92.222    35780    TCP   
[TCP Window Full] 47500 → 48843 [ACK] Seq=180330 Ack=30711 Win=99584
Len=35712 TSval=3529221982 TSecr=2842678176
On client (10.37.92.222):
131    16:34:16.473103    10.48.14.1    10.37.92.222    35780    TCP   
[TCP Window Full] 47500 → 48843 [ACK] Seq=180330 Ack=30711 Win=99584
Len=35712 TSval=3529221982 TSecr=2842678176

I cann't understand that delay.

Thank you.

PS. This is kubernetes.


aealexsandrov aealexsandrov
Reply | Threaded
Open this post in threaded view
|

Re: Client connects to server after too long time interval (1 minute)

Hi,

Could you please attach the XML configurations of your client and server
nodes and logs?

BR,
Andrei





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Dmitry Lazurkin Dmitry Lazurkin
Reply | Threaded
Open this post in threaded view
|

Re: Client connects to server after too long time interval (1 minute)

Hi, Andrei. Thank you for reply.

I have found that problem is in unmarshalling:
2018-11-12 13:18:24.375  INFO 10 --- [-sock-reader-#3]
o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : 1
2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
node: 28ebe679-e815-4c16-bd5c-ee320caa3019
2018-11-12 13:18:24.375 ERROR 10 --- [-sock-reader-#3]
o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 1
2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
node: 12f39501-0fba-4c1c-9c37-dc35f33fcee9
2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
node: 40b74eed-a636-488f-b1a8-6d58f4bc9137
2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
node: f4c370e7-cf4a-4209-84fe-42fba4a30eef
2018-11-12 13:18:24.380 ERROR 10 --- [-sock-reader-#3]
o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 2
2018-11-12 13:19:44.965 ERROR 10 --- [-sock-reader-#3]
o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 3:
TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode
[id=8807c1ce-1e4a-496b-9aea-66cee8d95434, addrs=[10.37.92.255],
sockAddrs=[client-98d86c46f-j76s5/10.37.92.255:0], discPort=0, order=0,
intOrder=12, lastExchangeTime=1542028704438, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=true],
dataPacket=org.apache.ignite.spi.discovery.tcp.internal.DiscoveryDataPacket@3b238c6e,
discardMsgId=null, discardCustomMsgId=null, top=[TcpDiscoveryNode
[id=28ebe679-e815-4c16-bd5c-ee320caa3019, addrs=[10.48.14.1],
sockAddrs=[ignite-1/10.48.14.1:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1542028704438, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], TcpDiscoveryNode
[id=f4c370e7-cf4a-4209-84fe-42fba4a30eef, addrs=[10.37.92.208],
sockAddrs=[/10.37.92.208:0], discPort=0, order=2, intOrder=2,
lastExchangeTime=1542028704448, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=true], TcpDiscoveryNode
[id=12f39501-0fba-4c1c-9c37-dc35f33fcee9, addrs=[10.37.92.205],
sockAddrs=[/10.37.92.205:0], discPort=0, order=5, intOrder=5,
lastExchangeTime=1542028714511, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=true], TcpDiscoveryNode
[id=40b74eed-a636-488f-b1a8-6d58f4bc9137, addrs=[10.37.92.222],
sockAddrs=[/10.37.92.222:0], discPort=0, order=15, intOrder=11,
lastExchangeTime=1542028724583, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=true]], clientTop=null,
gridStartTime=1542023768674, super=TcpDiscoveryAbstractMessage
[sndNodeId=null, id=0c5f7c70761-28ebe679-e815-4c16-bd5c-ee320caa3019,
verifierNodeId=28ebe679-e815-4c16-bd5c-ee320caa3019, topVer=0,
pendingIdx=0, failedNodes=null, isClient=true]]

Unmarshalling TcpDiscoveryNodeAddedMessage need 1 minute...(:

That's org.apache.ignite.marshaller.jdk.JdkMarshaller with logs:
    /** {@inheritDoc} */
    @SuppressWarnings({"unchecked"})
    @Override protected <T> T unmarshal0(InputStream in, @Nullable
ClassLoader clsLdr) throws IgniteCheckedException {
        assert in != null;

        if (clsLdr == null)
            clsLdr = getClass().getClassLoader();

        logger.error("Unmarshall 1");

        ObjectInputStream objIn = null;

        try {
            objIn = new JdkMarshallerObjectInputStream(new
JdkMarshallerInputStreamWrapper(in), clsLdr, clsFilter);

            logger.error("Unmarshall 2");

            T t = (T)objIn.readObject();

            logger.error("Unmarshall 3: " + t);

            return t;
        }
...


On 11/12/18 10:24, aealexsandrov wrote:

> Hi,
>
> Could you please attach the XML configurations of your client and server
> nodes and logs?
>
> BR,
> Andrei
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/



Dmitry Lazurkin Dmitry Lazurkin
Reply | Threaded
Open this post in threaded view
|

Re: Client connects to server after too long time interval (1 minute)

After reconnect unmarshalling of TcpDiscoveryNodeAddedMessage takes 20
seconds:
2018-11-12 14:10:36.105 ERROR 10 --- [-sock-reader-#3]
o.a.ignite.marshaller.jdk.JdkMarshaller : Unmarshall 1
2018-11-12 14:10:36.107 ERROR 10 --- [-sock-reader-#3]
o.a.ignite.marshaller.jdk.JdkMarshaller : Unmarshall 2
2018-11-12 14:10:56.262 ERROR 10 --- [-sock-reader-#3]
o.a.ignite.marshaller.jdk.JdkMarshaller : Unmarshall 3:
TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode [id=06960cfd-17

Does class loader load many classes on initial connect?

On 11/12/18 16:28, Dmitry Lazurkin wrote:

> Hi, Andrei. Thank you for reply.
>
> I have found that problem is in unmarshalling:
> 2018-11-12 13:18:24.375  INFO 10 --- [-sock-reader-#3]
> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : 1
> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
> node: 28ebe679-e815-4c16-bd5c-ee320caa3019
> 2018-11-12 13:18:24.375 ERROR 10 --- [-sock-reader-#3]
> o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 1
> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
> node: 12f39501-0fba-4c1c-9c37-dc35f33fcee9
> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
> node: 40b74eed-a636-488f-b1a8-6d58f4bc9137
> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
> node: f4c370e7-cf4a-4209-84fe-42fba4a30eef
> 2018-11-12 13:18:24.380 ERROR 10 --- [-sock-reader-#3]
> o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 2
> 2018-11-12 13:19:44.965 ERROR 10 --- [-sock-reader-#3]
> o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 3:
> TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode
> [id=8807c1ce-1e4a-496b-9aea-66cee8d95434, addrs=[10.37.92.255],
> sockAddrs=[client-98d86c46f-j76s5/10.37.92.255:0], discPort=0, order=0,
> intOrder=12, lastExchangeTime=1542028704438, loc=false,
> ver=2.6.0#20180710-sha1:669feacc, isClient=true],
> dataPacket=org.apache.ignite.spi.discovery.tcp.internal.DiscoveryDataPacket@3b238c6e,
> discardMsgId=null, discardCustomMsgId=null, top=[TcpDiscoveryNode
> [id=28ebe679-e815-4c16-bd5c-ee320caa3019, addrs=[10.48.14.1],
> sockAddrs=[ignite-1/10.48.14.1:47500], discPort=47500, order=1,
> intOrder=1, lastExchangeTime=1542028704438, loc=false,
> ver=2.6.0#20180710-sha1:669feacc, isClient=false], TcpDiscoveryNode
> [id=f4c370e7-cf4a-4209-84fe-42fba4a30eef, addrs=[10.37.92.208],
> sockAddrs=[/10.37.92.208:0], discPort=0, order=2, intOrder=2,
> lastExchangeTime=1542028704448, loc=false,
> ver=2.6.0#20180710-sha1:669feacc, isClient=true], TcpDiscoveryNode
> [id=12f39501-0fba-4c1c-9c37-dc35f33fcee9, addrs=[10.37.92.205],
> sockAddrs=[/10.37.92.205:0], discPort=0, order=5, intOrder=5,
> lastExchangeTime=1542028714511, loc=false,
> ver=2.6.0#20180710-sha1:669feacc, isClient=true], TcpDiscoveryNode
> [id=40b74eed-a636-488f-b1a8-6d58f4bc9137, addrs=[10.37.92.222],
> sockAddrs=[/10.37.92.222:0], discPort=0, order=15, intOrder=11,
> lastExchangeTime=1542028724583, loc=false,
> ver=2.6.0#20180710-sha1:669feacc, isClient=true]], clientTop=null,
> gridStartTime=1542023768674, super=TcpDiscoveryAbstractMessage
> [sndNodeId=null, id=0c5f7c70761-28ebe679-e815-4c16-bd5c-ee320caa3019,
> verifierNodeId=28ebe679-e815-4c16-bd5c-ee320caa3019, topVer=0,
> pendingIdx=0, failedNodes=null, isClient=true]]
>
> Unmarshalling TcpDiscoveryNodeAddedMessage need 1 minute...(:
>
> That's org.apache.ignite.marshaller.jdk.JdkMarshaller with logs:
>     /** {@inheritDoc} */
>     @SuppressWarnings({"unchecked"})
>     @Override protected <T> T unmarshal0(InputStream in, @Nullable
> ClassLoader clsLdr) throws IgniteCheckedException {
>         assert in != null;
>
>         if (clsLdr == null)
>             clsLdr = getClass().getClassLoader();
>
>         logger.error("Unmarshall 1");
>
>         ObjectInputStream objIn = null;
>
>         try {
>             objIn = new JdkMarshallerObjectInputStream(new
> JdkMarshallerInputStreamWrapper(in), clsLdr, clsFilter);
>
>             logger.error("Unmarshall 2");
>
>             T t = (T)objIn.readObject();
>
>             logger.error("Unmarshall 3: " + t);
>
>             return t;
>         }
> ...
>
>
> On 11/12/18 10:24, aealexsandrov wrote:
>> Hi,
>>
>> Could you please attach the XML configurations of your client and server
>> nodes and logs?
>>
>> BR,
>> Andrei
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>
>

Dmitry Lazurkin Dmitry Lazurkin
Reply | Threaded
Open this post in threaded view
|

Re: Client connects to server after too long time interval (1 minute)

I've found. TcpDiscoveryNode try to resolve sockAddrs in readExternal
method from hostNames (first) and addrs (second). TcpDiscoverySpi fills
hostNames for node with local hostnames. And of course new client node
can resolve  only local hostname. In my case resolution of other node's
hostnames take 10 seconds for each hostname.

On 11/12/18 17:28, Dmitry Lazurkin wrote:

> After reconnect unmarshalling of TcpDiscoveryNodeAddedMessage takes 20
> seconds:
> 2018-11-12 14:10:36.105 ERROR 10 --- [-sock-reader-#3]
> o.a.ignite.marshaller.jdk.JdkMarshaller : Unmarshall 1
> 2018-11-12 14:10:36.107 ERROR 10 --- [-sock-reader-#3]
> o.a.ignite.marshaller.jdk.JdkMarshaller : Unmarshall 2
> 2018-11-12 14:10:56.262 ERROR 10 --- [-sock-reader-#3]
> o.a.ignite.marshaller.jdk.JdkMarshaller : Unmarshall 3:
> TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode [id=06960cfd-17
>
> Does class loader load many classes on initial connect?
>
> On 11/12/18 16:28, Dmitry Lazurkin wrote:
>> Hi, Andrei. Thank you for reply.
>>
>> I have found that problem is in unmarshalling:
>> 2018-11-12 13:18:24.375  INFO 10 --- [-sock-reader-#3]
>> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : 1
>> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
>> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
>> node: 28ebe679-e815-4c16-bd5c-ee320caa3019
>> 2018-11-12 13:18:24.375 ERROR 10 --- [-sock-reader-#3]
>> o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 1
>> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
>> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
>> node: 12f39501-0fba-4c1c-9c37-dc35f33fcee9
>> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
>> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
>> node: 40b74eed-a636-488f-b1a8-6d58f4bc9137
>> 2018-11-12 13:18:24.375 DEBUG 10 --- [o-msg-worker-#4]
>> o.a.i.spi.discovery.tcp.TcpDiscoverySpi  : Received metrics from unknown
>> node: f4c370e7-cf4a-4209-84fe-42fba4a30eef
>> 2018-11-12 13:18:24.380 ERROR 10 --- [-sock-reader-#3]
>> o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 2
>> 2018-11-12 13:19:44.965 ERROR 10 --- [-sock-reader-#3]
>> o.a.ignite.marshaller.jdk.JdkMarshaller  : Unmarshall 3:
>> TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode
>> [id=8807c1ce-1e4a-496b-9aea-66cee8d95434, addrs=[10.37.92.255],
>> sockAddrs=[client-98d86c46f-j76s5/10.37.92.255:0], discPort=0, order=0,
>> intOrder=12, lastExchangeTime=1542028704438, loc=false,
>> ver=2.6.0#20180710-sha1:669feacc, isClient=true],
>> dataPacket=org.apache.ignite.spi.discovery.tcp.internal.DiscoveryDataPacket@3b238c6e,
>> discardMsgId=null, discardCustomMsgId=null, top=[TcpDiscoveryNode
>> [id=28ebe679-e815-4c16-bd5c-ee320caa3019, addrs=[10.48.14.1],
>> sockAddrs=[ignite-1/10.48.14.1:47500], discPort=47500, order=1,
>> intOrder=1, lastExchangeTime=1542028704438, loc=false,
>> ver=2.6.0#20180710-sha1:669feacc, isClient=false], TcpDiscoveryNode
>> [id=f4c370e7-cf4a-4209-84fe-42fba4a30eef, addrs=[10.37.92.208],
>> sockAddrs=[/10.37.92.208:0], discPort=0, order=2, intOrder=2,
>> lastExchangeTime=1542028704448, loc=false,
>> ver=2.6.0#20180710-sha1:669feacc, isClient=true], TcpDiscoveryNode
>> [id=12f39501-0fba-4c1c-9c37-dc35f33fcee9, addrs=[10.37.92.205],
>> sockAddrs=[/10.37.92.205:0], discPort=0, order=5, intOrder=5,
>> lastExchangeTime=1542028714511, loc=false,
>> ver=2.6.0#20180710-sha1:669feacc, isClient=true], TcpDiscoveryNode
>> [id=40b74eed-a636-488f-b1a8-6d58f4bc9137, addrs=[10.37.92.222],
>> sockAddrs=[/10.37.92.222:0], discPort=0, order=15, intOrder=11,
>> lastExchangeTime=1542028724583, loc=false,
>> ver=2.6.0#20180710-sha1:669feacc, isClient=true]], clientTop=null,
>> gridStartTime=1542023768674, super=TcpDiscoveryAbstractMessage
>> [sndNodeId=null, id=0c5f7c70761-28ebe679-e815-4c16-bd5c-ee320caa3019,
>> verifierNodeId=28ebe679-e815-4c16-bd5c-ee320caa3019, topVer=0,
>> pendingIdx=0, failedNodes=null, isClient=true]]
>>
>> Unmarshalling TcpDiscoveryNodeAddedMessage need 1 minute...(:
>>
>> That's org.apache.ignite.marshaller.jdk.JdkMarshaller with logs:
>>     /** {@inheritDoc} */
>>     @SuppressWarnings({"unchecked"})
>>     @Override protected <T> T unmarshal0(InputStream in, @Nullable
>> ClassLoader clsLdr) throws IgniteCheckedException {
>>         assert in != null;
>>
>>         if (clsLdr == null)
>>             clsLdr = getClass().getClassLoader();
>>
>>         logger.error("Unmarshall 1");
>>
>>         ObjectInputStream objIn = null;
>>
>>         try {
>>             objIn = new JdkMarshallerObjectInputStream(new
>> JdkMarshallerInputStreamWrapper(in), clsLdr, clsFilter);
>>
>>             logger.error("Unmarshall 2");
>>
>>             T t = (T)objIn.readObject();
>>
>>             logger.error("Unmarshall 3: " + t);
>>
>>             return t;
>>         }
>> ...
>>
>>
>> On 11/12/18 10:24, aealexsandrov wrote:
>>> Hi,
>>>
>>> Could you please attach the XML configurations of your client and server
>>> nodes and logs?
>>>
>>> BR,
>>> Andrei
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>


Dmitry Lazurkin Dmitry Lazurkin
Reply | Threaded
Open this post in threaded view
|

Re: Client connects to server after too long time interval (1 minute)

In reply to this post by aealexsandrov
I've found. TcpDiscoveryNode try to resolve sockAddrs in readExternal
method from hostNames (first) and addrs (second). TcpDiscoverySpi fills
hostNames for node with local hostnames. And of course new client node
can resolve  only local hostname. In my case resolution of other node's
hostnames take 10 seconds for each hostname.