MQTT with Secure connection disconnection issue with IDFv3.0

rahul.b.patel
Posts: 57
Joined: Wed Apr 19, 2017 6:35 am

MQTT with Secure connection disconnection issue with IDFv3.0

Postby rahul.b.patel » Wed May 23, 2018 7:16 am

Hello,
I have ported eclipse paho mqtt client library as below link,
https://github.com/eclipse/paho.mqtt.embedded-c

and added modifications for secure connection options with mbedTLS.

I tested it with IDFv2.1 and IDFv3.0 stable release and found some issues as listed below with Disconnect event of MQTT connection,
with IDF3.0 only.

first of all my part of code snippet is as below,

Code: Select all

NetworkConnect()
MQTTConnect()
while (1)
{
   MQTTYield(&client, 1000);
   if (!MQTTIsConnected(&client))
   {
      ESP_LOGE(TAG,"Disconnected.......");
      break;
   }
}
DisconnectNetwork()
NetworkConnect()
MQTTConnect()


Now the issue is with IDFv3.0,
when connection is lost (STA is disconnected from ESP), MQTTYield() in above code snippet does not return until STA gets connected again. SO until STA is connected again, MQTTYield() is running forever. I am suspecting timer does not expires in implementation of MQTT library.

But changing back to the IDFv2.1,
MQTTYield() returns after 1sec as expected and gives "disconnected..." log on console.


This issue is only with IDF3.0. With IDF2.1 MQTTYield() returns as soon as STA gets dis-connected. and as per my functionality it disconnect network and again try to connect to the network.

Hope I explained issue good enough to have idea about it.
It will be a great help if anybody have come across the same issue and have the resolution.

Is there any change in IDF3.0 which is causing this issue. As looking at MQTTYield(), its based on gettimeofday() dependent which is again dependent on freertos ticks. In IDF3.0, tick source for both CPUs separated. Is there any way that its coming into picture. well its just a guess.

if anybody have any idea, it would be a great help.
thanks.

rahul.b.patel
Posts: 57
Joined: Wed Apr 19, 2017 6:35 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby rahul.b.patel » Thu May 24, 2018 12:22 pm

Hi,
Can any one have idea if changes in IDFv3.0 may cause this issue.?

Thanks.

ESP_Angus
Posts: 1225
Joined: Sun May 08, 2016 4:11 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby ESP_Angus » Fri May 25, 2018 2:43 am

Hi Rahul,

You've explained the problem you're seeing well but without seeing your port it's not really possible to guess at what the problem might be. There aren't any changes in v3.0 that spring to mind as likely candidates, but it's hard to guess.

Suggest either using a JTAG debugger or adding some logging in the MQTT code so you can see (for example) each time the code goes into a delay loop or blocks for something. Check that the timeouts for delays and blocking calls are being set correctly. Maybe there is some integer overflow/underflow somewhere causing something to block indefinitely when it should be timing out.

Angus

rahul.b.patel
Posts: 57
Joined: Wed Apr 19, 2017 6:35 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby rahul.b.patel » Fri May 25, 2018 12:36 pm

ESP_Angus wrote:Hi Rahul,

You've explained the problem you're seeing well but without seeing your port it's not really possible to guess at what the problem might be. There aren't any changes in v3.0 that spring to mind as likely candidates, but it's hard to guess.

Suggest either using a JTAG debugger or adding some logging in the MQTT code so you can see (for example) each time the code goes into a delay loop or blocks for something. Check that the timeouts for delays and blocking calls are being set correctly. Maybe there is some integer overflow/underflow somewhere causing something to block indefinitely when it should be timing out.

Angus

Hi Angus,
Thanks for suggestive reply. I analyze the scenario with some logging. I found that, mbedtls_ssl_read() function stays in blocking state usually.
The issue I am observing is that on STA disconnect event, mbedtls_ssl_read() comes out of block state in IDFv2.1 while its stays in blocking state with IDFv3.0.

rahul.b.patel
Posts: 57
Joined: Wed Apr 19, 2017 6:35 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby rahul.b.patel » Sat May 26, 2018 4:23 am

Hi Angus,

I back ported LWIP component from IDFv2.1 to IDFv3.0 and now its works fine. So need to check what is the changes that create this issue in LWIP component in IDFv3.0. Any suggestion from your side will be helpful.

thanks.

littlesky
Posts: 4
Joined: Fri Jun 09, 2017 7:49 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby littlesky » Mon May 28, 2018 6:26 am

I guess that it may be caused by the IP address. In IDF v2.1, IP address is lost once WiFi disconnects. But in IDF v3.0, IP address keeps unchanged in 120 seconds after WiFi disconnects. Due to IP address is unchanged, TCP connections bound on it are not changed.

liuzhifu
Posts: 7
Joined: Tue Dec 13, 2016 2:18 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby liuzhifu » Tue Jun 05, 2018 3:21 am

Please refer to my mail about this issue:

This is an issue introduced in IDFv3.0.

In original LWIP, when WiFi disconnects, call tcp_abort() to kill all active/bound TCP connections.

In IDFv3.0 and latest idf, when WiFi disconnects, don't remove the TCP connections and the application is responsible to remove all TCP connections. When WiFi connection recovers again, the TCP connections no need to re-create and just rebind to the new IP address.

The historical reason not removing TCP connections in IDFv3.0:
1. Most of the applications have WiFi auto-reconnect mechanism, the WiFi will auto reconnect if it disconnects because of some reasons.
2. If the WiFi can successfully reconnects to the AP, then the TCP no need to re-create.
3. If this WiFi no longer reconnects to the AP, or if esp_wifi_set_config() is called to reconnects the WiFi to a different AP, or if esp_wifi_stop()/esp_wifi_deinit() is called, then the application needs to close all sockets connections.
4. Most of the applications, such as audio applications, they use the recv with a TIMEOUT value, so if the WiFi in disconnect status for a long time, the recv returns because of timeout.
5. However, for some existing library API, such as ebedtls, the use the LWIP recv with blocked forever. That's what our customer met in this issue.

So the suggestions for customers:
1. Ignore this issue if the WiFi will reconnect
2. Close the sockets if the WiFi will not reconnect or WiFi reconnect to a different AP

In the future(sorry I'm not sure the exact date), we are considering to add a menuconfig option, it's disabled by default (the behavior is same as IDFv2.1).

rahul.b.patel
Posts: 57
Joined: Wed Apr 19, 2017 6:35 am

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Postby rahul.b.patel » Tue Jun 05, 2018 8:32 am

liuzhifu wrote:Please refer to my mail about this issue:

This is an issue introduced in IDFv3.0.

In original LWIP, when WiFi disconnects, call tcp_abort() to kill all active/bound TCP connections.

In IDFv3.0 and latest idf, when WiFi disconnects, don't remove the TCP connections and the application is responsible to remove all TCP connections. When WiFi connection recovers again, the TCP connections no need to re-create and just rebind to the new IP address.

The historical reason not removing TCP connections in IDFv3.0:
1. Most of the applications have WiFi auto-reconnect mechanism, the WiFi will auto reconnect if it disconnects because of some reasons.
2. If the WiFi can successfully reconnects to the AP, then the TCP no need to re-create.
3. If this WiFi no longer reconnects to the AP, or if esp_wifi_set_config() is called to reconnects the WiFi to a different AP, or if esp_wifi_stop()/esp_wifi_deinit() is called, then the application needs to close all sockets connections.
4. Most of the applications, such as audio applications, they use the recv with a TIMEOUT value, so if the WiFi in disconnect status for a long time, the recv returns because of timeout.
5. However, for some existing library API, such as ebedtls, the use the LWIP recv with blocked forever. That's what our customer met in this issue.

So the suggestions for customers:
1. Ignore this issue if the WiFi will reconnect
2. Close the sockets if the WiFi will not reconnect or WiFi reconnect to a different AP

In the future(sorry I'm not sure the exact date), we are considering to add a menuconfig option, it's disabled by default (the behavior is same as IDFv2.1).


Hi,
Thanks for the detail explaination. So all the sockets closing needs to manage on application on WiFi STA disconnect event considering that AP will not connect again considering the worse case scenario.

Who is online

Users browsing this forum: No registered users and 11 guests