are TCP DUP ACKs caused by lwIP process being starved of time?

tschak909
Posts: 36
Joined: Mon Oct 26, 2020 8:17 pm

are TCP DUP ACKs caused by lwIP process being starved of time?

Postby tschak909 » Tue Dec 08, 2020 8:15 pm

Hello everybody,

I am one of the firmware developers on the #FujiNet project, which brings a network adapter to the #Atari8bit computers.

The firmware is here, and is free software:
https://github.com/FujiNetWIFI/fujinet-platformio

We are currently having issues with the Wi-Fi MODEM emulation that is part of the firmware (currently implemented here: https://github.com/FujiNetWIFI/fujinet- ... /modem.cpp), where if a host sends too many one-byte packets (due to Nagle algorithm being disabled by the host to improve interactive performance), then the ESP32 does not respond fast enough to them, and the host re-sends the packet, ultimately resulting in a duplicate acknowledgement (DUP ACK) because the ESP32 ultimately acknowledges all the packets, if a bit late, and ultimately resulting in the host refusing to send more data because it thinks that there is too much packet congestion.

My question is, if the primary thread of the program runs too quickly, does this ultimately starve lwIP of being able to do basic TCP house-keeping, like acknowledging receipt of packets? If so, what's the correct solution?

We've already tried

* vPortYield() all over the parts of the code where received packets are processed.
* Adjusting SDK parameters to adjust the size of static and dynamic buffers
* Adjusting the process affinity of the lwIP process
* explicitly putting delays in where received packets are to be processed (as a test)


and we're really scratching our heads here, any help would be appreciated.

-Thom Cherryhomes

ESP_Sprite
Posts: 9050
Joined: Thu Nov 26, 2015 4:08 am

Re: are TCP DUP ACKs caused by lwIP process being starved of time?

Postby ESP_Sprite » Sun Dec 13, 2020 3:04 am

Hard to say... in theory, LwIP / WiFi do have their own tasks, and if they are starved, I'd imagine a deadlock to happen indeed. Is there any way you can lower the priority of your own tasks so lwip/wifi get preferential execution time? Does that change the situation?

tschak909
Posts: 36
Joined: Mon Oct 26, 2020 8:17 pm

Re: are TCP DUP ACKs caused by lwIP process being starved of time?

Postby tschak909 » Tue Dec 15, 2020 12:52 am

Thanks to the tireless efforts of a FuijNet user (Mark LeAir, aka John Polka), he found an SDK option that fixed our issue:
[Codebox]
CONFIG_LWIP_TCP_QUEUE_OOSEQ=n
[/Codebox]

Which turns off queuing of packets that are out of sequence, forcing the host to re-send (which it will anyway), but avoiding the duplicate acknowledgements.

Given that the target system is an 8-bit microcomputer, the resulting performance impact is negligable, and hopefully this will help someone else who runs into this problem.

-Thom

ESP_Sprite
Posts: 9050
Joined: Thu Nov 26, 2015 4:08 am

Re: are TCP DUP ACKs caused by lwIP process being starved of time?

Postby ESP_Sprite » Tue Dec 15, 2020 1:41 am

Thanks for posting the solution as well!

Who is online

Users browsing this forum: Bing [Bot] and 260 guests