Page 1 of 1

ESP32 randomly crashing with high EMI RS-232 comm

Posted: Thu Oct 29, 2020 3:43 pm
by FiwiDev
Hello all, I have a problem that's kind of got me stumped. Basically, I've designed a PCB around an ESP32 that provides WiFi connectivity and access to a large power supply. It communicates with the power supply's CPU via RS-232 (connector, not on the same board.) Most of the time it works perfectly well.
However, at higher power supply loads, there is a lot of EMI induced in the RS-232 cable (as proven with an oscilloscope). I am using a TTL buffer IC to protect the ESP32 from direct EMI, though that does not seem to help in the slightest. The issue I'm having is that there is a direct correlation between power supply load (=EMI) and ESP32 crashes. My best guess is that it is the EMI induced in the RS-232 cable causing an unhandled "framing error" in the ESP32 Arduino core??
I have also had it crash when running a lot of serial communications back and forth (changing settings, etc.) without EMI to blame. This is in part why I believe it may have something to do with an unhandled UART error, though I do not know how I would identify this.

Crash logs more often than not look like the following:

Code: Select all

E (339575) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (339575) task_wdt:  - IDLE0 (CPU 0)
E (339575) task_wdt: Tasks currently running:
E (339575) task_wdt: CPU 0: CPU
E (339575) task_wdt: CPU 1: loopTask
E (339575) task_wdt: Aborting.
abort() was called at PC 0x400ee25f on core 0

Backtrace: 0x4008d0cc:0x3ffbe170 0x4008d2fd:0x3ffbe190 0x400ee25f:0x3ffbe1b0 0x4008505d:0x3ffbe1d0 0x4000bfed:0x3ffb9310 0x4008a5f1:0x3ffb9320 0x400893b7:0x3ffb9340 0x400ec4d9:0x3ffb9360 0x400ea865:0x3ffb9380 0x400df7d9:0x3ffb93a0 0x40089469:0x3ffb93e0
and...

Code: Select all

E (1448275) task_wdt: Task watch dog got triggered. The following tasks did not reset the watchdog in time:
E (1448275) task_wdt:  - IDLE0 (CPU 0)
E (1448275) task_wdt: Tasks currently running:
E (1448275) task_wdt: CPU 0: CPU
E (1448275) task_wdt: CPU 1: loopTask
E (1448275) task_wdt: Aborting.
abort() was called at PC 0x400ee25f on core 0

Backtrace: 0x4008d0cc:0x3ffbe170 0x4008d2fd:0x3ffbe190 0x400ee25f:0x3ffbe1b0 0x4008505d:0x3ffbe1d0 0x400df70f:0x3ffb93a0 0x40089469:0x3ffb93e0
One last one for good measure:

Code: Select all

(133129) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (133129) task_wdt:  - IDLE0 (CPU 0)
E (133129) task_wdt: Tasks currently running:
E (133129) task_wdt: CPU 0: CPU
E (133129) task_wdt: CPU 1: loopTask
E (133129) task_wdt: Aborting.
abort() was called at PC 0x400ee25f on core 0

Backtrace: 0x4008d0cc:0x3ffbe170 0x4008d2fd:0x3ffbe190 0x400ee25f:0x3ffbe1b0 0x4008505d:0x3ffbe1d0 0x400ec4c8:0x3ffb9360 0x400ea865:0x3ffb9380 0x400df701:0x3ffb93a0 0x40089469:0x3ffb93e0
I've run the ESP Exception Decoder on the crashes, but in each case it did not point to a potential problem spot. It is worth noting in all 3 crash logs that the task on CPU 0 was "CPU", which is one of several tasks on CPU 0--but the "CPU" task is the only one that handles RS-232 communications.

Maybe I'm mistaken with the RTOS stuff, but I did not think that the RTOS tasks had to "yield" time--the RTOS task switcher would simply give each task a fixed number of CPU cycles. While there is a "xTaskNotifyWait" call in the "CPU" task, at least to my knowledge that is not required. In other words, if something is going haywire with the Serial access (maybe Serial.Available() returning a ridiculous number due to an RS-232 framing error?? https://github.com/espressif/arduino-es ... art.c#L280)...even so it should not cause an ESP32 crash??

Thoughts?