Are there any tools to help debug watchdog timeouts

rwel59
Posts: 97
Joined: Thu Oct 12, 2017 3:32 pm

Are there any tools to help debug watchdog timeouts

Postby rwel59 » Sun Mar 11, 2018 10:05 pm

I've got an application that runs for days without any problems - sometimes. Other times I get watchdog timeout errors after minutes or hours and the app hangs without performing any of its functions. I've ignored this problem to date but now its time to hopefully solve.

I've tried to figure out what is causing this by looking at the code and using tons of log messages but so far I haven't figured it out - since it happens so infrequently most of the time using log messages is a very time consuming process.

Are they any tools that will provide more information on where the timeout is being triggered?

User avatar
kolban
Posts: 1683
Joined: Mon Nov 16, 2015 4:43 pm
Location: Texas, USA

Re: Are there any tools to help debug watchdog timeouts

Postby kolban » Sun Mar 11, 2018 10:16 pm

The way I understand a watchdog timeout is that we want to detect tasks that aren't responding. To do this, we set a timer per task and ask "Since the last time the timer fired, has this task proved it is alive?". If you are getting a watchdog exception, then I am guessing it is because "some task" has not identified itself as being alive by having set its liveness indicator. What are the details of your exception message. Does this give us details (or at least a clue) as to which task was thought to be in trouble?
Free book on ESP32 available here: https://leanpub.com/kolban-ESP32

rwel59
Posts: 97
Joined: Thu Oct 12, 2017 3:32 pm

Re: Are there any tools to help debug watchdog timeouts

Postby rwel59 » Sun Mar 11, 2018 10:21 pm

Unfortunately, no details are given in the monitor. I haven't been able to find any routines that are hanging. Was hoping there was some type of configuration flag or tool that would help identify what is hanging.

User avatar
kolban
Posts: 1683
Joined: Mon Nov 16, 2015 4:43 pm
Location: Texas, USA

Re: Are there any tools to help debug watchdog timeouts

Postby kolban » Sun Mar 11, 2018 10:22 pm

What exactly does the monitor output show? Have you enabled verbose output for the execution of the project?
Free book on ESP32 available here: https://leanpub.com/kolban-ESP32

rwel59
Posts: 97
Joined: Thu Oct 12, 2017 3:32 pm

Re: Are there any tools to help debug watchdog timeouts

Postby rwel59 » Sun Mar 11, 2018 10:41 pm

Task watchdog got triggered. The following tasks did not reset the watchdog in time:
- IDLE (CPU 0)
Tasks currently running:
CPU 0: Task

Just turned verbose back on so maybe that will give me more - seems a bit strange hoping my code craps out...

update: verbose did not provide any additional info

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: Are there any tools to help debug watchdog timeouts

Postby WiFive » Mon Mar 12, 2018 12:24 am

Well this tells you what task to look at and see if there are any while loops

Code: Select all

Tasks currently running:
CPU 0: Task

rwel59
Posts: 97
Joined: Thu Oct 12, 2017 3:32 pm

Re: Are there any tools to help debug watchdog timeouts

Postby rwel59 » Mon Mar 12, 2018 1:45 am

thanks for pointing that out.
I am not in the habit of naming my tasks and was looking at that message as some meaningless message - which it is at the moment. I will add task names to my code and hopefully that will lead me in the right direction.

ESP_igrr
Posts: 2067
Joined: Tue Dec 01, 2015 8:37 am

Re: Are there any tools to help debug watchdog timeouts

Postby ESP_igrr » Mon Mar 12, 2018 2:00 am

In addition to naming your tasks, a couple more suggestions:

1) if you connect JTAG, you can halt the program (hit Ctrl-C in GDB or "pause" button in Eclipse) and observe where exactly the "Task" is spending its time.

2) If you can not connect JTAG, enable configuring the task watchdog to cause a panic (TASK_WDT_PANIC option in menuconfig, component config, esp32 specific) might give you a stack trace which will point to the offending code. Note that this is not guaranteed to work — it depends which core the task is spinning on — but still might give you some clue.

rwel59
Posts: 97
Joined: Thu Oct 12, 2017 3:32 pm

Re: Are there any tools to help debug watchdog timeouts

Postby rwel59 » Mon Mar 12, 2018 2:10 am

in general, there are no long running tasks so don't think JTAG would help with this issue although I have been considering it for other debugging.

I'm guessing my issue is related to wifi communications getting hung somewhere so I'll take a look at the panic config to see if that can provide more info.

rwel59
Posts: 97
Joined: Thu Oct 12, 2017 3:32 pm

Re: Are there any tools to help debug watchdog timeouts

Postby rwel59 » Mon Mar 12, 2018 12:17 pm

With the debugging suggestions here, I now know the task causing WDT - its a pretty simple task so not obvious why the WDT would be called. Trying to understand how to trace error based on what the monitor is showing me but not having much luck. Could someone explain (or point to a link) if/how I can use the debug info to get closer to root cause...

Make Monitor shows the following:

Task watchdog got triggered. The following tasks did not reset the watchdog in time:
- IDLE (CPU 0)
Tasks currently running:
CPU 0: wifiMonitor
Aborting.
abort() was called at PC 0x400d206b on core 0
0x400d206b: task_wdt_isr at /home/rick/esp/esp-idf/components/esp32/./task_wdt.c:236


Backtrace: 0x4008b0d0:0x3ffc07b0 0x4008b1cf:0x3ffc07d0 0x400d206b:0x3ffc07f0 0x4008269e:0x3ffc0810 0x4000bfed:0x00000000
0x4008b0d0: invoke_abort at /home/rick/esp/esp-idf/components/esp32/./panic.c:572

0x4008b1cf: abort at /home/rick/esp/esp-idf/components/esp32/./panic.c:572

0x400d206b: task_wdt_isr at /home/rick/esp/esp-idf/components/esp32/./task_wdt.c:236

0x4008269e: _xt_lowint1 at /home/rick/esp/esp-idf/components/freertos/./xtensa_vectors.S:1105

Who is online

Users browsing this forum: Baidu [Spider], DrMickeyLauer and 108 guests