Spurious panic in crosscore

benpeoples
Posts: 10
Joined: Wed May 31, 2017 4:21 pm

Spurious panic in crosscore

Postby benpeoples » Wed May 27, 2020 6:19 pm

So I seem to be getting somewhat random Panic resets in a device, it seems to be happening in crosscore?

Complete core dump is here: https://pastebin.com/vCGet7eL

So, the really weird thing is that this happened on devices that were at a customer's location, and two of them did this at precisely the same time, and both coredumps look nearly identical.

My guess is that there was some power glitch and it caused one of the cores to lock up?

Anyone see anything that would point to a software issue? Maybe we just need better power filtering on our inputs.

ESP_Sprite
Posts: 8921
Joined: Thu Nov 26, 2015 4:08 am

Re: Spurious panic in crosscore

Postby ESP_Sprite » Thu May 28, 2020 6:59 am

Not sure if you can conclude that the issue is in crosscore... all the other tasks are in there (mostly because they yielded, and that is the last position the task will be in before they get descheduled) but all we know of the crashing tasks is that they called abort; the stack trace is broken from there on.

I agree that if two boards crash in the same way at exactly the same time, some power issue may be a culprit.

benpeoples
Posts: 10
Joined: Wed May 31, 2017 4:21 pm

Re: Spurious panic in crosscore

Postby benpeoples » Thu May 28, 2020 12:54 pm

Thanks! That makes sense.

I guess related to this, all of my stack traces seem to get corrupted in this way (our panic handler uploads the stored-on-flash coredump to the server upon reboot)-- even testing by simply triggering an assert (e.g., assert(false);) leads to the current thread stack being corrupted above the abort(). Is that common behavior in a panic, or is there some problem in my handler?

We're just uploading the full contents of the coredump partition.

I don't want to post the elf for the firmware publicly, but is there perhaps documentation on the coredump format or someone who could look at it to tell me if I'm corrupting it in some way?

ESP_Sprite
Posts: 8921
Joined: Thu Nov 26, 2015 4:08 am

Re: Spurious panic in crosscore

Postby ESP_Sprite » Thu May 28, 2020 9:01 pm

I doubt it's you corrupting the coredump - if I recall correctly, it's protected by a hash or CRC or something. Can you tell me the ESP-IDF version you compile with? Could be that something is marked as 'noreturn' in that version, breaking core dumps on abort.

benpeoples
Posts: 10
Joined: Wed May 31, 2017 4:21 pm

Re: Spurious panic in crosscore

Postby benpeoples » Fri May 29, 2020 11:57 pm

It appears to be the v4.0 release, looks like I should actually go ahead and move to v4.0.1, but that does not seem to solve this particular issue.

Who is online

Users browsing this forum: No registered users and 30 guests