CRASH in NimBLE host task

jcolebaker
Posts: 60
Joined: Thu Mar 18, 2021 12:23 am

CRASH in NimBLE host task

Postby jcolebaker » Tue Jul 13, 2021 10:26 pm

Hi, we're using the NimBLE bluetooth host in a fairly complex ESP32 application. It's working well when I test it, but a tester is reporting frequent crashes. They are using the same ESP32-based board, but a different Android phone to connect. The other main difference is they are using it in a large office environment where there is a lot of BLE traffic.

We are using ESP-IDF 4.1 (boot: ESP-IDF v4.1.1-250-ga92185263)

Here's the log from the crash. It's always the same. It typically happens if they open then close the connection from the app to the ESP32 board a few times. It looks like it happens just after the app subscribes to several BLE characteristics.

Code: Select all

I (100006) BLTH: Handle 12 subscribed
I (100096) BLTH: Handle 15 subscribed
I (100186) BLTH: Handle 18 subscribed
I (100276) BLTH: Handle 21 subscribed
I (100366) BLTH: Handle 24 subscribed
Guru Meditation Error: Core  0 panic'ed (StoreProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x40128446  PS      : 0x00060b30  A0      : 0x800e304c  A1      : 0x3ffcddf0
A2      : 0x00000001  A3      : 0x00000c35  A4      : 0x80090a95  A5      : 0x3ffbe340
A6      : 0x00000003  A7      : 0x00060023  A8      : 0x800e066e  A9      : 0x3ffcddd0
A10     : 0x00000000  A11     : 0x00000000  A12     : 0x400e09e0  A13     : 0x3ffc3d8c
A14     : 0x3ffb5f78  A15     : 0x00000000  SAR     : 0x00000019  EXCCAUSE: 0x0000001d
EXCVADDR: 0x00000001  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff


ELF file SHA256: b8935011f1710251

Backtrace: 0x40128443:0x3ffcddf0 0x400e3049:0x3ffcde10 0x400e325d:0x3ffcde30 0x400dcc11:0x3ffcde50 0x400dcc33:0x3ffcde80 0x400dfa42:0x3ffcdea0 0x400d705a:0x3ffcdec0 0x400900c5:0x3ffcdee0

Rebooting...


I used the xtensa-esp32-elf-addr2line tool to expand the above backtrace. It always looks like this:

Code: Select all

0x40128443: put_le16 at /opt/esp/idf/components/bt/host/nimble/nimble/porting/nimble/src/endian.c:24
0x400e3049: ble_hs_hci_cmd_send at /opt/esp/idf/components/bt/host/nimble/nimble/nimble/host/src/ble_hs_hci_cmd.c:80
0x400e325d: ble_hs_hci_cmd_send_buf at /opt/esp/idf/components/bt/host/nimble/nimble/nimble/host/src/ble_hs_hci_cmd.c:125
0x400dcc11: ble_hs_flow_tx_num_comp_pkts at /opt/esp/idf/components/bt/host/nimble/nimble/nimble/host/src/ble_hs_flow.c:99
0x400dcc33: ble_hs_flow_event_cb at /opt/esp/idf/components/bt/host/nimble/nimble/nimble/host/src/ble_hs_flow.c:120
0x400dfa42: ble_npl_event_run at /opt/esp/idf/components/bt/host/nimble/nimble/porting/npl/freertos/include/nimble/nimble_npl_os.h:121
 (inlined by) nimble_port_run at /opt/esp/idf/components/bt/host/nimble/nimble/porting/nimble/src/nimble_port.c:81
0x400d705a: BleHostTask at /__w/13/s/build/../components/wc_bluetooth/wc_bluetooth.c:133
0x400900c5: vPortTaskWrapper at /opt/esp/idf/components/freertos/port.c:143
Our code is in wc_bluetooth.c, which is just me calling nimble_port_run().

I tried increasing the NimBLE task stack size to 5120 (BT_NIMBLE_TASK_STACK_SIZE), and also turning off NimBLE debug logging, but it didn't help much (maybe the crash occurred less often but it still occurred).

Does this look like a stack overflow? Could it be caused by high BLE traffic in the area? Any other ideas?

jcolebaker
Posts: 60
Joined: Thu Mar 18, 2021 12:23 am

Re: CRASH in NimBLE host task

Postby jcolebaker » Wed Dec 08, 2021 10:43 pm

FYI For anyone who has a similar problem, we identified a bug in the NimBLE stack. See this PR:

https://github.com/apache/mynewt-nimble/pull/1012

Not sure if / when the fix will be added to ESP-IDF source; if it's not there, then it is easy enough to add it.

Who is online

Users browsing this forum: Baidu [Spider], Bing [Bot] and 134 guests