Heap corruption diagnostics causing heap corruption?.

permal
Posts: 384
Joined: Sun May 14, 2017 5:36 pm

Re: Heap corruption diagnostics causing heap corruption?.

Postby permal » Tue Feb 06, 2018 8:44 pm

ESP_Angus wrote: However, a number of class functions (for example operator= & operator[]) call cJSON functions which free data as part of their behaviour (cJSON_ReplaceItem* or cJSON_DeleteItem*). In the case that other references to these cJSON data structures exist, this looks to me like it could cause dangling pointers to freed memory.
So I had a look. You're correct that if you hold on to objects that have been replaced/erased you're going down a sad path. Just like cJSON, Value is not thread safe. However, none of the uses hold onto any objects, nor does it access it from multiple threads or delete items.
ESP_Angus wrote:Or you could look for a pure C++ JSON library which will be easier to integrate with C++ usage patterns.
jsoncpp seems like the obvious choice, but I feel cJSON is more lightweight, though I haven't actually tried. Also cJSON ships with IDF so it's one less dependency.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Heap corruption diagnostics causing heap corruption?.

Postby ESP_Angus » Wed Feb 07, 2018 3:34 am

permal wrote:
ESP_Angus wrote: However, a number of class functions (for example operator= & operator[]) call cJSON functions which free data as part of their behaviour (cJSON_ReplaceItem* or cJSON_DeleteItem*). In the case that other references to these cJSON data structures exist, this looks to me like it could cause dangling pointers to freed memory.
So I had a look. You're correct that if you hold on to objects that have been replaced/erased you're going down a sad path. Just like cJSON, Value is not thread safe. However, none of the uses hold onto any objects, nor does it access it from multiple threads or delete items.
OK, that's good to know. Probably the best I can help you from the IDF side at this point is if you can give me a test project (with source) that I can reproduce without specialised hardware. Sorry I can't be of more help.
ESP_Angus wrote:Or you could look for a pure C++ JSON library which will be easier to integrate with C++ usage patterns.
jsoncpp seems like the obvious choice, but I feel cJSON is more lightweight, though I haven't actually tried. Also cJSON ships with IDF so it's one less dependency.
I haven't looked closely at all, but you could also check out rapidJSON: http://rapidjson.org/

According to the blurb a "JSON Value" occupies 16 bytes only. The cJSON struct itself is 36 bytes, so there's potentially some room for savings there...

caseymdk
Posts: 15
Joined: Wed Feb 07, 2018 2:35 am

Re: Heap corruption diagnostics causing heap corruption?.

Postby caseymdk » Wed Feb 07, 2018 3:52 am

Hi guys!

Boy was I happy to see your post. I've been tearing my hair out all morning trying to isolate a heap corruption error I was getting. It matches the description you guys are seeing very closely: I'm getting a bad tail error, with comprehensive heap corruption detection on, which occurs after calling cJSON_Print. I had commented out a lot of my code trying to isolate it, and as it stands now, my code runs in a loop creating and destroying cJSON objects, calling heap_caps_check_integrity_all(true) each time. I get no errors for about 50 cycles, then no corrupt heap before cJSON_Print(), but heap corruption after cJSON_Print(), and then a crash when I try to free the chunk->data object later in the loop.

I just recently started experiencing this, and given that this issue was posted only a few days ago, I suspect it may be because of a recent change?

A snippet of the code around where the issue happens:

Code: Select all

		...
		printf("Before:\n");
		heap_caps_check_integrity_all(true);
		chunk->data = (uint8_t*)cJSON_Print(json_chunk); // Set attributes of this chunk
        	printf("After:\n");
		heap_caps_check_integrity_all(true);
		...
		
This results in:

Code: Select all

Before:
After:
Before:
After:
....... (repeats 50 times)......

Before:
After:
Before:
After:
Before:
After:
CORRUPT HEAP: Bad tail at 0x3ffe0cc5. Expected 0xbaad5678 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffe0cd0. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Block 0xfefefefc is outside heap (last valid block 0x3ffe0ccc)
Before:
CORRUPT HEAP: Bad tail at 0x3ffe0cc5. Expected 0xbaad5678 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffe0cd0. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Block 0xfefefefc is outside heap (last valid block 0x3ffe0ccc)
After:
CORRUPT HEAP: Bad tail at 0x3ffe0cc5. Expected 0xbaad5678 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffe0cd0. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Block 0xfefefefc is outside heap (last valid block 0x3ffe0ccc)
CORRUPT HEAP: Bad tail at 0x3ffe0cc5. Expected 0xbaad5678 got 0xfefefefe
assertion "head != NULL" failed: file "/home/me/esp/esp-idf/components/heap/./multi_heap_poisoning.c", line 199, function: multi_heap_free
abort() was called at PC 0x400d346f on core 0
0x400d346f: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)


Backtrace: 0x40087c64:0x3ffb55b0 0x40087e07:0x3ffb55d0 0x400d346f:0x3ffb55f0 0x4008788b:0x3ffb5620 0x400826be:0x3ffb5640 0x40082add:0x3ffb5660 0x4000bec7:0x3ffb5680 0x400d32fa:0x3ffb56a0 0x400d2b39:0x3ffb56d0 0x400d0c3a:0x3ffb5f30
0x40087c64: invoke_abort at /home/me/esp/esp-idf/components/esp32/./panic.c:648

0x40087e07: abort at /home/me/esp/esp-idf/components/esp32/./panic.c:648

0x400d346f: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)

0x4008788b: multi_heap_free at /home/me/esp/esp-idf/components/heap/./multi_heap_poisoning.c:313

0x400826be: heap_caps_free at /home/me/esp/esp-idf/components/heap/./heap_caps.c:401

0x40082add: _free_r at /home/me/esp/esp-idf/components/newlib/./syscalls.c:42

0x400d32fa: receipt_buffer_add at <snip> receipt_buffer.c:601

0x400d2b39: app_main at <snip> main.c:137

0x400d0c3a: main_task at /home/me/esp/esp-idf/components/esp32/./cpu_start.c:455

There is a fair bit of code in my program, and while I know there could be an issue that I'm causing myself, the similarity and timing of this thread makes me think otherwise. I am running the latest git clone of the esp-idf.

I will see if I can get this to happen with a test case.
Last edited by caseymdk on Wed Feb 07, 2018 7:25 am, edited 1 time in total.

permal
Posts: 384
Joined: Sun May 14, 2017 5:36 pm

Re: Heap corruption diagnostics causing heap corruption?.

Postby permal » Wed Feb 07, 2018 6:59 am

ESP_Angus wrote: OK, that's good to know. Probably the best I can help you from the IDF side at this point is if you can give me a test project (with source) that I can reproduce without specialised hardware. Sorry I can't be of more help.
You're being plenty helpful.
According to the blurb a "JSON Value" occupies 16 bytes only. The cJSON struct itself is 36 bytes, so there's potentially some room for savings there...
Oh, that is interesting. Though I'd rather solve the current issue without exchanging parts, at least confirm where the problem actually is located.

I'm looking into mocking the hardware in my project so that you can run it on any ESP32.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Heap corruption diagnostics causing heap corruption?.

Postby ESP_Angus » Wed Feb 07, 2018 7:16 am

Hi casey,
caseymdk wrote: I get no errors for about 50 cycles, then no corrupt heap before cJSON_Print(), but heap corruption after cJSON_Print(), and then a crash when I try to free the chunk->data object later in the loop.
This is starting to seem awfully convenient!
I just recently started experiencing this, and given that this issue was posted only a few days ago, I suspect it may be because of a recent change?
Did you see the post about cJSON 1.6.0 having a buffer overflow bug, fixed in 1.7.1? The latest master branch of ESP-IDF has 1.7.1, but it was only updated a couple of days ago with this commit: https://github.com/espressif/esp-idf/co ... b4f110d6ba

caseymdk
Posts: 15
Joined: Wed Feb 07, 2018 2:35 am

Re: Heap corruption diagnostics causing heap corruption?.

Postby caseymdk » Wed Feb 07, 2018 7:23 am

I did see the post about the bug in 1.6.0. I am running 1.7.1 while experiencing this bug. I checked out the IDF when it was using 1.6.0 of cJSON, and did not get the error...but did get a different one. I am not sure if it is related to the bug in 1.6.0, I will take a look tomorrow and post what I find. It did seem to be something very separate though: while still related to cJSON_Print, it did not cause any heap corruption errors. Rather, a function far down the stack trace in the cJSON library was being sent an invalid pointer as a parameter, if I remember correctly.

permal
Posts: 384
Joined: Sun May 14, 2017 5:36 pm

Re: Heap corruption diagnostics causing heap corruption?.

Postby permal » Wed Feb 07, 2018 7:29 am

Oh, I missed Casey's post - welcome to the club :)

So, another indication that cJSON_print() is to blame, that is very interesting. The question is now if the issue actually occurs in cJSON_print, or if it happens earlier.

Casey, are you running things on multiple threads? I.e. is this issue depending on timing?

caseymdk
Posts: 15
Joined: Wed Feb 07, 2018 2:35 am

Re: Heap corruption diagnostics causing heap corruption?.

Postby caseymdk » Wed Feb 07, 2018 7:46 am

permal wrote:Oh, I missed Casey's post - welcome to the club :)
Great to be here....? ;)
permal wrote:Casey, are you running things on multiple threads? I.e. is this issue depending on timing?
Nope, turned all other tasks off as I was investigating. When I turn on a resource monitor task, which prints all tasks and runtime, I get the same error, at nearly the same address. Indicates that it's not timing related...I think. As you experienced, I also do not get the error with "light impact" mode turned on - only comprehensive.

Were you running similar code that was working, until the 1.7.1 update? I had my code at a fairly stable point about a week ago...I'll see if I can do a git checkout, and run it with the most recent IDF. See what that yields.

I am going to do the slightly irresponsible thing, and continue my development with heap corruption detection off. Maybe my code will crash as I continue development and give us another nugget of info.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Heap corruption diagnostics causing heap corruption?.

Postby ESP_Angus » Wed Feb 07, 2018 8:05 am

I had a quick review of cJSON_Print() and saw a couple of places it may corrupt memory (to my eyes). Reported here: https://github.com/DaveGamble/cJSON/issues/241 - but I think this is not the issue you are both seeing, unless you're using the cJSON_Raw type a lot (I think you have to be calling cJSON_CreateRaw() for this to happen).

EDIT: Actually only one place, other one was me misreading the code.

permal
Posts: 384
Joined: Sun May 14, 2017 5:36 pm

Re: Heap corruption diagnostics causing heap corruption?.

Postby permal » Wed Feb 07, 2018 9:14 am

caseymdk wrote: Nope, turned all other tasks off as I was investigating. When I turn on a resource monitor task, which prints all tasks and runtime, I get the same error, at nearly the same address. Indicates that it's not timing related...I think. As you experienced, I also do not get the error with "light impact" mode turned on - only comprehensive.
I've been trying to turn off tasks and I2C communication in order to get a more easily diagnosed app, but in doing so the issue disappears. So, in my case it appears to be depending on timing, but it is not conclusive.
caseymdk wrote: Were you running similar code that was working, until the 1.7.1 update? I had my code at a fairly stable point about a week ago...I'll see if I can do a git checkout, and run it with the most recent IDF. See what that yields.
I had the same problems with 1.6 too, though I'm not certain when they actually started to appear. Right now it always happens at start up.
ESP_Angus wrote:I had a quick review of cJSON_Print()
Unfortunately, I'm not using cJSON_CreatRaw().

Who is online

Users browsing this forum: No registered users and 97 guests