ESP32 - GPIO speed lower than expected

Alizame
Posts: 3
Joined: Tue Apr 04, 2017 12:35 pm

ESP32 - GPIO speed lower than expected

Postby Alizame » Tue Apr 04, 2017 3:16 pm

Hi guys :)

I recently got my ESP32 (it's Rev0, I already checked that) and it runs fine, except that I don't get why the maximum toggle frequency for GPIOs seems to be 4 MHz.
w1ts w1tc 4mhz.png
w1ts w1tc 4mhz.png (34.2 KiB) Viewed 79533 times
Is this really the maximum (the peripheral bus clock rate is 80MHz, so shouldn't it be 40MHz?)? Or are now the slow register-adresses (as mentioned in the ECO pdf) standard?

Im running the latest IDF (from github) and my project is based on the esp-idf-template (again form github).
My Toggle-function runs on Core1 (as a thread pinned to it).
I use GPIO.out_w1ts and GPIO.w1tc to set/clear the GPIO.

I have tried to simplify the code code as far as possible eliminate any "codelogic-errors".

The critical code section:

Code: Select all

void toggle(void *pvParameter) {
	printf("toggle called.\n");
	portDISABLE_INTERRUPTS();

	while (1) {
		GPIO.out_w1ts |= (1 << GPIO_NUM_23);
		__asm__ __volatile__("nop;nop;nop;nop;nop;nop;nop;"); // Bug workaround (I found this snippet somewhere in this forum)

		GPIO.out_w1tc |= (1 << GPIO_NUM_23);
		__asm__ __volatile__("nop;nop;nop;nop;nop;nop;nop;");
	}

	// GPIO.out doesn't make any difference
	/* ... */

	portENABLE_INTERRUPTS();
}
My complete main.c file: https://pastebin.com/9eYRUMVs , maybe you can check for errors :)

So why do I want to Toggle a GPIO as fast as possible? Why not...
I'm trying to get data as fast as possible into 4 x 8 D-FlipFlops (4 x 74ahct574), so I have 8 data lines and 4 clock lines.
Im new to ESP32 so I don't exactly know yet, how some of the peripherals/DMA work, so I thought this would be a good start

I am thankful for any help and maybe you can point me in the right direction to solve my problem.

PS: I'm sorry for any spelling/grammar errors, I'm not a native speaker.

Hans Dorn
Posts: 62
Joined: Tue Feb 21, 2017 2:21 am

Re: ESP32 - GPIO speed lower than expected

Postby Hans Dorn » Wed Apr 05, 2017 1:25 am

Hi Alizame,

as far as I understand the w1ts/w1tc registers, you don't have to read them out to set/reset GPIO bits.
Try this version:

Code: Select all

void toggle(void *pvParameter) {
   printf("toggle called.\n");
   portDISABLE_INTERRUPTS();

   while (1) {
      GPIO.out_w1ts = (1 << GPIO_NUM_23);
      __asm__ __volatile__("nop;nop;nop;nop;nop;nop;nop;"); // Bug workaround (I found this snippet somewhere in this forum)

      GPIO.out_w1tc = (1 << GPIO_NUM_23);
      __asm__ __volatile__("nop;nop;nop;nop;nop;nop;nop;");
   }

   // GPIO.out doesn't make any difference
   /* ... */

   portENABLE_INTERRUPTS();
}

rsimpsonbusa
Posts: 55
Joined: Tue May 17, 2016 8:12 pm

Re: ESP32 - GPIO speed lower than expected

Postby rsimpsonbusa » Wed Apr 05, 2017 1:43 am

Tried it with a logical analyzer same result. Very slow. STM32f4 is 100MHZ.
Attachments
Screen Shot 2017-04-04 at 8.41.38 PM.png
Screen Shot 2017-04-04 at 8.41.38 PM.png (49.78 KiB) Viewed 79528 times

ESP_Angus
Posts: 1790
Joined: Sun May 08, 2016 4:11 am

Re: ESP32 - GPIO speed lower than expected

Postby ESP_Angus » Wed Apr 05, 2017 4:31 am

For pin twiddling, you can get to 10MHz if you remove the "bug workaround":

Code: Select all

while (1) {
    GPIO.out_w1ts = (1 << TogglePin);
    GPIO.out_w1tc = (1 << TogglePin);
}
(No workaround is necessary here, the R0 silicon bug only triggers if you write to the same register multiple times in a row, but this code is writing two registers - OUT_W1TS & OUT_W1TC - alternately.)

A quick peek at the generated assembly shows that's pretty close to optimal:

Code: Select all

xtensa-esp32-elf-objdump -S build/app-template.elf

Code: Select all

        while (1) {
                GPIO.out_w1ts = (1 << TogglePin);
400f1d10:       7a0d81          l32r    a8, 400d0544 <_flash_cache_start+0x52c>
400f1d13:       7a0d91          l32r    a9, 400d0548 <_flash_cache_start+0x530>
400f1d16:       0020c0          memw
400f1d19:       2899            s32i.n  a9, a8, 8
                GPIO.out_w1tc = (1 << TogglePin);
400f1d1b:       0020c0          memw
400f1d1e:       3899            s32i.n  a9, a8, 12
400f1d20:       fffb06          j       400f1d10 <app_main+0x1c>
        ...
If you compile in Release mode then the extra compiler optimisation moves the jump target so it doesn't reload a8 & a9 each time around the loop, which doesn't actually change the result (due to CPU pipeline, buffered writes, slower I/O bus).

If you rewrite in inline assembler then you can even take out the memory barrier "memw" instructions for absolute minimum code size, but this also doesn't change the result - the CPU waits for the previous write to each register to complete before it writes there again.

We realise that 10MHz is not terribly fast, however using a fast CPU to bit-bang values is also much more fiddly than bit-banging on a simple 8-bit microcontroller. ESP32 has peripherals which help with this:
  • The RMT Peripheral is nominally designed to send/receive IR remote control signals, but can send/receive arbitrary pulse sequences. For example here is a project that uses it to drive addressable LED strings.
  • The SPI buses all function in DMA mode when using the IDF included driver, so you can clock data in/out at up to 80MHz.
  • The I2S peripheral can be used in a parallel mode to drive a single clock and multiple data inputs/output pins, also via DMA. I don't think a general-purpose driver is available for this yet. Ivan's camera example shows how to do it for clocking parallel data in, I'm not sure if a similar example exists for parallel data output yet.
If you can offload the bit-twiddling work to the peripherals, then CPU can spend its time handling WiFi, Bluetooth, high-level logic, etc.

Alizame
Posts: 3
Joined: Tue Apr 04, 2017 12:35 pm

Re: ESP32 - GPIO speed lower than expected

Postby Alizame » Wed Apr 05, 2017 2:04 pm

A new problem arrived :D
code: https://pastebin.com/7ZC6Y4yU

If I write to GPIO.out_w1ts directly (like "GPIO.out_w1ts = (1 << 23);" ), my Core0 crashes or something like that (TG1WDT_SYS_RESET happens and Core0 doesn't print that it got to my loop function, but the GPIO gets set correctly until the ESP resets).
If I write it like "GPIO.out_w1ts |= (1 << 23), everything works fine (but slower). Also GPIO.out_w1tc works perfectly as expected.

Serial Output:
ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
ets Jun 8 2016 00:22:57

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0x00
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0008,len:8
load:0x3fff0010,len:3384
load:0x40078000,len:7428
load:0x40080000,len:252
entry 0x40080034
I (43) boot: ESP-IDF v2.0-rc1-387-g47b8f78 2nd stage bootloader
I (43) boot: compile time 15:13:16
I (44) boot: Enabling RNG early entropy source...
I (64) boot: SPI Speed : 80MHz
I (76) boot: SPI Mode : DIO
I (89) boot: SPI Flash Size : 4MB
I (101) boot: Partition Table:
I (112) boot: ## Label Usage Type ST Offset Length
I (135) boot: 0 nvs WiFi data 01 02 00009000 00006000
I (158) boot: 1 phy_init RF data 01 01 0000f000 00001000
I (182) boot: 2 factory factory app 00 00 00010000 00100000
I (205) boot: End of partition table
I (218) boot: Disabling RNG early entropy source...
I (235) boot: Loading app partition at offset 00010000
I (787) boot: segment 0: paddr=0x00010018 vaddr=0x00000000 size=0x0ffe8 ( 65512)
I (788) boot: segment 1: paddr=0x00020008 vaddr=0x3f400010 size=0x07004 ( 28676) map
I (804) boot: segment 2: paddr=0x00027014 vaddr=0x3ffb0000 size=0x023ac ( 9132) load
I (832) boot: segment 3: paddr=0x000293c8 vaddr=0x40080000 size=0x00400 ( 1024) load
I (857) boot: segment 4: paddr=0x000297d0 vaddr=0x40080400 size=0x1994c (104780) load
I (913) boot: segment 5: paddr=0x00043124 vaddr=0x400c0000 size=0x00000 ( 0) load
I (914) boot: segment 6: paddr=0x0004312c vaddr=0x00000000 size=0x0cedc ( 52956)
I (934) boot: segment 7: paddr=0x00050010 vaddr=0x400d0018 size=0x377ac (227244) map
I (960) cpu_start: Pro cpu up.
I (971) cpu_start: Starting app cpu, entry point is 0x40080b08
I (0) cpu_start: App cpu up.
I (1003) heap_alloc_caps: Initializing. RAM available for dynamic allocation:
I (1026) heap_alloc_caps: At 3FFAE2A0 len 00001D60 (7 KiB): DRAM
I (1046) heap_alloc_caps: At 3FFB7448 len 00028BB8 (162 KiB): DRAM
I (1068) heap_alloc_caps: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
I (1089) heap_alloc_caps: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1111) heap_alloc_caps: At 40099D4C len 000062B4 (24 KiB): IRAM
I (1131) cpu_start: Pro cpu start user code
I (1188) cpu_start: Starting scheduler on PRO CPU.
I (202) cpu_start: Starting scheduler on APP CPU.
I (202) wifi: wifi firmware version: 1f2a9e1
I (202) wifi: config NVS flash: enabled
I (202) wifi: config nano formating: disabled
I (202) wifi: Init dynamic tx buffer num: 32
I (202) wifi: wifi driver task: 3ffbd1d4, prio:23, stack:3584
I (202) wifi: Init static rx buffer num: 10
I (202) wifi: Init dynamic rx buffer num: 0
I (212) wifi: Init rx ampdu len mblock:7
I (212) wifi: Init lldesc rx ampdu entry mblock:4
I (212) wifi: wifi power manager task: 0x3ffc257c prio: 21 stack: 2560
I (222) wifi: wifi timer task: 3ffc35fc, prio:22, stack:3584
I (252) phy: phy_version: 350, Mar 22 2017, 15:02:06, 0, 0
I (252) wifi: mode : sta (24:0a:c4:07:8a:84)
toggle called.
ets Jun 8 2016 00:22:57

rst:0x8 (TG1WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0x00
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0008,len:8
load:0x3fff0010,len:3384
load:0x40078000,len:7428
load:0x40080000,len:252
entry 0x40080034
I (264) boot: ESP-IDF v2.0-rc1-387-g47b8f78 2nd stage bootloader
I (264) boot: compile time 15:13:16
I (264) boot: Enabling RNG early entropy source...
I (285) boot: SPI Speed : 80MHz
I (298) boot: SPI Mode : DIO
I (311) boot: SPI Flash Size : 4MB
I (323) boot: Partition Table:
I (335) boot: ## Label Usage Type ST Offset Length
I (357) boot: 0 nvs WiFi data 01 02 00009000 00006000
I (381) boot: 1 phy_init RF data 01 01 0000f000 00001000
I (404) boot: 2 factory factory app 00 00 00010000 00100000
I (427) boot: End of partition table
I (440) boot: Disabling RNG early entropy source...
I (457) boot: Loading app partition at offset 00010000
I (1009) boot: segment 0: paddr=0x00010018 vaddr=0x00000000 size=0x0ffe8 ( 65512)
I (1010) boot: segment 1: paddr=0x00020008 vaddr=0x3f400010 size=0x07004 ( 28676) map
I (1026) boot: segment 2: paddr=0x00027014 vaddr=0x3ffb0000 size=0x023ac ( 9132) load
I (1055) boot: segment 3: paddr=0x000293c8 vaddr=0x40080000 size=0x00400 ( 1024) load
I (1080) boot: segment 4: paddr=0x000297d0 vaddr=0x40080400 size=0x1994c (104780) load
I (1137) boot: segment 5: paddr=0x00043124 vaddr=0x400c0000 size=0x00000 ( 0) load
I (1138) boot: segment 6: paddr=0x0004312c vaddr=0x00000000 size=0x0cedc ( 52956)
I (1158) boot: segment 7: paddr=0x00050010 vaddr=0x400d0018 size=0x377ac (227244) map
I (1184) cpu_start: Pro cpu up.
I (1195) cpu_start: Starting app cpu, entry point is 0x40080b08
I (0) cpu_start: App cpu up.
I (1228) heap_alloc_caps: Initializing. RAM available for dynamic allocation:
I (1251) heap_alloc_caps: At 3FFAE2A0 len 00001D60 (7 KiB): DRAM
I (1271) heap_alloc_caps: At 3FFB7448 len 00028BB8 (162 KiB): DRAM
I (1292) heap_alloc_caps: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
I (1314) heap_alloc_caps: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1335) heap_alloc_caps: At 40099D4C len 000062B4 (24 KiB): IRAM
I (1356) cpu_start: Pro cpu start user code
I (1413) cpu_start: Starting scheduler on PRO CPU.
I (202) cpu_start: Starting scheduler on APP CPU.
I (202) wifi: wifi firmware version: 1f2a9e1
I (202) wifi: config NVS flash: enabled
I (202) wifi: config nano formating: disabled
I (202) wifi: Init dynamic tx buffer num: 32
I (202) wifi: wifi driver task: 3ffbd1d4, prio:23, stack:3584
I (202) wifi: Init static rx buffer num: 10
I (202) wifi: Init dynamic rx buffer num: 0
I (212) wifi: Init rx ampdu len mblock:7
I (212) wifi: Init lldesc rx ampdu entry mblock:4
I (212) wifi: wifi power manager task: 0x3ffc257c prio: 21 stack: 2560
I (222) wifi: wifi timer task: 3ffc35fc, prio:22, stack:3584
I (252) phy: phy_version: 350, Mar 22 2017, 15:02:06, 0, 0
I (252) wifi: mode : sta (24:0a:c4:07:8a:84)
toggle called.
As you can see the "loop called" from the loop function never gets printed out and the Watchdog starves ;(

I hope you can help me again ;)

ESP_Angus
Posts: 1790
Joined: Sun May 08, 2016 4:11 am

Re: ESP32 - GPIO speed lower than expected

Postby ESP_Angus » Thu Apr 06, 2017 12:16 am

If I write to GPIO.out_w1ts directly (like "GPIO.out_w1ts = (1 << 23);" ), my Core0 crashes or something like that (TG1WDT_SYS_RESET happens and Core0 doesn't print that it got to my loop function, but the GPIO gets set correctly until the ESP resets).
If I write it like "GPIO.out_w1ts |= (1 << 23), everything works fine (but slower). Also GPIO.out_w1tc works perfectly as expected.
I think this is probably something to do with the fact that the task is permanently disabling interrupts. When I was playing with that code it also regularly reset itself.

If you disable interrupts in chunks, and come up for a breather now and then (just re-enabling them and immediately disabling them should be enough) then it should be OK.

Alizame
Posts: 3
Joined: Tue Apr 04, 2017 12:35 pm

Re: ESP32 - GPIO speed lower than expected

Postby Alizame » Thu Apr 06, 2017 11:00 am

Ok, I can confirm that it has something todo with the disabling of the interrupts.
But is there any explanation why it only happens if I try to write to w1ts directly (and not with or) and everything works fine for w1tc?

Btw: is there any ETA on when I2S-Parallel Driver/Documentation will be available?

Who is online

Users browsing this forum: Google [Bot] and 37 guests