Long thread, and can't say I have read every suggestion, but mine are about fixing the "broken" parts in ESP32, before adding "more stuff" (possibly with the exception of more usable GPIOs on the physical chip)
* Interrupt latency is quite poor. I don't understand why, but if 8bit microcontrollers can do that in very few clocks (8031 only pushes return address, status and interrupt regs), then why is ESP32 so slow? Is it related to being compatible with C ISR functions, then we (the community) should look at how that can be fixed? But if it is in silicon, then I urge Espressif to look into it.
* I2S in the Technical Reference Manual mentions that the FIFO buffer can written to and read from via registers. But the ESP-IDF has not mapped that out, and I can't get that direct access to work, so I suspect it is not done in esp-idf because the silicon has a bug. Should be fixed.
* Predictability. In recent project, I was able to lock down the start up really hard, so that I could sync the SPI and a MCPWM timer to within one clock, repeatable on every RESET. BUT, adding some code elsewhere, code that is not even executing yet, threw the carefully adjust sync values off by 100s of clocks, for no apparent reason. Only got a vague "Internal architecture problem" answer, and gave up.
* SPI CS handling in DMA mode, so that continuous DMA (never ends) can also send CS low for N cycles. Useful for ADC/DAC streaming (audio, video, oscilloscopes, etc) where the external chip requires a sync clock, latch pulse, start pulse etc
* A small FIFO on SPI that is directly accessible and not requiring DMA or slow setup to let CPU pump out (and read in) word-by-word.
* Route (pre-scaled) clock(s) to GPIO pin, and not use MCPWM or LED periph.
For more exotic "next generation" chip and "keep dreaming" list;
* Should be called ESP42, because 42 is meaning of life.
* I would like to see 4 user SPI ports. Interesting use-case; neural networks, with thousands of connected ESP32 in a symmetric mesh. Idea introduced by the Transputer in the 1980s, but way too costly ($400 just for the CPU) for most application areas back then.
* On-chip LoRa-compatible radio, and esp-idf support for LoRa and LoRaWAN. I only care about the 868/915MHz bands
* RISC-V is possibly a better choice in the long-run, and probably easier for users to dig into than the Tensilica cores assembler, which is really hard to get enough information about.
* Many others mentions ADC accuracy and both ADC and DAC precision/speed., and I agree.
* And of course; a SoC/PICO version... Love the PICO!