整合离线语音识别ASR和TTS,内存映射时发生内存不足的错误,TTS字典不能正常映射到数据空间

wangpengXX
Posts: 8
Joined: Thu Apr 22, 2021 4:05 am

整合离线语音识别ASR和TTS,内存映射时发生内存不足的错误,TTS字典不能正常映射到数据空间

Postby wangpengXX » Tue Jul 05, 2022 9:37 am

整合TTS和ASR,发现识别模型和TTS字典映射冲突,应该是只有4M的数据空间可以映射导致的,换了16M模组也不行,应该不是flash的问题,测试找到返回错误0x101的位置是 按页映射的函数里start==end;,如果不加识别模型,TTS能正常和唤醒模型工作,这个问题怎么解决,希望乐鑫给个方案。字典和模型的大小应该都在3M左右吧,具体语音识别模型怎么加载进来的?也是映射吗?
end1 = region_begin + region_size - page_count + 1;
for (start1 = region_begin; start1 < end1; ++start1) {
int pageno = 0;
int pos;
DPORT_INTERRUPT_DISABLE();
for (pos = start1; pos < start1 + page_count; ++pos, ++pageno) {
int table_val = (int) DPORT_SEQUENCE_REG_READ((uint32_t)&DPORT_PRO_FLASH_MMU_TABLE[pos]);
uint8_t refcnt = s_mmap_page_refcnt[pos];
if (refcnt != 0 && table_val != pages[pageno]) {
break;
}
}
DPORT_INTERRUPT_RESTORE();
// whole mapping range matched, bail out
if (pos - start1 == page_count) {
break;
}
}
// checked all the region(s) and haven't found anything?
if (start1 == end1) {

*out_handle = 0;
*out_ptr = NULL;
ret = ESP_ERR_NO_MEM;

wangpengXX
Posts: 8
Joined: Thu Apr 22, 2021 4:05 am

Re: 整合离线语音识别ASR和TTS,内存映射时发生内存不足的错误,TTS字典不能正常映射到数据空间

Postby wangpengXX » Tue Jul 05, 2022 9:42 am

忘记发log,补上,这个是去掉语音识别模型的LOG,
--- idf_monitor on COM6 115200 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x3f (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x3fff001c,len:6960
ho 0 tail 12 room 4
load:0x40078000,len:11928
ho 0 tail 12 room 4
load:0x40080400,len:6624
entry 0x40080720
I (69) boot: Chip Revision: 3
I (76) boot_comm: chip revision: 3, min. bootloader chip revision: 0
I (43) boot: ESP-IDF v3.3.2-107-g722043f73-dirty 2nd stage bootloader
I (44) boot: compile time 16:40:24
I (45) boot: Enabling RNG early entropy source...
I (50) qio_mode: Enabling default flash chip QIO
I (55) boot: SPI Speed : 80MHz
I (59) boot: SPI Mode : QIO
I (63) boot: SPI Flash Size : 16MB
I (68) boot: Partition Table:
I (71) boot: ## Label Usage Type ST Offset Length
I (78) boot: 0 nvs WiFi data 01 02 00009000 00006000
I (86) boot: 1 phy_init RF data 01 01 0000f000 00001000
I (93) boot: 2 factory factory app 00 00 00010000 00300000
I (101) boot: 3 flash_test Unknown data 01 81 00310000 00084000
I (108) boot: 4 voice_data Unknown data 01 81 00400000 00300000
I (116) boot: End of partition table
I (120) boot_comm: chip revision: 3, min. application chip revision: 0
I (127) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x41168 (266600) map
I (204) esp_image: segment 1: paddr=0x00051190 vaddr=0x3ffb0000 size=0x03b18 ( 15128) load
I (208) esp_image: segment 2: paddr=0x00054cb0 vaddr=0x40080000 size=0x00400 ( 1024) load
0x40080000: _WindowOverflow4 at D:/ESP/esp-idf/esp-adf/esp-idf/components/freertos/xtensa_vectors.S:1779

I (211) esp_image: segment 3: paddr=0x000550b8 vaddr=0x40080400 size=0x0af58 ( 44888) load
I (233) esp_image: segment 4: paddr=0x00060018 vaddr=0x400d0018 size=0x48e9c (298652) map
0x400d0018: _flash_cache_start at ??:?

I (308) esp_image: segment 5: paddr=0x000a8ebc vaddr=0x4008b358 size=0x038c0 ( 14528) load
0x4008b358: __sfvwrite_r at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././new
lib/libc/stdio/fvwrite.c:204

I (323) boot: Loaded app from partition at offset 0x10000
I (323) boot: Disabling RNG early entropy source...
I (323) psram: This chip is ESP32-D0WD
I (328) spiram: Found 64MBit SPI RAM device
I (332) spiram: SPI RAM mode: flash 40m sram 40m
I (338) spiram: PSRAM initialized, cache is in low/high (2-core) mode.
I (345) cpu_start: Pro cpu up.
I (349) cpu_start: Application information:
I (353) cpu_start: Project name: speech_recognition_example
I (360) cpu_start: App version: v2.3-6-g8121aa2-dirty
I (366) cpu_start: Compile time: Jul 4 2022 16:41:02
I (372) cpu_start: ELF file SHA256: d8241a8e76f4c41c...
I (378) cpu_start: ESP-IDF: v3.3.2-107-g722043f73-dirty
I (385) cpu_start: Starting app cpu, entry point is 0x40081410
0x40081410: call_start_cpu1 at D:/ESP/esp-idf/esp-adf/esp-idf/components/esp32/cpu_start.c:267

I (0) cpu_start: App cpu up.
I (1262) spiram: SPI SRAM memory test OK
I (1263) heap_init: Initializing. RAM available for dynamic allocation:
I (1263) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (1269) heap_init: At 3FFB4F90 len 0002B070 (172 KiB): DRAM
I (1276) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (1282) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1289) heap_init: At 4008EC18 len 000113E8 (68 KiB): IRAM
I (1295) cpu_start: Pro cpu start user code
I (1300) spiram: Adding pool of 4096K of external SPI memory to heap allocator
I (202) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I (203) spiram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (237) uart: queue free spaces: 20
I (238) gpio: GPIO[22]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (240) example_asr_keywords: Initialize SR wn handle
Quantized wakeNet5: wakeNet5_v1_nihaoxiaozhi_6_0.983_0.95, mode:0 (Oct 14 2020 16:26:17)
I (253) example_asr_keywords: keywords: nihaoxiaozhi (index = 1)
Fail: index is out of range, the min index is 1, the max index is 1I (260) example_asr_keywords: keywords_num = 1, threshold = 0.000000, sample_rate =
16000, chunksize = 480, sizeof_uint16 = 2
I (278) example_asr_keywords: [ 1 ] Start codec chip
I (284) example_asr_keywords: [ 2.0 ] Create audio pipeline for recording
I (291) example_asr_keywords: [ 2.1 ] Create i2s stream to read audio data from codec chip
I (303) example_asr_keywords: [ 2.2 ] Create filter to resample audio data
I (308) example_asr_keywords: [ 2.3 ] Create raw to receive data
I (314) example_asr_keywords: [ 3 ] Register all elements to audio pipeline
I (322) example_asr_keywords: [ 4 ] Link elements together [codec_chip]-->i2s_stream-->filter-->raw-->[SR]
E (333) gpio: gpio_install_isr_service(412): GPIO isr service already installed
r
xx

----------------------------- ESP Audio Platform -----------------------------
| |
| ESP_AUDIO-v1.7.0-037bef3-09be8fe |
| Compile date: May 8 2021-17:38:00 |
------------------------------------------------------------------------------
W (386) I2S: I2S driver already installed
I (386) example_asr_keywords: esp_audio instance is:0x3f804c38

I (393) example_asr_keywords: [ 5 ] Start audio_pipeline
300000
0
5
17
inti voice set:template
ESP Chinese TTS v1.0 (Nov 20 2020 10:35:43)
欢è¿ä½¿ç¨è¯­éåæ

这个是同时使用唤醒,识别,TTS的log
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x3fff001c,len:6960
ho 0 tail 12 room 4
load:0x40078000,len:11928
ho 0 tail 12 room 4
load:0x40080400,len:6624
entry 0x40080720
I (69) boot: Chip Revision: 3
I (76) boot_comm: chip revision: 3, min. bootloader chip revision: 0
I (43) boot: ESP-IDF v3.3.2-107-g722043f73-dirty 2nd stage bootloader
I (43) boot: compile time 16:40:24
I (45) boot: Enabling RNG early entropy source...
I (50) qio_mode: Enabling default flash chip QIO
I (55) boot: SPI Speed : 80MHz
I (59) boot: SPI Mode : QIO
I (63) boot: SPI Flash Size : 16MB
I (67) boot: Partition Table:
I (71) boot: ## Label Usage Type ST Offset Length
I (78) boot: 0 nvs WiFi data 01 02 00009000 00006000
I (86) boot: 1 phy_init RF data 01 01 0000f000 00001000
I (93) boot: 2 factory factory app 00 00 00010000 00300000
I (101) boot: 3 flash_test Unknown data 01 81 00310000 00084000
I (108) boot: 4 voice_data Unknown data 01 81 00400000 00300000
I (116) boot: End of partition table
I (120) boot_comm: chip revision: 3, min. application chip revision: 0
I (127) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x17407c (1523836) map
I (520) esp_image: segment 1: paddr=0x001840a4 vaddr=0x3ffb0000 size=0x03c4c ( 15436) load
I (525) esp_image: segment 2: paddr=0x00187cf8 vaddr=0x40080000 size=0x00400 ( 1024) load
0x40080000: _WindowOverflow4 at D:/ESP/esp-idf/esp-adf/esp-idf/components/freertos/xtensa_vectors.S:1779

I (527) esp_image: segment 3: paddr=0x00188100 vaddr=0x40080400 size=0x07f10 ( 32528) load
I (546) esp_image: segment 4: paddr=0x00190018 vaddr=0x400d0018 size=0x4c930 (313648) map
0x400d0018: _flash_cache_start at ??:?

I (625) esp_image: segment 5: paddr=0x001dc950 vaddr=0x40088310 size=0x06988 ( 27016) load
0x40088310: illegal_instruction_helper at D:/ESP/esp-idf/esp-adf/esp-idf/components/esp32/panic.c:715
(inlined by) xt_unhandled_exception at D:/ESP/esp-idf/esp-adf/esp-idf/components/esp32/panic.c:363

I (643) boot: Loaded app from partition at offset 0x10000
I (643) boot: Disabling RNG early entropy source...
I (643) psram: This chip is ESP32-D0WD
I (648) spiram: Found 64MBit SPI RAM device
I (653) spiram: SPI RAM mode: flash 40m sram 40m
I (658) spiram: PSRAM initialized, cache is in low/high (2-core) mode.
I (665) cpu_start: Pro cpu up.
I (669) cpu_start: Application information:
I (674) cpu_start: Project name: speech_recognition_example
I (680) cpu_start: App version: v2.3-6-g8121aa2-dirty
I (686) cpu_start: Compile time: Jul 4 2022 16:41:02
I (693) cpu_start: ELF file SHA256: 8fb1b1985dce9b11...
I (699) cpu_start: ESP-IDF: v3.3.2-107-g722043f73-dirty
I (705) cpu_start: Starting app cpu, entry point is 0x40081410
0x40081410: call_start_cpu1 at D:/ESP/esp-idf/esp-adf/esp-idf/components/esp32/cpu_start.c:267

I (690) cpu_start: App cpu up.
I (1583) spiram: SPI SRAM memory test OK
I (1583) heap_init: Initializing. RAM available for dynamic allocation:
I (1584) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (1590) heap_init: At 3FFB50D8 len 0002AF28 (171 KiB): DRAM
I (1596) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (1603) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1609) heap_init: At 4008EC98 len 00011368 (68 KiB): IRAM
I (1615) cpu_start: Pro cpu start user code
I (1620) spiram: Adding pool of 4096K of external SPI memory to heap allocator
I (75) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I (76) spiram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (110) uart: queue free spaces: 20
I (111) gpio: GPIO[22]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (113) example_asr_keywords: Initialize SR wn handle
Quantized wakeNet5: wakeNet5_v1_nihaoxiaozhi_6_0.983_0.95, mode:0 (Oct 14 2020 16:26:17)
I (126) example_asr_keywords: keywords: nihaoxiaozhi (index = 1)
Fail: index is out of range, the min index is 1, the max index is 1I (133) example_asr_keywords: keywords_num = 1, threshold = 0.000000, sample_rate =
16000, chunksize = 480, sizeof_uint16 = 2
SINGLE_RECOGNITION: 2_0 MN1_4; core: 0; (May 15 2020 14:50:27)
SHIFT: 8, 12, 17, 17, 19, 17, 6, 16, 15, 14,
I (165) example_asr_keywords: keywords_num = 166 , sample_rate = 16000, chunksize = 480, sizeof_uint16 = 2
I (171) example_asr_keywords: [ 1 ] Start codec chip
I (177) example_asr_keywords: [ 2.0 ] Create audio pipeline for recording
I (184) example_asr_keywords: [ 2.1 ] Create i2s stream to read audio data from codec chip
I (196) example_asr_keywords: [ 2.2 ] Create filter to resample audio data
I (200) example_asr_keywords: [ 2.3 ] Create raw to receive data
I (207) example_asr_keywords: [ 3 ] Register all elements to audio pipeline
I (215) example_asr_keywords: [ 4 ] Link elements together [codec_chip]-->i2s_stream-->filter-->raw-->[SR]
E (226) gpio: gpio_install_isr_service(412): GPIO isr service already installed
r
xx

----------------------------- ESP Audio Platform -----------------------------
| |
| ESP_AUDIO-v1.7.0-037bef3-09be8fe |
| Compile date: May 8 2021-17:38:00 |
------------------------------------------------------------------------------
W (278) I2S: I2S driver already installed
I (279) example_asr_keywords: esp_audio instance is:0x3f80c300

I (286) example_asr_keywords: [ 5 ] Start audio_pipeline
300000
Couldn't map voice data partition!
101
17
17

wangpengXX
Posts: 8
Joined: Thu Apr 22, 2021 4:05 am

Re: 整合离线语音识别ASR和TTS,内存映射时发生内存不足的错误,TTS字典不能正常映射到数据空间

Postby wangpengXX » Wed Jul 06, 2022 9:40 am

Code: Select all

esp_err_t IRAM_ATTR spi_flash_mmap_pages(const int *pages, size_t page_count, spi_flash_mmap_memory_t memory,
                         const void** out_ptr, spi_flash_mmap_handle_t* out_handle)
{
    esp_err_t ret;
    bool need_flush = false;
    if (!page_count) {
        return ESP_ERR_INVALID_ARG;
    }
    if (!esp_ptr_internal(pages)) {
        return ESP_ERR_INVALID_ARG;
    }
    for (int i = 0; i < page_count; i++) {
        if (pages[i] < 0 || pages[i]*SPI_FLASH_MMU_PAGE_SIZE >= g_rom_flashchip.chip_size) {
            return ESP_ERR_INVALID_ARG;
        }
    }
    mmap_entry_t* new_entry = (mmap_entry_t*) heap_caps_malloc(sizeof(mmap_entry_t), MALLOC_CAP_INTERNAL|MALLOC_CAP_8BIT);
    if (new_entry == 0) {
        
        return ESP_ERR_NO_MEM;
    }

    spi_flash_disable_interrupts_caches_and_other_cpu();

    spi_flash_mmap_init();
    // figure out the memory region where we should look for pages
    int region_begin;   // first page to check
    int region_size;    // number of pages to check
    uint32_t region_addr;  // base address of memory region
    get_mmu_region(memory,&region_begin,&region_size,&region_addr); //48页 3M
    if (region_size < page_count) {
       
        return ESP_ERR_NO_MEM;
    }
    // The following part searches for a range of MMU entries which can be used.
    // Algorithm is essentially naïve strstr algorithm, except that unused MMU
    // entries are treated as wildcards.
  //  int start;
    // the " + 1" is a fix when loop the MMU table pages, because the last MMU page 
    // is valid as well if it have not been used
   //  end1 = region_begin + region_size - page_count + 1;
      end1 = 64;
    for (start1 = region_begin; start1 < end1; ++start1) {
        int pageno = 0;
        int pos;
        DPORT_INTERRUPT_DISABLE();
        for (pos = start1; pos < start1 + page_count; ++pos, ++pageno) {
            int table_val = (int) DPORT_SEQUENCE_REG_READ((uint32_t)&DPORT_PRO_FLASH_MMU_TABLE[pos]);
            uint8_t refcnt = s_mmap_page_refcnt[pos]; 
            if (refcnt != 0 && table_val != pages[pageno]) {
                break;
            }
        }
        DPORT_INTERRUPT_RESTORE();
        // whole mapping range matched, bail out
        if (pos - start1 == page_count) {
            break;
        }
    }
    // checked all the region(s) and haven't found anything?
    if (start1 == end1) {
       
        *out_handle = 0;
        *out_ptr = NULL;
        ret = ESP_ERR_NO_MEM;
    } else {
        // set up mapping using pages
        uint32_t pageno = 0;
        DPORT_INTERRUPT_DISABLE();
        for (int i = start1; i != start1 + page_count; ++i, ++pageno) {
            // sanity check: we won't reconfigure entries with non-zero reference count
            uint32_t entry_pro = DPORT_SEQUENCE_REG_READ((uint32_t)&DPORT_PRO_FLASH_MMU_TABLE[i]);
            uint32_t entry_app = DPORT_SEQUENCE_REG_READ((uint32_t)&DPORT_APP_FLASH_MMU_TABLE[i]);
            assert(s_mmap_page_refcnt[i] == 0 ||
                    (entry_pro == pages[pageno] &&
                     entry_app == pages[pageno]));
            if (s_mmap_page_refcnt[i] == 0) {
                if (entry_pro != pages[pageno] || entry_app != pages[pageno]) {
                    DPORT_PRO_FLASH_MMU_TABLE[i] = pages[pageno];
                    DPORT_APP_FLASH_MMU_TABLE[i] = pages[pageno];
                    need_flush = true;
                }
            }
            ++s_mmap_page_refcnt[i];
        }
        DPORT_INTERRUPT_RESTORE();
        LIST_INSERT_HEAD(&s_mmap_entries_head, new_entry, entries);
        new_entry->page = start1;
        new_entry->count = page_count;
        new_entry->handle = ++s_mmap_last_handle;
        *out_handle = new_entry->handle;
        *out_ptr = (void*) (region_addr + (start1 - region_begin) * SPI_FLASH_MMU_PAGE_SIZE);
        ret = ESP_OK;
    }

    /* This is a temporary fix for an issue where some
       cache reads may see stale data.

       Working on a long term fix that doesn't require invalidating
       entire cache.
    */
    if (need_flush) {
#if CONFIG_SPIRAM_SUPPORT
        esp_spiram_writeback_cache();
#endif
        Cache_Flush(0);
        Cache_Flush(1);
    }

    spi_flash_enable_interrupts_caches_and_other_cpu();
    if (*out_ptr == NULL) {
        free(new_entry);
    }
    return ret;
}
这个函数修改了END=64后,可以创建TTS句柄了,能合成数字,但不能合成语音,START打印出来为24,这里不太理解,4M的映射空间,原来的函数访问的是剩下的1M吗?原来的START=5 end=17,这段代码应该是按页读,读到有数据返回页号吧。能不能讲讲这个地方。还有现在这样改中文不能合成是空间不足?没有完整映射?

tempo.tian
Posts: 39
Joined: Wed Jun 22, 2022 12:10 pm

Re: 整合离线语音识别ASR和TTS,内存映射时发生内存不足的错误,TTS字典不能正常映射到数据空间

Postby tempo.tian » Thu Jul 07, 2022 12:44 pm

看定义ESP32应该最多只能mmap 4M的data空间给用户使用
如果多次调用mmap,会导致可以mmap的空间变小
tts的代码,默认mmap的是3M的空间,所以之前如果没有做mmap的话应该可以成功如果有人mmap过了就不行了

同时做语音识别和唤醒,可以尝试下分时的来处理,唤醒后释放掉资源再做tts
tts释放后再做唤醒

如果可以用ESPS3的话可以多一个可能性,语音识别的模型可以放到SDcard或者SPIFFS上避免很大的空间占用,两者可以同时跑

Who is online

Users browsing this forum: No registered users and 24 guests