Page 1 of 1

Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Mon Jun 29, 2020 4:13 pm
by xieliwei
Hi all, I'm trying to get QSPI working with DMA, but can't seem to get the read phase to work at all.

From the documentation, the phrasing is somewhat ambiguous:
Half duplex mode is not compatible with DMA when both writing and reading phases exist.
I'm taking that to mean that it is not compatible when both read and write phases exist in a transaction as opposed to a device? And anyway since only half-duplex is possible in QSPI, this is the only choice I have.

Currently I'm connecting to an FPGA emulating a QSPI flash device. So in every transaction there will be a command, followed by address, then read or write phase.

Without DMA, everything works fine, but I am limited to 64 bytes per transaction.
With DMA, write works (I see the changed data being returned by the device on read using a logic analyser), but read only performs correctly for the first word (32-bits) of the first transaction.

To make it clear, each transaction is either read or write only (excluding the command and address fields, which are sent to the device every transaction, but which I assume does not count as being part of the "write phase")

To get a better sense of what's happening, I re-initialise the read buffer to a non-zero value after each transaction. It seems like the first word is always written to but everything else is untouched. Of all the transactions, only the first word of the first read transaction returns the correct value, further transactions return zero.

Thinking this may be a silicon bug (I read something about DMA bugs, but I don't think they're relevant), I also tested the same code on a 32E device with no difference in behaviour. Although I am not sure if v3.2.3 of the IDF currently bundled with ESP32 Arduino knows of the 32E silicon version.

I also have a logic analyser hooked up to the interface, and the QSPI flash device the FPGA is emulating is behaving correctly on both read and write. It might be worth noting that the flash device operates in mode 3.

This is the test program I have written, it uses the IOMUX pins for HSPI.

Code: Select all

// Include the libraries we need
#include "esp_heap_caps.h"
#include "driver/spi_master.h"

#define F_CLK 34
#define CRESET_B 32

#define F_3_SPI_P 14
#define F_3_SPI_N 12

#define F_10_SPI_P 13
#define F_10_SPI_N 4

#define F_26_SPI_P 2
#define F_26_SPI_N 15

//////// QSPI Master stuff
// In addr terms, not bytes
//#define SPI_TEST_LEN (SPI_MAX_DMA_LEN/2)
#define SPI_TEST_LEN 1024
//#define SPI_TEST_LEN 16

#define SPI_MOSI F_10_SPI_P
#define SPI_MISO F_3_SPI_N
#define SPI_WP F_26_SPI_P
#define SPI_HD F_10_SPI_N
#define SPI_CLK F_3_SPI_P
#define SPI_CS F_26_SPI_N

#define SPI_HOST HSPI_HOST
#define SPI_FREQ SPI_MASTER_FREQ_8M
//#define SPI_FREQ 10000

spi_device_handle_t dev;

// In addr terms, not bytes, +1 in case SPI_TEST_LEN is not even
#define SPI_BUFLEN (SPI_TEST_LEN+1)
#define SPI_QUEUELEN 6

// TX-RX DMA Buffers
uint16_t* spi_buf[SPI_QUEUELEN];
volatile uint8_t currbuf = 0;
// Data attached to transaction to specify which buffer it is using
uint8_t bufnum[SPI_QUEUELEN];

volatile bool spi_gotdata = false;
// Tracks number of transactions in flight
// Used to figure out when we have space for a new transaction
volatile uint8_t num_dat = 0;

//Called right before a transaction is started
void IRAM_ATTR SPI_pre_trans_cb(spi_transaction_t *trans) {
}

//Called after transaction is completed
void IRAM_ATTR SPI_post_trans_cb(spi_transaction_t *trans) {
  // Signal to main thread if a read transaction has been completed
  if (trans->rx_buffer != NULL) {
    spi_gotdata = true;
  }
}

bool queue_transaction(spi_transaction_t  *t) {
  num_dat++;
  if (spi_device_queue_trans(dev, t, 0) != ESP_OK) {
    num_dat--;
    return false;
  }
  return true;
}

bool queue_write_transaction(uint8_t n, uint16_t addr, size_t len) {
  spi_transaction_t* t = new spi_transaction_t;
  // Ensure that data is a multiple of 4 and at least 8 bytes long
  // Since len is counting number of 16-bit words, we compare 0b1 and 4
  // instead of 0b11 and 8
  while (len & 0b1 || len < 4) len++;
  t->flags = SPI_TRANS_MODE_QIO | SPI_TRANS_MODE_DIOQIO_ADDR;
  t->cmd = 0b00110010; //0x32
  t->addr = addr;
  t->length = len << 4;
  t->rxlength = 0;
  t->tx_buffer = spi_buf[n];
  t->rx_buffer = NULL;
  t->user = &(bufnum[n]);
  return queue_transaction(t);
}

bool queue_read_transaction(uint8_t n, uint16_t addr, size_t len) {
  spi_transaction_t* t = new spi_transaction_t;
  len++;
  // Ensure that data is a multiple of 4 and at least 8 bytes long
  // Since len is counting number of 16-bit words, we compare 0b1 and 4
  // instead of 0b11 and 8
  while (len & 0b1 || len < 4) len++;
  t->flags = SPI_TRANS_MODE_QIO | SPI_TRANS_MODE_DIOQIO_ADDR;
  t->cmd = 0b01101011; //0x6B
  t->addr = addr;
  t->length = 0;
  t->rxlength = len << 4;
  t->tx_buffer = NULL;
  t->rx_buffer = spi_buf[n];
  t->user = &(bufnum[n]);
  return queue_transaction(t);
}

uint16_t get_next_addr(bool advance) {
  static uint16_t addr = 0;
  uint16_t ret = addr;
  if (advance) {
    addr += SPI_TEST_LEN;
  }
  if (advance && !(ret & 0x1FFF)) {
    Serial.println(F("\r\nWraparound!"));
  }
  return ret;
}

// Returns current pointer value and increment it (or warparound)
uint8_t increment_currbuf_pointer() {
  uint8_t ret = currbuf;
  currbuf++;
  if (currbuf >= SPI_QUEUELEN) currbuf = 0;
  return ret;
}

uint16_t* get_next_spi_buffer(uint8_t* n) {
  *n = increment_currbuf_pointer();
  return spi_buf[*n];
}

void setup_qspi() {
  // Populate bufnum
  for (int i = 0; i < SPI_QUEUELEN; i++) {
    bufnum[i] = i;
  }

  // Configuration for the SPI bus
  spi_bus_config_t buscfg;
  memset(&buscfg, 0, sizeof(buscfg));
  buscfg.mosi_io_num = (gpio_num_t)SPI_MOSI;
  buscfg.miso_io_num = (gpio_num_t)SPI_MISO;
  buscfg.sclk_io_num = (gpio_num_t)SPI_CLK;
  buscfg.quadwp_io_num = (gpio_num_t)SPI_WP;
  buscfg.quadhd_io_num = (gpio_num_t)SPI_HD;
  buscfg.max_transfer_sz = SPI_BUFLEN * 2;
  buscfg.flags = SPICOMMON_BUSFLAG_MASTER | SPICOMMON_BUSFLAG_QUAD | SPICOMMON_BUSFLAG_NATIVE_PINS;
  buscfg.intr_flags = 0;

  // Configuration of SPI device
  spi_device_interface_config_t devcfg;
  memset(&devcfg, 0, sizeof(devcfg));
  devcfg.command_bits = 8;
  devcfg.address_bits = 16;
  devcfg.dummy_bits = 0; // Or 16 in QSPI?
  devcfg.mode = 3;
  devcfg.clock_speed_hz = SPI_FREQ;
  devcfg.input_delay_ns = 0;
  devcfg.spics_io_num = (gpio_num_t)SPI_CS;
  devcfg.queue_size = SPI_QUEUELEN;
  devcfg.flags = SPI_DEVICE_HALFDUPLEX;
  devcfg.pre_cb = SPI_pre_trans_cb;
  devcfg.post_cb = SPI_post_trans_cb;

  pinMode(SPI_CS, OUTPUT);
  digitalWrite(SPI_CS, HIGH);
  pinMode(SPI_CLK, OUTPUT);
  digitalWrite(SPI_CLK, HIGH);
  pinMode(SPI_MOSI, INPUT_PULLUP);
  pinMode(SPI_MISO, INPUT_PULLUP);
  pinMode(SPI_WP, INPUT_PULLUP);
  pinMode(SPI_HD, INPUT_PULLUP);

  // Allocate DMA-capable (word-aligned DRAM) memory
  for (int i = 0; i < SPI_QUEUELEN; i++) {
    spi_buf[i] = (uint16_t*)heap_caps_malloc(SPI_BUFLEN * 2, MALLOC_CAP_DMA|MALLOC_CAP_8BIT|MALLOC_CAP_INTERNAL);

    if (spi_buf[i] == NULL) {
      Serial.println(F("Err, SPI buffer malloc failure! Restarting..."));
      ESP.restart();
    }
    memset(spi_buf[i], 0xBB, SPI_BUFLEN * 2);
  }

  if (spi_bus_initialize(SPI_HOST, &buscfg, 2) != ESP_OK) {
    Serial.println(F("Err, SPI failure! Restarting..."));
    ESP.restart();
  }

  if (spi_bus_add_device(SPI_HOST, &devcfg, &dev) != ESP_OK) {
    Serial.println(F("Err, SPI device add failure! Restarting..."));
    ESP.restart();
  }

  // We acquire the bus forever
  if (spi_device_acquire_bus(dev, portMAX_DELAY) != ESP_OK) {
    Serial.println(F("Err, SPI not ready! Restarting..."));
    ESP.restart();
  }
}

void gen_const_data(uint16_t* dat, size_t len, uint16_t val) {
  for (int i = 0; i < len; i++) {
    dat[i] = SPI_SWAP_DATA_TX(val, 16);
  }
}

void gen_test_data(uint16_t* dat, size_t len) {
  for (int i = 0; i < len; i++) {
    dat[i] = SPI_SWAP_DATA_TX(i + 1, 16);
  }
}

void gen_rand_data(uint16_t* dat, size_t len) {
  uint16_t randval;
  for (int i = 0; i < len; i++) {
    // Without radio, this is pseudo-random, but that's fine for testing
    randval = esp_random() & 0xFFFF;
    // We don't really need to swap for random data, but... eh
    dat[i] = SPI_SWAP_DATA_TX(randval, 16);
  }
}


void qspi_poll() {
  uint8_t n;
  int i;
  spi_transaction_t *tx;
  spi_transaction_t *rx;
  if (spi_gotdata) {
    spi_gotdata = 0;
    if (spi_device_get_trans_result(dev, &tx, 0) != ESP_OK) {
      Serial.println(F("Could not retrieve TX transaction?"));
      return;
    }
    if (spi_device_get_trans_result(dev, &rx, 0) != ESP_OK) {
      Serial.println(F("Could not retrieve RX transaction, unsynced! Reboot!"));
      ESP.restart();
    }

    // Make sure data is correct
    uint16_t* txd = (uint16_t*)(tx->tx_buffer);
    uint16_t* rxd = (uint16_t*)(rx->rx_buffer);
    size_t txl = (tx->length) >> 4;
    size_t rxl = (rx->rxlength) >> 4;
    // Skip the first received word during comparison as it is the dummy word
    if (memcmp(txd, rxd+1, SPI_TEST_LEN * 2) != 0) {
      Serial.print(F("\r\nFailed! "));
      Serial.print(txl);
      Serial.print(' ');
      Serial.print(rxl);
      Serial.print(F("\r\nTX:      "));
      for (i = 0; i < txl; i++) {
        //for (i = 0; i < 10; i++) {
        Serial.printf("%04X ", SPI_SWAP_DATA_RX(txd[i], 16));
      }
      Serial.print(F("\r\nRX: "));
      for (i = 0; i < rxl; i++) {
        //for (i = 1; i <= 10; i++) {
        Serial.printf("%04X ", SPI_SWAP_DATA_RX(rxd[i], 16));
      }
      Serial.println();
    } else {
      Serial.print('.');
    }

    // Overwrite RX memory (not necessary if things work properly)
    // We do this to show that DMA does not touch past the first word
    memset(rx->rx_buffer, 0xCE, (rx->rxlength) >> 3);

    delete tx;
    // Decrease num_dat
    num_dat--;
    delete rx;
    // Decrease num_dat
    num_dat--;
  }

  if (num_dat != SPI_QUEUELEN) {
    uint16_t* buf = get_next_spi_buffer(&n);
    if (n & 0b1) {
      // n is odd (read)
      if (!queue_read_transaction(n, get_next_addr(true), SPI_TEST_LEN)) {
        Serial.println(F("Err, unable to re-queue RX, rebooting..."));
        ESP.restart();
      }
    } else {
      // n is even (write)
      // Generate new test data
      //gen_const_data(buf, SPI_TEST_LEN, get_next_addr(false));
      gen_const_data(buf, SPI_TEST_LEN, 0x5AA5);
      //gen_test_data(buf, SPI_TEST_LEN);
      if (!queue_write_transaction(n, get_next_addr(false), SPI_TEST_LEN)) {
        Serial.println(F("Err, unable to re-queue TX, rebooting..."));
        ESP.restart();
      }
    }
  }
}

void setup() {
  pinMode(F_CLK, INPUT);

  pinMode(CRESET_B, INPUT_PULLUP);
  //digitalWrite(CRESET_B, HIGH);

  // SPI CLK
  pinMode(SPI_CLK, OUTPUT);
  digitalWrite(SPI_CLK, HIGH);
  pinMode(SPI_MOSI, INPUT);
  pinMode(SPI_MISO, INPUT);
  pinMode(SPI_WP, INPUT);
  pinMode(SPI_HD, INPUT);
  // SPI CS
  pinMode(SPI_CS, OUTPUT);
  digitalWrite(SPI_CS, HIGH);

  Serial.begin(115200);
  Serial.println(esp_get_idf_version());
  Serial.println(F("BOOT"));

  // Wait for FPGA to initialise
  delay(5000);

  Serial.println(F("SPI"));
  setup_qspi();

  Serial.println(F("READY"));
}

void loop() {
  qspi_poll();
}
Can anyone tell me what seems to be the problem? Or are the assumptions I have made about DMA capability with half-duplex and/or command+address not counting as "write phase" erroneous?

The closest issue I can find is this: https://github.com/espressif/esp-idf/issues/598
Which does seem to suggest that "MISO phase after a MOSI phase" is the cause, and probably the origin of the quoted text (above) in the IDF documentation. Is this the issue I am facing? In the github issue, the phrasing seems to more strongly suggest read and write for a device instead of within a transaction, which worries me.

Re: Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Tue Jun 30, 2020 2:28 am
by ESP_Alvin
Moderator's note: edit the topic title for issue tracking, thanks for reporting.

Re: Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Tue Jun 30, 2020 3:08 am
by axellin
ESP_Alvin wrote:
Tue Jun 30, 2020 2:28 am
Moderator's note: edit the topic title for issue tracking, thanks for reporting.
It looks like IDFGH-3571 does not exist on https://github.com/espressif/esp-idf/issues
How to find IDFGH-3571 on the github issue tracking?

Re: Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Tue Jun 30, 2020 8:43 am
by xieliwei
axellin wrote:
Tue Jun 30, 2020 3:08 am
It looks like IDFGH-3571 does not exist on https://github.com/espressif/esp-idf/issues
How to find IDFGH-3571 on the github issue tracking?
I believe it is an internal tracking number. I wasn't sure if I should be opening a github issue as I can't confirm if I'm not making a mistake here. Do you have a similar issue?

Re: Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Tue Jun 30, 2020 9:02 am
by ESP_Alvin
Thanks for the feedback, the number is only for internal tracking. We will look into the issue. Thanks.

Re: Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Tue Jun 30, 2020 1:27 pm
by axellin
ESP_Alvin wrote:
Tue Jun 30, 2020 9:02 am
Thanks for the feedback, the number is only for internal tracking. We will look into the issue. Thanks.
I see several IDFGH-xxxx tag in esp-idf commit log but I'm not able to find
the link of IDFGH-xxxx. So sometimes it's difficult to know what issue was fixed
if suche tag is for your internal use.

Re: Half-Duplex QSPI (Quad SPI/QIO) with DMA [IDFGH-3571]

Posted: Wed Jul 01, 2020 1:36 am
by ESP_Alvin
Hi axellin,

Sorry for the confusion, will share updates if we have any progress for this issue. Thanks.