STM32F4 external SDRAM with HAL

STM32F427/9 has this nice FMC module, which stands for Flexible Memory Controller. It is able to deal with all kinds of memories. For now I want to add external SDRAM to my project to hold large(ish) amount of data captured by sensors (images) for later retrieval over slower UART. 8MB (64Mbit) on Discovery board is quite a bit, but it can hold only 1 raw 5 megapixel image, perhaps multiple JPEGs. I would like to have a bit more, but let's figure out the setup first on a DISCO board, before trying to improve on that.

There are two ways of using external SDRAM:
 - kind of manual, where you use HAL driver to issue commands (read/write) to memory, which is uncomfortable to say the least, if you want to use it as a general-purpose memory.
 - convince the compiled code, that it has more memory than the internal SRAM of the chip. More on this some other time though.


Hardware Setup

Memories generally are parallel and have some fixed data bus width. The one on DISCO board is IS42S16400J, which has 4 internal banks, each contains 1M "cells" each 16 bits wide. So our data bus width will be 16 lines. 1M x 16 bit x 4banks add up to 64Mbit or 8MBytes of available memory.

These cells are laid out in matrix and each is accessed by specific address. Address consists of row number and column number. Widest of both (bitwise) define number of address lines required, usually it's number of rows. In datasheet manufacturer defines row addresses A0-A11, column addresses A0-A7, meaning, that we will have 12 address bits, i.e. 12 address lines. Addresses are usually sent sequentially, first row address, then column address on the same address bus. To separate address data stream, "row address strobe" (RAS) and "column address strobe" (CAS) signals are used. They get toggled briefly to indicate what kind of address is sent. This is done for corner cases dealing with first and last addresses, which otherwise might go undetected (all bits high/low), as well as cases, when both row and column addresses coincide.

What does "4 internal banks" mean? It means, that there are 4 memory matrices, each containing 1M cells. By default only one cell is used, unless you signal to the memory, that you want to use other bank. This is done by a pair of signals, usually called "Bank select" or "bank address" (BA0/BA1). Without using these you have only a quarter of your memory available.
Extra signals for clock-enable(SDCKE), chip-select (SDNE), write-enable (WE) are used for enabling particular memories in multi-memory setup.
CAS, RAS and WE in conjunction are used for sending setup commands to the memory - all kinds of timings and such.
LDQM and UDQM (Lower/Upper byte input-output masks) are used to indicate whether read or write is performed and which byte part of the word is sent.

Uhh, that's a lot of pins. Total of 38, if I'm not mistaken. Easy to eff something up. In addition to that, they should be more or less of equal length, because signal timings are rather tight (~11 ns).

Software Setup

To use external memory, you have to set up both parts:
  • MCU requires correct GPIO mappings and FMC peripheral setup
  • memory requires some configuration before use

Reference manual section 37.7.3 describes steps, required for SDRAM initialization. HAL takes care of GPIO setup and first 2 steps from reference manual, afterwards you need to configure your particular memory. STM32CubeMX has an example of doing all the steps in /Projects/STM32F429I-Discovery/Examples/FMC/FMC_SDRAM/. It looks nasty AF, but it works. I would like to understand, what's happening there.
  1. System init with clocks and board support stuff
  2. hal_msp.c sets up memory GPIOs. OK, that's familiar
  3. FMC peripheral setup:
  4.         // select bank. Note, that these are 1-indexed
              hsdram.Init.SDBank             = FMC_SDRAM_BANK2;
            // length of column address. From memory datasheet A0-A7 equals 8 bits
              hsdram.Init.ColumnBitsNumber   = FMC_SDRAM_COLUMN_BITS_NUM_8;
            // length of row address. From memory datasheet A0-A11 equals 12 bits
              hsdram.Init.RowBitsNumber      = FMC_SDRAM_ROW_BITS_NUM_12;
            // data bus width in bits. It's 16 bit memory, so 16 is right
              hsdram.Init.MemoryDataWidth    = FMC_SDRAM_MEM_BUS_WIDTH_16;
            // count of internal memory banks (matrices)
              hsdram.Init.InternalBankNumber = FMC_SDRAM_INTERN_BANKS_NUM_4;
            // Column address strobe latency
              hsdram.Init.CASLatency         = FMC_SDRAM_CAS_LATENCY_3;
            // do we want to use write protection? This has to be unlocked manually, so no, not now
              hsdram.Init.WriteProtection    = FMC_SDRAM_WRITE_PROTECTION_DISABLE;
            // memory clock frequency divider
              hsdram.Init.SDClockPeriod      = FMC_SDRAM_CLOCK_PERIOD_2;
            // do we use bursts?
              hsdram.Init.ReadBurst          = FMC_SDRAM_RBURST_DISABLE;
            // Delay in read data path in system clock units
              hsdram.Init.ReadPipeDelay      = FMC_SDRAM_RPIPE_DELAY_1;
    
    Woah! Most of this stuff is pretty self-explanatory, but some of it isn't. First of all, we need to understand, where timings are coming from. 
Timings come from AHB interface, which by default is equal to HCLK. Meaning, that if we have 180MHz clock, AHB will also be 180MHz. SDRAM clocks can be HCLK/2 or HCLK/3, or, in our case, 90MHz or 60MHz. SDClockPeriod defines this divider, so we have 90MHz set up for us. Each clock will thus be roughly 11 ns long. Let's figure these parameters out:
  • CASLatency defines delay between READ command and the moment data is available on the data bus. Check what memory supports, our case it's 2/3
  • ReadPipeDelay - I guess this is interval between consecutive data outputs

   4. SDRAM timing setup, these parameters are all in memory timing setup (1 unit = 11ns), these should exceed timings specified by the memory manufacturer:
        // delay between a Load Mode Register command and an active or Refresh command (ISSI Command Period (PRE to ACT): 15ns, ours: 22ns )
          SdramTiming.LoadToActiveDelay = 2;
        // delay from releasing the self refresh command to issuing the Activate command (ISSI Self-Refresh to Active Time: 70ns, ours 77)
          SdramTiming.ExitSelfRefreshDelay = 7;
        // the minimum Self Refresh period (ISSI Cmmand Period (ACT to PRE)? 42ns, ours 44ns)
          SdramTiming.SelfRefreshTime = 4;
        // the delay between the Refresh command and the Activate command and the delay between two consecutive Refresh commands
        // (ISSI Command Period (REF to REF / ACT to ACK: 63, ours 77, could be reduced?)
          SdramTiming.RowCycleDelay = 7;
        // (i guess ISSI Data-in to PRECHARGE command 2clk, ours:3)
          SdramTiming.WriteRecoveryTime = 3;
        // delay between a Precharge Command and an other command (ISSI Command Period (PRE to ACT): 15ns, ours 22ns)
          SdramTiming.RPDelay = 2;
        // delay between the Activate Command and a Read/Write (ISSI tive Command To Read / Write Command Delay Time 15ns, ours 22ns)
          SdramTiming.RCDDelay = 2;

    5. Sending SDRAM commands to prepare memory for work. As I wrote earlier, these are a combination of RAS, CAS and WE toggles:
  static void SDRAM_Initialization_Sequence(SDRAM_HandleTypeDef *hsdram, FMC_SDRAM_CommandTypeDef *Command) {
    __IO uint32_t tmpmrd =0;
    /* Step 3:  Configure a clock configuration enable command */
    Command->CommandMode     = FMC_SDRAM_CMD_CLK_ENABLE;
    Command->CommandTarget    = FMC_SDRAM_CMD_TARGET_BANK2;
    Command->AutoRefreshNumber   = 1;
    Command->ModeRegisterDefinition = 0;

    /* Send the command */
    HAL_SDRAM_SendCommand(hsdram, Command, 0x1000);

    /* Step 4: Insert 100 ms delay */
    HAL_Delay(100);

    /* Step 5: Configure a PALL (precharge all) command */
    Command->CommandMode     = FMC_SDRAM_CMD_PALL;
    Command->CommandTarget       = FMC_SDRAM_CMD_TARGET_BANK2;
    Command->AutoRefreshNumber   = 1;
    Command->ModeRegisterDefinition = 0;

    /* Send the command */
    HAL_SDRAM_SendCommand(hsdram, Command, 0x1000);

    /* Step 6 : Configure a Auto-Refresh command */
    Command->CommandMode     = FMC_SDRAM_CMD_AUTOREFRESH_MODE;
    Command->CommandTarget    = FMC_SDRAM_CMD_TARGET_BANK2;
    Command->AutoRefreshNumber   = 4;
    Command->ModeRegisterDefinition = 0;

    /* Send the command */
    HAL_SDRAM_SendCommand(hsdram, Command, 0x1000);

    /* Step 7: Program the external memory mode register */
    tmpmrd = (uint32_t)SDRAM_MODEREG_BURST_LENGTH_2          |
        SDRAM_MODEREG_BURST_TYPE_SEQUENTIAL   |
        SDRAM_MODEREG_CAS_LATENCY_3           |
        SDRAM_MODEREG_OPERATING_MODE_STANDARD |
        SDRAM_MODEREG_WRITEBURST_MODE_SINGLE;

    Command->CommandMode = FMC_SDRAM_CMD_LOAD_MODE;
    Command->CommandTarget    = FMC_SDRAM_CMD_TARGET_BANK2;
    Command->AutoRefreshNumber   = 1;
    Command->ModeRegisterDefinition = tmpmrd;

    /* Send the command */
    HAL_SDRAM_SendCommand(hsdram, Command, 0x1000);

    /* Step 8: Set the refresh rate counter */
    /* (15.62 us x Freq) - 20 */
    /* Set the device refresh counter */
    HAL_SDRAM_ProgramRefreshRate(hsdram, 0x056A);
  }
        What we are interested in here, is at least refresh counter. In datasheet Refresh Cycle Time (4096) is specified in 64 ms, meaning that 4096 refreshes happen during given period. Which implies, that a single refresh should happen about 64ms/4096 or every 15us. FMC_SDRTR register description tells us, that value to be put there should be (refresh_rate x SDRAM clock frequency) -20. For us then it would be (15.625ns x 90MHz)-20 = 1386.25 or hex 0x56a. Which is the value we have.

Attempt to use HAL SDRAM library

I wanted to try out read/write to SDRAM using HAL_SDRAM_Write16b() function and it turns out you have to figure out correct chip memory address to pass into function, not the memory address on the memory itself. Ahh, ok. Reference manual section 37.4 states, that banks 5 and 6 are used for addressing SDRAM devices. Each bank can be up to 256 MBytes. Should cover my planned 512Mbit/64MByte memory. SDRAM banks are located at 0xC000 0000 - 0xCFFF FFFF and 0xD000 0000 - 0xDFFF FFFF addresses. HADDR[28] selects bank1/bank2 (table 258) and table 259 second row tells us HADDR formatting for 16bit memories.

After setting up memory I am able to write data directly to memory address and afterwards inspect it in debugger:
 *(__IO uint32_t*) SDRAM_BANK_ADDR = 0xdeadbeef;

However using HAL_SDRAM_Write16b() overwrites my address pointer and it writes gods know where. The function prototype tells us, that it expects a pointer to write start address, but internally it casts this pointer itself to address. Another quirk in STM HAL. So you actually have to pass an integer cast to address, not a pointer to an integer, containing address. Oh, well.
   uint32_t pAddress = 0xd0000400;

   uint32_t data[2] = {0xb00b, 0xface};

   HAL_SDRAM_Write_16b(&hsdram1, (uint16_t*)pAddress, (uint16_t*)data, 4);

 

Custom hardware

Uhh, that took a while. Now let's try to port this stuff to our custom board with extra large SDRAM.
Memory model I tested is IS42S16320D-7TLI - 512Mbit 8M x 16 bits in 4 banks.
Memory uses 13 address lines and 16 data lines. 13 row address bits and 9 column address bits are expected for each transfer.

If we compare datasheet timing sections, then we can note that most of the stuff are shorter for our larger memory, except refresh cycle time is defined for 2x larger count. That would mean, that refresh rate should be approximately 2x lower - 683.125 or rounded hex 0x2ab. Another extra thing to do is increase RowBitsNumber in Init to 13 lines and column count to 9 bits.

A gotcha I lost 2 days on, was the fact, that STM32CubeMX does not generate FMC_NBL0/1 (DMQH/L on memory side) pin configuration, if "16-bit byte enable" is not ticked. I checked EVERYTHING, from physical connections to footprint mappings on my board. That's, actually, how I found the issue - I had 2 more pins connected  than cube was showing.

Now I can write the whole memory with dummy data. All but last 16 bytes of 512Mbit are filled with same numbers from 1 to 15 with 0xBE prepended. Last one is 32-bit word 0xDEADBEEF:
  Address   0 - 3     4 - 7     8 - B     C - F
  D0000000  BE010203  04050607  08090A0B  0C0D0E0F         
  D0000010  BE010203  04050607  08090A0B  0C0D0E0F         
  ...        
  D3FFFFD0  BE010203  04050607  08090A0B  0C0D0E0F         
  D3FFFFE0  BE010203  04050607  08090A0B  0C0D0E0F         
  D3FFFFF0  00080009  000A000B  000C00EF  EFBEADDE         
  D4000000  00??????  ????????  ????????  ????????         
  D4000010  ????????  ????????  ????????  ????????         
  D4000020  ????????  ????????  ????????  ????????         
  D4000030  ????????  ????????  ????????  ????????  
What is interesting, is that the debugger seems to be able to read one more byte than I expected.

No comments:

Post a Comment