There are two ways of using external SDRAM:
- kind of manual, where you use HAL driver to issue commands (read/write) to memory, which is uncomfortable to say the least, if you want to use it as a general-purpose memory.
- convince the compiled code, that it has more memory than the internal SRAM of the chip. More on this some other time though.
Hardware SetupMemories generally are parallel and have some fixed data bus width. The one on DISCO board is IS42S16400J, which has 4 internal banks, each contains 1M "cells" each 16 bits wide. So our data bus width will be 16 lines. 1M x 16 bit x 4banks add up to 64Mbit or 8MBytes of available memory.
These cells are laid out in matrix and each is accessed by specific address. Address consists of row number and column number. Widest of both (bitwise) define number of address lines required, usually it's number of rows. In datasheet manufacturer defines row addresses A0-A11, column addresses A0-A7, meaning, that we will have 12 address bits, i.e. 12 address lines. Addresses are usually sent sequentially, first row address, then column address on the same address bus. To separate address data stream, "row address strobe" (RAS) and "column address strobe" (CAS) signals are used. They get toggled briefly to indicate what kind of address is sent. This is done for corner cases dealing with first and last addresses, which otherwise might go undetected (all bits high/low), as well as cases, when both row and column addresses coincide.
What does "4 internal banks" mean? It means, that there are 4 memory matrices, each containing 1M cells. By default only one cell is used, unless you signal to the memory, that you want to use other bank. This is done by a pair of signals, usually called "Bank select" or "bank address" (BA0/BA1). Without using these you have only a quarter of your memory available.
Extra signals for clock-enable(SDCKE), chip-select (SDNE), write-enable (WE) are used for enabling particular memories in multi-memory setup.
CAS, RAS and WE in conjunction are used for sending setup commands to the memory - all kinds of timings and such.
LDQM and UDQM (Lower/Upper byte input-output masks) are used to indicate whether read or write is performed and which byte part of the word is sent.
Uhh, that's a lot of pins. Total of 38, if I'm not mistaken. Easy to eff something up. In addition to that, they should be more or less of equal length, because signal timings are rather tight (~11 ns).
Software SetupTo use external memory, you have to set up both parts:
- MCU requires correct GPIO mappings and FMC peripheral setup
- memory requires some configuration before use
Reference manual section 37.7.3 describes steps, required for SDRAM initialization. HAL takes care of GPIO setup and first 2 steps from reference manual, afterwards you need to configure your particular memory. STM32CubeMX has an example of doing all the steps in /Projects/STM32F429I-Discovery/Examples/FMC/FMC_SDRAM/. It looks nasty AF, but it works. I would like to understand, what's happening there.
- System init with clocks and board support stuff
- hal_msp.c sets up memory GPIOs. OK, that's familiar
- FMC peripheral setup: Woah! Most of this stuff is pretty self-explanatory, but some of it isn't. First of all, we need to understand, where timings are coming from.
- CASLatency defines delay between READ command and the moment data is available on the data bus. Check what memory supports, our case it's 2/3
- ReadPipeDelay - I guess this is interval between consecutive data outputs
4. SDRAM timing setup, these parameters are all in memory timing setup (1 unit = 11ns), these should exceed timings specified by the memory manufacturer:
5. Sending SDRAM commands to prepare memory for work. As I wrote earlier, these are a combination of RAS, CAS and WE toggles:
What we are interested in here, is at least refresh counter. In datasheet Refresh Cycle Time (4096) is specified in 64 ms, meaning that 4096 refreshes happen during given period. Which implies, that a single refresh should happen about 64ms/4096 or every 15us. FMC_SDRTR register description tells us, that value to be put there should be (refresh_rate x SDRAM clock frequency) -20. For us then it would be (15.625ns x 90MHz)-20 = 1386.25 or hex 0x56a. Which is the value we have.
Attempt to use HAL SDRAM libraryI wanted to try out read/write to SDRAM using HAL_SDRAM_Write16b() function and it turns out you have to figure out correct chip memory address to pass into function, not the memory address on the memory itself. Ahh, ok. Reference manual section 37.4 states, that banks 5 and 6 are used for addressing SDRAM devices. Each bank can be up to 256 MBytes. Should cover my planned 512Mbit/64MByte memory. SDRAM banks are located at 0xC000 0000 - 0xCFFF FFFF and 0xD000 0000 - 0xDFFF FFFF addresses. HADDR selects bank1/bank2 (table 258) and table 259 second row tells us HADDR formatting for 16bit memories.
After setting up memory I am able to write data directly to memory address and afterwards inspect it in debugger:
However using HAL_SDRAM_Write16b() overwrites my address pointer and it writes gods know where. The function prototype tells us, that it expects a pointer to write start address, but internally it casts this pointer itself to address. Another quirk in STM HAL. So you actually have to pass an integer cast to address, not a pointer to an integer, containing address. Oh, well.
Custom hardwareUhh, that took a while. Now let's try to port this stuff to our custom board with extra large SDRAM.
Memory model I tested is IS42S16320D-7TLI - 512Mbit 8M x 16 bits in 4 banks.
Memory uses 13 address lines and 16 data lines. 13 row address bits and 9 column address bits are expected for each transfer.
If we compare datasheet timing sections, then we can note that most of the stuff are shorter for our larger memory, except refresh cycle time is defined for 2x larger count. That would mean, that refresh rate should be approximately 2x lower - 683.125 or rounded hex 0x2ab. Another extra thing to do is increase RowBitsNumber in Init to 13 lines and column count to 9 bits.
A gotcha I lost 2 days on, was the fact, that STM32CubeMX does not generate FMC_NBL0/1 (DMQH/L on memory side) pin configuration, if "16-bit byte enable" is not ticked. I checked EVERYTHING, from physical connections to footprint mappings on my board. That's, actually, how I found the issue - I had 2 more pins connected than cube was showing.
Now I can write the whole memory with dummy data. All but last 16 bytes of 512Mbit are filled with same numbers from 1 to 15 with 0xBE prepended. Last one is 32-bit word 0xDEADBEEF:
What is interesting, is that the debugger seems to be able to read one more byte than I expected.