Zerowi bare-metal WiFi driver part 4: loading firmware

In the previous post we sent some commands to the WiFi chip, and got a response. To make the chip do anything useful, we need to program its internal CPU, as it doesn’t have code in ROM.

It does have configuration tables in ROM, that indicate what resources it possesses, and their locations, so the chip variants can all be programmed by a single driver. However, parsing these tables isn’t easy; the simplest code I’ve found is in the the Plan9 driver (see part 1 of this blog for details), and that is moderately impenetrable; here is the parser output for the ZeroW (values in hex):

chip ID A9A6 hex, 43430 decimal
coreid 800, corerev 31
  chipcommon 18000000
coreid 812, corerev 27
  d11ctl 18101000
coreid 829, corerev 15
  sdregs 18002000
  sdiorev 15
coreid 82a, corerev 9
  armcore 82a (ARMcm3)
  armregs 18003000
coreid 80e, corerev 16
  socramregs 18004000

I think these are the Intellectual Property (IP) cores within the chip, and the locations they occupy in the memory map, but in the absence of documentation, a lot of guesswork is required. So I decided to ignore the configuration tables, and just use the same addresses as the Linux driver, after it has done the decode. This makes my driver a lot less flexible, as the addresses will have to be changed for each new chip, but there aren’t many of them, so only a handful of definitions will need changing.

The most important number is the chip ID; it should be A9A6 hex, so it’d be a good idea to check our chip matches that. In part 3 my code did some preliminary SDIO initialisation, now to follow on from that:

// SD function numbers
#define SD_FUNC_BUS     0
#define SD_FUNC_BAK     1
#define SD_FUNC_RAD     2

// Maximum block sizes
#define SD_BAK_BLK_BYTES    64
#define SD_RAD_BLK_BYTES    512

// [0.243831] Set bus interface
sdio_cmd52_writes(SD_FUNC_BUS, BUS_SPEED_CTRL_REG, 0x03, 1);
sdio_cmd52_writes(SD_FUNC_BUS, BUS_BI_CTRL_REG, 0x42, 1);
// [17.999101] Set block sizes
sdio_cmd52_writes(SD_FUNC_BUS, BUS_BAK_BLKSIZE_REG, SD_BAK_BLK_BYTES, 2);
sdio_cmd52_writes(SD_FUNC_BUS, BUS_RAD_BLKSIZE_REG, SD_RAD_BLK_BYTES, 2);

The SD function numbers allow Command 52 & 53 to access 3 different interfaces within the chip: think ‘hardware functions’ rather than ‘software functions’. The SDIO bus interface is configured using the ‘bus’ function, and is set into high-speed mode (as discussed in part 2). Then the block sizes for the backplane (‘BAK’) and radio (‘RAD’) functions are set; these are limited to 64 & 512 bytes by the hardware. These will be used by command 53 when operating in multi-block mode.

#define BAK_BASE_ADDR           0x18000000              // CHIPCOMMON_BASE_ADDRESS

// [17.999944] Enable I/O 
sdio_cmd52_writes(SD_FUNC_BUS, BUS_IOEN_REG, 1<<SD_FUNC_BAK, 1);
if (!sdio_cmd52_reads_check(SD_FUNC_BUS, BUS_IORDY_REG, 0xff, 2, 1))
    log_error(0, 0);
// [18.001750] Set backplane window
sdio_bak_window(BAK_BASE_ADDR);
// [18.001905] Read chip ID 
sdio_cmd53_read(SD_FUNC_BAK, SB_32BIT_WIN, u32d.bytes, 4);

We now use the ‘bus’ function to enable the ‘backplane’ interface; by default, the IP cores in the chip are switched off to conserve power, and they need to be enabled; the second line of code checks that the core has actually powered up (I/O enabled -> I/O ready). Once the backplane function is enabled, we set a window pointing to the common base address (‘chipcommon’ in the Plan9 driver) then do a read, and we get hex values A6 A9 41 15, which is correct. However, some explanation is needed with regard to the backplane window.

Backplane window

You may recall that the commands we’re using here, CMD52 and CMD53, only have a 17-bit address range, yet the chip uses 32-bit addresses internally. The way this is handled is by writing a 24-bit value to 3 of the backplane registers, to act as an offset within the internal space.

// Backplane window
#define SB_32BIT_WIN    0x8000
#define SB_ADDR_MASK    0x7fff
#define SB_WIN_MASK     (~SB_ADDR_MASK)

// Set backplane window, don't set if already OK
void sdio_bak_window(uint32_t addr)
{
    static uint32_t lastaddr=0;
    
    addr &= SB_WIN_MASK;
    if (addr != lastaddr)
        sdio_cmd52_writes(SD_FUNC_BAK, BAK_WIN_ADDR_REG, addr>>8, 3);
    lastaddr = addr;
}
// Do 1 - 4 CMD52 writes to successive addresses
int sdio_cmd52_writes(int func, int addr, uint32_t data, int nbytes)
{
    int n=0;

    while (nbytes--)
    {
        n += sdio_cmd52(func, addr++, (uint8_t)data, SD_WR, 0, 0);
        data >>= 8;
    }
    return(n);
}

It is important to realise that this is a simple windowing scheme where the bottom 15 bits are provided by the offset, and the top 17 bits by the window: the two values aren’t added together. An additional complication (yes, really) is that there are 2 copies of the lower 15-byte address space; 0 – 7fff hex is for byte accesses, and 8000 – ffff hex is for 32-bit word accesses (offset SB_32BIT_WIN).

To give a concrete example, here is the analysis of the RPi driver fetching the CPU ID:

18.001455 * Cmd 52 92001400 Wr BAK  1000A 00
18.001481 * Rsp 52 00001000 Flags 10 data 00
18.001618 * Cmd 52 92001600 Wr BAK  1000B 00
18.001644 * Rsp 52 00001000 Flags 10 data 00
18.001750 * Cmd 52 92001818 Wr BAK  1000C 18 Bak Win 180000
18.001777 * Rsp 52 00001018 Flags 10 data 18
18.001905 * Cmd 53 15000004 Rd BAK  180000:8000 len 4
  Data   4 bytes: a6 a9 41 15 *

You can see the 3 CMD52 write cycles to set the window address, then the 4-byte read cycle, with the offset into the 32-bit area. The ‘win 180000’ and ‘180000:8000’ labels are my analysis code trying to be helpful, by saving the window value, and repeating it at the subsequent read cycle.

Firmware file

There are various firmware versions that could be used (see Cypress WICED) but I’m using the same version as the RPi driver, available here. It is around 300K bytes; eventually, it’ll be stored in the SD card filesystem, but for the time being I wanted a simpler storage mechanism, so attached an external SPI memory device, that can be programmed by a standard RPi utility, and is really easy to read back.

This extra hardware isn’t compulsory; there is an INCLUDE_FIRMWARE option in the source code to link the firmware file into the binary image; the functionality is the same, it just takes longer to load over the target serial link.

The device I used is an EN25Q80B, which has a megabyte of serial flash memory. MikroElektronika sell a small flash click board that is simple to connect to the ZeroW, as follows:

MicroE pi   RPi pin
Gnd          25
3V3          17
SDI          19
SDO          21
SCK          23
CS           24

This can be programmed using the following utilities that are included in the standard Linux distribution:

objcopy -F binary brcmfmac43430-sdio.bin flash.bin --pad-to 0x100000
sudo apt install flashrom
sudo modprobe spi_bcm2835
flashrom -p linux_spi:dev=/dev/spidev0.0,spispeed=1000 -w flash.bin

The version of flashrom I used does issue a warning that the Eon chip isn’t fully supported, but still programs it OK. Reading the chip is really easy:

#define SPI0_BASE       (REG_BASE + 0x204000)
#define SPI0_CS         (uint32_t *)SPI0_BASE
#define SPI0_FIFO       (uint32_t *)(SPI0_BASE + 0x04)
#define SPI0_CLK        (uint32_t *)(SPI0_BASE + 0x08)
#define SPI0_DLEN       (uint32_t *)(SPI0_BASE + 0x0c)
#define SPI0_DC         (uint32_t *)(SPI0_BASE + 0x14)

#define SPI0_CE0_PIN    8
#define SPI0_MISO_PIN   9
#define SPI0_MOSI_PIN   10
#define SPI0_SCLK_PIN   11

// Initialise flash interface (SPI0)
void flash_init(int khz)
{
    gpio_set(SPI0_CE0_PIN, GPIO_ALT0, GPIO_NOPULL);
    gpio_set(SPI0_MISO_PIN, GPIO_ALT0, GPIO_PULLUP);
    gpio_set(SPI0_MOSI_PIN, GPIO_ALT0, GPIO_NOPULL);
    gpio_set(SPI0_SCLK_PIN, GPIO_ALT0, GPIO_NOPULL);
    *SPI0_CS = 0x30;
    *SPI0_CLK = CLOCK_KHZ / khz;
}

// Set / clear SPI chip select
void spi0_cs(int set)
{
    *SPI0_CS = set ? *SPI0_CS | 0x80 : *SPI0_CS & ~0x80;
}

// Start a flash read cycle (EN25Q80 device)
void flash_open_read(int addr)
{
    uint8_t rxdata[4], txdata[4]={3, (uint8_t)(addr>>16), (uint8_t)(addr>>8), (uint8_t)(addr)};
    
    spi0_cs(1);
    spi0_xfer(txdata, rxdata, 4);
}
// Read next block
void flash_read(uint8_t *dp, int len)
{
    while (len--)
    {
        *SPI0_FIFO = 0;
        while((*SPI0_CS & (1<<17)) == 0) ;
        *dp++ = *SPI0_FIFO;
    }
}
// End a flash cycle
void flash_close(void)
{
    spi0_cs(0);
}

If you don’t want to bother with this, just set the INCLUDE_FIRMWARE option in the source code, which links the firmware file into the main executable.

File upload

Before we can upload the code, there is a lot more initialisation to be done; another 34 commands that I won’t be describing here, mainly because I’m having difficulty understanding them in the absence of documentation; for now, the source code is the only explanation you’ll get.

The process of transferring the file is made a bit more complicated by the windowing scheme I described earlier; we have to move that along after every 32K. Command 53 is used in multi-block mode, so one command is issued for multiple data blocks.

// Upload blocks of firmware from flash to chip RAM
int write_firmware(void)
{
    int len, n=0, nbytes=0, nblocks;
    uint32_t addr;

    flash_open_read(0);
    while (nbytes < FIRMWARE_LEN)
    {
        addr = sdio_bak_addr(nbytes);
        len = MIN(sizeof(txbuffer), FIRMWARE_LEN-nbytes);
        nblocks = len / SD_BAK_BLK_BYTES;
		if (nblocks > 0)
        {
            flash_read(txbuffer, nblocks*SD_BAK_BLK_BYTES);
            n = sdio_write_blocks(SD_FUNC_BAK, SB_32BIT_WIN+addr, txbuffer, nblocks);
            if (!n)
                break;
            nbytes += nblocks * SD_BAK_BLK_BYTES;
        }
        else
        {
            flash_read(txbuffer, len);
            txbuffer[len++] = 1;
            sdio_cmd53_write(SD_FUNC_BAK, SB_32BIT_WIN+addr, txbuffer, len);
            nbytes += len;
        }
    }
    flash_close();
    return(nbytes);
}
// Write multiple 64-byte command 53 blocks (max 32K in total)
int sdio_write_blocks(int func, int addr, uint8_t *dp, int nblocks)
{
    int n=0;
    SDIO_MSG rspx, cmd={.cmd53 = {.start=0, .cmd=1, .num=53,
        .wr=1, .func=func, .blk=1, .inc=1, .addrh=(uint8_t)(addr>>15)&3,
        .addrm=(uint8_t)(addr>>7), .addrl=(uint8_t)(addr&0x7f),
        .lenh=(uint8_t)(nblocks>>8)&1, .lenl=(uint8_t)nblocks, .crc=0, .stop=1}};

    clk_0(1);
    add_crc7(cmd.data);
    log_msg(&cmd);
    sdio_cmd_write(cmd.data, MSG_BITS);
    if (sdio_rsp_read(rspx.data, MSG_BITS, SD_CMD_PIN))
    {
        gpio_write(SD_D0_PIN, 4, 0xf);
        gpio_mode(SD_D0_PIN, GPIO_OUT);
        gpio_mode(SD_D1_PIN, GPIO_OUT);
        gpio_mode(SD_D2_PIN, GPIO_OUT);
        gpio_mode(SD_D3_PIN, GPIO_OUT);
        while (n++ < nblocks)
        {
            sdio_block_out(dp, SD_BAK_BLK_BYTES);
            sdio_rsp_read(rspx.data, BLOCK_ACK_BITS, SD_D0_PIN);
            dp += SD_BAK_BLK_BYTES;
            clk_0(2);
        }
        gpio_mode(SD_D0_PIN, GPIO_IN);
        gpio_mode(SD_D1_PIN, GPIO_IN);
        gpio_mode(SD_D2_PIN, GPIO_IN);
        gpio_mode(SD_D3_PIN, GPIO_IN);
    }
    clk_0(1);
    return(n);
}

Once that is complete, we must load in the configuration data, which is available here. A small amount of pre-processing is required, namely removing the comment lines, and replacing all the newline characters with nulls. Since the file is small, command 53 is used in single-block mode.

// Upload blocks of config data to chip NVRAM
int write_nvram(void)
{
    int nbytes=0, len;

    sdio_bak_window(0x078000);
    while (nbytes < config_len)
    {
        len = MIN(config_len-nbytes, SD_BAK_BLK_BYTES);
        sdio_cmd53_write(SD_FUNC_BAK, 0xfd54+nbytes, &config_data[nbytes], len);
        nbytes += len;
    }
    return(nbytes);
}

After another 12 initialisation commands, we can check if the code was loaded OK:

usdelay(50000);
if (!sdio_cmd52_reads(SD_FUNC_BAK, BAK_CHIP_CLOCK_CSR_REG, &u32d.uint32, 1) || u32d.uint8!=0xd0)
    log_error(0, 0);
// [19.190728]
sdio_cmd52_writes(SD_FUNC_BAK, BAK_CHIP_CLOCK_CSR_REG, 0xd2, 1);
sdio_bak_write32(SB_TO_SB_MBOX_DATA_REG, 0x40000);
sdio_cmd52_writes(SD_FUNC_BUS, BUS_IOEN_REG, (1<<SD_FUNC_BAK) | (1<<SD_FUNC_RAD), 1);
sdio_cmd52_reads(SD_FUNC_BUS, BUS_IORDY_REG, &u32d.uint32, 1);
usdelay(100000);
if (!sdio_cmd52_reads(SD_FUNC_BUS, BUS_IORDY_REG, &u32d.uint32, 1) || u32d.uint8!=0x06)
    log_error(0, 0);

If the first value is D0 hex, and the second is 6, then all is well, and after another 21 initialisation commands, we can think about doing something useful with the chip…

[Overview] [Previous part] [Next part]

Copyright (c) Jeremy P Bentham 2020. Please credit this blog if you use the information or software in it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s