PicoWi part 8: UDP server socket

So far I have used a large number of custom functions to configure and control the WiFi networking, but before adding yet more functionality, I need to offer a simpler (and more standard) way of doing all this programming.

When it comes to network programming on Linux or Windows systems, there is only one widely-used Application Programming Interface (API), namely Network Sockets, also known as Internet Sockets. A socket is an endpoint on the network, defined by an IP address & port number; in previous parts, I have used sockets to construct DHCP & DNS clients, and now will use them in a general-purpose programming interface.

UDP & TCP, Client & Server

To recap on the difference between User Datagram Protocol (UDP) & Transmission Control Protocol (TCP), also Client & Server:

  • UDP allows you to send blocks of data across the network, the maximum block size generally being around 1.5 kBytes. There is no synchronisation between sender & receiver, and no error handling, so there is no guarantee that the data will arrive at the destination.
  • TCP starts with an exchange of packets to synchronise the two endpoints, and thereafter the transfer is bidirectional and stream-orientated; it acts like a transparent pipe between the sockets, carrying arbitrary numbers of bytes from one socket to the other. When the transfer is complete, either side can close the connection, but the closure will only be complete when both sides acknowledge it.
  • A client initiates contact with a server, by sending UDP or TCP traffic to a well-known port on that system.
  • A server takes possession of (‘binds to’) a specific UDP or TCP port number, and runs continuously, waiting for a client to initiate contact.

I’m starting with the simplest of these options, namely a UDP server.

UDP server

To create a server socket, it is just necessary to issue a socket() call, followed by bind() to take possession of a specific port, by providing an initialised structure. The same code applies for both Linux, and my RP2040 socket API:

#define PORTNUM  8080    // Server port number
#define MAXLEN   1024    // Maximum data length

int server_sock; 
struct sockaddr_in server_addr, client_addr; 
    
server_sock = socket(AF_INET, SOCK_DGRAM, 0);
if (server_sock < 0)
{ 
    perror("socket creation failed"); 
    return(1); 
} 
memset(&server_addr, 0, sizeof(server_addr)); 
server_addr.sin_family    = AF_INET;
server_addr.sin_addr.s_addr = INADDR_ANY; 
server_addr.sin_port = htons(PORTNUM); 
if (bind(server_sock, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) 
{ 
    perror("bind failed"); 
    return(1); 
} 

You may be surprised at the use of 2 structures, namely sockaddr_in, and sockaddr. The former is specific to Internet Protocol, whereas the latter is general-purpose, capable of being used with other network protocols. The structure has to be filled in with the address family (Internet Protocol) the desired IP address (zero for any), and the port number, byte-swapped to big-endian.

The code then enters an endless loop, waiting for a a datagram to be received, printing the data (on the assumption it is text), and returning a text string in response:

printf("UDP server on port %u\n", PORTNUM);
while (1)
{
    memset(&client_addr, 0, sizeof(client_addr)); 
    addrlen = sizeof(client_addr);
    n = recvfrom(server_sock, buffer, MAXLEN-1, 0, 
        (struct sockaddr *)&client_addr, &addrlen);
    if (n > 0)
    {
        buffer[n] = 0; 
        printf("Client %s: %s\n", inet_ntoa(client_addr.sin_addr), buffer); 
        n = sprintf(buffer, "Test %u\n", testnum++);
        sendto(server_sock, buffer, n, MSG_CONFIRM, 
            (struct sockaddr *)&client_addr, addrlen);
    }
}

It may seem strange to set ‘addrlen’ every time we go round the loop; this is for safety, as recvfrom() can potentially modify its value. Also, there is no guarantee that the incoming string is null-terminated, so a null is added at the end of the datagram, just in case.

Blocking

The above code ‘blocks’ on the receive function, that is to say it waits indefinitely until the data arrives, which is standard practice on Linux or Windows sockets – it is sensible when we have a multi-tasking operating system, but inconvenient if we are restricted to a single task, as the CPU can’t run any other code while waiting for incoming data.

There are various ways round this limitation, for example Linux has a ‘select’ function that allows you to poll the interface, checking if any incoming data is available. Alternatively, we can set a timeout on the socket read, for example 0.1 seconds (100,000 microseconds):

struct timeval tv;
tv.tv_sec = 0;
tv.tv_usec = 100000;
setsockopt(server_sock, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));

To demonstrate this functionality, we can add a useful feature whereby the LED is flashed every time an access to the server is made; if we set the timeout to 1000 microseconds, then the timeouts give us a convenient millisecond timer. The following code does a rapid flash of the LED until the unit joins the Wifi network, then a quick blink every 2 seconds, plus a flash every time the server is accessed.

sock_set_timeout(server_sock, 1000);
while (1)
{
    memset(&client_addr, 0, sizeof(client_addr)); 
    addrlen = sizeof(client_addr);
    n = recvfrom(server_sock, buffer, MAXLEN, 0, 
        (struct sockaddr *)&client_addr, &addrlen);
    if (n > 0)
    {
        buffer[n] = 0; 
        printf("Client %s: %s\n", inet_ntoa(client_addr.sin_addr), buffer); 
        n = sprintf(buffer, "Test %u\n", testnum++);
        sendto(server_sock, buffer, n, 0, (struct sockaddr *)&client_addr, addrlen);
        msec = 0;
    }
    else
    {
        msec++;
        if (msec == 1)
            wifi_set_led(1);
        else if (msec == 100)
            wifi_set_led(0);
        else if (msec == 200 && !link_check())
            msec = 0;
        else if (msec == 2000)
            msec = 0;
    }
}

Test program

See the introduction for details on how to rebuild and program the udp_socket_server application into the Pico.

The simplest way to test the server is using the netcat application, that can easily be installed on the Raspberry Pi using ‘apt install’; you just need to note the IP address of the Pico as reported on the serial console, and enter it into the netcat command line, e.g. for an address of 192.168.1.240:

echo -n "hello" | nc -4u -w1 192.168.1.240 8080

The server will respond with an incrementing test number, e.g.

Test 1

Alternatively, here is a simple Python client program:

# Simple UDP client example
import time, socket

addr = "192.168.1.240", 8080
data = 'test'

while True:  
    client = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    client.settimeout(1)
    client.sendto(str.encode(data), addr)
    print("Tx: " + data);        
    try:
        resp, server = client.recvfrom(1024)
    except socket.timeout:
        resp = None
    print("Rx: " + "timeout" if not resp else resp.decode('utf-8'))
    time.sleep(2)
# EOF

Or in C:

// Simple UDP client for Linux

#include <stdio.h> 
#include <string.h> 
#include <sys/socket.h> 
#include <arpa/inet.h> 
#include <netinet/in.h> 
#include <unistd.h>

#define SERVER_ADDR     "192.168.1.240"
#define SERVER_PORT     8080
#define MAXLEN          1024 

void sock_set_timeout(int sock, int usec);

int main(int argc, char **argv)
{
    int sock, n;
    struct sockaddr_in server_addr;
    char txdata[MAXLEN] = "test", rxdata[MAXLEN];

    sock = socket(AF_INET, SOCK_DGRAM, 0);
    sock_set_timeout(sock, 100000);
    bzero(&server_addr, sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_addr.s_addr = INADDR_ANY;
    server_addr.sin_port = htons(SERVER_PORT);
    while (1)
    {
        printf("Tx %s:%u: %s\n", SERVER_ADDR, SERVER_PORT, txdata);
        sendto(sock, txdata, strlen(txdata), 0,
              (struct sockaddr *)&server_addr, sizeof(server_addr));
        n = recvfrom(sock, rxdata, MAXLEN-1, 0, NULL, NULL);
        if (n < 0)
            printf("Rx timeout\n");
        else
        {
            rxdata[n] = 0;
            printf("Rx %s\n", rxdata);
        }
        sleep(2);
    }
    return(0);
}

// Set socket timeout
void sock_set_timeout(int sock, int usec)
{
    struct timeval tv;
    
    tv.tv_sec = usec / 1000000;
    tv.tv_usec = usec % 1000000;
    setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
}

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi part 6: DHCP

In part 5, we joined a WiFi network, and used ‘ping’ to contact another unit on that network, but this was achieved by setting the IP address manually, which is generally known as using a ‘static’ IP.

The alternative is to use a ‘dynamic’ IP, that a central server (such as the WiFi Access Point) allocates from a pool of available addresses, using Dynamic Host Configuration Protocol (DHCP); this also provides other information such as a netmask & router address, to allow our unit to communicate with the wider Internet.

IP addresses and routing

So far, I’ve just said that an IP address consists of 4 bytes, that are usually expressed as decimal values with dotted notation, e.g. 192.168.1.2, but there is some extra complication.

Firstly it is important to note I’m using version 4 of the protocol (IPv4); there is a newer version (IPv6) with a much wider address range, but the older version is sufficient for our purposes, and easier to implement.

Next it is important to distinguish between a public and private IP address.

  • Public: an address that is accessible from the Internet, generally assigned by an Internet Service Provider (ISP)
  • Private: an address used locally within an organisation, that is not unique; generally assigned from the blocks 192.168.x.x, 172.16.x.x or 10.x.x.x

The address we’ll be getting from the DHCP server is probably private; if we are accessing the Internet, there will be one or more network devices (‘routers’) that perform public-to-private translation, and also security functions (‘firewalls’) to block malicious data.

If our unit has an IP address it wishes to contact, how does it know what to do? It just has to determine if the target address is local or remote by applying a netmask. For example if our unit is given the address 192.168.1.1 with netmask 255.255.255.0, then a logical AND of the two values means that our local network (known as a ‘subnet’) is 192.168.1. If the unit we’re contacting is on that subnet (i.e. the address begins with 192.168.1) then we just send out a local ARP request to convert their IP address into a MAC address, and start communicating.

If the target address isn’t on the same subnet (e.g. 192.168.2.1, 11.22.33.44, or anything else) then our unit contacts a router (using the address given in the DHCP response) and relies on the router to forward the data appropriately.

In the diagram above, there are networks with public addresses 11.22.33.44 and 22.33.44.55, and they both have private addresses in 192.168.1.x subnetworks; the job of the router is to move the data between these subnetworks by performing Network Address Translation (NAT) between them.

If unit 192.168.1.3 wants to contact 22.33.44.55 it will check the netmask, and because the target isn’t on the same subnetwork, the data will be sent to the router 192.168.1.1, which will forward it over the Internet.

If 192.168.1.3 wants to contact 192.168.1.2, ANDing with the netmask will show that they are both on the same subnet, so the data will be sent directly, bypassing using the router.

However, if 192.168.1.3 wants to send the data to 192.168.1.1 on the remote network, how does the router know what to do? The simple answer is “it doesn’t”, as addresses on the 192.168.1.x subnet aren’t unique, and there will be thousands (or millions!) of units with that same address around the world. Also the netmask clearly indicates that 192.168.1.1 must be on the same subnet as 192.168.1.3, so the data will be sent locally to 192.168.1.1, whether it exists or not; if it doesn’t exist, that’ll be flagged up by the ARP request failing.

There are various workarounds for this ‘NAT traversal’ problem, for example 192.168.1.3 sends the data to the router 22.33.44.55, which is configured to copy incoming data to 192.168.1.1, but there are major security risks associated with opening up a system to unfiltered Internet traffic, so for the purposes of this blog, I’m assuming that our unit will only be communicating with other units on the same subnetwork, or publicly-available systems on the Internet.

The above example assumes there is a single router for all outgoing traffic, and this is generally the case on a WiFi network, where the Access Point also acts as a router. However, on more complex networks there can be multiple routers to provide alternative routes to other networks or the Internet.

Client and server

The most common model for communication between two systems is client-server. The server runs continuously, waiting for a client to get in contact. The client uses a specific communications format (a ‘protocol’) to establish a link (‘connection’) to the server. The connection persists for as long as is needed to exchange the data, then it is closed by both sides.

Simpler protocols can dispense with the connection, but still retain the client-server model; for example, to fetch the time with Network Time Protocol (NTP) you just send a single message to a time server, and get a single message back with the time. This ‘connectionless’ approach means that a single ‘stateless’ server can handle very large numbers of clients, since it doesn’t have to track the state of its clients; an incoming request has all the information needed to send the response.

UDP message format

So there are two distinct ways for a client to communicate with a server; one creates a persistent connection, with both sides tracking the flow of data, and re-sending any data that is lost in transit: this is Transmission Control Protocol (TCP). The other way is User Datagram Protocol (UDP), which has no such tracking, or error correction; just send a block of data and hope it arrives.

This uncertainty means that, if faced with a choice, many programmers reject UDP as being too unreliable, however it does have a very important place in the suite of TCP/IP protocols, not least because it is used for DHCP.

A DHCP transmission consists of the following:

  • Ethernet header
  • IP header
  • UDP header
  • DHCP header
  • DHCP option data

We’ve already used the Ethernet and IP headers when sending an ICMP (ping) message, this time we’re stacking on a UDP header.

/* ***** UDP (User Datagram Protocol) header ***** */
typedef struct udph
{
    WORD  sport,            /* Source port */
          dport,            /* Destination port */
          len,              /* Length of datagram + this header */
          check;            /* Checksum of data, header + pseudoheader */
} UDPHDR;

There is a 16-bit length, which shows the total length of the header plus any data that follows, and a 16-bit checksum, which is calculated in an unusual manner; it incorporates the UDP header, parts of the IP header, and all the data that follows. The way this is calculated is to create a pseudo-header containing the relevant IP parts:

/* ***** Pseudo-header for UDP or TCP checksum calculation ***** */
/* The integers must be in hi-lo byte order for checksum */
typedef struct              /* Pseudo-header... */
{
    IPADDR sip,             /* Source IP address */
          dip;              /* Destination IP address */
    BYTE  z,                /* Zero */
          pcol;             /* Protocol byte */
    WORD  len;              /* UDP length field */
} PHDR;

So the UDP code has to prepare two headers, though the pseudo-header is only used for checksum calculation, and can be discarded after that is done.

// Add UDP header to buffer, return byte count
int ip_add_udp(BYTE *buff, WORD sport, WORD dport, void *data, int dlen)
{
    UDPHDR *udp=(UDPHDR *)buff;
    IPHDR *ip=(IPHDR *)(buff-sizeof(IPHDR));
    WORD len=sizeof(UDPHDR), check;
    PHDR ph;

    udp->sport = htons(sport);
    udp->dport = htons(dport);
    udp->len = htons(sizeof(UDPHDR) + dlen);
    udp->check = 0;
    len += ip_add_data(&buff[sizeof(UDPHDR)], data, dlen);
    check = add_csum(0, udp, len);
    IP_CPY(ph.sip, ip->sip);
    IP_CPY(ph.dip, ip->dip);
    ph.z = 0;
    ph.pcol = PUDP;
    ph.len = udp->len;
    udp->check = 0xffff ^ add_csum(check, &ph, sizeof(PHDR));
    return(len);
}

Port numbers

Another notable feature of the UDP header is the source & destination port numbers, and these deserve some explanation.

A port number can identify a specific service on a server; for example port 80 identifies an HTTP web server, and 67 is a DHCP server. These are ‘well-known’ port numbers and are in the range 0 to 1023. Ports numbered 1024 to 49151 are also used for specific server functionality that isn’t part of the original set, so are known as ‘registered’. The remaining numbers 49152 to 65535 are ‘dynamic’ ports, that are used temporarily by client applications.

When a client wishes to communicate with a server, it will obtain a dynamic port from its operating system, and use that port for the duration of a transaction, releasing it when the transaction is complete. In contrast, a server will generally monopolise a well-known or registered port on a permanent basis, though some servers additionally open up a dynamic port on a short-term basis to handle a specific interaction with the client, such as a file transfer.

Unusually, the DHCP server & client are both assigned well-known numbers, namely UDP 67 and 68. You may see these identified as BOOTP ports, since DHCP is based on the older BOOTP protocol, with some additions.

DHCP message format

DHCP is a 4-step process:

  • Discover: the unit broadcasts a request asking for network parameters, such as an IP address it can use, also a router address, and subnet mask.
  • Offer: the server responds with some proposed values, that the unit can accept or reject.
  • Request: the unit signifies its acceptance of the proposed values
  • ACK: the server acknowledges the request, indicating that the parameters have been assigned to the unit.

Once the parameters have been assigned, the server will generally attempt to keep them unchanged, such that every time the unit boots, it will get the same IP address. However, this is not guaranteed, and a busy server with a lot of temporary clients will be forced to re-use addresses from units that haven’t been active for a while.

The message format is based on the older protocol BOOTP:

typedef struct {
  	BYTE  opcode;   			/* Message opcode/type. */
	BYTE  htype;				/* Hardware addr type (net/if_types.h). */
	BYTE  hlen;					/* Hardware addr length. */
	BYTE  hops;					/* Number of relay agent hops from client. */
	DWORD trans;				/* Transaction ID. */
	WORD secs;					/* Seconds since client started looking. */
	WORD flags;					/* Flag bits. */
	IPADDR ciaddr,				/* Client IP address (if already in use). */
           yiaddr,				/* Client IP address. */
           siaddr,				/* Server IP address */
           giaddr;				/* Relay agent IP address. */
	BYTE chaddr [16];		    /* Client hardware address. */
	char sname[SNAME_LEN];	    /* Server name. */
	char bname[BOOTF_LEN];		/* Boot filename. */
	BYTE cookie[DHCP_COOKIE_LEN];   /* Magic cookie */
} DHCPHDR;

When making the initial discovery request, many of these values are unused; the ‘cookie’ is filled in with a specific 4-byte value (99, 130, 83, 99) that signal this is a DHCP request, not BOOTP. Then there is a data field with ‘option’ values; each entry has one byte indicating the option type, one byte indicating data length, and that number of data bytes. The options I use in the discovery request are a byte value of 1, indicating it is a discovery message, and 4 parameter values, indicating what should be provided by the server (1 for subnet mask, 3 for router address, 6 for nameserver address and 15 for network name).

// DHCP message options
typedef struct {
    BYTE typ1, len1, opt;
    BYTE typ2, len2, data[4];
    BYTE end;
} DHCP_MSG_OPTS;

// DHCP discover options
DHCP_MSG_OPTS dhcp_disco_opts = 
   {53, 1, 1,               // Msg len 1 type 1: discover
    55, 4, {1, 3, 6, 15},   // Param len 4: mask, router, DNS, name
    255};                   // End

The resulting offer from the server probably includes much more than we asked for; this is what my server returns:

    Option: (53) DHCP Message Type (Offer)
    Option: (54) DHCP Server Identifier (192.168.1.254)
    Option: (51) IP Address Lease Time (7 days)
    Option: (58) Renewal Time Value (3 days, 12 hours)
    Option: (59) Rebinding Time Value (6 days, 3 hours)
    Option: (1) Subnet Mask (255.255.255.0)
    Option: (28) Broadcast Address (192.168.1.255)
    Option: (15) Domain Name ("home")
    Option: (6) Domain Name Server (192.168.1.254)
    Option: (3) Router (192.168.1.254)
    Option: (255) End

You’ll see that the Access Point 192.168.1.254 is acting as a router and nameserver; we’ll be looking at the Domain Name System (DNS) in the next part of this blog.

If the unit wants to accept these proposed settings, it must send a request containing the proposed IP address. This can have the same format as the discovery, with a byte value of 3, indicating it is a request message, and a the 4-byte address value:

// DHCP request options
DHCP_MSG_OPTS dhcp_req_opts = 
   {53, 1, 3,               // Msg len 1 type 3: request
    50, 4, {0, 0, 0, 0},    // Address len 4 (copied from offer)
    255};                   // End

Assuming all is OK, the ACK response from the server will be similar to the offer, maybe with more values added (such as vendor-specific information), so an important part of the receiver code is the scanning of the parameters to find the values that are needed.

State machine

If we were in a multi-tasking environment, the DHCP process might basically consist of a sequence of 4 function calls, each function stopping (‘blocking’) until it is complete:

send_discovery()
receive_offer()
send_request()
receive_ack()

Since we don’t currently have multi-tasking, we can’t adopt this approach, as it would block any other code from running, and in the event of an error, one of these functions might stall indefinitely. Instead, we have to adopt a ‘polled’ approach, where we keep on re-visiting this process to see what (if anything) has changed. The key to this is to have a single ‘state’ variable that reflects what has happened, e.g. it has a value of 1 when we have sent the discovery, 2 when we have received an offer, and so on.

// Poll DHCP state machine
void dhcp_poll(void)
{
    static uint32_t dhcp_ticks=0;
    
    if (dhcp_state == 0 ||              // Send DHCP Discover
       (dhcp_state != DHCPT_ACK && ustimeout(&dhcp_ticks, DHCP_TIMEOUT)))
    {
        ustimeout(&dhcp_ticks, 0);
        IP_ZERO(my_ip);
        ip_tx_dhcp(bcast_mac, bcast_ip, DHCP_REQUEST, 
                   &dhcp_disco_opts, sizeof(dhcp_disco_opts));
        dhcp_state = DHCPT_DISCOVER;
    }
    else if (dhcp_state == DHCPT_OFFER) // Received Offer, send Request
    {
        ustimeout(&dhcp_ticks, 0);
        IP_CPY(dhcp_req_opts.data, offered_ip);
        ip_tx_dhcp(host_mac, bcast_ip, DHCP_REQUEST, 
                   &dhcp_req_opts, sizeof(dhcp_req_opts));
        dhcp_state = DHCPT_REQUEST;
    }
}

The polling of the DHCP state also incorporates a timeout, that is triggered in the event of an error; with a simple 4-step protocol like this, we can just restart the process from the beginning, rather than trying to work out where the error occurred.

Example program

There is one example program dhcp.c that fetches IP addresses and netmask from a DHCP server, and prints the result:

Joining network
Joined network
Tx DHCP DISCOVER
Rx DHCP OFFER 192.168.1.240
Tx DHCP REQUEST
Rx DHCP OFFER 192.168.1.240
Rx DHCP OFFER 192.168.1.240
Rx DHCP ACK 192.168.1.240 mask 255.255.255.0 router 192.168.1.254 DNS 192.168.1.254
DHCP complete, IP address 192.168.1.240 router 192.168.1.254
192.168.1.254->192.168.1.240 ARP request
192.168.1.240->192.168.1.254 ARP response

The display mode is set to include DHCP:

set_display_mode(DISP_INFO|DISP_JOIN|DISP_ARP|DISP_DHCP);

This allows you to see the message-passing; it isn’t unusual to receive duplicate messages, and in the DHCP OFFER above. The ARP display is also enabled so you can see the router using ARP to check the newly-assigned address.

It will be necessary to change the default SSID and PASSWD to match your network; for details on how to build & load the application, see the introduction.

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi part 5: ARP, IP and ICMP

In part 4, the wireless chip was connected to a WiFi network, so it can now send & receive data on that network, but we still have to encode the data for transmission, and decode it for reception.

We’re using a ‘full MAC’ chip, so all the low-level WiFi interfacing is handled within the chip. When transmitting, it encrypts our data, and adds the necessary 802.11 headers so that it will accepted by the network access point; when receiving, the headers are stripped off and the data is decrypted before being passed over to the Pico CPU.

This doesn’t just make our encoding & decoding tasks easier, it also ensures that the transmissions fully conform to the (exceedingly complex) 802.11 rules; if your interest is in creating non-standard wireless transmissions, then I’m afraid this project will be of no help.

TCP/IP

The suite of protocols used for data transmission over the Internet are generally known as Transmission Control Protocol / Internet Protocol, or TCP/IP. We’ll only be using a small subset of these protocols, and the initial task is just to handle Address Resolution Protocol (ARP) and Internet Control Message Protocol (ICMP). This will allow us to send & receive diagnostic ‘ping’ messages, and do some simple benchmarks by communicating with another system.

TCP/IP uses a three-tier addressing system; at the highest level, there are names with dotted notation, such as iosoft.blog or http://www.google.com. To access the computer at this address, two further steps are required:

  • a Domain Name System (DNS) database lookup is used to convert the name into an Internet Protocol (IP) address, which has 4 numeric values in dotted notation, for example 192.168.5.1
  • an Address Resolution Protocol (ARP) message is sent out on the network, with a request to convert the remote unit’s IP address into a Media Access and Control (MAC) address, which has 6 bytes, that are normally printed with a colon separator, e.g. 28:CD:C1:00:12:34

The first of these will be tackled in the next part of this project; for now, I’m assuming that the unit has obtained an IP address from somewhere, and knows the IP address of another unit it wishes to communicate with, for example the WiFi access point.

Address Resolution Protocol (ARP)

This is probably the simplest of all TCP/IP protocols; the unit broadcasts a request in a specific format, giving the IP address it wants to contact, and if any unit on the same ‘subnet’ has that address, then it will respond with its 6-byte MAC address. That is used for outgoing messages, but for incoming messages our unit must listen out for ARP broadcasts, and if a request matches its IP address, it should respond with the MAC address.

The ARP message format can be encapsulated within a C structure:

typedef unsigned char  BYTE;
typedef unsigned short WORD;
typedef unsigned int   DWORD;
typedef BYTE MACADDR[MACLEN];
typedef unsigned int IPADDR;

/* ***** ARP (Address Resolution Protocol) packet ***** */
typedef struct
{
    WORD hrd,           /* Hardware type */
         pro;           /* Protocol type */
    BYTE  hln,          /* Len of h/ware addr (6) */
          pln;          /* Len of IP addr (4) */
    WORD op;            /* ARP opcode */
    MACADDR  smac;      /* Source MAC addr */
    IPADDR   sip;       /* Source IP addr */
    MACADDR  dmac;      /* Destination MAC addr */
    IPADDR   dip;       /* Destination IP addr */
} ARPKT;

This is the first of many C structures for TCP/IP, and I’ve chosen to define 8, 16 and 32-bit values as BYTE, WORD and DWORD for clarity.

To broadcast this message, we need to add on Ethernet header, giving a source MAC address (the MAC address of our unit, as reported by the WiFi chip) the destination MAC address (broadcast, which is all-ones, i.e. FF:FF:FF:FF:FF:FF) and a protocol ID, which indicates that we’re sending an ARP packet.

/* Ethernet (DIX) header */
typedef struct {
    MACADDR dest;               /* Destination MAC address */
    MACADDR srce;               /* Source MAC address */
    WORD    ptype;              /* Protocol type or length */
} ETHERHDR;
#define PCOL_ARP    0x0806      /* Protocol type: ARP */
#define PCOL_IP     0x0800      /*                IP */

There are a lot of similarities between the higher level of wired (Ethernet) and wireless (802.11) protocols, so it makes sense that both use the same network address structure.

Creating an ARP request is really just a fill-in-the-blanks exercise:

#define HTYPE       0x0001  /* Hardware type: ethernet */
#define ARPPRO      0x0800  /* Protocol type: IP */
#define ARPREQ      0x0001  /* ARP request */
#define ARPRESP     0x0002  /* ARP response */

// Add Ethernet header to buffer, return byte count
WORD ip_add_eth(BYTE *buff, MACADDR dmac, MACADDR smac, WORD pcol)
{
    ETHERHDR *ehp = (ETHERHDR *)buff;

    MAC_CPY(ehp->dest, dmac);
    MAC_CPY(ehp->srce, smac);
    ehp->ptype = htons(pcol);
    return(sizeof(ETHERHDR));
}

// Create an ARP frame, return length
int ip_make_arp(BYTE *buff, MACADDR mac, IPADDR addr, WORD op)
{
    int n = ip_add_eth(buff, op==ARPREQ ? bcast_mac : mac, my_mac, PCOL_ARP);
    ARPKT *arp = (ARPKT *)&buff[n];

    MAC_CPY(arp->smac, my_mac);
    MAC_CPY(arp->dmac, op==ARPREQ ? bcast_mac : mac);
    arp->hrd = htons(HTYPE);
    arp->pro = htons(ARPPRO);
    arp->hln = MACLEN;
    arp->pln = sizeof(DWORD);
    arp->op  = htons(op);
    arp->dip = addr;
    arp->sip = my_ip;
    if (display_mode & DISP_ARP)
        ip_print_arp(arp);
    return(n + sizeof(ARPKT));
}

// Convert byte-order in a 'short' variable
WORD htons(WORD w)
{
    return(w<<8 | w>>8);
}

All network data is in big-endian format (most-significant byte first), but the RP2040 processor is little-endian, so the 16-bit values need to be byte-swapped.

To transmit the message, all that is needed is to add on the SDPCM layer for the WiFi chip, and copy it into an outgoing message buffer:

// Transmit an ARP frame
int ip_tx_arp(MACADDR mac, IPADDR addr, WORD op)
{
    int n = ip_make_arp(txbuff, mac, addr, op);
    
    return(ip_tx_eth(txbuff, n));
 }

// Send transmit data
int ip_tx_eth(BYTE *buff, int len)
{
    return(event_net_tx(buff, len));
}
// Transmit network data
int event_net_tx(void *data, int len)
{
    TX_MSG *txp = &tx_msg;
    int txlen = sizeof(SDPCM_HDR)+2+sizeof(BDC_HDR)+len;
    
    display(DISP_DATA, "Tx_DATA len %d\n", len);
    disp_bytes(DISP_DATA, data, len);
    display(DISP_DATA, "\n");
    txp->sdpcm.len = txlen;
    txp->sdpcm.notlen = ~txp->sdpcm.len;
    txp->sdpcm.seq = sd_tx_seq++;
    memcpy(txp->data, data, len);
    if (!wifi_reg_val_wait(10, SD_FUNC_BUS, SPI_STATUS_REG, 
            SPI_STATUS_F2_RX_READY, SPI_STATUS_F2_RX_READY, 4))
        return(0);
    return(wifi_data_write(SD_FUNC_RAD, 0, (uint8_t *)txp, (txlen+3)&~3));
}

The transmit data length is rounded up to the nearest 4 bytes, as the WiFi DMA controller only works handles complete 4-byte words.

ARP reception

An incoming message will arrive as an ‘event’ from the WiFi chip, and a handler function first checks that it is valid:

// Handler for incoming ARP frame
int arp_event_handler(EVENT_INFO *eip)
{
    ETHERHDR *ehp=(ETHERHDR *)eip->data;

    if (eip->chan == SDPCM_CHAN_DATA &&
        eip->dlen >= sizeof(ETHERHDR)+sizeof(ARPKT) &&
        htons(ehp->ptype) == PCOL_ARP &&
        (MAC_IS_BCAST(ehp->dest) ||
         MAC_CMP(ehp->dest, my_mac)))
    {
        return(ip_rx_arp(eip->data, eip->dlen));
    }
    return(0);
}

If the incoming message is an ARP request, then the receiver function transmits an appropriate response. If it is a response, the resulting MAC address is saved, for use in future transmissions:

// Receive incoming ARP data
int ip_rx_arp(BYTE *data, int dlen)
{
    ETHERHDR *ehp=(ETHERHDR *)data;
    ARPKT *arp = (ARPKT *)&data[sizeof(ETHERHDR)];
    WORD op = htons(arp->op);

    if (arp->dip == my_ip)
    {
        if (op == ARPREQ)
            ip_tx_arp(ehp->srce, arp->sip, ARPRESP);
        else if (op == ARPRESP)
            ip_save_arp(arp->smac, arp->sip);
        return(1);
    }
    return(0);
}

Ping

Ping request & response format

Having obtained the 6-byte MAC address of a unit we wish to communicate with, what can we send to it? The obvious choice is a diagnostic ‘ping’, that echoes back the data we send, and measures the round-trip time.

Ping uses the Internet Control Message Protocol (ICMP), with an IP header for the address information:

/* ***** ICMP (Internet Control Message Protocol) header ***** */
typedef struct
{
    BYTE  type,         /* Message type */
          code;         /* Message code */
    WORD  check,        /* Checksum */
          ident,        /* Identifier */
          seq;          /* Sequence number */
} ICMPHDR;
#define ICREQ           8   /* Message type: echo request */
#define ICREP           0   /*               echo reply */

/* ***** IP (Internet Protocol) header ***** */
typedef struct
{
    BYTE   vhl,         /* Version and header len */
           service;     /* Quality of IP service */
    WORD   len,         /* Total len of IP datagram */
           ident,       /* Identification value */
           frags;       /* Flags & fragment offset */
    BYTE   ttl,         /* Time to live */
           pcol;        /* Protocol used in data area */
    WORD   check;       /* Header checksum */
    IPADDR sip,         /* IP source addr */
           dip;         /* IP dest addr */
} IPHDR;
#define PICMP   1           /* Protocol type: ICMP */
#define PTCP    6           /*                TCP */
#define PUDP   17           /*                UDP */

Creating an ICMP request largely consists of filling in the values within these structures, and adding some arbitrary data on the end, but there are some issues to bear in mind:

  • As with ARP, all values are big-endian (most significant byte first) so byte-swaps are needed
  • Potentially the IP message (known as a ‘datagram’) may travel very long distances, with a large number of ‘hops’ between computers, and each of these hops will have a maximum data size it can accommodate, which is known as a Maximum Transmission Unit (MTU). To allow a large datagram to be sent across a link with a smaller MTU, there is a technique called ‘IP fragmentation’, whereby the transmission is chopped up into smaller parts, and the parts are reassembled at the receiving end. For simplicity, we won’t initially support fragmentation, which means we have an MTU of around 1.5K bytes.
  • There is a checksum across the IP header and ICMP data, and this is calculated using a method that performs identically on big-endian and little-endian processors.
/* Calculate TCP-style checksum, add to old value */
WORD add_csum(WORD sum, void *dp, int count)
{
    WORD n=count>>1, *p=(WORD *)dp, last=sum;

    while (n--)
    {
        sum += *p++;
        if (sum < last)
            sum++;
        last = sum;
    }
    if (count & 1)
        sum += *p & 0x00ff;
    if (sum < last)
        sum++;
    return(sum);
}

Ping reception

If the unit has received a unicast ICMP request, then it should return a response to the sender that basically copies everything in the request, but with the source & destination addresses swapped, and the message type changed from request to reply. Theoretically, the ICMP checksum needs to be re-computed, but as it is just a sum of 16-bit words, it isn’t affected by the address swap. So we can just re-use the existing checksum, adjusted for the change from the request value of 8 to the response value of 0:

// Receive incoming ICMP data
int ip_rx_icmp(BYTE *data, int dlen)
{
    ETHERHDR *ehp=(ETHERHDR *)data;
    IPHDR *ip = (IPHDR *)&data[sizeof(ETHERHDR)];
    ICMPHDR *icmp = (ICMPHDR *)&data[sizeof(ETHERHDR)+sizeof(IPHDR)];
    int n;

    if (display_mode & DISP_ICMP)
        ip_print_icmp(ip);
    if (icmp->type == ICREQ)
    {
        ip_add_eth(data, ehp->srce, my_mac, PCOL_IP);
        ip->dip = ip->sip;
        ip->sip = my_ip;
        icmp->check = add_csum(icmp->check, &icmp->type, 1);
        icmp->type = ICREP;
        n = htons(ip->len);
        return(ip_tx_eth(data, sizeof(ETHERHDR)+n+sizeof(ICMPHDR)));
    }
    else if (icmp->type == ICREP)
    {
        ping_rx_time = ustime();
    }
    return(0);
}

Example program: ping.c

This program generates pins every 2 seconds, and responds to incoming ping requests. It uses a hard-coded IP addresses, for itself and the target of the outgoing pings:

IPADDR myip   = IPADDR_VAL(192,168,1,165);
IPADDR hostip = IPADDR_VAL(192,168,1,1);

‘myip’ should be set to a suitable unused IP address on your subnet (e.g. 182.168.1 in the above example); you can check if an address is unused by pinging it.

‘hostip’ should be set to the address of another unit on the network that can accept pings, or the address of the WiFi Access Point.

You’ll also need to set the name & password for the WiFi network you are using:

// The hard-coded password is for test purposes only!!!
#define SSID                "testnet"
#define PASSWD              "testpass"

See the PicoWi introduction for a description of the build process, and the connection of a serial console to see the diagnostic messages.

The LED will flash rapidly for a few seconds until the device is connected to the network; it will then switch on when a ping is sent, and off when it is received; on my network, the ping time is generally quite short, so only a brief flash is visible if everything is working correctly.

Ping times

On an Ethernet network, it is usual to see fast & repeatable values for the ping round-trip time. However wireless networks aren’t as predictable, since all units that are on the same radio channels will be competing for air-time, not just with your network, but any other networks within range.

So the response time will vary, depending on the activity of any networks sharing the same WiFi channels; here is a a typical example of 20 pings, using the time reported on the Pico console:

Round-trip time for PicoWi ping

A ping time of 1.2 milliseconds is quite respectable, considering that a Pi 4 on the same network takes a minimum of 1.9 ms.

The graph was plotted using GNUplot; if you want to replicate it, the console output is captured to pings.txt, then pre-processed using awk:

awk -F [=\ ] '/time/ { print $(NF-1) }' pings.txt > pings.csv

This script should also work with the console output of Linux pings. The result is then fed to GNUplot; the command-line has been split into 4 for clarity:

gnuplot -e "set term png size 420,240 font 'sans,8'; \
  set title 'Ping time'; set grid; set key noautotitle; \
  set ylabel 'Time (ms)' offset 2; set output 'pings.png'; \
  plot 'pings.csv' with boxes"
Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi part 4: scan and join a network

By the end of part 3, the WiFi chip was up & running, and as a simple test of WiFi operation, we’ll next scan the neighbourhood for WiFi networks, then attempt to join a network.

Scanning a network

As a quick check of wireless functionality, it can be useful to scan for WiFi networks within range. Before starting that, we need to send some IOCTL commands to configure various parameters, such as the network band.

The main problem with IOCTL calls is their sheer variety, that might require data in a specific format, or maybe no data at all. I haven’t been able to find a document that describes them, the only publicly-available documentation seems to be the source code . So when developing, it is quite possible to use the wrong IOCTL command, or send it the wrong data, and we need a way of reporting the error, without adding a lot of print function calls.

All my IOCTL functions return 0 if there wasn’t a reply, and -1 if the response indicated an error, so we can just chain commands using the short-circuit AND functionality to ensure execution will stop when an error occurs, and print the last IOCTL command that was executed:

#define IOCTL_WAIT  30 // Time to wait for ioctl response (msec)

const EVT_STR escan_evts[] = {EVT(WLC_E_ESCAN_RESULT), EVT(WLC_E_SET_SSID), EVT(-1)};

// Start a network scan
int scan_start(void)
{
    int ret;
    
    events_enable(escan_evts);
    ret = ioctl_wr_int32(WLC_SET_SCAN_CHANNEL_TIME, IOCTL_WAIT, SCAN_CHAN_TIME) > 0 &&
        ioctl_set_uint32("pm2_sleep_ret", IOCTL_WAIT, 0xc8) > 0 &&
        ioctl_set_uint32("bcn_li_bcn", IOCTL_WAIT, 1) > 0 &&
        ioctl_set_uint32("bcn_li_dtim", IOCTL_WAIT, 1) > 0 &&
        ioctl_set_uint32("assoc_listen", IOCTL_WAIT, 0x0a) > 0 &&
        ioctl_wr_int32(WLC_SET_BAND, IOCTL_WAIT, WIFI_BAND_ANY) > 0 &&
        ioctl_wr_int32(WLC_UP, IOCTL_WAIT, 0) > 0;
    ioctl_err_display(ret);
    return(ret);
}

// Display last IOCTL if error
void ioctl_err_display(int retval)
{
    IOCTL_MSG *msgp = &ioctl_txmsg;
    IOCTL_HDR *iohp = (IOCTL_HDR *)&msgp->data[msgp->cmd.sdpcm.hdrlen];
    char *cmds = iohp->cmd==WLC_GET_VAR ? "GET" : 
                 iohp->cmd==WLC_SET_VAR ? "SET" : "";
    char *data, *name;
    
    if (retval <= 0)
    {
        data = (char *)&msgp->data[msgp->cmd.sdpcm.hdrlen+sizeof(IOCTL_HDR)];
        name = iohp->cmd==WLC_GET_VAR || iohp->cmd==WLC_SET_VAR ? data : "";
        printf("IOCTL error: cmd %lu %s %s\n", iohp->cmd, cmds, name);
    }
}

We can check the code functioning by forcing an error, e.g. temporarily reducing the timeout value for a command such as ‘bcn_li_dtim’ to zero, in which case the code reports the following which, although somewhat terse, does indicate the source of the problem:

IOCTL error: cmd 263 SET bcn_li_dtim

To start a scan, we need one more IOCTL, with an data structure that sets some more parameters:

#define SSID_MAXLEN         32
#define SCANTYPE_ACTIVE     0
#define SCANTYPE_PASSIVE    1

#pragma pack(1)
typedef struct {
    uint32_t version;
    uint16_t action,
             sync_id;
    uint32_t ssidlen;
    uint8_t  ssid[SSID_MAXLEN],
             bssid[6],
             bss_type,
             scan_type;
    uint32_t nprobes,
             active_time,
             passive_time,
             home_time;
    uint16_t nchans,
             nssids;
    uint8_t  chans[14][2],
             ssids[1][SSID_MAXLEN];
} SCAN_PARAMS;

SCAN_PARAMS scan_params = {
    .version=1, .action=1, .sync_id=0x1, .ssidlen=0, .ssid={0},
    .bssid={0xff,0xff,0xff,0xff,0xff,0xff}, .bss_type=2,
    .scan_type=SCANTYPE_PASSIVE, .nprobes=~0, .active_time=~0,
    .passive_time=~0, .home_time=~0, .nchans=0
};

ioctl_set_data("escan", IOCTL_WAIT, &scan_params, sizeof(scan_params));

After that command is sent, we should receive several responses in the form of events; at least one from each WiFi network in range. The scan event handler has to byte-swap any 16 or 32-bit values, since they are in ‘network’ byte-order (big-endian); the handler function was described in the previous part of this blog.

It isn’t unusual for the same network to be reported more than once, e.g.

8C:59:73:xx:xx:xx 'Post_Office' chan 3
E8:65:D4:xx:xx:xx 'Court Hotel' chan 1
8C:59:73:xx:xx:xx 'Post_Office' chan 3
00:11:22:xx:xx:xx 'Virginia' chan 6
20:B0:01:xx:xx:xx 'vodafone' chan 6
6A:A2:22:xx:xx:xx '[hidden]' chan 6
..and so on..

In the tests I have done, the total time from power-up to receiving the last scan entry is around 2.1 seconds, which is surprisingly fast, considering how much chip-initialisation has been required.

Joining a network

This requires a large number of IOCTL commands to set up the WiFi interface, and there is little point in my listing all of them here, so I’m concentrating on specific settings of interest.

  • Country: this is required in order to set domain-specific parameters. I’m taking the easy way out, and specifying a country code of ‘XX’, which is a common set of world-wide characteristics.
  • Multicast: there is one MAC address set to 01:00:5E:00:00:FB which is the standard for IP v4
  • Power saving: this is disabled by default, but can be compiled in if required, though it does significantly increase WiFi response times, as the device will sleep when idle, and takes some time to wake up & respond.
  • Authentication: this uses a WPA2 pre-shared key, stored in plaintext, which is a major weakness in network security.
  • Network name: the SSID is also stored as plaintext.

Once the network join has been initiated, we receive a stream of events to show progress. These can be viewed by calling set_display_mode with DISP_EVENT. A typical joining sequence might be:

Join secure network:
  Rx_EVT  87 ASSOC_REQ_IE,  flags 0, status 0, reason 0
  Rx_EVT   3 AUTH,          flags 0, status 0, reason 0
  Rx_EVT  88 ASSOC_RESP_IE, flags 0, status 0, reason 0
  Rx_EVT   7 ASSOC,         flags 0, status 0, reason 0
  Rx_EVT  16 LINK,          flags 1, status 0, reason 0
  Rx_EVT   1 JOIN,          flags 0, status 0, reason 0
  Rx_EVT   0 SET_SSID,      flags 0, status 0, reason 0
  Rx_EVT  46 PSK_SUP,       flags 0, status 6, reason 0
  ..then Rx_DATA for broadcast/multicast network traffic..

Automatic reassociation after joining a network:
  Rx_EVT  46 PSK_SUP,       flags 0, status 6, reason 14
  Rx_EVT  87 ASSOC_REQ_IE,  flags 0, status 0, reason 0
  Rx_EVT   3 AUTH,          flags 0, status 0, reason 0
  Rx_EVT  88 ASSOC_RESP_IE, flags 0, status 0, reason 0
  Rx_EVT   9 REASSOC,       flags 0, status 0, reason 0
  Rx_EVT  16 LINK,          flags 1, status 0, reason 0
  Rx_EVT  46 PSK_SUP,       flags 0, status 6, reason 0
  Rx_EVT   1 JOIN,          flags 0, status 0, reason 0
 ..then Rx DATA flow continues..

Join open network (no security):
  Rx_EVT  87 ASSOC_REQ_IE,  flags 0, status 0, reason 0
  Rx_EVT   3 AUTH,          flags 0, status 0, reason 0
  Rx_EVT  88 ASSOC_RESP_IE, flags 0, status 0, reason 0
  Rx_EVT   7 ASSOC,         flags 0, status 0, reason 0
  Rx_EVT  16 LINK,          flags 1, status 0, reason 0
  Rx_EVT   1 JOIN,          flags 0, status 0, reason 0
  Rx_EVT   0 SET_SSID,      flags 0, status 0, reason 0
  ..then Rx_DATA for broadcast/multicast network traffic..

SSID not found:
  Rx_EVT   0 SET_SSID,      flags 0, status 3, reason 0

Password incorrect:
  Rx_EVT  87 ASSOC_REQ_IE,  flags 0, status 0, reason 0
  Rx_EVT   3 AUTH,          flags 0, status 0, reason 0
  Rx_EVT  88 ASSOC_RESP_IE, flags 0, status 0, reason 0
  Rx_EVT   7 ASSOC,         flags 0, status 0, reason 0
  Rx_EVT  16 LINK,          flags 1, status 0, reason 0
  Rx_EVT   1 JOIN,          flags 0, status 0, reason 0
  Rx_EVT   0 SET_SSID,      flags 0, status 0, reason 0
  Rx_EVT  46 PSK_SUP,       flags 0, status 8, reason 15
  Rx_EVT  46 PSK_SUP,       flags 0, status 8, reason 14
  ..then the same sequence repeated..

The ‘status’ values are common to all the events:

  • 0: success
  • 3: no networks
  • 6: unsolicited
  • 8: partial

The ‘reason’ values are specific to an event, for example in PSK_SUP, 14 means that a de-authentication request has been received, and 15 indicates that a timeout of the pre-shared key handshake has occurred.

Also there is no guarantee that the events will arrive in this order; for example, when I tested on a different Access Point, the last 3 events were PSK_SUP, JOIN, and SET_SSID.

I have also tested the responses to network events:

Orderly shutdown of WiFi at the access point:
  Rx_EVT  12 DISASSOC_IND,  flags 0, status 0, reason 8
  Rx_EVT   3 AUTH,          flags 0, status 5, reason 0
  Rx_EVT  46 PSK_SUP,       flags 0, status 6, reason 0
  Rx_EVT  16 LINK,          flags 0, status 0, reason 2

Restore WiFi after orderly shutdown:
  Rx_EVT  87 ASSOC_REQ_IE,  flags 0, status 0, reason 0
  Rx_EVT   3 AUTH,          flags 0, status 0, reason 0
  Rx_EVT  88 ASSOC_RESP_IE, flags 0, status 0, reason 0
  Rx_EVT   9 REASSOC,       flags 0, status 0, reason 0
  Rx_EVT  16 LINK,          flags 1, status 0, reason 0
  Rx_EVT  46 PSK_SUP,       flags 0, status 6, reason 0
  Rx_EVT   1 JOIN,          flags 0, status 0, reason 0
  ..then the data flow resumes..

Power-down of the access point:
  Rx_EVT  16 LINK,          flags 0, status 0, reason 1

Restore power to the access point:
  Rx_EVT  16 LINK,          flags 0, status 0, reason 1
  Rx_EVT  87 ASSOC_REQ_IE,  flags 0, status 0, reason 0
  Rx_EVT   3 AUTH,          flags 0, status 0, reason 0
  Rx_EVT  88 ASSOC_RESP_IE, flags 0, status 0, reason 0
  Rx_EVT   9 REASSOC,       flags 0, status 0, reason 0
  Rx_EVT  16 LINK,          flags 1, status 0, reason 0
  Rx_EVT  46 PSK_SUP,       flags 0, status 6, reason 0
  Rx_EVT   1 JOIN,          flags 0, status 0, reason 0
  ..then the data flow resumes..

Network unavailable on startup:
  Rx_EVT   0 SET_SSID,      flags 0, status 3, reason 0

Network becomes available after startup:
  Nothing!

Try to join a secure network, using no security 
  Rx_EVT   0 SET_SSID,      flags 0, status 0, reason 0

So the good news is that the WiFi chip can automatically reconnect to the network under some circumstances, but the bad news is that it will not always reconnect, and I can find no single event showing if the device is connected or not. Rather than attempting to decode the events in detail, I’ve used an overall timeout for joining a network (default 10 seconds); if that fails there is a rest period (currently also 10 seconds) before the next re-connection attempt.

Example programs

There are two examples; see the introduction for details of how to re-build and run the code.

scan.c does a single scan, and returns a list of networks found. The result returned by the WiFi chip is displayed as-is, so may contain duplicates.

join.c joins a given network, reporting on progress; the network name and password must be entered in the source code:

#define SSID                "testnet"
#define PASSWD              "testpass"

The on-board LED flashes at 5 Hz prior to connection, and at 1 Hz when connected.

In the next part I’ll start using TCP/IP protocols.

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi part 3: IOCTLs and events

Part 2 described how the CYW43439 WiFi chip is initialised, but used an IOCTL call and an event check without explaining what these are, or how they work, so now is the time to rectify that deficiency.

An IOCTL (Input/Output Control) call is sent by the Pico host CPU (RP2040) to the ARM CPU in the WiFi chip, to read or write configuration data, or send a specific command. An event is an unsolicited block of data sent from the WiFi CPU to the host; it can be a notification that an action is complete, or some data that has arrived over the WiFi network.

IOCTLs

A simple example of an IOCTL is a request for the 6-byte WiFi MAC address.

uint8_t mac[6];
ioctl_get_data("cur_etheraddr", 10, mac, 6);

This sends the IOCTL command GET_VAR, with a string to identify the item of interest, and a timeout in milliseconds.

#define WLC_GET_VAR 262

// Get data block from IOCTL variable
int ioctl_get_data(char *name, int wait_msec, uint8_t *data, int dlen)
{
    return(ioctl_cmd(WLC_GET_VAR, name, strlen(name)+1, wait_msec, false, data, dlen));
}

The request must be packed into a structure, for transmission the the WiFi CPU; this has 2 headers, the first is an ‘SDIO/SPI Bus Layer’ (SDPCM) header, followed by an IOCTL header:

// SDPCM header
typedef struct {
    uint16_t len,       // sdpcm_header.frametag
             notlen;
    uint8_t  seq,       // sdpcm_sw_header
             chan,
             nextlen,
             hdrlen,
             flow,
             credit,
             reserved[2];
} SDPCM_HDR;

// IOCTL header
typedef struct {
    uint32_t cmd;       // cdc_header
    uint16_t outlen,
             inlen;
    uint32_t flags,
             status;
} IOCTL_HDR;

// IOCTL command with SDPCM and IOCTL headers
typedef struct
{
    SDPCM_HDR sdpcm;
    IOCTL_HDR ioctl;
    uint8_t data[IOCTL_MAX_BLKLEN];
} IOCTL_CMD;

The first two 16-bit words of the SDPCM header contain the data length, and its bitwise inverse, then the most important fields are:

  • Chan: a number identifying which ‘channel’ is associated with the data: IOCTL channel is 0, event is 1, and data is 2.
  • Hdrlen: the length of the SDPCM header plus any padding. My code doesn’t use any padding, but the response from the WiFi chip often has a lot of padding.
  • Flow & Credit: used to track the WiFi buffer utilisation

This is followed by the IOCTL header, with a command number (262 for GET_VAR) and a data length value.

The whole message plus data is written to the SPI interface:

// Do an IOCTL transaction, get response
// Return 0 if timeout, -1 if error response
int ioctl_cmd(int cmd, char *name, int namelen, int wait_msec, int wr, void *data, int dlen)
{
    IOCTL_CMD *cmdp = &ioctl_txmsg.cmd;
    int txdlen = ((namelen + dlen + 3) / 4) * 4, ret = 0;
    int hdrlen = sizeof(SDPCM_HDR) + sizeof(IOCTL_HDR);
    int txlen = hdrlen + txdlen;

    memset(cmdp, 0, sizeof(ioctl_txmsg));
    cmdp->sdpcm.notlen = ~(cmdp->sdpcm.len = txlen);
    cmdp->sdpcm.seq = sd_tx_seq++;
    cmdp->sdpcm.chan = SDPCM_CHAN_CTRL;
    cmdp->sdpcm.hdrlen = sizeof(SDPCM_HDR);
    cmdp->ioctl.cmd = cmd;
    cmdp->ioctl.outlen = txdlen;
    cmdp->ioctl.flags = ((uint32_t)ioctl_reqid++ << 16) | (wr ? 2 : 0);
    if (namelen)
        memcpy(cmdp->data, name, namelen);
    if (wr && dlen>0)
        memcpy(&cmdp->data[namelen], data, dlen);
    wifi_data_write(SD_FUNC_RAD, 0, (void *)cmdp, txlen);
    ..continued below..

The code now waits for a response, but it is important to note that the first response it receives may be associated with a completely different request, or network data. So it is essential to check that the response matches the command, and if not, keep on checking for a matching response.

    ..continued from above..
    while (wait_msec>=0 && !(ret=ioctl_resp_match(cmd, data, dlen)))
    {
        wait_msec -= IOCTL_POLL_MSEC;        
        usdelay(IOCTL_POLL_MSEC * 1000);
    }
    return(ret);
}

// Read an ioctl response, match the given command, any command if 0
// Return 0 if no response, -1 if error response
int ioctl_resp_match(int cmd, void *data, int dlen)
{
    int rxlen=0, n=0, hdrlen;
    IOCTL_MSG *rsp = &ioctl_rxmsg;
    IOCTL_HDR *iohp;
    
    if ((rxlen = event_read(rsp, 0, 0)) > 0)
    {
        iohp = (IOCTL_HDR *)&rsp->data[rsp->cmd.sdpcm.hdrlen]; 
        hdrlen = rsp->cmd.sdpcm.hdrlen + sizeof(IOCTL_HDR);
        if (rsp->rsp.chan==SDPCM_CHAN_CTRL && 
            (cmd==0 || cmd==iohp->cmd))
        {
            n = MIN(dlen, rxlen-hdrlen);
            if (data && n>0)
                memcpy(data, &rsp->data[hdrlen], n);
            if (cmd)
            {
                if (iohp->status)
                    n = -1;
            }
        }
    }
    return(cmd==0 ? rxlen : n>0 ? n : 0);
}

You’ll note that the response has been obtained using the ‘event_read’ function, which handles all incoming data (solicited or unsolicited) from the WiFi interface; it will be described in detail below.

The IOCTL response has a similar format to the request, except that it generally has a lot of padding after the SDPCM header. This means that (unlike the transmit message) the receiver has to decode the SDPCM header ‘hdrlen’ value, in order to know how much padding has been added in front of the IOCTL header.

In addition to the IOCTL GET_VAR call that reads the value of a variable, given its name as a string, and its partner SET_VAR that writes a new value to that variable, there nearly 300 other IOCTL calls, such as SET_ANTDIV (command 64) which controls the antenna diversity, or UP (command 2) which is used to activate the WiFi interface.

Events

The WiFi chip signals an event when it has something to report to the host processor, for example it has succeeded in joining a WiFi network, or it has just received a data packet from that network.

As discussed above, there is a time-delay associated with any IOCTL command, so the IOCTL response might arrive within a stream of other events. So my code treats any incoming message as a potential event, and establishes its purpose by decoding the SDPCM header.

This raises the question of how the host CPU knows that there is an incoming event; the answer is that it can poll the BUS_SPI_STATUS_REG, to see if the ‘function 2 packet available’ flag is set. Alternatively, to avoid excessive polling cycles, the host can just check the IRQ line (described in part 1) and if that is high, there is an event pending. I use a combined approach; check the IRQ line, but is there hasn’t been any event for 10 milliseconds, check the status register:

#define SPI_STATUS_LEN_SHIFT            9
#define SPI_STATUS_LEN_MASK             0x7ff

// Get ioctl response, async event, or network data.
int event_get_resp(void *data, int maxlen)
{
    uint32_t val=0;
    int rxlen=0;
    
    val = wifi_reg_read(SD_FUNC_BUS, SPI_STATUS_REG, 4);
    if ((val != ~0) && (val & SPI_STATUS_PKT_AVAIL))
    {
        rxlen = (val >> SPI_STATUS_LEN_SHIFT) & SPI_STATUS_LEN_MASK;
        rxlen = MIN(rxlen, maxlen);
        // Read event data if present
        if (data && rxlen>0)
            wifi_data_read(SD_FUNC_RAD, 0, data, rxlen);
        // ..or clear interrupt, and discard data
        else
        {
            val = wifi_reg_read(SD_FUNC_BUS, SPI_INTERRUPT_REG, 2);
            wifi_reg_write(SD_FUNC_BUS, SPI_INTERRUPT_REG, val, 2);
            wifi_reg_write(SD_FUNC_BAK, SPI_FRAME_CONTROL, 0x01, 1);
        }
    }
    return(rxlen);
}

The status register has a flag to indicate data is available on function 2 (the radio interface), and also a length value, indicating how many bytes there are to read. Once that has been read in, the SDPCM header is checked, and the data after that header is copied into a buffer.

// Get ioctl response, async event, or network data
// Optionally copy data after SDPCM & BDC headers into a buffer, return its length
int event_read(IOCTL_MSG *rsp, void *data, int dlen)
{
    int rxlen=0, n=0, hdrlen;
    SDPCM_HDR *sdp=&rsp->cmd.sdpcm;
    BDC_HDR *bdcp;
    
    if ((rxlen = event_get_resp(rsp, sizeof(IOCTL_MSG))) >= sizeof(SDPCM_HDR)+sizeof(BDC_HDR))
    {
        if ((sdp->len ^ sdp->notlen) == 0xffff)
        {
            hdrlen = sdp->hdrlen;
            bdcp = (BDC_HDR *)&rsp->data[hdrlen];
            hdrlen += sizeof(BDC_HDR) + bdcp->offset*4;
            n = MIN(dlen, rxlen-hdrlen);
            if (data && n>0)
                memcpy(data, &rsp->data[hdrlen], n);
        }
    }
    return(dlen>0 ? (n>0 ? n : 0) : rxlen);
}

At the top of these function calls is the polling function, which stores the SDPCM values in a local structure (EVENT_INFO), and takes appropriate action with the data. The reason why a local structure is used is that the event header is in ‘network’ byte-order, which is big-endian (most-significant byte first), so the data is byte-swapped before being stored locally.

Since there may be multiple event handlers, and the the the polling function can’t know which one is the correct destination for the event, it calls each one in turn, stopping when one returns a non-zero value, indicating that it has accepted the event.

// Poll for async event, put results in info structure
int event_poll(void)
{
    EVENT_INFO *eip = &event_info;
    IOCTL_MSG *iomp = &ioctl_rxmsg;
    ESCAN_RESULT *erp=(ESCAN_RESULT *)rxdata;
    EVENT_HDR *ehp = &erp->eventh;
    int n = event_read(iomp, rxdata, sizeof(rxdata));
    
    if (n > 0)
    {
        eip->chan = iomp->rsp.sdpcm.chan;
        eip->flags = SWAP16(ehp->flags);
        eip->event_type = SWAP32(ehp->event_type);
        eip->status = SWAP32(ehp->status);
        eip->reason = SWAP32(ehp->reason);
        eip->data = rxdata;
        eip->dlen = n;
        if (eip->chan == SDPCM_CHAN_CTRL)
            display(DISP_EVENT, "\n");
        else if ((eip->chan==SDPCM_CHAN_EVT || eip->chan==SDPCM_CHAN_DATA) &&
            n >= sizeof(ETHER_HDR)+sizeof(BCMETH_HDR)+sizeof(EVENT_HDR))
            ok = event_handle(eip);
    }
    return(ok);
}

Handling an event

The code calls handler functions in turn, until one returns a non-zero value, indicating it has accepted the event.

#define MAX_HANDLERS    10
typedef int (*event_handler_t)(EVENT_INFO *eip);
event_handler_t event_handlers[MAX_HANDLERS];
int num_handlers;

// Run event handlers, until one returns non-zero
int event_handle(EVENT_INFO *eip)
{
    int ret=0;
    
    for (int i=0; i<num_handlers && !ret; i++)
        ret = event_handlers[i](eip);
    return(ret);
}

An event handler is called with a pointer to the EVENT_INFO structure, which basically contains a copy of the SDPCM header information (in the correct byte-order) and a pointer to the data after that header. The function must return zero if it hasn’t recognised the event. As an example, here is a simple handler that displays the result of a network scan:

// Handler for scan events
int scan_event_handler(EVENT_INFO *eip)
{
    ESCAN_RESULT *erp=(ESCAN_RESULT *)eip->data;
    int ret = eip->chan==SDPCM_CHAN_EVT && eip->event_type==WLC_E_ESCAN_RESULT;
    
    if (ret)
    {
        if (erp->eventh.status == 0)
        {
            printf("Scan complete\n");
            ret = -1;
        }
        else
        {
            printf("%s '", mac_addr_str(erp->info.bssid));
            disp_ssid(&erp->info.ssid_len);
            printf("' chan %d\n", SWAP16(erp->info.channel));
        }
    }
    return(ret);
}

Note that the ESCAN_RESULT data is in ‘network’ byte-order, so needs to be byte-swapped before being displayed.

This handler has to be added to the array of handlers using a function call:

add_event_handler(scan_event_handler);

This allows you to implement your own event handlers, in addition to, or instead of, the functions I have provided.

Enabling events

There are over 140 possible events, and by default they are disabled; we need to enable those we are interested in, such as network authentication & joining, so we can detect any problems.

The enabling process uses a (very large) bitfield, each bit indicating whether an event is enabled or disabled; the resulting byte array is sent to the WiFi CPU using an IOCTL call.

#define EVENT_MAX           208
#define SET_EVENT(msk, e)   msk[4 + e/8] |= 1 << (e & 7)

uint8_t event_mask[EVENT_MAX / 8];

// Enable events
int events_enable(const EVT_STR *evtp)
{
    memset(event_mask, 0, sizeof(event_mask));
    while (evtp->num >= 0)
    {
        if (evtp->num / 8 < sizeof(event_mask))
            SET_EVENT(event_mask, evtp->num);
        evtp++;
    }
    return(ioctl_set_data("bsscfg:event_msgs", 10, event_mask, sizeof(event_mask)));
}

I have used an unusual method to specify the events that are to be enabled; a macro is used to store the event number, and a string corresponding to the event name. This means that I can display event names (instead of numbers) on a diagnostic console, which is very useful to show any problems.

// Storage for event number, and string for diagnostics
typedef struct {
    int num;
    char *str;
} EVT_STR;
#define EVT(e)      {e, #e}

const EVT_STR join_evts[]={EVT(WLC_E_JOIN), EVT(WLC_E_ASSOC), EVT(WLC_E_REASSOC), 
    EVT(WLC_E_ASSOC_REQ_IE), EVT(WLC_E_ASSOC_RESP_IE), EVT(WLC_E_SET_SSID),
    EVT(WLC_E_LINK), EVT(WLC_E_AUTH), EVT(WLC_E_PSK_SUP),  EVT(WLC_E_EAPOL_MSG),
    EVT(WLC_E_DISASSOC_IND), EVT(WLC_E_DISASSOC_IND), EVT(-1)};

In the next part of this project we’ll be scanning and joining a network.

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi part 2: initialisation

PicoWi initialisation steps

In part 1, I described the low-level hardware & software interface to the Broadcom / Cypress / Infineon CYW43439 WiFi chip, and its close relative, the CYW4343W.

Now we need to initialise the chip, so it is ready to receive network commands and data. This involves sending three files to the chip, and unfortunately there is no simple Application Programming Interface (API) to do this; it is necessary to send a detailed sequence of commands, and if there are any errors, you usually end up with a completely unresponsive WiFi chip.

The first step is to check the Active Low Power (ALP) clock; this involves setting a register, then waiting for the WiFi chip to acknowledge that setting.

bool wifi_init(void)
{
    wifi_reg_write(SD_FUNC_BAK, BAK_CHIP_CLOCK_CSR_REG, SD_ALP_REQ, 1);
    if (!wifi_reg_val_wait(10, SD_FUNC_BAK, BAK_CHIP_CLOCK_CSR_REG, 
                           SD_ALP_AVAIL, SD_ALP_AVAIL, 1))
        return(false);

This wait-and-loop scenario is quite common, so I’ve created a specific function for it:

// Check register value every msec, until correct or timeout
bool wifi_reg_val_wait(int ms, int func, int addr, uint32_t mask, uint32_t val, int nbytes)
{
    bool ok;
    
    while (!(ok=wifi_reg_read_check(func, addr, mask, val, nbytes)) && ms--)
        usdelay(1000);
    return(ok);
}

// Read & check a masked value, return zero if incorrect
bool wifi_reg_read_check(int func, int addr, uint32_t mask, uint32_t val, int nbytes)
{
    return((wifi_reg_read(func, addr, nbytes) & mask) == val);
}

A value is obtained from the register, masked with an AND-function, then compared with the required value. If the comparison is false, the code delays for one millisecond, then tries again, until the given time (in milliseconds) has expired.

This raises the question as to what the code should do when it encounters an error such as this; should it try to re-send the command? In practice, the timeout generally means that the internal state of the chip is incorrect; for example, there may have been a bug in the code, or a power glitch, and the only way to correct this situation is to re-power the chip, and start again – fortunately the initialisation process is quite fast (it only takes a few seconds) so this isn’t a major problem.

Assuming the ALP check passes, there are some more register write cycles that I can’t explain in detail, as I don’t have access to any information about the chip that isn’t publicly available.

We then make 2 writes to registers in banked memory, that do deserve more explanation.

#define BAK_BASE_ADDR           0x18000000
#define SRAM_BASE_ADDR          (BAK_BASE_ADDR+0x4000)
#define SRAM_BANKX_IDX_REG      (SRAM_BASE_ADDR+0x10)
#define SRAM_BANKX_PDA_REG      (SRAM_BASE_ADDR+0x44)

wifi_bak_reg_write(SRAM_BANKX_IDX_REG, 0x03, 4);
wifi_bak_reg_write(SRAM_BANKX_PDA_REG, 0x00, 4);

The backplane function can only access a 32K block in the WiFi RAM, so the two addresses we’re writing (18004010 and 18004044 hex) are outside its access range, and we have to use bank-switching. There is a simple check to see if the bank has changed since the last access, in which case no switching is needed:

#define SB_32BIT_WIN    0x8000
#define SB_ADDR_MASK    0x7fff
#define SB_WIN_MASK     (~SB_ADDR_MASK)

// Set backplane window if address has changed
void wifi_bak_window(uint32_t addr)
{
    static uint32_t lastaddr=0;

    addr &= SB_WIN_MASK;
    if (addr != lastaddr)
        wifi_reg_write(SD_FUNC_BAK, BAK_WIN_ADDR_REG, addr>>8, 3);
    lastaddr = addr;
}

// Write a 1 - 4 byte value via the backplane window
int wifi_bak_reg_write(uint32_t addr, uint32_t val, int nbytes)
{
    wifi_bak_window(addr);
    return(wifi_reg_write(SD_FUNC_BAK, addr, val, nbytes));
}

We can now load the binary ARM firmware into the WiFi processor; it is in a file that is unique to the specific Wifi chip, so different files are needed for the CYW43439 and CYW4343w; these are the only differences in the way the chips are programmed.

const unsigned char fw_firmware_data[] = {
0x00,0x00,0x00,0x00,0x65,0x14,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,0x91,0x13,0x00,0x00,
..and so on..
}
const unsigned int fw_firmware_len = sizeof(fw_firmware_data);

wifi_data_load(SD_FUNC_BAK, 0, fw_firmware_data, fw_firmware_len);

As previously mentioned, the backplane can only access a 32K block, and each SPI access to the backplane is limited to 64 bytes, so the data loading function walks though the memory in 64-byte blocks, and when 32K is reached, the access window is moved up, and the loading resumes at address 0.

#define MAX_BLOCKLEN    64

// Load data block into WiFi chip (CPU firmware or NVRAM file)
int wifi_data_load(int func, uint32_t dest, const unsigned char *data, int len)
{
    int nbytes=0, n;
    uint32_t oset=0;
    
    wifi_bak_window(dest);
    dest &= SB_ADDR_MASK;
    while (nbytes < len)
    {
        if (oset >= SB_32BIT_WIN)
        {
            wifi_bak_window(dest+nbytes);
            oset -= SB_32BIT_WIN;
        }
        n = MIN(MAX_BLOCKLEN, len-nbytes);
        wifi_data_write(func, dest+oset, (uint8_t *)&data[nbytes], n);
        nbytes += n;
        oset += n;
    }
    return(nbytes);
}

After a delay to allow the WiFi chip to settle, the next item to be loaded is the non-volatile RAM (NVRAM) data. This is in the form of a C character array, with each entry being null-terminated, e.g.

const unsigned char fw_nvram_data[0x300] = {
    "manfid=0x2d0"  "\x00"
    "prodid=0x0727" "\x00"
    "vendid=0x14e4" "\x00"
    ..and so on..
}
const unsigned int fw_nvram_len = sizeof(fw_nvram_data);

Loading this file uses the same function as the firmware, with a different base, and a write-cycle to confirm the file length:

#define NVRAM_BASE_ADDR     0x7FCFC

wifi_data_load(SD_FUNC_BAK, NVRAM_BASE_ADDR, fw_nvram_data, fw_nvram_len);
n = ((~(fw_nvram_len / 4) & 0xffff) << 16) | (fw_nvram_len / 4);
wifi_reg_write(SD_FUNC_BAK, SB_32BIT_WIN | (SB_32BIT_WIN-4), n, 4);

Now it is necessary to reset the WiFi processor core, wait for the indication that the High Throughput (HT) clock is available, then wait for an ‘event’ that signals the device is ready. The most common fault I experienced when developing the code was that it gets stuck at this point, waiting for a confirmation that never comes.

// Reset, and wait for High Throughput (HT) clock ready
wifi_core_reset(false);
if (!wifi_reg_val_wait(50, SD_FUNC_BAK, BAK_CHIP_CLOCK_CSR_REG, 
                              SD_HT_AVAIL, SD_HT_AVAIL, 1))
    return(false);
// Wait for backplane ready
if (!wifi_rx_event_wait(100, SPI_STATUS_F2_RX_READY))
    return(false);

Events are the main way that the WiFi chip sends signals or data asynchronously to the RP2040; for a detailed description of how they work, see the next part.

Once the system has signalled it is ready, the Country Locale Matrix (CLM) file has to be loaded. This binary file limits the WiFi parameters (e.g. transmit power level) to be within the regulatory constraints for the specific RF hardware and locale.

const unsigned char fw_clm_data[] = {
	0x42,0x4C,0x4F,0x42,0x3C,0x00,0x00,0x00,0xFA,0x69,0xE0,0xBB,
	0x01,0x00,0x00,0x00,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    ..and so on..
};
const unsigned int fw_clm_len = sizeof(fw_clm_data);

wifi_clm_load(fw_clm_data, fw_clm_len);

The loading function uses a specific structure to send IOCTL data blocks to the WiFi chip, with flags to mark the beginning & end of the sequence:

#define MAX_LOAD_LEN        512

typedef struct {
	uint16_t flag;
	uint16_t type;
	uint32_t len;
	uint32_t crc;
} CLM_LOAD_HDR;

typedef struct {
    char req[8];
    CLM_LOAD_HDR hdr;
} CLM_LOAD_REQ;

// Load CLM
int wifi_clm_load(const unsigned char *data, int len)
{
    int nbytes=0, oset=0, n;
    CLM_LOAD_REQ clr = {.req="clmload", .hdr={.type=2, .crc=0}};
    
    while (nbytes < len)
    {
        n = MIN(MAX_LOAD_LEN, len-nbytes);
        clr.hdr.flag = 1<<12 | (nbytes?0:2) | (nbytes+n>=len?4:0);
        clr.hdr.len = n;
        ioctl_set_data2((void *)&clr, sizeof(clr), 1000, (void *)&data[oset], n);
        nbytes += n;
        oset += n;
    }
    return(nbytes);
}

Example program

To show all this code in action, we can run our first complete program;

// PicoWi blinking LED test

#include <stdint.h>
#include <stdbool.h>
#include <stdio.h>
#include "picowi_pico.h"
#include "picowi_spi.h"
#include "picowi_init.h"

int main() 
{
    uint32_t led_ticks;
    bool ledon=false;
    
    io_init();
    printf("PicoWi LED blink\n");
    set_display_mode(DISP_INFO);
    if (!wifi_setup())
        printf("Error: SPI communication\n");
    else if (!wifi_init())
        printf("Error: can't initialise WiFi\n");
    else
    {
        ustimeout(&led_ticks, 0);
        while (1)
        {
            if (ustimeout(&led_ticks, 500000))
                wifi_set_led(ledon = !ledon);
        }
    }
}

This sets up the WiFi chip as described above, prints the 6-byte MAC address, then just loops, flashing the LED that is attached to the Wifi chip at 1 Hz.

The display_mode function controls how much diagnostic information you might want to see, using a bitfield so you can combine multiple options:

// Display mask values
#define DISP_NOTHING    0       // No display
#define DISP_INFO       0x01    // General information
#define DISP_SPI        0x02    // SPI transfers
#define DISP_REG        0x04    // Register read/write
#define DISP_SDPCM      0x08    // SDPCM transfers
#define DISP_IOCTL      0x10    // IOCTL read/write
#define DISP_EVENT      0x20    // Event reception
#define DISP_DATA       0x40    // Data transfers

This can potentially provide a lot of diagnostic information, the main limitation being the speed of the console display – I use a serial link at 460800 baud, since the default of 9600 is much too slow. To see some of the internal workings of PicoWi, try:

set_display_mode(DISP_INFO|DISP_REG|DISP_IOCTL);

You can call this function multiple times with different mode values, to concentrate the diagnostic information on a specific area of interest, and avoid displaying a lot of unwanted information while the WiFi chip is being initialised.

The other unusual feature is the use of the ustimeout function, which I’ve used it in place of the more conventional delay function call, as I don’t want the delay to block all other CPU activity. In a simple program this isn’t an issue, but in later examples I want to do other things (such as checking for events) while waiting for the LED to blink, so can’t use a simple delay.

The ustimeout function takes two arguments; a pointer to a variable, and a timeout value in microseconds (zero if immediate). When the specified time has elapsed, the function returns a non-zero value and reloads the variable with the current time. So you can add extra function calls to the main loop, without affecting the LED blinking.

The code to control the LED uses a single Device I/O Control (IOCTL) call with 2 arguments; the first is a bit-mask, and the second is the value:

#define SD_LED_GPIO     0

// Set WiFi LED on or off
void wifi_set_led(bool on)
{
    ioctl_set_intx2("gpioout", 10, 1<<SD_LED_GPIO, on ? 1<<SD_LED_GPIO : 0);
}

For details on how to build & run this example program, see the introduction.

IOCTL calls are the primary mechanism for high-level communication with the WiFi chip; see the next part for a detailed description.

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi part 1: low-level interface

Pi Pico W wireless architecture

The WiFi interface on the Pico W uses the Broadcom/Cypress/Infineon CYW43439; this is a ‘full’ Media Access and Control (MAC) chip, so in theory you can just tell it to join a network, or send a block of data, and it’ll handle all the low-level operations.

However, in practice there is a lot more complication than that, and it takes a very large number of carefully-timed commands before the chip will start up, let alone do anything useful. This is because it actually contains two processors (ARM M3 and D11), each with their own memory and I/O, and they both have to be programmed before any network operations can start.

The CYW43439 is part of a large family of 43xxx wireless interfaces; most use PCI, USB or SDIO interface for communications with a host processor, but in this case communications is via a Serial Peripheral Interface (SPI) that is half-duplex, i.e. a single wire is used to carry commands & data to the WiFi chip, and also the responses from that chip, as described in the device datasheet.

SPI interface

Pi Pico-W interface to CYW43439

This excerpt from the Pico-W circuit diagram shows the interface between the CYW43439 WiFi chip and the RP2040 CPU. The connections on the WiFi chip are labelled as if there were an SDIO interface, since they are dual-function:

SDIO functionSPI functionRP2040 pin
WL_REG_ONPower up (ON)GP23
SDIO_CLKClock (CLK)GP29
SDIO_CMDData in (MOSI)GP24
SDIO_DATA0Data out (MISO)GP24 via 470R
SDIO_DATA1Interrupt (IRQ)GP24 via 10K
SDIO_DATA2Mode select (SEL)GP24
SDIO_DATA3Chip select (CS)GP25

On chips with dual interfaces, the state of DATA2 at power-up determines which interface is to be used; for SPI, this pin must be held low, before REG_ON is set high to power up the chip.

A single data line is shared between CMD for commands & data going to the WiFi chip, and DATA0 for the returned responses. Just in case there is a clash of I/O (e.g. both the CPU and WiFi chip transmitting at the same time) there is a 470 ohm protection resistor in series with DATA0.

The chip-select line is as usual for SPI interfaces; when it is high, the WiFi interface is disconnected from the data lines. This allows the over-worked data line to be used for a third purpose, namely interrupt request (IRQ) to the RP2040 CPU; when the interface is idle, IRQ is normally low, but goes high when the WiFi chip has some data to send (e.g. a new data packet has been received). To ensure that the IRQ line doesn’t interfere with communications, it is connected via a 10K resistor.

Debug with CYW4343W

When developing this software, there was a major problem; the Pico-W components and PCB tracks are so fine that I couldn’t attach an oscilloscope or logic analyser to the SPI connections. This makes debugging the low-level drivers very difficult, especially when programming the PIO peripheral in the RP2040.

The solution I adopted was to add a second CYW43xxx interface to the Pico; unfortunately I couldn’t find a convenient CYW43439 module, but Avnet sell an FPGA add-on board (part number AES-PMOD-MUR-1DX-G) with a Murata 1DX module containing a CYW4343W. This is sufficiently similar to the 43439 that no code modifications are required, just a different firmware file, as described in part 2 of this post.

Circuitry to add Murata 1DX (CYW4343W) module to Pi Pico

I’ve had to tweak the resistor values slightly; this is because D0 – D4 pins on the module have 10K pullup resistors, so R2 has to be lower to compensate.

The choice of GPIO pins is completely arbitrary, since the code can work with any pin taking any function; I chose D0 – D3 as GP16 – GP19, since it will allow me to experiment with an SDIO interface at a future date. If you are only interested in emulating the standard Pico-W interface, then the connections to GP16 – GP18 can be omitted, since the SPI select line (D2) has a pull-down resistor.

The resulting circuitry fits very neatly onto a Pico prototyping board; by keeping the connections short, it works fine with SPI speeds up to 16 MHz. The only problems I encountered in constructing the hardware were that the PMOD connector has an unusual pin-numbering, and is mis-labelled as Bluetooth.

Pi Pico with Murata 1DX (CYW4343W) add-on module

Using this setup, it is easy to capture the SPI waveforms; here is an oscilloscope trace using a (relatively leisurely) 2 MHz clock for clarity.

SPI transfer waveforms

This shows a 4-byte command on the MOSI line, followed by a 4-byte MISO response, which also appears on the MOSI line, due to the 470 ohm resistor linking the two.

SPI software

Normally we’d just use the RP2040 built-in SPI controller to access the WiFi chip, but that has specific sets of pins it can use, which differ from those that are connected to the chip. This isn’t a major problem, as we can use the built-in Programmble I/O (PIO) to do the transfers, but initially I’d just like to check that the hardware works, before diving into PIO programming. So the first step is to write a ‘bit-bashed’ driver, that uses direct access to the I/O bits.

The write-cycle is quite conventional, where a bit (most-significant bit first) is put onto the data line, and the clock is toggled high then low:

// Write data to SPI interface
void spi_write(uint8_t *data, int nbits)
{
    uint8_t b=0;
    int n=0;

    io_mode(SD_CMD_PIN, IO_OUT);
    usdelay(SD_CLK_DELAY);
    b = *data++;
    while (n < nbits)
    {
        IO_WR(SD_CMD_PIN, b & 0x80);
        IO_WR(SD_CLK_PIN, 1);
        b <<= 1;
        if ((++n & 7) == 0)
            b = *data++;
        IO_WR(SD_CLK_PIN, 0);
    }
    usdelay(SD_CLK_DELAY);
    io_mode(SD_CMD_PIN, IO_IN);
    usdelay(SD_CLK_DELAY);
}

If the command is a data-read, the data is read immediately, then the clock is toggled high and low. This is unusual, and can cause confusion in some protocol decoders:

// Read data from SPI interface
void spi_read(uint8_t *data, int nbits)
{
    uint8_t b;
    int n=0;

    data--;
    while (n < nbits)
    {
        b = IO_RD(SD_DIN_PIN);
        IO_WR(SD_CLK_PIN, 1);
        if ((n++ & 7) == 0)
            *++data = 0;        
        *data = (*data << 1) | b;
        IO_WR(SD_CLK_PIN, 0);
    }
 }

A write-cycle to the WiFi chip involves creating a command message as described in the data sheet, setting chip-select (CS) low, and transferring that message, followed by the data.

#define SWAP16_2(x) ((((x)&0xff000000)>>8) | (((x)&0xff0000)<<8) | \
                    (((x)&0xff00)>>8)      | (((x)&0xff)<<8))
#define SD_FUNC_BUS         0
#define SD_FUNC_BAK         1
#define SD_FUNC_RAD         2
#define SD_FUNC_SWAP        4
#define SD_FUNC_BUS_SWAP    (SD_FUNC_BUS | SD_FUNC_SWAP)
#define SD_FUNC_MASK        (SD_FUNC_SWAP - 1)

typedef struct
{
    uint32_t len:11, addr:17, func:2, incr:1, wr:1;
} SPI_MSG_HDR;

// SPI message
typedef union
{
    SPI_MSG_HDR hdr;
    uint32_t vals[2];
    uint8_t bytes[2048];
} SPI_MSG;

// Write a data block using SPI
int wifi_data_write(int func, int addr, uint8_t *dp, int nbytes)
{
    SPI_MSG msg={.hdr = {.wr=1, .incr=1, .func=func&SD_FUNC_MASK,
                         .addr=addr, .len=nbytes}};
    if (func & SD_FUNC_SWAP)
        msg.vals[0] = SWAP16_2(msg.vals[0]);
    io_out(SD_CS_PIN, 0);
    usdelay(SD_CLK_DELAY);
    if (nbytes <= 4)
    {
        memcpy(&msg.bytes[4], dp, nbytes);
        spi_write((uint8_t *)&msg, 64);
    }
    else
    {
        spi_write((uint8_t *)&msg, 32);
        usdelay(SD_CLK_DELAY);
        spi_write(dp, nbytes*8);
    }
    usdelay(SD_CLK_DELAY);
    io_out(SD_CS_PIN, 1);
    usdelay(SD_CLK_DELAY);
    return(nbytes);
}

The command header contains:

  • Length: byte-count of data
  • Address: location to receive the data
  • Function number: destination for the transfer
  • Increment: flag to enable address auto-increment
  • Write: flag to indicate a write-cycle

The unusual item is the function number, that selects which peripheral within the WiFi chip will receive the data; a value of 0 selects the SPI interface, 1 the backplane, and 2 the radio. Functions 0 & 1 are limited to a maximum size of 64 bytes, and are generally used for device configuration, whilst function 2 is used for transferring network data, which can be up to 2048 bytes. I’ve also created a dummy function 4, which is used to signal that a word-swap is required, when the chip is uninitialised.

The read cycle has a similar structure:

// Read data block using SPI
int wifi_data_read(int func, int addr, uint8_t *dp, int nbytes)
{
    SPI_MSG msg={.hdr = {.wr=0, .incr=1, .func=func&SD_FUNC_MASK,
                         .addr=addr, .len=nbytes}};
    uint8_t data[4];

    if (func & SD_FUNC_SWAP)
        msg.vals[0] = SWAP16_2(msg.vals[0]);
    else if (func == SD_FUNC_BAK)
        msg.hdr.len += 4;
    io_out(SD_CS_PIN, 0);
    usdelay(SD_CLK_DELAY);
    spi_write((uint8_t *)&msg, 32);
    io_mode(SD_CMD_PIN, IO_IN);
    usdelay(SD_CLK_DELAY);
    if (func == SD_FUNC_BAK)
        spi_read(data, 32);
    usdelay(SD_CLK_DELAY);
    spi_read(dp, nbytes*8);
    usdelay(SD_CLK_DELAY);
    io_mode(SD_CMD_PIN, IO_OUT);
    io_out(SD_CS_PIN, 1);
    return(nbytes);
}

When making a ‘backplane’ read, the first 4 return bytes are discarded; they are padding to give the remote peripheral time to respond.

Now that we have the necessary read/write functions, we can perform a simple check to see if the WiFi chip is responding. The data sheet describes several ‘gSPI registers’ and the ‘test read-only’ register at address 0x14 has the defined constant 0xFEEDBEAD. The first attempt to read this register generally fails, but subsequent reads should return the desired value:

#define SPI_TEST_VALUE 0xfeedbead
bool ok=0;
for (int i=0; i<4 && !ok; i++)
{
    usdelay(2000);
    val = wifi_reg_read(SD_FUNC_BUS_SWAP, 0x14, 4);
    ok = (val == SPI_TEST_VALUE);
}
if (!ok)
    printf("Error: SPI test pattern %08lX\n", val);

Next we need to configure the SPI interface to our preferences, using register 0, as described in the datasheet. The main change is to eliminate the awkward byte-swapping, but to do that, we need to send a byte-swapped command:

// Write a register using SPI
int spi_reg_write(int func, uint32_t addr, uint32_t val, int nbytes)
{
    if (func&SD_FUNC_SWAP && nbytes>1)
        val = SWAP16_2(val);
    return(wifi_data_write(func, addr, (uint8_t *)&val, nbytes));
}

wifi_reg_write(SD_FUNC_BUS_SWAP, SPI_BUS_CONTROL_REG, 0x204b3, 4);

Now we can re-read the test register without the awkward byte-swapping:

wifi_reg_read(SD_FUNC_BUS, 0x14, 4);

Another parameter we’ve set is ‘high-speed mode’, which means that reading & writing occur on the rising clock edge.

Using RP2040 PIO

To maximise the speed of SPI transfers, we need to use a peripheral within the RP2040 CPU. Normally this would be an SPI controller, but this can not control the pins that are connected to the WiFi chip, so we have to use the Programmable I/O (PIO) peripheral instead.

There are plenty of online tutorials explaining how PIO works; it is basically a small state-machine that is programmed in assembly-language. It operates in a highly deterministic fashion, at a rate of up to 125M instructions per second, so is ideally suited to handling the SPI interface.

I wanted to use the PIO as a direct replacement for the bit-bashed spi_read and spi_write functions described above, so the PIO program is:

; Pico PIO program for half-duplex SPI transfers
.program picowi_pio
.side_set 1
.wrap_target
.origin 0
public stall:               ; Stall here when transfer complete
    pull            side 0  ; Get byte to transmit from FIFO
loop1:
    nop             side 0  ; Idle with clock low
    in pins, 1      side 0  ; Fetch next Rx bit
    out pins, 1     side 0  ; Set next Tx bit 
    nop             side 1  ; Idle high
    jmp !osre loop1 side 1  ; Loop if data in shift reg
    push            side 0  ; Save Rx byte in FIFO
.wrap
; EOF

The origin statement ensures the program is loaded at address zero, rather than the default, which is to load it at the top of program memory.

The ‘pull’ instruction fetches an 8-bit value from the transmit first-in first-out (FIFO) buffer, then there is a loop to output & input the individual bits of that byte until the transmit shift register is empty, and the receive register is full, so the latter can be pushed onto the receive FIFO.

So for SPI write, the associated C code just needs to keep the 4-entry transmit FIFO topped up with the outgoing data, and discard the incoming data, so the receive FIFO doesn’t overflow.

static PIO my_pio = pio0;
uint my_sm = pio_claim_unused_sm(my_pio, true);
io_rw_8 *my_txfifo = (io_rw_8 *)&my_pio->txf[0];

// Write data block to SPI interface,
// When complete, set data pin as I/P (so it is available for IRQ)
void pio_spi_write(unsigned char *data, int len)
{
    config_output(1);
    pio_sm_clear_fifos(my_pio, my_sm);
    while (len)
    {
        if (!pio_sm_is_tx_fifo_full(my_pio, my_sm))
        {
            *my_txfifo = *data++;
            len --;
        }
        if (!pio_sm_is_rx_fifo_empty(my_pio, my_sm))
            pio_sm_get(my_pio, my_sm);
    }
    while (!pio_sm_is_tx_fifo_empty(my_pio, my_sm) || !pio_complete())
    {
        while (!pio_sm_is_rx_fifo_empty(my_pio, my_sm))
            pio_sm_get(my_pio, my_sm);
    }
    pio_sm_get(my_pio, my_sm);
    config_output(0);
}

The SPI read code fills the transmit FIFO with null bytes, and fetches the incoming data from the receive FIFO:

// Read data block from SPI interface
void pio_spi_read(unsigned char *data, int rxlen)
{
    int txlen=rxlen;
    pio_sm_clear_fifos(my_pio, my_sm);
    while (rxlen > 0 || !pio_complete())
    {
        if (txlen>0 && !pio_sm_is_tx_fifo_full(my_pio, my_sm))
        {
            *my_txfifo = 0;
            txlen--;
        }
        if (!pio_sm_is_rx_fifo_empty(my_pio, my_sm))
        {
            *data++ = pio_sm_get(my_pio, my_sm);
            rxlen--;
        }
    }
}

Since the reading of a WiFi register involves an SPI write cycle closely followed by a read cycle, it is important that the write cycle is complete before the read cycle starts. This issue proved to be the biggest problem with the PIO code; it is easy to detect when the transmit FIFO is empty, but the code must carry on waiting until the last bit of the last byte has been shifted out. This means that I have to use an explicitly-coded loop in the assembly language, with a check of shift-register-empty, rather than using the auto-load capability of the input & output instructions.

The other tricky issue was how the assembly-language program should signal to the C program that the output-shift is complete. In theory, I can use an IRQ flag to do this signalling, but in practice I could not make that work reliably – the technique would only work at specific clock frequencies, which suggested that there might be a critical race between the two sets of code. The problem with timing-sensitive code is that a small unrelated change to the main program (e.g. addition of an interrupt) can cause the code to fail in a manner that is very difficult to diagnose, so it is essential that the code works reliably over a wide range of SPI frequencies.

The solution I adopted is encapsulated in the pio_complete function:

// Check to see if PIO transfer complete (stalled at FIFO pull)
static inline int pio_complete(void)
{
    return(my_pio->sm[my_sm].addr == picowi_pio_offset_stall);
}

This compares the current PIO execution address with the ‘stall’ label in the PIO code; if the transmit FIFO is empty, and this comparison is true, then the PIO is stalled waiting for more data, having shifted out everything it was given.

In a single-threaded program, it won’t be too difficult to keep the transmit FIFO topped up, and the receive FIFO emptied, so there is no risk of an overflow or underflow causing problems. However, this is more difficult when the program is multi-tasking, so it will be necessary to add Direct Memory Access (DMA) transfers to the current code.

In the next part we’ll initialise the WiFi chip.

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

PicoWi: standalone WiFi driver for the Pi Pico W

Introduction

Raspberry Pi Pico W

The aim of this project is to provide a fast WiFi driver for the CYW43439 chip on the Pi Pico W module, with C code running on the RP2040 processor; it can also be used with similar Broadcom / Cypress / Infineon chips that have an SPI interface, such as the CYW4343W.

It is based on my Zerowi project which performs a similar function on the Pi Zero device CYW43438, which has an SDIO interface. However, due to myriad difficulties getting the code running, the code has been restructured and simplified to emphasise the various stages in setting up the chip, and to provide copious run-time diagnostics.

The structured approach of the WiFi drivers is mirrored in the example programs, and the individual parts of this blog; they range from a simple LED-flash program, to one that provides some TCP/IP functionality.

A major problem with debugging Pico-W code is the difficulty attaching any hardware diagnostic tools, such as an oscilloscope or logic analyser; this has been addressed by supporting an add-on board with a Murata 1DX module and CYW4343W chip; full details are given in part 1 of this project.

Development environment

For simplicity, I use a Raspberry Pi 4 to build the code and program the Pico, with two I/O lines connected to the Pico SWD interface. This is really easy to set up, using a single script that installs the SDK and all the necessary software tools on the Pi 4:

wget https://raw.githubusercontent.com/raspberrypi/pico-setup/master/pico_setup.sh
chmod +x pico_setup.sh
./pico_setup.sh

For SWD programming, the Pico must be connected to the I/O pins on the Pi as follows:

Pico SWCLK   Pi pin 22 (GPIO 25)
Pico GND     Pi pin 20
Pico SWDIO   Pi pin 18 (GPIO 24)

The serial interface is used extensively for displaying diagnostic information; pin 1 of the Pico is the serial output, and pin 3 is ground. I use a 3.3 volt FTDI USB-serial adaptor to display the serial output on a PC, but the Pi 4 serial input could be used instead.

If you want to develop using a Windows PC, you can find several online guides as to how this might be set up, but I experienced problems with 2 of the methods I tried. So far, the only good setup for Windows I’ve found is VisualGDB, which has a wide range of very useful features, but is not free.

The PicoWi source code is on github. To load it on the Pi 4:

cd ~
git clone http://github.com/jbentham/picowi
cd ~/picowi/build
chmod +x prog
chmod +x reset

I’ve included the necessary CMakeLists.txt, so the project can be built using the following commands:

cd ~/picowi/build
cmake ..      # create the makefiles
make picowi   # make the picowi library
make blink    # make the 'blink' application
./prog blink  # program the RP2040, and run the application

When building the current pico-SDK, there are some ‘parameter passing’ warnings when the pio assembler is compiled; these can be ignored.

Compilation is reasonably fast on the Pi 4; once the SDK libraries have been built, you can do a complete re-build of the PicoWi library and application within 10 seconds, and reprogram the RP2040 in under 3 seconds.

The ‘reset’ command is useful when you just want to restart the Pico, without loading any new code.

If OpenOCD reports ‘read incorrect DLIPDR’ then there is a problem with the wiring. I’ve set the SWD speed to 2 MHz, which should work error-free, providing the wires are sufficiently short (e.g. under 6 inches or 150 mm) and there are good power & ground connections between the Pi & Pico. I use a short USB cable to power the Pico from the Pi, and this is generally problem-free, though sometimes the Pi won’t boot with the Pico connected; this appears to be a USB communication problem.

Compile-time settings

There is currently only one setting in the CMakeLists.txt file, to choose between the on-board CYW43439 device, or an external CYW4343W module:

# Set to 0 for Pico-W CYW43439, 1 for Murata 1DX (CYW4343W)
set (CHIP_4343W 0)

The Pico-specific settings are in picowi_pico.h:

#define USE_GPIO_REGS   0       // Set non-zero for direct register access
                                // (boosts SPI from 2 to 5.4 MHz read, 7.8 MHz write)
#define SD_CLK_DELAY    0       // Clock on/off delay time in usec
#define USE_PIO         1       // Set non-zero to use Pico PIO for SPI
#define PIO_SPI_FREQ    8000000 // SPI frequency if using PIO

These affect the way the SPI interface is driven; the default is to use the Pico PIO (programmable I/O) with the given clock frequency; 8 MHz is a conservative value, I have run it at 12MHz, and higher speeds should be possible with some tweaking of the I/O settings.

Setting USE_PIO to zero will enable a ‘bit-bashed’ (or ‘bit-banged’) driver; this can run over 7 MHz if using direct register writes, or 2 MHz if using normal function calls.

You’ll note that I haven’t included a driver for the SPI peripheral inside the RP2040; this would have been easier to use than the PIO peripheral, but the on-board CYW43439 chip isn’t connected to suitable I/O pins. The actual pins used are defined in picowi_pico.h:

#define SD_ON_PIN       23
#define SD_CMD_PIN      24
#define SD_DIN_PIN      24
#define SD_D0_PIN       24
#define SD_CS_PIN       25
#define SD_CLK_PIN      29
#define SD_IRQ_PIN      24

You’ll see that pin 24 is performing multiple functions; this hardware configuration is discussed in detail in the next part of this blog. If you are using an external module, the pin definitions can be modified to use any of the RP2040 I/O pins.

Diagnostic settings

My code makes extensive use of a serial console for diagnostic purposes, and I generally use an FTDI USB-serial adaptor connected to pin 1 of the Pico module to monitor this. In view of the large amount of information that can be produced, I modify the serial interface to run at 460800 baud:

Edit pico-sdk/src/rp2_common/hardware_uart/include/hardware/uart.h
Change definition to:
  #define PICO_DEFAULT_UART_BAUD_RATE 460800

I originally tried running the interface at 921600 baud, but this resulted in occasional characters being lost, which is a major problem when trying to understand what is going wrong with the code.

You can use the Pico USB link instead; it must be enabled in the CMakeLists.txt, using the name of the main file, for example to enable it for the ‘ping’ example program:

pico_enable_stdio_usb(ping 1)

Then you can use a terminal program such as minicom on the Pi 4 to view the console:

# Run minicom, press ctrl-A X to exit. 
minicom -D /dev/ttyACM0 -b 460800

A disadvantage of this approach is that when the Pico is reprogrammed, its CPU is reset, which causes a failure of the USB link. After a few seconds, the link is re-established, but there will be a gap in the console display, which can be misleading. Also, there is the possibility that the extra workload of maintaining a (potentially very busy) USB connection might cause timing problems, so if you are making extensive use of the diagnostics, it might be necessary to use a hard-wired serial interface.

You can control the extent to which diagnostic data is reported on the console; this is done by inserting function calls, rather than using compile-time definitions, to give fine-grained control. The display options are in a bitfield, so can be individually enabled or disabled, for example:

// Display SPI traffic details
set_display_mode(DISP_INFO|DISP_EVENT|DISP_SDPCM|
                 DISP_REG|DISP_JOIN|DISP_DATA);

// Display nothing
set_display_mode(DISP_NOTHING);

// Display ARP and ICMP data transfers
set_display_mode(DISP_ARP|DISP_IP|DISP_ICMP);

WiFi network

For the time being, the code does not support the Access Point functionality within the WiFi chip. It can only join a network that is unencrypted, or with WPA1 or WPA2 encryption, as set in the file picowi_join.h:

// Security settings: 0 for none, 1 for WPA_TKIP, 2 for WPA2
#define SECURITY            2

The network name (SSID) and password are defined in the ‘main’ file for each application, e.g. join.c or ping.c, which means they are insecure, as they can be seen by anyone with access to the source code or binary executable:

// Insecure password for test purposes only!!!
#define SSID                "testnet"
#define PASSWD              "testpass"

Other resources

The data sheets for the CYW43439 and CYW4343W are well worth a read, as they contain a good description of the low-level SPI interface, but contain nothing on the inner workings of these incredibly complicated chips. The Infineon WICED development environment has very comprehensive coverage of the WiFi chips, though it would take some work to port this code to the RP2040. The Pi Pico SDK contains the full source code to drive the CYW43439, with the lwIP (lightweight IP) open-source TCP/IP stack.

I’m using a different approach, with a completely new low-level driver, and a built-in IP stack to maximise throughput, as described in the following parts:

Project links
IntroductionProject overview
Part 1Low-level interface; hardware & software
Part 2Initialisation; CYW43xxx chip setup
Part 3IOCTLs and events; driver communication
Part 4Scan and join a network; WPA security
Part 5ARP, IP and ICMP; IP addressing, and ping
Part 6DHCP; fetching IP configuration from server
Part 7DNS; domain name lookup
Part 8UDP server socket
Source codeFull C source code

I’ll be releasing updates with more TCP/IP functionality.

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

Web display for Pi Pico oscilloscope

Web oscilloscope display

In part 1 of this series, I added WiFi connectivity to the Pi Pico using an ESP32 moduleand MicroPython. Part 2 showed how Direct Memory Access (DMA) can be used to get analog samples at regular intervals from the Pico on-board Analog Digital Converter (ADC).

I’m now combining these two techniques with some HTML and Javascript code to create a Web display in a browser, but since this code will be quite complicated, first I’ll sort out how the data is fetched from the Pico Web server.

Data request

The oscilloscope display will require user controls to alter the sample rate, number of samples, and any other settings we’d like to change. These values must be sent to the Web server, along with a filename that will trigger the acquisition. To fetch 1000 samples at 10000 samples per second, the request received by the server might look like:

GET /capture.csv?nsamples=1000&xrate=10000

If you avoid any fancy characters, the Python code in the server that extracts the filename and parameters isn’t at all complicated:

ADC_SAMPLES, ADC_RATE = 20, 100000
parameters = {"nsamples":ADC_SAMPLES, "xrate":ADC_RATE}

# Get HTTP request, extract filename and parameters
req = esp.get_http_request()
if req:
    line = req.split("\r")[0]
    fname = get_fname_params(line, parameters)

# Get filename & parameters from HTML request
def get_fname_params(line, params):
    fname = ""
    parts = line.split()
    if len(parts) > 1:
        p = parts[1].partition('?')
        fname = p[0]
        query = p[2].split('&')
        for param in query:
            p = param.split('=')
            if len(p) > 1:
                if p[0] in params:
                    try:
                        params[p[0]] = int(p[1])
                    except:
                        pass
    return fname

The default parameter names & values are stored in a dictionary, and when the URL is decoded, and names that match those in the dictionary will have their values updated. Then the data is fetched using the parameter values, and returned in the form of a comma-delimited (CSV) file:

if CAPTURE_CSV in fname:
    vals = adc_capture()
    esp.put_http_text(vals, "text/csv", esp32.DISABLE_CACHE)

The name ‘comma-delimited’ is a bit of a misnomer in this case, we just with the given number of lines, with one floating-point voltage value per line.

Requesting the data

Before diving into the complexities of graphical display and Javascript, it is worth creating a simple Web page to fetch this data.

The standard way of specifying parameters with a file request is to define a ‘form’ that will be submitted to the server. The parameter values can be constrained using ‘select’, to avoid the user entering incompatible numbers:

<html><!DOCTYPE html><html lang="en">
<head><meta charset="utf-8"/></head><body>
  <form action="/capture.csv">
    <label for="nsamples">Number of samples</label>
    <select name="nsamples" id="nsamples">
      <option value=100>100</option>
      <option value=200>200</option>
	  <option value=500>500</option>
      <option value=1000>1000</option>
    </select>
    <label for="xrate">Sample rate</label>
    <select name="xrate" id="xrate">
      <option value=1000>1000</option>
      <option value=2000>2000</option>
	  <option value=5000>5000</option>
      <option value=10000>10000</option>
    </select>
	<input type="submit" value="Submit">
  </form>
</body></html>

This generates a very simple display on the browser:

Form to request ADC samples

On submitting the form, we get back a raw list of values:

CSV data

Since the file we have requested is pure CSV data, that is all we get; the controls have vanished, and we’ll have to press the browser ‘back’ button if we want to retry the transaction. This is quite unsatisfactory, and to improve it there are various techniques, for example using a template system to always add the controls at the top of the data. However, we also want the browser to display the data graphically, which means a sizeable amount of Javascript, so we might as well switch to a full-blown AJAX implementation, as mentioned in the first part.

AJAX

To recap, AJAX originally stood for ‘Asynchronous JavaScript and XML’, where the Javascript on the browser would request an XML file from the server, then display data within that file on the browser screen. However, there is no necessity that the file must be XML; for simple unstructured data, CSV is adequate.

The HTML page is similar to the previous one, the main changes are that we have specified a button that’ll call a Javascript function when clicked, and there is a defined area to display the response data; this is tagged as ‘preformatted’ so the text will be displayed in a plain monospaced style.

  <form id="captureForm">
    <label for="nsamples">Number of samples</label>
    <select name="nsamples" id="nsamples">
      <option value=100>100</option>
      <option value=200>200</option>
	  <option value=500>500</option>
      <option value=1000>1000</option>
    </select>
    <label for="xrate">Sample rate</label>
    <select name="xrate" id="xrate">
      <option value=1000>1000</option>
      <option value=2000>2000</option>
	  <option value=5000>5000</option>
      <option value=10000>10000</option>
    </select>
    <button onclick="doSubmit(event)">Submit</button>
  </form>
  <pre><p id="responseText"></p></pre>

The button calls the Javascript function ‘doSubmit’ when clicked, with the click event as an argument. As this button is in a form, by default the browser would attempt to re-fetch the current document using the form data, so we need to block this behaviour and substitute the action we want, which is to wait until the response is obtained, and display it in the area we have allocated. This is ‘asynchronous’ (using a callback function) so that the browser doesn’t stall waiting for the response.

function doSubmit() {
  // Eliminate default action for button click
  // (only necessary if button is in a form)
  event.preventDefault();

  // Create request
  var req = new XMLHttpRequest();

  // Define action when response received
  req.addEventListener( "load", function(event) {
    document.getElementById("responseText").innerHTML = event.target.responseText;
  } );

  // Create FormData from the form
  var formdata = new FormData(document.getElementById("captureForm"));

  // Collect form data and add to request
  var params = [];
  for (var entry of formdata.entries()) {
    params.push(entry[0] + '=' + entry[1]);
  }
  req.open( "GET", "/capture.csv?" + encodeURI(params.join("&")));
  req.send();
}

The resulting request sent by the browser looks something like:

GET /capture.csv?nsamples=100&xrate=1000

This is created by looping through the items in the form, and adding them to the base filename. When doing this, there is a limited range of characters we can use, in order not to wreck the HTTP request syntax. I have used the ‘encodeURI’ function to encode any of these unusable characters; this isn’t necessary with simple parameters that are just alphanumeric values, but if I’d included a parameter with free-form text, this would be needed. For example, if one parameter was a page title that might include spaces, then the title “Test page” would be encoded as

GET /capture.csv?nsamples=100&xrate=1000&title=Test%20page

You may wonder why I am looping though the form entries, when in theory they can just be attached to the HTTP request in one step:

// Insert form data into request - doesn't work!
req.open("GET", "/capture.csv");
req.send(formdata);

I haven’t been able to get this method to work; I think the problem is due to the way the browser adapts the request if a form is included, but in the end it isn’t difficult to iterate over the form entries and add them directly to the request.

The resulting browser display is a minor improvement over the previous version, in that it isn’t necessary to use the ‘back’ button to re-fetch the data, but still isn’t very pretty.

Partial display of CSV data

Graphical display

There many ways to display graphic content within a browser. The first decision is whether to use vector graphics, or a bitmap; I prefer the former, since it allows the display to be resized without the lines becoming jagged.

There is a vector graphics language for browsers, namely Scalable Vector Graphics (SVG) and I have experimented with this, but find it easier to use Javascript commands to directly draw on a specific area of the screen, known as an ‘HTML canvas’, that is defined within the HTML page:

<div><canvas id="canvas1"></canvas></div>

To draw on this, we create a ‘2D context’ in Javascript:

var ctx1 = document.getElementById("canvas1").getContext("2d");

We can now use commands such as ‘moveto’ and ‘lineto’ to draw on this context; a useful first exercise is to draw a grid across the display.

var ctx1, xdivisions=10, ydivisions=10, winxpad=10, winypad=30;
var grid_bg="#d8e8d8", grid_fg="#40f040";
window.addEventListener("load", function() {
  ctx1 = document.getElementById("canvas1").getContext("2d");
  resize();
  window.addEventListener('resize', resize, false);
} );

// Draw grid
function drawGrid(ctx) {
  var w=ctx.canvas.clientWidth, h=ctx.canvas.clientHeight;
  var dw = w/xdivisions, dh=h/ydivisions;
  ctx.fillStyle = grid_bg;
  ctx.fillRect(0, 0, w, h);
  ctx.lineWidth = 1;
  ctx.strokeStyle = grid_fg;
  ctx.strokeRect(0, 1, w-1, h-1);
  ctx.beginPath();
  for (var n=0; n<xdivisions; n++) {
    var x = n*dw;
    ctx.moveTo(x, 0);
    ctx.lineTo(x, h);
  }
  for (var n=0; n<ydivisions; n++) {
    var y = n*dh;
    ctx.moveTo(0, y);
    ctx.lineTo(w, y);
  }
  ctx.stroke();
}

// Respond to window being resized
function resize() {
  ctx1.canvas.width = window.innerWidth - winxpad*2;
  ctx1.canvas.height = window.innerHeight - winypad*2;
  drawGrid(ctx1);
}

I’ve included a function that resizes the canvas to fit within the window, which is particularly convenient when getting a screen-grab for inclusion in a blog post:

All that remains is to issue a request, wait for the response callback, and plot the CSV data onto the canvas.

var running=false, capfile="/capture.csv"

// Do a single capture (display is done by callback)
function capture() {
  var req = new XMLHttpRequest();
  req.addEventListener( "load", display);
  var params = formParams()
  req.open( "GET", capfile + "?" + encodeURI(params.join("&")));
  req.send();
}

// Display data (from callback event)
function display(event) {
  drawGrid(ctx1);
  plotData(ctx1, event.target.responseText);
  if (running) {
    window.requestAnimationFrame(capture);
  }
}

// Get form parameters
function formParams() {
  var formdata = new FormData(document.getElementById("captureForm"));
  var params = [];
  for (var entry of formdata.entries()) {
    params.push(entry[0]+ '=' + entry[1]);
  }
  return params;
}

A handy feature is to have the display auto-update when the current data has been displayed; I’ve done this by using requestAnimationFrame to trigger another capture cycle, if the global ‘running’ variable is set. Then we just need some buttons to control this feature:

<button id="single" onclick="doSingle()">Single</button>
<button id="run"  onclick="doRun()">Run</button>
// Handle 'single' button press
function doSingle() {
  event.preventDefault();
  running = false;
  capture();
}

// Handle 'run' button press
function doRun() {
  event.preventDefault();
  running = !running;
  capture();
}

The end result won’t win any prizes for style or speed, but it does serve as a useful basis for acquiring & displaying data in a Web browser.

100 Hz sine wave

You’ll see that the controls have been rearranged slightly, and I’ve also added a ‘simulate’ checkbox; this invokes MicroPython code in the Pico Web server that doesn’t use the ADC; instead it uses the CORDIC algorithm to incrementally generate sine & cosine values, which are multiplied, with some random noise added:

# Simulate ADC samples: sine wave plus noise
def adc_sim():
    nsamp = parameters["nsamples"]
    buff = array.array('f', (0 for _ in range(nsamp)))
    f, s, c = nsamp/20.0, 1.0, 0.0
    for n in range(0, nsamp):
        s += c / f
        c -= s / f
        val = ((s + 1) * (c + 1)) + random.randint(0, 100) / 300.0
        buff[n] = val
    return "\r\n".join([("%1.3f" % val) for val in buff])
Distorted sine wave with random noise added

Running the code

If you haven’t done so before, I suggest you run the code given in the first and second parts, to check the hardware is OK.

Load rp_devices.py and rp_esp32.py onto the Micropython filesystem, not forgetting to modify the network name (SSID) and password at the top of that file. Then load the HTML files rpscope_capture, rpscope_ajax and rpscope_display, and run the MicroPython server rp_adc_server.py using Thonny. The files are on Github here.

You should then be able to display the pages as shown above, using the IP address that is displayed on the Thonny console; I’ve used 10.1.1.11 in the examples above.

When experimenting with alternative Web pages, I found it useful to run a Web server on my PC, as this allows a much faster development process. There are many ways to do this, the simplest is probably to use the server that is included as standard in Python 3:

python -m http.server 8000

This makes the server available on port 8000. If the Web browser is running on the same PC as the server, use the ‘localhost’ address in the browser, e.g.

http://127.0.0.1:8000/rpscope_display.html

This assumes the HTML file is in the same directory that you used to invoke the Web server. If you also include a CSV file named ‘capture.csv’, then it will be displayed as if the data came from the Pico server.

However, there is one major problem with this approach: the CSV file will be cached by the browser, so if you change the file, the display won’t change. This isn’t a problem on the Pico Web server, as it adds do-not-cache headers in the HTTP response. The standard Python Web server doesn’t do that, so will use the cached data, even after the file has changed.

One other issue is worthy of mention; in my setup, the ESP32 network interface sometimes locks up after it has transferred a significant amount of data, which means the Web server becomes unresponsive. This isn’t an issue with the MicroPython code, since the ESP32 doesn’t respond to pings when it is in this state. I’m using ESP32 Nina firmware v 1.7.3; hopefully, by the time you read this, there is an update that fixes the problem.

Copyright (c) Jeremy P Bentham 2021. Please credit this blog if you use the information or software in it.

Pi Pico ADC input using DMA and MicroPython

Analog data capture using DMA

This is the second part of my Web-based Pi Pico oscilloscope project. In the first part I used an Espressif ESP32 to add WiFi connectivity to the Pico, and now I’m writing code to grab analog data from the on-chip Analog-to-Digital Converter (ADC), which can potentially provide up to 500k samples/sec.

High-speed transfers like this normally require code written in C or assembly-language, but I’ve decided to use MicroPython, which is considerably slower, so I need to use hardware acceleration to handle the data rate, specifically Direct Memory Access (DMA).

MicroPython ‘uctypes’

MicroPython does not have built-in functions to support DMA, and doesn’t provide any simple way of accessing the registers that control the ADC, DMA and I/O pins. However it does provide a way of defining these registers, using a new mechanism called ‘uctypes’. This is vaguely similar to ‘ctypes’ in standard Python, which is used to define Python interfaces for ‘foreign’ functions, but defines hardware registers, using a very compact (and somewhat obscure) syntax.

To give a specific example, the DMA controller has multiple channels, and according to the RP2040 datasheet section 2.5.7, each channel has 4 registers, with the following offsets:

0x000 READ_ADDR
0x004 WRITE_ADDR
0x008 TRANS_COUNT
0x00c CTRL_TRIG

The first three of these require simple 32-bit values, but the fourth has a complex bitfield:

Bit 31:   AHB_ERROR
Bit 30:   READ_ERROR
..and so on until..
Bits 3-2: DATA_SIZE
Bit 1:    HIGH_PRIORITY
Bit 0:    EN

With MicroPython uctypes, we can define the registers, and individual bitfields within those registers, e.g.

from uctypes import BF_POS, BF_LEN, UINT32, BFUINT32
DMA_CHAN_REGS = {
    "READ_ADDR_REG":       0x00|UINT32,
    "WRITE_ADDR_REG":      0x04|UINT32,
    "TRANS_COUNT_REG":     0x08|UINT32,
    "CTRL_TRIG_REG":       0x0c|UINT32,
    "CTRL_TRIG":          (0x0c,DMA_CTRL_TRIG_FIELDS)
}
DMA_CTRL_TRIG_FIELDS = {
    "AHB_ERROR":   31<<BF_POS | 1<<BF_LEN | BFUINT32,
    "READ_ERROR":  30<<BF_POS | 1<<BF_LEN | BFUINT32,
..and so on until..
    "DATA_SIZE":    2<<BF_POS | 2<<BF_LEN | BFUINT32,
    "HIGH_PRIORITY":1<<BF_POS | 1<<BF_LEN | BFUINT32,
    "EN":           0<<BF_POS | 1<<BF_LEN | BFUINT32
}

The UINT32, BF_POS and BF_LEN entries may look strange, but they are just a way of encapsulating the data type, bit position & bit count into a single variable, and once that has been defined, you can easily read or write any element of the bitfield, e.g.

# Set DMA data source to be ADC FIFO
dma_chan.READ_ADDR_REG = ADC_FIFO_ADDR

# Set transfer size as 16-bit words
dma_chan.CTRL_TRIG.DATA_SIZE = 1

You may wonder why there are 2 definitions for one register: CTRL_TRIG and CTRL_TRIG_REG. Although it is useful to be able to manipulate individual bitfields (as in the above code) sometimes you need to write the whole register at one time, for example to clear all fields to zero:

# Clear the CTRL_TRIG register
dma_chan.CTRL_TRIG_REG = 0

An additional complication is that there are 12 DMA channels, so we need to define all 12, then select one of them to work on:

DMA_CHAN_WIDTH  = 0x40
DMA_CHAN_COUNT  = 12
DMA_CHANS = [struct(DMA_BASE + n*DMA_CHAN_WIDTH, DMA_CHAN_REGS)
    for n in range(0,DMA_CHAN_COUNT)]

DMA_CHAN = 0
dma_chan = DMA_CHANS[DMA_CHAN]

To add even more complication, the DMA controller also has a single block of registers that are not channel specific, e.g.

DMA_REGS = {
    "INTR":               0x400|UINT32,
    "INTE0":              0x404|UINT32,
    "INTF0":              0x408|UINT32,
    "INTS0":              0x40c|UINT32,
    "INTE1":              0x414|UINT32,
..and so on until..
    "FIFO_LEVELS":        0x440|UINT32,
    "CHAN_ABORT":         0x444|UINT32
}

So to cancel all DMA transactions on all channels:

DMA_DEVICE = struct(DMA_BASE, DMA_REGS)
dma = DMA_DEVICE
dma.CHAN_ABORT = 0xffff

Single ADC sample

MicroPython has a function for reading the ADC, but we’ll be using DMA to grab multiple samples very quickly, so this function can’t be used; we need to program the hardware from scratch. A useful first step is to check that we can produce sensible values for a single ADC sample. Firstly the I/O pin needs to be set as an analog input, using the uctype definitions. There are 3 analog input channels, numbered from 0 to 2:

import rp_devices as devs
ADC_CHAN = 0
ADC_PIN  = 26 + ADC_CHAN
adc = devs.ADC_DEVICE
pin = devs.GPIO_PINS[ADC_PIN]
pad = devs.PAD_PINS[ADC_PIN]
pin.GPIO_CTRL_REG = devs.GPIO_FUNC_NULL
pad.PAD_REG = 0

Then we clear down the control & status register, and the FIFO control & status register; this is only necessary if they have previously been programmed:

adc.CS_REG = adc.FCS_REG = 0

Then enable the ADC, and select the channel to be converted:

adc.CS.EN = 1
adc.CS.AINSEL = ADC_CHAN

Now trigger the ADC for one capture cycle, and read the result:

adc.CS.START_ONCE = 1
print(adc.RESULT_REG)

These two lines can be repeated to get multiple samples.

If the input pin is floating (not connected to anything) then the value returned is impossible to predict, but generally it seems to be around 50 to 80 units. The important point is that the value fluctuates between samples; if several samples have exactly the same value, then there is a problem.

Multiple ADC samples

Since MicroPython isn’t fast enough to handle the incoming data, I’m using DMA, so that the ADC values are copied directly into memory without any software intervention.

However, we don’t always want the ADC to run at maximum speed (500k samples/sec) so need some way of triggering it to fetch the next sample after a programmable delay. The RP2040 designers have anticipated this requirement, and have equipped it with a programmable timer, driven from a 48 MHz clock. There is also a mechanism that allows the ADC to automatically sample 2 or 3 inputs in turn; refer to the RP2040 datasheet for details.

Assuming the ADC has been set up as described above, the additional code is required. First we define the DMA channel, the number of samples, and the rate (samples per second).

DMA_CHAN = 0
NSAMPLES = 10
RATE = 100000
dma_chan = devs.DMA_CHANS[DMA_CHAN]
dma = devs.DMA_DEVICE

We now have to enable the ADC FIFO, create a 16-bit buffer to hold the samples, and set the sample rate:

adc.FCS.EN = adc.FCS.DREQ_EN = 1
adc_buff = array.array('H', (0 for _ in range(NSAMPLES)))
adc.DIV_REG = (48000000 // RATE - 1) << 8
adc.FCS.THRESH = adc.FCS.OVER = adc.FCS.UNDER = 1

The DMA controller is configured with the source & destination addresses, and sample count:

dma_chan.READ_ADDR_REG = devs.ADC_FIFO_ADDR
dma_chan.WRITE_ADDR_REG = uctypes.addressof(adc_buff)
dma_chan.TRANS_COUNT_REG = NSAMPLES

The DMA destination is set to auto-increment, with a data size of 16 bits; the data request comes from the ADC. Then DMA is enabled, waiting for the first request.

dma_chan.CTRL_TRIG_REG = 0
dma_chan.CTRL_TRIG.CHAIN_TO = DMA_CHAN
dma_chan.CTRL_TRIG.INCR_WRITE = dma_chan.CTRL_TRIG.IRQ_QUIET = 1
dma_chan.CTRL_TRIG.TREQ_SEL = devs.DREQ_ADC
dma_chan.CTRL_TRIG.DATA_SIZE = 1
dma_chan.CTRL_TRIG.EN = 1

Before starting the sampling, it is important to clear down the ADC FIFO, by reading out any existing samples – if this step is omitted, the data you get will be a mix of old & new, which can be very confusing.

while adc.FCS.LEVEL:
    x = adc.FIFO_REG

We can now set the START_MANY bit, and the ADC will start generating samples, which will be loaded into its FIFO, then transferred by DMA to the RAM buffer. Once the buffer is full (i.e. the DMA transfer count has been reached, and its BUSY bit is cleared) the DMA transfers will stop, but the ADC will keep trying to put samples in the FIFO until the START_MANY bit is cleared.

adc.CS.START_MANY = 1
while dma_chan.CTRL_TRIG.BUSY:
    time.sleep_ms(10)
adc.CS.START_MANY = 0
dma_chan.CTRL_TRIG.EN = 0

We can now print the results, converted into a voltage reading:

vals = [("%1.3f" % (val*3.3/4096)) for val in adc_buff]
print(vals)

As with the single-value test, the displayed values should show some dithering; if the input is floating, you might see something like:

['0.045', '0.045', '0.047', '0.046', '0.045', '0.046', '0.045', '0.046', '0.046', '0.041']

Running the code

If you are unfamiliar with the process of loading MicroPython onto the Pico, or loading files into the MicroPython filesystem, I suggest you read my previous post.

The source files are available on Github here; you need to load the library file rp_devices.py onto the MicroPython filesystem, then run rp_adc_test.py; I normally run this using Thonny, as it simplifies the process of editing, running and debugging the code.

In the next part I combine the ADC sampling and the network interface to create a networked oscilloscope with a browser interface.

Copyright (c) Jeremy P Bentham 2021. Please credit this blog if you use the information or software in it.