DNS – Lean2

The Domain Name System (DNS) is the top layer of the 3-layer addressing scheme, as described in the previous part of this blog. To recap:

The Domain Name System translates a Domain Name such as http://www.google.com into an Internet Protocol (IP) address, consisting of 4 decimal numbers 0-255 with ‘.’ separators, e.g. 142.250.187.228.

The Address Resolution Protocol (ARP) translates an IP address into a Medium Access and Control (MAC) address, consisting of 6 hexadecimal values with ‘:’ separators, e.g. 28:CD:C1:00:D9:20

The MAC address is used for all communication within a network, to identify a sender or intended recipient.

In the early days of the Internet there was a simple one-to-one relationship between a domain name and an IP address, so DNS just required a simple lookup on a remote database. Nowadays the name-to-address mapping is much more complex, but we can still take a simple approach, by providing DNS with a name, and it will return an IP address, or several IP addresses if there are some alternatives.

DNS specification

The specifications for DNS, and all other aspects of TCP/IP communication, are in the form of documents called Request For Comments (RFC); DNS is RFC-1034 and RFC-1035. These documents are freely available, and don’t just describe the protocol, but also give background information on the rationale for the design decisions that were made. RFCs are generally very clearly written, with a minimum of jargon, and are well worth a read.

A DNS lookup consists of a single UserDatagram Protocol (UDP) message sent to a server, and a single matching UDP response. UDP is an ‘unreliable’ protocol, so there is no guarantee that a response will be received; if not, the process can just be repeated.

DNS request

The request consists of a fixed-format header, plus variable-format data. The fixed section has just 6 2-byte values:

typedef struct
{
	WORD ident,   // Message identification number
        flags,    // Option flags
	    n_query,  // Number of queries
		n_ans,    // Number of answers
		n_auth,   // Number of authority records
		n_rr;     // Number of additional records
} DNS_HDR;

All values are in ‘network’ byte-order, so are most-significant-byte first. The meaning of the flags can best be described by looking at the decode of a typical request in Wireshark:

Domain Name System (query)
    Transaction ID: 0x1234
    Flags: 0x0100 Standard query
        0... .... .... .... = Response: Message is a query
        .000 0... .... .... = Opcode: Standard query (0)
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... .0.. .... = Z: reserved (0)
        .... .... ...0 .... = Non-authenticated data: Unacceptable
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 0

This is followed by a variable-length Resource Record (RR) field with the domain name, then 16-bit name type & name class, which normally have a value of 1. Each part of the name is prefixed by a length byte, and the name is terminated by a zero length, e.g.

LEN w w w  LEN r a s p b e r r y p i  LEN o r g  NUL Type Class
03  777777 0b  7261737062657272797069 03  6f7267 00  0001 0001

The code to create the header and data is relatively simple:

// Format DNS request data, given name string
int dns_add_hdr_data(BYTE *buff, char *s) 
{
	BYTE *p, *q;
    DNS_HDR *dhp = (DNS_HDR *)buff;
    int len = sizeof(DNS_HDR);
    static int ident = 1;
    
    memset(dhp, 0, sizeof(DNS_HDR));
    dhp->ident = htons(ident++);
    dhp->flags = htons(0x100);  // Recursion desired
    dhp->n_query = htons(1);
    p = q = &buff[len];
	while (*s)	                // Prefix each part with length byte
	{
		p++;
		while (*s && *s != '.')
			*p++ = (BYTE)*s++;
		*q = (BYTE)(p - q - 1);
		q = p;
    	if (*s)
        	s++;
	}
    *p++ = 0;   // Null terminator
    *p++ = 0;	// Type A (host address)
    *p++ = 1;
    *p++ = 0;	// Class IN
    *p++ = 1;
	return (p - buff);
}

Transmitting this message is just a question of adding on the Ethernet, IP and UDP headers; this is done in a slightly strange order, since the UDP header must include a checksum calculated from the subsequent (DNS) data.

// Transmit DNS request
int dns_tx(MACADDR mac, IPADDR dip, WORD sport, char *s)
{
    char temps[300];
    int oset = 0;
	int len = ip_add_eth(txbuff, mac, my_mac, PCOL_IP);
	int dlen = dns_add_hdr_data(&txbuff[len + sizeof(IPHDR) + sizeof(UDPHDR)], s);
	
	len += ip_add_hdr(&txbuff[len], dip, PUDP, sizeof(UDPHDR) + dlen);
	len += udp_add_hdr_data(&txbuff[len], sport, DNS_SERVER_PORT, 0, dlen);
 	return (ip_tx_eth(txbuff, len));
}

DNS response

The response (if any) will arrive from the WiFi chip as an ‘event’, and it has to go through IP and UDP pre-processing before arriving at the DNS decoder, or any other protocol handler that matches the incoming data. I’ve already established an ‘add_event_handler’ mechanism for distributing events to various handlers, and am using a similar mechanism for distributing incoming UDP traffic, by giving the standard DNS port number:

#define DNS_SERVER_PORT	53
add_event_handler(udp_event_handler);
udp_sock_init(udp_dns_handler, zero_ip, 0, DNS_SERVER_PORT);

// Return number of DNS responses
int dns_num_resps(BYTE *buff, int len)
{
    DNS_HDR *dhp = (DNS_HDR *)&buff[sizeof(ETHERHDR) + sizeof(IPHDR) + sizeof(UDPHDR)];
    return (len > sizeof(ETHERHDR)+sizeof(IPHDR)+sizeof(UDPHDR)+sizeof(DNS_HDR) ?
        htons(dhp->n_ans) : 0);
}

// Handler for UDP DNS response
int udp_dns_handler(UDP_SOCKET *usp)
{    
    char temps[300];
    IPADDR addr;
    int oset = 0;
    
    if (display_mode & DISP_DNS)
    {
        printf("Rx %s: ", dns_hdr_str(temps, usp->data, usp->dlen));
        printf("%s\n", dns_name_str(temps, usp->data, usp->dlen, &oset, 0, 0));
        for (int n = 0; n < dns_num_resps(usp->data, usp->dlen); n++)
            printf("%s\n", dns_name_str(temps, usp->data, usp->dlen, &oset, 0, addr));
    }
    return (1);
}

The response handler just iterates through the responses and prints them out:

Tx DNS 1 query: www.raspberrypi.org type A
Tx UDP 192.168.1.139:1234->192.168.1.254:53 len 37
Rx UDP 192.168.1.254:53->192.168.1.139:1234 len 85
Rx DNS 1 query, 3 resp: www.raspberrypi.org type A
  www.raspberrypi.org type A 104.22.1.43
  www.raspberrypi.org type A 104.22.0.43
  www.raspberrypi.org type A 172.67.36.98

There are 3 responses, but the way they are structured is surprising; this is the binary data:

0040                                 03 77 77 77 0b 72   ...........www.r
0050   61 73 70 62 65 72 72 79 70 69 03 6f 72 67 00 00   aspberrypi.org..
0060   01 00 01 c0 0c 00 01 00 01 00 00 00 df 00 04 68   ...............h
0070   16 01 2b c0 0c 00 01 00 01 00 00 00 df 00 04 68   ..+............h
0080   16 00 2b c0 0c 00 01 00 01 00 00 00 df 00 04 ac   ..+.............
0090   43 24 62                                          C$b

The first entry has the same format as the request, with the domain name http://www.raspberrypi.org having its delimiters replaced by length-bytes. However the subsequent responses have the length byte replaced by the value 0xc0, followed by a value of 0x0c. This 16-bit value is the result of a compression scheme; the two most-significant bits are set to indicate a compressed entry, and the remaining 14 bits are an offset value pointing at the duplicate text (calculated from the start of the DNS header).

The domain name (or compression pointer) is followed by 16-bit type & class words (usually both 1), a 32-bit time-to-live value, then the IP address length and the 4 address bytes.

This makes the decoder quite complex; typically the item most of interest is the IP address, but it is necessary to decode the 3 parts of the name (or the compression pointer) first; see the function dns_name_str() for my version of this.

Misaligned data values

A major issue that arose while debugging the message decoder is the fact that the 16- and 32-bit values in the response may not be aligned on a 2- or 4-byte boundary. This is an issue that is relatively unique to protocol decoding on small embedded systems, but does have a really bad outcome (crashing the CPU) so some explanation is in order. Here is a test program, with a simplified version of the DNS response, just a null-terminated string followed by a 4-byte IP address:

#include <stdio.h>
#include <string.h>

char data[] = {'a', 'b', 'c', 0, 0x11, 0x22, 0x33, 0x44};

typedef unsigned int IPADDR;

int main(int argc, char *argv[])
{
    char *p = &data[strlen(data) + 1];
    IPADDR addr = *(IPADDR *)p;
    printf("%s %X\n", data, addr);
}

When this program is run on the Pico board, or a little-endian Linux system, the result is printed out as expected:

abc 44332211

The program can now be changed so the string is 1 character longer:

char data[] = {'a', 'b', 'c', 'd', 0, 0x11, 0x22, 0x33, 0x44};

The Linux system works as expected, printing out:

abcd 44332211

However the program fails on the Pico, and prints nothing out. If run under a debugger, a ‘hard fault’ trap is reported, with the call stack showing that the problem occurs when the address value is set. This is because the address is no longer on a 4-byte boundary, and the simpler RP2040 processor can’t handle this misaligned transfer, whereas the PC processor can. So it is quite easy to write some code that works fine on a PC, and sometimes fails on the Pico; for example, my early DNS tests used the domain name ‘pool.ntp.org’ and when the response is obtained, all the 16-bit values happen to all be on 2-byte boundaries so the code worked fine; switching to ‘www.raspberrypi.org’ these values became misaligned, so the code failed.

There are various workarounds for this problem, the simplest being not to use casts; my newer code defines the IP address as a byte array, which can be copied byte-by-byte to avoid any misalignment issues. It isn’t sensible to use this approach on 16-bit port values, but these are generally in a byte array and need to be swapped from from big-endian to little-endian, so we can use a byte pointer, e.g.

BYTE *buff;
WORD val = htonsp(buff);

// Convert byte-order in a 'short' variable, given byte pointer
WORD htonsp(BYTE *p)
{
    WORD w = (WORD)(*p++) << 8;
    return(w | *p);
}

Test program

The ‘dns’ program joins a network using the name ‘testname’ and password ‘testpass’ that need to be changed. It uses DHCP to fetch an IP address, and the addresses of a router and DNS server. ARP is then used to resolve the server IP address to a MAC address, then that is used to contact the nameserver, asking to resolve the name http://www.raspberrypi.org, and printing the result.

For more information on compiling and running the code, see the introduction.

Project links
Introduction	Project overview
Part 1	Low-level interface; hardware & software
Part 2	Initialisation; CYW43xxx chip setup
Part 3	IOCTLs and events; driver communication
Part 4	Scan and join a network; WPA security
Part 5	ARP, IP and ICMP; IP addressing, and ping
Part 6	DHCP; fetching IP configuration from server
Part 7	DNS; domain name lookup
Part 8	UDP server socket
Part 9	TCP Web server
Part 10	Web camera
Source code	Full C source code

Copyright (c) Jeremy P Bentham 2022. Please credit this blog if you use the information or software in it.

Category: DNS

PicoWi part 7: DNS