
A Web camera is quite a demanding application, since it requires a continuous stream of data to be sent over the network at high speed. The data volume is determined by the image size, and the compression method; the raw data for a single VGA-size (640 x 480 pixel) image is over 600K bytes, so some compression is desirable. Some cameras have built-in JPEG compression, which can compress the VGA image down to roughly 30K bytes, and it is possible to send a stream of still images to the browser, which will display them as if they came from a video-format file. This approach (known as motion-JPEG, or MJPEG) has a disadvantage in terms of inter-frame compression; since each frame is compressed in isolation, the compressor can’t reduce the filesize by taking advantage of any similarities between adjacent frames, as is done in protocols such as MPEG. However, MJPEG has the great advantage of simplicity, which makes it suitable for this demonstration.
Camera
The standard cameras for the full-size Raspberry Pi boards have a CSI (Camera Serial Interface) conforming to the specification issued by the MIPI (Mobile Industry Processor Interface) alliance. This high-speed connection is unsuitable for use with the Pico, we need something with a slower-speed SPI (Serial Peripheral Interface), and JPEG compression ability.
The camera I used is the 2 megapixel Arducam, which is uses the OV2640 sensor, combined with an image processing chip. It has I2C and SPI interfaces; the former is primarily for configuring the sensor, with the latter being for data transfer. Sadly the maximum SPI frequency is specified as 8 MHz, which compares unfavourably with the 60 MHz SPI we are using to communicate with the network.
The connections specified by Arducam are:
SPI SCK GPIO pin 2
SPI MOSI 3
SPI MISO 4
SPI CS 5
I2C SDA 8
I2C SCL 9
Power 3.3V
Ground GND
In addition, GPIO pin 0 is used as a serial console output, the data rate is 115200 baud by default.
I2C and SPI tests
The first step is to check that the i2c interface is connected correctly, by checking an ID register value:
#define CAM_I2C i2c0
#define CAM_I2C_ADDR 0x30
#define CAM_I2C_FREQ 100000
#define CAM_PIN_SDA 8
#define CAM_PIN_SCL 9
i2c_init(CAM_I2C, CAM_I2C_FREQ);
gpio_set_function(CAM_PIN_SDA, GPIO_FUNC_I2C);
gpio_set_function(CAM_PIN_SCL, GPIO_FUNC_I2C);
gpio_pull_up(CAM_PIN_SDA);
gpio_pull_up(CAM_PIN_SCL);
WORD w = ((WORD)cam_sensor_read_reg(0x0a) << 8) | cam_sensor_read_reg(0x0b);
if (w != 0x2640 && w != 0x2641 && w != 0x2642)
printf("Camera i2c error: ID %04X\n", w);
/ Read camera sensor i2c register
BYTE cam_sensor_read_reg(BYTE reg)
{
BYTE b;
i2c_write_blocking(CAM_I2C, CAM_I2C_ADDR, ®, 1, true);
i2c_read_blocking(CAM_I2C, CAM_I2C_ADDR, &b, 1, false);
return (b);
}
Then we can check the SPI interface by writing values to a register, and reading them back:
#define CAM_SPI spi0
#define CAM_SPI_FREQ 8000000
#define CAM_PIN_SCK 2
#define CAM_PIN_MOSI 3
#define CAM_PIN_MISO 4
#define CAM_PIN_CS 5
spi_init(CAM_SPI, CAM_SPI_FREQ);
gpio_set_function(CAM_PIN_MISO, GPIO_FUNC_SPI);
gpio_set_function(CAM_PIN_SCK, GPIO_FUNC_SPI);
gpio_set_function(CAM_PIN_MOSI, GPIO_FUNC_SPI);
gpio_init(CAM_PIN_CS);
gpio_set_dir(CAM_PIN_CS, GPIO_OUT);
gpio_put(CAM_PIN_CS, 1);
if ((cam_write_reg(0, 0x55), cam_read_reg(0) != 0x55) || (cam_write_reg(0, 0xaa), cam_read_reg(0) != 0xaa))
printf("Camera SPI error\n");
Initialisation
The sensors require a large number of i2c register settings in order to function correctly. These are just ‘magic numbers’ copied across from the Arducam source code. The last block of values specify the sensor resolution, which is set at compile-time. The options are 320 x 240 (QVGA) 640 x 480 (VGA) 1024 x 768 (XGA) 1600 x 1200 (UXGA), e.g.
// Horizontal resolution: 320, 640, 1024 or 1600 pixels
#define CAM_X_RES 640
Capturing a frame
A single frame is captured by writing to a few registers, then waiting for the camera to signal that the capture (and JPEG compression) is complete. The size of the image varies from shot to shot, so it is necessary to read some register values to determine the actual image size. In reality, the camera has a tendency to round up the size, and pad the end of the image with some nulls, but this doesn’t seem to be a problem when displaying the image.
// Read single camera frame
int cam_capture_single(void)
{
int tries = 1000, ret=0, n=0;
cam_write_reg(4, 0x01);
cam_write_reg(4, 0x02);
while ((cam_read_reg(0x41) & 0x08) == 0 && tries)
{
usdelay(100);
tries--;
}
if (tries)
n = cam_read_fifo_len();
if (n > 0 && n <= sizeof(cam_data))
{
cam_select();
spi_read_blocking(CAM_SPI, 0x3c, cam_data, 1);
spi_read_blocking(CAM_SPI, 0x3c, cam_data, n);
cam_deselect();
ret = n;
}
return (ret);
}
Reading the picture from the camera just requires the reading of a single dummy byte, then the whole block that represents the image; it is a complete JFIF-format picture, so no further processing needs to be done. If the browser has requested a single still image, we just send the whole block as-is to the client, with an HTTP header specifying “Content-Type: image/jpeg”
The following image was captured by the camera at 640 x 480 resolution:

MJPEG video
As previously mentioned, the Web server can stream video to the browser, in the form of a continuous flow of JPEG images. The requires a few special steps:
- In the response header, the server defines the content-type as “multipart/x-mixed-replace”
- To enable the browser to detect when one image ends, and another starts, we need a unique marker. This can be anything that isn’t likely to occur in the data stream; I’ve specified “boundary=mjpeg_boundary”
- Before each image, the boundary marker must be sent, followed by the content-type (“image/jpeg”) and a blank line to mark the end of the header.
Timing
The timing will be quite variable, since it depends on the image complexity and network configuration, but here are the results of some measurements when fetching a single JPEG image over a small local network, using binary (not base64) mode:
Resolution (pixels) | Image capture time (ms) | Image size (kbyte) | TCP transfer time (ms) | TCP speed (kbyte/s) |
320 x 240 | 153 | 10.2 | 4.4 | 2310 |
640 x 480 | 292 | 25.6 | 10.9 | 2350 |
1024 x 768 | 321 | 49.1 | 21.5 | 2285 |
1600 x 1200 | 420 | 97.3 | 42.4 | 2292 |
The webcam code triggers an image capture, then after the data has been fetched into the CPU RAM buffer, it is sent to the network stack for transmission. There would be some improvement in the timings if the next image were fetched while the current image is being transmitted, however the improvement will be quite small, since the overall time is dominated by the time taken for the camera to capture and compress the image.
Using the Web camera
There is only one setting at the top of camera/cam_2640.h, namely the horizontal resolution:
// Horizontal resolution: 320, 640, 1024 or 1600 pixels
#define CAM_X_RES 640
Then the binary is built and the CPU is programmed in the usual way:
make web_cam
./prog web_cam
At boot-time the IP address will be reported on the serial console; use this to access the camera or video Web pages in a browser, e.g.
http://192.168.1.240/camera.jpg
http://192.168.1.240/video
It is important to note that a new image capture is triggered every time the Web page is accessed, so any attempt to simultaneously access the pages from more than one browser will fail. To allow simultaneous access by multiple clients, a double-buffering scheme needs to be implemented.
Project links | |
---|---|
Introduction | Project overview |
Part 1 | Low-level interface; hardware & software |
Part 2 | Initialisation; CYW43xxx chip setup |
Part 3 | IOCTLs and events; driver communication |
Part 4 | Scan and join a network; WPA security |
Part 5 | ARP, IP and ICMP; IP addressing, and ping |
Part 6 | DHCP; fetching IP configuration from server |
Part 7 | DNS; domain name lookup |
Part 8 | UDP server socket |
Part 9 | TCP Web server |
Part 10 | Web camera |
Source code | Full C source code |
Copyright (c) Jeremy P Bentham 2023. Please credit this blog if you use the information or softw