Programming PSoC: an ARM CPU with programmable hardware

Want to craft your own CPU peripheral? Want to experiment with programmable hardware, but are deterred by the complexity and cost? Read on…

Cypress PSoC kit

The Cypress PSoC family are strange devices; a conventional CPU combined with programmable hardware, that gives unparalleled flexibility to any embedded system.

However, this flexibility means that the ‘PSoC Creator’ development environment is unlike anything else, and can be a bit intimidating for any developer who isn’t used to programmable hardware.

In this blog, I try to de-mystify the development process, hopefully demonstrating that the learning curve isn’t as bad as you might think; you can create a simple demonstration from scratch quicker than on most other platforms.

Why use a PSoC?

  • Combines a conventional CPU with programmable logic
  • Ideal for learning about programmable hardware
  • Completely free development environment – no registration required
  • Low-cost development boards (under US $15)
  • Powerful logic optimisation software
  • Comprehensive analogue capability
  • Vast amount of online documentation and worked examples
  • Wide supply-voltage range (1.71 to 5.5 volts).
  • USB connectivity, simple Bluetooth add-on

So why isn’t everyone using these devices? Well, I must admit that when I first ran their ‘PSoC Creator’ IDE, it was a bit overwhelming – there are so many features it is difficult to know where to start, and I’m too impatient to sit through all the training videos.

So I’ll try to get you up to speed quickly, on the assumption you are familiar with the usual process for developing embedded software. The project I’ve chosen is really trivial – flashing an LED – but this is always the first step I take with any new hardware, and in this case I’ll start by using conventional software techniques, then show how the same application can be implemented purely in hardware, with the CPU doing nothing!

.. and in case you’re wondering whether I’m being paid by Cypress to create an overly-enthusiastic write-up, the answer is no, I have no commercial relationship with them: I just think this is really interesting technology, that deserves a wider audience.



For this demonstration, all you need is the Cypress CY8CKIT-059, which is a self-contained evaluation board with a detachable USB interface for programming and debugging. It has a PSoC 5LP device with:

  • 80 MHz ARM Cortex-M3 CPU
  • 256 kbytes flash, 64 kbytes RAM
  • 38 GPIO pins
  • Full-speed USB
  • 24 ‘Programmable Universal Digital Blocks’ (known as UDBs)

The board is in a conventional dual-in-line format, ideal for prototyping on 0.1″ pitch breadboard, and at under US $15, you’re getting a lot of hardware for the money.

The board has 2 snap-apart sections, a ‘kitprog’ programmer on the left, and the actual PSoc on the right. For the time being, I’d suggest you keep the board intact, as you lose some handy features (e.g. a debug serial link to the PC) if the sections are snapped apart and re-joined by an SWD cable.

PSoc Creator

PSoC Creator

This is a Windows-only application; I’m using version 4.2, and (unlike most programmable logic software) you don’t have to register, or get license keys from the vendor – all the features are available for free, right from the start.

I’ll admit that it can be a bit intimidating to a software developer; the central section in the image above looks like (and is) a circuit schematic, and you don’t normally get that in a software IDE. Once you start using the software, it gets even stranger, as every time you build the application, umpteen C and Verilog files are created; you can get an out-of-control feeling when faced with all the automatically-generated files you didn’t write.

However, fairly quickly you get to appreciate the amount of work that has gone into these pre-prepared (and very well-documented) hardware & software devices – they do make your job a lot easier, and having access to the source files is excellent for experimentation & learning.

Flashing LED example

We’ll start by creating a conventional flashing-LED example in software, then transition to a programmable-hardware version.

When PSOC Creator is launched, you’ll see a Start Page, with the usual option of creating a new Design Project. Click this, and select the Target Device as PSoC 5LP Cy8C5888LTI-LP097. If you get this wrong, no great harm will result, the device programming will fail, with a helpful message as to the device you’ve set, and the device that is actually fitted to the board.

You’ll then be asked whether you want to run a code example or use a template; there are a large number of examples you could try, but for the purposes of this demonstration select an Empty Schematic, and choose workspace & project name you like; I’ve used ‘blink_led’.


Now you’re faced with a blank circuit diagram, and may feel completely out of your comfort zone; why is a schematic needed, when all you want to do is blink an LED in software? The answer is that a PSoC, unlike a normal CPU, has really flexible peripherals, with no real pre-programmed function. So until you specify what you want by placing a component in a circuit diagram (the ‘TopDesign schematic’), the functionality doesn’t exist.

Blank schematic

To give you a quick guide to this screen, on the left you’ll see various design-wide resources that can be specified, such as I/O pins and the CPU clock. There are tabs for ‘Source’ and ‘Components’; you’ll frequently be switching between these views of your design. On the right side is the Component Catalogue containing various (soft) components that can be dragged onto the schematic.

The lower bar shows the result of compilation & device programming, as with many IDEs. The other area of note is the narrow menu bar to the left of the schematic; this selects drawing tools, the most important is the second one down, which is used to wire up the components.

Select a ‘Digital Output Pin’ from ‘Ports and Pins’ in the catalogue, and drag it onto the schematic. Zoom in so it is visible (I use control-mousewheel to do this). The default name is Pin_1 since all resources are auto-numbered, so double-click it and rename to LED_pin. You’ll see all sorts of other options, hinting at the extent to which everything is programmable; the only change you need to make is to un-check the ‘HW connection’ checkbox.

Configuring LED pin

It seems strange to have a hardware pin with no hardware, but what you are actually doing is removing the need for a hardware connection within the programmable logic, since we want the pin to be driven by software only.

Schematic with LED pin

You’ll also see a prompt to open the datasheet; try this, and you’ll be pleasantly surprised at the amount of information in the PDF. All components are documented in this way, which is one of the major strengths of the Cypress software tools.

Having defined an output to drive the LED, we need to link that output to a real hardware pin on the device. This is done by double-clicking ‘Pins’ in the Design Wide Resources, which displays a pretty little diagram of the device, with a hint that the LED_pin needs to be assigned.

Hardware pin assignment

Clicking the drop-down Port menu gives a long list of possible candidates; according to the kit packaging the LED is on port P2_1, so that is what we’ll select.

Before writing code, we need to define the CPU clock; double-click ‘Clocks’ in the design-wide resources, and you’ll see a spreadsheet of all the clocks we currently have.

Clock settings

The BUS_CLK needs to be defined, so double-click it and an even prettier flowchart pops up.

Clock flowchart

The IMO is the primary internal oscillator, and in the above settings I’ve used the PLL to boost it from 3 to 50 MHz. In theory the CPU can go up to 80 MHz, but then the timing analyser complains about the clock tolerance; I guess it implies that boosting an internal 3 MHz signal up to the CPU maximum isn’t a good idea, so I’ve backed it off to 50 MHz.

Back to the circuit diagram, right-click the LED pin, and open the datasheet; scroll through this remarkably long document until you get to the API definitions, and you’ll see that the function Pin_Write(uint8 value) is used to drive it. We’ll be using that function in the software.


The source file ‘main.c’ was automatically created with the project; double-click it to open, and you’ll see a minimal framework.

One surprise is that there is a #include of the file ‘project.h’, with an error message saying it doesn’t exist. Don’t worry about it; this file contains the hardware definitions that are needed by the software, and will be generated at build-time.

The simplest way to add a delay is to use the built-in functions, so main.c becomes:

#include "project.h"

int main(void)

The datasheet said the API function was Pin_Write, so why is it now called LED_pin_Write? This is an unusual aspect of PSoC programming, caused by the very tight integration between the hardware & software definitions of a device. At build-time, the system creates a custom version of the API to match your device settings in the schematic; the standard prefix ‘Pin’ is replaced with the name we provided, so Pin_Write becomes LED_pin_Write. Quite unconventional, but you soon get to appreciate the convenience of having a single shared namespace for the hardware & software.

In the Build menu, click ‘Build blink_led’ and watch as a myriad of files are generated, hopefully with no errors. If you see an error ‘No input on Instance “LED_pin”‘ then you didn’t disable the ‘HW connection’ on the LED pin (see above), so the software is expecting it to be wired to some internal logic.

Connect the Kitprog USB interface on the left-hand side of the board, and select Program in the Debug menu. If this is the first time you’ve used the board, you may be prompted to select it from a list.

After a brief flurry of programming, the blue LED on the board should start flashing at 1 Hz.

Hardware-only version

We’ll now use the PSoC’s programmable hardware to do the same task, as a jumping-off point for your own experimentation. There are various ways to do this, I’ve chosen one that demonstrates how components are wired up in the schematic.

First you need to double-click the LED pin on the schematic, and re-enable ‘HW connection’ so we can drive it from internal logic.

Now we need to create a suitable clock frequency; we’ll start with a 1 MHz signal, and divide it down to 1 Hz for the LED. Find the ‘Clock’ component in the System section of the Component Catalog, drag it onto the schematic, double click it, and use the following settings.

Clock settings

Now find the Frequency Divider in the Utility section of the Component Catalog, and drag it onto the schematic. Double-click, and set a divisor of 1000000.

Schematic components

All components follow the convention of having inputs on the left, outputs on the right; this part has Enable, Reset and Clock inputs. These have to be connected to something, you can’t leave inputs unconnected. Since we want the divider to be permanently enabled, and we don’t want it to be reset, we need to force Enable high, and Reset low, so find Logic High and Logic Low in the Digital Logic section of the Component Catalog, and drag one of each onto the schematic.

Now join the component pins using the wiring tool, which is the second one down in the left margin of the schematic; click on a pin, and the wire extends to the cursor; click on another pin, and they are joined. The end result should look something like:

Completed schematic

Once joined, you can drag a component and it will remain connected; to delete a connection, click on a wire section, and press Delete.

All that remains is to get rid of our software, since everything is being done in hardware; change main.c to:

#include "project.h"
int main(void)

Now rebuild the project, and program the device. You’ve created your first PSoC hardware project, and hopefully I’ve convinced you to carry on experimenting with these remarkable devices.

Copyright (c) Jeremy P Bentham 2019. Please credit this blog if you are using the information or software in it.

Creating real-time Web graphics with Python

Drawing graphics in browser-friendly SVG, with Javascript animation

My previous attempt at real-time graphics involved direct-drawing the shapes using Python and PyQt. This works well for a standalone system, but is less useful in the current world of ‘everything-on-a-browser’. To make the graphics Web-friendly, there is a choice; either create an animated bitmap (e.g. GIF, PNG or JPEG) or use vector graphics (SVG).

Vector graphics have various advantages; not just that they can be resized without degradation (pixellation) but also that they can preserve some of the structural characteristics of the image that is being animated. For example, we can draw a detailed circuit diagram complete with voltage and current measurements, and if it is too large to display in a browser screen, the user can zoom out to get the big picture, then zoom in to see the data in a specific area; the components will always appear well-drawn, no matter what scale is used.

Animated circuit diagram
Zoomed in to show voltage & current values

If the components are drawn sensibly they can easily be annotated & animated to reflect the current data values. Unfortunately, although most graphics packages can export SVG, very few preserve the underlying structure of the parts. For example, the above circuit diagram was originally exported from Visio as an 800K file, filled with tiny vector & text fragments, that were very difficult to relate back to the original components. The final version, created by a Python program from a netlist, was 33KB in size, and much easier to animate.

I certainly wouldn’t give up on the idea of animating pre-existing circuit diagrams, but for the purposes of this blog, am taking the easy way out, and creating the graphics from scratch in Python. Also, it is quite a big step to create & animate a full circuit diagram in one go, so I’ll start with something much simpler, a 7-segment digital clock.

Clock display in Web browser

This will involve Python, SVG, style sheets and even a smattering of Javascript, so is complicated enough for starters. If you are going to do a lot of graphical manipulation, it is well worth reading the SVG specification.


My code is compatible with Python 2.7 or 3.x, you just need to install the ‘svgwrite’ package using pip or pip3.

As its name suggests, svgwrite is a relatively slim wrapper that simplifies the task of writing SVG files, for example:

# Create simple SVG
import svgwrite
FILENAME      = "simple.svg"
WIDTH, HEIGHT = 250, 150

dwg = svgwrite.Drawing(FILENAME, (WIDTH, HEIGHT))

If the resulting SVG file is dragged into a browser, it looks like:


There are various points to note (apart from the author’s strange colour choices):

  • The origin (x=0, y=0) is in the top left corner
  • Coordinates are specified as (x,y) tuples
  • The order of drawing is significant: earlier objects may be obscured by later
  • Lines outside the drawing area are clipped
  • To eliminate fill, you have to use ‘fill=”none”‘, rather than the more Pythonic ‘fill=None’

It is instructive to view the resulting file in a text editor; after the namespace boilerplate, there is a close match with the Python code:

<?xml version="1.0" encoding="utf-8" ?>
<svg baseProfile="full" height="150" version="1.1" width="250" 
 <defs />
 <rect fill="none" height="149" stroke="red" width="249" x="0" y="0" />
 <line stroke="green" stroke-width="8" x1="0" x2="249" y1="0" y2="149" />
 <circle cx="125" cy="75" fill="moccasin" r="75" stroke="salmon" stroke-width="3" />
 <circle cx="125" cy="75" fill="moccasin" r="50" stroke="salmon" stroke-width="3" />


The most obvious issue is the repetition of the colour definitions; using a style sheet would shorten the file, and separate the style from the drawing commands, making it easier to change the colours, for example:

import svgwrite

FILENAME      = "simple.svg"
WIDTH, HEIGHT = 250, 150

STYLES = """
  .circle_style  {stroke-width:3; stroke:salmon; fill:moccasin}

dwg = svgwrite.Drawing(FILENAME, (WIDTH, HEIGHT))

The browser display is exactly the same, and the SVG code now has an entry in the ‘defs’ section, with a strange addition:

  .circle_style {stroke-width:3; stroke:salmon; fill:moccasin}

The CDATA encapsulation is a signal to the browser that the normal search for HTML tags should be disabled; it isn’t really necessary in this case, but is essential if, for example, you are inserting a Javascript comparison such as ‘a < b’, since the browser would normally treat ‘<‘ as the start of an HTML tag.

Two important points:

  1. Note the underscore on the end of ‘class_’ in the Python code. If you forget that, Python will error out, as ‘class’ is a reserved word; svgwrite automatically strips off the trailing underscore when writing the SVG file.
  2. If you make any errors in the definition, the whole class will be ignored, and the object will be drawn in the default style.

The first issue is very common; you will see many complaints from svgwrite users that they can’t assign a class. The second issue can be annoying, since a trivial mistake can produce an ugly (or invisible!) mess, for example the following display results from putting the colour names in quotes (stroke:”salmon”; fill:”moccasin”).


The ‘developer tools’ mode in Chrome has highlighted the incorrect colour definitions.


A useful SVG feature is that several items can be grouped together and treated as one, for example, instead of adding the 2 circles directly to the drawing, they can be added to a group, which is then added to the drawing:

g = svgwrite.container.Group(class_="circle_style")
g.add(,HEIGHT/2), HEIGHT/2))
g.add(,HEIGHT/2), HEIGHT/3))

This is of most use when handling a complex shape, as all grouped items will be moved as one; if you want to move the 2 grouped circles to the right by 10 pixels, the first line becomes:

g = svgwrite.container.Group(class_="circle_style", transform="translate(10 0)")

You might think that the ‘add’ function would allow an offset to be applied to the items, but that isn’t true; it simply inserts the item into the file.

Using predefined items

If the the same element appears 2 or more times in your graphics, it is worthwhile defining it in the ‘defs’ section, then just adding it to the drawing with a ‘use’ command, which also allows the predefined graphic to be positioned and styled as required.

g = svgwrite.container.Group()
g.add(,HEIGHT/2), HEIGHT/2))
g.add(,HEIGHT/2), HEIGHT/3))
dwg.add(dwg.use(g, class_="circle_style"))
dwg.add(dwg.use(g, insert=(40,0), style="stroke:green; fill:palegreen"))


You’ll note that I’ve removed the class from the group definition, and instead applied class & style definitions when the graphic is used. This isn’t compulsory; you can specify a class when the group is defined, but then it remains fixed, which limits the flexibility of your predefined objects – once a graphic is defined you can’t change its underlying structure.

It can also be really useful to name the predefined items using an ‘id’ parameter; they can then be inserted into the document by referencing that name, improving readability.

g = svgwrite.container.Group(id="circles")
g.add(,HEIGHT/2), HEIGHT/2))
g.add(,HEIGHT/2), HEIGHT/3))
dwg.add(dwg.use("#circles", class_="circle_style"))

The ‘#’ prefix is necessary to turn the ID into an Internationalised Resource Identifier (IRI).

Colour gradient

The green line looks a bit boring, so I wanted to add a colour gradient, making it look 3-dimensional. All the examples I’ve seen are of horizontal or vertical boxes, but I needed the shading to be aligned along the axis of the diagonal line, so it looks like a round bar. My intention was to create some very clever code to do this, but the result always looked terrible, so in the end I just hacked the parameters to make it look roughly right. Sorry!

lg = svgwrite.gradients.LinearGradient(
     start=(0,0), end=(40,0),
     id="blue_grad", gradientUnits="userSpaceOnUse",
    (0,0), (WIDTH-1,HEIGHT-1),


Modifying objects

In the above code, we’ve been creating a graphic object and and adding it to the drawing in a single line. Alternatively, we can access and modify the object’s attributes using its dictionary, for example changing the stroke colour:

line = dwg.line((10,0), (110, 100), stroke="red")
# Change to green using either..
line["stroke"] = "green"
# ..or..


The intention is to create a working seven-segment clock display; we could possibly do this in Python by continually updating an SVG file, and persuading the browser to reload it, but a much simpler method is to use Javascript.

Each object (or group of objects) to be animated must have a unique ‘id’ tag.

dwg.add(dwg.line((20,0), (120, 100), stroke="red", id="myline"))

To change this line, we have to wait until the browser has loaded the graphics, so the code is triggered by window.onload event:

SCRIPT = """
window.onload = function()
    var myline = document.getElementById("myline");
    if (myline)
        myline.setAttribute("stroke", "yellow");

That’s how easy it is to modify a graphic at run-time using Javascript.

If you are a newcomer to that language, the developer’s console in the browser is incredibly useful; you can add diagnostic printouts using console.log(), and they will only show up when the developer tools are enabled. Here is the Chrome console when ‘console.log(myline);’ has been added to Javascript; hovering the cursor over the text display causes the associated graphic bounding box to be highlighted:


SVG clock

I started off by drawing the segments using simple flat-colour lines:


However the colour gradients look much more attractive..

seg_clock I decided to use them instead. This triggered a cascade of problems, because using colour gradients on lines is much more restrictive than rectangles, in respect of the overall dimensions the browser uses to calculate gradient values (see the documentation on gradientUnits); when lines are drawn in various on-screen positions using the same gradient, only one is the correct colour.

The way round this problem was to draw only one shaded line, then use the ‘transform’ function to rotate and move a copy of that line. The parameters for the transformation are in the form of a 6-value array, which allows complex changes to be made in one step. Simplistically, the following transformations can be made:

(1, 0, 0, 1, x, y)          Move (translate) by x,y
(sx, 0, 0, sy, 0, 0)        Scale by sx, sy
(cos, sin, -sin, cos, 0, 0) Rotate by angle
(1, 0, tan, 1, 0, 0)        x-skew by angle
(1, tan, 0, 1, 0, 0)        y-skew by angle

A transformation array is needed for each of the 7 segments, so if segment A is the original drawn line:


..the 7 segment transformations are..

# Transform matrix for each segment
SEG_MATRIX = ((1, 0,  0, 1,      0,        0), # A
              (0, 1, -1, 0, SEGLEN,        0), # B
              (0, 1, -1, 0, SEGLEN,   SEGLEN), # C
              (1, 0,  0, 1,      0, SEGLEN*2), # D
              (0, 1, -1, 0,      0,   SEGLEN), # E
              (0, 1, -1, 0,      0,        0), # F
              (1, 0,  0, 1,      0,   SEGLEN)) # G

The resulting lines are in a rectangular grid; the final flourish is to use a ‘skew’ transformation on the complete digit to give it a lean to the right:

# Matrix to slightly skew the displays
SKEW_MATRIX = "matrix(1,0,-0.1,1,0,0)"
g = Group(transform=SKEW_MATRIX)

The remaining question is how to do the animation; the simplest (and hopefully fastest) method I could think of is to stack all the digits 0 – 9 on top of each other, setting them transparent (opacity = 0). Each one is labelled with its position on the display, and its value, so ‘digit00’ is the left-most digit, with a value of 0, ‘digit01’ is the left-most digit with a value of 1, ‘digit10’ is the second digit with a value of 0, and so on. All the Javascript code has to do is set the opacity on the required digits, so to display the number 2345, digit02, digit13, digit24, and digit35 are opaque, the rest are transparent.

Source code

Here is the entire source code, compatible with Python 2.7 or 3.x

To view the resulting SVG file, click here.

# SVG clock with 7-segment display

import svgwrite
from svgwrite.container import Group

# 7-seg display
SEGWID      = 5             # Width & length of a segment
SEGLEN      = 20
NDIGITS     = 6             # Number of digits
DIG_PITCH   = 35            # Pitch of digits
DIG_SCALE   = 1             # Scaling factor for main digits
FRAC_SCALE  = 0.7           # Scaling factor for fractional part
LMARGIN     = 20            # Left margin
VMARGIN     = 5             # Vertical margins

GRAD_COLOURS= 'lightblue','blue'# Gradient colours
FNAME       = "svg_clock.svg"   # Filename


# Transform matrix for each segment
SEG_MATRIX = ((1,0, 0,1,0,      0),        # A
              (0,1,-1,0,SEGLEN, 0),        # B
              (0,1,-1,0,SEGLEN, SEGLEN),   # C
              (1,0, 0,1,0,      SEGLEN*2), # D
              (0,1,-1,0,0,      SEGLEN),   # E
              (0,1,-1,0,0,      0),        # F
              (1,0, 0,1,0,      SEGLEN))   # G

# Matrix to slightly skew the displays
SKEW_MATRIX = "matrix(1,0,-0.1,1,0,0)"

# Matrix to scale a segment down to a decimal point
DP_MATRIX   = "matrix(%g,0,0,1,0,0)" % (float(SEGWID)/SEGLEN)

# Segments to be activated for digits 0 - 9
dig_segs = ("abcdef", "bc", "abdeg", "abcdg", "bcfg",
            "acdfg", "acdefg", "abc", "abcdefg", "abcdfg")

SCRIPT = """
    var ndigits=6, last;

    // Initialise
    window.onload = function()
        last = 0;
        setInterval(tick_handler, 100)

    // Timer tick handler
    function tick_handler()
        var now = new Date();
        var t = now.getHours()*10000 + now.getMinutes()*100 +
        if (t != last)
            set_value(t, last);
        last = t;

    // Set a value on the display
    function set_value(val, oldval)
        var i;
        for (i=0; i<ndigits; i++)
            set_digit(ndigits-1-i, oldval, 0);
            set_digit(ndigits-1-i, val, 1);
            val = Math.floor(val / 10);
            oldval = Math.floor(oldval / 10);

    // Set opacity of a single display digit
    function set_digit(dig, val, op)
        var digit = document.getElementById("digit"+dig%10+val%10);
        if (digit)
            digit.setAttribute("opacity", op);

# Create maximised SVG drawing
def create_svg(fname, size):
    return svgwrite.Drawing(fname,
            width="100%", height="100%",
            viewBox=("0 0 %u %u" % size),

# Add Javascript to drawing
def add_script(dwg, script):

# Create linear gradient, add to drawing defs
def create_lin_grad(dwg, id, start, end, colours):
    grad = svgwrite.gradients.LinearGradient(
           start=start, end=end,
           id=id, gradientUnits="userSpaceOnUse")

# Create a single segment, add to drawing defs
def create_seg(dwg, id, length, width, stroke):
    seg = dwg.line((0,0), (length,0), id=id,
                   stroke="url(#%s)" % stroke,
                   stroke_width=width, stroke_linecap="round")

# Create digits 0 - 9 by transforming a segment, add to drawing defs
def create_digits(dwg, digid, segid):
    for dig, segs in enumerate(dig_segs):
        g = Group(id="%s%u" % (digid,dig), transform=SKEW_MATRIX)
        for seg in segs:
            num = ord(seg) - ord('a')
            mat = SEG_MATRIX[num]
            g.add(dwg.use('#'+segid, transform="matrix%s" % str(mat)))

# Create decimal point by scaling a segment, add to drawing defs
def create_dp(dwg, dpid, segid):
    g = Group(id=dpid, transform=SKEW_MATRIX)
    g.add(dwg.use('#'+segid, transform=DP_MATRIX))

# Add seven-segment display
# Each digit is a group of coincident transparent chars 0 - 9
def add_display(dwg, pos, pitch):
    dpos = list(pos)
    for n in range(0, NDIGITS):
        scale = DIG_SCALE if n<4 else FRAC_SCALE
        g = Group(transform = "translate(%g %g) scale(%g)" %
            (dpos[0], dpos[1], scale))
        for i in range(0, 10):
            g.add(dwg.use("#dig%u"%i, id="digit%u%u"%(n,i), opacity=0))
        if n == 2:
            dwg.add(dwg.use("#dp", insert=(dpos[0]-SEGLEN*0.7,
        dpos[0] += pitch * scale

if __name__ == '__main__':
    dwg = create_svg(FNAME, DWG_SIZE)
    add_script(dwg, SCRIPT)
    create_lin_grad(dwg, "blue_grad", (0,0), (0,SEGWID*3/4),
    create_seg(dwg, "seg", SEGLEN, SEGWID, "blue_grad")
    create_digits(dwg, "dig", "seg")
    create_dp(dwg, "dp", "seg")
    add_display(dwg, (LMARGIN, VMARGIN), DIG_PITCH)
            (LMARGIN-2+DIG_PITCH*4, DWG_SIZE[1]-5),
            style="font-size:8px; font-family:Arial", fill="gray"))


Copyright (c) Jeremy P Bentham 2019. Please credit this blog if you use the information or software in it.

Raspberry Pi and OpenOCD

In previous blog posts I used an FTDI module and pure Python code to access the internals of an ARM CPU using the SWD interface. I want to expand this technique to provide a more comprehensive real-time display of the CPU status, but the FTDI interface is quite limiting; what I need is an fast intelligent SWD/JTAG adaptor, with a network interface so I can do both local and remote diagnosis.

Enter the raspberry Pi: a lot of computing power at very low cost, either using the built-in HDMI display output, or running ‘headless’ over a wireless network, providing diagnostic data to a remote display.

Connecting the Pi to the target system could hardly be simpler; 3 wires (clock, data & ground) are sufficient to access data from most CPUs with an SWD interface.

Raspberry Pi 3 SWD interface to STMicro ARM CPU
Raspberry Pi ZeroW JTAG interface to Atmel ARM CPU

Software-wise, OpenOCD has all the SWD/JTAG features you’ll ever need, accessed through a network interface; installation may be a bit intimidating if you’re not an experienced Linux user, but is really quite easy, as this blog will (hopefully) demonstrate.

What you end up with is a really powerful local/remote debugger for very little money; around $10 US, in the case of the Pi Zero W.

Installing OpenOCD

You need any Raspberry Pi (RPi), versions 0 to 3. The slower boards will have longer boot times & build times but are otherwise fully functional. The OS version I used was ‘Raspbian Stretch with desktop’; the ‘recommended software’ add-ons are not necessary. The total image size on SD card is around 3 GB.

A convenient way to avoid re-typing the instructions below is to enable the Secure Shell (SSH) protocol using the ‘Raspberry Pi Configuration utility’, then run a remote ssh client (e.g. ‘putty’ on Windows) to access the RPi over the network; you can then cut and paste a command line into the ssh window without re-typing. If in doubt as to the IP address of your RPi, hover the cursor over the network icon in the top-right corner, and the address will be shown, e.g. (or run ‘ifconfig’ if in text mode).

It is best to install OpenOCD from source, as the pre-built images often lack important functionality. Installation instructions can be found on many Web sites, for example Adafruit “Programming Microcontrollers using OpenOCD on a Raspberry Pi”. In summary, the steps are:

cd ~
sudo apt-get update
sudo apt-get install git autoconf libtool make pkg-config libusb-1.0-0 libusb-1.0-0-dev
git clone
cd openocd
./configure --enable-sysfsgpio --enable-bcm2835gpio
sudo make install

The ‘make’ step takes approximately an hour on the slower boards, or 15 minutes on the faster.

Configuration files

OpenOCD has a wide variety of options, so generally needs more than one configuration file, to define:

  • Debug adaptor (in our case, the RPi)
  • Communication method (SWD or JTAG)
  • Target CPU.

There are a large number of files in /usr/local/share/openocd/scripts, most notably the ‘interface’ and ‘target’ sub-directories, however there are so many permutations that it is unlikely you’ll find everything you need, so we need to think about creating our own files.

The most important first step is to work out how the RPi will be connected to the target system…

RPi I/O connections

At the time of writing, there are 3 versions of the RPi I/O connector, and 3 different pin-numbering schemes, so it is easy to get confused. The older boards may be considered obsolete, but are still more than adequate for running OpenOCD, so mustn’t be excluded.

The numbering schemes are:

  1. Connector pin numbers: sequential 1 – 26 or 1 – 40
  2. GPIO bit numbers (also known as Broadcom or BCM numbers) 0 – 27
  3. WiringPi numbers, as used in the Python library

I’ll only be using the first 2 of these. The older boards have 26 pins, the newer 40.

RPi 26-way connector with GPIO numbers

Pins 3 & 5 were initially GPIO 0 and 1, but later became GPIO 2 and 3; they are best avoided.

RPi 40-way connector with GPIO numbers

On the 40-way connector GPIO21 has become 27, so should also be avoided. The choice of ground pin is arbitrary; any of them can be used, but I avoid pin 6, as any mis-connection to the supply pins can result in significant damage.


The SWD connections given in the OpenOCD configuration file ‘raspberrypi2-native.cfg’ are:

raspberrypi2-native SWD connections

The relevant lines in the configuration file are:

# SWD                 swclk swdio
# Header pin numbers: 22    18
bcm2835gpio_swd_nums  25    24

bcm2835gpio_srst_num  18
reset_config srst_only srst_push_pull

In many applications the reset signal is unnecessary – and undesirable, if the objective is to perform non-intrusive monitoring of a running system.


JTAG is an older (and more widely available) standard for debugging, that requires 4 wires in the place of 2 for SWD. There is a standard mapping between them (SWCLK is TCK, SWDIO is TMS), but the JTAG connections in the standard OpenOCD configuration file ‘raspberrypi-native.cfg’ use completely different pins:

raspberrypi-native JTAG connections

The relevant lines in the configuration file are:

# JTAG                tck tms tdi tdo
# Header pin numbers: 23  22  19  21
bcm2835gpio_jtag_nums 11  25  10  9

# bcm2835gpio_srst_num 24
# reset_config srst_only srst_push_pull

As standard, the reset definition is commented out.

I’m not a fan of this pinout scheme; I’d like a single setup that covers both SWD and JTAG.

Other pin functions

You might wish to use the RPi for other diagnostic functions, such as monitoring a serial link, so these pins have to be kept free. The following diagram shows the alternative pin functions.


You can use any of the blue or yellow pins for the SWD/JTAG interface, it is just a question as to which other functionality you may be needing.

Combining SWD and JTAG

The compromise I’ve adopted is to preserve the existing SWD arrangement, but move the JTAG pins so one set of connections can serve both SWD & JTAG on either the 26-way or the 40-way connectors – and I’ve also avoided using any of the predefined pins, so there are no conflicts with other functionality.

rpi_swd_jtag SWD and JTAG connections

The relevant section of the configuration file is:

# SWD                swclk swdio
# Header pin numbers 22    18
bcm2835gpio_swd_nums 25    24

# JTAG                tck tms tdi tdo
# Header pin numbers  22  18  16  15 
bcm2835gpio_jtag_nums 25  24  23  22

Target system connections

The connection points on the target system will vary from board to board; for a previous demonstration I used a ‘blue pill’ STM32F103 board that has ground, SWD clock & data conveniently on some separate header pins, but the most common standard for JTAG & SWD connections is a 20-way 2-row header, as follows:

JTAG     SWD     20-way pin
Ground   Ground  4, 6, 8, 10, 14, 16, 18, 20
TRST             3
TDI              5
TMS      SWDIO   7
TCK      SWCLK   9
TDO              13
RESET            15

There is generally a keyway on the odd-numbered side of the connector.


Two reset signals are defined: TRST is ‘tap reset’, that is intended to just reset the diagnostic port; the other signal marked RESET (which OpenOCD refers to as SRST or ‘system reset’) should reset all devices, as if a reset button has been pressed. In the experimentation I’ve done, the reset lines haven’t been needed, but this is very processor-specific; sometimes the RESET line has to be used to gain control of the target system.

It is convenient to use ribbon cable for wiring up the interface, especially if the wires follow the resistor colour code:

RPi pin  Colour  20-way pin  JTAG/SWD
9        Brown   20          Ground
12       Red     15          Reset
16       Orange  5           TDI
15       Yellow  13          TDO
18       Green   7           TMS/SWDIO
22       Blue    9           TCK/SWCLK

Or in graphical form…


Interface configuration file

The above examples show how the SWD/JTAG connections are handled, but some more data is needed to fully configure the RPi interface, most notably the I/O base address and clock scaling; this tells OpenOCD where to find the I/O interface, and how to compute its speed.

There are 2 possible values for the I/O base address: the RPi zero and v1 use 0x20000000, and v2+ use 0x3F000000. If you are unsure which value to use, the boards have an excellent feature called Device Tree that documents the current hardware configuration; enter the following command in a console window:

xxd -c 4 -g 4 /proc/device-tree/soc/ranges

The base I/O address is the second value returned, for example:

RPi zero:
00000000: 7e000000  ~...
00000004: 20000000   ...
00000008: 02000000  ....
RPi v3:
00000000: 7e000000 ~...
00000004: 3f000000 ?..
00000008: 02000000 ....
..and so on..

The clock scaling is less critical, since we’re generally aiming for around 1 MHz, which gives quite a bit of leeway in terms of being fast or slow. This is fortunate, because it is difficult to find a definitive explanation of the values that should be used for all hardware & clock settings. My understanding, from reading the source code, is that every I/O read or write instruction is followed by a loop containing NOP (CPU idle) cycles to space out the operations; this number is known as the ‘jtag_delay’, and is calculated by:

(speed_coeff / khz) - speed_offset;

..where speed_coeff & speed_offset are the two scaling parameters, and khz is the desired SWD/JTAG clock speed in kHz (all the values are integers). Obviously the delay is very CPU-dependant; the standard values in the files are:

Rpi zero and v1:
  bcm2835gpio_speed_coeffs 113714 28
RPi v2+:
  bcm2835gpio_speed_coeffs 146203 36

These do seem to give roughly the right answers, and there isn’t any great necessity for the delays to be accurate – when viewed on an oscilloscope, you can see some of the cycles being stretched by an incoming interrupt, so they never will be as accurate as a pure hardware solution.

Adaptor configuration files

Combining all the information above, here are the two adaptor configuration files: rpi1.cfg for RPi zero & v1, and rpi2.cfg for v2+

# rpi1.cfg: OpenOCD interface on RPi zero and v1

# Use RPi GPIO pins
interface bcm2835gpio

# Base address of I/O port
bcm2835gpio_peripheral_base 0x20000000

# Clock scaling
bcm2835gpio_speed_coeffs 113714 28

# SWD                swclk swdio
# Header pin numbers 22    18
bcm2835gpio_swd_nums 25    24

# JTAG                tck tms tdi tdo
# Header pin numbers  22  18  16  15 
bcm2835gpio_jtag_nums 25  24  23  22
# rpi2.cfg: OpenOCD interface on RPi v2+

# Use RPi GPIO pins
interface bcm2835gpio

# Base address of I/O port
bcm2835gpio_peripheral_base 0x3F000000

# Clock scaling
bcm2835gpio_speed_coeffs 146203 36

# SWD                swclk swdio
# Header pin numbers 22    18
bcm2835gpio_swd_nums 25    24

# JTAG                tck tms tdi tdo
# Header pin numbers  22  18  16  15 
bcm2835gpio_jtag_nums 25  24  23  22

Running OpenOCD

Finally we get to run OpenOCD, but in addition to the adaptor configuration, we need to give some details about the interface & target CPU.

The command line consists of configuration files prefixed by -f, and commands prefixed by -c. In reality, a configuration file is just a series of commands; for example you can select JTAG operation using the command-line option:

openocd -c "transport select jtag"

This is exactly the same as:

openocd -f select_jtag.cfg

where the file ‘select_jtag.cfg’ has the line:

transport select jtag

So we’ll use a mixture of commands and files on our command line. The following example is for an RPi v3 driving an SWD interface into a STM32F103 processor;  I’ve used backslash continuation characters at the end of each line to make the commands more readable:

sudo openocd -f rpi2.cfg \
             -c "transport select swd" \
             -c "adapter_khz 1000" \
             -f target/stm32f1x.cfg

Some hardware operations require superuser privileges, hence the use of ‘sudo’. The usual security warnings apply when doing this; you can try without, there will just be a ‘permission denied’ error if it fails.

For a list of supported CPUs, see the files in /usr/local/share/openocd/scripts/target

When OpenOCD runs, with a bit of luck, you’ll see something like:

BCM2835 GPIO nums: swclk = 25, swdio = 24
BCM2835 GPIO config: tck = 25, tms = 24, tdi = 23, tdo = 22
adapter speed: 1000 kHz
adapter speed: 1000 kHz
adapter_nsrst_delay: 100
none separate
cortex_m reset_config sysresetreq
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : BCM2835 GPIO JTAG/SWD bitbang driver
Info : JTAG and SWD modes enabled
Info : clock speed 1001 kHz
Info : SWD DPIDR 0x1ba01477
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : Listening on port 3333 for gdb connections

If there is a configuration or wiring error, OpenOCD usually (but not always!) returns to the command line, for example if the SWDIO line is disconnected:

BCM2835 GPIO nums: swclk = 25, swdio = 24
BCM2835 GPIO config: tck = 25, tms = 24, tdi = 23, tdo = 22
adapter speed: 1000 kHz
adapter speed: 1000 kHz
adapter_nsrst_delay: 100
none separate
cortex_m reset_config sysresetreq
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : BCM2835 GPIO JTAG/SWD bitbang driver
Info : JTAG and SWD modes enabled
Info : clock speed 1001 kHz
Info : SWD DPIDR 0x02192468

..and then OpenOCD terminates back to the command line..

The clue is in the SWD Data Port ID Register (DPIDR) value. According to the datasheet for the STM32F103 CPU, this should be 1BA01477. With a data line fault, every time OpenOCD runs, a different value is returned, e.g. 0x00e65468, 0x02192468, 0x00433468 and so on; the software is just picking up noise on the data line.

A disconnected clock line is harder to diagnose, as OpenOCD just terminates after the ‘clock speed’ report, with no error indication. Try using the -d option to invoke a debug display, and you’ll see lines like

JUNK DP read reg 0 = ffffffff

which suggests that all is not well in the hardware interface.

Another thing to try in the event of a failure is adding or removing a reset line, and changing its configuration entries; if there is a reset problem you’ll probably see the DPIDR value reported correctly, but other functions may not work.

What now?

Having just written 2100 words and drawn 8 diagrams, I’m going to take a short break. However, first I ought to give some indication as to how you control this OpenOCD setup.

The sign-on text mentions a telnet interface on port 4444, so we can use that; the commands highlighted in bold:

sudo apt-get install telnet  # ..if not already installed

telnet localhost 4444
Trying ::1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger

As standard, this interface only works when telnet is running locally on the Raspberry Pi. To open it up to a wider network, add the command ‘bindto’ to the configuration. However, this option comes with a major security warning; think very carefully before making the system accessible to everyone on the network.

Refer to the OpenOCD documentation for information on the large number of commands that can be used over telnet, for example displaying memory using ‘mdw’ or halting the processor using ‘halt’. When finished, close the telnet link with ‘exit’.

Open-source development toolchain

To learn more about the way OpenOCD can be used with GCC and GDB to program & debug ARM target systems, take a look at this post.

Copyright (c) Jeremy P Bentham 2019. Please credit if you use the information or software in here.

Viewing ARM CPU activity in real time


In previous blog posts, I have described how an FTDI USB device can be programmed in Python to access the SWD bus of an ARM microprocessor. This allows the internals of the CPU to be accessed, without disrupting the currently running program.

In this blog I take the process one step further, and add a graphical front-end, that shows the CPU activity in real time; if you want to see it in action, take a look at this video.

If you need a more powerful debug system, take a look at my post OpenOCD on the Raspberry Pi. I’ve also created a remote graphical front-end that uses a Web browser for display, instead of running PyQt locally, click here for details.


The target system in the demonstration is a ‘blue pill’ STM32F103 board, with a 7-segment display and pushbutton. This CPU board is particularly convenient because it has a 4-pin connector with SWD clock & data; it can be seen on the right-hand side of the photograph above.

The SWD connection is described in detail in this post, in brief, the circuit is:

SWD cable circuit diagram

Take care when making any connections to power lines, especially the 5 volt line on the FTDI module; if mis-connected, high currents can flow, resulting in significant damage.

The demonstration system has a Kingbright SC56-11LGWA common-cathode 7-segment display; the anodes are driven directly from the CPU I/O pins with 220 ohm series resistors. The pin mapping (reading from top left of the display) is:

Segment   CPU Pin (via 220R)
g         PB11
f         PB10
a         PB1
b         PB0
e         PB12
d         PB13
c         PB14
DP        PB15

Button    PB3 (button shorts this pin to ground)

You may wonder why I have used such a complex mapping; why not use 8 consecutive pins? The answer is that the above arrangement makes the wiring easier, at the expense of a little software complexity. This trade-off is quite common in commercial projects, where the demands of cost-saving often lead to significant complexity in the hardware configuration.



The full source code is available on Github, and is compatible with Python 2.7 or 3.x. It has been tested on Windows, and is theoretically Linux-compatible, except for a problem reading data back from USB, as described in the earlier blog – this issue needs to be resolved. The Python library dependencies are:

  • PyQt v4 or v5 (GPL version)
  • ftd2xx
  • pypiwin32 (if running Python 3.x)

These can be installed with ‘pip’ or ‘pip3’ as usual.


The following text describes how I created the graphical front-end, for the benefit of anyone wishing to understand or modify the code; this isn’t necessarily the best way, it is just what I did based on past experience.

There are many options for creating a GUI in Python; I’ve used PyQt in the past, so that is the option I’ve chosen here. In case you’re unfamiliar with it, the current version is 5, but many installations are still version 4, so I’ve written the code to be compatible with both, even though this does involve some manipulation of the libraries:

    from PyQt5.QtGui import QBrush, QPen, QColor, QFont, QTextCursor, QFontMetrics, QPainter
    from PyQt5.QtWidgets import QApplication, QGraphicsScene, QGraphicsView, QGraphicsSimpleTextItem
    from PyQt5 import QtCore, QtWidgets
    from PyQt4.QtGui import QBrush, QPen, QColor, QFont, QTextCursor, QFontMetrics, QPainter
    from PyQt4.QtGui import QApplication, QGraphicsScene, QGraphicsView, QGraphicsSimpleTextItem
    from PyQt4 import QtCore, QtGui as QtWidgets

It is possible to create the entire front-end graphically using Qt designer, then just import the GUI file and it will be displayed exactly as designed, so in theory you only need to write the event-handlers for the various actions. Personally, I find this approach a bit tricky when implementing more complex GUIs, so tend to use the PyQt function calls to build the graphics from scratch.

Main Window

The main window is as simple as possible, containing only one central widget:

class MyWindow(QtWidgets.QMainWindow, MyWidget):
    graph_updater = QtCore.pyqtSignal(str)

    def __init__(self, parent=None):
        QtWidgets.QMainWindow.__init__(self, parent)
        self.widget = MyWidget(self)

A ‘graph_updater’ signal is defined so that any thread can send a request to update the graphics; this will be queued until the main GUI thread is back in control, so there is no ambiguity over which thread is performing the updates.

The single widget contains all the graphical elements, and the hierarchy is important if  they are to be displayed correctly; for example, if the window is resized, I want all the elements to be resized in proportion, and that is only possible if the correct parent-child relationship is maintained.

The upper area of the widget is graphics, the lower is text; they are combined using a vertical layout widget.

self.text = QtWidgets.QTextEdit()
self.scene = QGraphicsScene()
self.view = MyView(self.scene)
layout = QtWidgets.QVBoxLayout()
layout.addWidget(self.view, 30)
layout.addWidget(self.text, 10)

Text display

The text area is used to display all kinds of diagnostic information, so it is convenient to redirect the console print function to display here. This is done by creating ‘write’ and ‘flush’ functions in the widget, which emit a signal linked to an updater function:

text_updater = QtCore.pyqtSignal(str)
# Handler to update text display
def update_text(self, text):
    disp = self.text.textCursor()           # Move cursor to end
    text = str(text).replace("\r", "")      # Eliminate CR
    while text:
        head,sep,text = text.partition("\n")# New line on LF
        if sep:
    self.text.ensureCursorVisible()    # Scroll if necessary

# Handle sys.stdout.write: update text display
def write(self, text):
def flush(self):

Now all that is necessary is to redirect console output to the main widget..

sys.stdout = self

..and magically anything in a ‘print’ function will appear in the text display. The PyQt Signal interface ensures there are no threading problems, you can still use the print function anywhere.

Graphical display

When displaying graphics, Qt (and hence PyQt) makes a distinction between the graphical objects (the ‘scene’) and their realisation on the screen (the ‘view’).

When first experimenting with background patterns, I discovered one (Dense1Pattern) that creates rows of holes similar to the prototyping board. Having fixed on this, it was logical to draw all objects with respect to this grid, i.e. in nominal units of 0.1 inches as on the prototyping board, though the graphics will expand or shrink when the window size is changed.


To draw an object, it is just added to the scene, and it will automatically be displayed, for example to draw a circle:

# Add circle to grid, given centre
    def draw_circle(self, gpos, size, pen, brush=PIN_ON_BRUSH):
        size *= GRID_PITCH
        x,y = self.grid_pos(gpos)
        p = self.scene.addEllipse(0, 0, size, size, pen, brush)
        p.setPos(x-size/2.0, y-size/2.0)
        return p

You can see the conversions from screen-units to grid-units. One less-obvious aspect of this code is that the ellipse is originally drawn at the 0,0 origin, then moved into place, rather than being drawn at the final location; this simplifies any subsequent operations such as movement or rotation.


The drawing function returns the object that has been created, which must be saved somewhere, so it can be animated.  The most obvious form of animation is to replace that object with another, e.g. one drawn in dark red to light red, but there is an easier way to modify any drawn object: change the opacity – the extent to which the object is transparent or opaque. If a bright red object is drawn with an opacity value of 0.1, it becomes a faint dark red; changing the opacity to 1.0 restores the full strong colour.

So the objects that are to be animated are stored as a list in a dictionary, indexed on the signal name (e.g. ‘PB10’); when that signal changes state, it is only necessary to walk the list, setting the opacity as required.

# Set pin (or segment) on/off state
# Format is 'name=value', e.g.  'PA10=1'
def set_pin(self, s):
    name, eq, num = s.partition('=')
    if eq and name in self.sigpins:
        val = int(num, 16)
        for p in self.sigpins[name]:
            if int(p.opacity()) != val:
                p.setOpacity(PIN_ON_OPACITY if val else PIN_OFF_OPACITY)

7-segment display

As it happens, PyQt has a widget to draw 7-segment displays, but in this case it is easier to animate if drawn from scratch. The standard segment notation is:


To draw this in one continuous operation, I start with segment F, then ABCDEG. Once drawn, the list is rearranged into ABCDEFGH order, where H is the decimal point.

SWD interface

The SWD interface uses an FTDI USB device as described in detail here. The software used in this project is very similar, but has been optimised to scan the I/O ports quite fast, around 100 times per second. The way this has been achieved is to group all the commands to the FTDI device together, so a single block of data requests is sent out over the USB bus, then a single block of responses is read back.

All the outgoing requests are buffered, rather than being sent individually – this is quite a simple change, you just have to remember to flush the buffer before reading back the results. However you then have the problem that the returned data block consists of the data from several requests – how do you work out which data value corresponds to which request? The method I’ve adopted is to create a polling list, with objects representing the memory addresses to be polled

poll_vars = []  # List of variables to be polled

# Storage class for variable to be polled
class Pollvar(object):
    def __init__(self, name, addr):, self.addr = name, addr
        self.value = None

# Add variable to the polling list
def poll_add_var(name, addr):
    poll_vars.append(Pollvar(name, addr))

The data requests are generated by walking down the list, then when the responses arrive, the ‘value’ fields are filled in sequentially, or set to ‘none’ if there was an error.

# Send out poll requests
def poll_send_requests(h):
    for pv in poll_vars:
        swd.swd_wr(h, swd.SWD_AP, APORT_TAR, pv.addr, True, False)
        swd.swd_idle_bytes(h, 2)
        swd.swd_rd(h, swd.SWD_AP, APORT_DRW, True, False)
        swd.swd_rd(h, swd.SWD_AP, APORT_DRW, True, False)

# Get poll responses
def poll_get_responses(h):
    for pv in poll_vars:
        swd.swd_wr(h, swd.SWD_AP, APORT_TAR, pv.addr, False, True)
        swd.swd_rd(h, swd.SWD_AP, APORT_DRW, False, True)
        req = swd.swd_rd(h, swd.SWD_AP, APORT_DRW, False, True)
        pv.value = if ( is not None and
                    req.ack.value==swd.SWD_ACK_OK) else None

It is slightly confusing that the request & response use the same swd_wr and swd_rd functions; the key to understanding this code is to look at the boolean values. ‘True, False’ means that a transmission is sent, but no data is read back; ‘False, True’ is the opposite, in that nothing is sent, the response is just read back.

If you want to see the SWD transactions, try setting swd.VERBOSE to True:

SWD interface: FT232H device in Single RS232-HS

Rd 0 IDCODE  1BA01477 Ack 1
Wr 0 ABORT   0000001E Ack 1
Wr 4 CTRL    50000000 Ack 1
Rd 4 STATUS  F0000040 Ack 1
DP ident: 1BA01477
Wr 8 SELECT  000000F0 Ack 1
Rd C DRW/BD3 00003BDB Ack 1
Rd C DRW/BD3 14770011 Ack 1
AP ident: 14770011
Wr 8 SELECT  00000000 Ack 1
Wr 0 CSW/BD0 22000002 Ack 1
Wr 4 TAR/BD1 40010C08
Wr 4 TAR/BD1 40010C08 Ack 1
Rd C DRW/BD3 14770011 Ack 1
Rd C DRW/BD3 000043D9 Ack 1
..and so on..

Looking at the last block of 6 transactions, the first 3 are a write-cycle to the SWD TAR (transfer address) register, then a dummy read and the actaul read cycle. These are the requests being sent to the target system; nothing is being read back so the read-data & ack-status values are unknown. The second block of 3 are the readback of these requests, so the actual values are displayed.

An ‘ack’ value of anything apart from 1 is an error; a value of 0 or 7 suggests the target isn’t responding, 2 means it is trying to send a delayed response, and 4 indicates a hard error – this is sticky, so will persist until cleared by writing to the ‘abort’ register.

Code structure

The ‘reporta’ project source files on Github are:   main program   PyQt interface    ARM processor definitions    SWD interface FTDI device driver

The lower 3 files are very similar to their counterparts in these posts, basically I’ve copied the files with minor modifications. Each file has a ‘main’ function to demonstrate its functionality; for example, you can see the PyQt interface running without the FTDI hardware, just by executing

The main ‘reporta’ program has no command-line options, currently the I/O port of interest is hard-coded; to add other ports (or CPU memory locations) to the polling list, just use the poll_add_var function.

As with the previous posts, I must state that so far I’ve only tested this technique on the STM32F103 processors; others may require custom powerup sequences, or special incantations to gain access to the CPU memory – but hopefully my Python code will provide a useful starting-point.

Copyright (c) Jeremy P Bentham 2018. Please credit this blog if you are using the information or software in it.

Programming FTDI devices in Python: Part 5

Doing something useful with SWD

In part 4 we got as far as reading in the CPU identification, which is of no real use; in this part we’ll actually read some of the CPU internals, but first we need to understand how SWD accesses are controlled.

SWD cable showing resistor

DAP: DP and AP

You may think that we just need to feed a 32-bit address into the SWD port, and get a value back, but the reality is much more complicated. The SWD clock & data lines are connected to the CPU Debug Port (SW-DP), which has its own address space. Read/write accesses to the CPU memory aren’t controlled by the DP; you have to go through the Access Port (AP), which also has a separate memory space, and is bank-switched.


We’ve already accessed the Debug Port ID code register, the other read-write registers are:

# Debug Port (SWD-DP) registers
# See ARM DDI 0314H "Coresight Components Technical Reference Manual"
DPORT_IDCODE        = 0x0   # ID Code / abort
DPORT_ABORT         = 0x0
DPORT_CTRL          = 0x4   # Control / status
DPORT_STATUS        = 0x4
DPORT_SELECT        = 0x8   # Select
DPORT_RDBUFF        = 0xc   # Read buffer

These registers control various aspects of the debug interface, for example the ‘abort’ register is used to reset the internal logic after an error, and the ‘control’ register is used to power up the peripherals needed for debugging.

The ‘select’ register has bitfields to control AP and DP bank-switching; it is convenient to define this using CTYPES:

from ctypes import Structure, Union, c_uint
# AHB-AP Select Register
class DP_SELECT_REG(Structure):
    _fields_ = [("DPBANKSEL",   c_uint, 4),
                ("APBANKSEL",   c_uint, 4),
                ("Reserved",    c_uint, 16),
                ("APSEL",       c_uint, 8)]
class DP_SELECT(Union):
    _fields_ = [("reg",   DP_SELECT_REG),
                ("value", c_uint)]
dp_select = DP_SELECT()

The union allows us to specify the whole 32–bit register when doing SPI read & write cycles, but still access individual bitfields within it.


The AP registers are:

# Access Port (AHB-AP) registers, high nybble is bank number
# See ARM DDI 0337E: Cortex M3 Technical Reference Manual page 11-38
APORT_CSW           = 0x0   # Control status word
APORT_TAR           = 0x4   # Transfer address
APORT_DRW           = 0xc   # Data read/write
APORT_BANK0         = 0x10  # Banked registers
APORT_BANK1         = 0x14
APORT_BANK2         = 0x18
APORT_BANK3         = 0x1c
APORT_DEBUG_ROM_ADDR= 0xf8   # Address of debug ROM
APORT_IDENT         = 0xfc   # AP identification

The Control Status Word register controls various aspects of the CPU memory accesses, for example the data size, and auto-increment for reading blocks of memory:

# AHB-AP Control Status Word Register
class AP_CSW_REG(Structure):
    _fields_ = [("Size", c_uint, 3),
                ("Res1", c_uint, 1),
                ("AddrInc", c_uint, 2),
                ("DbgStatus", c_uint, 1),
                ("TransInProg", c_uint, 1),
                ("MODE", c_uint, 4),
                ("Res2", c_uint, 13),
                ("HProt1", c_uint, 1),
                ("Res3", c_uint, 3),
                ("MasterType", c_uint, 1),
                ("Res4", c_uint, 2)]
class AP_CSW(Union):
    _fields_ = [("reg", AP_CSW_REG),
                ("value", c_uint)]
ap_csw = AP_CSW()

Because the AP can be accessing main CPU memory, it has two time-dependencies:

  • the AP may return a ‘wait’ indication in the status field so the transaction has time to go though
  • the data isn’t returned immediately; first you have to do a dummy read cycle, then a second read cycle that actually returns the data

So to set up a CPU memory read cycle, we need to configure the AP (including its bank-switching) then set a transfer address

# Configure AP memory accesses
def ap_config(d, inc, size):
    dp_select.reg.APBANKSEL = 0     # Zero bank
    swd.swd_wr(d, swd.SWD_DP, DPORT_SELECT, dp_select.value)
    ap_csw.reg.MasterType = 1
    ap_csw.reg.HProt1 = 1           # Enable incrementing, set access size
    ap_csw.reg.AddrInc = 1 if inc else 0
    ap_csw.reg.Size = 0 if size==8 else 1 if size==16 else 2
    return swd.swd_wr(d, swd.SWD_AP, APORT_CSW, ap_csw.value)

# Set AP memory address
def ap_addr(d, addr):
    swd.swd_wr(d, swd.SWD_AP, APORT_TAR, addr)
    swd.swd_idle_bytes(d, 2)            # Idle to avoid 'wait' response

# Do an immediate read of a 32-bit CPU memory location
def cpu_mem_read32(d, addr):
    ap_addr(d, addr)                              # Address to read
    swd.swd_rd(d, swd.SWD_AP, APORT_DRW)          # Dummy read cycle
    r = swd.swd_rd(d, swd.SWD_AP, APORT_DRW)      # Read data
    return if r.ack.value==swd.SWD_ACK_OK else None

There are 2 idle (null) bytes after the target memory address is set. These give the AP time to process the address value before it is used; if omitted, the AP gives an ack value of 2 (‘wait’) on the next transaction.

User interface

The console-based user interface is minimal; it just allows you to specify the STM32F103  GPIO port or memory address to be accessed, e.g.

python gpiob
DP ident: ack 1, value 1BA01477h
Powerup:  ack 1, value F0000040h
AP ident: ack 1, value 14770011h
40010C00: GPIOB CRL=44488433 CRH=33333344 IDR=00007FDA ODR=00007C1A

python 0
DP ident: ack 1, value 1BA01477h
Powerup:  ack 1, value F0000040h
AP ident: ack 1, value 14770011h
00000000: 20005000

If a gpio port is specified, four of its register values are printed out (CRL, CRH, IDR, ODR); the other diagnostic values can be useful if the access fails. Further help is available by setting the VERBOSE flag at the top of the part 4 file, which enables a printout of all the SWD cycles:

# Command line with invalid address
python 800000 
  Rd 0 IDCODE  1BA01477 Ack 1
DP ident: ack 1, value 1BA01477h
  Wr 0 ABORT   0000001E Ack 1
  Wr 4 CTRL    50000000 Ack 1
  Rd 4 STATUS  F0000000 Ack 1
Powerup:  ack 1, value F0000000h
  Wr 8 SELECT  000000F0 Ack 1
  Rd C DRW/BD3 00000000 Ack 1
  Rd C DRW/BD3 14770011 Ack 1
AP ident: ack 1, value 14770011h
  Wr 8 SELECT  00000000 Ack 1
  Wr 0 CSW/BD0 22000012 Ack 1
  Wr 4 TAR/BD1 00800000 Ack 1
  Rd C DRW/BD3 14770011 Ack 1
  Rd C DRW/BD3 00000000 Ack 4
00800000: ?

Source code

Here is the final batch of source-code. I’ve only tested the it with STM32F1 CPUs, so if you are trying to communicate with something else, there is a strong possibility you’ll need to make some changes. Points to bear in mind:

  1. Check the the CPU supports SWD, and the connections it uses.
  2. Check for any special power-up requirements, e.g. sending the reset sequence multiple times, or setting registers to enable debugging mode
  3. Check that the SWD clock and signal lines are toggling OK.
  4. Watch out for acknowledgement values 2 and 4, indicating a problem.
  5. Once an error occurs, it will persist over successive cycles until reset by writing to the ABORT register.
  6. Good luck!
# Python FTDI SWD CPU memory read from
# Compatible with Python 2.7 or 3.x
# v0.01 JPB 8/12/18

import sys, ftdi_py_part3 as ft, ftdi_py_part4 as swd
from ctypes import Structure, Union, c_uint

# STM32F1 address values for testing
# Address of GPIO Ports A - E on STM32F1
ports = {"GPIOA":0x40010800, "GPIOB":0x40010C00, "GPIOC":0x40011000,
         "GPIOD":0x40011400, "GPIOE":0x40011800}
# GPIO registers at offsets 0, 4, 8, 12
gpio_regs = ("CRL", "CRH", "IDR", "ODR")

# Debug Port (SWD-DP) registers
# See ARM DDI 0314H "Coresight Components Technical Reference Manual"
DPORT_IDCODE        = 0x0   # ID Code / abort
DPORT_ABORT         = 0x0
DPORT_CTRL          = 0x4   # Control / status
DPORT_STATUS        = 0x4
DPORT_SELECT        = 0x8   # Select
DPORT_RDBUFF        = 0xc   # Read buffer

# Access Port (AHB-AP) registers, high nybble is bank number
# See ARM DDI 0337E: Cortex M3 Technical Reference Manual page 11-38
APORT_CSW           = 0x0   # Control status word
APORT_TAR           = 0x4   # Transfer address
APORT_DRW           = 0xc   # Data read/write
APORT_BANK0         = 0x10  # Banked registers
APORT_BANK1         = 0x14
APORT_BANK2         = 0x18
APORT_BANK3         = 0x1c
APORT_DEBUG_ROM_ADDR= 0xf8   # Address of debug ROM
APORT_IDENT         = 0xfc   # AP identification

# DP Select Register
class DP_SELECT_REG(Structure):
    _fields_ = [("DPBANKSEL",   c_uint, 4),
                ("APBANKSEL",   c_uint, 4),
                ("Reserved",    c_uint, 16),
                ("APSEL",       c_uint, 8)]
class DP_SELECT(Union):
    _fields_ = [("reg",   DP_SELECT_REG),
                ("value", c_uint)]
dp_select = DP_SELECT()

# AHB-AP Control Status Word Register
class AP_CSW_REG(Structure):
    _fields_ = [("Size",        c_uint, 3),
                ("Res1",        c_uint, 1),
                ("AddrInc",     c_uint, 2),
                ("DbgStatus",   c_uint, 1),
                ("TransInProg", c_uint, 1),
                ("MODE",        c_uint, 4),
                ("Res2",        c_uint, 13),
                ("HProt1",      c_uint, 1),
                ("Res3",        c_uint, 3),
                ("MasterType",  c_uint, 1),
                ("Res4",        c_uint, 2)]
class AP_CSW(Union):
    _fields_ = [("reg",   AP_CSW_REG),
                ("value", c_uint)]
ap_csw = AP_CSW()

# Select AP bank, do read cycle
def ap_banked_read(d, addr):
    dp_select.reg.APBANKSEL = addr >> 4;
    swd.swd_wr(d, swd.SWD_DP, DPORT_SELECT, dp_select.value)
    swd.swd_rd(d, swd.SWD_AP, addr&0xf)
    return swd.swd_rd(d, swd.SWD_AP, addr&0xf)

# Configure AP memory accesses
def ap_config(d, inc, size):
    dp_select.reg.APBANKSEL = 0     # Zero bank
    swd.swd_wr(d, swd.SWD_DP, DPORT_SELECT, dp_select.value)
    ap_csw.reg.MasterType = 1
    ap_csw.reg.HProt1 = 1           # Enable incrementing, set access size
    ap_csw.reg.AddrInc = 1 if inc else 0
    ap_csw.reg.Size = 0 if size==8 else 1 if size==16 else 2
    return swd.swd_wr(d, swd.SWD_AP, APORT_CSW, ap_csw.value)

# Set AP memory address
def ap_addr(d, addr):
    swd.swd_wr(d, swd.SWD_AP, APORT_TAR, addr)
    swd.swd_idle_bytes(d, 2)            # Idle to avoid 'wait' response

# Do an immediate read of a 32-bit CPU memory location
def cpu_mem_read32(d, addr):
    ap_addr(d, addr)                              # Address to read
    swd.swd_rd(d, swd.SWD_AP, APORT_DRW)          # Dummy read cycle
    r = swd.swd_rd(d, swd.SWD_AP, APORT_DRW)      # Read data
    return if r.ack.value==swd.SWD_ACK_OK else None

if __name__ == "__main__":
    mem = sys.argv[1].upper() if len(sys.argv) > 1 else None
    dev = ft.ft_open()
    if not dev:
        print("Can't open FTDI device")
    ft.set_bitmode(dev, 0, 2)           # Enable SPI
    ft.set_spi_clock(dev, 1000000)      # Set SPI clock
    ft.ft_write(dev, (0x80, 0, ft.OPS)) # Set outputs
    swd.swd_reset(dev)                  # Send SWD reset sequence
    resp = swd.swd_rd(dev, swd.SWD_DP, DPORT_IDCODE) # Request & response
    if resp is None:
        print("No response")
        print("DP ident: ack %u, value %08Xh" % (resp.ack.value,
        swd.swd_wr(dev, swd.SWD_DP, DPORT_ABORT, 0x1e)    # Clear errors
        swd.swd_wr(dev, swd.SWD_DP, DPORT_CTRL,  0x5<<28) # Powerup request
        resp = swd.swd_rd(dev, swd.SWD_DP, DPORT_STATUS)  # Get status
        print("Powerup:  ack %u, value %08Xh" % (resp.ack.value,
        resp = ap_banked_read(dev, APORT_IDENT)           # Get AP ident
        print("AP ident: ack %u, value %08Xh" % (resp.ack.value,
        ap_config(dev, 1, 32);                            # Configure AP RAM accesses
        s = ""
        if mem in ports:
            addr = ports[mem]
            s = "%08X: %s" % (addr, mem)
            for reg in gpio_regs:
                val = cpu_mem_read32(dev, addr)
                s += " %s=%08X" % (reg, val)
                addr += 4
                addr = int(mem, 16)
                addr = None
            if addr is not None:
                s = "%08X:" % addr
                val = cpu_mem_read32(dev, addr)
                s += " %08X" % val if val is not None else " ?"



Next steps

Whilst this code is an interesting framework for experimentation, it lacks some of the features and error-handling of a ‘real’ application, for example it’d be nice to draw an animated picture of the CPU, showing the SWD poll results graphically. That is the subject of another post.

Copyright (c) Jeremy P Bentham 2018. Please credit this blog if you are using the information or software in it.

Programming FTDI devices in Python: Part 4

First steps toward viewing CPU internals with SWD

What is SWD?

If you want to access the internals of a programmable device, there used to be only one way: a JTAG interface. This uses 4 signals: TDI, TDO, TCK and TMS, and is quite complex; it can handle multiple daisy-chained devices, of various types. When you add in a variety of USB-JTAG adaptors and APIs to serve data to higher-level GUIs, you have a very complex piece of software; for an illustration of this, take a look at OpenOCD

More recently, ARM introduced a simpler 2-wire protocol, called SWD. It has just 2 connections, clock and bi-directional data, but has most of the capabilities of the older JTAG systems. Software such as OpenOCD has been extended to incorporate the SWD protocol, but is still very complex; I felt there was a need for a simple-as-possible implementation, in a high-level language, that could easily be combined with custom GUI to display the CPU internals in whatever fashion suits your application; maybe an animated diagram of the CPU, display of serial data streams, or graphs of analogue values.

So the Python SWD project was born, and I needed to select a USB device for the interface. The more modern FTDI parts have the MPSSE protocol engine, which (as we’ll see later) is ideally suited for the SWD protocol, and there are a wide variety of FTDI cables and modules at reasonable cost.

In the previous blog posts I’ve documented some preliminary steps to understand the FTDI hardware, and how it can be driven from Python; now we have a major test, implementing the SWD protocol.


FTDI adaptors

We’ll only be using 3 pins (clock, data out, data in) on the adaptor, so it isn’t difficult to wire up and FTDI cable or module, the only requirements are that the device supports the MPSSE protocol, and has a 3.3V output. If the module has a 5 volt pin, you do need to be careful not to short-circuit or mis-connect it, as it can source quite high currents (over an amp) and do significant damage. If you peer closely at the above diagram, you’ll see top-right an Adafruit FT232H module with connector pins fitted but missing pin 1; this is so I can’t accidentally destroy my test CPU by accidentally connecting the SWD to 5 volts.

In the introduction I mentioned that the SWD protocol has a bi-directional data line, but unfortunately the FTDI adaptors don’t provide a bi-directional mode – we need to combine the data input & output lines to provide this. This is done by putting a resistor in series with the FTDI output, so that the target system can pull that line high or low when required.

SWD cable circuit diagram
SWD cable with heatshrink pulled back to show DI-DO resistor

A similar scheme is mentioned in the OpenOCD documentation, but they suggest a value of 470 ohms. I’ve gone with 1K because at its lowest drive setting, FTDI chips such as the FT232H only source 4 mA, and I’m never keen on overloading outputs, no matter how harmless this is supposed to be – but feel free to follow the majority opinion, and go with 470 ohms.

Some people suggest that is is necessary for the FTDI adaptor and target CPU to share a common supply. Professional JTAG adaptors do this – they take a supply from the target system, and use level-shifters to ensure the signals are of the right amplitude – but it should’t be necessary providing your supplies are of reasonable quality. However, you must resist the temptation to make the cables very long; we’re dealing with fast edge-sensitive signals, so I’d keep the cable length below 6 inches (150 mm).

A convenient way of incorporating the resistor in a cable is by soldering & covering with heat-shrink tubing; at a pinch you could use a screw-terminal block, but try to keep the assembly reasonably compact to avoid EMC problems.

SWD Protocol

There are 3 main difficulties with this protocol:

  • Bit-oriented rather than byte-oriented
  • Bi-directional data line
  • Intolerant of errors

The first of these is quite a culture-shock; when dealing with bit values, they are usually aggregated up to the nearest byte or word. This isn’t good enough for SWD; if you are supposed to be sending 2 bits, it must be 2 bits, not padded out to the nearest byte.

The second issue makes debugging the software quite challenging; if there is a bug that causes both sides to transmit at the same time, it is difficult to work out which side is at fault.

The third issue is actually a design feature; in the event of an error, the CPU interface is designed to stop transmitting, to avoid further data collisions – but when writing your own code, you often find the target CPU stops talking; it refuses to communicate, and you don’t know why.

To give an example, here is the standard SWD read transaction on which all data transfers are based, taken from the original ARM document “Serial Wire Debug and the CoreSight Debug and Trace Architecture”. All transactions are initiated and controlled by the SWD adaptor, the target CPU just ‘fills in the blanks’ in the messages it is given.


We start with the data line being idle, which (very confusingly) can be either high or low. The clock line can be either be running continuously, or can stop between transactions; a bit like Ethernet, in that the recipient looks for a specific marker in the data, and ignores everything until that is received. In this context, the marker is at least 2 low (zero) bits, followed by a high ‘start’ bit, then there are 7 bits of header data. If you want to know the meaning of these bits, ARM have copious online documentation, such as the “CoreSight Components Technical Reference Manual”.

After the initial transmission to the CPU, the adaptor inserts a dummy ‘turnaround’ bit where it stops driving the data line, letting the target CPU take over. The adaptor continues toggling the clock line while the CPU sends 3 acknowledgement bits; if these show a positive response (100, l.s.bit first, so a value of 1) then 32 bits of data will follow, and a parity bit. This concludes the transaction, but another turnaround bit is needed so the SWD adaptor can start driving the bus again.

Alternatively, the acknowledgement bits may show an error (001, which is a value of 4), in which case the CPU will stop communicating, or a ‘wait’ indication (010, a value of 2), which means the data isn’t yet available – try again later.

After this transaction, another may follow immediately, or a minimum of 2 zero bits may be inserted to idle the data line – a clean transition between transactions is essential, with no spurious additional bits.

Python implementation

After several false starts, I ended up creating my own class to store bit values; there are various bit-handling libraries around, but my requirements are so simple that these are massive overkill.

# Class for a multi-bit value
class Bitval(object):
    def __init__(self, value, nbits, name="", rd=False):
        self.value = value
        self.nbits = nbits = name
        self.rd = rd

That’s all – it is just a vehicle for storing one or more bits; the ‘name’ isn’t strictly necessary, but is useful in identifying one bit-value amongst many others.

The ‘rd’ flag indicates whether the value should be write-only, or whether we need the value to be read back from the target system. For example, in the SWD read cycle above, we need to know the ‘ack’ and ‘data’ values, but aren’t really interested in reading back the other bits we’re sending – and the FTDI device provides a convenient way of controlling whether input data is read back or not (command bit 5: ‘TDO/DI data input’).

Creating the SWD request is just a question of stacking the bit-values in a list, e.g.

# 1 start bit, 1 AP bit, 1 read bit, 2 address bits...
Bitval(1, 1, "Start"),      Bitval(ap,   1, "AP"),
Bitval(1, 1, "Read"),       Bitval(addr, 2, "Addr"), etc..

Our USB driver software just churns out the bit-values in sequence, then gets any responses that are required; it doesn’t need to understand what each bit-value means. All that is needed is a bit of support code, to allow iteration through the list, and give access to the important element values:

# Create an SWD read request for a given AP or DP address
class swd_rd_request(object):
    def __init__(self, ap, addr):
        addr >>= 2
        hpar = ap ^ 1 ^ (addr & 1) ^ (addr>>1 & 1)
        self.ack = Bitval(0, 3, "Ack", 1) = Bitval(0, 32, "Data", 1)
        self.dparity = Bitval(0, 1, "DParity", 1)
        self.bitvals = (
            Bitval(1, 1, "Start"),      Bitval(ap, 1, "AP"),
            Bitval(1, 1, "Read"),       Bitval(addr, 2, "Addr"),
            Bitval(hpar, 1, "HParity"), Bitval(0, 1, "Stop"),
            Bitval(1, 1, "Park"),       Bitval(0, 1, "Turn"),
            self.ack,         ,
            self.dparity,               Bitval(0, 1, "Turn"))

    # Allow the bitval list to be iterated
    def __getitem__(self, idx):
        bv = self.bitvals[idx]
        return bv

Having set up a class for our data transaction, it is easy to transmit the data, and evaluate the response:

req = swd_rd_request(ap, addr)  # Create request
for bv in req:                  # For each bit-value..
    spi_write_bitval(h, bv)     # ..send bit(s) out
for bv in req:                  # For each bit-value..
    if bv.rd:                   # .. with 'rd' flag set..
        spi_read_bitval(h, bv)  # .. read bit(s) in

Since the request is a class instance, we can access the returned bits in an intuitive way, e.g. simplistically:

if req.ack.value == 1:     # If request was acknowledged OK..
    print(  # ..print returned data value

SWD reset

Now that we have an easy way to send an SWD request, can we read something out from the CPU? Nearly, there is just the reset process to go through.

To unlock the CPU SWD interface and start communicating, we need to send a lengthy bit sequence, namely at least 50 ‘1’ bits, then 0111 1001 1110 0111 (9E E7 hex, l.s.bit first), then at least another 50 ‘1’ bits, then at least 2 ‘0’ bits. this serves 2 purposes:

  • It provides a unique bit-pattern, that can’t be confused with a normal request
  • It gives time for the CPU SWD interface to be powered up

The second point is important; ARM CPUs are designed with power-saving in mind, and parts of the CPU may be powered down when not in use, so need some time to wake up. This especially applies if the CPU is in a deep sleep mode; it may require the startup sequence to be sent several times before the CPU is sufficiently awake to respond to requests. Despite the ‘reset’ name, this sequence does not reset the error-handling of the SWD interface; that must be done using a separate write-cycle.

Reading the CPU ID

After sending the startup sequence, the first request should be a read of the CPU ID; not just because it is a simple read-only value, but also because the CPU specification may require that it be read before anything else.

We need to set the ‘ap’ and ‘addr’ values in the code above; I’ll describe these settings in detail in the next post, but for now, it is sufficient to say that the ID register is at DP address 0, so ap=0, addr = 0.

So we just need to send the startup bit pattern, then a request with these values, and read back the result. If we’re unlucky, it’ll be all-0s because the target CPU isn’t communicating, or all-1s because the data line is floating; ideally it is something between those, that is consistent every time it is read; on an STMicro Cortex M3 CPU (STM32F103) it is 1BA01477, see your CPU’s data sheet for the corresponding value.

See below for the full Python source code to read the ID register; I can’t claim this code is particularly useful on its own, but in the next post we’ll start to explore some more useful data requests.

Regrettably the code doesn’t work on Linux with libftdi. In all my tests the SPI write cycles work fine, but the read cycles always return null data. To be investigated.

# Python FTDI SWD example from
# Compatible with Python 2.7 or 3.x
# v0.01 JPB 8/12/18

import time, ftdi_py_part3 as ft

VERBOSE  = False    # Flag to display SWD read/write cycles
ERRVAL = 0xEEEEEEEE # Dummy value returned if read cycle fails

SWD_DP          = 0     # AP/DP flag bits
SWD_AP          = 1
DPORT_IDCODE    = 0x0   # ID Code address

SWD_ACK_OK      = 1     # SWD Ack values

FTDI_MODE_BITBANG   = 1     # MPSSE modes

FTDI_SPI_WR_CLK_NEG = 0x01  # SPI command bit values
FTDI_SPI_WR_TDI     = 0x10
FTDI_SPI_RD_TDO     = 0x20
FTDI_SPI_WR_TMS     = 0x40

# Commands to read, write, and read+write SPI data

# Class for a bit value (1 - 32 bits)
class Bitval(object):
    def __init__(self, value, nbits, name="", rd=False):
        self.value = value
        self.nbits = nbits = name
        self.rd = rd

# Send SWD reset; at least 50 high bits around 0111 1001 1110 0111
# (9E E7 lsb-first), then at least 2 null bits before start bit
def swd_reset(d):
    rst = (0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF, 0x9E,0xE7,
    spi_write_bytes(d, SPI_WR_BYTES, rst)
    spi_write_bits(d, SPI_WR_BITS, 0, 4)

# Send a number of idle (zero) bytes
def swd_idle_bytes(d, n):
    data = n * [0]
    spi_write_bytes(d, SPI_WR_BYTES, data)

# Create an SWD read request for a given AP or DP address
class swd_rd_request(object):
    def __init__(self, ap, addr):
        addr >>= 2
        hpar = ap ^ 1 ^ (addr & 1) ^ (addr>>1 & 1)
        self.ack = Bitval(0, 3, "Ack", 1) = Bitval(0, 32, "Data", 1)
        self.dparity = Bitval(0, 1, "DParity", 1)
        self.bitvals = (
            Bitval(1,    1, "Start"),  Bitval(ap,   1, "AP"),
            Bitval(1, 1, "Read"),      Bitval(addr, 2, "Addr"),
            Bitval(hpar, 1, "HParity"),Bitval(0, 1, "Stop"),
            Bitval(1,    1, "Park"),   Bitval(0,    1, "Turn"),
            self.ack,        ,
            self.dparity,              Bitval(0, 1, "Turn"))

    # Allow the bitval list to be iterated
    def __getitem__(self, idx):
        bv = self.bitvals[idx]
        return bv

# Create an SWD write request for a given AP or DP address
class swd_wr_request(object):
    def __init__(self, ap, addr, value):
        addr >>= 2
        hpar = ap ^ (addr & 1) ^ (addr>>1 & 1)
        self.ack = Bitval(0, 3, "Ack", 1) = Bitval(value, 32, "Data")
        self.dparity = Bitval(parity32(value), 1, "DParity")
        self.bitvals = (
            Bitval(1,    1, "Start"),  Bitval(ap,   1, "AP"),
            Bitval(0, 1, "Read"),      Bitval(addr, 2, "Addr"),
            Bitval(hpar, 1, "HParity"),Bitval(0, 1, "Stop"),
            Bitval(1,    1, "Park"),   Bitval(0,    1, "Turn"),
            self.ack,                  Bitval(0,    1, "Turn"),
  ,                 self.dparity)

    # Allow the bitval list to be iterated
    def __getitem__(self, idx):
        bv = self.bitvals[idx]
        return bv

# Send an SWD read request and/or get the response
def swd_rd(d, ap, addr, tx=True, rx=True):
    req = swd_rd_request(ap, addr)
    ok = False
    if tx:
        spi_write_bitvals(d, req)
        ok = True
    if rx:
        ok = spi_read_bitvals(d, req)
    if VERBOSE:
        if rx:
            print("  Rd %X %-7s %08lX Ack %u" % (addr,
                  apreg_str(addr) if ap else dpreg_str(addr, 1),
        , req.ack.value))
            print("  Rd %X %-7s" % (addr,
                  apreg_str(addr) if ap else dpreg_str(addr, 1)))
    return req if ok else None

# Send an SWD write request and/or get the response
def swd_wr(d, ap, addr, value, tx=True, rx=True):
    req = swd_wr_request(ap, addr, value)
    ok = False
    if tx:
        spi_write_bitvals(d, req)
        ok = True
    if rx:
        ok = spi_read_bitvals(d, req)
    if VERBOSE:
        if rx:
            print("  Wr %X %-7s %08lX Ack %u" % (addr,
                  apreg_str(addr) if ap else dpreg_str(addr, 0),
        , req.ack.value))
            print("  Wr %X %-7s %08lX" % (addr,
                  apreg_str(addr) if ap else dpreg_str(addr, 0),
    return req if ok else None

# Return DP register string
def dpreg_str(reg, rd):
    if rd:
        s = ("IDCODE" if reg==0 else "STATUS" if reg==4 else
             "RESEND" if reg==8 else "RDBUFF")
        s = ("ABORT " if reg==0 else "CTRL" if reg==4 else
             "SELECT" if reg==8 else "RDBUFF")
    return s

# Return AP register string; see Cortex-M3 'AHB-AP programmers model'
def apreg_str(reg):
    return ("CSW/BD0" if reg==0 else "TAR/BD1" if reg==4 else
            "BD2/RAR" if reg==8 else "DRW/BD3")

# Write bitval requests
def spi_write_bitvals(d, bitvals):
    for bv in bitvals:
        spi_write_bitval(d, bv)

# Read bitval responses
def spi_read_bitvals(d, bitvals):
    ok = True
    for bv in bitvals:
        ok = spi_read_bitval(d, bv)
        if not ok:
    return ok

# Write a bit value to SPI interface
# If read-flag is set, use read+write, otherwise just write
def spi_write_bitval(d, bv):
    value, nbits = bv.value, bv.nbits
    cmd = SPI_RD_WR_BITS if bv.rd else SPI_WR_BITS
    while nbits > 0:
        n = min(nbits, 8)
        spi_write_bits(d, cmd, value&0xff, n)
        value >>= n
        nbits -= n

# Read a bit value (max 32 bits) from SPI, if read-flag is set
def spi_read_bitval(d, bv):
    ok = True
    if bv.rd:
        bv.value = shift = 0
        nbits = bv.nbits
        while ok and nbits >= 8:    # Get whole bytes
            data = spi_read_bytes(d, 1)
            if len(data) > 0:
                byt = data[0] >> max(8-nbits, 0)
                bv.value |= byt  0:
                bv.value = data[0]
                bv.value = ERRVAL
                ok = False
    return ok

# Write SPI command and data bytes to the device
def spi_write_bytes(d, cmd, data):
    n = len(data) - 1
    ft.ft_write(d, [cmd, n&0xff, n>>8] + list(data))

# Read data bytes back from SPI
def spi_read_bytes(d, nbytes):
    return ft.ft_read(d, nbytes)

# Write SPI command and up to 8 bits to the device
def spi_write_bits(d, cmd, byt, nbits):
    ft.ft_write(d, (cmd, nbits-1, byt))

# Read data bits back from SPI
# Bits are left-justified in the byte, so must be shifted down
def spi_read_bits(d, nbits):
    data = ft.ft_read(d, 1)
    return [data[0] >> (8-nbits)] if len(data)>0 else []

# Calculate parity of 32-bit integer
def parity32(i):
    i = i - ((i >> 1) & 0x55555555)
    i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
    i = (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24
    return i & 1

if __name__ == "__main__":
    dev = ft.ft_open()
    if not dev:
        print("Can't open FTDI device")
        ft.set_bitmode(dev, 0, 2)           # Enable SPI
        ft.set_spi_clock(dev, 1000000)      # Set SPI clock
        ft.ft_write(dev, (0x80, 0, ft.OPS)) # Set outputs
        swd_reset(dev)                      # Send SWD reset sequence
        r = swd_rd(dev, SWD_DP, DPORT_IDCODE) # Request & response
        if r is None:
            print("No response")
            print("SWD ack %u, ID %08Xh" % (r.ack.value,

In the next post we’ll be doing something a bit more useful – accessing the CPU address space.

Copyright (c) Jeremy P Bentham 2018. Please credit this blog if you use the information or software in it.

Programming FTDI devices in Python: Part 3

Driving an SPI device using MPSSE

Synchronous protocols: MPSSE

In a synchronous protocol (such as SPI or I2C) both clock and data signals are transmitted from sender to receiver, so the two remain in sync. This is in contrast to asynchronous (e.g. RS-232) protocols where markers in the data are used to establish & maintain sync.

The newer FTDI chips have a very strong capability in this area, which they call Multi-Protocol Synchronous Serial Engine, or MPSSE. This mode is enabled by the same command we use to enable bitbanging; the first argument is unused, and the second argument has the value 2 for MPSSE.

d.ftdi_fn.ftdi_set_bitmode(0, 2) # MPSSE using pylibftdi or..
d.setBitMode(0, 2)               # ..using ftd2xx

Bits 0 and 1 are chosen as outputs since they are normally SPI clock and data out; see part 1 for information on I/O pins usage.

Once MPSSE is set up, it is controlled by reading & writing byte streams; command bytes with optional arguments and data. The commands are detailed in FTDI application note 108 ‘Command Processor for MPSSE and MCU Host Bus Emulation Modes’, and at first sight there appears to be a bewildering array of options; the key to understanding them is that each command is actually a bitfield, namely:

Bit   MPSSE Function      If set    If clear
0     Write clock edge    -ve       +ve
1     Bit/byte mode       bit       byte
2     Read clock edge     -ve       +ve
3     LS/MS bit first     LS        MS
4     TDI/DO data output  On        Off
5     TDO/DI data input   On        Off
6     TMS data output     On        Off
7     Zero

On a normal microcontroller serial interface you set up the transfer parameters (clock edge, bit-order, word-length) in advance, then just do byte or word transfers based on those settings. The FTDI interface is completely different; the parameters are specified for each transfer, and you can freely intersperse commands with different word-lengths, clock edges etc. Bits 4 -6 are particularly strange, in that they allow you to control the flow of data to & from the chip; if just bit 4 is set you have a write-only interface, just bit 5 and it is read-only. What use is that? Very useful, if we’re doing more complex protocols such as SWD, but for simpler read/write tasks you’d probably want to leave DO & DI enabled (not TMS, unless you’re implementing JTAG) .

These serial-data commands have bit 7 clear, but the FTDI application note describes various other commands that are available if bit 7 is set; for example, to set an I/O pin in MPSSE mode the following commands are used:

Command   Data bytes    Action
80h       Value, Mask   Write low byte (ADBUS)
82h       Value, Mask   Write high byte (ACBUS)
81h                     Read low byte (ADBUS)
83h                     Read high byte (ACBUS)

For serial output we need to set the SPI clock and MOSI pins (bits 0 & 1) to be outputs, so the command to be sent is:

OPS = 0x03
ft_write(d, (0x80, 0, OPS))

This makes the clock & MOSI lines into outputs,  with a value of 0

MPSSE example: SPI output

The MPSSE command structure is easiest to explain with a worked example, and since SPI (Serial Peripheral Interface) is the simplest clocked serial protocol it supports, we’ll start with that.

SPI normally has 4 lines; clock, data out, data in, and chip-select. Since it can be ambiguous as to which direction ‘out’ and ‘in’ refer to, those terms are normally qualified as MOSI (Master Out Slave In) and MISO (Master In Slave Out). The chip-select is used to mark the beginning and end of a transaction, and to identify which chip is being addressed out of (potentially) several chips on the bus.

For this example I’ll be using SPI to drive a MAX6969 LED driver chip; this is used in various low-cost multiple-LED displays, in this case the MikroElektronika UT-L 7-SEG R click with dual 7-segment displays.


This can be set to select 3.3 or 5 volt operation by re-soldering a resistor; to save this complication, we’ll leave it in the default 3.3V mode. That means we need an FTDI module with 3.3V outputs, since they must match the supply voltage – if you doubt this, check the ‘absolute maximum’ values in the MAX6969 data sheet.


With a supply of 3.3V, the data and clock inputs can only go 0.3V higher before bad things happen – you ignore Absolute Maximum ratings at your peril. So our FTDI interface needs to be 3.3V; any such module with MPSSE capability will do, I’ll use the C232HM-DDHSL-0 cable.

It can only supply a maximum current of 200 mA to the power-hungry display module; lighting 16 segments at around 20 mA each will easily overload this supply, so we need an external 3.3V source, with at least 0.5A capacity. The connections are:

Display pin  Function      FTDI cable  Function
3            Load Enable   Grey        GPIOL0 (ADBUS4)
4            Clock         Orange      TCK
5            Data out      Green       TDO
6            Data in       Yellow      TDI
7            3.3V            
8 & 9        Ground        Black       GND
16           PWM           Brown       TMS (ADBUS3)

It may look like I’ve got the input and output lines the wrong way round, but FTDI are using the device-oriented JTAG pin identifiers, so TDO is actually MISO, and TDI is MOSI.

Pin 3 ‘load enable’ is similar to ‘chip enable’, and is connected to an I/O line that can be toggled; we’ll be looking at its exact function later. Pin 16 is a line that can be used to vary the display brightness using pulse-width modulation; it must be driven high to illuminate the display.

When first using new hardware, it is well worth checking the supply current with an ammeter, and making a note of it; this board takes 4 mA at 3.3V; not a lot!

Device configuration

The default data rate is less than ideal for our application, so we need to set something better; 1 MHz is a good safe starting-point for most SPI devices. Command 86 hex sets the data rate, followed by the low byte and high byte of the frequency divisor (which turns out to be 5 for 1 MHz).

hz = 1000000                            # Desired SPI frequency
div = int((12000000 / (hz * 2)) - 1)
ft_write(d, (0x86, div%256, div//256))  # Return byte count (3)

Now we can write some data to the SPI interface, and view the result on an oscilloscope. According to the MPSSE function table , a command value of 10h will send a byte value to DO, with +ve clocks, M.S.Bit first. After the command byte, you send a word value, L.S.Byte first, that tells the command how long the data is, minus 1 byte; in this case, we’re sending 1 byte of SPI data with value 55h, so the whole command in hex is 10 00 00 55

def ft_write_cmd_bytes(d, cmd, data):
    n = len(data) - 1
    ft_write(d, [cmd, n%256, n//256] + list(data))
# Set bit 0 & 1 (TCK, TDI) as O/Ps, output 1 byte: 55h
ft_write(d, (0x80, SPI_MASK, SPI_MASK))
ft_write_cmd_bytes(d, 0x10, [0x55])

This is the resulting oscilloscope display


In case you aren’t used to looking round an oscilloscope display, the top figures say what the vertical & horizontal sensitivities are, in units per division (i.e. units per large square). So the data and clock lines are 2 volts per division vertically, which looks roughly right, since we’re expecting 3.3V signals. The horizontal scale is 2 microseconds per division, which also looks right, since we get 2 clock cycles per division, so each cycle has a period of 1 microsecond, corresponding to a frequency of 1MHz.

You’ll also see that the data line changes at the same time as the clock line goes from low to high, i.e. at the positive-going clock edge. This is to be expected since bit 0 if the command (‘write clock edge’) is set to zero (‘+ve’). This is important because the display chip will be using a specific clock edge to read in the data bits, and if we have chosen the wrong edge, the data will be changing while it is being read in, with highly unpredictable results. The relevant text in the MAX6969 data sheet says “DIN is the serial-data input, and must be stable when it is sampled on the rising edge of CLK”.

So we need to set command bit 0 so that the data changes on the falling clock edge, and is stable on the rising edge. There are also 2 other issues to address:

  • The ‘load enable’ pin must be toggled to latch the data into the display after it has been sent. This is somewhat unusual, since normally a chip-enable signal is asserted before the data is sent, and negated afterwards, but we do need to toggle the load-enable line or nothing will be visible.
  • The PWM signal needs to be asserted to illuminate the display. It is intended to be driven from a pulse-width-modulated (PWM) signal to give variable intensity, but since we don’t have that, needs to be turned full-on.

So here is our next attempt, writing 2 bytes (one for each display) with data changing on negative edge, latching the transferred data, and turning on the display.

OE, LE = 0x08, 0x10
ft_write(d, (0x80, 0, OPS+OE+LE))         # Set outputs
ft_write_cmd_bytes(d, 0x11, (0x3f, 0x06)) # Write seg data
ft_write(d, (0x80, LE, OPS+OE+LE))        # Latch = 1
ft_write(d, (0x80, OE, OPS+OE+LE))        # Latch = 0, disp = 1
ft_write(d, (0x80, 0,  OPS+OE+LE))        # Latch = disp = 0

If all is well, the number 10 appears on the display when it is enabled. How did I know that the hex values 3F and 06 were needed to display 0 and 1? Rather than work out the segment-to-I/O-bit mapping for myself, I just looked at the C code on the MikroElektronik Web page, that gave the values for 0 – 9, and copied the first 2.

Here is the resulting waveform; it is quite instructive to match the bit values with the high/low states of the MOSI line.


Avoiding disaster

An earlier version of the SPI write code looked like this:

ft_write(d, (0x80, OPS, OPS))
ft_write_cmd_bytes(d, 0x11, (0x3f, 0x06))

Looks quite harmless, but the oscilloscope showed a major problem; see the highlighted areas on the clock trace.


There are some sizeable glitches in our clock signal, which is very bad news. Will the MAX6969 chip see these pulses, or ignore them, and if they are accepted, will a high or low level be read in? Having glitches like this on the clock line is very risky; even if it works now, it could suddenly stop working with a minor rearrangement of the components or wiring.

The solution is to set the outputs to zero when enabling MPSSE mode:

ft_write(d, (0x80, 0, OPS))
ft_write_cmd_bytes(d, 0x11, (0x3f, 0x06))

This issue demonstrates how a software bug has the potential to create a subtle hardware problem; it is always worth checking the waveforms with an oscilloscope, if at all possible.

Source code

Here is the source code, tested on Windows using the D2XX driver, and Linux using pylibftdi – just set the FTD2XX value appropriately.

# Python FTDI SPI example from
# Compatible with Python 2.7 or 3.x
# Drives a MikroElektronika UT-L 7-SEG R display
# v0.01 JPB 8/12/18

FTD2XX       = True  # Set False if using pylibftdi
FTDI_TIMEOUT = 1000  # Timeout for D2XX read/write (msec)

if FTD2XX:
    import sys, time, ftd2xx as ftd
    import sys, time, pylibftdi as ftdi

# Segment bit values for digits 0 - 9
dig_segs = 0x3F,0x06,0x5B,0x4F,0x66,0x6D,0x7D,0x07,0x7F,0x6F

DIG1 = 0    # Digits to be displayed
DIG2 = 1
OPS  = 0x03 # Bit mask for SPI clock and data out
OE   = 0x08 #              display output enable
LE   = 0x10 #              display latch enable

# Set mode (bitbang / MPSSE)
def set_bitmode(d, bits, mode):
    return (d.setBitMode(bits, mode) if FTD2XX else
            d.ftdi_fn.ftdi_set_bitmode(bits, mode))

# Open device for read/write
def ft_open(n=0):
    if FTD2XX:
        d =
        d.setTimeouts(FTDI_TIMEOUT, FTDI_TIMEOUT)
        d = ftdi.Device(device_index=n)
    return d

# Set SPI clock rate
def set_spi_clock(d, hz):
    div = int((12000000 / (hz * 2)) - 1)  # Set SPI clock
    ft_write(d, (0x86, div%256, div//256)) 

# Read byte data into list of integers
def ft_read(d, nbytes):
    s =
    return [ord(c) for c in s] if type(s) is str else list(s)

# Write list of integers as byte data
def ft_write(d, data):
    s = str(bytearray(data)) if sys.version_info<(3,) else bytes(data)
    return d.write(s)

# Write MPSSE command with word-value argument
def ft_write_cmd_bytes(d, cmd, data):
    n = len(data) - 1
    ft_write(d, [cmd, n%256, n//256] + list(data))

if __name__ == "__main__":
    dev = ft_open(0)
    if dev:
        print("FTDI device opened")
        set_bitmode(dev, OPS, 2)              # Set SPI mode
        set_spi_clock(dev, 1000000)           # Set SPI clock
        ft_write(dev, (0x80, 0, OPS+OE+LE))   # Set outputs
        data = dig_segs[DIG1], dig_segs[DIG2] # Convert digits to segs
        ft_write_cmd_bytes(dev, 0x11, data)   # Write seg bit data
        ft_write(dev, (0x80, LE, OPS+OE+LE))  # Latch = 1
        ft_write(dev, (0x80, OE, OPS+OE+LE))  # Latch = 0, disp = 1
        print("Displaying '%u%u'" % (DIG2, DIG1))
        ft_write(dev, (0x80, 0,  OPS+OE+LE))  # Latch = disp = 0
        print("Display off")

See the next post for an introduction to the SWD protocol.

Copyright (c) Jeremy P Bentham 2018. Please credit this blog if you use the information or software in it.

Programming FTDI devices in Python: Part 2

Using pylibftdi on Linux

Linux and pylibftdi

In the first part, I used the FTDI Windows D2XX driver and Python ftd2xx library to do some simple I/O testing on an FTDI module. However, when attempting to run the same code on Linux, I had problems getting the d2xx driver to cooperate with the ftd2xx library, so switched to using the pylibftdi library, which uses the open-source libftdi driver in place of d2xx.

Installation is as you’d expect: use sudo apt-get (or equivalent) to install libftdi 1.x, then sudo pip (or pip3) to install pylibftdi. There are then a couple of Linux-specific issues:

  • If you want to avoid running all your code as root, you need to give permission for user-space applications to access the USB device. FTDI devices usually have a PID of 0403, so to allow user access to all FTDI devices, create a file /etc/udev/rules.d/99-libftdi.rules with the contents:
    SUBSYSTEMS=="usb", ATTRS{idVendor}=="0403", GROUP="dialout", MODE="0660"

    Then re-plug the device for the new rule to take effect. This rule will apply to all devices with the FTDI vendor ID, for security reasons you may need to restrict it to devices that match a specific product ID as well.

  •  By default, Linux will load ftdi_sio and usbserial kernel modules when the device is plugged in. Various ways of preventing this have been posted on the Internet, but fortunately libftdi works fine without having to unload the modules.

To see if the library is working, try listing all the FTDI devices:

import sys, pylibftdi as ftdi

This should return a list of devices, each with manufacturer, description and serial number, e.g.

[('FTDI', 'FT2232H MiniModule', 'FT2FNDAR')]

Device I/O

Enabling bitbang mode is similar to the D2XX function calls used in part 1:

d = ftdi.Device()                  # Open first device
OP = 1                             # Bit 0 will be an output
d.ftdi_fn.ftdi_set_bitmode(OP, 1)  # Return 0 if bitbang mode set OK

A baud rate of 9600 is set by default, so the O/P bit clock rate should be 16 times that, as discussed in the previous post.

The differences in Python 2.7 and 3.x byte string handling are still a potential problem, though the pylibftdi library goes some way towards addressing this by using a ‘Latin 1’ encoder to translate between unicode and binary strings. However, to simplify the manipulation and storage of transmit & receive data, I am retaining the list-of-integers storage method used in the first part;

def ft_write(d, data):
    s = str(bytearray(data)) if sys.version_info<(3,) else bytes(data)
    return d.write(s)
ft_write(d, (OP, 0))

def ft_read(d, nbytes):
    s =
    return [ord(c) for c in s] if type(s) is str else list(s)
ft_read(d, 1)

As before, the read cycle returns [254], since the unused pins are floating high, and we have set bit 0 low.

See the next post for an example of code that runs on Linux and Windows.

Copyright (c) Jeremy P Bentham 2018. Please credit this blog if you use the information or software in it.

Caterpillar pic chip small

Programming FTDI devices in Python: Part 1

If you are a Python programmer, and need a simple USB interface for some hardware, read on…

FTDI are well known for their USB-to-serial chips, but the later models (such as FT2232C and FT232H) have various other capabilities; when combined with Python, you get a simple yet powerful method of controlling & monitoring a wide variety of hardware devices.


Various FTDI-equipped modules and cables are available.


A few points to bear in mind:

  • The module may need to have some of its pins linked together, otherwise it won’t power up.
  • If the module has a 5 Volt output pin, take care when connecting it up; if mis-connected, a sizeable current may flow, causing significant damage.
  • Each FTDI device has a unique set of capabilities; check the datasheet to make sure the part has the facilities you need.

Device driver

As standard, when an FTDI device is plugged into a Windows PC, the operating system loads the default Virtual Com Port driver, that can only handle asynchronous serial (RS232-type) protocols. However, we want to be a bit more adventurous, so need to substitute the ‘d2xx’ driver, available from the FTDI drivers page. A quick way to check which driver is active is to look at the Device Manager; if the FTDI part appears as a COM port, it is asynchronous-only.

Use ‘pip’ to install a Python library that will access the d2xx driver; there are several available (such as pyftdi, pylibftdi) but the only one that worked seamlessly with Python 2.7 and 3.x on my systems was the simplest: ftd2xx, which is just a CTYPES wrapper round the d2xx API

pip install ftd2xx

A quick check to see if all is well (Python 2.7 or 3.x):

import sys, ftd2xx as ftd
d =    # Open first FTDI device

This should print some dictionary entries for the device, e.g.

{'serial': 'FT28C9AHA', 'type': 4L, 'id': 67330064L, 'description': 'DLP2232M A'}

If this fails, it is usually because the device is still using the VCP driver, or the Python library can’t find the necessary FTDI files (ftd2xx.lib, and ftd2xx.dll or ftd2xx64.dll);  they need to be somewhere on the executable PATH.

Linux drivers are discussed in the next post.


Before sending any data to the device, we need to establish which pins does what, as all pin functions are pre-assigned. Each chip has 1 or more ‘channels’, i.e. protocol engines, so a 2-channel device can drive 2 separate protocol streams, though there may be a limitation on the protocols a channel can handle. Each channel is assigned to one or more ports, which are usually 8-bit, but may have as many as 10 or as few as 4 bits. The first port of the first channel is identified as ADBUS; if that channel has a second port, it would be ACBUS. The first port of the second channel (if present) is BDBUS, the second port of that channel would be BCBUS.

The serial I/O functions are generally constrained to the lower few bits of the first port, the rest of the lines act as general status or handshake I/O. For example, the 2-channel FT2232C device channel A has pins ADBUS 0 – 7 and ACBUS 0 – 3:

CHANNEL A                           CHANNEL B
ADBUS0   TxD    D0        TCK/SK    BDBUS0   TxD    D0        -
ADBUS1   RxD    D1        TDI/DO    BDBUS1   RxD    D1        -
ADBUS2   RTS    D2        TDO/DI    BDBUS2   RTS    D2        -
ADBUS3   CTS    D3        TMS/CS    BDBUS3   CTS    D3        -
ADBUS4   DTR    D4        GPIOL0    BDBUS4   DTR    D4        -
ADBUS5   DSR    D5        GPIOL1    BDBUS5   DSR    D5        -
ADBUS6   DCD    D6        GPIOL2    BDBUS6   DCD    D6        -
ADBUS7   RI     D7        GPIOL3    BDBUS7   RI     D7        -
ACBUS0   TXDEN  WR        GPIOH0    BCBUS0   TXDEN  WR        -
ACBUS1   SLEEP  RD        GPIOH1    BCBUS1   SLEEP  RD        -
ACBUS3   TXLED  OSC       GPIOH3    BCBUS3   TXLED  OSC       -

The GPIOL and GPIOH prefixes refer to the low & high byte output commands that we’ll encounter later when using MPSSE mode for synchronous protocols; also note that channel B is unusable in that mode.

A possible source of confusion is that pins 1 and 2 in MPSSE mode are identified as TDI/DO and TDO/DI, implying that they can act as inputs or outputs. This is incorrect: in MPSSE mode, pin 1 is normally an output, and pin 2 is an input. The reason for the TDI and TDO labels is that they refer to the JTAG protocol, which is defined from the point of view of the device being controlled, not the controller – so the DO and DI labels apply in normal usage.

Also note that the device has a tendency to keep its previous settings, even after a reset. For this reason, all programs using the ftd2xx library normally start by clearing everything in the device to zero, just in case a preceding program has left some settings active. For simplicity, the code given below ignores this requirement, and assumes the device has been  re-plugged just before the code was run.

Bitbang mode: toggling an I/O pin

‘bitbashing’ which FTDI call ‘bitbanging’, refers to driving the I/O pins directly, rather than using an I/O protocol embedded in the device.

The FTDI device powers up in ‘reset mode’ and must be set to bitbang mode using the setBitmode function. One advantage of using the Python ftd2xx library is that the function arguments are as documented in the FTDI ‘D2XX Programmers Guide’:

OP = 0x01            # Bit mask for output D0
d.setBitMode(OP, 1)  # Set pin as output, and async bitbang mode
d.write(str(OP))     # Set output high
d.write(str(0))      # Set output low

Having set our chosen pin as an output, and enabled bitbang mode, writing a string to the device handle will set its state. The ‘write’ functions returns the number of characters written, which is 1 in this case. If your application involves sending out a succession of O/P pulses, you’ll want to know how fast the operation is; sending the following commands:

d.write(chr(OP)); d.write(chr(0))

results in a positive pulse somewhere between 500 microseconds and 2 milliseconds wide. This will be too variable and too slow for many applications, so an alternative is to write a string containing multiple data values, e.g.

d.write(chr(OP) + chr(0))

This results in a pulse 50 nanoseconds wide, which is probably too narrow for most applications, however in theory you can just duplicate a command to stretch it out, for example to generate a pulse of 200 nanoseconds:

d.write(chr(OP)*4 + chr(0))

This approach is somewhat inefficient, and works fine on Python 2.7, but not on Python 3.x; if you connect an  oscilloscope to the output, you’ll see a couple of cycles of 10 MHz square-wave, instead of a single broad pulse. Using a USB analyser to monitor the data, it is apparent that the code is sending the bytes 01 00 01 00 01 instead of  01 01 01 01 00; the length is correct, but the data values are wrong, because of the different ways Python 2.7 and 3.x store their strings.

Byte values in Python 2.7 and 3.x

The default string type can’t be used for byte data in 3.x, as the characters are 16-bit Unicode values, not bytes. There are various ways round the problem, the simplest is to force the string type to binary:


This is fine for preformatted strings, but gets rather messy if the data is being computed on-the-fly. There are plenty of alternative suggestions on the Internet, but many don’t work in special cases, such as bit 7 being set. The ‘bytearray’ type would be useful, except that it is rejected as an unknown type by the ftd2xx library. The ‘bytes’ datatype is good on v3, but works very differently on v2.7, so for my development I reluctantly decided to store all outgoing data as lists of integers, with a version-dependant conversion function on writing, e.g.

def ft_write(d, data):
    s = str(bytearray(data)) if sys.version_info<(3,) else bytes(data)
    return d.write(s)

ft_write(d, [OP]*4 + [0])

Slowing down output transitions

Sending multiple output commands to slow down the output transitions is quite inefficient, and unworkable for really long pulses. A better alternative is to program the baud rate generator (the same generator as used for serial communications), which synchronises the transitions, e.g.

ft_write(d, (OP, 0))

The FTDI Application Note states that the output is clocked at 16 times the baud rate, so  9600 baud should result in a timing of 6.51 microseconds per bit. However, on an FT2232H module the time was measured as 20.825 microseconds, so that logic seemingly doesn’t apply to all modules.

Reading input pins

Finally we get to read some data in:


The length of 1 returns an 8-bit value corresponding to the I/O pin states; as before, the returned type depends on the Python version, so I convert it to a list of integers:

def ft_read(d, nbytes):
    s =
    return [ord(c) for c in s] if type(s) is str else list(s)

print(ft_read(d, 1))

Unused inputs float high, and the last output command drove the ADBUS0 output low, so the value printed is 254 in a list, [254].

You can implement quite complex protocols using simple I/ O commands; write-cycles can be chained to output complex sequences, but there is quite a speed-penalty every time a read-cycle has to be interleaved. In recognition of this, many FTDI chips have a more complex capability, which they call MPSSE (Multi-Protocol Synchronous Serial Engine); that’ll be the subject of a later blog post…

See the next post to run the code on Linux…

Copyright (c) Jeremy P Bentham 2018. Please credit this blog if you are using the information or software in it.