2023 March 21 15:15
Another update!
I wanted to share what I have been doing for the last few weeks. I “pivoted” from focusing on 32-bit targets – RISC-V and ARM Cortex-M – to 8-bit targets – in particular, 8051, PIC16, and PIC18.
A friend cajoled me into adding support for the 8051 – an architecture that I had for years avoided because I found it ugly. He decided that the Silabs EFM8 “busy bee” looked interesting, and he bought an EFM8BB1LCK board. These are very cheap – about $6.
I held my nose, read what I could find about the 8051, and wrote an assembler and disassembler. This was suprisingly harder than I thought, for a such a simple architecture. It’s mostly regular, but the irregularities are enough that the exceptions dominate the rules, in both the assembler and disassembler.
The EFM8BB1 has a (supposedly) built-in serial bootloader, but after many failures to connect to it, we researched on the forums and found that the chips on the EFM8BB1LCK boards have had the bootloader replaced by a demo application! So I found a .hex file, and, using Silabs’ Simplicity IDE, my friend was successful getting the serial bootloader programmed onto the chip. We used this to write and debug a serial chat program, which has since replaced the bootloader on his chip.
There is no “Forth” to speak of for the 8051, however. The support that muforth provides is limited to writing assembler words, flashing them onto a chip, and interactively executing them.
Flushed with success, we looked around for other likely suspects and found that Atmel (now Microchip) still make AT89LP parts – the most interesting being the AT89LP52, AT89LP828, and AT89LP6440. These are all 1-cycle 8051 cores – twelve times faster than the original 8051! – in either DIP28 or DIP40 packages. I bought a couple of 6440’s, breadboarded them, and proceeded to figure out how to program them via SPI.
Curiously, the protocol for programming the AT89LP was almost identical to that for the AVR! I remembered that the first AVR parts had an AT90 prefix, so clearly Atmel had developed some silicon IP with the AT89 that they carried over to the AVR.
I had existing code to program AVR chips using an S08, but I realized that for this code to be useful to anyone else it had to be more generic. There was never much hobbyist interest in the S08, and after Freescale was absorbed by NXP, the prices for S08 chips tripled or quadrupled, making them rather unnattractive. I had used, as AVR programmers, both breadboarded S08QG DIP parts and an S08JS16 that I had soldered onto a proto board. Almost no one else in the world is using the S08JS16. Just Google it.
I decided that using a $10 STM32 Discovery board would be a better option, so I proceeded to write SPI code for that, and came up with a way to talk to running programming code using ST-LINK and a simple in-memory semaphore. Once I gave up on trying to use the STM32 SPI peripheral and simply bitbanged the port pins, this worked flawlessly.
After this 8051 adventure, I realized that I missed working with 8-bit micros. There is something down-to-earth and real about them. Maybe because they make everything a bit of a challenge? I don’t know, but it struck me that a fast 8-bit micro does a great job if all you need to do is toggle a few pins and communicate over serial (UART, SPI, I2C, CAN) interfaces.
Notwithstanding a brief flirtation with Microchip’s PIC18F14K50 – which I had decided would make a cheap and cheerful USB-to-whatever interface (and, like many Microchip parts, it comes in breadboardable DIP packages) – I had kind of given up on the PIC. The architecture had started to annoy me, and, frankly, the parts were the same price as a 32-bit micro. I had given up on worrying about DIP versus surface mount and instead focused on finding reasonably-priced vendor boards, like the MSP430 Launchpads and STM32 Discovery boards.
Aside: Of course, the Raspberry Pi Pico is the everything killer here. For less than the cost of a DIP40 8051 from Microchip you get a fast dual-core 32-bit chip with lots of flash memory. I got annoyed with the RP2040 for a while, but it’s still on my to-do list to continue working on it, and it would also be a great platform for bitbang-programming 8051s and PICs.
Newly 8-bit curious again, I poked around on Microchip’s site and found that they had a new-ish series of parts – the PIC18-Q – that were reasonably fast (16 MHz instruction rate), reasonably cheap ($2 or less), had lots of flash and ram, came in a variety of DIP packages (14, 20, 28, 40), and had tons of interesting peripherals (including 12-bit ADC and CAN), so I sampled a few parts; I’m mostly interested in the PIC18-Q41, PIC18-Q43, and PIC18-Q84 families.
The Q parts use an “SPI-compatible” serial programming protocol – SPI-compatible here means that it’s MSB-first, and the commands and data are multiples of 8-bits in size – but since it’s half-duplex (sharing a single data line), rather than full-duplex (uni-directional MISO and MOSI), I ended up bitbanging this too, since I failed to convince/cajole/coerce the STM32 SPI to actually work in half-duplex mode (which it supposedly supports).
I can program the chips that I have, but I have no program to program onto them! I wanted to write a serial chat program (naturally), but I need an equates file before I can do that.
While debugging a PIC programming issue (a long story better told elsewhere), I discovered that Microchip has something like Keil’s CMSIS-Pack service: a “packs repository” site that indexes and makes available machine-readable files useful for programming their chips. I downloaded the “pack” for the PIC18-Q and initially was looking at the C and assembler support files (.h and .inc, resp.), but I later discoverd something called an “ini” file that contains exactly what I wanted: a simple, machine-readable description of the memory layout and i/o registers.
This spawned the pic-chip-equates project – a sibling to similar STM32 and Kinetis projects for generating ARM equates files – which anyone can now use to generate muforth equates files for PIC16 and PIC18 chips! It’s not totally finished, but it really only needs a bit of polish.
I still need to write my serial chat program, but at least now I have all the tools at hand.
2023 January 01 01:06
For the last two years I have been egregiously silent on these pages, and I have also been neglecting muforth, only working on it in short bursts.
For the last several weeks, however, I have been giving muforth my full attention, and have been thinking about almost nothing else! I want to give a brief overview of what I have been up to. I’ll start with changes to the host Forth, and then talk about the targets that have seen the most progress.
Core (host-side Forth)
The biggest change here was a new approach to the 64-bitness of muforth.
For several years now, since the conversion to it being a 64-bit Forth – as far as the user is concerned – the implementation has annoyed me. I wanted the experience of using it to be identical on 32-bit and 64-bit machines. All user-visible values, the stacks, and even the dictionary itself, was 64-bit. The way I accomplished this on 32-bit machines was to have all pointers be stored in slots 64-bits wide, but use only half of the space. I defined a struct that contained a 32-bit pointer, and 32-bits of padding, and I accessed these pointers via macros. On 64-bit machines the macros were no-ops. This all worked, but it was clumsy, annoying, and confusing. Sometimes, making changes to code only a few weeks after writing it, I would not be entirely sure what the macros were doing.
There had to be a better way.
My first thought was to keep the 64-bit variables, arrays, and stacks, but to make all of the pointers and threading 32-bit – on both 32-bit and 64-bit platforms. To make this work in both settings, I was going to use relative addresses in the heap. I thought it might be rather nice, actually, if the heap started at 0000_0000. I started down this path, and got quite far along, when I changed my mind and decided that I wanted, instead, to have the implementation use the native host pointer size for pointers, and 64-bits for everything else. That’s what we have now.
If I decide I ever want to change back to the relative-addressed heap – which is elegant in its own way – it will be very easy, since I have identified everywhere that manipulates pointers. I call them “addrs” – both in the C and the Forth code.
One other change, rather minor in comparison, but another wart that it felt good to remove, was the mechanism for hiding colon words as they are being defined. In most Forths a colon definition doesn’t show up in the dictionary until it is complete (ie, until ;
is executed). There are various ways this is accomplished, but I had adopted an elegant idea that Martin Tracy used in his zenFORTH: the newly created word is only linked into the dictionary after it is complete.
For simplicity in bootstrapping from C to Forth, however, I had made all new words immediately visible once they are added to the dictionary by the C code. Partway into startup.mu4 I wrote Forth code to reverse the C code’s behavior, by unlinking colon words from the dictionary, and only relinking them once they were complete. This always felt like an ugly hack, and I wanted to fix it.
The solution was to have the C code create unlinked words, but remember the word and the vocabulary chain that it was defined on. Later, by executing (from Forth) the C primitive show
, the word is finally linked onto the chain. This works remarkably well, and is very simple.
I need to bring a similar mechanism to the treatment of colon words in the target compilers, however. Right now, any target colon words are immediately visible in the dictionary.
MSP430 target
After acquiring a few MSP430F5529 Launchpad boards – which I decided on because of their reasonable cost ($13 or so) and on-chip USB – I slowly added the requisite support to muforth: equates, USB BSL (boostrap loader) support, clock support, core voltage support, and, finally, chat code that I could put into the Flash. It took a very long time to get to the end of that journey. The clocks and core voltage management system on that chip are insanely complicated. The chapter on the voltage supervisor peripheral is so full of twisty little acronyms all alike that it’s mind numbing. I barely escaped with my sanity!
Happily, it’s done now, and the board is a fine workhorse.
I added a basic Forth-style multi-tasker and got a periodic timer interrupt working, so the pieces are there for doing real-time programming!
ARM target (STM32 and Raspberry Pi Pico)
Through the agency of two other projects – STM32 and Kinetis “equates” generators – I have added several chip equates files to muforth, and used these to write some simple example Flash-based startup code – all in Forth! – for the STM32F0, STM32F072, and STM32F3 Discovery boards.
Since the acquisition of Freescale by NXP I have been slowly losing interest in supporting the Kinetis ARM parts. I was really excited when the first FRDM board came out, but supporting it was a pain, OpenSDA is a mess, and there just seemed to be more interest elsewhere – especially in the STM32 parts. So I started to focus more on those.
Because they have a “user” USB port in addition to the ST-LINK debug port, I decided to get a couple of STM32F3 and STM32F072 Discovery boards. I wrote some USB firmware, also in Forth, that successfully enumerated if I compiled it into and ran it from RAM, but if I flashed the code, nothing happened. I assumed this was some kind of timing problem, but nothing I tried fixed the problem. This made me a bit crazy, and I almost gave up on ST and STM32!
I finally realized that I had a bug in the ARM target variable
defining word that caused variables defined in Flash to not work! After fixing this my USB code worked flawlessly. ;-)
After learning about the Pico last summer (2021), I decided it was probably going to be a big deal, based on its cost, groovy capabilities (dual cores! lotsa Flash! a weird I/O coprocessor!), and because of the popularity and name recognition of the Raspberry Pi Foundation.
Currently, it’s possible to write and execute RAM-based ARM assembler and Forth code on the Pico. There is an equates file, good PICOBOOT support, a working stage2 flash loader, and a UF2 file generator. I need to write serial and USB chat code, and I should probably write an assembler for the PIO (the curious coprocessor).
I ported the MSP430 multitasker code to the ARM, added a nifty decompiler, got the SysTick timer working, and am now puzzling out how to get the “chat” code and the ST-LINK debug code to work with a multitasking target. I have been testing the tasking and timer code on the STM32 boards, but getting it to run on the Pico will be trivial – it’s just ARM assembler and Forth. There is nothing chip-specific in it.
RISC-V target
Most of the changes to the RISC-V target have either involved bringing changes over from the ARM code, or adding support for the GigaDevice GD32VF103. I ported the serial code that I had written for the HiFive1 to the Longan Nano, which was trivial to program via its UART pins thanks to the STM32 serial bootloader support that I had previously added to muforth. GigaDevice – which got its start in the microcontroller market by making clones of STM32 chips – ported the serial bootloader code to the GD32VF103, and it worked flawlessly.
I can’t say the same for the USB DFU bootloader code. But I can say that their porting effort was flawless: they ported over all of ST’s strange bugs, and behavior that diverges from their own documentation! Despite my assertions in at least one Git commit, I was unable to get the USB DFU bootloader to work reliably. Since the serial bootloader does work reliably, I’m not worried about this, but it would be nice to get it working.
I’m still really excited about the world of RISC-V, and I want to focus a lot of effort and energy on this target.
Porting over the ARM decompiler code and the multitasker will be almost trivial, both because the structure of ITC Forths is very regular, but also because they are both RISC architectures (though ARM is arguably less so).
I need to tackle RISC-V interrupts – CLINT, PLIC, and CLIC! – and also get the USB controller working on the GD32VF103 – which is a complicated host/OTG/device controller from the STM32F105 family.
Lots to do, but most of it should be fun!
Read the 2021 journal.