MicroComp

Overview:

CPU	MicroComp
Data	8 bits
Memory	2x 16 bits, with a 12-bit program counter
Registers	3 1/2
Architecture	Modified Harvard
Technology	TTL
Status	The machine is able to run Fibonacci and Hello World.
Successor	None
Time period	December 2017 - present

Problems:

The microprogram counter is unstable.

Lessons:

"Broadside" chips such as 74574 and 74245 are a lot nicer to wire than chips such as 74244.
Buses are very messy and difficult to wire on solderless breadboards.
When determining which features to include in a CPU, consider how many extra chips each feature uses. I did not do this, so I only realized much later that allowing the C register to be read would only require one or two extra chips and add much more programming flexibility.
When determining which features to include in a CPU, consider how much that feature will be used. I did not do this, so I only realized much much later that disallowing the C register from being read would save a chip without significantly reducing programming flexibility.
Use a PCB.
Simulate the machine before building it.
Load the microcode output into a register to prevent glitches and latency issues.
Logic analyzers are helpful.
If it occurs to you that something might be a problem later, then thoroughly investigate it. It is very likely that it will cause trouble later.

If you read about Mini Comp, you would know that it failed me and that I replaced it with MicroComp. Yes, I do find it kind of odd that I named a 4-bit computer Mini Comp and a larger 8-bit computer MicroComp. Design went well, but it inherits its predecessor's instruction set style. That is not great, but it works because four times as many instructions are available.

The plan to make MicroComp started around the end of November 2017, but because of my busy schedule at school, design did not actually start until winter break. Just like Mini Comp, MicroComp received a load-store architecture that is very difficult to work with. At least it can manipulate arrays without self-modifying code. To make it a bit easier to program, I planned to make a virtual CPU with an instruction set that only operates on memory. This is a bit similar to Gigatron's vCPU, except that instead of running from data memory, it will be required to run from program memory.

Over spring break I built the computer. At first the wiring didn't bother me, but after spending hours cutting, stripping, and poking the wires into the breadboard, it became quite tedious. I did finish, but I do not think I powered it on until summer arrived.

My original plan was to disconnect all control signals that could possibly cause bus contention before I turned on power. Instead, I got impatient, and I turned the it on anyway. The first thing I noticed was that the computer was drawing over 2 A. TTL is power hungry, but 2 A did not seem right. Assuming each of the 40 chips consumed 20 mA of current, the computer should draw about 800 mA. I suspected that most of the wiring was correct, and that there was probably only a few components to blame. An infrared thermometer was very useful and showed that the capacitors and all of the chips were at a relatively low temperature, except for the flag register 3-state buffer, which read 105 °F. After rotating the chip 180°, the power supply current was about 800 mA. Much better.

After fixing a few wiring errors, it was time to run the test program. The first instruction was LDI 0xDE A with an opcode of 0x01. At least, that is what it should have been. The opcode is located in the upper 5 bits of the instruction byte (resulting in 0x08), but instead I placed it in the lower 5 bits (resulting in 0x01). That meant that the CPU had to sit until the next time I could get to school, since I did not have a sufficient flash programmer (the one I made doesn't count). I don't remember how or when, but I also discovered a few bugs in the microcode that I later fixed.

Since MicroComp did not have any programs it could run, I decided to simulate it when I came across this warning. I found some more bugs in the microcode and in the schematics, which shows just how useful this step is. I fixed the bugs in the microcode and then programed it into flash at school.

MicroComp no microcode (jpg) — MicroComp without microcode or program flash

Once the final version of the microcode and test program were programmed, I went step-by-step through the test program, comparing the predicted result with the actual result. All of the instructions up to the addition instruction were tested, when I found that the low nybble of the ALU was adding 1 when it was not supposed to. My first thought was that the carry in (Cn) was stuck high because of improper wiring. I checked Cn, but it was low. A0 and B0 were in the correct state as well. When I checked the wires connected to the pins I found that they matched the schematic, and the schematic had been verified in simulation using the very same test program. At this point I became suspicious of the ALU (74381), so I took it out and tested it under the same conditions. It worked fine. I rechecked the wiring of Cn, and found that the carry bit in the flag register was not connected to it. Once I made the connection, addition worked fine. So why did Cn act like it was high, but when checked was low? What probably happened is that the pin floated high because it is TTL, but the LED that I tested the pin with pulled the pin low. If I tested the two pins at the same time, it would have taken a lot less time to diagnose the problem.

74381 being tested (jpg) — Testing suspicious 74381

A little before I found the addition bug, I noticed that the instruction SWF A F sometimes moved A to F but not F to A. After the addition bug was fixed, this is the only way it would behave. Unfortunately, I think the only way I can get around this is to load the microcode into a register before the instruction is executed, or to slightly redesign the ISA. I will probably choose the latter. This may involve adding a 3-state output to the C register so that it can be used as an ALU input. This would make arrays a lot easier to implement. I refuse to take this machine apart after I have spent this much time on it. Maybe while I'm at it I'll add DMA.

2019-07-19

I redid the microcode, but I did not want to generate the binary by hand as I had been doing previously, so I wrote a simple microassembler. Besides providing a useful tool to assemble microcode, it also allows me to easily modify the instruction set. Microcode v1.1.0 was slightly better than microcode v0.3, but I think I can vastly improve the ISA by increasing the number of instructions and adding multiple addressing modes to instructions that were formerly only meant for registers. (A few hours later...) The additional addressing modes required the destruction of the contents of some of the registers, so I came up with an instruction set that worked directly on memory. A comparison of Fibonacci using both instruction sets revealed that the load-store version (v1.1.0) was both faster and smaller than the memory-memory version. As a result, the only change I am making to v1.1.0 is to double the number of instructions to 64. I will have unused opcodes, but that is fine since I have plenty of memory in the microcode ROMs.

2019-07-21

Once again, the VHDL simulation is proving helpful with debugging. I seem to have found a bug in Vivado's behavioral simulation though. When I load a new version of the microcode into Vivado, I am able to read it from its text editor, but when I run a simulation using that version of the microcode, nothing changes. To fix this, I have to modify one of the VHDL files. Once I do that, the simulator uses the correct version of the microcode. It seems like this is a problem with caching memory files and not using the original copy. I guess this is a case of a feature that has turned into a bug.

Fibonacci works in simulation! Now I need to figure out how to program the ROMs. I guess I will take a break from MicroComp and work on the flash programmer.

2020-01-08

I have done nothing with the programmer.

The MicroComp hardware has almost reached v1.1. I have made all the changes required to run a Fibonacci program. This leaves the C register data buffer as the only modification left to add. So do we get blinky lights? No we do not. I don't have the annoying swap A and F bug anymore, but I am having a problem where the A register is being loaded without a clock pulse. Hopefully putting capacitors on a few of the clock pins will make it run properly. It should definitely be easier to solve than the Swap bug.

MicroComp 2020-01-08 (jpg) — MicroComp as of 2020-01-08. There is now a third board.

MicroComp and logic analyzer (jpg) — Debugging with a logic analyzer (LA)

MicroComp IR LA trace loop (jpg) — Instruction register and microprogram counter during execution of the Fibonacci program. I spy glitches. Pin #2 of the LA is broken, which is why it is low.

MicroComp IR LA trace start (jpg) — Program start. Pin #3 is unreliable.

I have decided to stop using EAGLE because of its limitations (and that fact that I didn't want to sign in to Autodesk), so the MicroComp schematics are now available for KiCAD as well. I used the built-in conversion tool, so not everything may work correctly.

2020-01-13

I worked on MicroComp a bit more. I tried adding a capacitor to the ACLK signal, and the LEDs started blinking! They are not, however, blinking out the Fibonacci sequence like they are supposed to. Instead they are adding two each iteration.

2024-12-16

I have made a little bit of progress in the last five years. The first thing I did was to add registers on the output of the microcode to prevent the deadly race conditions. I was surprised to find that it worked in simulation on the first try. I can't recall if I needed to change the microcode for that. I think I did and it probably delayed instruction decode by a clock cycle or two. Yes, that's a major hit to performance, but I don't currently care provided the machine works.

Microcode synchronization registers (jpg) — The new microcode word synchronization registers

The lower left side of the left breadboard has the three microcode memories. If you look at older photos of the computer, you may notice that they are thinner. That's because I shocked some of my flash chips and now I only have three total. The computer needs four: three for microcode, one for program. At some point I'll have to order more flash, but for now I'm using SRAM instead. Of course the downside of SRAM is that it has to be loaded each time the computer is powered on, which means I need an additional mechanism to do so.

Microcode loader (jpg) — Microcode loader

The blue microcontroller is an LPC1769. Two digital I/O pins are dedicated to the UART, and nearly all the rest are used to program the SRAMs. The big green and gray board is an Atmel STK500 (Huh. I suppose they're Microchip now.) which I am using solely for the RS-232 to UART converter. Beside the STK500 is the flash programmer, which is not in use.

The LPC1769 is running the duck-lisp virtual machine, which runs bytecode sent to it by my laptop. The laptop compiles a duck-lisp script to bytecode, during which a macro reads the contents of the microcode images into the program. The bytecode is sent to the microcontroller which then executes it. The purpose of this complicated system is to 1) have fun, and 2) allow a tight debug cycle. Once the loader is working properly, I should be able to load the latest microcode, run it on the TTL computer, and when it fails, simply fix the microcode sources and run a single command to compile and load the new microcode. Unfortunately, the loader isn't working yet, due to a couple of the GPIOs acting weirdly. From what I recall, one bit can't go high and another bit can't go low. I haven't yet pulled out my oscilloscope to figure out what's going on.

The hardware isn't the only thing that's had a significant change. It occurred to me that it's going to be a while until I have a computer I can write software for, so I figured I'd write a simulator for it. It is a pretty straightforward conversion of the schematic into C, with some statements reordered so that everything happens in the right order. The other reason I wrote it was that I spent a few years writing duck-lisp, and I recently realized that if I could make a simple bytecode virtual machine for the TTL computer, then I would only have to make a few tweaks to the duck-lisp compiler in order to have a high level language. A bytecode VM also has the advantage that its instruction density is much higher than native machine code, meaning that if I can make the VM small enough, it actually increases the maximum size of the programs that can be run. The trade off is that it will slow down the machine.

The assembler is written in duck-lisp, which is surprisingly well suited to this task. The syntax is the same as native duck-lisp, but looks almost like normal assembly due to parenthesis inference.

I started writing the VM and have got as far as instruction dispatch. Dispatch is done by using the opcode to index a 128 by 2 byte table of addresses that selects the correct subroutine for the corresponding instruction.

A while ago I noticed that adding a single 3-state buffer would allow reads from the C register. I was disappointed about this limitation, since it seemed like such a great gain for just a single chip that I left out of the design simply because I didn't stop and calculate how many chips the feature would actually need. Well after writing some assembly, I've noticed that I haven't needed to read the C register even once so far. I'm not completely sure, but I think it's because there's so few registers, which forces the B and C registers to constantly be loaded with new memory addresses to fetch from and write to. The lack of registers gives the feeling of using a memory-memory architecture, but without the convenience. I might write up a better explanation later. But I don't think I will make it so that C can be read.

I also noticed that I tended to write the instructions ldi (low address) b and ldi (high address) c a lot. Given how often they are used together, it seemed like I could save a byte by creating a new instruction to load both the C and B registers at the same time. So now that code is written ldi address cb. This should slightly increase execution speed and I still have plenty of unused opcodes to spare for future extensions.

2025-05-20

Loading the microcode SRAM works after I fixed a couple wiring errors. I think I got confused last time which wires were generating which signals. Probing wires with the oscilloscope revealed the problem.

2025-05-30

I added support for programming to the flash programmer's code and was able to flash a chip with the classic Fibonacci program. I turned on and reset the computer and noticed that it was (and still is) very unstable. I think this is related to the microprogram counter. On the oscilloscope I can see it sometimes skip from state 3 to state 5, skipping 4. I don't know why it does this yet. I was able to get it to execute the program though, but the pattern that showed up on the LEDs wasn't correct. I wrote the numbers down and found that it was the Fibonacci number sequence, but with an extra 1 added each iteration: 0, 1, 2, 4, 7, 11, and so on. I wrote about this exact problem many years ago, but I thought I fixed it. I checked the carry-in to the ALU and it was connected to the carry generator, but nothing else. I connected it to the flags register bit it was supposed to be connected to and then it finally gave me Fibonacci: 0, 1, 1, 2, 3, 5, and so on.

This is a milestone. I don't know if I have a working universal computer, but I do have a working computer.

I know that reading and writing registers works, I know that basic arithmetic works, and I know that jumping works.

Next I tried Hello World. I erased the first sector of flash, verified it was erased, programmed it, read it back to make sure it programmed properly, and found that it was still erased. It seems that the bytecode that I sent to the flash chip wasn't executing properly. It turned out that there were a couple bugs that should have prevented programming from ever working. I don't know why it worked with the Fibonacci program. I finally got Hello World programmed into flash. I was able to get a stable run of the program and it displayed the ASCII values for each character on the LED bar. The program is supposed to restart after it finishes printing out the message though, and it didn't. I'm not yet sure why. It's possible that branching isn't working quite right.

I know that reading arrays from program memory works.

Video of the Fibonacci program.

2025-06-02

Here is a video of the Hello World program I mentioned, but it now repeats as intended. The problem was that to jump, the program counter's clock needs to be toggled, but if the jump didn't occur because the selected condition code wasn't set, the program counter's clock would be toggled anyway. The program counter was already pointing at the next instruction, and toggling the program counter incremented it, so the next opcode would be skipped. A workaround would be to put a nop after every branch instruction. The solution I used though was to tie the /JMP signal to the program counter enable. This disables increments, but it still permits resets and loads.

Since I have no text display, here is the decoded text:
Hello, world! Hel

I know that branching works. I believe this makes the computer Turing complete from a practical standpoint.

Here's some celebratory cylon eyes:

I turned up the clock rate. According to the oscilloscope, the clock rate (the yellow trace) varies widely and is around 1 MHz. The signal looks horrible because I measured it by connecting the probe to a floating wire next to but not touching the clock signal. When I connect the probe directly to the signal it tends to crash. It also tends to crash when I touch the insulated clock wire.

1ish MHz clock of the cylon program (jpg)

This is a bit of the output port that is driving the LEDs (the cyan trace).

Signal at an output LED while running the cylon program at 1ish MHz (jpg)

This project is going surprisingly well.

I think the next step is to hook up the I/O port to the laptop using the LPC1769, which means I need to tear up the circuit it's connected to, which means I need to buy some flash chips or build an EPROM programmer so I can make the microcode permanent.

2025-06-21

I wrote an 8-bit multiply routine. It accepts two hard coded 8-bit numbers and multiplies them together, giving an 8-bit result. It failed, so I started debugging the hardware and I eventually found that for the clr f register, the ALU was subtracting instead of outputting 0x00. What caused it was an ALU operation select signal connected to the data bus instead of the instruction register. Apparently I misplaced the wire by one pin on the instruction register. Shifting it back from the register input to the register output fixed that problem.

The program still didn't work though, so I kept on probing signals. It eventually dawned on me that the mistake was in the software.



              clr f


              clr b


              sub a b b


              ldi multiply8-end cb


              br.z

clr f sets register F to 0. clr b sets register B to 0. But unlike clr f, it also loads F with the ALU flags. So in this case, F (which was 0) gets overwritten to some other garbage, which sets the carry bit and prevents sub a b b from properly checking if the A register is zero. Swapping the positions of clr f and clr b fixed this, and then the program multiplied correctly.

I've started planning how to add more memory and IO to the computer. I think it will be annoying to do. I have two 8255 parallel IO chips. Each one provides 24 bits of parallel I/O. I've got a D8253C with three 16-bit timers. I don't think that will be useful now, but it could be nice later, maybe for managing multiplexed 7-segment displays. I have a CDP6402C UART which looks easy to use, but I will need to find a crystal for it. I would like to use it to load programs into the program RAM, but I don't have a program RAM yet, so I'll have to add that. The UART has two additional output bits, one for transmit finished and one for receive finished. To read these signals I would need to feed them into one of the 8255's. This isn't ideal though. If this were something like a Z80, I'd prefer to tie them to interrupts, but MicroComp doesn't have interrupts.

This is annoying, so (I think) I've figured out a way to hack interrupts in. I would need to add two additional registers to capture the PC value when an interrupt occurs. These read-only registers would be mapped into data memory with the rest of the I/O. I could potentially use them for PC-relative addressing too, but I'm not too interested in that. This method of relative addressing would be slow, and I'm content recompiling (or relinking?) the program each time I change the entry address. For triggering and enabling interrupts I'll probably need a few flip-flops and glue gates. To make the control logic handle the interrupt, I can tie the interrupt signal to address bit 10 of the microcode memories. This will act kind of like segmentation. The bit flips, and a whole new instruction set is used, perhaps with each instruction doing the same thing: jumping to the interrupt service routine. In order to jump, MicroComp needs to load the B and C registers with the address. That would be complicated, so an interrupt can simply reset the program counter to 0x0000 instead. The interrupt vector is the reset vector. This would require the boot code to check if an interrupt is active, and if it is, handle it. Now for the tricky bit. An interrupt cuts off program execution at some random point, executes an entirely different subroutine, then jumps back to the point of the interrupt and continues execution. In order to do this it needs to save and restore the registers. Many computers save the registers on the stack. MicroComp has no stack, and it would probably require a minimum of five other chips to add. The Zilog Z80 saves its registers by swapping them with shadow registers. This would require me to duplicate the four registers and then add another microcode chip, or else add multiplexing logic for the register control signals. I've noticed a coding pattern that I could exploit though, which would allow for no extra logic to be added, but at the cost of extra latency. Programs seem to work like this: data is loaded into the registers and operated on, overwriting the previous contents, then is written back to memory and no longer needs to be saved. I think it would be possible to make the assembler automatically detect these points where all data has been written back to memory and is effectively garbage. On the last store back to memory before the data becomes garbage, the assembler inserts a variant of the store instruction that enables interrupts for just that instruction. The stm and stp instructions are the only instructions that save data to memory, so these are the only instructions that need interrupt-enabled variants. They store the data to memory, check if an interrupt is waiting, and if so, reset the machine. Connecting the interrupt signal to address bit 10 would determine whether the store instruction (the variant with interrupts enabled) resets the machine. Tada! We have interrupts with only two extra chips.

Finally, I have an 8237 DMA controller, which I believe is designed for use with the 8088 CPU. I don't yet know if I will be able to make it work with the machine, but it would be nice because moving data is sluggish as is. I think I would map it into data memory so it could interact with I/O. It's a shame though that I won't be able to do DMA transfers between data and program memory. I will need a way for the CPU to relinquish control of memory and I/O to the DMA controller. I was thinking of adding a bus transceiver to the data bus between the CPU and the memory and I/O because of the large fan out. This will require I rewire or even pull up the data SRAM I already have so that it is on the "outside" of the transceiver and not directly connected to the CPU. The transceiver would be put into hi-Z during DMA transfers, so the RAM needs to be on the same side of the transceiver as the I/O.

I bought some more caterpillars, fed them, and placed them in the habitat.

Or to phrase it another way, I bought some more 39SF020A flash chips and used three of them for microcode.

MicroComp using flash for microcode again (jpg)

The microcontroller for loading the microcode SRAMs is no longer needed, so there are no more microprocessors on the breadboards. I might add an AVR though to manage some tasks like reset and communication with the laptop.

All resources are now hosted on GitHub. The last non-GitHub version will be kept below, but I am now pushing new versions to GitHub.
MicroComp scratchwork, tools, and software
MicroComp KiCAD schematics
MicroComp VHDL sources

Files:

MicroComp reference v1.1.1 (txt)
MicroComp reference v1.0 (txt)

MicroComp KiCAD Schematics v1.1.1 (zip)
MicroComp EAGLE Schematics v1.1.1 (sch)
MicroComp v1.0 EAGLE schematics (sch)

Test programs (zip)

Microcode v1.1.1 (zip)
Microcode v1.1.0 (zip)
Microcode v0.3 (zip)
Microcode v0.3 for simulation (zip)
Microcode v0.2 (zip)
Microcode v0.1 (zip)

This microcode assembler detects improper syntax by displaying the text "Segmentation Fault" and exiting.
Buggy microcode assembler (zip)

Note that the VHDL sources include the simulation microcode.
MicroComp v1.1.1 VHDL sources (zip)
MicroComp v1.0 VHDL sources (zip)

The MicroComp simulation sources are dependent upon these 7400 series VHDL modules
7400 series VHDL sources (zip)

Anything related to development of MicroComp.
Development scratch-work (zip)