VOGONS


First post, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie

Quite a bit ago I wrote about Neo6502: Olimex Neo6502 - Software defined machine!?
It seems like (mostly) it is a real 6502 CPU that has a "virtual"/fake environment created for it by a tiny RPi Pico, making it basically a software-defined computer platform.

... But it is still in hardware.

I'm very... very-very comfortable in software engineering. And it is the very opposite on the hardware side. Hardware issues and debugging has also magnitudes slower round-trip time to try different ideas and fixes.

So, what I figured is that, because I can write code, I could write code that simulates hardware which is defined as pins and connections between them. (I also learn best by building - Once I've built myself a virtual MOS6522 for example, I know very well how to use it.) So I've been playing around with it for 2 days and I think I've gotten somewhere that I kinda like.

On picture is a "platform", that consists of 6502 CPU (i wrapped Asm6502 library into a "Device" compatible for my framework) and 64kB of RAM. (+ pull-up/pull-down resistors provided by the "platform") And it works! But it is somewhat slow: On a Ryzen 7 3700X with Fedora 42 KDE I get platform cycle frequency of ~35kHz. Enough to play around, but nowhere near to do some ~1MHz simulations, unfortunately.

I tried to google and even AI to search for something pre-made. There are SpiceSharp, LogicGraph and SharpCircuit, but they all are way lower-level simulations and therefore quite probably way slower than even my handmade solution. (Those can do analog circuits, too)

Currently I have only a picture to share, showing how the code looks like for such "computer". Any feedback on ideas, concepts and even practical developer experience is very welcomed.

Last edited by GigAHerZ on 2026-02-07, 17:01. Edited 2 times in total.

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!
A little about software engineering: https://byteaether.github.io/

Reply 2 of 8, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie

Another day, another rewrite! 😁

So I fundamentally changed how the simulation works. I'm not anymore going through all the pin networks and "logic devices" (that should have immediate output based on input), calculating their values (potentially multiple times until they stabilize) and then going through all the "clocked devices" to let them do their job and changes to the pins.
Instead, I rewrote the system into a "reactive" style. Each device can register pins they want to react to (doesn't have to be all input pins, just clock for example) and they will be triggered only when their registered pin changes value.

As a result, performance jumped quite a bit!

The scenario is a 3 device setup: 6502 CPU, 64kB of SRAM and small device (I call it "chipset") that translates memory related control signals from 6502 to SRAM compatible values. (Think of simple 74LS glue-logic, GAL chips or extremely under-utilized FPGA).

Before the rewrite I got a cycle performance of about 25kHz on debug mode and 45kHz on release mode.
Now I get about 120kHz on debug mode and 170kHz on release mode.

And while I didn't have to change much how I had already implemented for the devices (CPU, RAM, glue logic), I did add the trigger pin definition to each device and I was even able to simplify their implementation a bit. (Pins became stateful with this rewrite, so no need to re-write to the pins repeatedly for multiple cycles if you want to keep values on the bus for a while)

Pretty nice! And because it is reactive system now, I can model a huge design/hardware set, as devices are only triggered when necessary. Earlier system became slower with every component I added, even if they never did any work. (Like the "synthetic" device that functions as a set of pull-up and pull-down resistors.)

Btw, did I mention that it runs real 6502 assembly program from memory? It is fully functional.

EDIT:

NO WAY!

More optimizing and i got 350kHz on debug mode and 1.4MHz on release mode!

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!
A little about software engineering: https://byteaether.github.io/

Reply 3 of 8, by kingcake

User metadata
Rank Oldbie
Rank
Oldbie

You just need state machines. This is basic stuff.

Reply 4 of 8, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie
kingcake wrote on 2026-01-30, 20:36:

You just need state machines. This is basic stuff.

Yet, I have to build one myself because nobody has written one yet! It is that basic!

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!
A little about software engineering: https://byteaether.github.io/

Reply 5 of 8, by kingcake

User metadata
Rank Oldbie
Rank
Oldbie
GigAHerZ wrote on 2026-01-30, 20:41:
kingcake wrote on 2026-01-30, 20:36:

You just need state machines. This is basic stuff.

Yet, I have to build one myself because nobody has written one yet! It is that basic!

Wut. State machines are a basic computer science concept and are everywhere. You should look into industrial control state machine frameworks for C. You basically just define all the states and the framework generates the rest.

I don't know if anything like that exists for .NET languages, but I would start by porting one of those first. it will make what comes next much faster, and portable.

Reply 6 of 8, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie

Sure-sure, kingcake.

Anyways...

I've set up the cc65 development into my solution side-by-side with the C# circuit simulation. So when I run the project, I both compile the hardware design project as well as the assembly code for the 6502. With a single "F5" button, I have the full cycle of development in 2 levels (hardware design vs 6502 software) done. I see a bug, change the code and 1 second later I'm running the new machine. (You can't even flash a ROM chip that fast!)

I've now quickly built a tiny terminal component (using raylib) and connected it to the machine. I also found an enhanced WozMon (EWOZ) and ported it to my machine.

As a result, I have a full working 6502 computer, fully defined/designed in code. At any point I can pause the whole thing and look into every chip and every bus and every line and see what value it has. I can watch the values and single step not the machine, but the processing of a single component, when debugging.

Elementary in software development, but crazy in the context of hardware design.

And speed hovers around 1-1.2MHz. (I let it run as fast as possible)
Similar to VIC-20 and Commodore 64 clock speeds.

EDIT: I missed one silly unoptimized code part. Now it runs at around 1.5MHz speed.

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!
A little about software engineering: https://byteaether.github.io/

Reply 7 of 8, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie

So... uhm... I might have gotten a small "manic" period or something. Was motivated enough for about a whole week to do single 20h stunts and once even a 40h stunt skipping one night.

And because i disliked the MOS6502 library's lack (and therefore my hacks) of pin-level simulation, I wrote my own 6502 implementation. It is not just cycle accurate with all the quirks (like certain instructions pushing random values on address bus in the middle of its work), but it is also pin/wire-accurate!
And i also wrote a full implementation of 16C550 UART chip based on the datasheet. (Very much preferred over the MOS6551 ACIA) It has all the features the datasheet covers: 16 byte buffers with 4 levels of interrupt triggers on them (1, 4, 8 and 14 characters count), timer interrupt, all the configuration registers in it to control all of that, etc.
The whole framework got like 10 rewrites during this time.

And it works! I have my WozMon running and my terminal is connected to the 16C550 UART chip's external TX/RX lines. I have, in total, 4 implementations of terminal interface:
1: The mentioned full 16C550 implementation. As it has RX/TX lines, it's a terminal!
2: My own simplified terminal chip that just reads the databus and blurts it out. Has Read/Write and ChipEnable control pins. So basically a databus sniffer with control signals.
3: TcpTerminal. It creates a small TCP connection listener and uses that for external communication. Allows you to use your favorite terminal software. The "other side" has RX/TX interface, because it, too, is a terminal!
4: Tried the RayLib library out. (Can also be seen in previous post's screenshot) Very simple keyboard reader and text renderer. "Other side" yet again has RX/TX interface.
... and everything with the terminal interface can be connected together with virtual "cross-cable". (RX <-> TX; TX <-> RX) So i have my 16C550 and simple terminal chip inside the simulation and TCP terminal and RayLib terminal outside the simulation.

It supports combinatorial logic and runs them with wire-net calculations until they stabilize. It also support listening to pins only once they are stable. And you can create mixed devices, where some pins are for combinatorial logic others are causing triggering only once the input has stabilized and changed. Each trigger has an identifier flag, too, so when your logic executes, it knows exactly which pin(s) caused it for that execution time. (combinatorial RESET vs stable CLK-based execution in the same chip for example)
All this can run right now on my pc at about 1.17MHz sitting in a WozMon screen doing the keyboard check loop. When i let the WozMon blurt out huge sections of memory, then the simulation speed drops to about 1.1MHz.

I need to clean up the thing a bit, move some code into better named files and what not. Mostly "cosmetics". So no public release of the thing. But for a teaser, I present you a guide to create your own chips for this simulator!
And an idea of implementing Z80 CPU in similar fashion as I just did with 6502 is starting to torment me...

But now, I need a sleep...

EDIT: Updated the guide. I think it is as comprehensive as it can be for developing new devices for my simulator. Gotta work now on guide building computers - basically how to connect together all those devices you have. That's a lot simpler one.

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!
A little about software engineering: https://byteaether.github.io/

Reply 8 of 8, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie

I've pivoted... to C++.

There are 2 big advantages in going with C++:
* I can compile verilog devices into C++ and make extremely thin wrapper around the generated code to incorporate them into my simulator.
* SPEED! It's 2 magnitudes faster for reasons i am not able to identify! C# achieves with crazy optimizations about 3MHz cycle frequency. C++ easily does 100+MHz!

There's also a big disadvantage to C++: I really can't do C++. 😁
But i have the simulator framework itself now working properly. "Just" need the CPU and stuff.

... and i probably need to create my own 65C02 CPU implementation.
The verilog variants I've found are either buggy or the 6502 without the "c" in the middle.
And none of the C++ versions I've found actually do half-clock emulation. (Clock input, emulate state when clock is high, emulate another state where clock is low, etc)

So I think I need to start creating a similar micro-instruction cycle-accurate pipelined 65C02 implementation in C++ (which i don't know really) that I've already done in C#. (Btw, the C# version, even though it emulates the 65C02 more accurately and lower level, is more performant than the widely used Mos6502 library.)

I can't leave 2 magnitudes of performance on the table and continue with C#...

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!
A little about software engineering: https://byteaether.github.io/