VOGONS


Does anyone own a MISTer FPGA and how is it?

Topic actions

Reply 80 of 92, by vstrakh

User metadata
Rank Member
Rank
Member
Shreddoc wrote on 2025-05-23, 03:21:

FPGA * is * emulation

FPGA is a hardware implementation. It's a real digital circuit doing same function, it doesn't matter if I built some d-flop with transistors or using 7474.
Ditching lots of discrete elements in favor of big VLSI chip doing everything original circuit did - it's another implementation, not the "emulation".
Whether it's doing a good or bad job being a clone implementation - is irrelevant to the fact the FPGA is just roughly the same as VLSI (very-large-scale-integration), but reconfigurable.

Reply 81 of 92, by Shreddoc

User metadata
Rank Oldbie
Rank
Oldbie

FPGA cores are written by people. People make choices and sometimes they make mistakes or compromises.

There are countless instances in human-written MiSTer cores where the choice has been deliberately made to deviate from the exact form of the original hardware - sometimes because those forms aren't known or developed yet! - in the name of *emulating the output* of the original hardware. Sometimes that deviation is significant.

For example, is the ao486 core "accurate"? Of course not. It contains *many* deliberate shortcuts in order to attain the performance it has, within the CycloneV FGPA's limited resources. And are all of the MiSTer's vaunted arcade cores fully "accurate"? Of course not. Sometimes there are glaring inaccuracies in MiSTer cores. For example the incomplete OPL2 FPGA module, which is in turn adopted by several arcade cores, still has some pretty shocking performance in some areas....assuming you like your game music to actually sound complete, with all it's instruments, etc, throughout your entire game.

It largely comes down the difference between "timing accuracy" and "emulation accuracy". Is MiSTer a huge aid to the former - and a lot simpler and cheaper than the PC required to, in it's own way, match it? - hell yes, no question there - there never was. But FPGA is certainly not some magical guarantee that the emulation accuracy is written to the 100% standard that some people imagine it to be. Emulation rarely is. That's no jibe against the incredibly talented developers. And MiSTer is certainly not the only possible way to achieve incredibly high quality emulation performance. Sometimes, the software emulation experience can even be further advanced than it's FPGA counterpart!, shocking as that may seem to some.

MiSTer just happens to be, imo, the best option currently available, all things considered - as it has been for quite a few years now, and will probably continue to be for a few more yet. There are few things I recommend to retroheads more vehemently, because I love it.. but let's not suffer the misconception that MiSTer's human-written cores with their myriad of imperfections and deliberate shortcuts where it suits, and using modern hardware of a completely different form than the original - with the primary goal of iteratively reconstructing the output of the original hardware, to the best of the authors' ability - are somehow "not emulation".

Supporter of PicoGUS, PicoMEM, mt32-pi, WavetablePi, Throttle Blaster, Voltage Blaster, GBS-Control, GP2040-CE, RetroNAS.

Reply 82 of 92, by doublebuffer

User metadata
Rank Member
Rank
Member

Don't trust emulation, buy the original! (Intel propaganda campaign in 80s/90s)

Reply 83 of 92, by Shreddoc

User metadata
Rank Oldbie
Rank
Oldbie

Propaganda type views are exactly what created the stupid stigma around that word "emulation", for which the dictionary definitions are variations on "the endeavor to equal, or imitate".

Repeat after me, being a form (or "implementation" if you like) of emulation doesn't somehow make the MiSTer FPGA less. Banish the stigma and stop being scared of a word.

Supporter of PicoGUS, PicoMEM, mt32-pi, WavetablePi, Throttle Blaster, Voltage Blaster, GBS-Control, GP2040-CE, RetroNAS.

Reply 84 of 92, by vstrakh

User metadata
Rank Member
Rank
Member

Emulation of IBM PC!
"the endeavor to equal, or imitate" by all definitions.
People at Citygate and Harris had to make choices, and they had to make compromises. But is it cycle accurate to IBM 5170? Of course not.

The attachment motherboard-286-citygate-td60c.jpg is no longer available

Reply 85 of 92, by amadeus777999

User metadata
Rank Oldbie
Rank
Oldbie

Emulation is nice especially when it comes to developing something - it can make life WAY easier.
Otherwise I prefer original hardware as the gold standard - the hardware in itself is part of the magic not only its functionality.
"Perfect" emulation may be a fool's errand.

Reply 86 of 92, by javispedro1

User metadata
Rank Member
Rank
Member
doublebuffer wrote on 2025-05-22, 22:43:

No, not yet even in practice. Triple buffering requires 2 frames lag to achieve smooth frame rate. So for an arcade game running 60 fps, the powerful PC and the display needs to run 180 frames.

Why all this ? You do not require "triple buffering" at all if you treat the PC like the 1980s machine it is and just race the beam, like a FPGA without buffering is going to do. Depending on what arcade machine you want to emulate, the PC is going to be able to do it without breaking a sweat, so OS-added jittery is going to be irrelevant*, and easily generate 300fps or 6000fps if need to be (unless you want absurd resolutions, but FPGAs are going to struggle there if only because of the timing requirements of high clock HDMI...).

*OS-added latency is not irrelevant, but there are still OSes out there that add no presentation latency. I hope (but do not know) Windows does it for full-screen games w/o compositor.

doublebuffer wrote on 2025-05-22, 22:43:

It's time to stop spreading this FUD (fear uncertainty doubt). There is a fundamental difference between naturally parallel FPGAs and (semi*) fake-parallel CPUs.

There's no FUD here. It is emulation under a different form. There are differences sure but they are less than what you think, and there is extremely smooth transition which makes a blanket "black/white" it is emulation/it is not emulation categorization absurd. E.g. what if my FPGA "core" is just a soft processor running a program? What if I use the helper CPU that many FPGAs come with? What if I use the helper ALUs/DSPs/multipliers/whatever ? What if my core is proven to be logically-equivalent at a external level yet internally it has completely different state machines ? You can't have a identical replica of the original ASIC in an FPGA, due to physics/timing alone (forget cycle accurate -- I mean glitch-behavior-accurate). You can have a pretty convincing analogue, though.

I subscribe the opinion that there is nothing to be ashamed of in the word "emulation", because it is what it is. And it is pretty well entrenched word even in logic design..

If you want a headache, are virtual machines also emulation? x86 VMs may not emulate the CPU, but they definitely emulate e.g. IDE controllers. Is a 2020s PC an emulator of the original IBM PC?

Last edited by javispedro1 on 2025-05-30, 10:43. Edited 1 time in total.

Reply 87 of 92, by SScorpio

User metadata
Rank Oldbie
Rank
Oldbie
javispedro1 wrote on 2025-05-29, 13:35:

Why all this ? You do not require "triple buffering" at all if you treat the PC like the 1980s machine it is and just race the beam, like a FPGA without buffering is going to do. Depending on what arcade machine you want to emulate, the PC is going to be able to do it without breaking a sweat, so OS-added jittery is going to be irrelevant*, and easily generate 300fps or 6000fps if need to be (unless you want absurd resolutions, but FPGAs are going to struggle there if only because of the timing requirements of high clock HDMI...).

*OS-added latency is not irrelevant, but there are still OSes out there that add no presentation latency. I hope (but do not know) Windows does it for full-screen games w/o compositor.

You can't race the beam, it's a frame buffers and multiple levels of extraction. Once Windows is out of the way, you still have display drivers themselves which are their own complete mini OSes. We have insane amounts of power at our finger tips, but we don't have direct access to it. You could code emulators on a real time OS, but even some of the modern multi core monsters still couldn't lock threads that simulate each chip to a core without swapping.

There are some new display techniques for high refresh rate monitors. You get a 16.66667ms delay, but with 240Hz or even better 480Hz, you display only a part of a frame lit up, with other dimmer, and the rest dark. That mimics the electron gun sweep. It's not line by line, but it can look good to the naked eye.

Reply 88 of 92, by Shreddoc

User metadata
Rank Oldbie
Rank
Oldbie
javispedro1 wrote on 2025-05-29, 13:35:
Shreddoc wrote on 2025-05-22, 21:52:

It's time to stop spreading this FUD (fear uncertainty doubt). There is a fundamental difference between naturally parallel FPGAs and (semi*) fake-parallel CPUs.

Just to be clear, the text you quoted was not written by me. It is from this post written by member 'doublebuffer'.

I am in general agreement with your standpoint.

Supporter of PicoGUS, PicoMEM, mt32-pi, WavetablePi, Throttle Blaster, Voltage Blaster, GBS-Control, GP2040-CE, RetroNAS.

Reply 89 of 92, by javispedro1

User metadata
Rank Member
Rank
Member
SScorpio wrote on 2025-05-29, 22:21:

You can't race the beam, it's a frame buffers and multiple levels of extraction. Once Windows is out of the way, you still have display drivers themselves which are their own complete mini OSes. We have insane amounts of power at our finger tips, but we don't have direct access to it. You could code emulators on a real time OS, but even some of the modern multi core monsters still couldn't lock threads that simulate each chip to a core without swapping.

I think this is over-exagerating. Sure display drivers are complicated, but this is because they do a lot of stuff (they _include_ multiple compilers!). However, you do not have to use the advanced stuff, and most definitely not to emulate a era 2D game. The basic presentation layer, I cannot say it's exactly like the 1980s, but it is not THAT different (framebuffers, planes, flips, CRTCs all of this pretty much the same). If it were too complicated, or you didn't have direct access, you would not be able to run real mode OSes using a basic BIOS in assembly.... and , so far, you still can.

I do not know but I'm reasonably sure you can have a very low latency presentation mode even on Windows, where you can race the beam and show tearing as much as you want, likely using full screen mode or at least disabling compositing some way. The game logic itself, the monitor and/or transport llikely add more latency than the OS or the GPU does, and you cannot avoid that even with custom hardware...

Shreddoc wrote on 2025-05-29, 22:31:

Just to be clear, the text you quoted was not written by me. It is from this post written by member 'doublebuffer'.

Sorry, corrected.

Reply 90 of 92, by SScorpio

User metadata
Rank Oldbie
Rank
Oldbie
javispedro1 wrote on 2025-05-30, 10:49:

I think this is over-exagerating. Sure display drivers are complicated, but this is because they do a lot of stuff (they _include_ multiple compilers!). However, you do not have to use the advanced stuff, and most definitely not to emulate a era 2D game. The basic presentation layer, I cannot say it's exactly like the 1980s, but it is not THAT different (framebuffers, planes, flips, CRTCs all of this pretty much the same). If it were too complicated, or you didn't have direct access, you would not be able to run real mode OSes using a basic BIOS in assembly.... and , so far, you still can.

I do not know but I'm reasonably sure you can have a very low latency presentation mode even on Windows, where you can race the beam and show tearing as much as you want, likely using full screen mode or at least disabling compositing some way. The game logic itself, the monitor and/or transport llikely add more latency than the OS or the GPU does, and you cannot avoid that even with custom hardware...

It's a different architecture and you just can't do it on modern systems. Racing the beam is sending out what should be draw at the exact time it should be drawn so you are generating the video signal the display is receiving and actively drawing. Anything modern uses a frame buffer which stores and holds everything that should be drawn and then sends it out. Any screen tearing is another full complete frame is done and updated the buffer that's drawing for it completed. There was no screen tearing with when racing the beam. In a perfect configuration, the top of the screen being drawn would be at least 16.6667ms behind native hardware and FGPA. You can make a program that's extremely low latency that a human can never see it, but it's still not accurate and the previously mentioned lightguns will not work.

Reply 91 of 92, by javispedro1

User metadata
Rank Member
Rank
Member
SScorpio wrote on 2025-05-30, 11:45:

It's a different architecture and you just can't do it on modern systems. Racing the beam is sending out what should be draw at the exact time it should be drawn so you are generating the video signal the display is receiving and actively drawing. Anything modern uses a frame buffer which stores and holds everything that should be drawn and then sends it out. Any screen tearing is another full complete frame is done and updated the buffer that's drawing for it completed. There was no screen tearing with when racing the beam.

The only thing a lack of framebuffer does is that it _forces_ you to race the beam as the only viable strategy, but even with a framebuffer, you can most definitely race the beam, if you wanted to -- just write to video memory at 'inconvenient' times. I'm quite sure there is some PC demo doing it somewhere. It's not that the VGA has a secret hidden memory where it keeps a copy of the framebuffer that you cannot modify.
I mention tearing because it is rather visible indicator that such video mode & game is not flipping completed frames on demand, i.e. it is using a traditional pipeline (double buffered or not). I do not imply that tearing is a obligatory side-effect, just that the fact that it can happen is showing that the "modern" GPU pipeline is really not that dissimilar from one from the 80s.

SScorpio wrote on 2025-05-30, 11:45:

In a perfect configuration, the top of the screen being drawn would be at least 16.6667ms behind native hardware and FGPA. You can make a program that's extremely low latency that a human can never see it, but it's still not accurate and the previously mentioned lightguns will not work.

But this would be because of the monitor (and scanout) latency, which no FPGA is going to avoid, rather than the GPU latency. There are working light guns for CRT VGA-based IBM PCs, after all. (And I think it would be an interesting experiment to see if they still can be made to work on a "modern" GPU with native VGA output).

i.e. my bet is that a "modern" PC is not forcing you to double-buffer any more than a FPGA would need to, depending on the output.

Reply 92 of 92, by SScorpio

User metadata
Rank Oldbie
Rank
Oldbie
javispedro1 wrote on 2025-05-30, 12:07:

The only thing a lack of framebuffer does is that it _forces_ you to race the beam as the only viable strategy, but even with a framebuffer, you can most definitely race the beam, if you wanted to -- just write to video memory at 'inconvenient' times. I'm quite sure there is some PC demo doing it somewhere. It's not that the VGA has a secret hidden memory where it keeps a copy of the framebuffer that you cannot modify.
I mention tearing because it is rather visible indicator that such video mode & game is not flipping completed frames on demand, i.e. it is using a traditional pipeline (double buffered or not). I do not imply that tearing is a obligatory side-effect, just that the fact that it can happen is showing that the "modern" GPU pipeline is really not that dissimilar from one from the 80s.

You could do that, but emulating a system and trying to start ahead of the beam doesn't seem possible to me. You are running GHz faster, but latency is the killer. Modern memory is optimized for bandwidth while access times are the important part.

The DE10 Nano the MiSTer runs on is an SOC that includes DDR3 memory onboard. However, there's a required add-on of SDRAM that's connected to GPIO because the access time latency of the DDR3 is too slow so you get graphical corruption where memory reads don't complete in time.

You can brute force a ton of things with the raw performance of modern hardware. But there are reasons FPGAs and ASICs exist and are still used in modern systems. Sometimes you need a circuit to guarantee timings that then puts and pulls from buffers for the rest.