VOGONS


MartyPC

Topic actions

Reply 80 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
VileR wrote on 2023-11-19, 20:16:

Ah, nice! Yeah, that looks pretty good for such a simple setup. I honestly think some of the features in that script of mine turned out to be overkill (rendering shadow masks at realistic sizes still sucks, for one - typical target resolutions make it look extremely artifacted). Would be nice to have some extra eye-candy, but 'nice to have' is a pretty low priority after all.

Eventually i'd like to implement librashader https://snowflakepowe.red/blog/introducing-li … ader-2023-01-14 which is a nice rust-based solution that would give MartyPC access to all those tasty Libretro shaders. Those will likely run in a full window. The nice thing about MartyPC's built in shader it it's applied by the scaler, so we can have those shader effects at configurable zoom levels and things.

VileR wrote on 2023-11-19, 20:16:

I thought I saw a bit of vertical blurring which didn't seem to coincide with the darkening between scanlines. But on second thought, it may be enough to simply make that darkening stronger... 200-line CRTs typically show fully-separated scanlines, so that should increase the 'realism' factor as well.

I think I see it now. I've moved aspect correction to the shader now and it looks a bit better. Resizing with a shader is still a bit tricky, ideally we would resize in two passes - once to the nearest integer scale nearest-neighbor, and then a final resize with linear filtering on. This would reduce the amount of blur. Although, I am still scan-doubling 200 line modes in software, which does help reduce the blur a bit.

VileR wrote on 2023-11-19, 20:16:

My bad, bungled choice of words there... yeah, dumping works just fine; I meant looking at VRAM in the memory viewer.

Currently in 0.2 you will see VRAM in the memory viewer, but just the first plane. I need to figure out a better interface for choose what you are going to see.

VileR wrote on 2023-11-19, 20:16:

Perhaps there's such a thing as a generic debugging front-end which could somehow be integrated? Admittedly such things often make for bigger headaches than rolling your own from scratch, but who knows... it's also true that Lua scripting engines are everywhere, so maybe that would be more straightforward than I think it is.

One thing that is commonly done is to implement a 'GDB stub', this allows any GDB client to connect to your emulator. VirtualXT does this. That's handy, but unfortunately doesn't support the 8088's segmented memory model, so setting breakpoints is not a lot of fun when you're breaking out the hex calculator every time to calculate a flat address. Despite that it might still be worth doing.

VileR wrote on 2023-11-19, 20:16:

Oh yeah, you mentioned that 'mysterious console message' in the other thread about the half-height vertical blanking period. 😀 (So was the fix related to the scanline-counting thing, after all?)

Yeah, there were several fixes, to both CGA and CPU emulation, and that was one of them. I wasn't letting a counter overflow that needed to. On the CPU side I needed better HALT and resume-from-halt timings, and the bus sniffer traces helped me find a stupid bug where I wasn't ticking devices during interrupts. I've decided to eventually treat a hardware interrupt like a pseudo-instruction. From a microcode perspective, it pretty much is. Rather than tack the cycles from interrupts onto an existing instruction, eventually you'll see an "intr" mnemonic appear in the instruction history with those cycles. Then there's an interesting period where PIT channel timer #1 is set to '1'. I believe reenigne intended to quickly refresh all of DRAM before setting the counter to 19, but it seems that a value of 1 is too fast and actually disables DRAM refresh entirely - luckily it's not off long enough to cause problems, but it throws your timing off if you don't emulate that too. I think that's actually what kills Area5150 on the Book8088.

I was writing up a blog article on the process of debugging all that, but it got a little longwinded and cluttered with images, I don't know if I can make it an easier read or if I'll just scrap it...

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 81 of 173, by jal

User metadata
Rank Oldbie
Rank
Oldbie

Maybe I'm overlooking some setting, but MartyPC starts up huge (taking almost the full height of my screen), any way to start smaller (I know I can resize it, but I'd rather it started smaller directly).

Reply 82 of 173, by Scali

User metadata
Rank l33t
Rank
l33t
GloriousCow wrote on 2023-11-19, 21:00:

Then there's an interesting period where PIT channel timer #1 is set to '1'. I believe reenigne intended to quickly refresh all of DRAM before setting the counter to 19, but it seems that a value of 1 is too fast and actually disables DRAM refresh entirely

Actually, I believe the reason for setting the counter to 1 is to be able to trigger a new counter value 'immediately'.
That is, the counter value is latched.
So when you write a new counter value without sending the command to reprogram the PIT first, then the PIT will continue its current countdown to 0, and will load the new value at that point.
If you set the counter to 1, then it effectively means that it will be able to read the new counter value 'immediately' as it will reach 0 almost instantly.
So when you want to synchronize the counter in a cycle-exact way, this is a nice trick that has predictable timing, with just a single IO write that you have to time correctly.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 83 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
Scali wrote on 2023-11-20, 13:58:
GloriousCow wrote on 2023-11-19, 21:00:

Then there's an interesting period where PIT channel timer #1 is set to '1'. I believe reenigne intended to quickly refresh all of DRAM before setting the counter to 19, but it seems that a value of 1 is too fast and actually disables DRAM refresh entirely

Actually, I believe the reason for setting the counter to 1 is to be able to trigger a new counter value 'immediately'.

I've seen that trick for timer #0, it's done in Tetra3D, which has some interesting side effects, since the reload is in LSBMSB mode. The short reload interval causes a reload between LSB and MSB - having just LSB set would lead to an invalid reload value. What occurs in hardware is some behavior I hadn't seen documented (the timer reloads again on MSB so that the reload value is corrected). But whereas a fast timer #0 is harmless with interrupts off, a fast timer #1 will be triggering the motherboard's DMA logic.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 84 of 173, by Scali

User metadata
Rank l33t
Rank
l33t

I assume the code is in the initialization of an effect that requires the system to be in 'lockstep'. In which case the DRAM refresh has to be synchronized on specific positions on each scanline. A frame is 19912 PIT ticks, and that is 262 scanlines, so you have 19912/262 = 76 PIT ticks per scanline. The default DRAM refresh of 18 does not fit properly to a scanline, but 76 is a nice multiple of 19.
But just having exactly 4 DRAM refreshes per scanline still doesn't allow you to run cycle-exact code, because you don't know exactly where they occur yet.
So you also want to start the refreshes at a fixed point relative to the start of a scanline.
I suppose that is what this code does. In which case it would be a 'happy accident' if setting the counter to 1 will disable DRAM refresh, because otherwise the bus would get hammered and your CPU would be starved from reading instructions and executing code.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 85 of 173, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote on 2023-11-20, 13:58:

Actually, I believe the reason for setting the counter to 1 is to be able to trigger a new counter value 'immediately'.

Yes, exactly - this is just how I turn DRAM refresh off. Timer 1 is also set to mode 0 at the same time. The fast refresh rate to ensure all the DRAM is refreshed before and after a "refresh off" period is done by setting timer 1 to mode 2 period 2. I'm pretty sure I checked that that actually does the refreshing I expect it to! It's certainly possible that mode 0 period 1 has unwanted side-effects on other machines that don't have exactly the same refresh circuitry as the 5150/5155/5160, though. The code for this is at https://github.com/reenigne/reenigne/blob/mas … common.asm#L342 .

Reply 86 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
Scali wrote on 2023-11-20, 19:06:

I assume the code is in the initialization of an effect that requires the system to be in 'lockstep'. In which case the DRAM refresh has to be synchronized on specific positions on each scanline. A frame is 19912 PIT ticks, and that is 262 scanlines, so you have 19912/262 = 76 PIT ticks per scanline. The default DRAM refresh of 18 does not fit properly to a scanline, but 76 is a nice multiple of 19.

I'm familiar with this technique. MartyPC's DMA scheduler is updated with the state of timer channel #1 on any change - Kefrens, Wibble and Lake wouldn't work without it.

Scali wrote on 2023-11-20, 19:06:

But just having exactly 4 DRAM refreshes per scanline still doesn't allow you to run cycle-exact code, because you don't know exactly where they occur yet.

There appears to be a little bit of wiggle room, at least. MartyPC 0.1.3's DMA logic is actually off by one cycle - the effects still run, but it causes the 8088MPH CPU test to fail.

reenigne wrote on 2023-11-21, 08:51:

Yes, exactly - this is just how I turn DRAM refresh off. Timer 1 is also set to mode 0 at the same time. The fast refresh rate to ensure all the DRAM is refreshed before and after a "refresh off" period is done by setting timer 1 to mode 2 period 2. I'm pretty sure I checked that that actually does the refreshing I expect it to! It's certainly possible that mode 0 period 1 has unwanted side-effects on other machines that don't have exactly the same refresh circuitry as the 5150/5155/5160, though. The code for this is at https://github.com/reenigne/reenigne/blob/mas … common.asm#L342 .

I guess then I'm curious if you just want to stop DRAM refresh, why not set a mode and let it sit there waiting for reload?

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 87 of 173, by Scali

User metadata
Rank l33t
Rank
l33t
GloriousCow wrote on 2023-11-21, 15:00:

There appears to be a little bit of wiggle room, at least. MartyPC 0.1.3's DMA logic is actually off by one cycle - the effects still run, but it causes the 8088MPH CPU test to fail.

Yes, there is a bit of 'self-correction', so to say.
That is, technically the only time-critical operations are writes to the CRTC. Since it is clocked at 1/4th of the CPU, the target isn't really a specific CPU cycle, but rather a specific IO cycle for the CRTC.
So, if your CPU happens to arrive a cycle early, it will just have to wait a cycle longer before the CRTC can process the next IO write.
What happens between these time-critical IO writes doesn't matter down to the last cycle, as long as the writes remain in sync with the CRTC. The code between is just an abstract block, a black box. The machine state does not have any externally observable effects there, so it doesn't matter if it is cycle-exact. It just has to be 'close enough'.

I actually used that 'self-correcting' or 'self-calibrating' property with a PC speaker routine some time ago... I wanted to write PDM data at the highest possible rate. It appears that you can only write to the 8255 every other cycle.
So while the 8255 runs at 1.19 MHz on paper, you can't update it at 1.19 MHz. It only does 595 kHz max. But it does that consistently if you just bang the IO port with a fast enough CPU.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 88 of 173, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
GloriousCow wrote on 2023-11-21, 15:00:

I guess then I'm curious if you just want to stop DRAM refresh, why not set a mode and let it sit there waiting for reload?

Yes, I think that would work too. I guess when writing this code it either just didn't occur to me to do a partial initialisation, or I thought there was a possibility of some circumstance where some more refreshes might occur after the mode set and interfere with the lockstep code. It's a bit belt-and-braces because at the time I didn't really have a good way to test that it was working properly, so I tried to account for any possible bug I could think of without determining if it could actually happen or not.

Reply 89 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
jal wrote on 2023-11-20, 13:40:

Maybe I'm overlooking some setting, but MartyPC starts up huge (taking almost the full height of my screen), any way to start smaller (I know I can resize it, but I'd rather it started smaller directly).

Not at this time; but I can add an option for startup window scaling. It will try to double the emulated video card's resolution if it will fit on your monitor. This was convenient for me since I always have a bunch of debug windows open and need the space; I recognize it is less convenient for just running the emulator.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 90 of 173, by jal

User metadata
Rank Oldbie
Rank
Oldbie
GloriousCow wrote on 2023-11-21, 22:43:

Not at this time; but I can add an option for startup window scaling. It will try to double the emulated video card's resolution if it will fit on your monitor. This was convenient for me since I always have a bunch of debug windows open and need the space; I recognize it is less convenient for just running the emulator.

Yeah, I can see that 😀. With regards to the debug windows, it would be really great if they could be moved outside the main window, as they're now overlapping what's on screen.

JAL

Reply 91 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
jal wrote on 2023-11-22, 08:35:
GloriousCow wrote on 2023-11-21, 22:43:

Not at this time; but I can add an option for startup window scaling. It will try to double the emulated video card's resolution if it will fit on your monitor. This was convenient for me since I always have a bunch of debug windows open and need the space; I recognize it is less convenient for just running the emulator.

Yeah, I can see that 😀. With regards to the debug windows, it would be really great if they could be moved outside the main window, as they're now overlapping what's on screen.

The gui windows are rendered to a texture so there's no easy way to put them 'outside' of a window, alas. The pros of an immediate mode gui is that it's super easy to update state every frame for live debug displays, the cons are almost everything else 😁

On the other hand, I am working right now on multiple window support. So you can have the GUI 'debug workspace' in one window, and the video adapter output in another window. This will also make multiple monitors possible. In conjunction with a new machine configuration system, you should be able to do things like define a fantasy computer with two CGA cards mapped at different IO and memory locations. What you'd do with that, I have no idea. But I'm kind of curious what people might come up with if given total control over machine organization. I envision someone being able to design a custom, 8088-based single board computer in MartyPC just by defining the machine configuration, testing and developing the BIOS for it, before even ordering a single PCB.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 92 of 173, by jal

User metadata
Rank Oldbie
Rank
Oldbie

Multiple window support would be really great in combination with MDA/Hercules emulation, as many developers had both MDA and CGA back in the days, for easy debugging on one monitor (MDA) and the output on CGA (of course, with a built-in debugger that's less of a concern, but hard-core Turbo Debugger debugging ftw! 😁).

Reply 93 of 173, by VileR

User metadata
Rank l33t
Rank
l33t
GloriousCow wrote on 2023-11-19, 21:00:

Eventually i'd like to implement librashader https://snowflakepowe.red/blog/introducing-li … ader-2023-01-14 which is a nice rust-based solution that would give MartyPC access to all those tasty Libretro shaders. Those will likely run in a full window. The nice thing about MartyPC's built in shader it it's applied by the scaler, so we can have those shader effects at configurable zoom levels and things.

I recently came across ShaderGlass, which lets you apply libretro shaders to arbitrary portions of your desktop. But those screenshots over there illustrate the biggest issue with that kind of setup: resolution and aspect are obviously wrong in most of them, and the headache of getting it anywhere near the ballpark is passed on to the user. Putting the scaler in control of shader settings should definitely be preferable.

GloriousCow wrote on 2023-11-19, 21:00:

I think I see it now. I've moved aspect correction to the shader now and it looks a bit better. Resizing with a shader is still a bit tricky, ideally we would resize in two passes - once to the nearest integer scale nearest-neighbor, and then a final resize with linear filtering on. This would reduce the amount of blur. Although, I am still scan-doubling 200 line modes in software, which does help reduce the blur a bit.

The existing/shader-less rendering would benefit from this idea as well - with aspect correction enabled, when the window is large enough to make x2 scaling kick in, I've noticed that the vertical filtering artifacts also get scaled up by 2.

GloriousCow wrote on 2023-11-19, 21:00:

One thing that is commonly done is to implement a 'GDB stub', this allows any GDB client to connect to your emulator. VirtualXT does this. That's handy, but unfortunately doesn't support the 8088's segmented memory model, so setting breakpoints is not a lot of fun when you're breaking out the hex calculator every time to calculate a flat address. Despite that it might still be worth doing.

Ah, yeah, doesn't sound too optimal...

I was thinking some more about how the debugging interface itself could be made a bit more convenient, and recalled this 'DebugBox' thing that someone made a while ago for DOSBox (or started to make, anyway): https://user-images.githubusercontent.com/150 … 61e80df71c0.png. This already looks pretty similar to the MartyPC debug setup, except that the emulated video output comes in its own 'widget' and can be moved/resized. Something like that would add lots of flexibility... and this also has me wondering about making the (other) widgets scalable/resizable, maybe allowing for custom fonts/sizes for the text.
Does any of that sound doable? 😀

GloriousCow wrote on 2023-11-19, 21:00:

I was writing up a blog article on the process of debugging all that, but it got a little longwinded and cluttered with images, I don't know if I can make it an easier read or if I'll just scrap it...

"Long winded and cluttered with images" has never stopped *me* before, so I'd say go for it!

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 94 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
VileR wrote on 2023-11-28, 14:47:

I recently came across ShaderGlass, which lets you apply libretro shaders to arbitrary portions of your desktop. But those screenshots over there illustrate the biggest issue with that kind of setup: resolution and aspect are obviously wrong in most of them, and the headache of getting it anywhere near the ballpark is passed on to the user. Putting the scaler in control of shader settings should definitely be preferable.

Yeah, and tight integration with the emulator gives you a lot of niceties, like your scanlines automatically adjusting between 15Khz and 21Khz modes on EGA... I'm also planning a simple effect that triggers if you power the emulated machine off.

VileR wrote on 2023-11-28, 14:47:
GloriousCow wrote on 2023-11-19, 21:00:

I think I see it now. I've moved aspect correction to the shader now and it looks a bit better. Resizing with a shader is still a bit tricky, ideally we would resize in two passes - once to the nearest integer scale nearest-neighbor, and then a final resize with linear filtering on. This would reduce the amount of blur. Although, I am still scan-doubling 200 line modes in software, which does help reduce the blur a bit.

The existing/shader-less rendering would benefit from this idea as well - with aspect correction enabled, when the window is large enough to make x2 scaling kick in, I've noticed that the vertical filtering artifacts also get scaled up by 2.

The old system has pretty much been replaced, so just looking forward. It won't really be in for 0.2, as I've already gotten bogged down with too many major refactors, but maybe multi-pass scaling can make it for 0.2.1.

VileR wrote on 2023-11-28, 14:47:

I was thinking some more about how the debugging interface itself could be made a bit more convenient, and recalled this 'DebugBox' thing that someone made a while ago for DOSBox (or started to make, anyway): https://user-images.githubusercontent.com/150 … 61e80df71c0.png. This already looks pretty similar to the MartyPC debug setup, except that the emulated video output comes in its own 'widget' and can be moved/resized. Something like that would add lots of flexibility... and this also has me wondering about making the (other) widgets scalable/resizable, maybe allowing for custom fonts/sizes for the text.
Does any of that sound doable? 😀

That's generally how most emulators that use an immediate gui like egui/imgui work and its fairly simple to do. The first thing that happens in the rendering pipeline is that the internal RGBA buffer is turned into a texture. Instead of sending that through the scaler shader pipeline to be rendered to the screen we can convert it to a texture handle that egui can display as a widget. Although we won't have any shader effects (at least until I figure out how to render to a texture - which I'd like to do, so we can have 'post-processed' screenshots). This is originally how I was planning to handle a secondary MDA - it was just going to be rendered to its own widget.

One optimization I was planning was to render the gui every other frame, since there's not much difference in seeing a number in a debug widget update in 16ms vs 32ms, and if you have a lot of debug windows open the draw calls start to add up. But obviously if we're putting our video output in a widget we would want 60/70Hz updates. I guess we can make this dynamic depending on whether you have an open display window or not.

Another cool thing you could do is have two widgets showing the same display but with different apertures or options - so you could see a cropped, normal CGA output in one window and the entire NTSC field with hblank/vblank in another window with a CRTC register write overlay on.

There's a lot of customization you can do in egui in terms of colors and font sizes (font faces, not so much), but the main issue is I don't want to turn the configuration file into a huge monstrosity of gui options. There's an active proposal for egui to be able to read from a 'stylesheet': https://github.com/emilk/egui/issues/3284 and I'm somewhat inclined to wait for that to feature rather than try to implement extensive styling options myself.

As for resizing widgets, I'd like to make any widgets resizable if it makes sense to do so, like the disassembly and instruction history and memory viewer should definitely resize to show you as much as you want to see. It's just a bit trickier - the immediate-mode flow is normally that you send the gui some data and a window expands to fit it. If you expand a window manually we need to somehow measure how much data can fit in the new size, and have a channel to request an update to the amount of data sent going forward. Nothing impossible, just takes more work and so that's why I didn't initially do it. A better design for the memory viewer might be instead of sending it a certain size buffer of bytes, we pass it a closure by which it can request a buffer of the size needed, but I am not sure if that's feasible if I move the gui stuff to its own thread...

Anyway, rambling. UI design is hard.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 95 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member

This took a while, but the display framework I've built should be pretty flexible, and it abstracts the backend in preparation for an SDL frontend.

multiple_windows.png
Filename
multiple_windows.png
File size
119.96 KiB
Views
1060 views
File comment
martypc multiple window support
File license
Public domain

A single video card can be rendered to multiple windows with different parameters. Watch 8088mph in composite and RGBI at the same time! Go nuts!

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 97 of 173, by GloriousCow

User metadata
Rank Member
Rank
Member
jal wrote on 2023-12-01, 23:28:

Very nice! Do you think you'll add MDA/Hercules support in the future?

MDA for sure, I just need a good 9-pixel glyph drawing strategy. I was pondering just hacking in a quick MDA by taking my CGA and changing the IO port/memory map... unless i'm missing some critical detail it would basically work, just with the wrong font.

Hercules is of course possible but I don't know much about it and would need to do some research.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 98 of 173, by Scali

User metadata
Rank l33t
Rank
l33t
GloriousCow wrote on 2023-12-02, 18:40:

Hercules is of course possible but I don't know much about it and would need to do some research.

Hercules is quite simple. It is basically an MDA card with 64k of memory and a hack to offer a graphics mode alongside the standard MDA functionality.
Another feature is that the second 32k of memory is disabled by default, to make it compatible with a CGA card (MDA memory being at segment B000, and CGA at B800, so the second 32k would overlap with CGA when enabled).
Its graphics mode is 720x348 monochrome, requiring 32k. The second 32k can be used as a backbuffer, and you can flip buffers at any moment.
Like MDA and CGA, it uses a stock 6845 CRTC.
The graphics mode is therefore hacked as a pseudo-text mode, where a character is 16 pixels wide and 4 scanlines high, and the 6845 is programmed to a virtual textmode of 45x87 (which indeed means that the timings in graphics mode aren't 100% the same as in textmode, but close enough).
This trick is similar to how CGA circumvents the limit of 127 rows of the 6845. CGA uses a max scanline address of 2 for the 6845 to switch between two framebuffers, one for even scanlines and one for odd scanlines.
Hercules does the same thing, but then with 4-way interleaving (so max scanline address set to 4) to fit all 348 scanlines within the 127 row limit, so you have 4 framebuffers of 8k each.

One interesting bit of trivia might be that the original MDA uses a 16.257 MHz crystal, while various clones, including the Hercules, use a 16.000 MHz crystal for whatever reason.
So if you're going for cycle-exactness, you may want to offer both options.

Last edited by Scali on 2023-12-02, 19:26. Edited 1 time in total.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 99 of 173, by jal

User metadata
Rank Oldbie
Rank
Oldbie

There's a lot of info on the PC gfx cards in this book I digitized a while ago (hence the obscure URL). It includes Hercules and its successors.

JAL