VOGONS


Reply 100 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++
ALEKS wrote on 2021-05-12, 17:15:
Then it appears the accelerator emulation code is still somehow faulty. Theoretically StarCraft would not be directly using the […]
Show full quote

Then it appears the accelerator emulation code is still somehow faulty. Theoretically StarCraft would not be directly using the video card but would pass through the Windows abstraction layer (including the driver).

I had myself some visual pixel artifacts but in my case they were all caused by bad hardware design which I corrected in the second revision of the PCB.

In your case it might be either memory timing errors (but in that case all the screen would get corrupted, not just the font rendering) or some obscure register settings that you haven't figured out, yet.

I have not had any time to follow this thread or look in the datasheet but once I free myself from other projects I will try to also take a look.

Well, in my emulator's case(at least with Starcraft), it looks like some kind of CPU bug somehow. I see it repeatedly faulting on the motherboard i430fx BIOS ROM(segment F000) in Virtual 8086 mode. Those are all faults writing to port 43h and 40h. The only others I see are ports 20h for interrupt acnowledge(what looks ;ike the PIT timer being acknowledged for it's IRQ0 handler, according to the IMR on the master PIC). Did see only one IRQ15 in there too (which was the secondary master CD-ROM drive containing the Starcraft CUE/ISO).

I do know it's feeding the accelerator incorrect data on the memory aperture most of the time (validated by forcing the A-character from the CGAA 8x8 font ROM as a replacement based on the x/y accelerator coordinates for the mixmap byte). And sometimes it's actually at least partially correct.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 101 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

Hmmm.... Looked at what it was doing and saw it kept polling the CD-ROM drives every now and then.
Then just inserted the Starcraft CUE/ISO into both drives and started it again.

It seems to be loading now?

1479-Starcraft loading from both CD-ROM drives.png
Filename
1479-Starcraft loading from both CD-ROM drives.png
File size
224.3 KiB
Views
1304 views
File comment
Starcraft loading from both CD-ROM drives
File license
Fair use/fair dealing exception

The "Loading" text is appearing and disappearing over and over again.

Perhaps there's some weird kind of CD-ROM driver bug in there? The MS-DOS OAKCDROM.SYS driver also seems to behave strange on the machine (only detecting the secondary slave CD-ROM and not the secondary master CD-ROM somehow)?

Edit: It seems to run fine, at least the main menu so far (it's slow though).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 102 of 194, by ALEKS

User metadata
Rank Newbie
Rank
Newbie

With real hardware, on a Pentium 233 MHz class CPU with the ET4000/W32i card, StarCraft should be near real-time in terms of performance.

TX486DLC / 40 MHz | 32 Mb RAM | 16-bit ISA Backplane | Tseng Labs ET4000/W32i 2 Mb | I/O Interface | Audio Interface | PC Speaker Driver | Signal View Interface
3.5" & 5.25" FDD | 4 x 512 Mb CF | HP 82341D Interface | Intel EtherExpress 16

Reply 103 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++
ALEKS wrote on 2021-05-13, 16:42:

With real hardware, on a Pentium 233 MHz class CPU with the ET4000/W32i card, StarCraft should be near real-time in terms of performance.

Well it's (the emulated CPU) running at ~20% realtime speed at 3MIPS atm(just the interpreting CPU emulator).
The hardware speed is relative to it (each CPU tick ticks the hardware a bit(in emulated nanoseconds, 64-bit floating point), depending on the ticked time in nanoseconds). So all oscillators are in step with each other (e.g. VGA at proper 25/28MHz, Tseng has higher clock speeds though (see Dosbox's Tseng speeds, with the VGA ones at a proper 50MHz)).
The main bottleneck is the CPU and memory emulation itself, which is already quite optimized for word/doubleword accesses.

Edit: It seems to run fine with the disc inserted into both CD-ROM drives? Perhaps some weird kind of master/slave drive-related bug?
I did notice MS-DOS OAKCDROM.SYS suddenly only detecting the secondary slave and not it's master somehow?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 104 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just found out something interesting: Windows 95 "C" sets up the virtual bus size setting to a value of 2 during the rendering of the text it seems?
Otherwise, it's just a normal transaction it seems (the Y offset seems to be set to 0x500)? 0x500 is the value that's the effective row size increase in linear VRAM for each transferred scanline.
Edit: The Virtual Bus Size register and it's ways of changing input handling isn't implemented in my emulator at all... Perhaps that's the main issue here?

Edit: Implemented the virtual bus size and it's effects on the remainder of the current accelerator line (according to the effects of cpu_input on the 86box emulator, which seems to imply that the remainder of the queued virtual bus size bytes are discarded when the accelerator's X position register overflows? It wasn't performing said behaviour yet).

Although messed up a bit rebooting Windows 95 with it, since I forgot to allocate the required FIFO buffer for the virtual bus queue, causing it to write enqueued data to a non-existing buffer, never finding it having filled up enough to start the operation of emptying it 😖

Edit: Hmmm.... Having implemented the virtual bus size buffering, the display gets even more corrupted?
Edit: The start menu seems to hang some time, waiting for more input?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 105 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

OK. So since the display gets even more corrupted by implementing the second buffer that's invisible to the CPU directly in the hardware (the virtual bus size, which is also implemented as a kind of buffer in the 86box emulator)?

Perhaps the emulation is going fine, but there's some kind of processing error somewhere? Like bit truncation or something like that?

Edit: Currently trying to whip up a simple turbo pascal 7.0 program to verify the correct execution on UniPCemu's emulated hardware.
Edit: Trying to run the simple program to test font rendering using a virtual bus size of 16 pixels in the mix map mode, I found a bug with the virtual bus size being applied from an uninitialized variable which it was supposed to set. So it calculated the virtual bus size register's low 2 bits, then instead of indexing it into a proper lookup table it would index the uninitialized variable (that's always 0 when starting) instead!
Edit: And another bug found: filling up the virtual bus queue caused the detection to think it was terminating or finishing the transfer (since it's never put in active state, which it expects for running transfers). So now with another bugfix it will keep it from terminating the transfer while it's processing the virtual bus queue being filled up (except terminating operation, which is handled like before).
Edit: Next a little error on my testing code. The Reload Control Register should be 0x00 instead of 0x03 (to facilitate proper loading of the pattern/source address internal registers).
Edit: Then, the classic mistake: forgetting to substract 1 for the destination Y offset register (640 instead of it's proper value of 639).

1489-UniPCemu ACLtest result.png
Filename
1489-UniPCemu ACLtest result.png
File size
2.24 KiB
Views
1217 views
File comment
The ACLtest program result on UniPCemu.
File license
Fair use/fair dealing exception

It now works properly (at least with my simple Turbo Pascal 7.0 compiled testing program)! 😁
At least my test code runs without issues. Although it just takes the A-character from the CGA 8x8 font (from Dosbox) and renders it using the accelerator through 16 8-bit writes (all sets of 2, where the first byte is the font data and the second (discarded though the virtual bus size being set to 2) being discarded. So it's writing to the accelerator aperture each byte from the ROM font, followed by a 0x00 byte to make it perform the transfer (which is discarded).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 106 of 194, by ALEKS

User metadata
Rank Newbie
Rank
Newbie

That sounds like good news!

Can you please share the Turbo Pascal code? I was just thinking: would it make sense to write a small TSR that redirects all DOS print character or print string functions through the accelerator? I think a simple hook on the required interrupt vector 21h would be sufficient.

For sure the card itself is very fast in DOS, but why not? It would be a premiere: accelerated 80 x 25 DOS text-mode. I am curious how fast would it render software using TurboVision (Dos Navigator for instance) or other text-mode interfaces.

TX486DLC / 40 MHz | 32 Mb RAM | 16-bit ISA Backplane | Tseng Labs ET4000/W32i 2 Mb | I/O Interface | Audio Interface | PC Speaker Driver | Signal View Interface
3.5" & 5.25" FDD | 4 x 512 Mb CF | HP 82341D Interface | Intel EtherExpress 16

Reply 107 of 194, by root42

User metadata
Rank l33t
Rank
l33t
ALEKS wrote on 2021-05-15, 06:20:

That sounds like good news!

Can you please share the Turbo Pascal code? I was just thinking: would it make sense to write a small TSR that redirects all DOS print character or print string functions through the accelerator? I think a simple hook on the required interrupt vector 21h would be sufficient.

For sure the card itself is very fast in DOS, but why not? It would be a premiere: accelerated 80 x 25 DOS text-mode. I am curious how fast would it render software using TurboVision (Dos Navigator for instance) or other text-mode interfaces.

Those programs will poke into the VGA memory for sure and avoid calling DOS or BIOS printing services. I doubt it would make any difference. But trying it would be fun nevertheless!

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 108 of 194, by ALEKS

User metadata
Rank Newbie
Rank
Newbie

You are absolutely right! I was carried by enthusiasm and forgot for a second how the TurboVision-based programs of back then were writing to the VGA RAM directly. And I think the other DOS shells with proprietary video output implementations such as Norton Commander or Volkov Commander were using the same technique.

Even I used this technique with my DOS text-mode UI framework called VersaVision. I have used a steamlined (and updated) version of VersaVision in my AIF (Audio Interface) driver/initialization program.
My original VersaVision implementation is about 160 Kb of Pascal and assembly source code but it is buggy and I haven't worked on it for a very long time (over 10 years, I think?).

But the interest remains in accelerating DOS commandline only text output. I am pretty sure the DIR command uses standard DOS interrupts for writing to the console output.

TX486DLC / 40 MHz | 32 Mb RAM | 16-bit ISA Backplane | Tseng Labs ET4000/W32i 2 Mb | I/O Interface | Audio Interface | PC Speaker Driver | Signal View Interface
3.5" & 5.25" FDD | 4 x 512 Mb CF | HP 82341D Interface | Intel EtherExpress 16

Reply 109 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++
ALEKS wrote on 2021-05-15, 06:20:

That sounds like good news!

Can you please share the Turbo Pascal code? I was just thinking: would it make sense to write a small TSR that redirects all DOS print character or print string functions through the accelerator? I think a simple hook on the required interrupt vector 21h would be sufficient.

For sure the card itself is very fast in DOS, but why not? It would be a premiere: accelerated 80 x 25 DOS text-mode. I am curious how fast would it render software using TurboVision (Dos Navigator for instance) or other text-mode interfaces.

Just confirmed Windows 95's annoying text rendering problem is solved now, with those latest improvement and bugfixes!

1493-Windows 95 finally working in accelerated mode.png
Filename
1493-Windows 95 finally working in accelerated mode.png
File size
18.33 KiB
Views
1190 views
File comment
Windows 95 finally working in accelerated mode!
File license
Fair use/fair dealing exception

The pascal code can be found on UniPCemu's repository with my other test programs I've written for UniPCemu, inside the UniPCemu/pascal folder (named ACLTEST.PAS (including file extension)).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 110 of 194, by ALEKS

User metadata
Rank Newbie
Rank
Newbie

Great job on fixing the ACL emulation.
I found TESTACL.PAS on the UniPCemu repository. I ran it on the real PC and it works great.

I will play around with it a bit trying to capture standard output text and display it in ACL mode in 640 x 480 or 720 x 400 graphics mode.

By the way, using ACLTEST with AL = 30h (which whould be 800 x 600 / 256 colors) outputs weird symbols instead of the test letter A.

Cheers,
A.

TX486DLC / 40 MHz | 32 Mb RAM | 16-bit ISA Backplane | Tseng Labs ET4000/W32i 2 Mb | I/O Interface | Audio Interface | PC Speaker Driver | Signal View Interface
3.5" & 5.25" FDD | 4 x 512 Mb CF | HP 82341D Interface | Intel EtherExpress 16

Reply 111 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++
ALEKS wrote on 2021-05-15, 11:19:
Great job on fixing the ACL emulation. I found TESTACL.PAS on the UniPCemu repository. I ran it on the real PC and it works grea […]
Show full quote

Great job on fixing the ACL emulation.
I found TESTACL.PAS on the UniPCemu repository. I ran it on the real PC and it works great.

I will play around with it a bit trying to capture standard output text and display it in ACL mode in 640 x 480 or 720 x 400 graphics mode.

By the way, using ACLTEST with AL = 30h (which whould be 800 x 600 / 256 colors) outputs weird symbols instead of the test letter A.

Cheers,
A.

Well, it has the destination Y offset hardcoded for the scanline distance for mode 2Eh (line 156). It should theoretically work if you adjust it to the proper value calculated from the (S)VGA offset register (multiplying it according to VGA specs to obtain the distance between scanline in VRAM addresses (should be 4 times the 2*offset register value (see FreeVGA documentation on that for a better explanation), so effectively 8*offset register minus one).

Edit: Just a sec, I'll modify the code to do that automatically.
Btw, Mind that the code is built for 8-bit color mode, not for 16-bit color mode. 16-bit color mode would need extra logic to produce 16-bit fonts from the 8-bit font used (or a double wide hardcoded 16-bit font in it's place, with adjustments to the rendering code and parameters).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 112 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just finished a modification to the testing code.
It will now perform the 8x8 pixel rendering of the A-character in 640x480(mode 2Eh), 800x600(mode 30h) and finally 1024x768(mode 38h). Like with the old 640x480 single mode test, simply press a key to make it return to text mode and start the next test (it will do mode 2E, 30, 38 then return to MS-DOS).

Edit: About the issue with Disney's Villain's Revenge and Starcraft, (Still need to verify Disney though)
It seems that somehow Windows 9x does actually see that the CD-ROM is inserted into the secondary master drive, but won't run the app properly unless it's installed in the secondary slave drive? Also, MS-DOS 6.22 OAKCDROM.SYS+MSCDEX only seems to detect the secondary slave CD-ROM drive and not it's master CD-ROM drive?

Edit: Windows 98 seems fixed as well!

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 113 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

Only VDIAG seems to still fail, on the Horizontal/Vertical/Diagonal Polyline Method 1 (as it calls it), which ends up writing some odd/even weird interlaced lines on the screen for the first two (two different colors), then the diagonal writing 4 different colors in diagonal stripes on the screen (which in turn makes the three squares at the bottom half of the screen during the final tests fail the rightmost stripe (the combination of 3 operators I believe) due to two of those diagonally different stripes affecting it's output). The other two seem to be unaffected because they're based on the top half of the screen, which is rendered correctly?

Inspecting the registers during the Horiontal and Diagonal Polyline Method 1 being displayed show it's in the non-CPU transfer mode (mode 0), at least when it's completed?
Edit: It's the "Blt (Dst and Pat) OR Scr" that's failing because of the Horizontal/Vertical/Diagnonal Polyline Method 1 test having a wrong pattern at it's location (two different colors, yellow (bottom left triangle/corner of the square) and red(remainder):

45-VDIAG running it's final test.png
Filename
45-VDIAG running it's final test.png
File size
5.81 KiB
Views
1138 views
File comment
VDIAG running it's final test.
File license
Fair use/fair dealing exception
46-Remaining VDIAG errors.png
Filename
46-Remaining VDIAG errors.png
File size
8.22 KiB
Views
1138 views
File comment
VDIAG's final errors.
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 114 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

These are the results from the Horizontal/Vertical/Diagonal Polyline Method 1:

47-VDIAG Horizontal Polyline Method 1.png
Filename
47-VDIAG Horizontal Polyline Method 1.png
File size
3.76 KiB
Views
1136 views
File comment
VDIAG Horizontal Polyline Method 1
File license
Fair use/fair dealing exception
48-VDIAG Vertical Polyline Method 1.png
Filename
48-VDIAG Vertical Polyline Method 1.png
File size
3.99 KiB
Views
1136 views
File comment
VDIAG Vertical Polyline Method 1
File license
Fair use/fair dealing exception
49-VDIAG Diagonal Polyline Method 1.png
Filename
49-VDIAG Diagonal Polyline Method 1.png
File size
6.2 KiB
Views
1136 views
File comment
VDIAG Diagonal Polyline Method 1
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 115 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

A simple improvement seems to fix those bugs: changing the mode 0 handling to not empty the accelerator queue until the entire opertion has reached terminal count(Effectively Y position register overflowing the programmed Y Count (becoming higher than the programmed Y count to be exact)).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 116 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just been thinking...

The W32i documentation says that the pattern/source register are a 3-stage register (the others being 2-stage). So queue into initial into shift.
This can be seen in it's example for the suspend and resume handling (chapters 2.11.8.1 and 2.11.8.2 ).

So does that mean that (If the PCem-based emulators are correct) the resume operation always performs an actual 3-stage shift, but only the starting of an operation by writing to the MMU aperture (the only other way to start it) is sensitive to the values in the bits of the queued reload control register?
So during a MMU aperture write when not started yet:
- Non source/pattern registers are shifted to the internal stage (always 2-stage).
- Source/pattern registers are shifted into the initial stage (performing 2-stage always).
After those, if (and only if) the bit in the internal Reload Control Register is cleared, shift the Initial Source/pattern registers into the internal Source/pattern registers (becoming a 2nd shift from Initial to Internal, becoming the same operation as the Resume operation). Or perhaps simply restore those internal registers from a backup of the previous value(stored at the begin of the shifting operation) when the reload control bit is set and a MMU write triggers the queue to shift?

So, basically, the Reload Control register bits acts as a suppressing the third shift to the internal registers? But only when not performing the Resume operation.

Then it would make sense what the W32i documentation says about the 3-stage shift queue?
Edit: But I see something strange in WhatVGA: The FillRect and CopyRect seem to start the operation by writing 0x09 to perform a load and resume operation. But if the Reload Control Register would have effect on that (as it would imply, otherwise it wouldn't work?) the documented code in the W32i documentation wouldn't work (since the internal registers wouldn't be loaded when they need to)?
Or does the implicit load of the internal registers only happen when the resume bit is set together with the restore bit (so both bits set)? In that case the documented code would work properly (if set seperately, keeping the internal values correct)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 117 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++

Also had to change the behaviour of the XYST bit a bit:
- When normally finishing or terminating an operation, it's cleared. Setting it will set it, but it won't be starting an operation unless started through the MMU or writing it set by the CPU after performing a loading of the queue using the ACL Operation State Register bit 0 set being written.
- So if writing the XYST bit directly after suspending/termination, resume operation(bit 3) or MMU aperture write starting an operation, the XYST bit will set (as it's RW according to documentation) and the SSO bit will be set properly also. But it won't actually start the accelerator properly unless the queue is loaded first through the bit 0 of the ACL Operation State Register, which will enable the behaviour of the XYST being written with 1 to really start the accelerator again (the XYST bit being set having no effect until this is done).

So in that way the special behaviour of the accelerator being active together with the XYST being readable by software is archieved in the emulated hardware, together with the code mentioned in 2.11.8.2 properly working as expected (and not immediately starting the operation when SAVE1 is first written back to the accelerator queue together with the XYST bit.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 118 of 194, by root42

User metadata
Rank l33t
Rank
l33t
ALEKS wrote on 2021-05-15, 09:31:

But the interest remains in accelerating DOS commandline only text output. I am pretty sure the DIR command uses standard DOS interrupts for writing to the console output.

You could probably make an accelerated command.com... With soft-scrolling maybe? 😀

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 119 of 194, by superfury

User metadata
Rank l33t++
Rank
l33t++
root42 wrote on 2021-05-17, 12:48:
ALEKS wrote on 2021-05-15, 09:31:

But the interest remains in accelerating DOS commandline only text output. I am pretty sure the DIR command uses standard DOS interrupts for writing to the console output.

You could probably make an accelerated command.com... With soft-scrolling maybe? 😀

As far as I know, this already happens running the MS-DOS command prompt in Windows 9x. It's using the accelerator to render the command prompt on the screen (unless in full-screen mode).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io