VOGONS


First post, by superfury

User metadata
Rank l33t
Rank
l33t

Would it be difficult to upgrade the ET4000AX emulation in UniPCemu to a W32 version?

Afaik that just adds a hardware cursor (mouse cursor sprite? Probably applied in the video rendering at the sequencer level?), some extra VRAM apertures and (perhaps more difficult) some BitBlt engine?

So the hardware cursor would need to be implemented in the active display renderer, the VRAM windows added to the memory maps(relatively easy?)?

What about the BitBlt engine it uses? Would that be very difficult to implement(I'm looking at WhatVGA documentation here)? Does it run in parallel to the ET4000AX part and sequencer through DAC? Can it be optional (also the cursor emulation)?

Last edited by superfury on 2021-03-23, 08:21. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 1 of 69, by BloodyCactus

User metadata
Rank Oldbie
Rank
Oldbie

If all you read is the wikipedia page sure.

It actually accelerates windows 3.1 functions, has a dual screen access which acts as an overlay over the existing screen (so full dual CRT controllers), compound bitblits, all 256 Microsoft Raster Operations opcodes in hardware.

--/\-[ Stu : Bloody Cactus :: http://kråketær.com :: http://mega-tokyo.com ]-/\--

Reply 2 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Can it be implemented without those? Like reporting unsupported for most/all of them(only implementing ET4000AX and memory extensions, perhaps cursor only)?
Also, is the (mouse?) cursor blitted as well or processed during rendering(overwriting DAC input or sequencer output)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 3 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Just implemented most registers but the CRTC/Sprite(except identification registers as ROM and CRTC/Sprite register for selection at the xA address). Based on VGADoc documentation: https://www.cs.utexas.edu/~dahlin/Classes/439 … gadoc/TSENG.TXT

So that's a ET4000/W32 without accelerator functions but with detection and base ET4000 and VRAM support(minus extra memory windows).

CRTC/Sprite registers read 0xFF always atm.

Currently still disabled by a flag in the source code(to be configured later).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 4 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Eventually added the memory mapped registers (mostly ignored) and other window supported and ignored by the card atm.

I see it booting the OS eventually but no VRAM character fonts seem to be loaded, thus a black screen is displayed?
Edit: The cause was faulty memory wrapping on the Tseng ET4000/W32 chip emulation only(not on the normal ET4000). It was calculating a mask for VRAM addressing, and instead of properly applying size-1(for the bits to mask for) it was using size, which resolved into a result of only the topmost bit + 1 being able to be set for VRAM reads and writes!

Having fixed that, the W32 chip now properly displays and boots.

I did notice, however, that the BIOS seems to detect 2MB display memory, while only 1MB is installed?

Anyone knows how the BIOSes detect display memory installed?
Edit: Hmmm... WhatVGA hangs/crashes when it has the extended functions for ET4000/W32 memory enabled somehow?
Edit: It seems to be waiting on BFF36(M+36) to clear bit 2, which currently always reports set?

Now the question: how much is supposed to be implemented and left for ROM values in the registers for the software to not hang and the remainder of the card(minus any BitBlt operations) to operate correctly...

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 5 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Just implemented a storage backend of all registers (both memory mapped and at I/O port 217A/217B (officially 21xA)). The only exception being the registers only defined once for the CRTCB registers behave to pointing to those in both CRTCB and Sprite windows and the CRTC/Sprire control and Image Port Control. They all resolve to the same registers in both CRTCB and Sprite windows.

Then the Status Registers (M+35 and M+36) all are read-only, with values 0x00 being reported atm (so software can continue on?).
Edit: WhatVGA doesn't hang anymore and messes up the display and seemingly character fonts in VRAM as well?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 6 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Just implemented the MMU 0-2 window areas and their functionality(just VRAM accesses, though).

It's still unclear what WhatVGA means with the "accelerator registers" when M+13h bits 0-2 are set? Currently I just caused it to float the bus during reads and do nothing when written(though taking the access instead of anything like RAM(normal RAM installed) or ROM(like BIOS etc.) responding to the address).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 7 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Is there any diagnostic software I can use to check the UniPCemu implementation of the ET4000/W32 video card? (Of course one that doesn't require full implementation of the actual BitBlt functionality)

It's weird that WhatVGA somehow seems to corrupt the video RAM (plane 2 in particular)?

Edit: Hmmm... Inspecting what the highest VRAM address actually written by software, I see that the ET4000/W32 only writes up to 256KB of VRAM? It says it detects 1MB, but what does it base that off?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 8 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Just managed to find the issue with WhatVGA corrupting video RAM:

The ET4000(AX and Rev E) documentation lists bit 5 of the Video System Configuration Register 1 to indicate contiguous mode (essentially matching linear mode on the linear memory aperture).

But the ET4000/W32 documentation in the TSENG.TXT of WhatVGA changes this bit: it now enables the Memory Mapped Registers(A.k.a. Tseng Addressing Mode). So enabling this bit doesn't cause the video memory to be presented in a contiguous way(linear memory buffer)! That's only happening on the extended memory area when it's enabled by bit 4 of said register!

Edit: It makes me wonder, though... What exactly is this bit 5 on the ET4000AX chips?

The documentation says the following about it:

Bit 5, when set to 1, will enable the address mapping of the display memory to be contiguous. This enabled much more efficient use of the ET4000's internal resources and thereby, improves the performance. When set to 0, will enable address mapping compatible with the VGA's. Note that the use of this bit will not affect the compatiblity with all video modes unless the software assumes the relationship of the address mapping between modes.

So does that mean it only affects the chained mode to behave in a more optimized way and not affect anything regarding the way VRAM is presented to the CPU?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 9 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Huh? Weird? When switching from a ET4000AX to a ET4000/W32, Windows 95 thinks the ET4000 has been removed with nothing in return?
Edit: Even weirder, it thinks that the "Gameport Joystick" has been removed from the system?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 10 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Just now reinstalling Windows 95 OSR 2.5 "C" within UniPCemu, since the device detection went faulty somehow...

Now, looking at the detlog.txt during detection, I see the following:

Checking for: XGA/2 Display Adapter
QueryIOMem: Caller=DETECTXGA2, rcQuery=0
IO=3b0-3bb,3c0-3df
Checking for: Tseng Labs W32 Display Adapter
QueryIOMem: Caller=DETECTTSENGW32, rcQuery=0
IO=3b0-3bb,3c0-3df
Checking for: Tseng Labs Display Adapter
QueryIOMem: Caller=DETECTTSENGW128, rcQuery=0
IO=3b0-3bb,3c0-3df
Detected: *PNP091A\0000 = [11] Tseng Labs ET4000
IO=3b0-3bb,3c0-3df
Mem=a0000-affff,b8000-bffff,c0000-c7fff

So it's somehow detecting it as a ET4000AX instead of a ET4000/W32?
That's of course incorrect, since the W32 is actually emulated(mostly, except sprite and acceleration itself)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 11 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Hmmmm... According to PCem, only bit 5 of the video system configuration 1 register enables the MMU registers in the selected VRAM window. Windows 95 seems to agree (seeing it poll said registers with only bit 5 set and bit 3 cleared). But the documentation on the ET4000/W32p and ET4000/W32i and WhatVGA say that both bit 3 and 5 need to be set for this to be enabled?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 13 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

But polling the register has nothing to do with it's functionality. The reading of the register has no effect on the actual effect of those bits on the VRAM window presented to the CPU? It's what the CPU writes to it that determines it?

What's now happening is that the OS writes bit 5 set and bit 3 cleared and reads that back. And it's the write that actually sets the functionality of the VRAM, in this case bit 5 being set and bit 3 being set or not(doesn't matter) to enable the MMU register area in the VRAM window.

What's now happening with Windows 95 is that the CPU seems to hang during boot once again, but since it's not polling or doing anything with the ET4000 chip, I assume that the problem with the video card is something different now?
I see that the display is garbled up, taking about half the screen with a bit of residual below that, which probably means that it's just entered the graphics mode for displaying the desktop (the mode after it's displayed it's booting screen), so what I'm seeing is probably the 256-color 256-color boot screen (probably LOGO.SYS) interpreted using the desktop width and pixel depth instead of 8-bit 320 pixel wide display(so reinterpreted as 640 pixels 8-bit colors with each even and odd scanline displayed on each single line?).
Edit: Changing the horizontal pitch to A0h in the precalcs used to render itself(leaving the backend registers unaffected) shows the Windows boot screen (LOGO.SYS) at part of the screen, with wrong rendering of colours of course, being seen 4 times horizontally.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 14 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

I just implemented the handling of the starting of the accelerator and the MMU write algorithms.

Now the hardware will start ticking, handling the transfers and operations. That will also perform the interrupt logic (connected to the VGA IRQ) and becoming idle, together with the X/Y block status changing.
Also, write waitstates on the accelerator are now handled, reads from the accelerator returning 0xFF for the moment (not handled yet).

So the bits in the status and interrupt registers will now toggle, with the VGA IRQ being handled accordingly.

Although the accelerator performs a NOP and finishes on the next clock for now (changing the actual operation to a NOP instead of actual processing essentially, until it's actually implemented).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 15 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Just implemented placeholders for the accelerator processing, together with 8:1 inputs for CPU mix data.
Also added precalcs support for the ACL queued registers (internal state), suspend (waiting for the operation to complete before terminating, together with storing the precalcs in CPU-readable registers) and terminate operations, as wel as an empty function for loading and storing the registers into precalcs(internal state) as an operation is started. Still need to implement the loading into the precalcs.
Edit: Just implemented the loading of the precalcs and the storing of the precalcs into the internal pattern and internal source registers for the CPU to read.

The main thing on the accelerator that's left now is the actual accelerator operations (all the processing of the inputs and outputs into and from video memory and ROPs (ATM unknown how these actually work?)) and perhaps the reading from the accelerator (is this present at all in a real ET4000/W32(/i/p)?).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 16 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Reduced the port 217B back to being a 8-bit I/O port only responding to said port for reads and writes.
Also implemented the image port to map to the selected VRAM addresses (row widths of 0 keep it stuck to the first scanline for any scanline following it). This is done by a simple divide and modulo operation to determine the VRAM address added to the start address in the IMA registers.

So all that's currently left is the actual implementation of the raster operations, which is currently unknown? The W32i documentation gives a ROP list, but I don't know anything about how it works?
Edit: Appearently it's just a mask determining if a bit is to be masked? So the destination masks it's odd bits, source it's odd/higher half nibbles and pattern it's odd nibbles. Anding all those masks together with the Raster Operation(like ROP&destmask&srcmask&patmask), when any bits are left set(the even/odd bits/half nibble/nibble selected by the destination bit, source bit and pattern bit respectively) the resulting bit in the result is set.

UniPCemu now retrieves the bit(using simple preshifted mask) from all three, inputs them to the three mask inputs(to get 55/aa(destination); 33/cc(source) and 0f/f0(pattern), anding them with each other and the ROP used. Then, if it has a set bit(thus non-zero), it will set the respective bit that was an input to the odd/even checks in the result. The inputs to the odd/even checks is simply a left-shifting value from 0x01 to 0x80(reaching 0x100 stops processing). Although it can maybe be optimized from ((x!=0)&1) to (x>>i) when adding a second counter to the loop.

Since, according to the ET4000/W32i documentation, the LineDraw doesn't exist on the ET4000/W32(i), it means it's (together with all other ET4000/W32 functionality) now fully implemented the algorithm (although no wrapping is supported yet so far).

One thing I did implement is that while the terminating operation is like documented, the suspend operation still requires sending leftover data before it's suspended(remainder of pixels for a running transfer) and will suspend on completion of said part of the operation (maybe requiring inputting CPU data).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 17 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

Hmmm... Without the wrapping implemented, I see that the used internal address registers and the destination address registers end up with multi-GB addresses in them(probably due to crazy overflow) when testing BitBlt using WhatVGA?
The X-coordinate looks within normal range? It's just the address registers and Y coordinates that look out-of-range?

Edit: After changing it to return the x coordinates like PCem-X does the output changes a bit, but doesn't seem to hang anymore and can return to the main menu by pressing escape in WhatVGA.
Although on the second try, the screen is messed up even in text mode?

Edit: One thing is obvious: the accelerator seems to go haywire once it's in no-input mode(in other words: free running with only VRAM input and outputs).
Is that possible with an ET4000/W32? In this case, it's setup with MMU mapped register 9C's lower bits being 0?
Or does that just mean it's not supposed to be doing anything?

So my question is: what happens when the ACL Routing Control Register is set to Routing of CPU data: CPU data not used (0)?
Edit: Or does this just mean that it ignores what's written by the CPU(not using it in that way), only using the address(if required by the settings) to trigger a transfer and transformation of data?

Edit: Modified 21xA register 0x8C to have the W32 version(currently 0h) as a ROM in the upper bits, as well as it and the sprite row offset high register being readable during reading said indexed register(missing duplicate from the crtc register copy-paste).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 18 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

This is the applying of the Raster Operations as I've implemented it(all based on queue input being filled(only used for source and mixmap inputs)):

Details
//result: bit0=Set to have handled tick, bit1=Set to immediately check for termination on the same clock.
byte Tseng4k_tickAccelerator_step(byte noqueue)
{
byte destination,source,pattern,mixmap,ROP,result,operationx,ROPmask;
uint_32 destinationaddress;
word ROPbits;
byte ROPmaskdestination[2] = {0x55,0xAA};
byte ROPmasksource[2] = {0x33,0xCC};
byte ROPmaskpattern[2] = {0x0F,0xF0};
//noqueue: handle without queue only. Otherwise, ticking an input on the currently loaded queue or no queue processing.
//acceleratorleft is used to process an queued 8-pixel block from the CPU! In 1:1 ration instead of 1:8 ratio, it's simply set to 1!
switch (et34k(getActiveVGA())->W32_MMUregisters[1][0x9C] & 7) //What kind of operation is used?
{
case 0: //CPU data isn't used!
//Handling without CPU data now!
if (noqueue && (et34k(getActiveVGA())->W32_acceleratorleft == 0)) return 0; //NOP when not a queue version and not processing!
break;
case 1: //CPU data is source data!
if (noqueue && (et34k(getActiveVGA())->W32_acceleratorleft==0)) return 0; //NOP when not a queue version and not processing!
break;
case 2: //CPU data is mix data!
if (noqueue && (et34k(getActiveVGA())->W32_acceleratorleft==0)) return 0; //NOP when not a queue version and not processing!
break;
case 4: //CPU data is X count
if (noqueue && (et34k(getActiveVGA())->W32_acceleratorleft==0)) return 0; //NOP when not a queue version and not processing!
break;
case 5: //CPU data is Y count
if (noqueue && (et34k(getActiveVGA())->W32_acceleratorleft==0)) return 0; //NOP when not a queue version and not processing!
break;
default: //Reserved
return 1|2; //Not handled yet! Terminate immediately on the same clock!
break;
}

et34k(getActiveVGA())->W32_acceleratorbusy |= 2; //Busy accelerator!

if (et34k(getActiveVGA())->W32_acceleratorleft == 0) //Need to start a new block?
{
switch (et34k(getActiveVGA())->W32_MMUregisters[1][0x9C] & 7) //What kind of operation is used?
{
case 0: //CPU data isn't used!
//Handling without CPU data now!
et34k(getActiveVGA())->W32_acceleratorleft = 1; //Default: only processing 1!
break;
case 1: //CPU data is source data!
//Only 1 pixel is processed!
et34k(getActiveVGA())->W32_acceleratorleft = 1; //Default: only processing 1!
break;
case 2: //CPU data is mix data!
et34k(getActiveVGA())->W32_acceleratorleft = 8; //Processing 8 pixels!
et34k(getActiveVGA())->W32_ACLregs.latchedmixmap = et34k(getActiveVGA())->W32_MMUqueueval[et34k(getActiveVGA())->W32_MMUqueueval_offset]; //Latch the written value!
break;
case 4: //CPU data is X count
//Only 1 pixel is processed!
et34k(getActiveVGA())->W32_acceleratorleft = 1; //Default: only processing 1!
break;
case 5: //CPU data is Y count
//Only 1 pixel is processed!
et34k(getActiveVGA())->W32_acceleratorleft = 1; //Default: only processing 1!
break;
Show last 64 lines
		default: //Reserved
return 1|2; //Not handled yet! Terminate immediately on the same clock!
break;
}
}

//We're ready to start handling a pixel. Now, handle the pixel!
destination = et4k_readlinearVRAM(et34k(getActiveVGA())->W32_ACLregs.destinationaddress); //Read destination!
source = et4k_readlinearVRAM(et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress + et34k(getActiveVGA())->W32_ACLregs.patternmap_x);
pattern = et4k_readlinearVRAM(et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress + et34k(getActiveVGA())->W32_ACLregs.sourcemap_x);
mixmap = 0xFF; //Assumed 1 if not provided by CPU!
operationx = et34k(getActiveVGA())->W32_ACLregs.Xposition; //TODO

//Apply CPU custom inputs!
if ((et34k(getActiveVGA())->W32_MMUregisters[1][0x9C] & 7)==2) //Mixmap from CPU?
{
mixmap = et34k(getActiveVGA())->W32_ACLregs.latchedmixmap; //The used mixmap instead!
}
else if ((et34k(getActiveVGA())->W32_MMUregisters[1][0x9C] & 7)==1) //Source is from CPU?
{
source = et34k(getActiveVGA())->W32_MMUqueueval[et34k(getActiveVGA())->W32_MMUqueueval_offset]; //Latch the written value!
}

//Now, determine and apply the Raster Operation!
operationx &= 7; //Wrap!
if (et34k(getActiveVGA())->W32_ACLregs.XYdirection&1) //Negative X?
{
operationx = 7-operationx; //Reversed order!
}
ROP = et34k(getActiveVGA())->W32_ACLregs.BGFG_RasterOperation[((mixmap>>(operationx&7))&1)];
result = 0; //Initialize the result!
ROPbits = 0x01; //What bit to process!
for (;ROPbits<0x100;) //Check all bits!
{
if ((ROP&ROPmaskdestination[(destination&1)]&ROPmasksource[(source&1)]&ROPmaskpattern[(pattern&1)])!=0)
{
result |= ROPbits; //Set in the result!
}
ROPbits <<= 1; //Next bit to check!
//Shift in the next bit to check!
destination >>= 1;
source >>= 1;
pattern >>= 1;
}

//Finally, writeback the result to destination in VRAM!
et4k_writelinearVRAM(et34k(getActiveVGA())->W32_ACLregs.destinationaddress,result); //Write back!

//Increase X/Y positions accordingly!
//Clear et34k(getActiveVGA())->W32_acceleratorbusy on terminal count reached!
if (et4k_stepx()==3) //X and Y overflow?
{
et34k(getActiveVGA())->W32_acceleratorbusy &= ~2; //Finish operation!
et34k(getActiveVGA())->W32_acceleratorleft = 0; //Nothing left!
return 1|2; //Terminated immediately on the same clock!
}

//Apply timing remainder calculation
if (et34k(getActiveVGA())->W32_acceleratorleft) //Anything left ticking?
{
--et34k(getActiveVGA())->W32_acceleratorleft; //Ticked one pixel of the current block!
}
return 1|2; //Handled! Terminated immediately on the same clock!
}

Is that correct behaviour? Although currently the stepping is like PCem-X, although without wrapping implemented yet.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io

Reply 19 of 69, by superfury

User metadata
Rank l33t
Rank
l33t

I've applied the source/pattern address wrapping as a simple addition or substraction to/from the internal addresses, based on the programmed virtual line length, but applied the initializing values of the counters and addresses like PCem does?

Starting an operation (calculated like PCem does):

Details
	et34k(getActiveVGA())->W32_ACLregs.patternmap_x = et34k(getActiveVGA())->W32_ACLregs.patternmap_y = et34k(getActiveVGA())->W32_ACLregs.sourcemap_x = et34k(getActiveVGA())->W32_ACLregs.sourcemap_y = 0; //Init!
//Perform what wrapping?
et34k(getActiveVGA())->W32_ACLregs.patternwrap_x = Tseng4k_wrap_x[et34k(getActiveVGA())->W32_ACLregs.Xpatternwrap]; //What horizontal wrapping to use!
et34k(getActiveVGA())->W32_ACLregs.patternwrap_y = Tseng4k_wrap_x[et34k(getActiveVGA())->W32_ACLregs.Ypatternwrap]; //What horizontal wrapping to use!
et34k(getActiveVGA())->W32_ACLregs.sourcewrap_x = Tseng4k_wrap_x[et34k(getActiveVGA())->W32_ACLregs.Xsourcewrap]; //What horizontal wrapping to use!
et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y = Tseng4k_wrap_x[et34k(getActiveVGA())->W32_ACLregs.Ysourcewrap]; //What horizontal wrapping to use!
//Perform wrapping of the inputs!
//First, wrap pattern!
if (et34k(getActiveVGA())->W32_ACLregs.patternwrap_x!=(uint_32)~0) //X wrapping?
{
et34k(getActiveVGA())->W32_ACLregs.patternmap_x = et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress & et34k(getActiveVGA())->W32_ACLregs.patternwrap_x; //Wrap X!
et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress &= ~et34k(getActiveVGA())->W32_ACLregs.patternwrap_x; //Mask off what's moved to patternmap x!
}
if (et34k(getActiveVGA())->W32_ACLregs.patternwrap_y!=(uint_32)~0) //Y wrapping?
{
et34k(getActiveVGA())->W32_ACLregs.patternmap_y = (et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress / (((uint_64)et34k(getActiveVGA())->W32_ACLregs.patternwrap_x) + 1)) & (et34k(getActiveVGA())->W32_ACLregs.Ypatternwrap - 1);
et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress &= ~((((uint_64)et34k(getActiveVGA())->W32_ACLregs.patternwrap_x + 1) * (uint_64)et34k(getActiveVGA())->W32_ACLregs.patternwrap_y) - 1);
}
//Next, wrap source!
if (et34k(getActiveVGA())->W32_ACLregs.sourcewrap_x!=(uint_32)~0) //X wrapping?
{
et34k(getActiveVGA())->W32_ACLregs.sourcemap_x = et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress & et34k(getActiveVGA())->W32_ACLregs.sourcewrap_x; //Wrap X!
et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress &= ~et34k(getActiveVGA())->W32_ACLregs.sourcewrap_x; //Mask off what's moved to patternmap x!
}
if (et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y != (uint_32)~0) //Y wrapping?
{
et34k(getActiveVGA())->W32_ACLregs.sourcemap_y = (et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress / (((uint_64)et34k(getActiveVGA())->W32_ACLregs.sourcewrap_x) + 1)) & (et34k(getActiveVGA())->W32_ACLregs.Ysourcewrap - 1);
et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress &= ~((((uint_64)et34k(getActiveVGA())->W32_ACLregs.sourcewrap_x + 1) * (uint_64)et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y) - 1);
}

Horizontal/vertical timing:

Details
byte et4k_stepy()
{
++et34k(getActiveVGA())->W32_ACLregs.Yposition;
et34k(getActiveVGA())->W32_ACLregs.destinationaddress = et34k(getActiveVGA())->W32_ACLregs.destinationaddress_backup; //Make sure that we're jumping from the original!
if (et34k(getActiveVGA())->W32_ACLregs.XYdirection&2) //Negative Y?
{
et34k(getActiveVGA())->W32_ACLregs.destinationaddress -= et34k(getActiveVGA())->W32_ACLregs.destinationYoffset + 1; //Next address!
et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress -= et34k(getActiveVGA())->W32_ACLregs.patternYoffset + 1; //Next address!
et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress -= et34k(getActiveVGA())->W32_ACLregs.sourceYoffset + 1; //Next address!
--et34k(getActiveVGA())->W32_ACLregs.patternmap_y;
if (et34k(getActiveVGA())->W32_ACLregs.patternmap_y == ((uint_32)~0)) //Overflow?
{
if (et34k(getActiveVGA())->W32_ACLregs.patternwrap_y != (uint_32)~0) //Wrapping Y?
{
et34k(getActiveVGA())->W32_ACLregs.patternmap_y = et34k(getActiveVGA())->W32_ACLregs.patternwrap_y; //Returning to the bottom!
et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress += ((et34k(getActiveVGA())->W32_ACLregs.patternYoffset + 1) * (((uint_64)et34k(getActiveVGA())->W32_ACLregs.patternwrap_y) + 1)); //Apply Y address wrap!
}
}
--et34k(getActiveVGA())->W32_ACLregs.sourcemap_y;
if (et34k(getActiveVGA())->W32_ACLregs.sourcemap_y == ((uint_32)~0)) //Overflow?
{
if (et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y != (uint_32)~0) //Wrapping Y?
{
et34k(getActiveVGA())->W32_ACLregs.sourcemap_y = et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y; //Returning to the bottom!
et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress += ((et34k(getActiveVGA())->W32_ACLregs.sourceYoffset + 1) * (((uint_64)et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y) + 1)); //Apply Y address wrap!
}
}
}
else //Positive Y?
{
et34k(getActiveVGA())->W32_ACLregs.destinationaddress += et34k(getActiveVGA())->W32_ACLregs.destinationYoffset + 1; //Next address!
et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress += et34k(getActiveVGA())->W32_ACLregs.patternYoffset + 1; //Next address!
et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress += et34k(getActiveVGA())->W32_ACLregs.sourceYoffset + 1; //Next address!
++et34k(getActiveVGA())->W32_ACLregs.patternmap_y;
if ((uint_64)et34k(getActiveVGA())->W32_ACLregs.patternmap_y > (uint_64)et34k(getActiveVGA())->W32_ACLregs.patternwrap_y) //Wrapping point reached?
{
et34k(getActiveVGA())->W32_ACLregs.patternmap_y = 0; //Reset!
et34k(getActiveVGA())->W32_ACLregs.internalpatternaddress -= (et34k(getActiveVGA())->W32_ACLregs.patternYoffset + 1) * (et34k(getActiveVGA())->W32_ACLregs.patternwrap_y + 1); //Go back to the backup address!
}
++et34k(getActiveVGA())->W32_ACLregs.sourcemap_y;
if ((uint_64)et34k(getActiveVGA())->W32_ACLregs.sourcemap_y > (uint_64)et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y) //Wrapping point reached?
{
et34k(getActiveVGA())->W32_ACLregs.sourcemap_y = 0; //Reset!
et34k(getActiveVGA())->W32_ACLregs.internalsourceaddress -= (et34k(getActiveVGA())->W32_ACLregs.sourceYoffset + 1) * (et34k(getActiveVGA())->W32_ACLregs.sourcewrap_y + 1); //Go back to the backup address!
}
}
et34k(getActiveVGA())->W32_ACLregs.destinationaddress_backup = et34k(getActiveVGA())->W32_ACLregs.destinationaddress; //Save the new line on the destination address to jump back to!
if (et34k(getActiveVGA())->W32_ACLregs.Yposition>et34k(getActiveVGA())->W32_ACLregs.Ycount)
{
//Leave Y position and addresses alone!
return 2; //Y count reached!
}
return 0; //No overflow!
}

et4k_stepx returning 3 means that the BitBlt operation has finished.

All lines in each generated block in WhatVGA display the same line from the patterns, but somehow all of those display the wrong horizontal patterns? The first block in the center becomes completely white instead of the requested outputs?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows and PSP on itch.io