I've just been thinking: F6F6:F7F9... Couldn't the F6 parts be data read from the floppy disk? I know that it formats the track with the fillter byte set to F6. Perhaps a buffer overflow of some kind(either by the DMA controller or by the CPU itself) when reading the sector(s) back to RAM?
So it could be a DMA malfunction, or perhaps a REP(Z/NZ) kind of overflow?
Edit: I see that the last DMA transfer from the FDC(18 sectors long) was somehow to physical address 0? The DMA page register and the address registers were loaded with 0x00 bytes, while the current address register ended up at 0x2400.
So all data in memory from physical memory location 0 to 2400h has been overwritten with the sector data read from track 0 head 1? And since those sectors have just been formatted with the 0xF6 fillter byte, it means that all that low memory has been filled with 0xF6 bytes by the FDC DMA read operation!
So any IRQ that triggers when it's complete(which it will, since the FDC completes it's operation) causes it to shift execution to F6F6:F6F6 or perhaps F6F6:anyaddress in real mode!
Why in the world would Windows 95's setup wizard setup the DMA controller for reading to physical address 0? No software in their right mind with an IVT at that address would do that? Except is there's a bug in the software?
So the issue here isn't the Floppy, but that the DMA controller is programmed for address 0 when it's going to read data to RAM after formatting the disk's first track, overwriting the real mode IVT?
Seeing as the DMA hi/lo and page address registers are all written as 0 (since they aren't cleared in any other way), I'd assume there's some instruction that's writing 0 to their origin bytes or perhaps not write to said memory at all(uninitialized memory = BIOS leftover(which clears it during POST) = zeroed RAM)? Anyone?
Perhaps some malfunctioning MOV, REP'd instruction, PUSH or POP?
Just checked out MS-DOS 6.22 format.com again with the formatting of the floppy.
Guess what: The exact same problem with reading sectors to memory address 0(setting DMA base and page registers to 0 for the read)!
So, if it has the same issue(besides triple faulting on it), it would actually be a faster and easier way of finding the CPU(or maybe hardware, but unlikely) bug, since it doesn't require the entire Windows 95 setup wizard to complete and get a log!
Edit: OK. Managed to set some breakpoints in UniPCemu and find out the locations that the DMA controller's channel 2(for the FDC) DMA address and page registers were written:
f000:ca5e first
f000:ca62 second
f000:ca6a page
So, the issue might be somewhere inside the BIOS itself? Or perhaps MS-DOS calling the BIOS functionality(e.g. INT 13h or something like that)...
Edit: OK. I eventually see(directly after the format track request from the BIOS) a strange interrupt being called:
INT 13h
EAX=00000412 EBX=0 ECX=1 EDX=0 ESP=000008F4 EBP=00000904 ESI=00000C54 EDI=0000034E
EIP=00001120 EFLAGS=00000202 CS=025A SS=011E DS=0070 ES=0 FS=0 GS=0 TR=0 LDTR=0
That's also the last INT 13h function that's called before the CPU resets.
So the cause is at:
025A:111E, according to my debugging. There it requests the track to be formatted using INT 13h function 05h. Directly after it, the invalid Verify sectors(INT 13h function 04h) occurs with ES:BX=0:0.
Edit: It appears that the invalid command for the verify sectors also uses the same address when it calls INT 13h?
Edit: Yay! A huge log(4.16GB) with the error hopefully somewhere in there?
Edit: This is said log: https://www.dropbox.com/s/t3ymp2827mu2yh1/deb … cleared.7z?dl=0
Anyone can see what's going wrong?
Edit: OK. The very first read is a read to address 18EC:0(length: 2 sectors), thus 18EC0 as a linear address.
Edit: The instructions following the INT 13h for that matches what it's supposed to at f000:0000ca5c, writing said data to the DMA address registers.
Edit: Then it calls function 18h of the BIOS INT 13h(set media type for format). Those parameters seem fine(setup for 80 tracks, 18 SPT, drive 0.
Edit: Then a format track command to the BIOS is issued. 12 sectors to format, track 0, head 0, drive 0, sector buffer ES:BX=70:53B=Memory address 73B.
Edit: f000:0000ca5c confirms it's using the correct buffer address, at 000c3b.
Edit: The very next INT 13h call is actually the incorrect AX=0412h, ES:BX=0:0 verify sectors command. So something above that is going incorrect?
Edit: OK. The 0 value loaded into ES is loaded a bit further up at 025a:00000e45, loading ds:[0537] with the BX value, thus generating the NULL ES:BX value.
And said BX value of 0 is generated the instruction before that, at 025a:00000e43, where it clears the BX register using a XOR BX,BX.
Edit: BX seems to be saved right before that clearing of BX(which is eventually loaded into ES), at 025a:00000e42. It's simply pushed on the stack at SS:SP=11E:08FC. It was 53B at said point.
Edit: OK. Somethings' weird with that real RAM from address c38 there? Or is it just badly logged?
Edit: OK. That's fine. It was a bug with the logging to the log file, wrapping the low 4 bits of the address to 3 bits instead of the correct 4 bits(the size of the cached memory address). So address *8-*F were becoming *0-*7 in the log.
It doesn't look very logical what it's doing, it's essentially making ES=BX=0 because of the clearing and writing to ds:[537]? It starts at 025a:00000e42.
So everything from 025a:00000e43 onwards seems like nonsense to me. It doesn't make any sense to run code that way for a memory access overwriting yourself and the OS's critical data which is the only possible outcome of this?
The issue with the verify command from the BIOS seems to have been that the DMA controller was put in Verify mode, so it was writing the data read from the peripheral(the FDC) to memory. But since it's in verify mode, no data is actually written in memory, just read from the FDC and discarded.
Now, with the DMA controller fixed to behave properly(read from the FDC and discard during INT 13h verify sector(s)), the MS-DOS 6.22 setup properly continues! 😁
Now the big question: what happens with read commands on a unformatted track?
I'd assume ST1=5 and ST2=1(to indicate it couldn't find anything on the disk).
What about the ST0 register? IC probably 1 for abnormal termination(0x40). What about Unit Check and Not Ready? Is just unit check set(drive faulted), Not ready(it's not ready to be read) or both?
Not ready is probably not the case(it's just not formatted), so just Unit Check is set in this case?
Edit: After some more tinkering(proper unformatted sector skipping, ST2 bit 3 being set when nothing is found, ST1 bit 1, 3 and 7 being set and ST0 bits 3 and 4 being set), I've managed to get it past the invalid sector read and on to formatting.
Then I noticed that the new formatting method I implemented was misbehaving, since the SPT of an unformatted track was effectively 1(1 unformatted sector is the minimum for the IMD disk image). So I added a check for the format command to the sector increase function to ignore the SPT setting of the drive to check for overflow(when it reached physical EOT, which is not available for Format Track commands(only for read/write commands)).
OK. Having fixed those issues, format.com now tries to physically format the disk.
It's using the sector timings for the 4-byte packet format timings(except 4-byte 'sectors' instead of the usual 512 byte rate, so 128 times faster).
It managed to get up to 8 sectors buffered, at which point format.com suddenly aborts and tries again.
So there's clearly some conflicting timing or timer at work?
Edit: OK. The issue was with the sector rate being used for the format command. When it's determining the sector rate using the RPM and SPT, for normal commands this is fine(e.g. 300 RPM with 18 SPT). But when using the format command on the unformatted track(as is only the case with unformatted IMD disk images), the SPT setting is 1(since there is only one unformatted 'sector' on the disk image). So it'll try to format the disk using a 300 RPM 1 SPT speed, for a total of 18 sectors. Since it handles 300 sectors per minute at that speed, it'll take the time of 18 full tracks being formatted to actually format the track!
I've now adjusted the speed it formats at to perform the format at a proper RPM speed(with the format's track length field taking the place of the SPT(which is 1) in the drive's known geometry.
So in that case, the format will properly finish in 1 track's spinning time(from index hole to index hole), instead of 18 times(with 18 sectors), 9 times(with 9 sectors) etc. as much time.
Which explains why it'll abort after 8 sectors time. It's a total of 8 spins at 300 RPM, so that's 8 rotations per second, so exactly 1 second until it aborts the formatting.
Edit: Then adjusted the timing to be more correct. Then I found out that it was using 1 RPM timing still(due to the check for the format command Track Length to be at most 1(thus not faster than 1 track(SPT) at 300 RPM), while it's supposed to be the opposite(at least 1, while the format command Track Length makes it faster than that(up to a factor of x255(when formatting with a track length of 0xFF)))).
One other thing I've also changed is the way the unmounted disks are handled. They now error out in a normal way, setting ST1 to 0x5, ST2 to 0x1, ST0 not ready and unit check to 1 and the ST0 condition to error out(highest 2 bits = 01h), indicating that the drive isn't ready.
ST1 will also set bit 7 now when it can't find the sector ID after two rotations of the disk(which will make the format.com and reading of the disk work properly).
Edit: Just confirmed that format.com is now properly formatting the IMD disk image in ascending order(track 0 head 0, track 0 head 1, track 1 head 0, track 1 head 1 etc.).
Edit: Just confirmed it. Format.com now properly formats the disk and it's properly readable and writable by MS-DOS 6.22 without visible issues. 😁
OK. The FDC in Windows NT 3.1 can format and use the IMD disk image without visible issues as well.
And, interestingly enough, CheckIt from it's MS-DOS prompt crashes within it(executing a 0xFFFF instruction).
And XTIDECFG(The configuration tool for the XT-IDE BIOS) crashes Windows NT 3.1 with a 0x7F Unexpected Kernel Fault BSOD), which is kind of strange, but it tries at least(no FDC errors there).
Still need to test Windows 95 RTM though. I sure hope it'll work once I tested it's FDC crash... Probably fixed, because the issue with the DMA controller is now fixed(which should no longer crash Windows)...
Edit: OK. After fixing the not ready behaviour of the FDC(hang the controller when no disk is present instead of returning an error code for read/write/format/read sector ID), Windows NT should detect them correctly now again(the disk not being inserted).
Having experimented a bit with Windows 95 RTM setup, I found out that somehow Windows 95 doesn't seem to like unformatted floppy disks to be inserted when creating the boot disk. After having used a formatted disk(formatted using Windows NT 3.1), it proceeds to format the disk normally and copy the files to the startup disk. 😁
Also, for MS-DOS, the unformatted floppy needs to be formatted with the /U option for unconditional format, because otherwise it'll try to check for previous data on the disk, which of course fails(since there's no sectors on the disk yet). Unconditional format seems to bypass that, formatting the disk normally.
Windows 95 setup now manages to create the boot disk without issues(only when using a formatted disk, like a IMD disk image that has been formatted already or a normal static disk image(IMA/DSK/IMG format) that's already having all it's sector IDs in the correct place.
Or perhaps it only needs the first track to be properly formatted for the startup disk creation to succeed?
OK. MS-DOS 6.22 still requires the /u parameter for format.com, because otherwise it'll try to check the existing disk format, which will report failure and not ready for the unformatted disk, causing format.com to abort(the same effect as Windows 95 executing dir on the floppy disk, the drive not becoming ready).
When specifying the /u parameter, however it proceeds to properly format the disk. That seems to work correctly.
Edit: Just confirmed the floppy disk and DMA emulation now working properly for all 3 operating systems(Windows NT 3.1, MS-DOS 6.22 and Windows 95)! 😁
Just tried Debian Bo again against the current FDC emulation.
I see it's somehow actually executing 4 sense interrupts after resetting the controller, which clears the ST0's error indication bits after that(although still returning the bits set on the last drive(drive #3, which together with drive #2 doesn't exist)).
Then immediately after the results are in, it somehow starts another Sense Interrupt command, which of course errors out as an invalid command, returning ST0=0x80? The linux driver doesn't seem to like that?
That's at line 1720 of the linux 2.0.33's floppy.c driver(linux/drivers/block/floppy.c). But isn't an IRQ supposed to happen when the floppy is reset?
Just adjusted the reset IRQ to occur after 20us. Anyone knows what the exact timing is? Linux 2.o.33's floppy driver seems to depend on it?
Also, just tried changing the behaviour for unmounted floppy disk drives(drives without any media inserted) to instead of hanging the controller(as Bochs does) actually give an error result with ST0.Ready=0 in it's reporting.
Turns out, with Read ID on Windows NT it just infinitely tries to re-execute the command incorrectly and thinks the disk is unformatted, while Windows 95 thinks the disk is unformatted.
So the only correct behaviour seems to be to hang the controller when unmounted, give the error result when unformatted, error out in another way when the sector isn't found(same error with the first sector not being found), and the next sector not being found giving a slightly different error(the sector following the previous sector is a mismatch instead of a nothing found at all). And of course the final result that can happen is the sector normally being found, in which case the result code becomes a normal success value(with any specific flags set when the cylinder ID etc. isn't matched, WC, Deleted sector marks encountered etc.), otherwise it's erroring out in a normal documented way(according to the documentation).
OK. After fixing some behaviour on the FDC(IMD disk image Read ID command to properly detect the amount of sectors on the selected track, not hanging up the FDC when a Sense Interrupt command after the result byte of 0x80 is sent completes(instead returning to command mode, as the documentation says), DumpReg and unknown commands not raising an IRQ, Drive polling mode still needing to raise an IRQ after reset when it's disabled, unknown FDC commands always reporting 0x80 for the result(not adding in the drive number, physical head and not ready/unit check bits to the result, instead clearing them)).
Now Linux seems to see that a disk is in there and is able to mount and read it(haven't tested writing/formatting, though).
I only still see issues with disk changing somehow? When I change the disk, Linux doesn't seem to notice that the disk is changed(I see no reads from the disk change flag(the DIR register's highest bit)). I do see that the reset procedure clears said flag, which should be correct behaviour?
Just improved a slight behavioural thing: when seeking during an implicit seek, it will now seek according to the current idea of the cylinder compared to the requested cylinder of the command. When the idea is different from what is requested, instead of what it did before(making the idea and physical cylinder the value of the requested cylinder), it will now actually try to seek the difference between the requested cylinder and the idea of the current cylinder. And said difference is also increasing the physical cylinder by the same amount, which will (like the seek command already did) now clip to the last track of the disk(which it didn't before). So if the physical cylinder is misaligned(track 0 isn't track 0), it will stay that way, unless seeking back to track 0 will cause it to clip back to 0, at which point the idea and physical track position become aligned again.
OK. One little question: what happens when a multi-sector read is performed while for example the sector numbers on the disk are interleaved? E.g. 1,9,2,8,3,7,4,6,5 and reading 9 sectors from sector 1 onwards using a read data command? Will the FDC read 9 sectors? Or will it abort after 1 sector with an error message?