I just fixed some bugs and made the reads/writes using 0 sectors(which actually are 256 sectors) work properly(it was actually reading 0 sectors in that sense, reading a junk buffer).
I've also fixed the cases of being able to read past the end of the disk(LBA out of bounds), which caused a wrong error when it's in read multiple and write multiple commands(it now reports the error when the CPU has finished reading the block or partly block(when it encounters read errors or LBA out of bounds)). In the case of the LBA out of bounds, it will now read until the final LBA of the disk normally, but everything after that is filled with zeroes(since you can't read past the end of a physical disk).
The strange thing that's now happening with Windows NT 3.1 is that I see it trying to read 100h(a value of 0 sectors) and 3Fh sectors from the disk using the 0x20 read sector(s) command.
But the main issue with those seems to be, that once it starts reading the data(done using a REP INSW), it seems to give up(resetting the ATA controller) before the transfer of all sectors is completed?
The entire transfer of 256 sectors would take about 7000ns times 200(according to PCem: https://bitbucket.org/pcem_emulator/pcem/src/ … fault/src/ide.c ).
UniPCemu currently bases the timing information on the timing that IDE_TIME is a duration of 7 microseconds?
Edit: Looking at the code again, it seems IDE_TIME is actually 10*TIMER_USEC, while TIMER_USEC is 1us? so IDE_TIME is 10us. And that means that UniPCemu's timing is too small and fast?
Since IDE_TIME is 10us, that would mean that transferring one sector from disk is actually 200*IDE_TIME, thus 2000us for the initial sector, then 6*IDE_TIME, thus 60us? So it's 2ms for the first sector, then 60us for each sector after that? Wouldn't that result in a extremely slow hard disk(top speed about 8.5MB/s(for all sectors in a row), but seeing as you can only ask it to transfer 256 in a row, that's limited. So those 256 take 2ms for the first + 60us for each of the other 255 ones, thus 256 sectors in 17.3ms. So that's 0.0173 seconds for 256 sectors, thus ~14797 seconds/second, so a rate of 7.22MB/s?
Thinking about it, the largest delay is 200 times that, so 2 microseconds? But 60us to transfer a whole sector isn't much, is it(since a clock takes about 30ns)? That doesn't leave much time to do much else? Thinking a 33MHz 80386 that's running instructions in 1 cycle would be able to fit about 1980 instructions? But that's taking 80486+ speeds, which a 80386 doesn't even have, looking at Dosbox's/UniPCemu's 3MIPS, it would take about 333.333~ns to execute an instruction at that speed. Thus a result of moving 180 bytes out of said buffer before another interrupt hits in that time?
Edit: Whoops, made a mistake there. It's not that it takes the time of 180 bytes to move the buffer itself. It's actually about 90 instructions(assuming 2 cycles/instruction) for the hard disk to get ready for it's next task, not the other way around.