The UM8886BF has the bit set in its PCI configuration space as if it supports bus mastering, but it doesn't support the Intel Triton-compatible PCI Bus Master IDE Controller specification described here: https://pdos.csail.mit.edu/6.828/2018/reading … E-BusMaster.pdf which is what the DMA checkbox in Win95 expects. The Linux 2.2/2.4 IDE code has an exception written into it to block DMA on the UM8886BF for this reason (in there among all the other buggy PCI IDE implementations). At least unlike the more notorious PCI IDE chipsets it won't lose data.
At least one revision of the UM8886BF, that found on my GA-586AM, does support bus mastering. But it does not support scatter-gather I/O. That's needed because in a paging-based OS, every time your buffer crosses a page boundary, the physical memory address is discontinuous. The Triton takes a big table in memory with memory addresses and number of DWORDs to transfer. Instead, the CPU has to babysit the UM8886BF, loading it with a memory address and dword count, then spin until it finishes, then load it with the next page's physical address, and so on. It was still faster than PIO in my testing, but it under-performs compared to a proper Triton.
It's been some time that I looked at it but it doesn't actually do DMA transfers between the chip and drive. Instead it does bus-mastered PIO as if that were a thing. It might do the opposite too; it's been some time since I have looked at it.
The UM8886AF and BF do have "FIFO mode" which speeds up PIO reads, but the FIFO has to be manually toggled between disabled, filling, and draining during each read. Multi-sector transfers also help. You might try the driver from UMC that supports these features, but it's said to be buggy beyond Win95a. I've done a bit more research in this area but am not prepared to release anything yet, partly because of the risk of data loss; DM me to discuss more.