Yes.
8bit mode (assuming an XT), would get penalized both by the datapath being half that of the 16bit mode, AND from the bus speed being ~25% slower than the 6mhz rate of the 6mhz 16bit AT bus.
However, this requires comparing apples to oranges--
If you are comparing a benchmark that is straight 8bit writes for both, you end up just measuring the increase in speed of the bus.
If you are comparing 8bit writes to 16bit writes, this is apples to oranges; you are measuring the increase in speed of the software, by leveraging a larger word size on the databus. Assuming 0 waitstates, and ideal bus optimization, you can expect a 100% increase in thruput from writing 16bit words instead of 8bit words, per clock cycle. (It is further complicated in the real world, by some CPU operations requiring more than one clocking cycle to complete. For the same instruction being called, the actual real-world difference in time to complete it may be quite drastically different, between a CPU that has implementations of that instruction that complete in 2 cycles instead of 3, or 1 cycle instead of 2, for example. Some of the instructions that got changed between 286 and 386, for instance, had such improvements in cycles to complete, in addition to the wordsize and databus width being made larger. Similar story for 386 to 486, and 486 to pentium. Some versions of some chips had differences that made big impacts in this area, like the 386slc vs 386sx, which had 16bit external bus vs 32bit external bus, and thus would have waitstates or multiple clock cycles needed to do a 32bit write, in the case of the narrower bus variant. There's a lot of devils in the details with some arbitrary benchmarking methods.)
If you are comparing 8bit writes on an XT's 4.77mhz bus, against 16bit writes on a 386's 8mhz bus, you are very much comparing very different beasts straight up. All things being considered ideal, you are looking at ~360% increase in theoretical performance. (Possibly larger, when you get things like cache involved, but that would mostly only happen on READS, not writes.)
[NOTE-- this is meant as "Comparison of rate you can redraw the screen between an actual XT, and a 386 AT clone."]
As for having a slow ISA video card-- remember, these devices had ... Conflicting... intentions in the marketplace with what we want them for now. Lots of commodity parts were made for the purposes of displaying boring business data on the screen, and little else. How fast you can redraw the screen is immaterial for such applications. WordPro, WordPerfect, Lotus 1-2-3, and pals really just wanted the higher display modes to cram more text / data on the screen at once. Not to get greater than 30fps in a raycasted FPS game. 😜
The PC very very much wanted to be seen as a business instrument, and not a glorified game console, you see.