First post, by computergeek92
How much faster is a Pentium MMX 166 vs a Classic Pentium 166? What about comparing 200MHz versions?
Dedicated Windows 95 Aficionado for good reasons:
http://toastytech.com/evil/setup.html
How much faster is a Pentium MMX 166 vs a Classic Pentium 166? What about comparing 200MHz versions?
Dedicated Windows 95 Aficionado for good reasons:
http://toastytech.com/evil/setup.html
The double L1 cache the MMX has, makes more difference than the MMX instructions, when we talk about ~1997-1998 era.
A Pentium MMX 166MHz performs about the same as a Pentium 200MHz. It all depends on the application used though.
Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS
I remember it being a bit slower in some operations - but I could be wrong 😉
Visit my AmiBay items for sale (updated: 2025-03-14). I also take requests 😉
https://www.amibay.com/members/kixs.977/#sales-threads
Some results for Voodoo 2 and SLI: http://www.philscomputerlab.com/voodoo-2-and- … ng-project.html
Depending on what you're running your mileage will vary.
wrote:I remember it being a bit slower in some operations - but I could be wrong 😉
That is also what I recall. The reason for this is that they had to add an extra pipeline stage to incorporate the new MMX extensions.
As far as I recall, they were about the same in most cases, certainly not like a PMMX166 being as fast as a P200, when MMX wasn't used. But cache-heavy stuff may benefit the PMMX more.
Well, if you look at Phil's links above, in games, the MMX 166 actually performs a little better than a Classic P200.
Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS
wrote:Well, if you look at Phil's links above, in games, the MMX 166 actually performs a little better than a Classic P200.
Games are best-case here, they probably use a lot of hand-optimized MMX code, and also benefit most from cache.
I don't think this is representative of other applications, or perhaps even representative of running software-rendered games.
The L1 makes more difference than you think because it doesn't need super-specific optimization, in contrast to the MMX instruction support.
I agree that this is the best-case, BUT in this case the P166 MMX flies ABOVE the P200. I said that on average, it's about the same between those two.
Somewhere between 0% better and 20% better (a P166 to a P166MMX), and somewhere inside the +10% to +30% margin in more advanced software (like 3d games).
Keep in mind that a P200 clocks 20,5% higher than a P166 MMX and that most uses for old PCs nowadays are games, and my 1st answer in this topic getts completely explained. 😀
Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS
wrote:The L1 makes more difference than you think because it doesn't need super-specific optimization, in contrast to the MMX instruction support.
The thing is though, if you optimize your code tightly for a regular Pentium, your 16K L1 cache may be enough, most of the time.
wrote:Somewhere between 0% better and 20% better (a P166 to a P166MMX), and somewhere inside the +10% to +30% margin in more advanced software (like 3d games).
It can actually be slower, because there is more latency for certain instructions because of the longer pipeline as I said.
I think you also have to make a big distinction between software 3d and hardware accelerated 3d.
Hardware acceleration probably benefits a lot from MMX to pack and stream the data to the videochip.
Software renderers are less 'streaming' in that sense. They could benefit from a handoptimized MMX rasterizer (especially when rendering RGB, but Pentiums were too slow for that really), but if you run a regular rasterizer (eg Doom/Quake in software mode), I bet the difference isn't that large.
Good old Toms Hardware review of Pentium MMX:
http://www.tomshardware.com/reviews/pentium-m … tations,19.html
The new Pentium MMX hardly shows any improvement for DOS Gamers. An increase of 2.5% is hardly worth mentioning.
Visit my AmiBay items for sale (updated: 2025-03-14). I also take requests 😉
https://www.amibay.com/members/kixs.977/#sales-threads
wrote:Good old Toms Hardware review of Pentium MMX:
http://www.tomshardware.com/reviews/pentium-m … tations,19.html
Ah yes, they tested Quake in software mode, and indeed, little difference.
I guess it depends very much on how you used your Pentium/PMMX. I was writing optimized software renderers in assembly back in the day. Slightly before the big bang of 3d hardware acceleration. I never actually coupled a Pentium or Pentium MMX to a 3d accelerator I think. Moved to PII by that time.
I also did a few benchmarks: thandor.net - Pentium MMX 200. Red is the MMX CPU, and grey is the regular Pentium. Both at 200MHz with little differences.
As mentioned by others: back in it's day MMX wasn't widely used, so the L1-cache made the (little) difference.
thandor.net - hardware
And the rest of us would be carousing the aisles, stuffing baloney.
In Phil's benchmark suite :
R9 3900X/X470 Taichi/32GB 3600CL15/5700XT AE/Marantz PM7005
i7 980X/R9 290X/X-Fi titanium | FX-57/X1950XTX/Audigy 2ZS
Athlon 1000T Slot A/GeForce 3/AWE64G | K5 PR 200/ET6000/AWE32
Ppro 200 1M/Voodoo 3 2000/AWE 32 | iDX4 100/S3 864 VLB/SB16
wrote:In Phil's benchmark suite
Where is it from?
wrote:wrote:In Phil's benchmark suite
Where is it from?
Phil. (I couldn't resist:)
Funny 🤣
http://www.philscomputerlab.com/486-benchmark-suite.html
It doesn't include speedsys though. This is the suite used for this project if you want to compare other people's systems: Phil's Ultimate VGA Benchmark Database Project
wrote:That is also what I recall. The reason for this is that they had to add an extra pipeline stage to incorporate the new MMX extensions.
As far as I recall, they were about the same in most cases, certainly not like a PMMX166 being as fast as a P200, when MMX wasn't used. But cache-heavy stuff may benefit the PMMX more.
^ This.
At least Voodoo II cards under Windows perform better on the MMX. Not sure if it's a driver optimization. Here the MMX166 can beat the P200. Not sure about other cards, like Nvidia / ATI.
The other benefit is that the MMX draws less power, runs a bit cooler, but also less boards work with it.
wrote:It doesn't include speedsys though.
486 benchmark suite and Ultimate VGA Benchmark have no speedsys and have no the table above. Hence the mystery stays.
wrote:Phil. (I couldn't resist:)
You wisdom is beyond your years.
Common knowledge would be that PMMX 166 roughly compares to P1 200 on average.
At that speed the bus bandwidth starts to become a bottleneck so a P1 200 is less than 20% faster than a P1 166.
Asides from the larger cache (16kB code + 16kB data instead of 8kB code and 8kB data) the MMX chip
The larger cache probably has the biggest influence, though.
Quake is heavily optimized for the original Pentium. I suppose that includes cache size so the larger cache shouldn't have a too big influence here.
wrote:Hardware acceleration probably benefits a lot from MMX to pack and stream the data to the videochip.
I thought that 3D hardware acceleration means that the program is doing lots of floating point calculations. Since it is somewhat expensive to switch between floating point and MMX modes (that's one of the improvements in SSE) MMX shouldn't be much help here. I thought that MMX is much more useful for sound processing (eg. Unreal), or maybe software textured 3D (there is a special version of POD IIRC). But since Scali does lots of 3D programming he might know what he's talking about.