VOGONS


Pentium MMX 450MHz

Topic actions

First post, by ph4nt0m

User metadata
Rank Member
Rank
Member

Played with a Pentium MMX (Tillamook) recently. SL2Z4 250nm 266MHz PPGA for Socket 7, 2.0Vcore and 2.5Vio. Modded my MVP3 based FIC VA-503+ mainboard, so that it could set the maximum 4x multiplier for this CPU. It can be overclocked like a charm. The only drawback is that it doesn't work with cache installed on the mainboard which has to be disabled.

266MHz default
up to 350MHz at 2.0V
400MHz at 2.2V
450MHz at 2.7V

cE51Wxr.png

It seems this Tillamook is the fastest non-AMD CPU for Socket 7, so I have run Quake v1.06 timedemo1 to compare it with K6-2 and K6-3 also at 450MHz and also with the mainboard cache disabled:

Tillamook = 67.9 fps
K6-2 = 61.5 fps
K6-3 = 77.6 fps

It beats the K6-2 by 10% having half as much L1 cache! That's what the pipelined FPU is good for. Although the K6-2 gets quite a boost with the 1Mb write back mainboard cache enabled:

K6-2 = 76.0 fps
K6-3 = 83.0 fps

Now I have to figure out how to get this cache working with Tillamook.

My Active Sales on CPU-World

Reply 1 of 45, by Horun

User metadata
Rank l33t++
Rank
l33t++

Interesting ! Have you figured out how fast it will run with ext. cache still enabled ? That might point to how you can get it working at 450Mhz... just a thought.

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor. Stuff: https://archive.org/details/@horun

Reply 2 of 45, by luckybob

User metadata
Rank l33t
Rank
l33t

EWW. 2.7V is REALLY scary. I mean if you are keeping it at sub-ambient temps it might be okay, but even so I'd expect that chip to fail in short order. I feel 10% voltage boost is the MAXIMUM safe overclocking voltage.

It is a mistake to think you can solve any major problems just with potatoes.

Reply 3 of 45, by Horun

User metadata
Rank l33t++
Rank
l33t++
luckybob wrote on 2020-04-20, 02:26:

EWW. 2.7V is REALLY scary. I mean if you are keeping it at sub-ambient temps it might be okay, but even so I'd expect that chip to fail in short order. I feel 10% voltage boost is the MAXIMUM safe overclocking voltage.

🤣 thanks ! I missed the "400MHz at 2.2V, 450MHz at 2.7V" part. Agree !

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor. Stuff: https://archive.org/details/@horun

Reply 5 of 45, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

2.7v is nothing to worry for 250nm CPU. Although optimal choice would be 400 Mhz with 2.2v - best voltage/speed ratio.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 6 of 45, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Nice. I am also keeping my S7 Tillamook 233 for this kind of experiments.

What sort of SRAM chips is your motherboard using?

I remember that problem with Tilamook and L2 cache is low output voltage of Synchronous SRAM chips. The maximum voltage Tilamook can recognise as low signal is 0.5V (Vil). Many SRAM chips output more than that.

Attachments

Blog|NexGen 586|S4

Reply 7 of 45, by ph4nt0m

User metadata
Rank Member
Rank
Member

I have fixed the cache issue 😀

There are BRDY# and ADS# signals on the P5 system bus which indicate beginning of every read/write bus transaction. These are shadowed by BRDYC# and ADSC# signals intended for mainboard cache subsystem because the system bus in dual processor mode is heavily loaded. Intel removed support for dual processing from their embedded Pentium processors including Tillamook, so 7 signal pins were disabled (CPUTYP, D/P#, FRCMC#, PBGNT#, PBREQ#, PHIT#, PHITM#). BRDYC# and ADSC# were dropped as well. It seems many desktop mainboard manufacturers have actually used BRDYC# and ADSC# in their designs. No surprise Tillamook cannot work with the cache there. However if you connect BRDYC# to BRDY# and ADSC# to ADS#, it works just fine.

LQupD9x.png

Quake timedemo1 = 77.0 fps

K6-2 beaten decisively by a competitor with half as many transistors 😁

P.S. Had to increase Vcore to 2.8V at 450MHz to deal with a higher load due to the L2 cache.

My Active Sales on CPU-World

Reply 8 of 45, by feipoa

User metadata
Rank l33t++
Rank
l33t++

Impressive results! How far into Windows testing did you get at 2.8 V on a Tilamook? I wonder how long term stable 450 MHz would be. If setting up a system with this chip, I'd personally settle for 400 MHz and 2.2 V.

Based on your L2-enabled DOS Quake results, it looks like the Tilamook at 450 MHz yields the same performance as a K6-3 450. I guess the only way to top that on a super socket 7 is to run a K6-2/3+ at 500-600 MHz.

The ADS# to ADSC# and BRDY# to BRDYC# bridging has been know for some years now. Should be a post on Vogons about it. I tried numerous motherboards with these bridges on the CPU and still found getting L2 to function was hit-or-miss. I found one board which only liked the ADS# to ADSC# bridge, but not the BRDY# to BRDYC# bridge. The only board I have which ran stable with the L2 cache enabled was the NEC Proserva V Plus system. I had nothing put problems with the FIC VA-503+ board in the past (early 2000's) and chucked it during a move, so I cannot comment on its performance with the L2 and Tillamook.

Plan your life wisely, you'll be dead before you know it.

Reply 9 of 45, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie
ph4nt0m wrote on 2020-04-20, 23:38:

There are BRDY# and ADS# signals on the P5 system bus which indicate beginning of every read/write bus transaction. These are shadowed by BRDYC# and ADSC# signals intended for mainboard cache subsystem because the system bus in dual processor mode is heavily loaded. Intel removed support for dual processing from their embedded Pentium processors including Tillamook, so 7 signal pins were disabled (CPUTYP, D/P#, FRCMC#, PBGNT#, PBREQ#, PHIT#, PHITM#). BRDYC# and ADSC# were dropped as well. It seems many desktop mainboard manufacturers have actually used BRDYC# and ADSC# in their designs. No surprise Tillamook cannot work with the cache there. However if you connect BRDYC# to BRDY# and ADSC# to ADS#, it works just fine.

So .. 2 shorted connections on the CPU and it worked? Is that the case?

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 10 of 45, by feipoa

User metadata
Rank l33t++
Rank
l33t++

There's a 3rd short for 4x as well.
EDIT: This is what I have in my notes. ph4nt0m, are these the mods you made?

Tillamook_CPU-side_mods_for_L2.jpg
Filename
Tillamook_CPU-side_mods_for_L2.jpg
File size
275.32 KiB
Views
3529 views
File license
Fair use/fair dealing exception

Plan your life wisely, you'll be dead before you know it.

Reply 11 of 45, by frudi

User metadata
Rank Member
Rank
Member
ph4nt0m wrote on 2020-04-20, 23:38:

K6-2 beaten decisively by a competitor with half as many transistors 😁

feipoa wrote on 2020-04-21, 06:37:

Based on your L2-enabled DOS Quake results, it looks like the Tilamook at 450 MHz yields the same performance as a K6-3 450. I guess the only way to top that on a super socket 7 is to run a K6-2/3+ at 500-600 MHz.

What am I missing? With L2 enabled, K6-2 and Tillamook perform virtually identically (76 vs 77 fps), while the K6-3 sails past both at 83 fps.

Seems odd, honestly. Isn't the Pentium MMX supposed to be easily faster per clock in Quake than any K6 family CPU? Compared to the L2 disabled scores, it seems the Tillamook gains far less from enabling L2 than the K6-2 (while K6-3's small gains make sense because of on-die L2). Maybe there's some other bottleneck at play?

Reply 12 of 45, by ph4nt0m

User metadata
Rank Member
Rank
Member
H3nrik V! wrote on 2020-04-21, 06:40:
ph4nt0m wrote on 2020-04-20, 23:38:

There are BRDY# and ADS# signals on the P5 system bus which indicate beginning of every read/write bus transaction. These are shadowed by BRDYC# and ADSC# signals intended for mainboard cache subsystem because the system bus in dual processor mode is heavily loaded. Intel removed support for dual processing from their embedded Pentium processors including Tillamook, so 7 signal pins were disabled (CPUTYP, D/P#, FRCMC#, PBGNT#, PBREQ#, PHIT#, PHITM#). BRDYC# and ADSC# were dropped as well. It seems many desktop mainboard manufacturers have actually used BRDYC# and ADSC# in their designs. No surprise Tillamook cannot work with the cache there. However if you connect BRDYC# to BRDY# and ADSC# to ADS#, it works just fine.

So .. 2 shorted connections on the CPU and it worked? Is that the case?

That is pretty much. Two zero ohm resistors and a piece of wire.

W33 to W35 for the 4x multiplier
Y3 to X4 for BRDYC#
AM2 to AJ5 for ASDC#

Just like in Feipoa's image. Although my FIC VA-503+ has been heavily customised over the years. In particular, all major capacitors and FETs were replaced. Much better stability.

frudi wrote on 2020-04-21, 07:50:

What am I missing? With L2 enabled, K6-2 and Tillamook perform virtually identically (76 vs 77 fps), while the K6-3 sails past both at 83 fps.

Seems odd, honestly. Isn't the Pentium MMX supposed to be easily faster per clock in Quake than any K6 family CPU? Compared to the L2 disabled scores, it seems the Tillamook gains far less from enabling L2 than the K6-2 (while K6-3's small gains make sense because of on-die L2). Maybe there's some other bottleneck at play?

Many ppl run K6-3 / K6-2+ / K6-3+ with mainboard cache disabled because they think it doesn't matter much. It still does if you pay attention to cacheable range.

Pentium MMX has a much better pipelined FPU, but K6 has a much better out-of-order execution ALU and twice as much the L1 cache. Although both I-cache and D-cache are 4-way set associative on Pentium MMX while 2-way set associative on K6 with additional 20Kb of instruction predecode cache. I guess 4-way vs. 2-way is why K6-2 is more sensitive to presence/absence of mainboard cache. I have also noticed a long time ago that integer performance of K6-2 suffers to scale very well above the 4x multiplier even with mainboard cache enabled.

My Active Sales on CPU-World

Reply 13 of 45, by ShovelKnight

User metadata
Rank Oldbie
Rank
Oldbie

I run my K6-III+ with the motherboard cache disabled. It's actually slightly faster in DOS Quake that way, and I benchmarked it many times to make sure it's not a measurement error. Other games are slightly slower but the difference is about the same as the difference between K6-III+ and K6-II+ at the same clock speed.

Reply 14 of 45, by ph4nt0m

User metadata
Rank Member
Rank
Member

Just for the record, screen shots of K6-2 and K6-3 running on the same mainboard with the same settings.

1ZAraqC.png

Cache Level 1
1677.63 MB/s read
1643.74 MB/s write
1645.15 MB/s move
1655.50 MB/s average

Cache Level 2
528.49 MB/s read
245.38 MB/s write
245.38 MB/s move
339.75 MB/s average

Memory
337.35 MB/s read
110.00 MB/s write
110.00 MB/s move
185.78 MB/s average

Cache Level 1 (MMX)
2443.02 MB/s read
2425.74 MB/s write
1355.89 MB/s move
2074.89 MB/s average

Cache Level 2 (MMX)
528.51 MB/s read
245.39 MB/s write
245.39 MB/s move
339.76 MB/s average

Memory (MMX)
337.35 MB/s read
110.00 MB/s write
110.00 MB/s move
185.79 MB/s average

ldPvFab.png

Cache Level 1
1706.25 MB/s read
1704.43 MB/s write
1706.86 MB/s move
1705.85 MB/s average

Cache Level 2
1144.62 MB/s read
981.20 MB/s write
1056.69 MB/s move
1060.83 MB/s average

Cache Level 3
490.72 MB/s read
241.09 MB/s write
241.08 MB/s move
324.30 MB/s average

Memory
321.43 MB/s read
144.88 MB/s write
144.88 MB/s move
203.73 MB/s average

Cache Level 1 (MMX)
2489.91 MB/s read
2489.26 MB/s write
1373.35 MB/s move
2117.51 MB/s average

Cache Level 2 (MMX)
1615.80 MB/s read
1248.84 MB/s write
1169.15 MB/s move
1344.59 MB/s average

Cache Level 3 (MMX)
528.51 MB/s read
245.34 MB/s write
232.93 MB/s move
335.59 MB/s average

Memory (MMX)
337.36 MB/s read
144.89 MB/s write
123.91 MB/s move
202.05 MB/s average

It's interesting that memory write performance is much better on K6-3 while performance related to mainboard cache is identical. The CPUs are K6-2/450AFX and K6-III/450AHX

Tillamook's results:

Cache Level 1
572.14 MB/s read
84.91 MB/s write
1710.51 MB/s move
789.18 MB/s average

Cache Level 2
312.31 MB/s read
84.58 MB/s write
229.02 MB/s move
208.64 MB/s average

Memory
225.33 MB/s read
84.73 MB/s write
110.01 MB/s move
140.02 MB/s average

Cache Level 1 (MMX)
1019.22 MB/s read
169.60 MB/s write
524.73 MB/s move
571.18 MB/s average

Cache Level 2 (MMX)
429.44 MB/s read
168.95 MB/s write
229.04 MB/s move
275.81 MB/s average

Memory (MMX)
281.48 MB/s read
169.21 MB/s write
110.02 MB/s move
186.90 MB/s average

BTW why write performance is such BS on many older CPUs including Pentium MMX? Cannot be true with write back caches.

My Active Sales on CPU-World

Reply 15 of 45, by ph4nt0m

User metadata
Rank Member
Rank
Member
ShovelKnight wrote on 2020-04-21, 14:13:

I run my K6-III+ with the motherboard cache disabled. It's actually slightly faster in DOS Quake that way, and I benchmarked it many times to make sure it's not a measurement error. Other games are slightly slower but the difference is about the same as the difference between K6-III+ and K6-II+ at the same clock speed.

What mainboard, how much cache and how much memory installed?

My Active Sales on CPU-World

Reply 16 of 45, by luckybob

User metadata
Rank l33t
Rank
l33t

I really don't have a lot of faith in Speedsys' benchmarks. It's really just a number that doesn't really correlate to the speed of the system as a whole.

That said, it's still super impressive.

I'd love to see how it runs early Win98 titles.

Also what are you doing to keep the chip cool? At 2.8V I certainly hope its at least on water cooling, or some kind of exotic socket A/370 cooler. I'd hate to see such a high performing Tillamook die.

It is a mistake to think you can solve any major problems just with potatoes.

Reply 17 of 45, by ShovelKnight

User metadata
Rank Oldbie
Rank
Oldbie
ph4nt0m wrote on 2020-04-21, 14:25:
ShovelKnight wrote on 2020-04-21, 14:13:

I run my K6-III+ with the motherboard cache disabled. It's actually slightly faster in DOS Quake that way, and I benchmarked it many times to make sure it's not a measurement error. Other games are slightly slower but the difference is about the same as the difference between K6-III+ and K6-II+ at the same clock speed.

What mainboard, how much cache and how much memory installed?

Gigabyte GA-5AX Rev. 4.1 (ALi Aladdin V), 512 KB cache, 384 MB RAM

Reply 18 of 45, by ph4nt0m

User metadata
Rank Member
Rank
Member
luckybob wrote on 2020-04-21, 15:19:
I really don't have a lot of faith in Speedsys' benchmarks. It's really just a number that doesn't really correlate to the spee […]
Show full quote

I really don't have a lot of faith in Speedsys' benchmarks. It's really just a number that doesn't really correlate to the speed of the system as a whole.

That said, it's still super impressive.

I'd love to see how it runs early Win98 titles.

Also what are you doing to keep the chip cool? At 2.8V I certainly hope its at least on water cooling, or some kind of exotic socket A/370 cooler. I'd hate to see such a high performing Tillamook die.

I haven't got much faith in Speedsys either, but it's one of the most popular benchmarks for old x86 hardware. Still something to think about.

Tillamook at 266MHz is rated by Intel for 4A on 2.0Vcore and 0.4A on 2.5Vio. That's 9W maximum. I estimate 13.3A on 2.8Vcore and 1.1A on 3.2Vio at 450MHz = 41W. I know that's a lot. Even K6-3 450MHz, which was a power hog back in the day burning poor Socket 7 mainboards, consumed only 30W. I use currently a cooler suitable for early Socket A Athlons which consumed up to 65W. I still consider to install something better. PPGA packaged Pentium MMX can be also delidded. There is some kind of thermal paste between the nickel plated copper cover and gold plated core. I can replace it with indium.

ShovelKnight wrote on 2020-04-21, 20:08:

5AX Rev. 4.1 (ALi Aladdin V), 512 KB cache, 384 MB RAM

GA-5AX rev. 4.1 comes with the M1541 north bridge rev. E. It means external tag and 128Mb cacheable range.

My Active Sales on CPU-World

Reply 19 of 45, by ShovelKnight

User metadata
Rank Oldbie
Rank
Oldbie
ph4nt0m wrote on 2020-04-21, 21:12:

GA-5AX rev. 4.1 comes with the M1541 north bridge rev. E. It means external tag and 128Mb cacheable range.

I know, but it makes no difference for DOS because the programs are loaded “from the bottom” and Quake for DOS only allocates 64 MB of RAM anyway. I get exactly the same results with 128 MB of RAM.

By the way, Speedsys shows that the L3 cache bandwidth on this motherboard is about 180 MB/s while main memory bandwidth is 160 MB/s so the difference is not that impressive.