VOGONS


Reply 80 of 251, by zyzzle

User metadata
Rank Member
Rank
Member
kalohimal wrote on 2020-07-08, 14:45:

Next revision... perhaps to include AMD K10 multiplier support per Falcosoft's suggestion? I hesitate to include CPUs beyond K8/Core2, as their chipsets no longer support Windows 98, hence they are not ideal for retro gaming, although they could still be used for DOS gaming if slowed down enough.

Oh... please do include the latest Intel Core familes, if possible! There are those of us who use the latest systems, and still want to play old DOS games, at retro speeds. It pains me to see that you think the response to your BOMBSHELL program has been "negligible". Those are sad words.

CPUSPD works perfectly on my Intel Atom and C2D system. I can adjust multiplier AND voltage, which is wonderful indeed!

I've tested v 1.4 of CPUSPD on my i5 5200 and disabling L1 cache works, and fidvid setting kind of works, I only can change from f1500 to f1900 with stable result. All other values are impossible. It appears multiplier is locked on my CPU at either 21x or 25x.

Attempting throtting on my core i5 gives immediate sigfault. Disabling L1 cache works, but then give sigfault without reporting status. The speed of the system is reduced to a 386 level with L1 cache disabled. It is impossible to disable L2 cache on the core i5, it would seem, as option -c2d slows the system down to the level of option -cd.

Will it be impossible to add throttling to Intel Core family cpus? If I may help in any way by sending you reports of my southbridge chipset, or any other debug info, please ask. By the way, the old utility FDAPM from way back in 2009 actually does support thottling, at least in my CPU / motherboard. It gives 8 options of throttling available. Using setting /SPEED 1 seems to be identical to 87.5% ODPM throttle setting. Paradoxically it seems to allow throttling in Intel core family, at least on my Lenovo core i5 laptop!

I thank you for creating this CPUSPD!

Reply 81 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

@zyzzle

Thank you very much for your encouragement. I've never tested the program at all on any Core i system, as my original plan was to support up to Core 2 only. I'm quite surprised that south bridge throttling didn't work though, as it is based on ACPI. I did have a chance before to look at FDAPM's source code, and their method is very similar. The major difference is FDAPM uses DOS int 15h 87h "move block" in real mode to copy ACPI data from high memory for examination, while my program can access high memory directly in protected mode. Multiplier is not expected to work as Intel had changed the SpeedStep mechanism from msr 198/199 to something else, so it's another surprise when you said it worked partially. I do have some Core i PCs lying around, so maybe I'll take a look later.

Btw I'll be releasing version 1.5 soon. It has only one added feature which is ODCM. But with it, late Pentium4/D, Dual Core, Core, and Core 2 systems now have 4 hardware "knobs" to control the CPU speed: cache, multiplier, throttle, and odcm. I'm very excited about it. Again not sure if ODCM would work on Core i CPUs though 🤣.

It actually doesn't really matter how good or bad the response is, as my intention is just hoping to be able to contribute something back to the community, while having some fun programming. If the program is useful to some people, then I will be most delighted already.

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 82 of 251, by Falcosoft

User metadata
Rank Oldbie
Rank
Oldbie
PARUS wrote on 2020-07-09, 02:51:

... And K10 architecture has minimum able mul x0.5, i.e. it will be CPU 100 MHz at FSB 200MHz (HyperTransport 800MHz).

I have tried this and it really works! If you are programming FID=0 DID=4 to MSR it functions as a 0.5x multiplier i.e. 100MHz.
Tested with AIDA16 K10 at 100MHz is not a very fast 16-bit processor so with 16-bit code it performs like a P55C 75 MHz. It's slow enough that the infamous Runtime error 200 with Borland compiled executables is NOT triggered.
Its speed on 100 MHz with 32-bit code is roughly equivalent to 200 MHz P55c.
Interestingly the Throttle DOS utility works perfectly with AMD SB710/750 chipsets (Throttle detects chipset as ATI SB600) so finer tuning is also possible.
Another interesting fact that despite K10 has invariant TSC (unlike K8) so TSC clock does not change when you change performance states, it can be reset to new Pstate-0 value by toggling bit 24 in 0xC0010015 MSR. This way software can also detect the CPU as a real 100 MHz one.

Unfortunately CpuSpd exited with a sigsev fault when I tried to use it for setting throttle with t parameter.

I have attached the prof of concept executable (K10Min) with source:

Filename
K10MIN.zip
File size
3.37 KiB
Downloads
107 downloads
File license
CC-BY-4.0

Website, Facebook, Youtube
Falcosoft Soundfont Midi Player + Munt VSTi + BassMidi VSTi
VST Midi Driver Midi Mapper

Reply 83 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

Ok 2 items added to the to do list: investigate throttle sigfault for Core i and K10 south bridges. That'll keep me busy for a while. 😀 I do have an AM3 board with SB710, so further testing is not difficult.

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 84 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

CPUSPD version 1.5 released:

  • Added support for ODCM (On Demand Clock Modulation), for Intel Pentium 4 and newer CPUs (Pentium D, Dual Core, Core, Core2).

This is very exciting news for owners of Pentium 4, Pentium D, Dual Core, Core, and Core 2. For these CPU, now we have 4 types of hardware controls over the CPU speed: cache, multiplier (for late Pentium 4/D and newer CPU), throttle, and ODCM! Combining these features we can now have very fine grade controls over the CPU speed.

CPU ODCM (On Demand Clock Modulation) uses the clock modulation hardware to control the duty cycle of the internal CPU clock. It works independently from south bridge clock throttling, but using the same method of skipping clock pulses. For example, if it is set to 1/8, it will skip 7 clock pulses before asserting 1 clock, effectively slowing down the CPU clock frequency to 1/8. Intel extended the resolution to 1/16 for later CPUs. ODCM's clock duty cycle is a lot finer compare to throttling, so on motherboards with bad keyboard/mouse response when throttling, ODCM will provide a much smoother experience.

New command added:

  • o[xx] - display current CPU on demand clock modulation (ODCM) value if run without any parameters, or set ODCM to xx, where xx is 1 to 8 (or 16 depending on CPU). Note that 8 (or 16) will disable ODCM.

Test results (O8 means ODCM disabled):

cpuspd3.jpg
Filename
cpuspd3.jpg
File size
128.61 KiB
Views
2142 views
File license
Fair use/fair dealing exception

Credit: thanks to PARUS again for providing information on ODCM programming.

Please download from first post.

Last edited by kalohimal on 2020-07-11, 15:10. Edited 4 times in total.

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 85 of 251, by kalohimal

User metadata
Rank Member
Rank
Member
Falcosoft wrote on 2020-07-11, 11:52:
I have tried this and it really works! If you are programming FID=0 DID=4 to MSR it functions as a 0.5x multiplier i.e. 100MHz. […]
Show full quote
PARUS wrote on 2020-07-09, 02:51:

... And K10 architecture has minimum able mul x0.5, i.e. it will be CPU 100 MHz at FSB 200MHz (HyperTransport 800MHz).

I have tried this and it really works! If you are programming FID=0 DID=4 to MSR it functions as a 0.5x multiplier i.e. 100MHz.
Tested with AIDA16 K10 at 100MHz is not a very fast 16-bit processor so with 16-bit code it performs like a P55C 75 MHz. It's slow enough that the infamous Runtime error 200 with Borland compiled executables is NOT triggered.
Its speed on 100 MHz with 32-bit code is roughly equivalent to 200 MHz P55c.
Interestingly the Throttle DOS utility works perfectly with AMD SB710/750 chipsets (Throttle detects chipset as ATI SB600) so finer tuning is also possible.
Another interesting fact that despite K10 has invariant TSC (unlike K8) so TSC clock does not change when you change performance states, it can be reset to new Pstate-0 value by toggling bit 24 in 0xC0010015 MSR. This way software can also detect the CPU as a real 100 MHz one.

Unfortunately CpuSpd exited with a sigsev fault when I tried to use it for setting throttle with t parameter.

I have attached the prof of concept executable (K10Min) with source:
K10MIN.zip

Thanks again for the great info Falcosoft! You're tempting me hard to do K10 😀

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 86 of 251, by PARUS

User metadata
Rank Oldbie
Rank
Oldbie

It's funny. I got two commendations already in this project but I even didn't try this program myself 😀

I have Pentium-4 with SB-Link bus, Pentium-4 with ISA bus, Core2Duo with ISA bus. And I have been using all these parameters (multiplier, ODCM, L1/L2, FSB throttle) for a long time via good but different utilities. I have to go to my old flat where I have almost all my retro hardware and take one-two builds for testing and using this awesome program.

Thank you very much kalohimal!

Reply 87 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

@PARUS

This is great stuff. With a Dual Core E6700/3.2G, and a Biostar P4M900-M7 FE with VIA 8237 south bridge, we can now get the following:

  • Cache on/off : 2 steps
  • ODCM 1-8 : 8 steps
  • SB throttling 1-16 : 16 steps
  • Multiplier 6.0-12.0 with 0.5 increment : 13 steps
  • L2 cache on/off : the effect is negligible so we don't count this.

So in total, we get 2 x 8 x 16 x 13 = 3328 possible steps for CPU speed control! This is an amazingly fine adjustment! Even with Intel ICH6/7 south bridge which has 1-8 throttling, we still get 1664 possible steps! With the support of Yamaha YMF744 for both ICH6/7 and VT8237, this is going to be the ultimate all-in-one retro PC for DOS, Win98 and XP. 😁

Update: with the above setup and the following settings:
cache on, odcm 1, throttle 1, multiplier 6
The CPU frequency reported from the performance counter (command i) is at 30MHz! OMG 🤣. With cache turned off, everything is in slow motion. When a dir is performed the text came out character by character just like in the movies. 8088 speed perhaps? Idk 🤣

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 88 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

Just tried running Landmark 2.0 with cache off, odcm 1, throttle 1, multiplier 6. It reported the system as an AT with a clock speed of... 0.5MHz! Lol

IMG_20200717_222124.jpg
Filename
IMG_20200717_222124.jpg
File size
1.09 MiB
Views
2091 views
File license
Fair use/fair dealing exception
IMG_20200718_001606.jpg
Filename
IMG_20200718_001606.jpg
File size
1.28 MiB
Views
2085 views
File license
Fair use/fair dealing exception

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 89 of 251, by PARUS

User metadata
Rank Oldbie
Rank
Oldbie

It's all very good about thousands of steps if using Pentium-Pentium2-Pentium3 speeds. But we must remember about very slow 8086-80386 speeds:

PARUS wrote on 2020-07-09, 02:33:
Old DOS games very like small and integer-valued CPU multiplier toward FSB. 1, 2, 3, 4. Integer-valued and small (I repeat), it […]
Show full quote

Old DOS games very like small and integer-valued CPU multiplier toward FSB. 1, 2, 3, 4. Integer-valued and small (I repeat), it is one of guarantors for qualitative throttling for old (~early 90s) and oldest DOS games.
<...>
examples (very useful with P4 processors):
Multiplier x12 + ODCM 75% (1/4) = real multiplier x3
Multiplier x16 + ODCM 87,5% (1/8) = real multiplier x2
Multiplier x20 +ODCM 75% (1/4) = real multiplier x5

I.e. if we got not integer-valued CPU:FSB ratio we can get and see nonlinear gameplay time in old DOS games.
Therefore your minimum x6 multiplier is not good. I advise for all Core2/PentiumE users minimum x8 multiplier because ODCM=1 makes 8:8=1. And if multiplier is x6 the ODCM=1 makes 6:8=3:4=0,75 CPU:FSB. But if ODCM=4 the multiplier x6 becomes good because 6:2=3:1=3 CPU:FSB.

And at slow 8086-80386 speeds the ODCM is better at values 1, 2, 4. In these cases we get equal intervals between working cycles. If choosing 3, 5, 6, 7 values the working cycles aren't in regular intervals. And we can get and see nonlinear gameplay time in old DOS games again (not always but very probably).

ODCM=1 (87,5%) |.......|.......|.......|.......
ODCM=2 (75%) |...|...|...|...|...|...|...
ODCM=3 (62,5%) |..|.|..|..|.|..|..|.|..|..|.
ODCM=4 (50%) |.|.|.|.|.|.|.|.|.|.|.|.
ODCM=5 (37,5%) ||.|.||.|.||.|.||.|.
ODCM=6 (25%) |||.|||.|||.|||.|||.
ODCM=7 (12,5%) |||||||.|||||||.|||||||.

Vertical line is a working cycle here, the dot is an empty cycle. Please look and find where intervals beetwen working cycles are equal. They are only at values 1, 2, 4.

Reply 90 of 251, by PARUS

User metadata
Rank Oldbie
Rank
Oldbie

UPD: the same goes for south bridge chipset (FSB) throttling too, 50%, 75%, 87,5% values only. And in most cases the 50% value is enough if we use ODCM at the same time, it's enough for getting 386-486 speeds. Not recommend to use more than 50% chipset slowdown. May stop some peripherals like keyboard.

As you see thousands of steps become strict enough values row.

Reply 91 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

I do not know exactly how Intel or the south bridge manufacturers implement hardware ODCM and throttling in their CPUs and chips, but here are what I suspect is happening.

I do agree with you that on the ODCM duty cycles, the spacing of odd number ODCM will have uneven spacing. For the multiplier, I doubt it will have any ill effect. The external FSB frequency goes through the internal PLL(s) in order to be multiplied. To the ODCM module, the effect of different multipliers would only be the frequency it sees, i.e. the width of the pulses would be different for different multiplier. Since ODCM modulates the pulses inside of the CPU, the frequency would be a lot higher. For example, Dual Core E6700 has an FSB of 266MHz, and this is multiplied by 12 by the PLL to get the maximum clock of 3.2GHz. So to the ODCM module, it sees 3.2GHz instead of 266MHz. When the CPU is set to minimum multiplier of 6, the frequency it sees become 1.6GHz. This 1.6GHz is then modulated according to the duty cycle setting per your pulse diagram above. Because of the much higher frequency, the effect of the "uneven" odd ODCM values would become minimized, and this represents a huge advantage over south bridge throttling, as the south bridge only has the 266MHz FSB to work with (as an example the eyes will see flickering at 30Hz but not at 120Hz). That's why ODCM can provide a much smoother "throttle" of the CPU than the south bridge. The worst is if the south bridge (or BIOS) has a poor implementation of throttling, e.g. if it groups the pulses together instead of spreading them out:

Case 1. 50% duty cycle spreaded out: |.|.|.|.
Case 2. 50% duty cycle grouped: ||||....

Couple with slow FSB clock, case 2 will certainly provide a much rougher throttling than case 1. Other than that, I would say the rest would be fine. I do have 2 motherboards with the exact P4M900 south bridge, one is Biostar and the other MSI. The MSI board would experience sluggishness in keyboard/mouse when throttled, while the Biostar is completely fine. I believe this is a case of bad implementation of throttling in the MSI BIOS since the chips are identical. In the MSI case, it would then be recommended to use ODCM instead of south bridge throttling. (As a side note, the MSI board also has incompatibility issue with the Yamaha YMF744 so it's in fact a bad choice for retro gaming setup.)

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 92 of 251, by JazeFox

User metadata
Rank Member
Rank
Member

Hi!

Great piece of software! Thank you.
I used it on a SBC with a soldered VIA Eden ESP8000 CPU (Nehemiah core) with CLE266/VT8235M chipset and it worked perfectly! Finally, I can change the multiplier now.

And tried it again on a different SBC with a VIA Eden 500 (Esther core, ID(6,D,0)) with VIA CX700M chipset and it did not work (I see Esther core is not supported in CpuSpd). I don't know if it is possible to change this CPU's multiplier, but I ask just in case.

CpuSpd report:

VIA Eden Processor 500MHz ID(6,D,0)
CPU frequency is 498MHz (TSC)
Current multiplier: -0.1 (min 0.0, max 0.0)
CPU throttle is 16/16

Thank you!

Reply 93 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

Hi JazeFox,

Thanks for the feedback. Currently the program supports VIA C3 only, C7 and Esther are not supported at the moment. It should report "Multiplier control is not supported" instead of the weird -0.1 multiplier, I'll take a look at this bug.

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 94 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

@JazeFox

After some digging I managed to find the datasheet for C7-M (Esther model A & D). However I could not find any information regarding MSR 110Ah, which is the VIA powersaver control register. From various info scattered on the internet, it seems that C7 multiplier could be set using standard Intel speedstep MSRs. If you're willing to help out with the testing, we could give it a try using both powersaver method and speedstep method. But it could be a long shot though.

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 97 of 251, by lordmogul

User metadata
Rank Newbie
Rank
Newbie

Will be interesting to see how it reacts to my Xeon setup (LGA771 on LGA775, but technically still a Core 2, so it should work, right?). Will give it a try there.

P3 933EB @1035 (7x148) | CUSL2-C | GF3Ti200 | 256M PC133cl3 @148cl3 | 98SE & XP Pro SP3
X5460 @4.1 (9x456) | P35-DS3R | GTX660Ti | 8G DDR2-800cl5 @912cl6 | XP Pro SP3 & 7 SP1
3570K @4.4 GHz | Z77-D3H | GTX1060 | 16G DDR3-1600cl9 @2133cl12 | 7 SP1

Reply 98 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

Modded Xeon X54x0? I have similar setup, except recently it acted up and won't boot, so I didn't do any tests on it. But yeah it should work, including odcm. 😀

Slow down your CPU with CPUSPD for DOS retro gaming.

Reply 99 of 251, by kalohimal

User metadata
Rank Member
Rank
Member

Btw for the upcoming version 1.6, will be adding support for VIA Esther (multiplier and voltage control), and also on/off of L2 cache, I-cache, D-cache, and branch prediction for C3. I've ordered a C3 Ezra-T 1GHz and expecting it to arrive within this week. Will release once it is fully tested.

Slow down your CPU with CPUSPD for DOS retro gaming.