VOGONS


First post, by bucket

User metadata
Rank Member
Rank
Member

is there an advantage to using a pentium 4 chip with a non-NT OS?
maybe i'm remembering this wrong but i seem to recall that hyperthreading (as opposed to multi-core or multi-socket) is done on a low hardware level so that even OSs that don't support multiple CPUs can see a benefit.

Reply 1 of 18, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie

You need an NT based OS to get advantage of HT. But a lot of people are running P4 Win98 systems, as it's a pretty fast combination ...

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 2 of 18, by ElectroSoldier

User metadata
Rank Oldbie
Rank
Oldbie

I believe the first Hyper Threading Pentium 4 was the 3.06HT 533 (SL6S5), the 2.8GHz 533 was none HT. Its a very capable CPU for Windows 98.
Windows 98 was pretty much dead by the time the P4 2.8/533/512 was released, using it for Windows 98 would put a lot of power in a Windows 98 PC, it wasnt done at the time because people wanted to use XP by then but if you did buy a PC and get 98 installed on it then it would have been a very powerful Win98 PC.

When I think of the Pentium 4 I split it up into Pre HT and post HT generations. There is no point at all in putting a dual, HT or multi core in Windows 98 as it will not ever use a CPU beyond the first one, be that in its own socket, a Hyper threading CPU or multi core.

Reply 3 of 18, by waterbeesje

User metadata
Rank Oldbie
Rank
Oldbie

Dont forget is HT that just helps at multitasking, it's not a second cpu. Without using ht the cpu stil has the same raw processing power. Also there are some cpu's that have more cache zo they can be a little bit faster too.
So Yes, ik a proud just of a HT cpu with win98.

Stuck at 10MHz...

Reply 4 of 18, by Trashbytes

User metadata
Rank Oldbie
Rank
Oldbie

98 cant take advantage of HT regardless, it has zero support for multiple cores or HT. Turning HT off will actually help 98 and the larger L2 cache that HT CPUs normally have will also be a big help.

Reply 5 of 18, by kolderman

User metadata
Rank l33t
Rank
l33t
bucket wrote on 2024-02-16, 05:08:

is there an advantage to using a pentium 4 chip with a non-NT OS?
maybe i'm remembering this wrong but i seem to recall that hyperthreading (as opposed to multi-core or multi-socket) is done on a low hardware level so that even OSs that don't support multiple CPUs can see a benefit.

Pipelining happens at a low leven (also branch prediction etc). HT just presents fake cores to the OS but uses normal pipelining to achieve some parallalism.

Reply 7 of 18, by ElectroSoldier

User metadata
Rank Oldbie
Rank
Oldbie

Not quite but its a decent start.

Win98 is single threaded, while it can run multiple processes at once but each process can run one thread of execution.
A P4 HT is one core that can run two threads at once, so the second thread will not be used as it will never try to run two of them.
Win98 might or might not run on the first thread as the scheduler calls.

I think you can tell Win98 to set affinity to the second thread Ive really never looked into it to that level, I just turn off HT and leave it at that.
What I do know is its single threaded and will only ever run one thread of execution.

Reply 8 of 18, by Trashbytes

User metadata
Rank Oldbie
Rank
Oldbie
ElectroSoldier wrote on 2024-02-16, 11:08:
Not quite but its a decent start. […]
Show full quote

Not quite but its a decent start.

Win98 is single threaded, while it can run multiple processes at once but each process can run one thread of execution.
A P4 HT is one core that can run two threads at once, so the second thread will not be used as it will never try to run two of them.
Win98 might or might not run on the first thread as the scheduler calls.

I think you can tell Win98 to set affinity to the second thread Ive really never looked into it to that level, I just turn off HT and leave it at that.
What I do know is its single threaded and will only ever run one thread of execution.

IIRC 98 doesn't have the ability to set thread affinity as it has no idea the second thread/core exists so as far as the OS is concerned its only got one single core/thread to work with.

Now its possible the CPU hardware scheduler could move things around at a low hardware level that isn't apparent to the OS but I doubt any performance increase would be noticed.

Reply 9 of 18, by kolderman

User metadata
Rank l33t
Rank
l33t
Trashbytes wrote on 2024-02-16, 11:12:

Now its possible the CPU hardware scheduler could move things around at a low hardware level that isn't apparent to the OS but I doubt any performance increase would be noticed.

Pipelining, out-of-order instruction execution, branch-prediction - these are all ways CPUs do things "at a low hardware level that isn't apparent to the OS" and are where much of the performance in "single-core" executions gains came from.

Pipelining is literally where a "single thread of execution" is executed in parallel where possible...HT makes it somewhat more efficient by giving the core more instructions it can try and pipeline....but really the main advantage comes from reducing the amount of context switching the OS needs to do...but in practice the benefits of HT are marginal or non-existent.

Reply 10 of 18, by kolderman

User metadata
Rank l33t
Rank
l33t
ElectroSoldier wrote on 2024-02-16, 11:08:

A P4 HT is one core that can run two threads at once, so the second thread will not be used as it will never try to run two of them.
Win98 might or might not run on the first thread as the scheduler calls.

I would not say this is entirely accurate. "A P4 HT is one core that presents to the OS as 2 cores" - is more accurate. It has no more computations units than a regular non-HT core - if it did, then it would be closer to a real second core, and cannot pipeline any more instructions than a non-HT core (although it has more to work with being aware of two instruction streams with it's two IPs).

HT simplifies the operations of the OS, and the intermingling of instructions from two instruction streams "threads" increases potential for parallel pipelining. That's about it.

Reply 11 of 18, by ElectroSoldier

User metadata
Rank Oldbie
Rank
Oldbie

From what I understand the early OSs like 2k and XP they didnt see a P4 HT as a one CPU with two cores but as two seperate CPUs.
They both use a "per socket" licence, and if you put a quad core CPU into Win2k it will only see the first 2 CPUs (0,1) , Ive had Win2k on a dual xeon quad core and it only uses the first core of each socket.
So "A P4 HT is one core that presents to the OS as 2 CPUs" is even more accurate. No?

I guess my question is

While 2k and XP are multi CPU are they aware of multi cores on a single CPU?

I think XP is because it can see all 8 cores on an 8 core i7

Reply 12 of 18, by spiroyster

User metadata
Rank Oldbie
Rank
Oldbie
ElectroSoldier wrote on 2024-02-16, 13:53:
From what I understand the early OSs like 2k and XP they didnt see a P4 HT as a one CPU with two cores but as two seperate CPUs. […]
Show full quote

From what I understand the early OSs like 2k and XP they didnt see a P4 HT as a one CPU with two cores but as two seperate CPUs.
They both use a "per socket" licence, and if you put a quad core CPU into Win2k it will only see the first 2 CPUs (0,1) , Ive had Win2k on a dual xeon quad core and it only uses the first core of each socket.
So "A P4 HT is one core that presents to the OS as 2 CPUs" is even more accurate. No?

I guess my question is

While 2k and XP are multi CPU are they aware of multi cores on a single CPU?

I think XP is because it can see all 8 cores on an 8 core i7

XP SP3 introduced "GetLogicalProcessorInformation" function in it's API, so Win2K is probably not aware of logical/physical cores so cannot distinguish between them. Win2K lic is per CPU (logical and physical). XP lic is per socket (physical), however also logical core limited, but this is a lot higher.

https://learn.microsoft.com/en-gb/windows/win … DN#requirements

Reply 13 of 18, by dionb

User metadata
Rank l33t++
Rank
l33t++

The only difference between multiple cores on a single CPU vs multiple separate CPUs is licensing. MS at one point (when they still had the Home/Pro distinction but dualcore CPUs were becoming the norm in mid/low-end) decided to treat a dual core CPU as a single one for the point of licensing so that you could utilize a dualcore with Windows Home edition.

Technically the only distinction in Windows is between uniprocessor and multiprocessor HAL.

The latter is required to utilize multiple CPUs, multiple cores or indeed multiple threads in the same CPU.

Note that the practical performance gain of HT in P4 was negligible to indeed negative (higher overhead of multiprocessor HAL), particulary in games.

Here are some benchmarks:
https://www.tomshardware.com/reviews/single-c … ion,549-12.html
(Q3A, slightly slower with HT)
https://www.tomshardware.com/reviews/single-c … ion,549-14.html
(Comanche, one bench very slightly faster, one very slightly slower)

Where HT (and indeed SMP) shines is in predictable number crunching, such as with MP3 encoding:
https://www.tomshardware.com/reviews/single-c … ion,549-15.html
Or video encoding:
https://www.tomshardware.com/reviews/single-c … ion,549-16.html

Reply 14 of 18, by fosterwj03

User metadata
Rank Member
Rank
Member

I think the answer also depends on what the OP means by non-NT OS. I use a P4 as my retro rocket for OS/2 2.x, Windows 3.0a, old versions of Solaris, and, yes, NT 3.1 because it's my newest/fastest system with integrated, fully functional ISA slots. I need ISA for sound since these OSs don't have drivers for PCI sound cards.

In my opinion, these OSs definitely benefit from the faster processing speed of the P4 (3.066 GHz in my case) even without HT available.

Reply 15 of 18, by chinny22

User metadata
Rank l33t++
Rank
l33t++
ElectroSoldier wrote on 2024-02-16, 13:53:
From what I understand the early OSs like 2k and XP they didnt see a P4 HT as a one CPU with two cores but as two seperate CPUs. […]
Show full quote

From what I understand the early OSs like 2k and XP they didnt see a P4 HT as a one CPU with two cores but as two seperate CPUs.
They both use a "per socket" licence, and if you put a quad core CPU into Win2k it will only see the first 2 CPUs (0,1) , Ive had Win2k on a dual xeon quad core and it only uses the first core of each socket.
So "A P4 HT is one core that presents to the OS as 2 CPUs" is even more accurate. No?

I guess my question is

While 2k and XP are multi CPU are they aware of multi cores on a single CPU?

I think XP is because it can see all 8 cores on an 8 core i7

2k and below are not, they will treat each core as a CPU (I use this to "cheat" my server builds to max out the CPU count.
XP and above are so will support your dual quad core setup (and no more cheating for my builds)

Reply 16 of 18, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
dionb wrote on 2024-02-16, 14:26:

The only difference between multiple cores on a single CPU vs multiple separate CPUs is licensing.

Until you run into NUMA aware stuff in Windows Server 2003 / Vista.

Did XP SP2 support NUMA? Did foster MP support NUMA? My recollection is kind of fuzzy. By the time I was interested in learning about it, it was already established.

Reply 17 of 18, by dionb

User metadata
Rank l33t++
Rank
l33t++
douglar wrote on 2024-02-19, 13:49:
dionb wrote on 2024-02-16, 14:26:

The only difference between multiple cores on a single CPU vs multiple separate CPUs is licensing.

Until you run into NUMA aware stuff in Windows Server 2003 / Vista.

Did XP SP2 support NUMA? Did foster MP support NUMA? My recollection is kind of fuzzy. By the time I was interested in learning about it, it was already established.

NUMA clustering requires OS support, but that support isn't needed to work on a clustered system, it just makes OS aware of which resources are 'near' and which are 'far' and so lets it optimize for keeping similar stuff on the same cluster nodes.

As the CPUs themselves back then didn't handle interconnect (the way Opterons do), NUMA wasn't a CPU but a chipset feature. IBM's Summit chipset enabled it, letting you connect up to four 4 CPU nodes - and that was a Foster MP chipset.

Reply 18 of 18, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
dionb wrote on 2024-02-19, 16:10:

As the CPUs themselves back then didn't handle interconnect (the way Opterons do), NUMA wasn't a CPU but a chipset feature. IBM's Summit chipset enabled it, letting you connect up to four 4 CPU nodes - and that was a Foster MP chipset.

That makes sense because true dual core chips weren't a thing yet. The chipset allowed for tiered clusters with up to 4 CPUs per cluster. I'm starting to remember.

https://www.hpcwire.com/2001/11/30/ibm-summit … es-for-takeoff/

So did it have a driver for shrink wrapped windows or did it require a special funky hacked up version of the OS?