VOGONS


First post, by Marco

User metadata
Rank Member
Rank
Member

Hi all,

I finally could answer the question of my childhood: how much faster was the 386DX/25 of my schoolmate compared to my 386SX/25.
I compared therefor both systems (SX: VLSI SCAMP, DX: Symphony 128kb cache):

Upfront:
- The results might be a bit in more in favor of the DX since its 128kb Cache. Especially I think you can see this in the Wintach results. Disabling external cache in BIOS did not work as it slowed down the DX way too much.
- I was in particular keen to see DOS4GW performance. Indeed quite a difference. Some remarks in dos-forum stating performance differences increase with 32BitExtender were fully right. 16Bit/32Bit Data Bus - clear. Couldn’t notify much improvement in QfG4/PQ4 though
- VGA Thruput tests are strange. Dont know why slower on DX. Maybe due to a SCAMP feature named "Automatic Bus Speed-up on Video Access". Dont think so though

It might not be too interesting for most but I just wanted to share 😀

Attachments

Last edited by Marco on 2024-02-03, 20:05. Edited 5 times in total.

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 1 of 14, by rasz_pl

User metadata
Rank l33t
Rank
l33t

I think you have vidspeed read/write wrong way around. VGA cards are optimized for write speed (fifos) and really suck at reading.
Most of your vidspeed numbers except for VGA write look bogus. At 13MHz ISA clock theoretical maximum transfer speed is somewhere around 8.5MB/s.

Last edited by rasz_pl on 2024-01-30, 13:33. Edited 1 time in total.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 2 of 14, by Marco

User metadata
Rank Member
Rank
Member

Thanks. I updated the attachment accordingly:

1. Vidspeed: Exchanged read and write
2. Vidspeed: Added correct parameters. "*" is called "normal RAM speed" and not Bus speed.
3. Wintach: Sub-benches now in correct order

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 3 of 14, by rasz_pl

User metadata
Rank l33t
Rank
l33t

19 write 16 read ram speed is really fantastic for 25MHz 386, usually you need very well tuned 40MHz to get there. Can you verify those numbers with cachechk https://www.sac.sk/files.php?d=13&p=4?
"cachechk.exe -x7" should show both ram read write speeds.

Doom 12-19 fps on 386 tells me it was running low details very small window?

11MB/s vidspeed write is not possible on 14MHz ISA bus with 1WS, was that a bios setting called "IO=1WS"? Was it necessary for going from 12 to 13.75MHz? In any case 11MB/s is a fantastic result, absolute ISA maximum is (2 Bytes x clock MHz)/2 cycles for 0WS and (2 Bytes x clock MHz)/3 cycles for 1WS.

Sorry for nitpicking, Im really interested in your comparison.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 4 of 14, by Marco

User metadata
Rank Member
Rank
Member

0. I am happy about the challenging questions but also very upset about some errors I made.

1. cachechk -x7 results see below. Whats your view here? Attention: for the SX. You need DX?
2. doom was benched using phils dosbench pack and there doom "low settings" which is indeed very small window
3. You are right. The 11mb/s came from a bench with even higher ISA bus. Furthermore the 16Bit I/O = 0WS. Why? I was also testing with even higher ISA bus speed and there had to increase WS. I mixed that up. Current results are 9127 / 3624 at WS=0.
4. I am using 13.75Mhz ISA as I increased CPU osci to 55=27,5Mhz and only use internal divider for ISA bus instead of an external (maybe you remember the other thread in this forum)
5. For the DX 25 results I have to reconnect the board again. Anything particular I should check ?

Attachments

  • IMG_7306.jpeg
    Filename
    IMG_7306.jpeg
    File size
    1.59 MiB
    Views
    637 views
    File license
    Fair use/fair dealing exception
  • IMG_7310.jpeg
    Filename
    IMG_7310.jpeg
    File size
    1.79 MiB
    Views
    642 views
    File license
    Fair use/fair dealing exception
  • IMG_7308.jpeg
    Filename
    IMG_7308.jpeg
    File size
    1.66 MiB
    Views
    642 views
    File license
    Fair use/fair dealing exception
  • IMG_7309.jpeg
    Filename
    IMG_7309.jpeg
    File size
    730.62 KiB
    Views
    642 views
    File license
    Fair use/fair dealing exception

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 5 of 14, by Marco

User metadata
Rank Member
Rank
Member

Following short updates regarding the DX/25 figures:

I attached:
- Vidspeed * results
- Cachechk results
- Thruput results

Maybe all as it should be and vidspeed results are just cache-based?

Attachments

  • IMG_7316.jpg
    Filename
    IMG_7316.jpg
    File size
    1.8 MiB
    Views
    575 views
    File license
    Fair use/fair dealing exception
  • IMG_7315.jpeg
    Filename
    IMG_7315.jpeg
    File size
    1.83 MiB
    Views
    595 views
    File license
    Fair use/fair dealing exception
  • IMG_7313.jpeg
    Filename
    IMG_7313.jpeg
    File size
    1.82 MiB
    Views
    595 views
    File license
    Fair use/fair dealing exception

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 7 of 14, by rasz_pl

User metadata
Rank l33t
Rank
l33t
Marco wrote on 2024-01-30, 14:24:

1. cachechk -x7 results see below. Whats your view here? Attention: for the SX. You need DX?

Sorry, I should have said -x1. -x7 parameter runs all tests including stupid ones like cache offset, all I really wanted to se was read and write speeds. But the screenshot still shows all the important data, 87/48 us/KB R/W = 11.5/20.8MB/s. Those are really impressive numbers for 386, especially considering low FSB.

Marco wrote on 2024-01-30, 14:24:

2. doom was benched using phils dosbench pack and there doom "low settings" which is indeed very small window

I dont really understand Phils intentions with that setting, its not like anyone would play it like that 😀

Marco wrote on 2024-01-30, 14:24:

4. I am using 13.75Mhz ISA as I increased CPU osci to 55=27,5Mhz and only use internal divider for ISA bus instead of an external (maybe you remember the other thread in this forum)

Now I remember 😀

Marco wrote on 2024-01-30, 14:24:

5. For the DX 25 results I have to reconnect the board again. Anything particular I should check ?

I wonder why DX motherboard has significantly slower ISA transfer results. Is there anything in bios that can be tweaked? ~5MB/s suggest it adds two wait states to VGA transactions.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 8 of 14, by Marco

User metadata
Rank Member
Rank
Member

1. indeed I did!!! I played it on that (non optimized) 386sx 25 back in the days. I don’t know how I could but it was fun. Same applies to various other games where I do say now: noooooo never

2. I went through all available settings even via AMI tool to identify hidden settings. I really have no idea. I will make some fotos but only later today.

Rgs

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 9 of 14, by Marco

User metadata
Rank Member
Rank
Member
Takedasun wrote on 2024-01-30, 20:26:
Marco wrote on 2024-01-30, 19:03:

- Thruput

Where can I download this program?

I will put it online this evening as I also couldn’t find an online resource

Attachments

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 10 of 14, by Takedasun

User metadata
Rank Newbie
Rank
Newbie
Marco wrote on 2024-01-31, 06:42:
Takedasun wrote on 2024-01-30, 20:26:
Marco wrote on 2024-01-30, 19:03:

- Thruput

Where can I download this program?

I will put it online this evening as I also couldn’t find an online resource

Thank you.

Reply 11 of 14, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie
Marco wrote on 2024-01-29, 18:32:

- I was in particular keen to see DOS4GW performance. Indeed quite a difference. Some remarks in dos-forum stating performance differences increase with 32BitExtender were fully right. 16Bit/32Bit Data Bus - clear. Couldn’t notify much improvement in QfG4/PQ4 though

Doom might be a bit of an outlier because, the way the story goes, it was optimized for 486. Tighter loops that fit into 486 cache (to preserve more cache space for data) instead of the usual 386 approach of unrolling the code as much as possible. So in this particular case even external cache might provide more performance vs non-cached system, as long as the RAM waitstates are not zero.
That being said in many cached 386DX systems the RAM latency is worse than in unached ones, reason being the chipset is simplified and doesn't do concurrent access to cache and RAM. Cache is tried first and RAM only after cache miss, so disabling cache entirely might result in worse penalty than expencted.

What I'm trying to say here is you are not measuring just the CPU performance, and its bus limitations, but also the quality of the motherboard chipset. A decent 386SX mobo might be able to access RAM with zero WS and make up a lot of performance loss to 32-bit bus of 386DX on a poor, slow mobo. That's why I like the cheap ALI-based 386SX-40 mobos. Even the 33MHz ones are nice. Very simple design, not much to go wrong with those, and the SX-40 performance is almost reaching levels of 386DX-33, with small cache as well, on a so-so mobo (like OPTi chipset).

As I see it, the main reason 386DX has better performance in protected mode (including DOS extenders) vs SX CPU is due to code, not data. DX can fetch code in 32-bit chunks - even in 16-bit mode so already has an advantage. In protected mode the opcode encoding gets longer due to SIB and prefixes for 16/32 bit data access, it's closer to 3 bytes on average rather than 2, so 386SX will have more stalls. The SX chip is meant for real mode and pure DOS, at 40MHz it can offer acceptable performance in some protected mode code but anyone targeting mid-90 games using DOS extenders should look for cached 386DX at 40MHz or even 486 if they want Doom (maybe even DX2) to be playable by today standards.

Reply 12 of 14, by Marco

User metadata
Rank Member
Rank
Member

Very interesting insights. Thanks

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I

Reply 13 of 14, by Marco

User metadata
Rank Member
Rank
Member

I again updated the file as I re-did the Windows benchmarks. the SX@27 numbers are now more realistic level towars the DX . The numbers before were wrong #sorry

1) VLSI SCAMP 311 | 386SX25@30 | 16MB | CL-GD5434 | CT2830| SCC-1 | MT32 | Fast-SCSI AHA 1542CF + BlueSCSI v2/15k U320
2) SIS486 | 486DX/2 66(@80) | 32MB | TGUI9440 | SG NX Pro 16 | LAPC-I