VOGONS


First post, by ux-3

User metadata
Rank Oldbie
Rank
Oldbie

I am still unsure how to configured the VLB correctly. This is my board (scroll down to second board version 2.2)
https://wiki.preterhuman.net/Shuttle_HOT-409_486_VLB
The last 4 tables at the very bottom of the link concern the VESA configuration.
In my board the socket U43 (next to VLB slot) is empty. According to the "vesa mode configuration table" my board can only run in slave mode.
The last table is "VESA DEVICE COMPATIBILITY CONFIGURATION". What is that?

What if I plug in one or two VLB cards? Would/should I need to set any of that?
I noted that with two VLB cards (VGA, IO) inserted, I get brief pauses in fm-synthesized music , as if it waits for something?

Any info is welcome!

Retro PC warning: The things you own end up owning you.

Reply 1 of 23, by dominusprog

User metadata
Rank Oldbie
Rank
Oldbie

Since U43 is empty you’ll not be able to use the VLB in master and slave mode, it'll work only in slave mode. For your setup I believe that you should set both J29 and J39 to 2&3. Also set the J20 for 0 wait states.

https://theretroweb.com/motherboard/manual/32938.pdf

Duke_2600.png
A-Trend ATC-1020 V1.1 ❇ Cyrix 6x86 150+ @ 120MHz ❇ 32MiB EDO RAM (8MiBx4) ❇ A-Trend S3 Trio64V2 2MiB
Creative AWE64 Value ❇ 8.4GiB Quantum Fireball ❇ Win95 OSR2 Plus!

Reply 3 of 23, by mkarcher

User metadata
Rank l33t
Rank
l33t
ux-3 wrote on 2024-05-29, 20:16:
dominusprog wrote on 2024-05-29, 20:00:

Since U43 is empty you’ll not be able to use the VLB in master and slave mode, it'll work only in slave mode.

Yes, but what does that mean?
Can I only install one VLB card?

You can install one or two VLB cards, as long as the cards only respond to actions generated by the processor ("slave"), and do not try to access your main memory on their own ("master"). This is completely different from the terminology used by IDE: You do not need to configure your VL cards as master and slave, like you do with two hard drives.

Unless you have a VL SCSI controller like the Adaptec 2842, you will likely only have slave devices, and it will work perfectly with the "slave only" setting. If you install two cards at 40MHz or higher, you might need to fiddle around with the "compatibility jumpers" to find a configuration that works stable.

Reply 4 of 23, by ux-3

User metadata
Rank Oldbie
Rank
Oldbie
dominusprog wrote on 2024-05-29, 20:00:

For your setup I believe that you should set both J29 and J39 to 2&3.

Setting this or Mode 1 will cause read errors from CF. Only mode 0 works. (that is settled then)

mkarcher wrote on 2024-05-29, 20:25:

you will likely only have slave devices, and it will work perfectly with the "slave only" setting.

Good to know, they are slaves then.

mkarcher wrote on 2024-05-29, 20:25:

If you install two cards at 40MHz or higher, you might need to fiddle around with the "compatibility jumpers" to find a configuration that works stable.

That would be setup zero then. The others fail.

The interesting thing is: The pauses in music happens only with my VLB graphics, when the CF cards LED goes on bright (longer read). It doesn't happen, when I use my non VLB VGA card. The choice of IO (ISA or VLB makes no difference).
The VLB graphics card somehow causes these pauses even if alone on the VL bus. Changing the cards jumper for >33 MHz makes no difference.

Edit:
I also swapped the soundcard from ESS to Creative, made no difference.
It seems as if the VLB VGA card causes this. It is an S3 P86C805 chip, 1MB (Miro Crystal 8S).

Retro PC warning: The things you own end up owning you.

Reply 5 of 23, by pitchshifter

User metadata
Rank Newbie
Rank
Newbie

Hello all, i got similar doubts here..
I have a vlb vga and IO card, can i place the vga anywhere? do she loose performance?
If i change my dx2-66 to a Dx2-80 fsb 40, will be stable with both cards? Should vga with fsb 40 be at slot 1?

Thank you

Reply 6 of 23, by mkarcher

User metadata
Rank l33t
Rank
l33t
pitchshifter wrote on 2024-08-01, 14:52:

I have a vlb vga and IO card, can i place the vga anywhere? do she loose performance?
If i change my dx2-66 to a Dx2-80 fsb 40, will be stable with both cards? Should vga with fsb 40 be at slot 1?

Both your VGA and your I/O card are "slave devices" and work in any VL slot (unless you are calling and Adaptec 2842 SCSI controller "I/O card" to be humble). All three slots provide the same performance, so just looking at digital signals, there should be no difference in what card gets put where (assuming no slot has bent contacts, of course...). At 33MHz, everything will likely work fine with both cards. At FSB40, you might or might not get issues. There is no general rule of thumb on how to distribute the cards for maximum compatibility, because the optimal configuration may depend on the trace layout on the mainboard, the signal drive strength of the CPU you installed and the load characteristic and trace lengths on the cards you install. Shuffling cards around might help in borderline cases.

Your BIOS setup might have a setting called "LDEV decode" or "ELBA sampling point", which can be set to "T1/T2" or "early/late". Chosing "late" or "T2" decreases bus performance, but gives VL cards more time to claim a cycle. At 40MHz, the official recommendation is to use late decoding, whereas at 33MHz, early decoding may be used. As I understand it the "<=33 / >33MHz" jumper on many VL boards (which oftentimes just affects one of the "system type identification" pins on the VL connector) is used to inform cards on the speed required to claim the cycle. Most cards just ignore it, and claim cycles as fast as they can, which works at 40MHz most of the time even if the chipset uses "early decoding". Anyway, if you get issues at 40MHz, changing this setting to "late" (assuming you have it in your setup) might help.

Reply 7 of 23, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

IDK if it was just coincidence, or a "feature" of the mostly VIA based motherboards I had, but all my performance VLB builds back in the day, ended up with Graphics nearest the CPU and the I/O furthest away from the CPU with an empty slot in between if necessary. Anyway, that's what worked for me when doing 40Mhz plus.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 8 of 23, by ux-3

User metadata
Rank Oldbie
Rank
Oldbie

I am using a 486-40, and I found no new problems at 40. If I went down to 33, no issue disappeared.

Retro PC warning: The things you own end up owning you.

Reply 9 of 23, by pitchshifter

User metadata
Rank Newbie
Rank
Newbie

This board is only 5v, although theres variants with 3.3v. Aldo i have in the pcb the place for the voltage regulator, but i will try the dx2 80 from It´s ST that works with 5v.
I will try and inform you abou the results, it its buggy or not 😀~
This board is weird, i cant get vga from isa cards, only VLB ones.

Reply 10 of 23, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Check you haven't got a mono/color jumper set to mono, it might force it for ISA and ignore it for VLB

Edit: I think I've seen a sneaky little cryptic CMOS setup option, maybe in advanced chipset setup on some boards that said something like "Local Bus Graphics" en/dis, which would either enable onboard, PCI, or VLB and disable ISA depending on system.

Last edited by BitWrangler on 2024-08-03, 15:45. Edited 1 time in total.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 11 of 23, by maxtherabbit

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2024-08-01, 18:03:
pitchshifter wrote on 2024-08-01, 14:52:

I have a vlb vga and IO card, can i place the vga anywhere? do she loose performance?
If i change my dx2-66 to a Dx2-80 fsb 40, will be stable with both cards? Should vga with fsb 40 be at slot 1?

Both your VGA and your I/O card are "slave devices" and work in any VL slot (unless you are calling and Adaptec 2842 SCSI controller "I/O card" to be humble). All three slots provide the same performance, so just looking at digital signals, there should be no difference in what card gets put where (assuming no slot has bent contacts, of course...). At 33MHz, everything will likely work fine with both cards. At FSB40, you might or might not get issues. There is no general rule of thumb on how to distribute the cards for maximum compatibility, because the optimal configuration may depend on the trace layout on the mainboard, the signal drive strength of the CPU you installed and the load characteristic and trace lengths on the cards you install. Shuffling cards around might help in borderline cases.

Your BIOS setup might have a setting called "LDEV decode" or "ELBA sampling point", which can be set to "T1/T2" or "early/late". Chosing "late" or "T2" decreases bus performance, but gives VL cards more time to claim a cycle. At 40MHz, the official recommendation is to use late decoding, whereas at 33MHz, early decoding may be used. As I understand it the "<=33 / >33MHz" jumper on many VL boards (which oftentimes just affects one of the "system type identification" pins on the VL connector) is used to inform cards on the speed required to claim the cycle. Most cards just ignore it, and claim cycles as fast as they can, which works at 40MHz most of the time even if the chipset uses "early decoding". Anyway, if you get issues at 40MHz, changing this setting to "late" (assuming you have it in your setup) might help.

are you familiar with what the BIOS settings:

"VESA Master Cycle: Delay ADSJ/Non-delay ADSJ"
and
"Delay Internal ADSJ: Disabled/Enabled"

could refer to with respect to VLB? This is an ALi chipset with Phoenix BIOS

Reply 12 of 23, by mkarcher

User metadata
Rank l33t
Rank
l33t
maxtherabbit wrote on 2024-08-02, 20:41:
are you familiar with what the BIOS settings: […]
Show full quote

are you familiar with what the BIOS settings:

"VESA Master Cycle: Delay ADSJ/Non-delay ADSJ"
and
"Delay Internal ADSJ: Disabled/Enabled"

could refer to with respect to VLB? This is an ALi chipset with Phoenix BIOS

I can guess what this setting means, but I don't know for sure. "ADS" is the name of the signal that is issued by the initiator (master) and tells any potential target (slave) that there is a valid cycle definition (address and cycle type) on the lines A0..A31, M/IO, D/C and R/W. ADS is usually driven by the master at the same time by the initiator as the cycle definition signals, which must be some minimum time earlier to the rising edge of the CLK signal, at which the ADS signal "is effective". Targets should look at ADS at the rising edge of the clock, and if they detect ADS at that point, they can assume that the address has been on the bus for "some time". This time is called "address setup time with respect to CLK". Usually, a master implementation does not know when the clock is going to appear, and sets up the cycle definition e.g. 10ns earlier, but instead, the cycle definition and ADS appear some time after the previous clock. Let's assume this is 15ns after the previous clock. At 33MHz bus speed, the clock period is 30ns, so 15ns after the previous clock provides 15ns setup time. At 40MHz bus speed, the clock period decreases to 25ns, so the setup time seen by the target shrinks from 15ns to 10ns, and at 50MHz FSB clock, the cycle time is just 20ns, so a signal appearing 15ns after the previous clock is only valid 5ns before the next clock, which may be too short for peripherals. Delaying ADS, so that it gets picked up one clock later (either every time, or if it is "too late" inside the current clock cycle) will ensure a longer setup time, but at the same time cost one FSB clock per cycle that is affected by "delaying ADS".

So in short: "Enable" generates a certain kind of waitstate at the start of a cycle, that can be used to make the system stable at high FSB clocks.

Reply 13 of 23, by maxtherabbit

User metadata
Rank l33t
Rank
l33t

Thanks. Wonder what the 'J' part could be

Reply 14 of 23, by mkarcher

User metadata
Rank l33t
Rank
l33t
maxtherabbit wrote on 2024-08-03, 15:50:

Thanks. Wonder what the 'J' part could be

Maybe some convention to indicate an active low signal, like "/ADS", "nADS", "ADS*" or ADS with a bar over it. ADS on the 386/486/Pentium FSB is active low. It might also be a qualifier for some internal or forwarded version of the ADS pin that is likely connected to both 486 /ADS pin and the VL /ADS pin. Most likely chipset can distinguish VL master cycles from other cycles using the state of the FSB arbitration unit, not by having different ADS pins for VL and FSB.

Reply 15 of 23, by pitchshifter

User metadata
Rank Newbie
Rank
Newbie

Hi again, i only got now the dx2 80 from It´s ST and i must say that at leat with me its bugs with the vlb controller working at 40fsb. It freezes graphs.
With a Isa controller it works fine, comparing with the intel dx2-66 speedsys gone from 24.91 to 32.21.
3dbench from 44.5 to 51.7, although Doom showed little improvment, 25.61 to 27.34.
Im happy anyways 😜

Reply 16 of 23, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2024-08-03, 13:17:
maxtherabbit wrote on 2024-08-02, 20:41:
are you familiar with what the BIOS settings: […]
Show full quote

are you familiar with what the BIOS settings:

"VESA Master Cycle: Delay ADSJ/Non-delay ADSJ"
and
"Delay Internal ADSJ: Disabled/Enabled"

could refer to with respect to VLB? This is an ALi chipset with Phoenix BIOS

I can guess what this setting means, but I don't know for sure. "ADS" is the name of the signal that is issued by the initiator (master) and tells any potential target (slave) that there is a valid cycle definition (address and cycle type) on the lines A0..A31, M/IO, D/C and R/W. ADS is usually driven by the master at the same time by the initiator as the cycle definition signals, which must be some minimum time earlier to the rising edge of the CLK signal, at which the ADS signal "is effective". Targets should look at ADS at the rising edge of the clock, and if they detect ADS at that point, they can assume that the address has been on the bus for "some time". This time is called "address setup time with respect to CLK". Usually, a master implementation does not know when the clock is going to appear, and sets up the cycle definition e.g. 10ns earlier, but instead, the cycle definition and ADS appear some time after the previous clock. Let's assume this is 15ns after the previous clock. At 33MHz bus speed, the clock period is 30ns, so 15ns after the previous clock provides 15ns setup time. At 40MHz bus speed, the clock period decreases to 25ns, so the setup time seen by the target shrinks from 15ns to 10ns, and at 50MHz FSB clock, the cycle time is just 20ns, so a signal appearing 15ns after the previous clock is only valid 5ns before the next clock, which may be too short for peripherals. Delaying ADS, so that it gets picked up one clock later (either every time, or if it is "too late" inside the current clock cycle) will ensure a longer setup time, but at the same time cost one FSB clock per cycle that is affected by "delaying ADS".

So in short: "Enable" generates a certain kind of waitstate at the start of a cycle, that can be used to make the system stable at high FSB clocks.

I have been messing with the UMC 418 SVGA I have been posting about recently on an MB-1433UIV, which is a UMC 498 chipset board.
When I switched out a 486DX2-66 for a UMC Green CPU and raised the bus to 40 MHz, I now have to enable "CPU ADS# Delay 1T or Not" when using the VLB SVGA (ISA SVGA of course is fine).

But: cache read performance as reported by cachechk decreases from 51.6 MB/s to 47.9 MB/s with this enabled; does that mean the ADS# signal going into the cache controller is also delayed? Is that a needless delay or could the extra loading from the VLB card's presence (as opposed to the bus speed or speed of components on that card) be what is causing the problems on the bus, and the card might affect what the cache controller sees on the bus as well, so therefore its sampling of the bus should also be delayed?

As I understand it looking over the VL-Bus spec,

LDEV# has nothing to do with ADS# or the clock, and is constantly being generated by cards from the current state of the bus.
Delaying chipset sampling of LDEV# avoids an erroneous ISA cycle if VLB devices are too slow to assert it.
How frequent would the opposite problem happen--LDEV# is asserted spuriously due to some intermediate state on the bus, and the card is too slow to de-assert it before the next clock cycle, so an access to an ISA device is missed?
Motherboard cache and DRAM hits supersede LDEV# sampling, and for this reason, delaying LDEV# sampling should never slow down the cache or DRAM.
The penalty for delaying LDEV# sampling is paid on all VLB cycles and all ISA cycles.
VLB ID2/ID3 jumpers would have no effect and can't solve this problem, because the chipset doesn't care about those jumpers

ADS# is asserted by the CPU (and then de-asserted) to indicate the start of a bus cycle
Delaying ADS# allows the address and bus cycle definition signals to stabilize for an extra entire clock cycle before VLB cards try to interpret them
I don't follow exactly what issue requires that--ADS# and the subsequent rising edge get seen by the VLB card too early, or too late, versus the address and bus cycle definition signals?
Is this caused by the motherboard layout and/or capacitive loading from other local bus devices being too high for the current clock rate, slowing down the arrival of bus signals at the SVGA VLB card? Or is it that the propagation through the logic on the VLB card is too slow once the signals have arrived? (or some combination of both)
At least in the case of the UM498, delaying ADS# slows down all cache and DRAM access by one cycle as well
VLB ID2/ID3 jumpers would have no effect and can't solve this problem, because the chipset doesn't care about those jumpers

LRDY# is asserted by the VLB device to end the bus cycle
Assertion of LRDY# could occur before (on a read cycle) data headed from the VLB card to the CPU has stabilized, causing the CPU to see the wrong data?
VLB cards could suppress their own generation of LRDY# for one cycle (e.g., via ID3 jumper) OR the chipset could be software-programmable to delay the sampling of LRDY#, disregarding cards that assert it too quickly - the effect should be the same
The penalty for delaying LRDY# generation by a card is paid only on a VLB cycle to that particular card; the penalty for delaying LRDY# sampling is paid on all VLB cycles across all cards.

Reply 17 of 23, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie

Because I had maxed out at 2-1-1-1 read and 1 WS, I tried replacing the 15ns "CE" brand Tag RAM that came with the board with a 15ns UMC one, and this seemingly has fixed the need for ADS# delay as well--as I know that often unbuffered address lines go to the Tag RAM, was the prior one loading down the address lines and slowing their change in state?

Reply 18 of 23, by mkarcher

User metadata
Rank l33t
Rank
l33t
jakethompson1 wrote on 2025-07-22, 02:35:

I have been messing with the UMC 418 SVGA I have been posting about recently on an MB-1433UIV, which is a UMC 498 chipset board.
When I switched out a 486DX2-66 for a UMC Green CPU and raised the bus to 40 MHz, I now have to enable "CPU ADS# Delay 1T or Not" when using the VLB SVGA (ISA SVGA of course is fine).

OK, so this shows that something requires the increased setup time of the address and/or command lines at 40MHz.

jakethompson1 wrote on 2025-07-22, 02:35:

But: cache read performance as reported by cachechk decreases from 51.6 MB/s to 47.9 MB/s with this enabled; does that mean the ADS# signal going into the cache controller is also delayed? Is that a needless delay or could the extra loading from the VLB card's presence (as opposed to the bus speed or speed of components on that card) be what is causing the problems on the bus, and the card might affect what the cache controller sees on the bus as well, so therefore its sampling of the bus should also be delayed?

The option is called "CPU ADS# Delay", not "VL ADS# Delay", so it is not surprising that ADS# going anywhere, including the cache controller, is delayed by that option. 51.6MB/s is 310ns/cache line. 47.9MB/s is 334ns/cache line, so the difference is in fact 25ns per cacheline, which is one clock. As the cache line is bursted, only a single ADS# is required. Interestingly, the rates are 12.4 and 13.4 FSB cycles per cache line, which is way more than the 5 cycles (or 6 cycles with the extra ADS# delay) required by a 2-1-1-1 burst. This is not surprising, though, as it is well-known that REP LODSD on the 486 processor is not optimized in microcode and does not hit the bus limit.

jakethompson1 wrote on 2025-07-22, 02:35:

As I understand it looking over the VL-Bus spec, LDEV# has nothing to do with ADS# or the clock, and is constantly being generated by cards from the current state of the bus.
Delaying chipset sampling of LDEV# avoids an erroneous ISA cycle if VLB devices are too slow to assert it.

Wow, I didn't know the VL specification is publicly available. And now, looking for it, I found a 7-year old VOGONS post. Yeah, LDEV# is typically generated combinatorically from the address and command signals. So LDEV is generated by an VL card for I/O address 3C0, but not for memory address 3C0. On the ISA bus, 16-bit negotiation was also generated combinatorically before the command line going active (/MEMR, /MEMW, /IOR, /IOW), but as the cycle type was not visible before the command was active, on ISA we have seperate lines for "if the address that is currently visible on SA0..SA9/SA15 is an I/O address, please perform a 16-bit cycle" and "if the address that is currently visible on LA17..LA23 is the memory address of the next cycle, please perform a 16-bit cycle".

To be clear, the LDEV# delay is not just about errorneously starting an ISA cycle, but if LDEV# is sampled too early and misses the LDEV# signal of a VL card, you have two devices handling the same cycle, which may conflict on RDY# and in case of reads also cause a bus conflict on the data lines.

jakethompson1 wrote on 2025-07-22, 02:35:

How frequent would the opposite problem happen--LDEV# is asserted spuriously due to some intermediate state on the bus, and the card is too slow to de-assert it before the next clock cycle, so an access to an ISA device is missed?

"Missing an access to an ISA device" is way worse than you make it sound to me. If the chipset sees a spurious LDEV# signal, it assumes the VL card will generate (L)RDY#, and never supply RDY# itself, so you get a hard lock up of the front side bus, because no device is going to terminate that cycle. I expect that both problems (errorneously missing LDEV# and spuriously recognizing LDEV#) are equally likely, as the both kinds of errors are caused by the setup time of the address and command lines relative to the LDEV# sample point being too short. This can both cause LDEV# to be "still active", because the previous cycle did access that VL card that does not release LDEV in time and LDEV# to be "still inactive" if the previous cycle did not also hit the card and and the card doesn't manage to assert LDEV# in time.

jakethompson1 wrote on 2025-07-22, 02:35:

Motherboard cache and DRAM hits supersede LDEV# sampling, and for this reason, delaying LDEV# sampling should never slow down the cache or DRAM.
The penalty for delaying LDEV# sampling is paid on all VLB cycles and all ISA cycles.

I sincerely hope every VL chipset works that way. And in fact I did observe horribly bad ET4000 ISA performance if late LDEV# sampling is activated, even if no VL card is installed.

jakethompson1 wrote on 2025-07-22, 02:35:

VLB ID2/ID3 jumpers would have no effect and can't solve this problem, because the chipset doesn't care about those jumpers

Reading the actual specification is interesting. Before reading the specification, I believed that ID3 ("CPU speed") actually indicates the LDEV# sample point, with "<=33MHz" meaning "early" aka "end of first T2", and ">33MHz" meaning "late" aka "end of second T2". Obviously, this is not true, and the only specification given for the LDEV# sample point is that it should be 20ns after the address got valid. That's a full clock period at 50MHz, but depending on the speed of the mainboard, maybe sampling LDEV# early is still just possible at 50MHz. Address and command lines are valid some time befor the end of T1, so the "early" sample point is the "Address/Command setup time to rising CLK with ADS# asserted" plus a whole bus clock, minus the delay in the chipset LDEV# recognition circuit (including the gate combining the totem-pole LDEV signals from multiple slots). But think this is enough of parrotting section 3.1 of the VL 2.0 specification for this post 😉 .

The specification also contains the rationale for ID2 (allow 0WS writes). This signal is meant to forbid VL cards to drive RDY# on the first T2, as the chipset might drive RDY# during the first T2 cycle. I first wondered why only 0WS writes can be forbidden, but there is nothing about 0WS reads (which makes no sense, because if it is about arbitration who may drive RDY#, the requirement for VL cards to back off RDY# is valid for reads as well as writes. I finally started understood that 0WS reads are generally not supported, even if 0WS writes are. OK, section 2.2.2 of the VL 2.0 specification makes it clear: VL cards are not supposed to implement 0WS read cycles, because cache controllers may drive data from a speculative enable the data output of the cache chips on reads before even decoding whether an address is cacheable, so the board claims the right to own the data lines on the first T2, no matter what address is accessed, so a 0WS read can not be implemented, as that would require the VL card to drive the data lines on the first T2.

I assume the 0WS write stuff is similar. A board that does not allow "high speed writes" may reserve the right to enable the output driver for RDY# during the first T2 speculatively for 0WS cache hit writes. On cache misses and non-cacheable addresses (e.g. VL memory space), the driven output will stabilize to high early enough before the end of the first T2 to satisfy the 486 RDY# setup time requirement, but a VL card can not interfere here. It makes some sense to have this feature speed dependent (most mainboard manuals ask you to allow 0WS writes on <= 33MHz, and forbid 0WS writes on > 33MHz), because the cacheablity decision can start as soon as the addresses are valid (just like the LDEV determination), and if the FSB clock is low enough, the decision that an address is not in DRAM range may still happen during T1, so the chipset would know it doesn't need to drive RDY# on the first T2 (there are no non-cacheable, non-DRAM 0WS cycles), but at higher FSB frequencies, the decision that the cycle clearly can not be a mainboard 0WS cycle might happen only after the rising clock edge that starts T2, so a VL card shouldn't drive RDY#.

Interestingly, most VL cards just ignore the ID bits, even if some chips (e.g. the S3 Trio64) have configuration bits like "allow 0WS write cycles" and "decode addresses on ADS# or one cycle later".

jakethompson1 wrote on 2025-07-22, 02:35:
ADS# is asserted by the CPU (and then de-asserted) to indicate the start of a bus cycle Delaying ADS# allows the address and bus […]
Show full quote

ADS# is asserted by the CPU (and then de-asserted) to indicate the start of a bus cycle
Delaying ADS# allows the address and bus cycle definition signals to stabilize for an extra entire clock cycle before VLB cards try to interpret them
I don't follow exactly what issue requires that--ADS# and the subsequent rising edge get seen by the VLB card too early, or too late, versus the address and bus cycle definition signals?
Is this caused by the motherboard layout and/or capacitive loading from other local bus devices being too high for the current clock rate, slowing down the arrival of bus signals at the SVGA VLB card? Or is it that the propagation through the logic on the VLB card is too slow once the signals have arrived? (or some combination of both)
At least in the case of the UM498, delaying ADS# slows down all cache and DRAM access by one cycle as well

The idea of asserting ADS# one cycle late is to add a full FSB clock period to the setup time of the address and command lines. Those lines are guaranteed to be valid some setup time before ADS#. The master will drive the address and command lines some clearly defined time after the end of the previous clock (so some time into T1), so the remaining time of T1, which will be less the higher the FSB clock is is the setup time. You can look at 486 data sheets to find the maximum time between the previous rising clock edge and the validity of the address and command lines. The higher the 486 FSB clock specification, the less time the 486 may "waste" before having valid address and command outputs. The remaining time of the cycle is required to deal with charging capacitive loads on the front side bus (likely including VL devices, as VL is typically unbuffered), propagation along the traceson the board, and yet the signals have to arrive some time before the next clock edge at the VL target. The "some time before" (the setup time at the receiver) is meant for propagation delays through the logic on the VL card. The VL 2.0 specification guarantees 7ns setup time at FSB33 and 5ns setup time at FSB40 and FSB50. VL card designers should know this constraint and design their cards in a way that this amount of setup time is sufficient. Hmm, well... now look at the CL-GD542x data sheet that requires at least 8ns setup time for the address, command, and UADDR# line. UADDR# is meant to be decoded using external logic, and well, the signals may already be 1 nanosecond late at the inputs of the decoder if the VL board is at the edge of allowed timings. Now, if the inputs are 1 nanosecond too late, how the heck are you supposed to generate the output in time?! Delaying ADS# by one clock would surely help.

jakethompson1 wrote on 2025-07-22, 03:41:

Because I had maxed out at 2-1-1-1 read and 1 WS, I tried replacing the 15ns "CE" brand Tag RAM that came with the board with a 15ns UMC one, and this seemingly has fixed the need for ADS# delay as well--as I know that often unbuffered address lines go to the Tag RAM, was the prior one loading down the address lines and slowing their change in state?

The CE tag RAM and the VL card graphics card together loaded the address lines hard enough that some required setup time was not met. The UMC RAM together with the VL card do not do that. Now, this can have two reasons: Either the CE tag RAM has a higher input capacitance than the UMC RAM, or (and that's what I suspect) the UMC RAM might actually be slightly faster than the CE RAM, and the critical path is not the VL, but the 2-1-1-1 cache lookup. So with the UMC tag RAM, the address lines are still as slow as they were with the CE RAM, but as the cache lookup is faster with the UMC RAM, the tag signals arrive in time for the 2-clock leadoff cycle when you use the UMC RAM, but they just miss the required "tag setup time" with the slower CE RAM.

You can verify whether that hypothesis is correct by re-installing the CE tag RAM, configuring no ADS# delay, but set the cache timing to 3-1-1-1. If my guess is right, this still works. Adding the ADS# delay slowed everything down by one FSB clock, which is a quite undirected approach. If the tag RAM access time is the limiting factor in your configuration, just slowing down the cache timing should help equally well, but keep VL performance high (although VL target performance might be limited with that UMC418 graphics card anyway).

jakethompson1 wrote on 2025-07-22, 02:35:

LRDY# is asserted by the VLB device to end the bus cycle
Assertion of LRDY# could occur before (on a read cycle) data headed from the VLB card to the CPU has stabilized, causing the CPU to see the wrong data?

VLB cards could suppress their own generation of LRDY# for one cycle (e.g., via ID3 jumper) OR the chipset could be software-programmable to delay the sampling of LRDY#, disregarding cards that assert it too quickly - the effect should be the same

A VL card is not supposed to assert LRDY# before it drives the data lines with valid data. The data lines are likely directly connected to the front-side bus, while the LRDY# signal might be buffered in the chipset. So it is extremely unlikely for LRDY# to "arrive before the data", as long as the VL card is compliant. You are right that a card that implements a ROM (e.g. a BIOS) might need an extra wait state if the FSB clock is high, so it might use the ID3 signal to decide the number of wait state for ROM reads. That's a valid approach.

You are also correct that the many chipset are able to do something that will effectively delay LRDY# by one clock. This is not meant to fix a broken VL target that asserts LRDY# before asserting the data, though. The reason for that chipset feature called "resynchronization" is to make sure that LRDY# meets the setup and hold time requirements of the 486 processor. The 486 processor requires LRDY# to not change from some time before the rising edge of the clock (the setup time) till some time after the rising edge of the clock (the hold time). If LRDY# is low all that time, the cycle is finished. If LRDY# is high all that time, a wait state is added to the cycle. If LRDY# changes during that time, undefined things might happen (most likely, it is just undefined whether LRDY# is recognized or not). VL cards are likely to output LRDY# some time after the previous clock edge, and the higher the FSB clock, the less time remains for LRDY# to propagate via the board traces and some processing in the chipset. At some FSB frequency, the LRDY# signal might be too late to meet the setup time requirement, so it might arrive during the sampling window in which LRDY# is forbidden to change. This is what "resynchronization" fixes: If the chipset "resynchronizes" LRDY#, it samples LRDY# a quite short time after the risign edge of CLK, and outputs that signal for the complete clock period, including the hold time of the next clock period. So if LRDY# is asserted later than the chipset sample point, LRDY# will not yet be seen by the processor at the end of that cycle, but at the end of the next cycle. Most importantly, though, this scheme ensures LRDY# at the processor side does not change during the setup and hold time.

jakethompson1 wrote on 2025-07-22, 02:35:

The penalty for delaying LRDY# generation by a card is paid only on a VLB cycle to that particular card; the penalty for delaying LRDY# sampling is paid on all VLB cycles across all cards.

If the chipset is not in LRDY# resynchronization mode (i.e. it is sampling LRDY# and forwarding it after the next clock started), it is in transparent mode, and will forward LRDY# as soon as it can (yet, that configurable latch/flip-flop/register will have some propagation delay), so it is not actually valid to talk about "sampling" in that case. I expect RDY# synchronization to not just apply to LRDY# from the VL slots, but likely also to RDY# generated from the ISA bridge, but that might depend on the chipset.

As the 486 processor reads the data when it sees RDY#, VL cards are required to keep driving data on read cycles until RDY# arrived at the processor (or whoever initiated that cycle). Due to the fact that LRDY# might be sent through an resynchronization circuit, or the reader might not be the processor at all, but an ISA bus master (including the ISA DMA circuit), the VL card can not rely on the read being done at the next clock edge. And that's why VL include RDYRTN#, which tells the card when the read cycle is actually over, so the card may stop driving the bus.

Reply 19 of 23, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2025-07-22, 21:49:

You can verify whether that hypothesis is correct by re-installing the CE tag RAM, configuring no ADS# delay, but set the cache timing to 3-1-1-1. If my guess is right, this still works. Adding the ADS# delay slowed everything down by one FSB clock, which is a quite undirected approach. If the tag RAM access time is the limiting factor in your configuration, just slowing down the cache timing should help equally well, but keep VL performance high (although VL target performance might be limited with that UMC418 graphics card anyway).

In a quick test, 3-1-1-1 & 1 W/S is working, however, the system only "lost" the ability to operate at 2-1-1-1 & 1 W/S with the CE after switching the ISA SVGA card out for the VLB SVGA. Perhaps the CE was just barely fast enough in the first place, and the presence of the VLB card on the address lines pushed it over the edge into too slow?