Why would you need 3x 2kb tag-ram chips for 16kb of L2 cache in a 386? \ VOGONS

Why would you need 3x 2kb tag-ram chips for 16kb of L2 cache in a 386?

Topic actions

First post, by vetz

Posted on 2024-04-04, 10:11

vetz Offline

Rank l33t

Rank: l33t
Posts: 4164
Joined: 2012-04-23, 17:13

This questions comes from this thread: Help with Zenith Data System Expansion cards (Z-386/25)

Here is the cache card from my Zenith Z-386 / 20:

It has 8x MCM6270P25 (4kx4) SRAM chips for 16kb cache and 3x MCM4180P25 (4kx4) for tag-ram. I find this to be a very strange amount of tag-ram of 6kb. I've never seen another machine with such a odd number. Maybe someone can shed some light on this? The maximum supported memory in the system is 52mb (20 on the mainboard and 32mb on the Zenith memory expansion board). I can also add that the machine uses 72-pin FPM memory (which is very early for a machine launched in Q1 1990!) I also want to upgrade the cache to 64kb (which according to the service manual is the maximum supported). Do I then need 3x 8kb chips (16kx4)?

3D Accelerated Games List (Proprietary APIs - No 3DFX/Direct3D)
3D Acceleration Comparison Episodes

Reply 1 of 10, by mkarcher

Posted on 2024-04-04, 10:40

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 2701
Joined: 2019-01-19, 16:29
Location: Germany

The tag RAM is 12 bits wide. So the three 4k x 4 tag SRAM chips are used as a 4k x 12 memory array. Assuming the board tags and caches individual 32-bit values, you have 4K cache lines of 32 bits each, and a 12 bit tag for each of the cache lines.

Note that the tag SRAM chips are not just 4k x 4 SRAMs, but they contain two integrated extra functions: They can be completely cleared to in a single operation taking just 70ns, and they provide an built-in comparator that provides a single output indicating whether the data presented at the data pins is identical to the data stored in the memory. Those special chips fell out of favour when chipsets got complex enough to do these operations inside the chipset, so they are hard to get nowadays.

Assuming a direct mapped cache, the cacheable area is 2^(tag bits) * cache size at max (if there is a "valid" bit inside the tag, decrement the tag bit count by one), so you get 4096 * 16k = 64M cacheable area (assuming "always valid" operation) or 32M cacheable area (assuming a dedicated valid bit). Bulk-clearable address tag comparator RAMs fit very well into a system with a dedicated valid bit. You just fix the "input" for that bit to 1, and the chip will output "no match" for all locations that have not yet been written.

If you upgrade to 64K of cache, you need two less tag bits for the same cacheable area. Assuming all 12 bits are used, you still need 10 tag bits, which means you need 3 chips of special 16k x 4 tag SRAM including an integrated comparator and bulk clear. If you can take a hit of a factor of 4 on the cacheable area by just using 8 tag bits, you might get away with 2 of those special tag SRAM chips.

I assume that the output enable signal of the tag chips is permanently grounded, causing them to never operate in "read" mode that presents the contents of the RAM on the data pins, but they keep operating in "compare" mode (outputting the match signal on a dedicated output-only pin) or "write" mode. In that case, the data pins are input-only and can be directly connected to the processor address bus (possibly buffered). The 573 chips are latch chips. I guess they are address latches that are used to finish a read/write operation while the 80386 already presents the next address due to "address pipelining". In that case, the tag chips might very well be connected to the unlatched addresses (to start the tag comparison early), while the data chips are connected to the latched addresses (to be able to handle the data late). For 4k chips, you only need 12 address bits, and for 16k chips, you need 14 address bits. The two 573 chips provide 16 latch bits, so it might be possible to go up to 256k cacheable area using a similar board design - provided you have enough unobtainium at hand to get two 64k x 4 or one 64k x 8 self-clearing tag-comparing SRAM chips, or you use a PAL to decode between multiple smaller tag SRAMs.

Reply 2 of 10, by vetz

Posted on 2024-04-04, 11:12

vetz Offline

Rank l33t

Rank: l33t
Posts: 4164
Joined: 2012-04-23, 17:13

mkarcher wrote on 2024-04-04, 10:40:
The tag RAM is 12 bits wide. So the three 4k x 4 tag SRAM chips are used as a 4k x 12 memory array. Assuming the board tags and […]
Show full quote

The tag RAM is 12 bits wide. So the three 4k x 4 tag SRAM chips are used as a 4k x 12 memory array. Assuming the board tags and caches individual 32-bit values, you have 4K cache lines of 32 bits each, and a 12 bit tag for each of the cache lines.

Note that the tag SRAM chips are not just 4k x 4 SRAMs, but they contain two integrated extra functions: They can be completely cleared to in a single operation taking just 70ns, and they provide an built-in comparator that provides a single output indicating whether the data presented at the data pins is identical to the data stored in the memory. Those special chips fell out of favour when chipsets got complex enough to do these operations inside the chipset, so they are hard to get nowadays.

Assuming a direct mapped cache, the cacheable area is 2^(tag bits) * cache size at max (if there is a "valid" bit inside the tag, decrement the tag bit count by one), so you get 4096 * 16k = 64M cacheable area (assuming "always valid" operation) or 32M cacheable area (assuming a dedicated valid bit). Bulk-clearable address tag comparator RAMs fit very well into a system with a dedicated valid bit. You just fix the "input" for that bit to 1, and the chip will output "no match" for all locations that have not yet been written.

If you upgrade to 64K of cache, you need two less tag bits for the same cacheable area. Assuming all 12 bits are used, you still need 10 tag bits, which means you need 3 chips of special 16k x 4 tag SRAM including an integrated comparator and bulk clear. If you can take a hit of a factor of 4 on the cacheable area by just using 8 tag bits, you might get away with 2 of those special tag SRAM chips.

I assume that the output enable signal of the tag chips is permanently grounded, causing them to never operate in "read" mode that presents the contents of the RAM on the data pins, but they keep operating in "compare" mode (outputting the match signal on a dedicated output-only pin) or "write" mode. In that case, the data pins are input-only and can be directly connected to the processor address bus (possibly buffered). The 573 chips are latch chips. I guess they are address latches that are used to finish a read/write operation while the 80386 already presents the next address due to "address pipelining". In that case, the tag chips might very well be connected to the unlatched addresses (to start the tag comparison early), while the data chips are connected to the latched addresses (to be able to handle the data late). For 4k chips, you only need 12 address bits, and for 16k chips, you need 14 address bits. The two 573 chips provide 16 latch bits, so it might be possible to go up to 256k cacheable area using a similar board design - provided you have enough unobtainium at hand to get two 64k x 4 or one 64k x 8 self-clearing tag-comparing SRAM chips, or you use a PAL to decode between multiple smaller tag SRAMs.

Thank you for the very detailed answer. That explains it and I certainly learned something new! I believe I read in the manual somewhere that the cache memory operates at 32-bit, so your assumption is most likely correct.

Now I gotta check how hard it is to acquire those special tag-ram chips.

3D Accelerated Games List (Proprietary APIs - No 3DFX/Direct3D)
3D Acceleration Comparison Episodes

Reply 3 of 10, by vetz

Posted on 2024-04-04, 11:54

vetz Offline

Rank l33t

Rank: l33t
Posts: 4164
Joined: 2012-04-23, 17:13

mkarcher wrote on 2024-04-04, 10:40:

If you upgrade to 64K of cache, you need two less tag bits for the same cacheable area. Assuming all 12 bits are used, you still need 10 tag bits, which means you need 3 chips of special 16k x 4 tag SRAM including an integrated comparator and bulk clear. If you can take a hit of a factor of 4 on the cacheable area by just using 8 tag bits, you might get away with 2 of those special tag SRAM chips.

I have some trouble finding 16k x 4 tag-ram chips with comparator in DIP-22 package. Do they exist? When Googling I found this page: http://www.cpu-ns32k.net/National.html. I know this computer is not x86 based, but the cache setup seems very similar.

For high performance the CPU board has a 64 kBytes cache. The data memory of the cache is made of 8 fast SRAMs from Performance Semiconductor. The SRAM device P4C188-25PC is organized as 16k x 4 bits and has an access time of 25 ns. They are located at the left edge. The tag memory of the cache is made of four special SRAMs from SGS-Thomson. The tag memory chip MK41H80N-25 contains a 4k x 4 bits memory array and a 4-bit comparator. 12 address bits and one valid bit are used for the tag. The result is a 256 MBytes cachable address space. The tag memories are located at the right edge. Both types of memory use a 22-pin DIP package.

From what the quote above says, it could indicate I could keep the tag-ram chips when upgrading the 8 cache chips to 16k x 4?

3D Accelerated Games List (Proprietary APIs - No 3DFX/Direct3D)
3D Acceleration Comparison Episodes

Reply 4 of 10, by vetz

Posted on 2024-04-04, 13:08

vetz Offline

Rank l33t

Rank: l33t
Posts: 4164
Joined: 2012-04-23, 17:13

I'm also wondering if 64kb is actually supported in this specific cache board. The manual states 64kb is possible, but in the parts list I see a 64kb cache board have a different part no. The part no is not similar to Zenith's hardware part numbers (which start with 85), so I'm not sure if this is a different base board, or just different memory chips. If it is a different base cache memory board, that might be due to the different pinouts on the 16k x 4 and 4k x 4 SRAM chips for DIP22?

I can see 16k have 13 address lines vs 11 for 4k. The chip enable and output enable pins are also different. On 16k output enable doesnt exist on the DIP22 version and pin 10 has become chip enable (instead of pin 9 on 4k).

3D Accelerated Games List (Proprietary APIs - No 3DFX/Direct3D)
3D Acceleration Comparison Episodes

Reply 5 of 10, by mkarcher

Posted on 2024-04-04, 13:38

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 2701
Joined: 2019-01-19, 16:29
Location: Germany

vetz wrote on 2024-04-04, 11:54:

mkarcher wrote on 2024-04-04, 10:40:

If you upgrade to 64K of cache, you need two less tag bits for the same cacheable area. Assuming all 12 bits are used, you still need 10 tag bits, which means you need 3 chips of special 16k x 4 tag SRAM including an integrated comparator and bulk clear. If you can take a hit of a factor of 4 on the cacheable area by just using 8 tag bits, you might get away with 2 of those special tag SRAM chips.

I have some trouble finding 16k x 4 tag-ram chips with comparator in DIP-22 package. Do they exist?

Definitely not in DIP22: That type of chip requires at least

14 address pins
4 data pins
1 write enable pin
1 read enable pin
1 global clear pin
1 match output pin
2 supply pins

This is 24 pins. There is no way to fit that into DIP-22. Even if you omit the read enable pin that is likely unused in your design, it's 23 pins.

I have no idea whether 16k x 4 with comparator exist in other packages, like DIP-24. The IDT7174 was a quite popular 8k x 8 chip with comparator, so there definitely were some 64kBit chips with integrated comparator.

vetz wrote on 2024-04-04, 11:54:

When Googling I found this page: http://www.cpu-ns32k.net/National.html. I know this computer is not x86 based, but the cache setup seems very similar.

National NS32K documentation wrote:
For high performance the CPU board has a 64 kBytes cache. The data memory of the cache is made of 8 fast SRAMs from Performance Semiconductor. The SRAM device P4C188-25PC is organized as 16k x 4 bits and has an access time of 25 ns. They are located at the left edge. The tag memory of the cache is made of four special SRAMs from SGS-Thomson. The tag memory chip MK41H80N-25 contains a 4k x 4 bits memory array and a 4-bit comparator. 12 address bits and one valid bit are used for the tag. The result is a 256 MBytes cachable address space. The tag memories are located at the right edge. Both types of memory use a 22-pin DIP package.

From what the quote above says, it could indicate I could keep the tag-ram chips when upgrading the 8 cache chips to 16k x 4?

No, definitely not if the cache is based around single 32-bit words. You need one entry in the tag RAM per "cache line". As your "cache line" is just a single 32-bit-word, you need one tag entry per DWORD, so 4K tag entries are good for 16k cache only. I suspect the NS32K system uses 16-byte cache lines as the 486 processor's L1 cache does as well. In that case, you need one tag entry per 16 bytes, and a 4K tag can support 64KB of cache.

vetz wrote on 2024-04-04, 13:08:
I'm also wondering if 64kb is actually supported in this specific cache board. The manual states 64kb is possible, but in the parts list I see a 64kb cache board have a different part no

...

I can see 16k have 13 address lines vs 11 for 4k. The chip enable and output enable pins are also different. On 16k output enable doesnt exist on the DIP22 version and instead pin 10 has become chip enable (instead of pin 9 on 4k).

As already discussed in my first paragraph in this post, you won't fit the tag chip required for 16K indivial cached dwords into the DIP22 sockets. As you also found out, the 16k x 4 and the 4k x 4 chips have incompatible pinout (although you seem to have miscounted the address lines, it should be 14 (A0..A13) vs. 12 (A0..A11), not 13 vs 11) as well. Furthermore, on 16K cache, you need to fit the CPU address lines A14 and A15 as data into the tag comparators, whereas on 64K cache, you feed those lines as addresses into both the tag RAM chips and the data chips. Thus again, it is clearly impossible to support both the 16K scheme and the 64K scheme on the same board without jumpers. Typical early 386/486 boards had around 4 to 6 jumpers to accomodate different cache sizes.

Reply 6 of 10, by MikeSG

Posted on 2024-04-04, 14:36

MikeSG Offline

Rank Member

Rank: Member
Posts: 147
Joined: 2023-02-14, 08:31

This is 256KB using 3 TAGs with 22pins. W22B65AK-20. Can't find the datasheet anywhere so they may/may not be 64KBx4, and whether or not they have comparator.

From an ECS SL486VE board.

Reply 7 of 10, by vetz

Posted on 2024-04-04, 18:11

vetz Offline

Rank l33t

Rank: l33t
Posts: 4164
Joined: 2012-04-23, 17:13

mkarcher wrote on 2024-04-04, 13:38:
Definitely not in DIP22: That type of chip requires at least […]
Show full quote

vetz wrote on 2024-04-04, 11:54:

mkarcher wrote on 2024-04-04, 10:40:

If you upgrade to 64K of cache, you need two less tag bits for the same cacheable area. Assuming all 12 bits are used, you still need 10 tag bits, which means you need 3 chips of special 16k x 4 tag SRAM including an integrated comparator and bulk clear. If you can take a hit of a factor of 4 on the cacheable area by just using 8 tag bits, you might get away with 2 of those special tag SRAM chips.

I have some trouble finding 16k x 4 tag-ram chips with comparator in DIP-22 package. Do they exist?

Definitely not in DIP22: That type of chip requires at least

14 address pins

4 data pins

1 write enable pin

1 read enable pin

1 global clear pin

1 match output pin

2 supply pins

This is 24 pins. There is no way to fit that into DIP-22. Even if you omit the read enable pin that is likely unused in your design, it's 23 pins.

I have no idea whether 16k x 4 with comparator exist in other packages, like DIP-24. The IDT7174 was a quite popular 8k x 8 chip with comparator, so there definitely were some 64kBit chips with integrated comparator.

vetz wrote on 2024-04-04, 11:54:

When Googling I found this page: http://www.cpu-ns32k.net/National.html. I know this computer is not x86 based, but the cache setup seems very similar.

National NS32K documentation wrote:
For high performance the CPU board has a 64 kBytes cache. The data memory of the cache is made of 8 fast SRAMs from Performance Semiconductor. The SRAM device P4C188-25PC is organized as 16k x 4 bits and has an access time of 25 ns. They are located at the left edge. The tag memory of the cache is made of four special SRAMs from SGS-Thomson. The tag memory chip MK41H80N-25 contains a 4k x 4 bits memory array and a 4-bit comparator. 12 address bits and one valid bit are used for the tag. The result is a 256 MBytes cachable address space. The tag memories are located at the right edge. Both types of memory use a 22-pin DIP package.

From what the quote above says, it could indicate I could keep the tag-ram chips when upgrading the 8 cache chips to 16k x 4?

No, definitely not if the cache is based around single 32-bit words. You need one entry in the tag RAM per "cache line". As your "cache line" is just a single 32-bit-word, you need one tag entry per DWORD, so 4K tag entries are good for 16k cache only. I suspect the NS32K system uses 16-byte cache lines as the 486 processor's L1 cache does as well. In that case, you need one tag entry per 16 bytes, and a 4K tag can support 64KB of cache.

vetz wrote on 2024-04-04, 13:08:
I'm also wondering if 64kb is actually supported in this specific cache board. The manual states 64kb is possible, but in the parts list I see a 64kb cache board have a different part no

...

I can see 16k have 13 address lines vs 11 for 4k. The chip enable and output enable pins are also different. On 16k output enable doesnt exist on the DIP22 version and instead pin 10 has become chip enable (instead of pin 9 on 4k).

As already discussed in my first paragraph in this post, you won't fit the tag chip required for 16K indivial cached dwords into the DIP22 sockets. As you also found out, the 16k x 4 and the 4k x 4 chips have incompatible pinout (although you seem to have miscounted the address lines, it should be 14 (A0..A13) vs. 12 (A0..A11), not 13 vs 11) as well. Furthermore, on 16K cache, you need to fit the CPU address lines A14 and A15 as data into the tag comparators, whereas on 64K cache, you feed those lines as addresses into both the tag RAM chips and the data chips. Thus again, it is clearly impossible to support both the 16K scheme and the 64K scheme on the same board without jumpers. Typical early 386/486 boards had around 4 to 6 jumpers to accomodate different cache sizes.

Again thanks for a very good reply. I agree, the cache memory board I have at the moment does not support a direct 64k upgrade by swapping out the memory chips. I will have to look for the 64k board going forward, which is probably unobtanium at this point. Can't find much information about it on the internet and no pictures.

Yes, I misread the addresslines starting at 0.

3D Accelerated Games List (Proprietary APIs - No 3DFX/Direct3D)
3D Acceleration Comparison Episodes

Reply 8 of 10, by mkarcher

Posted on 2024-04-04, 20:09

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 2701
Joined: 2019-01-19, 16:29
Location: Germany

vetz wrote on 2024-04-04, 18:11:

Again thanks for a very good reply. I agree, the cache memory board I have at the moment does not support a direct 64k upgrade by swapping out the memory chips. I will have to look for the 64k board going forward, which is probably unobtanium at this point. Can't find much information about it on the internet and no pictures.

So basically you have two options: Either you reverse engineer the system well enough to design your own 64K cache board or find out how to bodge the 16K board into a 64K board, or you try to obtain the original 64K cache board. Most likely all information required for an electronics engineer to design a new board or a 64K mod of the 16K board can be obtained from your system using just eye-sight and a continuity buzzer.

Well, you also have the third option to just accept that either of the first two options is too inconvenient and you keep that system at 16K cache and spend your time on other projects. Choose the way that fits you best.

Reply 9 of 10, by CoffeeOne

Posted on 2024-04-04, 20:24

CoffeeOne Offline

Rank Oldbie

Rank: Oldbie
Posts: 1157
Joined: 2019-12-25, 16:12
Location: Austria

MikeSG wrote on 2024-04-04, 14:36:

This is 256KB using 3 TAGs with 22pins. W22B65AK-20. Can't find the datasheet anywhere so they may/may not be 64KBx4, and whether or not they have comparator.

From an ECS SL486VE board.

These are classical 16kx4 chips on your EISA board. No comparator of course.
https://www.datasheets360.com/pdf/-9176828267189411086

Reply 10 of 10, by vetz

Posted on 2024-04-04, 20:28

vetz Offline

Rank l33t

Rank: l33t
Posts: 4164
Joined: 2012-04-23, 17:13

mkarcher wrote on 2024-04-04, 20:09:

vetz wrote on 2024-04-04, 18:11:

Again thanks for a very good reply. I agree, the cache memory board I have at the moment does not support a direct 64k upgrade by swapping out the memory chips. I will have to look for the 64k board going forward, which is probably unobtanium at this point. Can't find much information about it on the internet and no pictures.

So basically you have two options: Either you reverse engineer the system well enough to design your own 64K cache board or find out how to bodge the 16K board into a 64K board, or you try to obtain the original 64K cache board. Most likely all information required for an electronics engineer to design a new board or a 64K mod of the 16K board can be obtained from your system using just eye-sight and a continuity buzzer.

Well, you also have the third option to just accept that either of the first two options is too inconvenient and you keep that system at 16K cache and spend your time on other projects. Choose the way that fits you best.

Modding or designing a new board is way out my skills and knowledge, so it's the third option for me, except I'll put in an Ebay search just incase the original board shows up one day 😀

The Zenith Z-386 also looks to be quite a rare machine. I only know about OtakuN3rd in the community who owns one beside myself, so the marked for a new design is going to be very limited.

3D Accelerated Games List (Proprietary APIs - No 3DFX/Direct3D)
3D Acceleration Comparison Episodes

Go to top of page Go to top of page

Back to General Old Hardware