I'm surprised of all the problems you have with Tualatin P3s and slotkets.
Modded a few Tualatins, Celerons and P3s by isolating one pin and then the bridge. No problems with any of my Slot1-boards: two different Asus P3B-F, Abit VT6X4 and Abit BF6. Slotkets are MSI MS-6905 and some others. Only thing is that slotket must support at least Coppermine so that it can work with Tualatin.
All goes up to 150MHz and BF6 even higher until the motherboard (or AGP bus) hits the limit.
The P3-S with 512KB cache is way faster than Tualatin-256KB, it really surprised me how much the difference was. For an example Need for Speed Underground is almost unplayable with Celeron when with P3-S it's fully playable even lower clocks. System is Asus P3B-F with Geforce FX5900XT.