Well, I have found in S3 datasheet that RDY ctl min wait states is set by bit 4 in CR40 register. I wrote a small utility that displays a few S3 options from varoius CRs. With 0WS rom the bit was 0 and with 1WS rom the bit was 1 as expected. CRs seems to be unlocked by bios. So I cleared bit4 to 0 at runtime on 1WS rom and read it back to confirm the value has changed but it has not any effect on performance, strange... I coul search in dasmed 1WS rom if I find any CR40 manipulation and try to change it there.
Other though. Both roms reported me that memory config is 2-cycle EDO. Could modifying to 1-cycle EDO make it faster?
I also plan to add code for reading MCLK PLL regs to check how fast is running.
Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA