VOGONS


First post, by Kahenraz

User metadata
Rank l33t
Rank
l33t

I've been running into this error a lot as I try to incorporate a workflow in Windows 98 that includes Cygwin, for the familiar Bash shell and utilities. I wasn't entirely convinced that Cygwin was the cause but rather a trigger, and have been investigating this for time. I tried various methods of causing memory pressure and different ideas to simulate how Cygwin behaves, and finally found that it seemed to be connected somehow to spawning lots of short lived child processes. This unstable state can be induced by opening and closing Notepad (or some other program) repeatedly until it fails to open. At this point, the system should be unable to open a DOS command window. Win32 applications will typically still work, but there is a chance that the entire system will lock up.

All of the references I've found regarding this error always seem to point to the vcache or having too much memory installed. While this error may also be induced in these other scenarios as well, and the error message itself may be the same, I'm not confident that the problem is identical. I've tried all kinds of different configurations of real hardware, VMs, updates, and patches, but I haven't found anything to mitigate this problem, or even what causes it. It's a disease that can only be solved by a system reboot. Whatever "memory" Windows thinks has run out of, seems like it can never be reclaimed.

Does anyone have any information on this, what may be causing it, how to avoid it, and how to recover from this state?

Lots of information here:
https://msfn.org/board/topic/183777-how-to-de … n-windows-9xme/

I also made a video of the problem in action:
https://youtu.be/ssDf-fF6ULY

Reply 1 of 26, by Jo22

User metadata
Rank l33t++
Rank
l33t++

Is virtual memory enabled?

Some old programs like Photoshop Elements 2.0 will complain,
if virtual memory is not available.
No matter how much real, physical memory is available.
The PC I experienced this on had 3,5 GB free memory on XP and swap disabled..

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 2 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t

Yes, I never disable virtual memory. I have tried both with allowing Windows to manage memory for me and by specifying the page file, either growing or of a fixed size. The issue doesn't appear to involve swap file usage at all, because when there is sufficient physical memory available, the swap file is never used (it doesn't grow).

This problem is the state that the operating system can find itself in where some kind of "memory" resource has become exhausted. I can induce this state during testing by using my stress-open tool. I wrote this tool as a reaction to finding my system falling into this state with normal use and trying to find a way to induce it on demand for testing and replication.

This is after a fresh reboot. The tool opens Notepad (a native Win32 program) and closes it gracefully with the WM_CLOSE message (this is what happens when you click the X button). The stress-open tool is also a Win32 binary; there is no 16-bit code here. But somehow the operating system becomes unable to open a DOS window, despite still being able to run Win32 applications.

What DOES happen right before this state is my tool tries to open notepad.exe but it fails to open. At this point I can continue to open Notepad as normal, but I can't run anything that uses the DOS subsystem. Maybe it's some kind of race condition that corrupts memory somewhere in the kernel.

Does Windows 98 support JIT debugging? Maybe this can provide some more information as to what causes Notepad to crash.

Reply 3 of 26, by Jo22

User metadata
Rank l33t++
Rank
l33t++

I see. Hmm.. Maybe it has to do with GDI/Kernal ressorces somehow?
Windows 9x has the similar restrictions as Windows 3.x.
I say similar, because some resource constraints may have been leveraged (resources became larger),
while the old GDI objects are more limited on Windows 95+ (GDI stayed a 16/32-Bit hybrid, being 16-Bit at core).
That's why in rare circumstances, Win 16/32 programs could run on Windows 3.1, but not Windows 95.

Perhaps I'm remembering it wrong, but I think that Creatures! (an AI simulation disguised as a game) ran less out of resources on Windows 3.1+Win32s (and Macintosh perhaps, but that used different executables).

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 4 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t

According to Resource Meter, there are plenty of GDI resources. If there were none, then I wouldn't be able to open Win32 applications as I demonstrate in the video. I can actually gobble up all of the GDI resources by opening too many applications and not closing them, and the result is completely different. Resource Meter will actually display a popup warning if you are running out of GDI resources.

Reply 5 of 26, by Matchstick

User metadata
Rank Newbie
Rank
Newbie

Do Start > Run > System.ini
Scroll to the [vcache] section
Add the line
MaxFileCache=512000
Save, exit and reboot

I also recommend, Auto-Patcher for Windows 98:
https://retrosystemsrevival.blogspot.com/2018 … e-december.html

Reply 6 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t
Matchstick wrote on 2022-07-31, 21:53:
Do Start > Run > System.ini Scroll to the [vcache] section Add the line MaxFileCache=512000 Save, exit and reboot […]
Show full quote

Do Start > Run > System.ini
Scroll to the [vcache] section
Add the line
MaxFileCache=512000
Save, exit and reboot

I also recommend, Auto-Patcher for Windows 98:
https://retrosystemsrevival.blogspot.com/2018 … e-december.html

Increasing the vcache does not solve this issue.

Reply 7 of 26, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

After years doing battles with the stacks in GDI and USER.exe I'd blame them. Not sure the resource meter tells the whole tale when they're fragmented, prog tries to grab a contiguous block and they go "sorry mate, ain't got nothing bigger than 2k together."

Also wondering about a possibility of a physical memory loop, have up to X amount of memory it's okay, have X+1Mb and it's off the top of the counter and thinks you've got 1MB.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 8 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t

I've tried as low as 32MB of physical memory, but it didn't help. I also thought that fragmentation might be the issue, but it doesn't make a whole lot of sense if I'm just closing and reopening a single application.

Reply 9 of 26, by weedeewee

User metadata
Rank l33t
Rank
l33t

Have you tried opening "msdos prompt"in safe mode?

Right to repair is fundamental. You own it, you're allowed to fix it.
How To Ask Questions The Smart Way
Do not ask Why !
https://www.vogonswiki.com/index.php/Serial_port

Reply 12 of 26, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

This is "the boss" the end guy... it's how you know you've got good at Win9x. You have avoided all the other traps, the reasons for blue screens, the mazes of configurations and other riddles, you've beat them all. If you hadn't 9x would crash and reboot for other reasons and you'd never come across this. Not that many regular users came across this because of that, or because they'd give up on the final level, run their machine for a measly few hours and turn it off. So yay, you've nearly "won" windows 9x right? Nope, this boss is unbeatable. You can avoid and evade him a bit longer, but he always gets you. However Microsoft obscured the obviousness of this feature (to them).... and didn't put up a blue screen saying "Congratulations adventurer, you have completed 99% of Windows 98, this Operating System is beneath you now, bring your wallet to your Microsoft dealer to upgrade to Windows NT."

So yah, no fixes, only mitigation.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 13 of 26, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

This may help some...
https://www.betaarchive.com/wiki/index.php?ti … _Archive/229670

If your intended result is "leave it running for weeks unsupervised" then I doubt you'll get there, but stuff like this may extend the reboot window to 3 or 4 days.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 14 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t

I have tried using the latest GDI versions I could find, version 4.10.2227 for Windows 98 via Q918547, but the problem still exists. I don't think that this is related to GDI, otherwise I would see a loss of GDI resources, which I do not.

My concern is that this is a bug that exists in Windows 9x/ME that can occur even when you do everything right. You don't run any bad programs that leak memory, you shut down your programs gracefully, you follow all of the rules to play nice with the operating system. In my test case, I'm just opening and closing programs gracefully.

I would like to think that the system would remain stable so long as you did not run any bad code, but it looks like there is something that is still lurking beneath the surface that is making things unstable.

Reply 15 of 26, by weedeewee

User metadata
Rank l33t
Rank
l33t

Would it be possible to make a video from bootup showing all the actions and programs starting to when the problem occurs ?

Right to repair is fundamental. You own it, you're allowed to fix it.
How To Ask Questions The Smart Way
Do not ask Why !
https://www.vogonswiki.com/index.php/Serial_port

Reply 16 of 26, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++
Kahenraz wrote on 2022-08-07, 19:56:

My concern is that this is a bug that exists in Windows 9x/ME that can occur even when you do everything right.

Also exists in 3.xx too, though a bit worse, they mitigated it a little for 9x but didn't cure it. There was something I was using in 3.x to rinse and defrag the stacks but I forget if it was incompatible with 9x or I just forgot about it for a few years. I am trying to recall the name, was probably available on tucows or simtel win3 collections. That was only a bit of a bandaid though, lets you work a bit longer, save things, but still needs to reboot soon. This bug caused fun times on a packing line with a label printer back in the day, halfway through second shift it would usually crash because it popped up a requester for every order and eventually enough orders smashed the stacks. IT guy was both a) annoyed every time we got him out to it and b) in denial it was happening, so had to give up on him after a couple of weeks and "accidentally", knock the reset or power at the start of the shift.

Usual mitigation tricks to make it run longer... lowest useful resolution, lowest useful color depth (256 colors usually) no custom background, no custom icons, default icons only, and clear up desktop and program groups to essential items only, don't open huge directories in file manager just to gawp, turn on text only listing no icons if you need to work directories. It still crashes, but maybe it takes longer now. Also card games can eat it up quick I think it uses a stack object for each card so 52 spaces gone just by running solitaire, and another one for each back style you look at or switch to. Sometimes they clear out gracefully, sometimes they don't.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 17 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t

It seems as though people are chiming in with ideas without reading into my thread I linked on the MSFN forum. I can see that there would be a lot to read to get caught up, so let me summarize where I'm at currently with some quotes from the appropriate post:

https://msfn.org/board/topic/183777-how-to-de … comment=1223019

I have replicated this on fast hardware (i945/Pentium D/Core 2), age appropriate hardware (i440/Pentium 3/650Mhz/128MB RAM), and […]
Show full quote

I have replicated this on fast hardware (i945/Pentium D/Core 2), age appropriate hardware (i440/Pentium 3/650Mhz/128MB RAM), and in a virtual machine (VMware) on a Ryzen 1800X. The easiest way to put the computer in this state is to open an close an application repeatedly (such as Notepad).

I have started configuring both hardware and VMs, Windows 98 and ME, to the following conservative values for vcache, and it seems reasonable given the amount of physical memory available:

[vcache]
MinFileCache=1024
MaxFileCache=32768

As far as testing various patches, updates, and service packs, things that I have used or tried:

I need to use this patch to allow Windows 98 to run on my Ryzen CPU, tested 0.7.45 and 0.7.47, with and without Q288430:
Patch for Windows 95/98/98 SE/Me to run on newest CPUs

I have tried the following service packs for Windows 98 SE (3.65 and 3.66):
http://www.techtalk.cc/viewtopic.php?t=65

Windows ME Service Pack 1.05
https://retrosystemsrevival.blogspot.com/2019 … e-pack-102.html

Auto-Patcher for Windows 98
https://retrosystemsrevival.blogspot.com/2018 … e-december.html

I have tried Windows 98 with a stock IE5, IE6SP1, and IE6SP1 with the shell32.dll update:
https://msfn.org/board/topic/84451-98-fe-98-s … shell32dll-fix/

I tried varying sizes for the swap file, but with sufficient memory (at least 256MB), it appears that the swap file is never even used (it's never expanded), so the problem seems to lie elsewhere.

I have tried replacing himem.sys with HIMEMX. I have tried with and without rloew's memory patch. I have tried varying amounts of physical memory, 32MB, 128MB, 256MB, 512MB, 1.5GB. I have tried large swap sizes and small swap sizes. I have tried varying the amount of vcache, up to 512MB when large amounts of physical memory is installed.

I understand that my method to replicate this error is unrealistic, opening and losing Notepad or some other program hundreds of times, but it is simply meant to induce the error with the absolute smallest amount of noise to eliminate the change that it may be caused by something else. In my case, it is triggered VERY often when I use Cygwin, as it spawns lots of child processes that perform some action and then exit. I have been pursuing this issue for weeks and I've finally made some progress.

I have a copy of the system in this very state and can provide it in a VM, if there is an expert available to help look into this.

I've been trying to monitor different memory areas that are trackable by System Monitor, and I think I found something consistent. The memory that displays as "Locked non-cache pages" is a resource that always to be consumed but never freed. Maybe it's related?

Reply 18 of 26, by Kahenraz

User metadata
Rank l33t
Rank
l33t

Interesting! "Locked non-cache pages" will increase when I open multiple instances of programs such as a command.com window or Notepad and go back down when I close them. But if I open a single instance and the close it, the pages will slowly become consumed over time and not free themselves.

Maybe this is the cause of the leak?

Edit:

After further testing, if I continue to push the system and keep opening and closing Notepad, eventually the system will start to reporting that there isn't enough memory even for Win32 applications. It seems that the issue simply breaks DOS functionality much earlier. I can open one Notepad, but there isn't enough memory for a second one, despite there being almost 200MB of available physical memory, and the entire swap file.

At this point, the entire system starts to struggle under its own weight of doing nothing at all. Opening My Computer causes it to hang as it can't get the resources it wants, even though Explorer is already running. It's very bizarre.

20220807_225254_resize_30.jpg
Filename
20220807_225254_resize_30.jpg
File size
318.65 KiB
Views
1882 views
File license
CC-BY-4.0

Reply 19 of 26, by JH64

User metadata
Rank Newbie
Rank
Newbie

Hello, here are my 3 tips for cause of out of memory if here is still enough free ram:
1) Driver needs continuous physical space, but ram is too fragmented so it can’t fit – but drivers allocating memory at runtime was rare in 9x era. (My 3D driver needs one or more 32 MB block and after few app/games installation/uninstallation it won’t get it)
2) Out of descriptors in 16bit protected mode – there are only 8192 global segment descriptors – every can address up to 64k of memory, but it can be less, and more descriptors can point to same memory. Malfunctional 16bit program/driver can wasted them very fast.
3) Out of Windows handles - Windows 9x have very limited numbers of WINAPI HANDLE descriptors and most of them are global (shared files, mutexes, pipes, …) and if they are not freed by programmer, they would be lost.
Here is one example from winpthreads:

int pthread_cond_destroy (pthread_cond_t *c)
{
//...
if (!TryEnterCriticalSection (&_c->waiters_count_lock_)){
// ^ fail every time with ERROR_CALL_NOT_IMPLEMENTED
// ...
return EBUSY;
// ...and 2 semaphores are wasted
}
// ...
// real clean-up here
DestroySemaphoreWin(_c->sema_q);
DestroySemaphoreWin(_c->sema_b);

LeaveCriticalSection (&_c->waiters_count_lock_);
DeleteCriticalSection(&_c->waiters_count_lock_);
//...

Pthreads are part of Cygwin/Mingw and near every program compiled with it using it. (I had this problem with my Mesa3D port – 3D benchmark ran about 15 min and after crashed and could not be run again with out of memory error).

Missing InitializeCriticalSection in Wine caused simitar problem. KernelEx solves some of these problems (for example with TryEnterCriticalSection, but not in all cases) but not all.

I personally suspect that something in Cygwin could be the source of these problems. Try look to its source and search for functions like TryEnterCriticalSection, InitializeCriticalSectionAndSpinCount, SetCriticalSectionSpinCount or just LoadLibraryW.

All these problems are “by design” and cannot be solved for all – you could only find source of the leak and clog it 🙁