VOGONS


First post, by BardBun

User metadata
Rank Member
Rank
Member

Hi,

I'm trying to troubleshoot / optimize the performance of an old DOS game called "X-COM Apocalypse".
That game has several oddities where certain files are installed\changed\used depending on the CPU type (486 and Pentium), which can cause errors and weird game behaviour.

However, the most fluent experience you can get with it so far is with the "auto" cputype setting. (where it's nigh perfect performance 95% of the time with the right settings, but the feature set might cause weird stuff to happen)

Judging from this post here:
Re: New 'cputype' option question...

The feature sets can vary heavily depending on the cputype chosen.

Would it be possible to implement a "pentium_fast" option, that along with the default features listed in that post also uses the "loose page privilege check for more speed" ?

As the regular "pentium_slow" setting is just too slow to allow a pleasant game experience and I assume that the "tight page privilege check" is the only part of it which makes it run so horribly slow compared to the other cputypes.

While the pentium type CPU and its instruction sets are needed to make sure the game performs properly and at its best.

Last edited by BardBun on 2020-11-20, 21:52. Edited 1 time in total.

Reply 1 of 14, by ripsaw8080

User metadata
Rank DOSBox Author
Rank
DOSBox Author

AFAIK, the only difference for 486 vs. 586 is different UFO2P.EXE and TACP.EXE executables. While that is a quite significant thing, I just want to be specific because you were a bit vague about what is different.

Are you looking for better operation, meaning less bugs, from the 586-specific executables; or are you only interested in more speed? I ask because it's hardly certain that a hypothetical cputype=pentium_fast and the 586-specific executables would result in greater performance than cputype=auto and the 486-specific executables.

Reply 2 of 14, by BardBun

User metadata
Rank Member
Rank
Member

Both actually.

From my own testing and general information I could gather, the game can slightly bug out here and there due to the CPU type option causing a weird install for example.
Things like the installer miss matching the UFO2P.exe with a renamed TACP4.exe and even if you mitigate that by copying the XCOM3 folder form the CD over, renaming it to XCOMA, then doing a full install on top of it and then again copying the XCOM3 content (except for the options.dat file this time) over, to make sure all files are as they are meant to be without some weird install alterations, things might still not work the way they should using the "auto" cputype.

One thing I've observed is the default (fall back) AI breaking for example. (like, when the Learning AI is already not working the basic default AI will take over)
UFOs just circling around the Cityscape without shooting or depositing aliens into the city. Hostile units during tactical missions just running around but never using their weapons.
Then there are things like the market sort of breaking around mid/end game, with nothing being purchasable anymore, no more soldiers to recruit and so on.

Also, though I still have to do some testing on this, this might also be one of the reasons for the Learning AI not working when using an image file instead of the original CD, like in the digital distributions of the game from GoG or Steam.

The game does prioritize files that are installed over the files of the CD (i.e. if you run the original UK version but copy over the SMK video files of the German version, those will be played instead over the English SMK files of the CD) and by that logic it might also be able to bypass the CD checks for the Learning AI if it reads the files from the install folder instead.
But with the weird installation issue and CPU checks of the game, that might already cause some weird hiccups that just break the AI instantly, as it is very fragile to begin with due to the whole game being super rushed.

The game performs CPU checks at 2 different places, one being during the install, where then it decides to alter several game files if detects the CPU to be 80468 instead of Pentium, causing possible miss matches of the UFO and TAC files and maybe other slight alterations to other files.
The other CPU check I do not know where it is performed, but it might affect the game in similar ways and prevent it from performing the way it would if it were to detect the CPU to be a Pentium.

In general the 486 files of the game were optimized after the initial Pentium game files were created and seeing how the game was incredibly rushed at its final stages, it might be likely that the watering down of instructions and functions of those files might be so rushed that they didn't even have time to proper test things when they were done, leading to those weird issues I've observed.

Then of course having the possibility to run the game with near perfect performance using a "pentium_fast" cputype just like it is possible to do at least that using the "auto" type.

#Apocalypse: cycles=60000 (windowed) ; cycles=100000 (fullscreen) ; cputype=auto ; core=dynamic

Are the DOSBox settings needed to make the game run almost perfectly with what DOSBox has to offer currently (at 1280x960 and 1920x1440 resolution, openglnb), along with that the [mixer] setting has to have the prebuffer and blocksize increased to the maximum allowed value "8192", to avoid sound popping, and the sound card chosen in the Setup of the game has to be "SoundBlaster Pro".
(luckily the sound delay is so little it doesn't bother you during the game but instead allows a smooth sound output, regardless if turn based or real time mode)

Here are 2 short videos that show how fluent the game can run with those settings:
https://www.youtube.com/watch?v=D5bWeRkttJg&t=2m13s
https://www.youtube.com/watch?v=KFFL4bgrNf4&t=2m51s

Those settings almost fully mitigate slow downs caused by too many explosions at once or when you managed to fill up several levels with smoke or stun grenades, or at least reduces them to an amount where the game is still perfectly playable and the slight slowdowns won't bother you.

Compared to that, with the "pentium_slow" cputype the game just runs too slow overall, with the same settings as above but only 20000 to 40000 cycles possible without it turning into a sound popping and super stuttery mess, any time there is an explosion the game slows down or shortly freezes the actions to process what is going to happen next. Any time option faster than "normal" causes a stuttery fast forward where it almost looks like everyone is sort of teleporting 2-3 tiles at once.
(I tried lots of different settings with "pentium_slow" but unlike my optimized "auto" settings, they are nowhere near the amazing almost perfect performance)

My hope is that with a "pentium_fast" option, the weird buggy game behaviour can be fully avoided while allowing the game to be played with as good of a performance as in those videos I linked.
That way during the CPU checks the game will always detect a Pentium CPU.
There isn't too much or any information about what instruction sets the Pentium game files make use of, but seeing as how several issues can occur mid/late game using the "auto" setting, I assume it does take advantage of some of them, as the game has been initially designed to be played on Pentium CPUs with hasted backwards compatibility to 486 CPUs.

And maybe along with that the Learning AI can even be made working on the digital distributions of the game, which also come with horrible wonky installs. (The GoG Galaxy install has miss matched UFO2P and TAC2P files, the Steam version is also about as screwed up, both use the original 1.0 UK Version)

I also have like only very basic knowledge about programming and how DOSBox itself works, but judging from that CPU Type Forum post, the "loose page" vs "tight page" might be the thing that holds the "pentium_slow" option behind in terms of performance?

Anyway, if there are more questions I'll gladly try and answer them for you.
Thanks so far for your interest 😁

Reply 3 of 14, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

I don't think the page access checks would be having much of an impact unless there is a particular scenario happening: pages are being accessed by ring 0 but are marked as restricted for ring 3, so they get mapped/unmapped for every single access instead of remaining mapped. The same thing happens in Descent, which has notable speed differences when run with/without cputype set to pentium_slow.
For this reason I also don't think making a new cputype is the solution, reworking the paging functions is a better fix (likely by flushing the TLB on CPL changes).

Reply 4 of 14, by ripsaw8080

User metadata
Rank DOSBox Author
Rank
DOSBox Author

I see no indications of "weird" installs. The only files that differ in installs with cputype=auto and cputype=pentium_slow are the UFO2P.EXE and TACP.EXE executable, and they are correctly copied from the CD image. Going by which files get copied, UFO2P4.EXE and TACP4.EXE on the CD are the 486 versions, and the ones without "4" are the 586 versions.

I'm not sure what happens when you use cputype=auto with installed 586-specific executables, or vice versa, but I'd suggest not mismatching like that as it could cause issues.

Regarding "fluency" and "smoothness" of gameplay, there are factors not related to CPU emulation to consider, such as the output= setting. For example, using output=opengl(nb) with vsync enabled is known to cause hitches/stuttering. Also, be careful using videos as an example, as they can create the appearance of smooth frame progression that was not necessarily the case as the video frames were being captured.

Reply 6 of 14, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie
BardBun wrote on 2020-10-25, 00:30:

#Apocalypse: cycles=60000 (windowed) ; cycles=100000 (fullscreen) ; cputype=auto ; core=dynamic

What happens when you use these settings with cputype=486_slow ? Since that's basically the same as auto but with the additional page checks that pentium_slow uses.

Reply 7 of 14, by BardBun

User metadata
Rank Member
Rank
Member
jmarsh wrote on 2020-10-25, 11:10:
BardBun wrote on 2020-10-25, 00:30:

#Apocalypse: cycles=60000 (windowed) ; cycles=100000 (fullscreen) ; cputype=auto ; core=dynamic

What happens when you use these settings with cputype=486_slow ? Since that's basically the same as auto but with the additional page checks that pentium_slow uses.

It's the same poor performance as "pentium_slow", this also counts for "386_slow".

I've done extensive testing with all cpu and core types along with several different cycle values to come up with the perfect DOSBox settings for the game (for a guide for new and returning Apocalypse players), however only "auto" and "386" are able to provide a near perfect experience without freezes/hard slowdowns and only with the settings that you've quoted, while out of those 2 options "auto" is the more preferable one.

I can make a comparison video if you want, but I feel like this discussion is over with judging from ripsaw8080's second reply (fair enough :I ), so I don't know if that's even worth the time. But if you are interested, sure.

The issues I've listed above still persist and those are only a part of all the weird things that can happen with Apocalypse, the only way to make the game run without those weird bug issues is to use the "pentium_slow" cputype, which will result in a very unenjoyable performance of the game overall with tons of slowdowns and freezes due to how DOSBox handles that cputype compared to "auto".

The only way to play the game with near perfect performance is with these settings "#Apocalypse: cycles=60000 (windowed) ; cycles=100000 (fullscreen) ; cputype=auto ; core=dynamic" which, due to the CPU type and the CPU checks of the game, can cause weird things to happen in the long run.

Apocalypse overall is a very weird game when it comes to working correctly, mainly because of how rushed it was in its final stages.
Lots of things can break, the Learning AI only works with the original UK CD (MP191 207 D01R) and the fact that the installer does some weird things during the install with renaming files depending on the cputype just adds to that.

One key to preventing that is a proper manual install along with the game only registering a Pentium type CPU, but unfortunately the performance from DOSBox's "prentium_slow" makes it so that you'd rather avoid that and go for "auto" instead.

Reply 8 of 14, by ripsaw8080

User metadata
Rank DOSBox Author
Rank
DOSBox Author
BardBun wrote on 2020-10-25, 14:32:

I feel like this discussion is over with judging from ripsaw8080's second reply

Regarding your assertion that the 586-specific executables are less buggy than the 486-specific executables; I can neither confirm nor deny. Therefore your point remains about being able to use the 586-specific executables without performance degradation of added page checks .

Reply 9 of 14, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Another possible factor is the specific compiler's optimization of the code. It is possible for a compiler to have a better tested compilation of code for its default settings and - 1. have a bug in compiling for a non-default cpu type or 2. an expectation that the cpu can decode more than one instruction at the same time. Perhaps #2 is testable by comparing the performance between the dynamic and normal cores for a set of games. It is also important to compare against real hardware.

I have noted this to occur in builds of software but have not tested it outside emulation. For use in dosbox, it probably makes more sense to prefer a 386 or 486 build of a game instead of using any Pentium build, at least in the case that the differences between the builds are unknown.

Reply 10 of 14, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie
ripsaw8080 wrote on 2020-10-25, 03:25:

I'm not sure what happens when you use cputype=auto with installed 586-specific executables, or vice versa, but I'd suggest not mismatching like that as it could cause issues.

If the pentium/586 executables actually run with cputype=auto (i.e. they don't complain about the CPU not being a pentium) then I think it would be the easiest solution for OP; the only emulation differences between auto and pentium_slow are the way the CPU identifies itself and the page access checking. "auto" still supports the pentium-specific rdtsc opcode (which might explain some of the AI behavioural differences if it's being used as a random seed source...).

Reply 11 of 14, by ripsaw8080

User metadata
Rank DOSBox Author
Rank
DOSBox Author

The 586 executables do run with cputype=auto, and the only instance of CPUID appears to be in the DOS4GW extender. I found a few instances of RDTSC in one subroutine in TACP.EXE (tactical), but not in UFO2P.EXE (cityscape), and no other Pentium-only instructions. So, other than the case of RDTSC, I guess the main difference between the 486 and 586 executables might be processor-specific compiler optimization.

Reply 12 of 14, by Delfino Furioso

User metadata
Rank Newbie
Rank
Newbie

sorry to chime in with a partially related question, but since you guys are talking about cpu types...

some time ago I've compiled a comparison table with informations gathered from the wiki and this discussion

Screenshot-3.png

is this still accurate?

I was thinking of updating the wiki with something similar...
Would it be useful, in your opinion?
Would it be overkill for the average user's minimum required knowledge?

Reply 13 of 14, by ripsaw8080

User metadata
Rank DOSBox Author
Rank
DOSBox Author

The cputype condition in prefix_0f.cpp for the RDTSC instruction might give the impression that only pentium_slow supports it:

	CASE_0F_B(0x31)												/* RDTSC */
{
if (CPU_ArchitectureType<CPU_ARCHTYPE_PENTIUMSLOW) goto illegal_opcode;

But if you check the internal id number of cputypes in cpu.h you'll find that the mixed type (i.e. auto) is the highest of all.

#define CPU_ARCHTYPE_MIXED			0xff
#define CPU_ARCHTYPE_386SLOW 0x30
#define CPU_ARCHTYPE_386FAST 0x35
#define CPU_ARCHTYPE_486OLDSLOW 0x40
#define CPU_ARCHTYPE_486NEWSLOW 0x45
#define CPU_ARCHTYPE_PENTIUMSLOW 0x50

In short, cputype=auto supports RDTSC.

Reply 14 of 14, by xcomcmdr

User metadata
Rank Oldbie
Rank
Oldbie
BardBun wrote on 2020-10-25, 14:32:

Apocalypse overall is a very weird game when it comes to working correctly, mainly because of how rushed it was in its final stages.
Lots of things can break, the Learning AI only works with the original UK CD (MP191 207 D01R) and the fact that the installer does some weird things during the install with renaming files depending on the cputype just adds to that.

It also works with the original French CD release.
I know because I played it on real hardware back in the day (Intel Pentium III @ 450 MHz) and save-scumming for the whole campaign made the learning AI very hard, and my game on Beginner difficulty took me a full year non-stop.