VOGONS


First post, by danoon

User metadata
Rank Member
Rank
Member

I've noticed that Dosbox 0.74 dynamic core will run "Betrayal in Antara" but normal core will not under Windows 3.1/Win32s.

I've also noticed that the Dosbox Megabuild 6 will run Windows 98 just fine with the dynamic core but not with the normal core. (I know Win98 is not officially supported, I just give it as another example where the dynamic core does better than the normal core)

I was hoping some Dosbox devs might be able to speak to any known problems with the normal core.

I'm mainly interested in "Betrayal in Antara" since the official Dosbox build runs it. I'm would like my java dosbox port to have the same compatibility with games as the official build, but for obvious reasons, the java core is based of the normal core and not the x86 dynamic core.

Thanks,

Reply 1 of 14, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

It's not a bug but a feature of the dynamic core to avoid pagefault recursion which is part of the
original design of the normal core when dosbox was transitioned to support paging.

Maybe check the svn logs, think Antara should be mentioned there but don't remember.

Reply 4 of 14, by danoon

User metadata
Rank Member
Rank
Member

WD - Thanks for the hint about page faults.

It looks like Betrayal in Antara has a double page fault which doesn't seem to be supported by the normal core.

I started looking at implementing double faults, I saw a table of conditions which can lead to a double fault (interrupt 8) at http://www.logix.cz/michal/doc/i386/chp09-08.htm. I verified that a 0 is pushed onto the stack as the error code when creating the exception. Based on when it would previously crash it appears that my double fault routine gets executed around the right time, but so far no luck. I bet I messed up the eip or perhaps there is more going on that I don't understand.

Reply 5 of 14, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

There are no "double pagefaults", what you're seeing is that there is a pagefault, it's handled by win3x but does not
return to the code where it was started so depending on the situation a "hanging" pagefault may or may not
create problems.

double/triple faults are something completely different, they occur if for example the ivt/idt is bad/intentionally wrong
and a fault happens, the fault can not be executed causing a second one etc.

Reply 6 of 14, by danoon

User metadata
Rank Member
Rank
Member

WD - Thanks for the explanation of double faults, the more I read about them the more it seemed like it couldn't be the case since the Intel specs I was reading said that after a double fault the current task state would be bad.

I didn't know that a page fault could be handled but not return back to the same instruction. That would explain a lot.

http://www.boxedwine.org/

Reply 7 of 14, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

It's actually quite simple to have it not return, a pagefault is nothing more than a pmode interrupt
so it pushes stuff on the stack, sets some special information and calls the handler. The handler routine
can do anything it likes, examining the code that caused the pagefault, then paging in memory,
then resuming execution at the faulting instruction (or leave out this point and do something else).

Reply 8 of 14, by danoon

User metadata
Rank Member
Rank
Member

I replaced the page fault handler / loop with this:

paging.cr2=lin_addr;
CPU.CPU_Exception(CPU.EXCEPTION_PF,faultcode);
throw new PageFaultException();

Sorry this is Java code, I didn't know how to achieve the same thing as easily with c++ (my c++ is gettiing rusty 😊) , plus I wasn't sure if there was a good reason (portability, etc) that dosbox just didn't jump to the top of the running loop.

I catch this java exception in the normal loop

                   try {
ret=CPU.cpudecoder.call();
if (ret<0) return 1;
if (ret>0) {
/*Bitu*/int blah=Callback.CallBack_Handlers[ret].call();
if (blah!=0) return blah;
}
} catch (Paging.PageFaultException e) {
}

In the dosbox code cpu.mpl is stored then set cpu.mpl = 3, and when the page fault returns it restores it. I didn't have a good way to do it so I just left this out. Does anyone know if this will have a bad side effect? I assume it was there for a reason.

Along with h-a-l-9000's paging patch that I ported to Java (For Testers: CGA/VGA Video BIOS separation and Paging patch), this makes it so that Win98 boots all the way and explorer doesn't crash with my Java port. "Betrayal in Antara" also gets a lot farther, at least it opens many more files as seen in the debug log, before it crashes the Java code. So perhaps this fix addressed one issue and I'm on to another.

It would be interesting to come up with a c patch that can do this same thing, perhaps Windows and "Betrayal in Antara" could run with the Dosbox normal core.

Reply 9 of 14, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Exception handling like that is usually very slow (because normal codeflow assumes exceptions are, um, exceptionally).
I don't know in how far this would affect java though.
The main reason back then to use the current logic is that not only the cores may trigger pagefaults, but the callback
code as well so your code wouldn't attribute for that (if left alone like that).

The cpu.mpl is used to have better pagefault codes since some memory accesses that are done in the internal code
belong to the system even though the current mode is user-code. You should try to retain these but not too many
bad things will happen if you kick out the whole mpl logic at all (minus effects on win9x systems).

Reply 11 of 14, by danoon

User metadata
Rank Member
Rank
Member

I'm messing around with Dosbox Megabuild 6 source since it has the paging patches and Windows 98 runs just fine with the dynamic core. I'm still trying to get a normal core to run Win98. I'm getting a lot closer. It can boot up and run Diablo without any errors. But some things, like IE5 cause crashes.

I modified the code so that page faults will pop back to the top of the main cpu loop instead of running another instance of the loop in place. The only other changes I made was to the normal core string functions so that they update edi/esi/ecx immediately instead of using local variables so that page faults would work correctly with them.

I was wondering if someone had some pointers as to what I might have over looked and what else might be different between the normal and dynamic cores.

EXPLORER caused an invalid page fault in
module OLE32.DLL at 0167:7ffa9fd5.
Registers:
EAX=0007620 CS=0167 EIP=7ffa9fd5
EFLGS=00000287
EBX=00c2f9e4 SS=016f ESP=00c2f8a0
EBP=00c2f8c5
ECX=00000010 DS=016f ESI=9000c2f8 FS=239f
EDX=00000010 ES=016f EDI=00c2f9e4 GS=0000
Bytes at CS:EIP:
f3 a6 0f 85 77 06 00 00 8b 7d 08 33 c0 ab ab ab
static void PAGING_NewPageFault(PhysPt lin_addr, Bitu page_addr, 
bool prepare_only, Bitu faultcode) {
paging.cr2=lin_addr;
//LOG_MSG("FAULT q%d, code %x", pf_queue.used, faultcode);
//PrintPageInfo("FA+",lin_addr,faultcode, prepare_only);

if (prepare_only) {
cpu.exception.which = EXCEPTION_PF;
cpu.exception.error = faultcode;
} else {
if (in_callback==0) {
FillFlags();
CPU_Exception(EXCEPTION_PF,faultcode);
longjmp(top_of_loop, 1);
}
// Save the state of the cpu cores
LazyFlags old_lflags;
memcpy(&old_lflags,&lflags,sizeof(LazyFlags));
CPU_Decoder * old_cpudecoder;
old_cpudecoder=cpudecoder;
cpudecoder=&PageFaultCore;
if (pf_queue.used >= PF_QUEUESIZE) E_Exit("PF queue overrun.");
PF_Entry * entry=&pf_queue.entries[pf_queue.used++];
entry->cs=SegValue(cs);
entry->eip=reg_eip;
entry->page_addr=page_addr;
entry->mpl=cpu.mpl;
cpu.mpl=3;
CPU_Exception(EXCEPTION_PF,faultcode);
#if C_DEBUG
// DEBUG_EnableDebugger();
#endif
DOSBOX_RunMachine();
pf_queue.used--;
LOG(LOG_PAGING,LOG_NORMAL)("Left PageFault for %x queue %d",lin_addr,pf_queue.used);
memcpy(&lflags,&old_lflags,sizeof(LazyFlags));
cpudecoder=old_cpudecoder;
//LOG_MSG("FAULT exit");
}
}
jmp_buf top_of_loop;

void DOSBOX_RunMachinePF(void){
Bitu ret;
do {
ret=(*loop)();
} while (!ret);
}

void DOSBOX_RunMachine(void){
Bitu ret;
setjmp(top_of_loop);
do {
ret=(*loop)();
} while (!ret);
}

static Bitu Normal_Loop(void) {
Bits ret;
while (1) {
if (PIC_RunQueue()) {
ret=(*cpudecoder)();
if (GCC_UNLIKELY(ret<0)) return 1;
if (ret>0) {
in_callback++;
Bitu blah=(*CallBack_Handlers[ret])();
in_callback--;
if (GCC_UNLIKELY(blah)) return blah;
}

Partial extract of string changes

static void DoString(STRING_OP type) {
if (core.prefixes & PREFIX_ADDR)
DoString32(type);
else
DoString16(type);
}

static void DoString16(STRING_OP type) {
PhysPt si_base,di_base;
Bitu count,count_left;
Bits add_index;

si_base=BaseDS;
di_base=SegBase(es);
count=reg_cx;
if (!TEST_PREFIX_REP) {
count=1;
} else {
CPU_Cycles++;
/* Calculate amount of ops to do before cycles run out */
if ((count>(Bitu)CPU_Cycles) && (type<R_SCASB)) {
count_left=count-CPU_Cycles;
count=CPU_Cycles;
CPU_Cycles=0;
LOADIP; //RESET IP to the start
} else {
/* Won't interrupt scas and cmps instruction since they can interrupt themselves */
if ((count<=1) && (CPU_Cycles<=1)) CPU_Cycles--;
else if (type<R_SCASB) CPU_Cycles-=count;
count_left=0;
}
}
add_index=cpu.direction;
if (count) switch (type) {
case R_OUTSB:
for (;count>0;count--) {
IO_WriteB(reg_dx,LoadMb(si_base+reg_si));
reg_si+=add_index;
if (TEST_PREFIX_REP) reg_cx--;
}
break;
case R_OUTSW:
add_index<<=1;
for (;count>0;count--) {
IO_WriteW(reg_dx,LoadMw(si_base+reg_si));
reg_si+=add_index;
if (TEST_PREFIX_REP) reg_cx--;
}
break;

Reply 12 of 14, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

I modified the code so that page faults will pop back to the top of the main cpu loop instead of running another instance of the loop in place.

I think that won't work in the general case since the normal core wasn't written with atomic instruction behaviour in mind,
so if you just jump out on pagefaults some registers may already be modified so the state you're leaving the core in
is messed up (that's why the current pagefault handler tries to return to the intercepted instruction).

Reply 13 of 14, by danoon

User metadata
Rank Member
Rank
Member

WD: Thank you for putting up with my questions. I too thought that one of the instructions wasn't reentrant after the page fault. But with your 2nd opinion I redoubled my efforts. I logged all the instructions that generated page faults while Win98 ran and there was only 59 different ops so I focused on each of them. Turns our I missed jumps.

#define JumpCond32_d(COND) {					\
SAVEIP; \
if (COND) reg_eip+=Fetchds(); \
reg_eip+=4; \
continue; \
}

Obviously eip shouldn't be saved until after the fetch.

Now I no longer get the crashes around IE 5. In fact Win98 SE seems pretty stable running on my java port of the normal core.