VOGONS


Reply 60 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

It crashes in the dynamic linker helper that's why it works the second time. I found a nice page describing the process: http://timetobleed.com/dynamic-linking-elf-vs-mach-o

When the PLT already contains the cos() address (when it was loaded with a previous run with normal core) there is no problem. If the library is not loaded, then the dyld_stub_binder() is called. It crashes at the first xmm0 instruction so I assume the error is that the stack is not 16-bit aligned at that point.

Why is 0x08 added to rsp in gen_call_function_setup()?

http://www.si-gamer.net/gulikoza

Reply 62 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

Replaced gen_call_function_raw() with gen_call_function_setup() and it works the first time as well, so it is the stack alignment 😎. Although it is quite a lot slower...I guess the real solution would be to see which calls really need aligned stack and fix only those 😁

http://www.si-gamer.net/gulikoza

Reply 63 of 110, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Will have a look at that, possibly only floating point stuff needs alignment which may hit that function (and the _raw is too shortcut for that...)

Thanks for figuring that out 😀

Reply 64 of 110, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Could try that:

static void INLINE gen_call_function_raw(void * func) {
cache_addb(0x48);
cache_addw(0xec83);
cache_addb(0x08); // sub rsp,0x08

cache_addb(0x48);
cache_addb(0xb8); // mov reg,imm64
cache_addq((Bit64u)func);
cache_addw(0xd0ff);

cache_addb(0x48);
cache_addw(0xc483);
cache_addb(0x08); // add rsp,0x08
}

Reply 66 of 110, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

Ok, tested it just now.
I assumed the changes by wd need to go on top of the diff by gulikoza (and at risc_x64.h line 336).
With this core dynamic still crashes at the PCPbench. Also crashes if the Pcpbench first ran in normal core and I then switched to dynamic.

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 68 of 110, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

Don't know how to trace this. Sorry.

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 69 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

I was able to steal the macbook for the weekend again 😁

With wd's changes, PCPBENCH works fine here. Dominus, are you sure you changed the gen_call_function_raw() (as that one is at line 378 here (although yes, my risc_x64.h is probably modified))? The gen_call_function_setup() should be left as it is (line 343 here). And yes, these changes should go after applying my patch from the previous page.

edit: nvm, I probably moved gen_call_function_raw, but still it should work. Let me try again with clean risc_x64.h

http://www.si-gamer.net/gulikoza

Reply 70 of 110, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

Yes, please let me know which changes are correct. I'm getting confused 😉

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 71 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

Ok...checked out risc_x64 from svn again. Touched the file (so the modifications get picked up by make). Ran dosbox - crash before prompt. Applied my patch. Crash just after running PCPBENCH. Pasted wd's code on line 336. PCPBENCH works. 😀

http://www.si-gamer.net/gulikoza

Reply 73 of 110, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Ok so your patch (fixed memory addressing) + the stack adjustment i've
posted + small cache blocks makes this work? Nice.

Does it work with cache block entries:=32 and the _raw replaced by the other
function setup routine? The logic behind my small change was just to align
the stack to 16byte boundaries always since that seems an ABI requirement,
but i'll check where we call that (though it doesn't sound like it'd be related
to the max cache block entries).

Reply 74 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

Do you know where this code comes from?

0x11e04cd8c:	and    rsp,0xfffffffffffffff0
0x11e04cd90: add rsp,0x8
0x11e04cd94: push rax
0x11e04cd95: mov rax,0x10000c340
0x11e04cd9f: call rax
0x11e04cda1: pop rsp
0x11e04cda2: or al,al
0x11e04cda4: jne 0x11e04d0b2
0x11e04cdaa: mov di,0x44a
0x11e04cdae: mov WORD PTR [rip-0x1dde28a9],di # 0x10026a50c <cpu_regs+12>
0x11e04cdb5: mov di,WORD PTR [rip-0x1dde28b0] # 0x10026a50c <cpu_regs+12>
0x11e04cdbc: mov si,0xf
0x11e04cdc0: mov eax,edi
0x11e04cdc2: add eax,esi
0x11e04cdc4: jmp 0x11e04cdcc
0x11e04cdc6: nop
0x11e04cdc7: nop
0x11e04cdc8: nop
0x11e04cdc9: nop
0x11e04cdca: nop
0x11e04cdcb: nop
0x11e04cdcc: add BYTE PTR [rax],al
0x11e04cdce: call rax
0x11e04cdd0: add rsp,0x8
0x11e04cdd4: mov WORD PTR [rip-0x1dde28cf],ax # 0x10026a50c <cpu_regs+12>
0x11e04cddb: mov di,WORD PTR [rip-0x1dde28d6] # 0x10026a50c <cpu_regs+12>
0x11e04cde2: mov si,0xf0
0x11e04cde6: mov eax,edi
0x11e04cde8: and eax,esi

It crashes at 0x11e04cdcc. 0x11e04cdce looks like the second part of the gen_call_function_raw(), but add BYTE PTR [rax], al?? Memory at that address is:

(gdb) x/24xb 0x11e04cdc4
0x11e04cdc4: 0xeb 0x06 0x90 0x90 0x90 0x90 0x90 0x90
0x11e04cdcc: 0x00 0x00 0xff 0xd0 0x48 0x83 0xc4 0x08
0x11e04cdd4: 0x66 0x89 0x05 0x31 0xd7 0x21 0xe2 0x66

The last translated opcode (if my logging is correct) is 0x75.

http://www.si-gamer.net/gulikoza

Reply 75 of 110, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

Thanks, so the cache blocks is the difference why it crashed for me...

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 76 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

I guess those NOPs come from gen_fill_function_ptr() but they no longer correctly overwrite the function call. Disabling DRC_FLAGS_INVALIDATION fixes the crash.

http://www.si-gamer.net/gulikoza

Reply 77 of 110, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

I guess those NOPs come from gen_fill_function_ptr() but they no longer correctly overwrite the function call.

Wow, yeah, sorry, had a hard time remembering how advanced that thing is 😉

We have 8 bytes in addition now, so the blocks in gen_fill_function_ptr look like

			*(Bit32u*)(pos+0)=0xf001f889;	// mov eax,edi; add eax,esi
*(Bit32u*)(pos+4)=0x90900eeb; // skip
*(Bit32u*)(pos+8)=0x90909090;
*(Bit32u*)(pos+12)=0x90909090;
*(Bit32u*)(pos+16)=0x90909090;

I'll see about merging these things into current sources unless somebody is keen on doing that...

Reply 78 of 110, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

There's a problem with default case (pos+2) I guess. This becomes pos+6 when _raw is called, but remains (?) pos+2 when _setup is called. I'm not sure I can find all cases when pos+6 is required. I've added -4 to cache.pos in _setup in the following patch. Feel free to change that (or put something in the comment) 😀

The following patch has all the changes merged and seems to work (MAC OSX at least, I'll try to test linux later as well)

Attachments

  • Filename
    risc_x64.diff
    File size
    8 KiB
    Downloads
    226 downloads
    File license
    Fair use/fair dealing exception

http://www.si-gamer.net/gulikoza