PowerPC Dynamic Recompiler (patch)

Developer's Forum, for discussion of bugs, code, and other developmental aspects of DOSBox.

PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-1-31 @ 18:45

So I started this over 5 years ago, left it unfinished for a few years and now started from scratch and got it into a fit state for submitting...

This patch adds a dynamic recompiler for 32-bit PowerPC, based on the existing dynrec framework. I've only tested it on a wii but there should be no reason for it not to work on PowerPC based Macs. As far as performance goes with core=normal I get 0.7fps from PCPBENCH, with core=dynamic I get 3.1fps. There are some other big-endian improvements that can be made that get it up to 4.0 but I haven't included them here as they aren't related to dynrec.

I haven't touched any of the autoconfigure scripts, config.h needs the following settings:
#define C_TARGETCPU POWERPC
#define C_DYNREC 1
#define WORDS_BIGENDIAN
The compiler needs to support gcc inline assembly (checked via defined(__GNUC__)) for dcache flushing/icache invalidation. There doesn't seem to be a portable way to achieve this, but they're not supervisor level instructions so should be fine for any userspace program to use.

Some comments on the changes:
- I had to name the FPU_Rec struct so it could be forward-declared in risc_ppc.h (having a dedicated register pointed to it helps FPU heavy code).
- Removed some unneeded WORDS_BIGENDIAN guards in the self-modifying code detection, they weren't needed as the additions aren't meant to overflow between bytes.
- Made dyn_run_code() get called before dyn_return(BR_Link1/BR_Link2) and shuffled their locations a bit. The reason for this is that the PPC dynrec generates its epilog once in gen_run_code() and then puts a jump to it whenever gen_return_function() is called, rather than emitting a full epilog every time. If dyn_return() was called before dyn_run_code() the address of the epilog is unknown.
- Added missing cache_block_before_close()/cache_block_closing() calls for those blocks, since they were missing.
- The dynrec decoder wasn't differentiating between little-endian (host) memory access and regular memory access. I added new functions where necessary (hopefully caught them all) and aliased them to the regular functions when WORDS_BIGENDIAN is not defined.
- dyn_ret_near() was bugged, it tried to write a dword to &reg_ip which overran on big-endian.
Attachments
ppc_dynrec.diff
(43.35 KiB) Downloaded 4 times
Last edited by jmarsh on 2019-10-06 @ 18:41, edited 2 times in total.
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby digger » 2019-2-01 @ 01:37

Promising work! :)

How well do you think this will run on Talos and/or Blackbird hardware by Raptor Computing Systems? https://www.raptorcs.com/content/base/products.html

Is there anybody here on Vogons who owns such a system? If not, perhaps you could get in touch with the guy who manages the Talospace blog and ask if he'd be willing to try this out on his hardware: https://www.talospace.com/

He's quite the low-level software wizard himself, by the way. He's also the developer behind TenFourFox, the Firefox fork for PowerPC Macs that he still maintains and updates regularly.
User avatar
digger
Member
 
Posts: 225
Joined: 2010-2-12 @ 18:15
Location: Amsterdam, the Netherlands

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-2-01 @ 10:24

Afaict that's a different ISA. PowerPC (what this dynrec is) branched off from the POWER line, so while they have a similar base they're not compatible.
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby digger » 2019-2-01 @ 16:34

jmarsh wrote:Afaict that's a different ISA. PowerPC (what this dynrec is) branched off from the POWER line, so while they have a similar base they're not compatible.


Hmmm, according to this blog post on Talospace, it should be possible for these systems to virtualize (and partially emulate) any PowerPC CPU. To quote the blog post directly:

However, KVM-PR can also emulate other instructions and their desired behaviour, which theoretically allows it to act like any supported Power ISA or PowerPC CPU, including a G3, G4 or G5. Instructions which aren't supported natively are trapped and executed just like supervisor-level instructions, and everything else can still run on the metal.


So if I understand this correctly, it does have to trap and emulate certain instructions, but most of them can be executed natively. That should still allow these systems from Raptor CS to run DOSBox with your PowerPC Dynamic Recompiler patch with reasonable to good performance, wouldn't it? Perhaps you could even modify your patch to avoid any instructions that POWER8 and POWER9 would have to emulate, or at least reduce the number of those in your code as much as possible.

Of course to be able to develop, test and debug for POWER8 and/or POWER9 architectures, you would of course have to be in possession of such a system and this hardware is not exactly cheap. But if you don't have it, perhaps someone else here on the forum who has such a machine could try this out for you, using KVM-PR if necessary? :)
User avatar
digger
Member
 
Posts: 225
Joined: 2010-2-12 @ 18:15
Location: Amsterdam, the Netherlands

Re: PowerPC Dynamic Recompiler (patch)

Postby Qbix » 2019-2-28 @ 17:06

Thank you for the patch, as I said on IRC, I am not too sure about your fix for dyn_ret_near

Your fixes/changes didn't fix the problem that I have been chasing for a while now.
I'll make topic about it, as at the moment, I am not really sure how to find it what is going wrong.

The reordering that you did doesn't seem to break the x64 dynrec
Water flows down the stream
How to ask questions the smart way!
User avatar
Qbix
DOSBox Author
 
Posts: 10919
Joined: 2002-11-27 @ 14:50
Location: Fryslan

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-2-28 @ 18:44

There's actually a bug in gen_and_imm but luckily none of the current use cases for that function trigger it, I can fix it up on the weekend and tweak the dyn_ret_near fix to be more correct rather than just doing the same as little-endian systems.
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby Qbix » 2019-2-28 @ 19:59

think the bigop?reg_eip:reip,bigop was better ? or what we came up with on irc (and is used in other places)
Water flows down the stream
How to ask questions the smart way!
User avatar
Qbix
DOSBox Author
 
Posts: 10919
Joined: 2002-11-27 @ 14:50
Location: Fryslan

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-2-28 @ 22:59

Get rid of the zero extension when decode.big_op is false and use "decode.big_op?(void*)(&reg_eip):(void*)(&reg_ip),decode.big_op)" when storing the value from the host reg.
If the value is zero extended it's fine to write 32 bits to &reg_eip (that's what the normal and full cores always do) but why make 16 bit code emit an extra instruction when we can just write 16-bits directly instead.

The "if (bytes) gen_add_direct_word(&reg_esp,bytes,true);" statement is a bit of a worry because it assumes the stack address size is always 32 bits but I guess it is not likely for sp to overflow. Could maybe be fixed with a branch based on cpu.stack.big, but if the new sp value needs to be fixed there's a good chance an SS exception should be taken too.
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-3-12 @ 07:43

Related: Here is a patch that fixes drive_fat.cpp to work on big-endian systems (including fixing a bug that allocates one too many clusters when a file's length is a multiple of the cluster size) and makes use of gcc's bswap builtins (which for PowerPC translate to lwbrx/stwbrx) for host memory access.
Attachments
drive_fat_BE.diff
(8.14 KiB) Downloaded 31 times
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby fr500 » 2019-4-18 @ 03:57

I don't have a PPC Mac so I wasn't able to test this on such a system.
I was able to test on WiiU via RetroArch (I have a DOSBox fork that sticks as closely as possible to upstream here https://github.com/fr500/dosbox-svn/tree/ppc)

Sadly it crashes and it's well beyond what I could possibly solve.
Image

If there is anything I can do to contribute towards a fix I'd be willing to!
OFC I don't expect anyone to put serious time on this considering the niche status of the WiiU
fr500
Newbie
 
Posts: 16
Joined: 2018-7-10 @ 22:49

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-4-18 @ 04:23

I would need to double check the reported SR1/DSI values but at first glance it looks like the memory that is malloc'd to hold the dynamic code isn't marked as executable. Does retroarch have any other cores that use a dynarec on wiiu that might know how to change memory protection?
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby fr500 » 2019-4-18 @ 13:55

Sadly no, WiiU isn't all that popular actually.

One of the toolchain developers said this:

Doesn't look like an easy one - likely some bad pointer math, or they're relying on some mprotect-ish function that's not really a thing on WiiU. dynarecs have gotta have at least a little bit of WiiU-specific code to make 'em work, be it through the usual OSCodegen methods or something a lil' more kernel-ly - you've gotta take the generated code and mark it as executable before you can run it.

Taking a guess based on the stuff on git w/o looking at a binary, my money's on this pointer being uninitialized, which isn't great because we try and jump to it here. Should've been initialized here, can't trace it much further than that


The code locations he's referring to are:
https://github.com/fr500/dosbox-svn/blo ... c.cpp#L142
https://github.com/fr500/dosbox-svn/blo ... c.cpp#L254
https://github.com/libretro/dosbox-libr ... che.h#L636
fr500
Newbie
 
Posts: 16
Joined: 2018-7-10 @ 22:49

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-4-19 @ 02:08

It's not uninitialised, it's just not executable. You need to use the wiiu equivalent of mmap/mprotect to make it so, not sure if that's what OSCodegen does or if it's more complicated e.g. having to switch memory back and forth between writable or executable.
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby Dominus » 2019-6-26 @ 18:01

initial test and I can't even compile ppc with a dynrec core. seems I need some magic, so the OS X ppc built is seen as ppc.
just adding a #def ine PowerPC and #define C_DYNREC in config.h was not enough, as it probably pulled in code for OS X
User avatar
Dominus
DOSBox Moderator
 
Posts: 7978
Joined: 2002-10-03 @ 09:54
Location: Ludwigsburg

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-6-26 @ 19:02

These are the lines to put in config.h (if they're not there already or set to different values):
Code: Select all
#define C_TARGETCPU POWERPC
#define C_DYNREC 1
#define WORDS_BIGENDIAN
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby Dominus » 2019-6-26 @ 19:20

That helped somewhat but ran into
In file included from core_dynrec.cpp:155:
core_dynrec/risc_ppc.h:489:41: error: invalid suffix "b10100" on integer constant
core_dynrec/risc_ppc.h:536:31: error: invalid suffix "b10100" on integer constant
core_dynrec/risc_ppc.h:548:26: error: invalid suffix "b01100" on integer constant
core_dynrec/risc_ppc.h:561:26: error: invalid suffix "b00100" on integer constant
core_dynrec/risc_ppc.h:588:26: error: invalid suffix "b00100" on integer constant
core_dynrec/risc_ppc.h:597:26: error: invalid suffix "b00100" on integer constant
core_dynrec/risc_ppc.h:646:31: error: invalid suffix "b10100" on integer constant
core_dynrec/risc_ppc.h:654:30: error: invalid suffix "b10100" on integer constant
core_dynrec/risc_ppc.h:660:31: error: invalid suffix "b10100" on integer constant
core_dynrec/risc_ppc.h:676:31: error: invalid suffix "b10100" on integer constant
User avatar
Dominus
DOSBox Moderator
 
Posts: 7978
Joined: 2002-10-03 @ 09:54
Location: Ludwigsburg

Re: PowerPC Dynamic Recompiler (patch)

Postby krcroft » 2019-9-25 @ 04:52

Dominus,

What kind of PPC machine and OS are you using?

I'd like to chip in some help here too; I have a Power Mac G4 'Sawtooth' collecting dust that I previously ran Gentoo on back in the day.

I'd start fresh with whatever OS jmarsh, QBix, or yourself feel is the best target (I realize on Linux it doesn't matter much.. just the kernel, user-space libraries, and build suite; but I might as well start off on the right foot).

I also figured I could be orthogonal to whatever you're using, to maximize our test coverage.
User avatar
krcroft
Member
 
Posts: 330
Joined: 2017-4-29 @ 15:07
Location: Ogden's Retreat

Re: PowerPC Dynamic Recompiler (patch)

Postby Dominus » 2019-9-25 @ 05:07

I'm cross compiling for OS X PPC on a OS X 10.14 (yes that works if you have all the tools from way back :))
User avatar
Dominus
DOSBox Moderator
 
Posts: 7978
Joined: 2002-10-03 @ 09:54
Location: Ludwigsburg

Re: PowerPC Dynamic Recompiler (patch)

Postby jmarsh » 2019-9-25 @ 05:55

The "b" (binary) literals can be replaced with their decimal values if anyone else is stuck using a compiler as ancient as Apple's gcc.
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: PowerPC Dynamic Recompiler (patch)

Postby krcroft » 2019-9-25 @ 14:57

Dominus wrote:I'm cross compiling for OS X PPC on a OS X 10.14 (yes that works if you have all the tools from way back :))

Ahh, right on Dominus! With you covering OSX, I'll go with Debian to give coverage for those trying to use a modern Linux.

Edit: I might be sticking with Gentoo; Debian dropped PPC support in Stretch, while Jessie only supports up to kernel 2.6.x. CentOS dropped PPC support in 8 and newer, while 7 only supports kernel 3.10. ArchLinux gave up on PPC around 2013. OpenSuse dropped PPC in 12, while 11.3 only suports kernel 3.x. YellowDogLinux's latest release was from 2012 (hello seven years' worth of security vulnerabilities), so that's not an option.

Gentoo's PPC port is active and as of today supports kernel 4.19, and it looks like kernel coverage will keep advancing too. I'm not a fan of how slow emerge and portage was back in the day, but hopefully they've improved things.
Last edited by krcroft on 2019-9-25 @ 15:42, edited 4 times in total.
User avatar
krcroft
Member
 
Posts: 330
Joined: 2017-4-29 @ 15:07
Location: Ogden's Retreat

Next

Return to DOSBox Development

Who is online

Users browsing this forum: No registered users and 2 guests