VOGONS


PowerPC Dynamic Recompiler (patch)

Topic actions

Reply 60 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
Collected some more info. I have 2 PPC based machines which work under AmigaOS4, one of them is Pegasos2 with "true" ppc32 (i assume those ones were in some old PPC Macs too): Motorola MPC 7447/7457 running on 1ghz. And another machine is a more or less fresh one, AmigaONEx5000 which P5020 CPU running at 2.0Ghz (i do not understand fully at moment is it full PowerPC or Power + emulated calls to make it "PowerPC", at least it based on https://en.wikipedia.org/wiki/PowerPC_e5500 , but not sure if it really PowerPC.

Anyway, I run a compiled binary with your PPC-jit code on both machines, and it crashes the same. So at least it's for sure not differences between PowerPC vs Power, but something general about the handling of all this on AmigaOS4 itself.

There are crash logs with all the info about registers and stuff:

pegasos2 one (that one with Motorola MPC 7447/7457): http://kas1e.mikendezign.com/aos4/dosbox/jit/ … og_pegasos2.txt
x5000 one (that one with Freescales's P5020): http://kas1e.mikendezign.com/aos4/dosbox/jit/ … shlog_x5000.txt

As I can see, the content of special registers different in all cases. Only in the "MSR" register, while whole data different in, the last 4 bytes the same in both cases: F030. Not sure if it points us on anything anyway 😀

I also checked if in cache.h I go inside of "#if (C_HAVE_MPROTECT)" at all, and no, I didn't reach it. I.e. for me "C_HAVE_MPROTECT" parts skipped.

EDIT: And in case it matters, there is readelf's output (so to see sections, their flags, etc): http://kas1e.mikendezign.com/aos4/dosbox/jit/readelf.txt
EDIT2: And yep, we didn't have mprotect(), or sys/mman.h at all there.

Reply 61 of 137, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

It definitely sounds like the memory allocated for the dynamic code is not being marked as executable.
It also sounds like AmigaOS has its own functions for specialised memory allocation: https://wiki.amigaos.net/wiki/Exec_Memory_Allocation

You'd need to use this instead of malloc to allocate the memory for cache_code_start_ptr.

Reply 62 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

Ok, thanks, will ask our developers about.

If things will go easy route , then probably all what we need its in cpu/core_dynrec/cache.h, in the cache_init(), add some amigaos4 ifdef, so cache_code_start_ptr= _not_malloc_but_something_for_memory_execute(CACHE_TOTAL+CACHE_MAXSIZE+PAGESIZE_TEMP-1+PAGESIZE_TEMP);

Reply 63 of 137, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

From looking at that page, I think it would be something like this:

cache_code_start_ptr = (Bit8u*)IExec->AllocVec(CACHE_TOTAL+CACHE_MAXSIZE+PAGESIZE_TEMP-1+PAGESIZE_TEMP, MEMF_EXECUTABLE);

Or if it complains that AllocVec is deprecated/obsolete:

cache_code_start_ptr = (Bit8u*)IExec->AllocVecTags(CACHE_TOTAL+CACHE_MAXSIZE+PAGESIZE_TEMP-1+PAGESIZE_TEMP, AVT_Type, MEMF_EXECUTABLE, TAG_END);

And you'll probably need to include this header at the top of the file, unless it's already done by default:

#include <proto/exec.h>

Reply 64 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

Yes, you were right of course. That :

cache_code_start_ptr=(Bit8u*)AllocVecTags(CACHE_TOTAL+CACHE_MAXSIZE+PAGESIZE_TEMP-1+PAGESIZE_TEMP, AVT_Type, MEMF_EXECUTABLE, TAG_DONE);

+ include of proto/exec.h makes it compiles, and make it run with core:dynamic without a crash !

Now, I am trying to find some tests, to see if it work and speed things up. Probably the first one should be pcpebench ?

EDIT: found pcpbench. Do i need to just pure "run" as it , and then hit "esc" after it reach 100% ? No other keys need it ?

Reply 66 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

It seems that when I have set "fixed" number of CPU cycles nothing changes absolutely. I.e. if I set for example 5000 cycles, but use NORMAL or DYNAMIC core, nothing changes with pcnbench numbers, just the same.

But, if I set cycles to "max", then there is definitely a difference. On my 2ghz CPU I then have for cicles=max;core=normal : 4.2 fps. For cicles=max;core=dynamic : 15.2 fps ! So, about 3.5-4 times difference.

Through, with a dynamic core in use, when I hit "ESC" to exit from pcpbench, it always freeze the whole os. I assume, that because somewhere in the "caches.h" there should be some "free()" of that malloc(), and it tries to do free() than for new AllocVetTags, while should be FreeVecTags .. but that just a wild guess.

Reply 68 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

Yeah, it seems that DOSBox freezes just because of any reason. For example, right now it freezes even with "normal" core when I just run it another time, and type "mount c c".
Another time it freezes in PCPBench too on exit, even when I use core:normal. So there indeed something with dosbox somewhere..

Btw, in terms of other speed ups for big-endian , i not sure if it will help, but check plz that archive: http://os4depot.net/share/emulation/computer/dosbox.lha

Inside there is dosbox.diff, where author 7 or 8 yeats ago do that:

So the changes I applied are :
- src/cpu/core_normal.cpp : Activated C_CORE_INLINE, what increases the required
memory to compile, it only worked with 512 MB and the OS 4.1 paging system !
- include/paging.h : Used instructions to read half-words and words with
reversed bytes (PPC rulez !). Also changed a structure reorganizing TLB fields
in order to decrease the pressure on the cache.
- include/paging.h (again !) : Tried to improve the memory access through TLB.
- src/cpu/modrm.cpp and other files : I reduced the size of arrays to decrease
the pressure on the cache.
- include/render.h : Grouped some structure fields to improve alignment and
cache efficiency.

I not sure through how those changes can be (or can't be?) applied there, but they sure general ones and not for dynamic.

And another thing which always bother me with previous dosbox version of big-endian : diskmag "hugi" have wrong colors, very much looked like little vs big endian problem. It can be even not dosbox from other side, but diskmag's code itself.. Never tried it ?

Reply 69 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

I just applied the SDL2 patch on DOSBox (from Re: An adaptation to SDL 2.0 (Alpha-level Android build attached) ) and so far all by freezes gone.

So, I tried now Mortal Kombat 3 with and without jit: with your jit is start to be for sure MUCH better and fluid. Visually, on "core:normal" i have let's say 25-30 fps, but with "core:dynamic" i have for sure more than 60. And that with scan3x.

Btw, I read that it is possible to install win95 over DOSBox, so I will try it next with your PPC-jit. Probably it will be usable now!

Reply 70 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
May I ask if you can share your other big-endian speed-improvement changes not-related to dynarec ? I mean in first post you say that "There are some other big-endian improvements that can be made that get it up to 4.0 but I haven't included them here as they aren't related to dynrec.", will be very interesting to try to apply them and test how faster it will be on amigaos4. I can also donate a bit for if you, of course, have interest in 😀

Reply 72 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

Do you mean fat_driver endian patch? I apply that one yep. But should it make any differences to FPS speed in whole DOSBox?

Btw, what I also noticed, that I never reach 100% CPU loading, not with "normal", not with "dynamic". Maybe it somehow related to "pressure on the cache"?

For example for doom timedemo, with dynamic, i never reach more than 50% of CPU loading ever.

Reply 73 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
Maybe you will be in interest: I made a short video on how with your PPC dynamic core I run some "greedy" stuff on my machine. I had to use 1920x1080 to capture all fine, so also use big scalers (like normal4x) which eat CPU, so in reality when full-screen modes are "original" things even faster.

There is: https://youtu.be/WhiAq_giV3Y

I also found another game that has the same color issues on PPC as hugi diskmag: 11th hour. It runs fine, but all the colors are swapped, or just wrong palette, dunno, see:

hugi: http://kas1e.mikendezign.com/aos4/dosbox/dosb … colors_hugi.jpg
11th hour: http://kas1e.mikendezign.com/aos4/dosbox/dosb … lors_11hour.jpg

Reply 75 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
Not sure if you in interest about as well, but I have to work win95 img and that works fine over core:normal, but once I set core:dynamic, then it fails to boot (just black screen after window95 loading screen disappear). I tested on win32 the same image, and it works on x86 with both normal and dynamic cores. Sure I know that w95 there is out of interest as something not supported officially by DOSBox, but maybe it pinpoints on some bug in PPC's dynamic recompiler?

Reply 76 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
Also, TombRaider fails to runs with Dynamic core on PPC. Just showing black screen (while with normal showing animation), and then when I hit "ESC" (when should have a menu), then with the dynamic core I have "ERROR: Could not allocate enough memory" and exit. While with "normal" core all fine in that terms, even if I set just 16mb in config.

It may sometime runs by some luck with dynamic core (with trashed intro look), but not very offten. Like, one of 5-10 runs. And in game itself then all fine, speed good.

Last edited by kas1e on 2020-01-22, 20:01. Edited 1 time in total.

Reply 77 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
About those "colors" issues: TombRaider in the intro have them too: http://kas1e.mikendezign.com/aos4/dosbox/dosb … _tombraider.jpg

It's not that this bug happens everywhere, of course not, just a moment in 3 apps: hugi diskamg, and 2 games: 11 hours and now intro in tomb raider. You may try for example to run just a Hugi diskmag to see if you had those issues :
https://files.scene.org/get:nl-ftp/mags/hugi/hugi17.zip , just 3mb , nothing to install, only unpack and run "hugi17d.exe".

I just need to be 100% sure that on your mac-PPC machine colors will be correctly shown, so we can then start SDL2 testing. And this time it's not related to the dynamic core, of course, it's in whole in PPC build.

Last edited by kas1e on 2020-01-23, 08:09. Edited 1 time in total.

Reply 78 of 137, by kas1e

User metadata
Rank Newbie
Rank
Newbie

@jmarsh
Another report: with dynamic core, I can't run "setup.exe" of the game "screamer2", i.e. installing from CD fine, but then running of "setup.exe" from installed game fail with dynamic core, but works ok with the normal core. But running setup.exe with normal core, and then switch to the dynamic core when starting a game works. Through, sadly to say it's 4th game which has on my PPC build distorted colors:

http://kas1e.mikendezign.com/aos4/dosbox/dosb … s_screamer2.jpg
http://kas1e.mikendezign.com/aos4/dosbox/dosb … er2_in_game.jpg

I wrote to our SDL2 maintainer, and he first time sees such kind of issues, and he asks to test exactly Hugi at least on another PPC setup, to be 100% sure it's not DOSBox's endian issues.

Last edited by kas1e on 2020-01-23, 08:05. Edited 1 time in total.

Reply 79 of 137, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

With SDL1.2x there is a color bug that shows on OS X and higher colors. Maybe it's the same or similar.
Haven't tested the SDL2 build I do for OS X whether the bug is still present. Will need to check.

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper