VOGONS


First post, by roozbeh

User metadata
Rank Newbie
Rank
Newbie

Hi....
i am working on a port to smartphone (windows ce) devices.
anything is working but it is very slow!
i am using core_normal....
ny the way what are diffrences? using core_full or core_simple?
can i gain any speed by using core_simple?

also about core_normal,i was thinking maybe transfering that big switch case into a jump lookup table might give me some speed?for example to lconvert all those cases into functions and look their address from a table....
do you think it give me any speeds?

or what else can i do?
currently the game seems to work best by cycles=3000!
and it is very slow!

regards
roozbeh

Reply 1 of 9, by ASM

User metadata
Rank Newbie
Rank
Newbie

also about core_normal,i was thinking maybe transfering that big switch case into a jump lookup table might give me some speed?for example to lconvert all those cases into functions and look their address from a table....
do you think it give me any speeds?

C/C++ compiler compile case expressions into jumps with lookup tables (even w/0 optimizations turned on) so there's no need to do this manually.

If you want more speed you should consider to write a dynamic recompilation core.

Reply 2 of 9, by roozbeh

User metadata
Rank Newbie
Rank
Newbie

yes thanks for hint....
i was wondering so why the code in x86 was that small!!
but i am compiling for arm using embedded visual c and i looked its assembly code and it was very huge jumps!
so i think it doesnt transfers it into jump table!

is the speed of converting them into jump table great?

regards

Reply 3 of 9, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie

Typical PDAs run an ARM CPU at around 400MHz. If it had an x86 CPU (it doesn't), that would be enough to play most non-actionshooter games using dynamic core. Since dynamic core is not available for ARM CPUs, the observed 3000 cycles in normal (simple should indeed be a little bit faster) are to be expected. There's nothing you can fundamentally change about this, unless you go about coding dynamic core for ARM.

The question of jump tables or not is usually best answered by your compiler: it knows best which variant is better.

Reply 4 of 9, by roozbeh

User metadata
Rank Newbie
Rank
Newbie

yes
it does run almost good on pdas with 400mhz but my target is smartphone with around 100mhz speed!

about dynamic core what is idea behind it?
is it possible to write it for arm cpus?

Reply 5 of 9, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie

It should be. Since I now own an iPAQ, that's something that fascinates me too, but it's not easy at all. I have no idea how long such a task could take, I expect many weeks of coding and testing.

Reply 6 of 9, by Qbix

User metadata
Rank DOSBox Author
Rank
DOSBox Author

it depends on how well you can write PPC asm 😀

Water flows down the stream
How to ask questions the smart way!

Reply 7 of 9, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie

ARM, not PPC, and "not at all yet" - I usually pick these things up as I go. I didn't know anything about OpenGL either, when I started the hq scaler 😀

Reply 8 of 9, by roozbeh

User metadata
Rank Newbie
Rank
Newbie

thanks guys for hints so far!
now cycles=6000 seems to be working too!
now i am using core_simple and some minor optimizations!

anyway...how can i know that my changes really make things faster?
i only sense it by making values of cycles higher and see if programs speeds up or not...anybetter way?

Reply 9 of 9, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie

The best indicator is "sound continuity". Play something with sound and check that sound is playing fine. If you experience dropouts/stuttering, your cycles are too high.