VOGONS


First post, by robertmo

User metadata
Rank l33t++
Rank
l33t++

would it be possible to make it this way?

while one core is trying to count the result for either 1 or 0

two other cores could at the same time follow the code:
second core with the result of 1
and the third core with the result of 0

By the time the first core gets the result we will have the result for next operation.

8 cores cpu would be 3x faster than 1 core cpu this way.

Reply 1 of 10, by root42

User metadata
Rank l33t
Rank
l33t

A similar thing happens on CPUs with speculative execution. The main problem is that not all code is simply calculating a result. You basically have these kinds of operations: arithmetic functions, reading of data, writing of data. The first two can be easily parallelized. However the writing of data has side effects and may influence computations further down the line. That is why speculative execution will eventually fail, given a long enough lookahead. Same problem would apply to your multi-core speculative execution model.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 3 of 10, by root42

User metadata
Rank l33t
Rank
l33t

Then you lose performance. Suddenly you will be bandwidth limited, because you will be writing multiple times the amound of data you would normally write.
The pipelines in a normal CPU will delay writing results, and can probably keep as much state as the pipeline is deep. But if you try to emulate this on a core level, you will have to write to real memory, which is much slower and much more costly.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 6 of 10, by aqrit

User metadata
Rank Member
Rank
Member

no.
1. most instructions depend on the output of prior instruction.
2. the real cpu is already speculating, where possible.
3. 300+ cycles needed for synchronization between threads, every time.

Reply 7 of 10, by Pickle

User metadata
Rank Member
Rank
Member

Actually there are some cases that multithreading can help for instance a long time ago there was a device called the GP2X. It actually had 2 arm cores, the second core was limited but could so some processing in parallel.
Another user created a method of running the OPL emulation on the second core which resulted in the main application getting a boost. Keep in mind the main core arm chip even with the dynarec we had at time was only capable of 100's of cycles. So the offloading of the OPL emulation was more noticeable.
But running such a setup on your i7 is not likely to be noticeable.

Reply 8 of 10, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

I believe there are patches in the wild for multithreaded/multiprocess peripheral emulation. This is an important thing for unified builds on *Pi boards.

All hail the Great Capacitor Brand Finder

Reply 9 of 10, by cyclone3d

User metadata
Rank l33t++
Rank
l33t++

In theory it should not be that difficult to multi-thread DOSBox. You should be able to have one thread run the CPU, one for video, one for sound, and maybe one for input.

Maybe I will look into that once I get somewhere with optimizing the current code which should give about a 20% decrease in needed CPU cycles from my past work doing the same thing and then dropping it and eventually losing the code because all I got when I did that and released the optimized code was complaints asking why I didn't released a compiled binary.

So yeah.. it really put me off on doing anything else with it for quite a few years.

Yamaha modified setupds and drivers
Yamaha XG repository
YMF7x4 Guide
Aopen AW744L II SB-LINK