VOGONS


Reply 101 of 212, by root42

User metadata
Rank l33t
Rank
l33t

Now with OpenGL scaled window, makes the rendering also a tiny bit faster, I think:

https://youtu.be/nmTuMeVV7s4

Next up would be to render only one sample per pixel. Right now we have massive overdraw. Also thinking about how to process multiple pixels at once.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 102 of 212, by Deksor

User metadata
Rank l33t
Rank
l33t

With OpenGL you could transform the image to scale it to 4:3 aspect ratio !
I've seen that you made much more progress than I did ... My experiments were a failure.

Basically what I noticed is that some "lines" aren't always exactly the same length and I tried to modify them so each of them would "scale" to the proper length.

Also, in the program, I noticed that the "X" value could also represent time.
The electron beam takes so much time getting from the left to the right of the screen, right ? (according to here http://minuszerodegrees.net/mda_cga_ega/mda_cga_ega.htm, it takes 63.7µs to do that)
Is it possible to store some time representation in these samples ? If you know exactly when the "pixel" was recorded, you know where it's supposed to appear

Trying to identify old hardware ? Visit The retro web - Project's thread The Retro Web project - a stason.org/TH99 alternative

Reply 103 of 212, by matze79

User metadata
Rank l33t
Rank
l33t

How much processing power does it require ?

https://www.retrokits.de - blog, retro projects, hdd clicker, diy soundcards etc
https://www.retroianer.de - german retro computer board

Reply 104 of 212, by Deksor

User metadata
Rank l33t
Rank
l33t

At the moment it's making one core of my core i5 to run at 100%.

Maybe to run it on multiple cores, the stream could be stored to a buffer and once VSYNC is sent, the buffer gets interpreted and displayed when it's done ?

By the way, I have received my 6€ logic analyzer ... but I have never messed with such thing before ^^'

What should I do to make a capture ?

Trying to identify old hardware ? Visit The retro web - Project's thread The Retro Web project - a stason.org/TH99 alternative

Reply 105 of 212, by Predator99

User metadata
Rank l33t
Rank
l33t
Deksor wrote on 2020-07-17, 12:53:

Basically what I noticed is that some "lines" aren't always exactly the same length and I tried to modify them so each of them would "scale" to the proper length.

Indeed, this is the reason for the remaining jitter. Hope this is fixed now, cannot watch the YouTube Videos at the moment. As you can see in the purple line it should be always the same length +- 1 Pixel.

Reply 106 of 212, by Deksor

User metadata
Rank l33t
Rank
l33t

Yeah root42 seems to have fixed this. However in the Commander Keen recording, I can see lots of tearing ... Where does that come from ?

Trying to identify old hardware ? Visit The retro web - Project's thread The Retro Web project - a stason.org/TH99 alternative

Reply 107 of 212, by Predator99

User metadata
Rank l33t
Rank
l33t
Deksor wrote on 2020-07-17, 13:06:
At the moment it's making one core of my core i5 to run at 100%. […]
Show full quote

At the moment it's making one core of my core i5 to run at 100%.

Maybe to run it on multiple cores, the stream could be stored to a buffer and once VSYNC is sent, the buffer gets interpreted and displayed when it's done ?

By the way, I have received my 6€ logic analyzer ... but I have never messed with such thing before ^^'

What should I do to make a capture ?

Great!
First install the Saleae Software to test.

Then install sigrok -cli

Then make the connection to the EGA card Accounting to the Layout I posted above.

Reply 108 of 212, by Predator99

User metadata
Rank l33t
Rank
l33t
Deksor wrote on 2020-07-17, 13:17:

Yeah root42 seems to have fixed this. However in the Commander Keen recording, I can see lots of tearing ... Where does that come from ?

This is a known ck issue and not related to the Software or capture.

Reply 109 of 212, by root42

User metadata
Rank l33t
Rank
l33t

Yes, it currently saturates one core more or less. Storing a whole frame is too much. Storing one line, yes, that would make sense and is how e.g. the OSSC works as well. Otherwise you also get higher lag, which will be significant anyway due to the complexity of the whole process.

My plan is to process multiple pixels at once, but I am not quite sure how to do that. There are several possibilities. With regular 32 or 64 bit data types we can already process 4-8 pixels on a single core using bit manipulation, but it will make the code a bit convoluted. Let's see how that turns out. Also for the 320x200 mode we can skip drawing all but one sample per pixel. At the moment we have a 3-4x overdraw.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 110 of 212, by root42

User metadata
Rank l33t
Rank
l33t

Furthermore at least in macos resizing the window stops rendering. We need to spin out the reading, processing and rendering into different threads. So that reading and processing continues even if the rendering is stopped due to UI issues. But one problem at a time...

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 111 of 212, by Benedikt

User metadata
Rank Oldbie
Rank
Oldbie
root42 wrote on 2020-07-17, 13:22:

Yes, it currently saturates one core more or less. Storing a whole frame is too much. Storing one line, yes, that would make sense and is how e.g. the OSSC works as well. Otherwise you also get higher lag, which will be significant anyway due to the complexity of the whole process.

My plan is to process multiple pixels at once, but I am not quite sure how to do that. There are several possibilities. With regular 32 or 64 bit data types we can already process 4-8 pixels on a single core using bit manipulation, but it will make the code a bit convoluted. Let's see how that turns out. Also for the 320x200 mode we can skip drawing all but one sample per pixel. At the moment we have a 3-4x overdraw.

Before we go the SIMD route, we should try to fully exploit the optimization potential that has already been identified.
The Bresenham-like algorithm for efficient scaling could take care of the 3-4x overdraw that you mentioned and would probably be substantially faster.
I'd also say that the algorithm should maintain a pointer into the frame buffer that only gets reinitialized once per line and then simply advances to the next pixel.
The pixel output and color conversion could then be done using something like *target++ = egapal[color & 0x3f]. The pset function would become unnecessary.

Reply 112 of 212, by root42

User metadata
Rank l33t
Rank
l33t

Looking at the assembly that the compiler produces I think the pointer into the framebuffer might not help that much. It already optimizes really heavily. It even precalculates constants:

	.quad	4599042267697835596     ## double 0.29813084112149535
...
imull $5570560, %esi, %esi ## imm = 0x550000
movl %r13d, %ebx
andl $2, %ebx
orl %edi, %ebx
imull $21760, %ebx, %ebx ## imm = 0x5500

We can try to change that, doesn't hurt, but I wouldn't think that it is a significant improvement. The address calculation into the framebuffer is rather cheap. The overwrite is of course much more expensive. But getting it to read and process four pixels at once would be way more significant, as single byte reads are rather inefficient.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 113 of 212, by Benedikt

User metadata
Rank Oldbie
Rank
Oldbie

Of course the compiler optimizes. It does not usually change your algorithm, though. I totally agree that buffering the input would make sense, ideally in a ring buffer big enough that it definitely contains one full line.
As far as the scaling is concerned, this might then already be the full algorithm for one line, including pixel output:

for (target = fb + y * stride, erracc = phase; target < fb + y * stride + width; *target++ = egapal[*source++ & 0x3f], erracc += sclk)
while ((erracc -= pclk) >= 0) ++source;

Needless to say, source would have to be initialized to point to the beginning of a line.
sclk would be 24000000, pclk e.g. 14318180 or 7159090 and fb the frame buffer start.
If pclk is 7159090, phase could be a constant, as well.

Reply 115 of 212, by Predator99

User metadata
Rank l33t
Rank
l33t

1st thing I did back home: Test the latest version 😉

I had to replace this back to compile under WIN
value = getchar_unlocked(); // fgetc(stdin);

It looks FANTASTIC!

Image errors are gone now - thanks! So my QB64 code can go to the bin 😉

Only the colors did not improve...

popc.jpg
Filename
popc.jpg
File size
337.82 KiB
Views
700 views
File license
Fair use/fair dealing exception
ckk.jpg
Filename
ckk.jpg
File size
435.1 KiB
Views
698 views
File license
Fair use/fair dealing exception

Reply 116 of 212, by Deksor

User metadata
Rank l33t
Rank
l33t

I had these issues too on windows, you have to change the big endian part to this :

#if SDL_BYTEORDER == SDL_BIG_ENDIAN
static const Uint32 rmask = 0xff000000;
static const Uint32 gmask = 0x00ff0000;
static const Uint32 bmask = 0x0000ff00;
static const Uint32 amask = 0x000000ff;
#else
static const Uint32 rmask = 0x00ff0000;
static const Uint32 gmask = 0x0000ff00;
static const Uint32 bmask = 0x000000ff;
static const Uint32 amask = 0xff000000;
#endif

Trying to identify old hardware ? Visit The retro web - Project's thread The Retro Web project - a stason.org/TH99 alternative

Reply 118 of 212, by Predator99

User metadata
Rank l33t
Rank
l33t

Somebody has an idea for an EGA game with lots of action in the Intro? Would be happy to upload a longer dump-file later for testing and watching...

The prices for the 6€ analyzer should go up now 😁

Reply 119 of 212, by Deksor

User metadata
Rank l33t
Rank
l33t

Maybe Alpha Waves ? It's a 3D game which can do CGA, EGA and VGA and even hercules. Only issue with it is that EGA and VGA modes have some sort of tearing

Trying to identify old hardware ? Visit The retro web - Project's thread The Retro Web project - a stason.org/TH99 alternative