very slow drawing with DJGPP

Developer's Forum, for discussion of bugs, code, and other developmental aspects of DOSBox.

Re: very slow drawing with DJGPP

Postby jmarsh » 2019-3-30 @ 23:08

Bytes are 8 bits, you're only shifting by 4?
jmarsh
Member
 
Posts: 285
Joined: 2014-1-04 @ 09:17

Re: very slow drawing with DJGPP

Postby thrawn235 » 2019-3-31 @ 16:04

jmarsh wrote:Bytes are 8 bits, you're only shifting by 4?

Lol that was indeed the error.

Code: Select all
void GraphicsEngine::DrawRectangleASM(int x, int y, int w, int h, int color)
{
   int screenWidth = 320;

   asm("movl $320, %%eax;"
      "movl %1, %%ebx;"
      "mulw %%ebx;"
      "add %0, %%eax;"
      "add %5, %%eax;"

      "movb %4, %%dl;"
      "shl $8, %%edx;"
      "movb %4, %%dl;"
      "shl $8, %%edx;"
      "movb %4, %%dl;"
      "shl $8, %%edx;"
      "movb %4, %%dl;"

      "movl %3, %%ebx;"
      "loop1:;"
      "   movl %2, %%ecx;"
      "   loop2:;"
      "      movl %%edx, (%%eax, %%ecx);"
      "      sub $4, %%ecx;"
      "      jae loop2;"
      "   addl %6, %%eax;"
      "   dec %%ebx;"
      "   jnz loop1;"
      
      :
      :"m"(x), "m"(y), "m"(w), "m"(h), "m"(color), "m"(&backBuffer[0]), "m"(screenWidth)
      :"eax", "ebx", "ecx", "edx", "memory");
}


Thats what i came up with.
It seems to work fine too.
(the only downside is that the width has to be divisible by 4. but thats fine)

Its very fast though.
Drawing the rectangles and Flipping takes just about 1ms now.


Thanks all for helping me.

I'm going to do the same thing to my draw sprite function now.
thrawn235
Newbie
 
Posts: 14
Joined: 2019-3-25 @ 16:42

Re: very slow drawing with DJGPP

Postby BloodyCactus » 2019-4-01 @ 14:23

thrawn235 wrote:Its very fast though.
Drawing the rectangles and Flipping takes just about 1ms now.


which is 1000 Hz (or 1000 fps). hence why dosbox is not representative of real hardware.
--/\-[ Stu : Bloody Cactus :: http://kråketær.com :: http://mega-tokyo.com ]-/\--
User avatar
BloodyCactus
Oldbie
 
Posts: 903
Joined: 2016-2-03 @ 13:34
Location: Lexington VA

Re: very slow drawing with DJGPP

Postby root42 » 2019-4-01 @ 14:28

If you give me a binary, I can test it on my 386 with an ET4000 on ISA bus. That should probably take longer than 1ms...
Soldering, retro game reviews and more on YouTube and Bonus videos
80386DX@25 MHz, 8 MiB RAM, Tseng ET4000 1 MiB, Jazz16, PC MIDI Card + SC55MkII + MT32, XT CF Lite, OSSC 1.6
User avatar
root42
Oldbie
 
Posts: 1251
Joined: 2018-1-27 @ 13:23

Re: very slow drawing with DJGPP

Postby thrawn235 » 2019-4-01 @ 16:46

BloodyCactus wrote:which is 1000 Hz (or 1000 fps). hence why dosbox is not representative of real hardware.

there are two things to note here.
I dont wait for vertical retrace or anything. And also i've set the dosbox cycles to 100%
in dosbox.conf:
Code: Select all
cycles=max


root42 wrote:If you give me a binary, I can test it on my 386 with an ET4000 on ISA bus. That should probably take longer than 1ms...

Sure, we can test that.
I wasnt really trying to say anything about real Hardware though, it was just that my original code was very slow. And i thought i did something wrong (which i did lol)

It would be interesting non the less

The output is rather crude. It just draws 2 Rectangles, and prints the time in ms over and over.
On ESC it ends, and you can read the numbers.
Thats how it works in my dosbox anyway

Edit:
One more thing.
The ASM Code is designed for 486. Maybe it works on a 386 too. don't know.

Edit2:
Still works with dosbox set to 386. for whatever thats worth
Attachments
CWSDPMI.EXE
(20.83 KiB) Downloaded 10 times
mode13.exe
(1.3 MiB) Downloaded 10 times
thrawn235
Newbie
 
Posts: 14
Joined: 2019-3-25 @ 16:42

Re: very slow drawing with DJGPP

Postby Scali » 2019-4-01 @ 17:03

thrawn235 wrote:The ASM Code is designed for 486. Maybe it works on a 386 too. don't know.


The only regular instruction that 486 has that 386 doesn't is bswap as far as I can remember.
Scali
l33t
 
Posts: 4372
Joined: 2014-12-13 @ 14:24

Re: very slow drawing with DJGPP

Postby root42 » 2019-4-01 @ 18:02

Scali wrote:
thrawn235 wrote:The ASM Code is designed for 486. Maybe it works on a 386 too. don't know.


The only regular instruction that 486 has that 386 doesn't is bswap as far as I can remember.


CMPXCHG, XADD and a bunch of cache instructions are also new.
Soldering, retro game reviews and more on YouTube and Bonus videos
80386DX@25 MHz, 8 MiB RAM, Tseng ET4000 1 MiB, Jazz16, PC MIDI Card + SC55MkII + MT32, XT CF Lite, OSSC 1.6
User avatar
root42
Oldbie
 
Posts: 1251
Joined: 2018-1-27 @ 13:23

Re: very slow drawing with DJGPP

Postby root42 » 2019-4-01 @ 21:17

Ok, I tested it:

https://youtu.be/8-7N-BWnCcg

Result is 46ms per frame. That's a bit more than 21 FPS.
Soldering, retro game reviews and more on YouTube and Bonus videos
80386DX@25 MHz, 8 MiB RAM, Tseng ET4000 1 MiB, Jazz16, PC MIDI Card + SC55MkII + MT32, XT CF Lite, OSSC 1.6
User avatar
root42
Oldbie
 
Posts: 1251
Joined: 2018-1-27 @ 13:23

Re: very slow drawing with DJGPP

Postby thrawn235 » 2019-4-02 @ 11:46

Thats ... not great lol

But its also not terrible.

I suspect the printing of the numbers slows it down a bit.
And if i can implement proper page flipping with VESA it should easily be 30-40% faster than now.

Of course drawing real sprites is slower. plus all the game logic.
But i think i can get something that will be playable
thrawn235
Newbie
 
Posts: 14
Joined: 2019-3-25 @ 16:42

Re: very slow drawing with DJGPP

Postby Scali » 2019-4-02 @ 12:41

thrawn235 wrote:I suspect the printing of the numbers slows it down a bit.


Yup, the observer effect.
Generally a good approach is to just do a simple measurement, and only calculate times/framerates once or twice a second.

thrawn235 wrote:And if i can implement proper page flipping with VESA it should easily be 30-40% faster than now.


VESA? Mode X I suppose?
At least, when you're targeting 386, it would not be realistic to expect more than regular VGA support. 386 also has its hands full with just 320x200 8 bit colour.

Drawing sprites would be best with a compiled sprite routine, which would be barely slower than what you have now. In fact, it might be somewhat faster because it's effectively an unrolled loop.
Scali
l33t
 
Posts: 4372
Joined: 2014-12-13 @ 14:24

Re: very slow drawing with DJGPP

Postby root42 » 2019-4-02 @ 12:45

I think over the ISA bus you won't see much better values. It is limited to 16 bits at a time, clocked at 8MHz. This should allow for 16MiB/s of peak performance if there is nothing else working on the bus and the CPU doesn't do anything else, or you have DMA transfers. The 21 FPS are of course not optimal and show only about 1.3 MiB/s of throughput. Hence it is probably much better if you write directly to VGA RAM and do page flipping. It's important that you don't have to read back from VGA memory, so that turnaround won't be long.

You CAN try to render to RAM and 'rep stosl' the buffer (or whatever subset of it that changed), which probably will give you the best performance, if you are NOT using page flipping. Even with page flipping there are probably optimized ways to copy sprites and characters into video RAM. Using transparency will necessitate some tricks though, as will sprite scaling.
Soldering, retro game reviews and more on YouTube and Bonus videos
80386DX@25 MHz, 8 MiB RAM, Tseng ET4000 1 MiB, Jazz16, PC MIDI Card + SC55MkII + MT32, XT CF Lite, OSSC 1.6
User avatar
root42
Oldbie
 
Posts: 1251
Joined: 2018-1-27 @ 13:23

Re: very slow drawing with DJGPP

Postby thrawn235 » 2019-4-04 @ 14:43

So.
ive done a couple things.
I wrote a VESA init method, so I can use the VESA modes now including proper page flipping.
for the regular mode 13 (without page flipping) ive reimplemented the memcopy in assembly.
I also made sure that the printing time of the milliseconds wont show up in the calculated time.

In my dosbox it runs too fast for a proper measurement. Its below 1ms.

The problem with my page flip is, the screen still flickers like crazy.
Something is wrong. the picture should be static like mode13h...

Code: Select all
void GraphicsEngine::Flip()
{
   if(pageFlipping)
   {
      if(currentPageAddress == screenMemory)
      {
         currentPageAddress = screenMemory + screenWidth*screenHeight;
         SetDisplayStart(0, 0);
      }
      else
      {
         currentPageAddress = screenMemory;
         SetDisplayStart(screenHeight, 0);
      }
   }
   else
   {
      unsigned int maxScreenOffset = screenWidth * screenHeight;

      asm("movl $0, %%ecx;"
         "loop%=:;"
         "   movl (%%esi, %%ecx), %%eax;"
         "   movl %%eax, (%%edi, %%ecx);"
         "   add $4, %%ecx;"
         "   cmp %0, %%ecx;"
         "   jbe loop%=;"
         :
         :"m"(maxScreenOffset), "D"(&screenMemory[0]), "S"(backBuffer)
         :"eax", "ecx", "memory");
   }
}


Code: Select all
void GraphicsEngine::SetDisplayStart(int newStartScanline, int newStartPixelOnScanline)
{
   __dpmi_regs r;
   r.x.ax = 0x4F07;
   r.h.bh = 0x00;
   r.h.bl = 0x80;
   r.x.cx = newStartPixelOnScanline;
   r.x.dx = newStartScanline;


   __dpmi_int(0x10, &r);
}


If somebody wants to test it, ive attached the exe.
Would be cool to know how it performs on real hardware.
(use mode 100 in VESA mode. everything else is untested)
Attachments
mode13.exe
(1.3 MiB) Downloaded 9 times
thrawn235
Newbie
 
Posts: 14
Joined: 2019-3-25 @ 16:42

Re: very slow drawing with DJGPP

Postby BloodyCactus » 2019-4-04 @ 16:24

you need to wait for vertical retrace, polling the vsync flag, then copy/pageflip
--/\-[ Stu : Bloody Cactus :: http://kråketær.com :: http://mega-tokyo.com ]-/\--
User avatar
BloodyCactus
Oldbie
 
Posts: 903
Joined: 2016-2-03 @ 13:34
Location: Lexington VA

Re: very slow drawing with DJGPP

Postby root42 » 2019-4-04 @ 16:43

BloodyCactus wrote:you need to wait for vertical retrace, polling the vsync flag, then copy/pageflip


For my let's code series I put the code up on github. There's also a wait for retrace function that OP can copy:

https://gist.github.com/root42/8e147c5e ... a5aba52317

Code: Select all
#define INPUT_STATUS 0x3DA
#define VRETRACE_BIT 0x08

void wait_for_retrace()
{
  while( inp( INPUT_STATUS ) & VRETRACE_BIT );
  while( ! (inp( INPUT_STATUS ) & VRETRACE_BIT) );
}
Soldering, retro game reviews and more on YouTube and Bonus videos
80386DX@25 MHz, 8 MiB RAM, Tseng ET4000 1 MiB, Jazz16, PC MIDI Card + SC55MkII + MT32, XT CF Lite, OSSC 1.6
User avatar
root42
Oldbie
 
Posts: 1251
Joined: 2018-1-27 @ 13:23

Re: very slow drawing with DJGPP

Postby Scali » 2019-4-04 @ 17:03

BloodyCactus wrote:you need to wait for vertical retrace, polling the vsync flag, then copy/pageflip


That might depend...
On a 6845, the screen offset register is latched. So you basically 'fire-and-forget', the value will become active for the next frame.
Which means you would actually first perform the pageflip and THEN wait for vertical retrace, to wait for the actual flip to occur, before you start drawing in the new backbuffer (else you're actually drawing into what is still the frontbuffer).
Scali
l33t
 
Posts: 4372
Joined: 2014-12-13 @ 14:24

Re: very slow drawing with DJGPP

Postby ripsaw8080 » 2019-4-04 @ 17:47

FYI, waiting for retrace (e.g. BL=80h when calling INT 10h/AX=4F07h) is implemented in SVN, but in 0.74(-2) you can load UniVBE or so to get the feature.
User avatar
ripsaw8080
DOSBox Author
 
Posts: 4407
Joined: 2006-4-25 @ 23:24

Re: very slow drawing with DJGPP

Postby thrawn235 » 2019-4-04 @ 18:54

Ok.
You guys are right. Waiting for Retrace fixes flickering.

But i dont understand why it flickers in the first place. i draw the image to the off buffer, then i swap the pointers, and then i draw to the other buffer (thats now the off buffer)

so how can it display an incomplete image. shouldn't there always be a complete image in the front buffer ?




Also, and thats really weird.
With waiting for retrace enabled, my time function just prints "e" all the time instead of a number?!
but only in VESA, mode in mode13h (with waitforRetrace) it still works ???

Code: Select all
void GraphicsEngine::WaitForRetrace()
{
    /* wait until done with vertical retrace */
    while  ((inportb(0x03da) & 0x08) != 8) {};
    /* wait until done refreshing */
    while ((inportb(0x03da) & 0x08) == 8) {};
}


Code: Select all
//Main loop
while(running)
    {

       engine->time->SetFrameStartTimeStamp();
       
       engine->input->PollKeys();
       if(engine->input->KeyDown(1))
       {
          running = false;
       }   
       
        engine->graphics->DrawRectangleASM(0,0,320,200,5);
       engine->graphics->DrawRectangleASM(10,10,110,110,7);

       engine->graphics->WaitForRetrace();
       engine->graphics->Flip();

       time = engine->time->GetTicksSinceFrameStart();
       cout<<engine->time->TicksToMilliSeconds(time)<<" ";

    }
thrawn235
Newbie
 
Posts: 14
Joined: 2019-3-25 @ 16:42

Re: very slow drawing with DJGPP

Postby BloodyCactus » 2019-4-04 @ 19:27

thrawn235 wrote:But i dont understand why it flickers in the first place. i draw the image to the off buffer, then i swap the pointers, and then i draw to the other buffer (thats now the off buffer)

so how can it display an incomplete image. shouldn't there always be a complete image in the front buffer ?


because your drawing at a speed thats different from the screen. ie: Screen is in say, 60hz or 70hz but your drawing at 80 or 50 or something.

real crt + lcd will give different results.

waiting for retrace syncs up your drawing with the actual refresh rate of the screen.
--/\-[ Stu : Bloody Cactus :: http://kråketær.com :: http://mega-tokyo.com ]-/\--
User avatar
BloodyCactus
Oldbie
 
Posts: 903
Joined: 2016-2-03 @ 13:34
Location: Lexington VA

Re: very slow drawing with DJGPP

Postby Scali » 2019-4-04 @ 19:35

thrawn235 wrote:so how can it display an incomplete image. shouldn't there always be a complete image in the front buffer ?


As long as there are two different images in the two buffers, and you switch between them somewhere outside the vertical blank area, then you can get flicker, because part of the current displayed frame comes from one buffer, and part comes from another.

Drawing a screen takes time. The old CRT is quite intuitive that way: the cathode ray actually traces the screen one scanline at a time. So the output of the video card is always a single pixel at a time, and that is whatever the ray draws at that moment. Proper timing in the video card circuitry makes sure that pixels are switched at the correct time, scanlines are switched at the correct time, and eventually a vertical blank is inserted, so the ray can return to the top of the screen again.

Modern digital screens are not quite that 'direct', but the general idea still holds: The video memory is scanned from left to right, top to bottom, as it is sent to the internal framebuffer inside your flatscreen. So the concepts of vsync and tearing still apply.
Scali
l33t
 
Posts: 4372
Joined: 2014-12-13 @ 14:24

Previous

Return to DOSBox Development

Who is online

Users browsing this forum: jmarsh and 2 guests