How to write to screen.... in a FASTER way?

General FreeBASIC programming questions.
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

How to write to screen.... in a FASTER way?

Postby Fox » Dec 19, 2010 9:58

Hi all!

I am currently working on a program which is trying to refresh the screen at a rate of ~60hz.

Unfortunately, this is painfully *slow* :(

The screen size I am working on is 160x144 magnified 4x (that is really 640x576).

This is my current code:

Code: Select all

#MACRO DetectScreenChanges()
  LcdModifiedFirstY = 255
  LcdModifiedFirstX = 255
  LcdModifiedLastX = 0
  FOR y = 0 TO 143
    LcdModifiedScanline(y) = 0
    FOR x = 0 TO 159
      IF ScreenBuffer(x, y) <> ScreenOldBuffer(x, y) THEN
        ScreenOldBuffer(x, y) = ScreenBuffer(x, y)
        IF LcdModifiedScanline(y) = 0 THEN
          LcdModifiedScanline(y) = 1
          IF (LcdModifiedFirstY > y) THEN LcdModifiedFirstY = y
          LcdModifiedLastY = y
        END IF
        IF (LcdModifiedFirstX > x) THEN LcdModifiedFirstX = x
        IF (LcdModifiedLastX < x) THEN LcdModifiedLastX = x
      END IF
    NEXT x
  NEXT y
#ENDMACRO


SUB RefreshScreen()
  DIM AS UINTEGER x, y
  DetectScreenChanges()

  IF (LcdModifiedFirstY < 255) THEN
    DIM AS UINTEGER a, b, c, d
    DIM AS UBYTE ScaleStepX, ScaleStepY
    DIM AS UINTEGER PTR sPtr32 = Screenptr
    SCREENLOCK()
    FOR ScaleStepY = 0 TO (GraphicScaleFactor - 1) STEP FastScale
      a = (GraphicWidth * ScaleStepY) + ScreenBorderRight + (ScreenBorderTop * GraphicWidth)
      FOR y = LcdModifiedFirstY TO LcdModifiedLastY
        IF (LcdModifiedScanline(y) = 1) THEN
          b = y * GraphicWidth * GraphicScaleFactor
          FOR x = LcdModifiedFirstX TO LcdModifiedLastX
            c = (x * GraphicScaleFactor) + b + a
            FOR ScaleStepX = 0 TO (GraphicScaleFactor - 1)
                  sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
            NEXT ScaleStepX
          NEXT x
        END IF
      NEXT y
    NEXT ScaleStepY
    SCREENUNLOCK()
  END IF
END SUB


I am rendering my screen's content in an array called "ScreenBuffer(x,y)", and then I call this sub (refreshscreen) to draw it on screen. The ScreenBuffer() array is an array of UBYTES, and each value points to a palette entry (palette colors are stored in the array called ScreenPalette32()).

Variables ScaleStepX and ScaleStepY are both equal to 4 (but could be anything, to scale the screen more or less), and the variable "FastScale" is always equal to "1".

There is also a routine called "DetectScrenChanges()", which is detecting if the screen changed or not, and feeding variables LcdModifiedFirstX, LcdModifiedFirstY, LcdModifiedLastX, LcdModifiedLastY. (this way I don't have to redraw the screen if its content hasn't changed).
I am also keeping track of modifications on every line, to avoid redrawing lines of screen which weren't changed since last refresh (that's what the LcdModifiedScanline() array is about).

Is there any chance for me to speed up this code? I wonder if using SDL would improve the refresh rate speed?
Last edited by Fox on Dec 19, 2010 16:50, edited 1 time in total.
fxm
Posts: 9256
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Postby fxm » Dec 19, 2010 16:27

I hope I am not mistaken, and here are some very slight improvements in the loops (for #MACRO DetectScreenChanges() and for SUB RefreshScreen()) :

Code: Select all

#MACRO DetectScreenChanges()
  LcdModifiedFirstY = 255
  LcdModifiedFirstX = 255
  LcdModifiedLastX = 0
  FOR y = 0 TO 143
    LcdModifiedScanline(y) = 0
    FOR x = 0 TO 159
      IF ScreenBuffer(x, y) <> ScreenOldBuffer(x, y) THEN
        ScreenOldBuffer(x, y) = ScreenBuffer(x, y)
        IF LcdModifiedScanline(y) = 0 THEN
          IF (LcdModifiedFirstY > y) THEN LcdModifiedFirstY = y
          LcdModifiedLastY = y
          LcdModifiedScanline(y) = 1
        END IF
        IF (LcdModifiedFirstX > x) THEN LcdModifiedFirstX = x
        IF (LcdModifiedLastX < x) THEN LcdModifiedLastX = x
      END IF
    NEXT x
  NEXT y
#ENDMACRO


SUB RefreshScreen()
  DIM AS UINTEGER x, y
  DetectScreenChanges()
  IF (LcdModifiedFirstY < 255) THEN
    DIM AS UINTEGER a, b, c, d
    DIM AS UBYTE ScaleStepX, ScaleStepY
    DIM AS UINTEGER PTR sPtr32 = Screenptr
    SCREENLOCK()
    FOR ScaleStepY = 0 TO (GraphicScaleFactor - 1) STEP FastScale
      a = (GraphicWidth * ScaleStepY) + ScreenBorderRight + (ScreenBorderTop * GraphicWidth)
      FOR y = LcdModifiedFirstY TO LcdModifiedLastY
        IF (LcdModifiedScanline(y) = 1) THEN
          b = y * GraphicWidth * GraphicScaleFactor + a
          FOR x = LcdModifiedFirstX TO LcdModifiedLastX
            c = (x * GraphicScaleFactor) + b
            FOR ScaleStepX = 0 TO (GraphicScaleFactor - 1)
                  sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
            NEXT ScaleStepX
          NEXT x
        END IF
      NEXT y
    NEXT ScaleStepY
    SCREENUNLOCK()
  END IF
END SUB
 

"Un compatriote qui lui aussi est sous la neige en attendant Noël !"
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

Postby Fox » Dec 19, 2010 16:57

fxm wrote:I hope I am not mistaken, and here are some very slight improvements in the loops


Merci cher compatriote! :)

Thanks, I edited my first post to add your optimizations (and also added another very little one that I noticed now).
Still, as you say - these loops optimizations are really thin, and doesn't provide any noticeable speedup :-P

What's eating the most of my CPU seems to be this:

Code: Select all

DIM AS UINTEGER PTR sPtr32 = Screenptr
SCREENLOCK()
(...)
sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
(...)
SCREENUNLOCK()


I though that writing directly into the ScreenPtr memory area is the fastest possible way to do graphics on FreeBASIC... but well... it appears to be still too slow :-/
Is there really no faster way? I am wondering about trying out SDL... any chance it would be faster? Any experience on that?
fxm
Posts: 9256
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Postby fxm » Dec 19, 2010 17:27

- I don't know SDL.

- In this specific case only (to speed up), you can also try to extract the other declarations from the SUB RefreshScreen(), in order to avoid repetitive declarations with memory allocations in the program stack :

Code: Select all

DIM SHARED AS UINTEGER PTR sPtr32 = Screenptr
DIM SHARED AS UINTEGER x, y
DIM SHARED AS UINTEGER a, b, c ' , d
DIM SHARED AS UBYTE ScaleStepX, ScaleStepY
dafhi
Posts: 1258
Joined: Jun 04, 2005 9:51

Postby dafhi » Dec 20, 2010 10:47

hey Fox!

Code: Select all

Dim SHARED AS UBYTE ScaleStepX, ScaleStepY
Dim Shared As UInteger GWxGSF, GraphicScaleFactorM

' ...
' ...
' ...

Sub RefreshScreen()
Dim As UInteger PTR sPtr32 = Screenptr
DIM As UInteger x, y
Dim As UINTEGER a, b, c
  DetectScreenChanges()
  IF (LcdModifiedFirstY < 255) THEN
    GWxGSF = GraphicWidth * GraphicScaleFactor
    GraphicScaleFactorM = GraphicScaleFactor - 1
    SCREENLOCK()
    FOR ScaleStepY = 0 TO GraphicScaleFactorM STEP FastScale
      a = (GraphicWidth * ScaleStepY) + ScreenBorderRight + (ScreenBorderTop * GraphicWidth)
      FOR y = LcdModifiedFirstY TO LcdModifiedLastY
        IF (LcdModifiedScanline(y) = 1) THEN
          b = y * GWxGSF + a
          c = x * LcdModifiedFirstX + b
          FOR x = LcdModifiedFirstX TO LcdModifiedLastX
            FOR ScaleStepX = 0 TO GraphicScaleFactorM
                  sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
            NEXT ScaleStepX
            c += GraphicScaleFactor
          NEXT x
        END IF
      NEXT y
    NEXT ScaleStepY
    SCREENUNLOCK()
  END IF
END SUB


Pending modifications in your other subs, you also might be able to do this
sPtr32[c + ScaleStepX] = ScreenPalette32(x, y)

Freebasic's graphics are fast .. I've used SDL mainly for its ability to make a resizeable window
dafhi
Posts: 1258
Joined: Jun 04, 2005 9:51

Postby dafhi » Dec 20, 2010 12:20

I just visualized exactly what this sub is for .. magnification.

I'll share some tidbits I've found over the years. There's a commercial basic, and online documentation at their website (a long time ago) mentioned that its own compiler was designed to put inner-most nested veriables into CPU registers ..

I'm thinking most modern compilers use a similar technique.

So I'm guessing this is good:

Sub RefreshScreen()
Dim As UInteger PTR sPtr32
DIM As UInteger x, y
Dim As UINTEGER a, b, c

...

let's discuss a new loop

Code: Select all

#Include "fbgfx.bi"

Using FB
Dim Shared As Event e   'Allow user to escape anim loop

Dim Shared As Const UInteger   ScaleX=4,ScaleY=4
Dim Shared As Const UInteger   Width8=100,Height8=100,LenInteger = Len(Integer)
Dim Shared As UInteger         BufWidth,BufHeight, ScalePitch
Dim Shared As UInteger         WidthM, HeightM, Width8M, Height8M

WidthM = BufWidth - 1
HeightM = BufHeight - 1
Width8M = Width8 - 1
Height8M = Height8 - 1
BufWidth = Width8 * ScaleX
BufHeight = Width8 * ScaleY
ScalePitch = BufWidth * ScaleY

Dim Shared As UByte            Image8(Width8M,Height8M)
Dim Shared As UInteger         Image32(WidthM,HeightM)
Dim Shared As UInteger        Pal32(31)
Dim Shared As UInteger         MicroX, MicroY
Dim Shared As UInteger Ptr      sPtrY, sPtrY2, sPtrY3

Sub RandPixels8
Dim As UInteger I, J
   For I = 0 To Height8M
      For J = 0 To Width8M
         Image8(I,J) = Int(Rnd * 32)
      Next
   Next
   For I = 0 To 31
      Pal32(I) = Int(Rnd * 16777216)
   Next
End Sub
Sub RefreshScreen()
Dim As UInteger PTR sPtr32
Dim As UINTEGER x, y, MyColor
sPtrY3 = ScreenPtr
For MicroY = 0 to Height8M
  sPtrY2 = sPtrY3   
  For MicroX = 0 to Width8M
    MyColor = Pal32(Image8(MicroX, MicroY))
    sPtrY = sPtrY2
    For y = 1 To ScaleY
      sPtr32 = sPtrY
      For x = 1 To ScaleX
        *sPtr32 = MyColor
          sPtr32 += 1
      Next
      sPtrY += BufWidth
    Next
    sPtrY2 += ScaleX
  Next MicroX
  sPtrY3 += ScalePitch
Next MicroY
End Sub
Sub KeyEvents '' fbgfx.bi
   If (screenevent(@e)) then
      select case e.type
      case EVENT_KEY_PRESS
         If e.scancode = SC_ESCAPE Then End
      End Select
   End If
End Sub

ScreenRes BufWidth,BufHeight,32,,&h20

Do
   KeyEvents
   RandPixels8
   ScreenLock
   RefreshScreen
   ScreenUnLock
   Sleep 8 ''let Operating System breathe
Loop
Last edited by dafhi on Dec 21, 2010 5:14, edited 11 times in total.
TJF
Posts: 3486
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Postby TJF » Dec 20, 2010 12:22

@Fox:

Did you think about hardware acceleration yet? From my point of view using cairo or OpenGL will be a good idea (faster and less effort).
sir_mud
Posts: 1401
Joined: Jul 29, 2006 3:00
Location: US
Contact:

Postby sir_mud » Dec 20, 2010 22:46

Use static for variables in functions if you don't want them to be reallocated everytime. Doesn't clutter up the global namespace and preserves locality. Also the code seems very complex for what it does, maybe it just needs to be simplified?
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

Postby Fox » Dec 21, 2010 18:37

Hi all, thanks for all your suggestions :)

I replaced all my DIM declarations by STATIC declarations, but it doesn't changed the overall speed in any noticeable way... Still, the most consuming part is the operation of writing to screen.

@TJF: I have absolutely no experience with hardware acceleration... But indeed, this could be a good idea see what could be achieved with OpenGL - I will definitely check this out :)
TJF
Posts: 3486
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Postby TJF » Dec 21, 2010 19:26

@Fox:

For 2D OpenGl you can find some examples here:

http://other.paul-grunewald.de/ogl/

The text is in German, but the examples are in FreeBasic :)
dafhi
Posts: 1258
Joined: Jun 04, 2005 9:51

Postby dafhi » Dec 22, 2010 1:53

@Fox

My latest edit has a working sample of animated random pixels - if you haven't seen that, do check it out =)

In short, if the only *source* data is your 8 bit, you'll want to do this:

For y =
For x = ..
mycolor = Pal32(Image8(x,y))
For yDest = ..
For xDest = ..
sPtr32 = yDest * Buf32Width + xDest
*sPtr32 = mycolor

...

Because all you're doing is creating "big" pixels right?

If you're copying data from other sources also, I can see why you'd need to do it the way you're doing .. but I realized one more optimization if that's the case ..

FOR x = LcdModifiedFirstX TO LcdModifiedLastX
MyColor = ScreenPalette32(ScreenBuffer(x, y))
FOR ScaleStepX = 0 TO GraphicScaleFactorM
sPtr32[c + ScaleStepX] = MyColor
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

Postby Fox » Dec 22, 2010 7:26

@TJF: Ah zoo, ich spreche kein deutsche. Das ist a hund. Pommes-frittes bitte. (and that's it about my german skills) :-D But I will definitely look into these examples, looks interesting (and maybe passing the whole website into a translator will help me to understand what it's about) :)

@dafhi: Yeah, I saw your code, I just hadn't time yet to dig into it, but today I am starting to understand it! Using pointers instead of loading color values looks like a really neat idea, I will test this when I will be back home. I don't knew such trick is possible at all (and yes, my 8bit array is the only source for graphics in my program, and all I am doing is indeed drawing "big pixels") :)

Return to “General”

Who is online

Users browsing this forum: albert and 2 guests