How to write to screen.... in a FASTER way?

General FreeBASIC programming questions.
Post Reply
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

How to write to screen.... in a FASTER way?

Post by Fox »

Hi all!

I am currently working on a program which is trying to refresh the screen at a rate of ~60hz.

Unfortunately, this is painfully *slow* :(

The screen size I am working on is 160x144 magnified 4x (that is really 640x576).

This is my current code:

Code: Select all

#MACRO DetectScreenChanges()
  LcdModifiedFirstY = 255
  LcdModifiedFirstX = 255
  LcdModifiedLastX = 0
  FOR y = 0 TO 143
    LcdModifiedScanline(y) = 0
    FOR x = 0 TO 159
      IF ScreenBuffer(x, y) <> ScreenOldBuffer(x, y) THEN
        ScreenOldBuffer(x, y) = ScreenBuffer(x, y)
        IF LcdModifiedScanline(y) = 0 THEN
          LcdModifiedScanline(y) = 1
          IF (LcdModifiedFirstY > y) THEN LcdModifiedFirstY = y
          LcdModifiedLastY = y
        END IF
        IF (LcdModifiedFirstX > x) THEN LcdModifiedFirstX = x
        IF (LcdModifiedLastX < x) THEN LcdModifiedLastX = x
      END IF
    NEXT x
  NEXT y
#ENDMACRO


SUB RefreshScreen()
  DIM AS UINTEGER x, y
  DetectScreenChanges()

  IF (LcdModifiedFirstY < 255) THEN
    DIM AS UINTEGER a, b, c, d
    DIM AS UBYTE ScaleStepX, ScaleStepY
    DIM AS UINTEGER PTR sPtr32 = Screenptr
    SCREENLOCK()
    FOR ScaleStepY = 0 TO (GraphicScaleFactor - 1) STEP FastScale
      a = (GraphicWidth * ScaleStepY) + ScreenBorderRight + (ScreenBorderTop * GraphicWidth)
      FOR y = LcdModifiedFirstY TO LcdModifiedLastY
        IF (LcdModifiedScanline(y) = 1) THEN
          b = y * GraphicWidth * GraphicScaleFactor
          FOR x = LcdModifiedFirstX TO LcdModifiedLastX
            c = (x * GraphicScaleFactor) + b + a
            FOR ScaleStepX = 0 TO (GraphicScaleFactor - 1)
                  sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
            NEXT ScaleStepX
          NEXT x
        END IF
      NEXT y
    NEXT ScaleStepY
    SCREENUNLOCK()
  END IF
END SUB
I am rendering my screen's content in an array called "ScreenBuffer(x,y)", and then I call this sub (refreshscreen) to draw it on screen. The ScreenBuffer() array is an array of UBYTES, and each value points to a palette entry (palette colors are stored in the array called ScreenPalette32()).

Variables ScaleStepX and ScaleStepY are both equal to 4 (but could be anything, to scale the screen more or less), and the variable "FastScale" is always equal to "1".

There is also a routine called "DetectScrenChanges()", which is detecting if the screen changed or not, and feeding variables LcdModifiedFirstX, LcdModifiedFirstY, LcdModifiedLastX, LcdModifiedLastY. (this way I don't have to redraw the screen if its content hasn't changed).
I am also keeping track of modifications on every line, to avoid redrawing lines of screen which weren't changed since last refresh (that's what the LcdModifiedScanline() array is about).

Is there any chance for me to speed up this code? I wonder if using SDL would improve the refresh rate speed?
Last edited by Fox on Dec 19, 2010 16:50, edited 1 time in total.
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Post by fxm »

I hope I am not mistaken, and here are some very slight improvements in the loops (for #MACRO DetectScreenChanges() and for SUB RefreshScreen()) :

Code: Select all

#MACRO DetectScreenChanges()
  LcdModifiedFirstY = 255
  LcdModifiedFirstX = 255
  LcdModifiedLastX = 0
  FOR y = 0 TO 143
    LcdModifiedScanline(y) = 0
    FOR x = 0 TO 159
      IF ScreenBuffer(x, y) <> ScreenOldBuffer(x, y) THEN
        ScreenOldBuffer(x, y) = ScreenBuffer(x, y)
        IF LcdModifiedScanline(y) = 0 THEN
          IF (LcdModifiedFirstY > y) THEN LcdModifiedFirstY = y
          LcdModifiedLastY = y
          LcdModifiedScanline(y) = 1
        END IF
        IF (LcdModifiedFirstX > x) THEN LcdModifiedFirstX = x
        IF (LcdModifiedLastX < x) THEN LcdModifiedLastX = x
      END IF
    NEXT x
  NEXT y
#ENDMACRO


SUB RefreshScreen()
  DIM AS UINTEGER x, y
  DetectScreenChanges()
  IF (LcdModifiedFirstY < 255) THEN
    DIM AS UINTEGER a, b, c, d
    DIM AS UBYTE ScaleStepX, ScaleStepY
    DIM AS UINTEGER PTR sPtr32 = Screenptr
    SCREENLOCK()
    FOR ScaleStepY = 0 TO (GraphicScaleFactor - 1) STEP FastScale
      a = (GraphicWidth * ScaleStepY) + ScreenBorderRight + (ScreenBorderTop * GraphicWidth)
      FOR y = LcdModifiedFirstY TO LcdModifiedLastY
        IF (LcdModifiedScanline(y) = 1) THEN
          b = y * GraphicWidth * GraphicScaleFactor + a
          FOR x = LcdModifiedFirstX TO LcdModifiedLastX
            c = (x * GraphicScaleFactor) + b
            FOR ScaleStepX = 0 TO (GraphicScaleFactor - 1)
                  sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
            NEXT ScaleStepX
          NEXT x
        END IF
      NEXT y
    NEXT ScaleStepY
    SCREENUNLOCK()
  END IF
END SUB
 
"Un compatriote qui lui aussi est sous la neige en attendant Noël !"
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

Post by Fox »

fxm wrote:I hope I am not mistaken, and here are some very slight improvements in the loops
Merci cher compatriote! :)

Thanks, I edited my first post to add your optimizations (and also added another very little one that I noticed now).
Still, as you say - these loops optimizations are really thin, and doesn't provide any noticeable speedup :-P

What's eating the most of my CPU seems to be this:

Code: Select all

DIM AS UINTEGER PTR sPtr32 = Screenptr
SCREENLOCK()
(...)
sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
(...)
SCREENUNLOCK()
I though that writing directly into the ScreenPtr memory area is the fastest possible way to do graphics on FreeBASIC... but well... it appears to be still too slow :-/
Is there really no faster way? I am wondering about trying out SDL... any chance it would be faster? Any experience on that?
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Post by fxm »

- I don't know SDL.

- In this specific case only (to speed up), you can also try to extract the other declarations from the SUB RefreshScreen(), in order to avoid repetitive declarations with memory allocations in the program stack :

Code: Select all

DIM SHARED AS UINTEGER PTR sPtr32 = Screenptr
DIM SHARED AS UINTEGER x, y
DIM SHARED AS UINTEGER a, b, c ' , d
DIM SHARED AS UBYTE ScaleStepX, ScaleStepY
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Post by dafhi »

hey Fox!

Code: Select all

Dim SHARED AS UBYTE ScaleStepX, ScaleStepY
Dim Shared As UInteger GWxGSF, GraphicScaleFactorM

' ...
' ...
' ...

Sub RefreshScreen()
Dim As UInteger PTR sPtr32 = Screenptr
DIM As UInteger x, y
Dim As UINTEGER a, b, c
  DetectScreenChanges()
  IF (LcdModifiedFirstY < 255) THEN
    GWxGSF = GraphicWidth * GraphicScaleFactor
    GraphicScaleFactorM = GraphicScaleFactor - 1
    SCREENLOCK()
    FOR ScaleStepY = 0 TO GraphicScaleFactorM STEP FastScale
      a = (GraphicWidth * ScaleStepY) + ScreenBorderRight + (ScreenBorderTop * GraphicWidth)
      FOR y = LcdModifiedFirstY TO LcdModifiedLastY
        IF (LcdModifiedScanline(y) = 1) THEN
          b = y * GWxGSF + a
          c = x * LcdModifiedFirstX + b
          FOR x = LcdModifiedFirstX TO LcdModifiedLastX
            FOR ScaleStepX = 0 TO GraphicScaleFactorM
                  sPtr32[c + ScaleStepX] = ScreenPalette32(ScreenBuffer(x, y))
            NEXT ScaleStepX
            c += GraphicScaleFactor
          NEXT x
        END IF
      NEXT y
    NEXT ScaleStepY
    SCREENUNLOCK()
  END IF
END SUB
Pending modifications in your other subs, you also might be able to do this
sPtr32[c + ScaleStepX] = ScreenPalette32(x, y)

Freebasic's graphics are fast .. I've used SDL mainly for its ability to make a resizeable window
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Post by dafhi »

I just visualized exactly what this sub is for .. magnification.

I'll share some tidbits I've found over the years. There's a commercial basic, and online documentation at their website (a long time ago) mentioned that its own compiler was designed to put inner-most nested veriables into CPU registers ..

I'm thinking most modern compilers use a similar technique.

So I'm guessing this is good:

Sub RefreshScreen()
Dim As UInteger PTR sPtr32
DIM As UInteger x, y
Dim As UINTEGER a, b, c

...

let's discuss a new loop

Code: Select all

#Include "fbgfx.bi"

Using FB
Dim Shared As Event e	'Allow user to escape anim loop

Dim Shared As Const UInteger	ScaleX=4,ScaleY=4
Dim Shared As Const UInteger	Width8=100,Height8=100,LenInteger = Len(Integer)
Dim Shared As UInteger			BufWidth,BufHeight, ScalePitch
Dim Shared As UInteger			WidthM, HeightM, Width8M, Height8M

WidthM = BufWidth - 1
HeightM = BufHeight - 1
Width8M = Width8 - 1
Height8M = Height8 - 1
BufWidth = Width8 * ScaleX
BufHeight = Width8 * ScaleY
ScalePitch = BufWidth * ScaleY

Dim Shared As UByte				Image8(Width8M,Height8M)
Dim Shared As UInteger			Image32(WidthM,HeightM)
Dim Shared As UInteger  		Pal32(31)
Dim Shared As UInteger			MicroX, MicroY
Dim Shared As UInteger Ptr		sPtrY, sPtrY2, sPtrY3

Sub RandPixels8
Dim As UInteger I, J
	For I = 0 To Height8M
		For J = 0 To Width8M
			Image8(I,J) = Int(Rnd * 32)
		Next
	Next
	For I = 0 To 31
		Pal32(I) = Int(Rnd * 16777216)
	Next
End Sub
Sub RefreshScreen()
Dim As UInteger PTR sPtr32
Dim As UINTEGER x, y, MyColor
sPtrY3 = ScreenPtr
For MicroY = 0 to Height8M
  sPtrY2 = sPtrY3	
  For MicroX = 0 to Width8M
    MyColor = Pal32(Image8(MicroX, MicroY))
    sPtrY = sPtrY2
    For y = 1 To ScaleY
      sPtr32 = sPtrY
      For x = 1 To ScaleX
        *sPtr32 = MyColor
       	sPtr32 += 1
      Next
      sPtrY += BufWidth 
    Next
    sPtrY2 += ScaleX
  Next MicroX
  sPtrY3 += ScalePitch
Next MicroY
End Sub
Sub KeyEvents '' fbgfx.bi
	If (screenevent(@e)) then
		select case e.type
		case EVENT_KEY_PRESS
			If e.scancode = SC_ESCAPE Then End
		End Select
	End If
End Sub

ScreenRes BufWidth,BufHeight,32,,&h20

Do
	KeyEvents
	RandPixels8
	ScreenLock
	RefreshScreen
	ScreenUnLock
	Sleep 8 ''let Operating System breathe
Loop
Last edited by dafhi on Dec 21, 2010 5:14, edited 11 times in total.
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

@Fox:

Did you think about hardware acceleration yet? From my point of view using cairo or OpenGL will be a good idea (faster and less effort).
sir_mud
Posts: 1401
Joined: Jul 29, 2006 3:00
Location: US
Contact:

Post by sir_mud »

Use static for variables in functions if you don't want them to be reallocated everytime. Doesn't clutter up the global namespace and preserves locality. Also the code seems very complex for what it does, maybe it just needs to be simplified?
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

Post by Fox »

Hi all, thanks for all your suggestions :)

I replaced all my DIM declarations by STATIC declarations, but it doesn't changed the overall speed in any noticeable way... Still, the most consuming part is the operation of writing to screen.

@TJF: I have absolutely no experience with hardware acceleration... But indeed, this could be a good idea see what could be achieved with OpenGL - I will definitely check this out :)
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

@Fox:

For 2D OpenGl you can find some examples here:

http://other.paul-grunewald.de/ogl/

The text is in German, but the examples are in FreeBasic :)
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Post by dafhi »

@Fox

My latest edit has a working sample of animated random pixels - if you haven't seen that, do check it out =)

In short, if the only *source* data is your 8 bit, you'll want to do this:

For y =
For x = ..
mycolor = Pal32(Image8(x,y))
For yDest = ..
For xDest = ..
sPtr32 = yDest * Buf32Width + xDest
*sPtr32 = mycolor

...

Because all you're doing is creating "big" pixels right?

If you're copying data from other sources also, I can see why you'd need to do it the way you're doing .. but I realized one more optimization if that's the case ..

FOR x = LcdModifiedFirstX TO LcdModifiedLastX
MyColor = ScreenPalette32(ScreenBuffer(x, y))
FOR ScaleStepX = 0 TO GraphicScaleFactorM
sPtr32[c + ScaleStepX] = MyColor
Fox
Posts: 353
Joined: Aug 08, 2006 13:39
Location: Lille, France
Contact:

Post by Fox »

@TJF: Ah zoo, ich spreche kein deutsche. Das ist a hund. Pommes-frittes bitte. (and that's it about my german skills) :-D But I will definitely look into these examples, looks interesting (and maybe passing the whole website into a translator will help me to understand what it's about) :)

@dafhi: Yeah, I saw your code, I just hadn't time yet to dig into it, but today I am starting to understand it! Using pointers instead of loading color values looks like a really neat idea, I will test this when I will be back home. I don't knew such trick is possible at all (and yes, my 8bit array is the only source for graphics in my program, and all I am doing is indeed drawing "big pixels") :)
Post Reply