RGB and RGBA

Forum for discussion about the documentation project.
speedfixer
Posts: 421
Joined: Nov 28, 2012 1:27
Location: California

RGB and RGBA

Postby speedfixer » Oct 19, 2019 9:06

Shouldn't the return value be a ulong instead of uinteger?
fxm
Posts: 9996
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: RGB and RGBA

Postby fxm » Oct 19, 2019 9:20

dkl's answer in 2015 on the same question about the FB 1.02 release (viewtopic.php?p=206790#p206790):
dkl wrote:
D.J.Peters wrote:Does the macros RGB and RGBA are changed too ?

So far they're still using UInteger; but yea, it might be useful to change them to ULong.

That must have been forgotten after (or not applied for a reason I do not know).
dodicat
Posts: 6727
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: RGB and RGBA

Postby dodicat » Oct 19, 2019 18:49

For me here using double is faster than ulong (By about 5% to 10%).
Tested 32/64 gas (32) /gcc with optimisations or not.
So is double (on both 32/64) a feasible alternative to ulong for a colour?
And also, if I use double for the loop variable
e.g.
for x as double=0 to w-1
for y as double=0 to h-1
pd=point(x,y)
next
next
I get an extra boost.

Processor: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz (2 CPUs), ~3.0GHz
using win 10

Code: Select all

 

screen 20,32
dim as integer w,h
screeninfo w,h

dim as double pd
dim as ulong pu
dim as double accd
dim as double accu
dim as double t,t2

for x as long=0 to w-1
    for y as long=0 to h-1
        pset(x,y),rgb(rnd*255,rnd*255,rnd*255)
       point(x,y)   'warm up
   next
next
'==============

for k as long=1 to 14
t=timer
for x as long=0 to w-1
    for y as long=0 to h-1
       pd=point(x,y)
   next
next
t2=timer
accd+=(t2-t)
dim as ulong clr=pd
print "time for double",t2-t,clr;" = ";cast(ubyte ptr,@clr)[2];" ";cast(ubyte ptr,@clr)[1];" ";cast(ubyte ptr,@clr)[0]

sleep 50
t=timer
for x as long=0 to w-1
    for y as long=0 to h-1
       pu=point(x,y)
   next
next
t2=timer
accu+=(t2-t)
print "time for ulong",t2-t,pu;" = ";cast(ubyte ptr,@pu)[2];" ";cast(ubyte ptr,@pu)[1];" ";cast(ubyte ptr,@pu)[0]

print
next k

print "Total time for double ";accd
print "Total time for ulong  ";accu
sleep
 
jj2007
Posts: 1729
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: RGB and RGBA

Postby jj2007 » Oct 19, 2019 19:01

Intel Core i5:
With Gas, no difference
With Gcc32, ulong is 5% slower
With Gcc64, ulong is 30% slower

But I wonder what you are measuring here...

Code: Select all

t=timer
for x as long=0 to w-1
    for y as long=0 to h-1
       pd=point(x,y)
   next
next
t2=timer
accd+=(t2-t)

Code: Select all

t=timer
for x as long=0 to w-1
    for y as long=0 to h-1
       pu=point(x,y)
   next
next
t2=timer
accu+=(t2-t)
Last edited by jj2007 on Oct 19, 2019 19:08, edited 1 time in total.
fxm
Posts: 9996
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: RGB and RGBA

Postby fxm » Oct 19, 2019 19:08

But if we swap the order of tests in the code (first, Ulong), Ulong becomes faster!
dodicat
Posts: 6727
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: RGB and RGBA

Postby dodicat » Oct 19, 2019 19:42

jj2007.
I am using double to capture a 32 bit colour instead of ulong.

if I use double for a loop variable also and pick ulong/double at random then double still has the edge (small) here with 32 bits, ulong has the edge (small) with 64 bits.

Code: Select all


screen 20,32
dim as integer w,h
screeninfo w,h

dim as double pd
dim as ulong pu
dim as double accd
dim as double accu
dim as double t,t2
dim as long numdoubles,numulongs

for x as long=0 to w-1
    for y as long=0 to h-1
        pset (x,y),rgb(rnd*255,rnd*255,rnd*255)
       point(x,y)   'warm up
   next
next
'==============

#macro dbl
t=timer
for x as double=0 to w-1
    for y as double=0 to h-1
       pd=point(x,y)
   next
next
t2=timer
accd+=(t2-t)
dim as ulong clr=pd
print "time for double",t2-t,clr;" = ";cast(ubyte ptr,@clr)[2];" ";cast(ubyte ptr,@clr)[1];" ";cast(ubyte ptr,@clr)[0]
#endmacro

#macro ulng
t=timer
for x as long=0 to w-1
    for y as long=0 to h-1
       pu=point(x,y)
   next
next
t2=timer
accu+=(t2-t)
print "time for ulong",t2-t,pu;" = ";cast(ubyte ptr,@pu)[2];" ";cast(ubyte ptr,@pu)[1];" ";cast(ubyte ptr,@pu)[0]
#endmacro

randomize
for k as long=1 to 44
   
if rnd>.5 then
    ulng
    numulongs+=1
else
    dbl
    numdoubles+=1
    end if

next k

print "Average time for double ";accd/numdoubles
print "Average time for ulong  ";accu/numulongs
sleep
 

Integer numbers crunching was always faster with fb.
Now it looks like floats are catching up (cpu's probably)
I'll test on 32 bit Linux later.
coderJeff
Site Admin
Posts: 3343
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: RGB and RGBA

Postby coderJeff » Nov 15, 2020 2:09

speedfixer wrote:Shouldn't the return value be a ulong instead of uinteger?


Following-up to this thread and bug ticket #924 RGB() & RGBA() macros return incorrect data-type

Proposed is to return the ULONG type for RGB and RGBA macros. Timings depend on back end and optimization levels.

For example,
- 64-bit gcc -O 3 gives best results with UBYTE components and 'OR' operations
- 64-bit gcc -O 2 gives best results with longint components and '+' operations (but who would use longint for bytes??)
- 32-bit gas, doesn't really matter, it's kind of all the same

Some test code to print out timings:

Code: Select all

#define RGB_CONST(r,g,b) culng(r)
#define RGBA_CONST(r,g,b,a) culng(r)

#define RGB_OLD(r,g,b) culng((cuint(r) shl 16) or (cuint(g) shl 8) or cuint(b) or &hFF000000)
#define RGBA_OLD(r,g,b,a) culng((cuint(r) shl 16) or (cuint(g) shl 8) or cuint(b) or (cuint(a) shl 24))

#define RGB_NEW1(r, g, b)        culng((cubyte(255) shl 24) + (cubyte(r) shl 16) + (cubyte(g) shl 8) + cubyte(b))
#define RGBA_NEW1(r, g, b, a)    culng((cubyte(a) shl 24) + (cubyte(r) shl 16) + (cubyte(g) shl 8) + cubyte(b))

#define RGB_NEW2(r, g, b)        culng((cubyte(255) shl 24) or (cubyte(r) shl 16) or (cubyte(g) shl 8) or cubyte(b))
#define RGBA_NEW2(r, g, b, a)    culng((cubyte(a) shl 24) or (cubyte(r) shl 16) or (cubyte(g) shl 8) or cubyte(b))

#define RGB_NEW3(r, g, b)        culng(culng(cubyte(255) shl 24) + culng(cubyte(r) shl 16) + culng(cubyte(g) shl 8) + culng(cubyte(b)))
#define RGBA_NEW3(r, g, b, a)    culng(culng(cubyte(a) shl 24) + culng(cubyte(r) shl 16) + culng(cubyte(g) shl 8) + culng(cubyte(b)))

#define RGB_NEW4(r, g, b)        culng(culng(cubyte(255) shl 24) or culng(cubyte(r) shl 16) or culng(cubyte(g) shl 8) or culng(cubyte(b)))
#define RGBA_NEW4(r, g, b, a)    culng(culng(cubyte(a) shl 24) or culng(cubyte(r) shl 16) or culng(cubyte(g) shl 8) or culng(cubyte(b)))

/'
#define RGB_NEW3(r, g, b)        culng(culng(&hff000000) or (culng(r) shl 16) or (culng(g) shl 8) or culng(b))
#define RGBA_NEW3(r, g, b, a)    culng((culng(a) shl 24) or (culng(r) shl 16) or (culng(g) shl 8) or culng(b))

#define RGB_NEW4(r, g, b)        (cubyte(255) shl 24 + cubyte(r) shl 16 + cubyte(g) shl 8 + cubyte(b))
#define RGBA_NEW4(r, g, b, a)    (cubyte(a) shl 24 + cubyte(r) shl 16 + cubyte(g) shl 8 + cubyte(b))
'/

dim shared result as long

const COUNT = 10000000
const UNITN = 1000000
const UNIT  = "MHz"

#macro DO_RGB_TEST_F( func )

   t = timer
   i = count
   while i
      result += func( r, g, b )
      i -= 1

      r += 1
      r and= &hff
      g += 1
      g and= &hff
      b += 1
      b and= &hff
   wend
   t = timer - t
   print cuint( COUNT/t/UNITN ),

#endmacro

#macro DO_RGBA_TEST_F( func )
   t = timer
   i = count
   while i
      result += func( r, g, b, a )
      i -= 1

      a += 1
      a and= &hff
      r += 1
      r and= &hff
      g += 1
      g and= &hff
      b += 1
      b and= &hff
   wend
   t = timer - t
   print cuint( COUNT/t/UNITN ),
#endmacro

#macro DO_RGB_TEST_T( func0, func1, func2, func3, func4, func5, component_t )

   scope
      dim i as integer
      dim r as component_t = 0
      dim g as component_t = 0
      dim b as component_t = 0
      dim as double t

      print #component_t,

      DO_RGB_TEST_F( func0 )
      DO_RGB_TEST_F( func1 )
      DO_RGB_TEST_F( func2 )
      DO_RGB_TEST_F( func3 )
      DO_RGB_TEST_F( func4 )
      DO_RGB_TEST_F( func5 )
      
      print
   end scope
#endmacro

#macro DO_RGBA_TEST_T( func0, func1, func2, func3, func4, func5, component_t )

   scope
      dim i as integer
      dim a as component_t = 0
      dim r as component_t = 0
      dim g as component_t = 0
      dim b as component_t = 0
      dim as double t

      print #component_t,

      DO_RGBA_TEST_F( func0 )
      DO_RGBA_TEST_F( func1 )
      DO_RGBA_TEST_F( func2 )
      DO_RGBA_TEST_F( func3 )
      DO_RGBA_TEST_F( func4 )
      DO_RGBA_TEST_F( func5 )

      print
   end scope
#endmacro


#macro DO_TEST( test, func0, func1, func2, func3, func4, func5 )
   print #test
   print "Type", "NOP", "UL OLD", "UL UB +", "UL UB OR", "UL UL UB +", "UL UL UB or"
   test( func0, func1, func2, func3, func4, func5, byte )
   test( func0, func1, func2, func3, func4, func5, ubyte )
   test( func0, func1, func2, func3, func4, func5, short )
   test( func0, func1, func2, func3, func4, func5, ushort )
   test( func0, func1, func2, func3, func4, func5, long )
   test( func0, func1, func2, func3, func4, func5, ulong )
   test( func0, func1, func2, func3, func4, func5, longint )
   test( func0, func1, func2, func3, func4, func5, ulongint )
   test( func0, func1, func2, func3, func4, func5, single )
   test( func0, func1, func2, func3, func4, func5, double )
   print
#endmacro

#ifdef __FB_64BIT__
   print "64-bit - ";
#else
   print "32-bit - ";
#endif
print __FB_BACKEND__ & " - fbc " & __FB_VERSION__ & " - timings in " & UNIT

DO_TEST( DO_RGB_TEST_T , RGB_CONST , RGB_OLD , RGB_NEW1 , RGB_NEW2 , RGB_NEW3 , RGB_NEW4  )

DO_TEST( DO_RGBA_TEST_T, RGBA_CONST, RGBA_OLD, RGBA_NEW1, RGBA_NEW2, RGBA_NEW3, RGBA_NEW4 )

#if 1

print "testing that all macros produce the same output ... takes a while"

for a as integer = 0 to 255
   for r as integer = 0 to 255
      for g as integer = 0 to 255
         for b as integer = 0 to 255
            scope
               var c1 = RGB_OLD( r, g, b )
               var c2 = RGB_NEW1( r, g, b )
               var c3 = RGB_NEW2( r, g, b )
               var c4 = RGB_NEW3( r, g, b )
               var c5 = RGB_NEW4( r, g, b )
               if (( c1 <> c2 ) or ( c1 <> c3 ) or ( c1 <> c4 ) or ( c1 <> c5 )) then
                  print "ERROR"
                  end
               end if
            end scope
            scope
               var c1 = RGBA_OLD( r, g, b, a )
               var c2 = RGBA_NEW1( r, g, b, a )
               var c3 = RGBA_NEW2( r, g, b, a )
               var c4 = RGBA_NEW3( r, g, b, a )
               var c5 = RGBA_NEW4( r, g, b, a )
               if (( c1 <> c2 ) or ( c1 <> c3 ) or ( c1 <> c4 ) or ( c1 <> c5 )) then
                  print "ERROR"
                  end
               end if
            end scope
         next
      next
   next
next

print "OK"

#endif


There isn't going to be one fastest for every combination of code gen options, so I'm just going to pick something that looks decent.
speedfixer
Posts: 421
Joined: Nov 28, 2012 1:27
Location: California

Re: RGB and RGBA

Postby speedfixer » Nov 15, 2020 18:05

The issue had nothing to do with speed, just that the different return variable byte size **in function** required changes to code moving between 64 bit and 32 bit.

This matters when I move from one architecture to another: I had to make changes when I moved working code from my usual 64 bit dev systems to 32 bit micros (PI, etc.) That just seems wrong.

Thanks for the attention.

david

Return to “Documentation”

Who is online

Users browsing this forum: No registered users and 3 guests