## fast sin compare

General FreeBASIC programming questions.
srvaldez
Posts: 1656
Joined: Sep 25, 2005 21:54

### fast sin compare

while trying to decipher the _Sin6th function from the post here viewtopic.php?f=7&t=26248
I searched the net for alternatives and found fast_sin from https://www.gamedev.net/forums/topic/62 ... oximation/
it turns out to be much faster and more accurate (about 3 decimal digits), here are the times on my PC

Code: Select all

`32-bit, gen gcc -Wc -O2time for FB sin =  18.01501279999502sum = -6.369349492274523e-013time for fast_sin =  4.04145359992981sum =  2.114974861910923e-013time for _Sin6th =  11.92606279999018sum =  67.9763202431536264-bit gen gcc -Wc -O2time for FB sin =  42.23232729989104sum =  8.659739592076221e-015time for fast_sin =  3.331463000038639sum =  2.114974861910923e-013time for _Sin6th =  6.449131899978966sum =  67.97632024315362`

one surprise was the much longer execution time for FB x64 but more accurate, the sum should be 0

Code: Select all

`'https://www.freebasic.net/forum/viewtopic.php?f=7&t=26248Function _Sin6th(fX As Double) As Double    asm      jmp _Sin6th_Start         _Sin6th_Mul: .double 683565275.57643158          _Sin6th_Div: .double -0.0000000061763971109087229          _Sin6th_Rnd: .double 6755399441055744.0              _Sin6th_Start:          movq xmm0, [fX]          mulsd xmm0, [_Sin6th_Mul]          addsd xmm0, [_Sin6th_Rnd]          movd ebx, xmm0              lea  eax, [ebx*2+0x80000000]          sar  eax, 2          imul eax          sar  ebx, 31          lea  eax, [edx*2-0x70000000]          lea  ecx, [edx*8+edx-0x24000000]          imul edx          xor  ecx, ebx          lea  eax, [edx*8+edx+0x44A00000]          imul ecx                    cvtsi2sd xmm0, edx          mulsd xmm0, [_Sin6th_Div]          movq [Function], xmm0    End Asm End Function'https://www.gamedev.net/forums/topic/621589-extremely-fast-sin-approximation/function fast_round(byval x as double) as long   dim MAGIC_ROUND as const double = 6755399441055744.0   union fast_trunc      d as double      type         lw as long         hw as long      end type   end union   dim fast_trunc as fast_trunc   fast_trunc.d = x   fast_trunc.d += MAGIC_ROUND   return fast_trunc.lwend function'https://www.gamedev.net/forums/topic/621589-extremely-fast-sin-approximation/function fast_sin(byval x as double) as double   dim PI as const double = 3.14159265358979323846264338327950288   dim INVPI as const double = 0.31830988618379067153776752674502872   dim A as const double = 0.00735246819687011731341356165096815   dim B as const double = -0.16528911397014738207016302002888890   dim C as const double = 0.99969198629596757779830113868360584   dim k as long   dim x2 as double   k = fast_round(INVPI * x)   x -= k * PI   x2 = x * x   x = x * (C + (x2 * (B + (A * x2))))   if k mod 2 then      x = -x   end if   return xend functiondim as double t, yt=timery=0for x as double=-214748364 to 214748364   y += sin(x)nextprint "time for FB sin = ";timer-tprint "sum = ";yt=timery=0for x as double=-214748364 to 214748364   y += fast_sin(x)nextprint "time for fast_sin = ";timer-tprint "sum = ";yt=timery=0for x as double=-214748364 to 214748364   y += _Sin6th(x)nextprint "time for _Sin6th = ";timer-tprint "sum = ";y`
jj2007
Posts: 850
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

### Re: fast sin compare

FB32
Gas:

Code: Select all

`time for FB sin =  17.03938541668578sum =  6.441038566222523e-013time for fast_sin =  28.36938175893437sum = -3.419486915845482e-014time for _Sin6th =  9.583104912452711sum =  67.97632024315428`

-gen gcc -Wc -O2 -s console:

Code: Select all

`time for FB sin =  18.63207066554548sum = -6.378231276471524e-013time for fast_sin =  4.682086121603732sum = -6.447741104052829e-013time for _Sin6th =  10.41132472365041sum =  67.97632024315428`

No assembler code? I'd like to test it against Sinus(), which is about 3.5 times as fast as _Sin6th.
srvaldez
Posts: 1656
Joined: Sep 25, 2005 21:54

### Re: fast sin compare

hi jj2007
as you noticed gcc does a good job at optimizing, that's one reason I am frowning at non-portable asm code, asm is fun but inline asm in FB x64 is problematic, try compiling with -O3 or -Ofast.
jj2007
Posts: 850
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

### Re: fast sin compare

srvaldez wrote:try compiling with -O3 or -Ofast.

I tried gcc -Wc -O3 and -Wc -O3 but it fails miserably with TmpFile.asm:340: Error: symbol `_Sin6th_Mul' is already defined
Same code compiles just fine with -gen gcc -Wc -O2
Probably a bug...
UEZ
Posts: 184
Joined: May 05, 2017 19:59
Location: Germany

### Re: fast sin compare

Well, when I replace only the calls for _Sin6th only with fast_sin in the 3D Sine Wave demo then the FPS decreases from 64 fps to 50 fps!

Compiled with standard -s gui (Win32).
srvaldez
Posts: 1656
Joined: Sep 25, 2005 21:54

### Re: fast sin compare

@UEZ
you need to compile using gcc, for example: fbc 3D_Sine_Wave.bas -gen gcc -Wc -O2
fast sine is only faster when optimized using gcc, however FB graphics are quite a bit slower when compiled with gcc
UEZ
Posts: 184
Joined: May 05, 2017 19:59
Location: Germany

### Re: fast sin compare

srvaldez wrote:@UEZ
you need to compile using gcc, for example: fbc 3D_Sine_Wave.bas -gen gcc -Wc -O2
fast sine is only faster when optimized using gcc, however FB graphics are quite a bit slower when compiled with gcc

Indeed, now the fps goes up to 75 fps only when changed the fast_sin calls within _3Dto2D2 sub.

Is there also a fast_cos counterpart function available?
srvaldez
Posts: 1656
Joined: Sep 25, 2005 21:54

### Re: fast sin compare

UEZ wrote:Is there also a fast_cos counterpart function available?

you can make a copy of fast_sin and change x to Pi/2 - x or you could change the calls to cos to fast_sin(PI/2 - x), here's a modified fast_sin to give cos

Code: Select all

`'https://www.gamedev.net/forums/topic/621589-extremely-fast-sin-approximation/function fast_cos(byval x as double) as double   dim PI as const double = 3.14159265358979323846264338327950288   dim PI2 as const double = 1.5707963267948966192313216916397514421   dim INVPI as const double = 0.31830988618379067153776752674502872   dim A as const double = 0.00735246819687011731341356165096815   dim B as const double = -0.16528911397014738207016302002888890   dim C as const double = 0.99969198629596757779830113868360584   dim k as long   dim x2 as double   x = PI2 - x   k = fast_round(INVPI * x)   x -= k * PI   x2 = x * x   x = x * (C + (x2 * (B + (A * x2))))   if k mod 2 then      x = -x   end if   return xend function`
deltarho[1859]
Posts: 1566
Joined: Jan 02, 2017 0:34
Location: UK

### Re: fast sin compare

The following compiles with -03 and -0fast. Strap yourselves into your chair. <smile>

Code: Select all

`Function _Sin6th(fX As Double) As Double   Asm    jmp 0f  1: .double 683565275.57643158   2: .double -0.0000000061763971109087229   3: .double 6755399441055744.0          0:     movq xmm0, [fX]     mulsd xmm0, [1b]     addsd xmm0, [3b]     movd ebx, xmm0     lea  eax, [ebx*2+0x80000000]     sar  eax, 2     imul eax     sar  ebx, 31     lea  eax, [edx*2-0x70000000]     lea  ecx, [edx*8+edx-0x24000000]     imul edx     xor  ecx, ebx     lea  eax, [edx*8+edx+0x44A00000]    imul ecx     cvtsi2sd xmm0, edx     mulsd xmm0, [2b]     movq [Function], xmm0   End Asm End Function`
UEZ
Posts: 184
Joined: May 05, 2017 19:59
Location: Germany

### Re: fast sin compare

@srvaldez: thanks. :-)

deltarho[1859] wrote:The following compiles with -03 and -0fast. Strap yourselves into your chair. <smile>

Code: Select all

`Function _Sin6th(fX As Double) As Double   Asm    jmp 0f  1: .double 683565275.57643158   2: .double -0.0000000061763971109087229   3: .double 6755399441055744.0          0:     movq xmm0, [fX]     mulsd xmm0, [1b]     addsd xmm0, [3b]     movd ebx, xmm0     lea  eax, [ebx*2+0x80000000]     sar  eax, 2     imul eax     sar  ebx, 31     lea  eax, [edx*2-0x70000000]     lea  ecx, [edx*8+edx-0x24000000]     imul edx     xor  ecx, ebx     lea  eax, [edx*8+edx+0x44A00000]    imul ecx     cvtsi2sd xmm0, edx     mulsd xmm0, [2b]     movq [Function], xmm0   End Asm End Function`

What is the meaning of the suffixes f or b (jmp 0f, mulsd xmm0, [1b])?
deltarho[1859]
Posts: 1566
Joined: Jan 02, 2017 0:34
Location: UK

### Re: fast sin compare

@UEZ

Put srvadez's and mine side by side and ponder. You will only get half way through a cup of tea and the penny will drop - trust me. <smile>
UEZ
Posts: 184
Joined: May 05, 2017 19:59
Location: Germany

### Re: fast sin compare

deltarho[1859] wrote:@UEZ

Put srvadez's and mine side by side and ponder. You will only get half way through a cup of tea and the penny will drop - trust me. <smile>

Well, I googled a litte bit: the f suffix means forward and b means backward to the next / previous label. Is this faster than the original code?

Btw, the asm functions _Sin6th / _Cos6th was written by Eukalyptus. ;-)
deltarho[1859]
Posts: 1566
Joined: Jan 02, 2017 0:34
Location: UK

### Re: fast sin compare

UEZ wrote:Is this faster than the original code?

It is infinitely faster because the original code will not compile with optimisation levels -03 and -0fast.
Under certain circumstances, GCC may duplicate (or remove duplicates of) your assembly code when optimizing. This can lead to unexpected duplicate symbol errors during compilation if your asm code defines symbols or labels.

When we use local/temporary labels the compiler generates unique labels and will not shoot itself in the foot.
jj2007
Posts: 850
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

### Re: fast sin compare

Here is deltarho's last code with slight modifications to allow more realistic testing: I split the loop in two, an outer one with an integer counter, and an inner one with a double counter that covers the 0...359.99 degrees range. The step is tweaked to arrive at a zero sum:

Code: Select all

`  for x as integer=0 to outerloops   for s as double=0 to 6.2831853 Step 0.01750190891   ' make sure it's 360 inner iterations      innercount+=1      y += sin(s)   next  next`

Results:

Code: Select all

`time for FB sin =  0.1608874157742548     for  3600000 iterations, average:  44 ms/callsum = -4.244895989445813e-005time for MB sin =  0.03104239161274336    for  3600000 iterations, average:  8 ms/callsum =  1.076452668051075e-014time for fast_sin =  0.0363557692970744   for  3600000 iterations, average:  10 ms/callsum = -4.243645435448729e-005time for _Sin6th =  0.0875843159346914    for  3600000 iterations, average:  24 ms/callsum =  0.01235279427191839`

Both FB sin and fast_sin arrive at the same (small) sum, _Sin6th is a bit off but nothing to worry about. Timings are as expected, but probably it's nanosecs, not milliseconds; I'll have to check that one more ;-)

The full source is further down.
Last edited by jj2007 on Jan 04, 2018 15:51, edited 1 time in total.
dodicat
Posts: 5159
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: fast sin compare

Sinus.dll is not very well regarded.
http://www.completelyuninstallprogram.com/sinus-dll/
But I took a chance with your dll.
It scanned OK with Avira.

I see it only works with -gen gcc 32 bits.

But what is it, does it return the sin of an angle?
Or is it really discounts and best prices?

Code: Select all

`#inclib "mbsinusdll"declare function mbsinus cdecl alias "MbSinP"( as double) as doubledim as function  cdecl(as double)as double psinus=@mbsinusfor n as long=0 to 10print psinus(n),mbsinus(n)nextsleep `