## Need speedup

New to FreeBASIC? Post your questions here.
12val12newakk
Posts: 20
Joined: Nov 14, 2019 17:04

### Need speedup

I need speedup iteration sub
Time complexity O(n2)

physik not simple 1/r^2
therefore, finished libraries are not suitable
+ and I do not want someone else's
I will probably correct the laws of interaction between the bodies of the system, so the possibility of further corrections should remain

Code: Select all

` sub vals ()       for n=0 to Nmass         dim as single dx,dy ,a,df,dfy,dfx,xm,ym        dim as single Rglue ,Rgquadro     for m=0 to Nmass         if n<>m then                   dx=x(m)-x(n):dy=y(m)-y(n)                      Rgquadro=(dx*dx+dy*dy): Rglue=sqr (Rgquadro)             Select Case   Rglue                 Case is >140                       df=9512/((Rgquadro))                                              Case 10 to 140                     df=((-1e12 /Rgquadro)/Rgquadro/Rgquadro)+.4                 Case 0 to 10                        df=0 : NUM_ERROR=NUM_ERROR+1                            x_ERR_m(m) =x(m):x_ERR_m(m) =y(m):                           x_ERR_n(n) =x(n):x_ERR_n(n) =y(n):                           n_ERR_part =n                           m_ERR_part =m              End Select                       df=df*mc(m)                      dfy=dfy+df*(dy/Rglue):dfx=dfx+df*(dx/Rglue)                                         end if       next m      dfy=dfy-y(n)*Kcnt:dfx=dfx-x(n) *Kcnt   rem   Centering power     vy(n)=vy(n)*.99999:vx(n)=vx(n)*.99999          rem dissipation     vy(n)=vy(n)+dfy*dt:vx(n)=vx(n)+dfx*dt                y(n)=y(n)+vy(n)*dt   :   x(n)=x(n)+vx(n)*dt     dx=x(m)-x(n):dy=y(m)-y(n)   next n end sub`
Posts: 2109
Joined: May 24, 2007 22:10
Location: The Netherlands

### Re: Need speedup

Hi, it is collision detection between many particles right?
What is often done is: Make a grid to divide the particles.
Different type of grids can be used. Most simple is a square/cubic grid.
Then only check collisions between particles within 1 grid cell AND with all neighboring cells. (or something like that).
The idea is that assigning the particles to the grid cell requires only 1 iteration. Then the number of collision checks needed is less. Particles in separated cells can never collide.
There are examples online, but I don't have a direct link at the moment.

Also, for small speed improvement, see if you can eliminate the 'square root' call. Do comparison against the 'squared' value.

Edit:
Here is 1 link that does not use a fixed grid, but a 'quad tree':
https://gamedevelopment.tutsplus.com/tu ... amedev-374
Another technique seems to be 'spacial hashing', which I am not familiar with.
https://conkerjo.wordpress.com/2009/06/ ... ollisions/
Last edited by badidea on Nov 15, 2019 10:58, edited 2 times in total.
12val12newakk
Posts: 20
Joined: Nov 14, 2019 17:04

### Re: Need speedup

There are no explicit conflicts - this is what distinguishes this system
just at a distance of less than 140
the repulsive force grows in proportion to the 6th degree
there are no clear boundaries of bodies (balls).
Radii of bodies are drawn conditionally.
D.J.Peters
Posts: 8114
Joined: May 28, 2005 3:28
Contact:

### Re: Need speedup

compare yours :-(

Code: Select all

`for n = first to last   for m = first to last    if (n<>m) then calc_forces_gravitiy_with_cell(n,m)  nextnext`
with this :-)

Code: Select all

`for n = first to last-1  for m = n+1 to last    calc_forces_gravitiy_with_cell(n,m)  nextnext`
dodicat
Posts: 6628
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: Need speedup

You have
dim as single dx,dy ,a,df,dfy,dfx,xm,ym
all within the outer loop.

You could just do all these dims before starting the loops.
i.e.
dim as single dx,dy ,a,df,dfy,dfx,xm,ym
for n =0 to Nmass
dfx=0
dfy=0
for m =0 to Nmass

. . .
. . .
12val12newakk
Posts: 20
Joined: Nov 14, 2019 17:04

### Re: Need speedup

big boom in system diese code

dodicat tnx maybe it will add 1 percent to speed
deltarho
Posts: 2576
Joined: Jan 02, 2017 0:34
Location: UK

### Re: Need speedup

Where we put Dim has scope implications but I do not think that it has any noticeable effect on speed. I looked at Dim inside loops a while ago and concluded that they are only executed once and not on every pass of the loop.
Last edited by deltarho on Nov 15, 2019 13:51, edited 1 time in total.
deltarho
Posts: 2576
Joined: Jan 02, 2017 0:34
Location: UK

### Re: Need speedup

maybe it will add 1 percent to speed

With multi-tasking operating systems we can get 1% fluctuations in speed without doing anything.
Posts: 2109
Joined: May 24, 2007 22:10
Location: The Netherlands

### Re: Need speedup

This: dfy = dfy + df * (dy / Rglue)
Can be: dfy += df * (dy / Rglue)
But I don't expect speed improvement from this.
Likewise: dfy -= y(n) * Kcnt
And: df *= mc(m)

Not tested, maybe access to x(n) and y(n) inside the 'm-loop' can be speed up with an intermediate variable assigned in the 'n-loop'. But maybe optimized already by the compiler.

When compiling via GCC (default for 64-bit), there are also compiler options to optimize speed (-O1, -O2, -O3).

At the end: dx=x(m)-x(n):dy=y(m)-y(n) seems to have no purpose?

This part can have 1 multiplication less: dfy=dfy+df*(dy/Rglue):dfx=dfx+df*(dx/Rglue)
Via: dfy += (df / Rglue) * dy : dfx += (df / Rglue) * dx
To: extra_var = df / Rglue : dfy += extra_var * dy : dfx += extra_var * dx

BTW: Which force scales with the 6th power of 1/R ?
D.J.Peters
Posts: 8114
Joined: May 28, 2005 3:28
Contact:

### Re: Need speedup

may be you don't understand my hint in scope of speed !

let say your n,m loop's count like this

n 0,1,2,3,4
m 0,1,2,3,4

n=0 (m 1..4)
if n<>m then calculate(0,1)
if n<>m then calculate(0,2)
if n<>m then calculate(0,3)
if n<>m then calculate(0,4)
n=1 (m 0,2..4)
if n<>m then calculate(1,0) ' error 1 is the same as calculate(0,1)
if n<>m then calculate(1,2)
if n<>m then calculate(1,3)
if n<>m then calculate(1,4)
n=2 (m 0..1,3..4)
if n<>m then calculate (2,0) ' error 2 is the same as calculate(0,2)
if n<>m then calculate (2,1) ' error 3 is the same as calculate(1,2)
if n<>m then calculate (2,3)
if n<>m then calculate (2,4)
n=3 (m 0..2,4)
if n<>m then calculate (3,0) ' error 4 is the same as calculate(0,3)
if n<>m then calculate (3,1) ' error 5 is the same as calculate(1,3)
if n<>m then calculate (3,2) ' error 6 is the same as calculate(2,3)
if n<>m then calculate (3,4)
n=4 (m 0..3)
if n<>m then calculate (4,0) ' error 7 is the same as calculate(0,4)
if n<>m then calculate (4,1) ' error 8 is the same as calculate(1,4)
if n<>m then calculate (4,2) ' error 9 is the same as calculate(2,4)
if n<>m then calculate (4,3) ' error 10 is the same as calculate(3,4)

etc.

i posted this

n 0,1,2,3
m n+1 to 4

n=0 (m=1..4)
calculate (0,1)
calculate (0,2)
calculate (0,3)
calculate (0,4)

n=1 (m=2..4)
calculate (1,2)
calculate (1,3)
calculate (1,4)

n=2 (m=3..4)
calculate (2,3)
calculate (2,4)

n=3 (m=4..4)
calculate (3,4)

With other words you caluclate pairs of particles more than once !

Joshy
12val12newakk
Posts: 20
Joined: Nov 14, 2019 17:04

### Re: Need speedup

D.J.Peters ))
in my procedure, one-pass calculation
increments of velocity and body coordinates N
in relation to M .. but not back
(1,0) not the same (0,1) !
Provoni
Posts: 380
Joined: Jan 05, 2014 12:33
Location: Belgium

### Re: Need speedup

@12val12newakk

You may want to try to replace:

Code: Select all

`Select Case Rglue`

with:

Code: Select all

`Select Case Rgquadro`

and change the select case values accordingly. Removing the Rglue dependency may improve the CPU's out of order processing and speed up the code.

- Can you use a look up table to get the Rglue value instead of SQR?
- 64-bit FreeBASIC? If not try it.
- What compiler flags are you using? Try "-gen gcc -Wc -march=native,-Ofast,-funroll-loops,-fomit-frame-pointer,-ftree-loop-im,-fivopts".
RockTheSchock
Posts: 231
Joined: Mar 12, 2006 16:25

### Re: Need speedup

Maybe it's faster if you change this by using less divisions

Code: Select all

`df=((-1e12 /Rgquadro)/Rgquadro/Rgquadro)+.4`

also you could change the select case:

Code: Select all

`'Maybe ? Dim as Double RgquadroIf Rglue >140 Then   df=9512/((Rgquadro))Else   If rglue>= 10 Then      df=((-1e12 /(Rgquadro*Rgquadro*Rgquadro)+.4   Else      df=0      NUM_ERROR=NUM_ERROR+1      x_ERR_m(m) =x(m)  ' value in x_ERR_m(m)      x_ERR_m(m) =y(m)  ' is overwritten here ???      x_ERR_n(n) =x(n)      x_ERR_n(n) =y(n)  ' same here ???      n_ERR_part =n      m_ERR_part =m   EndIfEndIf`
12val12newakk
Posts: 20
Joined: Nov 14, 2019 17:04

### Re: Need speedup

At the end: dx=x(m)-x(n):dy=y(m)-y(n) seems to have no purpose?

BTW: Which force scales with the 6th power of 1/R ?[/quote]

Thanks not used deleted

6th power of 1/R roughly similar but can always be fixed
RockTheSchock
Posts: 231
Joined: Mar 12, 2006 16:25

### Re: Need speedup

Another option is to use ASM with AVX / AVX2 instructions:
Each YMM register can hold and do simultaneous operations on eight 32-bit single-precision floating point numbers. So basically if the algorithm could be optimally transformed you could gain almost 8 times the speed. Realistically it could be around twice as fast.