Speed of FreeBasic

General discussion for topics related to the FreeBASIC project or its community.
zxretrosoft
Posts: 22
Joined: Apr 23, 2013 19:12
Contact:

Speed of FreeBasic

Post by zxretrosoft »

Hello everyone,

I would like to ask, there are still discussions about which language is faster, slower, etc.

It seems to me that the FB is very very fast.

Is there a recent test?

E.g. at least relevant comparisons of FB vs Python vs C++ vs JAVA.

Thank you in advance! ;-)

P. S.
Here I found some trivial comparison, but there is only FreePascal.
https://geekregator.com/2015-01-15-benc ... ascal.html
lizard
Posts: 440
Joined: Oct 17, 2017 11:35
Location: Germany

Re: Speed of FreeBasic

Post by lizard »

zxretrosoft wrote:It seems to me that the FB is very very fast.

Is there a recent test?
Yes, many are thinking it is almost if not as fast as C. Here you have a comparison:

https://benchmarksgame-team.pages.debia ... stest.html

Edit: inserted new address
Last edited by lizard on Jun 15, 2018 23:36, edited 1 time in total.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Speed of FreeBasic

Post by dodicat »

Here is the raw C code translated without any external libraries.

Code: Select all



sub Sieve( maxNum as long) 
    dim as long i, j
    dim as byte ptr _data 
    _Data = allocate(maxNum + 1)
   clear *_data,1, maxNum+1
    for i=2 to maxNum
        if (_Data[i]) then
            for j=i+i to maxNum
                _Data[j]=0
            next j
            end if
        next i
    deallocate(_Data)
end sub

dim as double t,acc,diff
for z as long=1 to 7
    t=timer
for n as long=1 to 10000
sieve(100000)
next
diff=timer-t
acc+=diff
print diff,z; " of ";7
next z
print "Mean  ";acc/7
sleep


  
With 32 bit -O3 optimisation, the loop is short circuited, so no result is available.
With 64 bit -O3 optimisation I get about 2.56 seconds :

Code: Select all

 2.627888816758059           1 of  7
 2.551011909265071           2 of  7
 2.55284057641984            3 of  7
 2.55291243645479            4 of  7
 2.553794946317794           5 of  7
 2.551769177312963           6 of  7
 2.551165552897146           7 of  7
Mean   2.563054773632238
  
Win 10
fb 1.05

32 bit gcc-5.2.0

64 bit gcc-5.2.0
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: Speed of FreeBasic

Post by srvaldez »

I translated the C n-body program to FB http://benchmarksgame.alioth.debian.org/u64q/nbody.html
compile command: fbc -w all -gen gcc -fpu sse -Wc -O3 n-body.bas
on my Mac I get the following output

Code: Select all

-0.169075164
-0.169059907
elapsed time 11.43654584884644 seconds
 
strangely if I remove the Private before the functions and subs the time is about 12 seconds

Code: Select all

'http://benchmarksgame.alioth.debian.org/u64q/program.php?test=nbody&lang=gcc&id=1
' The Computer Language Benchmarks Game
' http://benchmarksgame.alioth.debian.org/

' contributed by Christoph Bauer
'
' http://benchmarksgame.alioth.debian.org/license.html

' translated to FreeBasic by srvaldez

Declare Function main(Byval argc As Long) As Long

    Dim As Double t=Timer
    main(50000000)
    Print "elapsed time ";timer-t;" seconds"
End


Const pi = 3.141592653589793
Const solar_mass = (4 * pi) * pi
Const days_per_year = 365.24

Type planet
    x As Double
    y As Double
    z As Double
    vx As Double
    vy As Double
    vz As Double
    mass As Double
End Type

Private Sub advance(Byval nbodies As Long, bodies() As planet, Byval dt As Double)
    Dim i As Long
    Dim j As Long
    For i = 0To nbodies-1
        Dim b As planet Ptr = @bodies(i)
        For j = i + 1 To nbodies-1
            Dim b2 As planet Ptr = @bodies(j)
            Dim dx As Double = b->x - b2->x
            Dim dy As Double = b->y - b2->y
            Dim dz As Double = b->z - b2->z
            Dim distance As Double = Sqr(((dx * dx) + (dy * dy)) + (dz * dz))
            Dim mag As Double = dt / ((distance * distance) * distance)
            b->vx -= (dx * b2->mass) * mag
            b->vy -= (dy * b2->mass) * mag
            b->vz -= (dz * b2->mass) * mag
            b2->vx += (dx * b->mass) * mag
            b2->vy += (dy * b->mass) * mag
            b2->vz += (dz * b->mass) * mag
        Next
    Next
    For i = 0 To nbodies-1
        Dim b As planet Ptr = @bodies(i)
        b->x += dt * b->vx
        b->y += dt * b->vy
        b->z += dt * b->vz
    Next
End Sub

Private Function energy(Byval nbodies As Long, bodies() As planet) As Double
    Dim e As Double
    Dim i As Long
    Dim j As Long
    e = 0.0
    For i = 0 To nbodies-1
        Dim b As planet Ptr = @bodies(i)
        e += (0.5 * b->mass) * (((b->vx * b->vx) + (b->vy * b->vy)) + (b->vz * b->vz))
        For j = i + 1 To nbodies-1
            Dim b2 As planet Ptr = @bodies(j)
            Dim dx As Double = b->x - b2->x
            Dim dy As Double = b->y - b2->y
            Dim dz As Double = b->z - b2->z
            Dim distance As Double = Sqr(((dx * dx) + (dy * dy)) + (dz * dz))
            e -= (b->mass * b2->mass) / distance
        Next
    Next
    Return e
End Function

Private Sub offset_momentum(Byval nbodies As Long, bodies() As planet)
    Dim px As Double = 0.0
    Dim py As Double = 0.0
    Dim pz As Double = 0.0
    Dim i As Long
    For i = 0 To nbodies-1
        px += bodies(i).vx * bodies(i).mass
        py += bodies(i).vy * bodies(i).mass
        pz += bodies(i).vz * bodies(i).mass
    Next
    bodies(0).vx = (-px) / ((4 * 3.141592653589793) * 3.141592653589793)
    bodies(0).vy = (-py) / ((4 * 3.141592653589793) * 3.141592653589793)
    bodies(0).vz = (-pz) / ((4 * 3.141592653589793) * 3.141592653589793)
End Sub

Const NBODIES = 5
Extern     bodies(0 To 4) As planet
Dim Shared bodies(0 To 4) As planet = {(0, 0, 0, 0, 0, 0,_
                                      (4 * 3.141592653589793) * 3.141592653589793),_
                                      (4.84143144246472090e+00, -1.16032004402742839e+00,_
                                      -1.03622044471123109e-01, 1.66007664274403694e-03 * _
                                      365.24, 7.69901118419740425e-03 * 365.24, _
                                      (-6.90460016972063023e-05) * 365.24, _
                                      9.54791938424326609e-04 * ((4 * 3.141592653589793) * _
                                      3.141592653589793)), (8.34336671824457987e+00, _
                                      4.12479856412430479e+00, -4.03523417114321381e-01, _
                                      (-2.76742510726862411e-03) * 365.24, _
                                      4.99852801234917238e-03 * 365.24, _
                                      2.30417297573763929e-05 * 365.24, _
                                      2.85885980666130812e-04 * ((4 * 3.141592653589793) * _
                                      3.141592653589793)), (1.28943695621391310e+01, _
                                      -1.51111514016986312e+01, -2.23307578892655734e-01, _
                                      2.96460137564761618e-03 * 365.24, _
                                      2.37847173959480950e-03 * 365.24, _
                                      (-2.96589568540237556e-05) * 365.24, _
                                      4.36624404335156298e-05 * ((4 * 3.141592653589793) * _
                                      3.141592653589793)), (1.53796971148509165e+01, _
                                      -2.59193146099879641e+01, 1.79258772950371181e-01, _
                                      2.68067772490389322e-03 * 365.24, _
                                      1.62824170038242295e-03 * 365.24, _
                                      (-9.51592254519715870e-05) * 365.24, _
                                      5.15138902046611451e-05 * ((4 * 3.141592653589793) * _
                                      3.141592653589793))}

Private Function main(Byval argc As Long) As Long
    Dim n As Long = argc
    Dim i As Long
    offset_momentum(5, bodies())
    Print Using "##.#########"; energy(NBODIES, bodies())
    For i = 1 To n
        advance(NBODIES, bodies(), 0.01)
    Next
    Print Using "##.#########"; energy(NBODIES, bodies())
    Return 0
End Function
Imortis
Moderator
Posts: 1924
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Speed of FreeBasic

Post by Imortis »

Using -gen GCC -O3 is not really fair in this context. Because that is just passing the compile out to GCC. If you want a good idea of how fast the compiled EXE's are FROM FBC, then you need to let FBC do the compile.
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: Speed of FreeBasic

Post by srvaldez »

@Imortis
I don't agree, fbc x64 always uses -gen gcc even if you leave that option out from the command
zxretrosoft
Posts: 22
Joined: Apr 23, 2013 19:12
Contact:

Re: Speed of FreeBasic

Post by zxretrosoft »

Thank you very much, friends!

@lizard
But is not FB included in this test? Is that the 11.43 (seconds as srvaldez writes?)
lizard
Posts: 440
Joined: Oct 17, 2017 11:35
Location: Germany

Re: Speed of FreeBasic

Post by lizard »

zxretrosoft wrote:@lizard
But is not FB included in this test? Is that the 11.43 (seconds as srvaldez writes?)
If you compile with gen gcc it produces c code that is compiled with gcc, AFAIK. Then it is actually gcc, and near #1 at benchmarksgame. Naturally it must be run on their hardware and os for a perfect comparision. But we can say, FB is close to the top. :-)
Last edited by lizard on Mar 20, 2018 22:04, edited 1 time in total.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Speed of FreeBasic

Post by dodicat »

The comparison should really be -gen gas.
It's a bit slower.

Code: Select all

 6.381060839107448           1 of  7
 6.304388219459455           2 of  7
 6.309766430873751           3 of  7
 6.308480820393896           4 of  7
 6.308385006978185           5 of  7
 6.311566354033488           6 of  7
 6.305176284668931           7 of  7
Mean   6.31840342221645
  
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: Speed of FreeBasic

Post by srvaldez »

I really don't understand your objections to -gen gcc, gcc is used as backend, just as -gen gas uses the gnu assembler as backend, if you really don't want to use -gen gcc that's your choice but then you can only compile to 32-bit applications
see https://superuser.com/a/1198792
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Speed of FreeBasic

Post by deltarho[1859] »

In dodicat's code there is a high level of precision but with some code timings we can have 'rogue' values. Very often the first timing may be the fastest. I have never been able to fathom out why that happens. On other occasions some timings may be a lot slower than the average.

We could have a sophisticated algorithm to examine the results and remove those which would have a profound effect on the average.

On the other hand a decidedly unsophisticated method is to take the median and not the mean.

To exaggerate this approach I forced dodicat's code to give the first 'diff' as zero. This is what I got.

Code: Select all

 0             1 of  7
 1.579375311659533           2 of  7
 1.597423764760606           3 of  7
 1.578401235354249           4 of  7
 1.579799527593423           5 of  7
 1.598054361660616           6 of  7
 1.578834251282387           7 of  7
Mean   1.358841207472973
The intitial 'rogue' value has been very influential.

Equally we have rogue values much slower than the rest.

By choosing the median any rogue values have a zero effect.

The author of the opening post's link is using medians.

I am inclined to agree with that.

Added: It is a pointless exercise to compare a FB test with the opening post's link. For a true comparison either the opening post's link should include FB in the tests or we should include C, Java and so on a FB setup. In other words they should all use the same CPU.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Speed of FreeBasic

Post by jj2007 »

deltarho[1859] wrote:We could have a sophisticated algorithm to examine the results and remove those which would have a profound effect on the average.
Yes, that is a good approach (see Best strategy for timings):
- discard the first n timings (the first 10%, ...?) to load the cache
- sort the results
- eliminate the highest 20%, i.e. the spikes
- calculate the average of the remaining values.

But before doing all that, it would be a good idea to agree on a testbed that simulates some relevant common tasks, like:
- calculations of all sorts, integers vs floats etc
- string generation, concatenation
- string parsing
- file loading and storing
- sorting
- filtering
...
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Speed of FreeBasic

Post by deltarho[1859] »

In Statistics for the Terrified 'Mean versus Median' is considered.

In a nutshell it seems that we should use mean "with symmetrically distributed data" and median otherwise.

When you and I were discussing timings a little while ago you showed me that my assumption that timings were normally distributed was wrong. It took a while, <smile>, but I eventually conceded that you were correct.

It follows then, from the above link, that we should be using median and not mean.

I wrote "We could have a sophisticated algorithm to examine the results and remove those which would have a profound effect on the average." to which you wrote "Yes, that is a good approach".

I disagree. What we are doing is conditioning the data by using arbitrary values such as the first 10% and the highest 20%. I doubt that we would get a consensus on these values and there would be a temptation to tweak them for some reason.

With the median approach any 'rogue' values are automatically filtered out without using any arbitrary conditioning values.

If the data is symmetrical then the mean and median will be pretty much the same. With dodicat's timings, which are precise, then the conclusion would be the same whether we used mean or median. In your 'Best strategy for timings' graph using the mean would not be a good approach and why you considered a different approach.

I have a suspicion that if we have 31, say, timings where there is a small number of low values and a number of spikes then your approach and the median approach may very well result in a similar conclusion. However, in my case I would simply sort the results and choose the 16th value; job done.<smile>

Having a maths background I would always be attracted to an elegant solution but, in this case, the pragmatism of a median is the better solution.
TeeEmCee
Posts: 375
Joined: Jul 22, 2006 0:54
Location: Auckland

Re: Speed of FreeBasic

Post by TeeEmCee »

You can't compare the speed of C and FB unless you run both on the same hardware! Quoting timings of just one is useless.
I compared srvaldez's n-body code against C++:

cpu: AMD FX-6100 (Bulldozer) @ 3.3GHz (with boosting above 3.3GHz disabled in BIOS)
fbc: 1.06 built from git
gcc: 7.3.0
linux 4.14.19, echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

I used the fastest n-body solution in C++ which didn't use SSE2 (here), and the fastest which did (here) -- which is tied for first place with Fortran -- and srvaldez's port of the C++ code to FB.
I compared both with no special compiler flags, and those compiler flags used by the benchmarksgame.

Summary of results:

32 bit builds:

Code: Select all

Program    Compiler args                                                             Time/sec
FB         -fpu sse -O 3                                                             40.13
FB         -fpu x87 -O 3                                                             43.45
FB         -gen gcc -O 3                                                             12.63
FB         -gen gcc -O 3 -Wc -march=native,-fomit-frame-pointer,-mfpmath=sse,-msse3  9.21
C++        -O3                                                                       14.87
C++        -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3                9.73
C++ & SSE  -O3                                                                       8.03
C++ & SSE  -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3                5.73
As you can see, FB actually outperformed C++ here - the reason is probably that srvaldez translated a different C++ implementation than the one I used. This is the fault of the people who submitted C++ nbody implementations, for not submitting an optimal one that didn't use SSE. I definitely think you should NOT compare to the C++ & SSE2 implementation, because it's not pure C++, it's like using inline assembler!

fbc's gas backend does rather few optimisations - it doesn't even try to keep variables in registers between different lines of code! - so being only 3x slower than GCC is rather remarkable. However, it's OK at compiling expressions that fit onto a single line of code, which is why it produces faster assembly for math heavy code than in it does in general.

64 bit builds:

Code: Select all

Program    Compiler args                                                             Time/sec
FB         -gen gcc -O 3                                                             9.51
FB         -gen gcc -O 3 -Wc -march=native,-fomit-frame-pointer,-mfpmath=sse,-msse3  8.71
C++        -O3                                                                       8.07
C++        -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3                7.27
C++ & SSE  -O3                                                                       5.48
C++ & SSE  -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3                5.95
Interestingly this time, srvaldez's FB implementation is about 18-20% slower than the C++ one. Odd!
Here, attempting to micromanage g++ seems to muck up the use of SSE2 intrinsics.

Minimum times, out of 7 runs are used above. Here are the complete results, with medians and all times:

Code: Select all


---32 bit---

fbc nbody.bas -arch 32 -fpu sse -O 3
Min: 40.13	Median: 40.29	All: [40.13, 40.21, 40.22, 40.29, 40.52, 40.69, 40.98]

fbc nbody.bas -arch 32 -fpu x87 -O 3
Min: 43.45	Median: 44.25	All: [43.45, 44.15, 44.19, 44.25, 44.29, 44.36, 44.45]

fbc nbody.bas -arch 32 -gen gcc -O 3
Min: 12.63	Median: 12.63	All: [12.63, 12.63, 12.63, 12.63, 12.63, 12.64, 12.71]

fbc nbody.bas -arch 32 -gen gcc -O 3 -Wc -march=native,-fomit-frame-pointer,-mfpmath=sse,-msse3
Min: 9.21	Median: 9.21	All: [9.21, 9.21, 9.21, 9.21, 9.21, 9.21, 9.21]

g++ nbody.cpp -o nbody_cpp -m32 -O3
Min: 14.87	Median: 14.88	All: [14.87, 14.88, 14.88, 14.88, 14.89, 14.9, 14.9]

g++ nbody.cpp -o nbody_cpp -m32 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
Min: 9.73	Median: 9.77	All: [9.73, 9.73, 9.74, 9.77, 9.86, 9.89, 9.96]

g++ nbody_sse.cpp -o nbody_sse_cpp -m32 -O3
Min: 8.03	Median: 8.14	All: [8.03, 8.09, 8.09, 8.14, 8.15, 8.18, 8.2]

g++ nbody_sse.cpp -o nbody_sse_cpp -m32 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
Min: 5.73	Median: 6.07	All: [5.73, 5.84, 5.91, 6.07, 6.19, 6.42, 7.13]

---64 bit---

fbc nbody.bas -arch 64 -gen gcc -O 3
Min: 9.51	Median: 9.51	All: [9.51, 9.51, 9.51, 9.51, 9.52, 9.52, 9.6]

fbc nbody.bas -arch 64 -gen gcc -O 3 -Wc -march=native,-fomit-frame-pointer,-mfpmath=sse,-msse3
Min: 8.71	Median: 8.71	All: [8.71, 8.71, 8.71, 8.71, 8.72, 8.72, 8.73]

g++ nbody.cpp -o nbody_cpp -m64 -O3
Min: 8.07	Median: 8.07	All: [8.07, 8.07, 8.07, 8.07, 8.08, 8.11, 8.15]

g++ nbody.cpp -o nbody_cpp -m64 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
Min: 7.27	Median: 7.28	All: [7.27, 7.27, 7.28, 7.28, 7.28, 7.28, 7.39]

g++ nbody_sse.cpp -o nbody_sse_cpp -m64 -O3
Min: 5.48	Median: 5.57	All: [5.48, 5.48, 5.56, 5.57, 5.58, 5.64, 5.66]

g++ nbody_sse.cpp -o nbody_sse_cpp -m64 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
Min: 5.95	Median: 5.98	All: [5.95, 5.98, 5.98, 5.98, 5.99, 6.02, 6.27]
Here is the script I used to produce these results:

Code: Select all

#!/bin/sh

print() {
    echo
    echo $*
    $*
}

time_runs() {
    prog=$1
    outfile=times_$2.txt
    rm -f $outfile

    for i in {1..7}; do
        /usr/bin/time -f '%e' $prog 2>&1 >$outfile.out | tee -a $outfile
    done

    # This uses pythonpy and numpy to compute medians. https://github.com/Russell91/pythonpy
    cat $outfile | py --ji -l '"Min: %.2f\tMedian: %.2f\tAll: %s" % (min(l), numpy.median(l), sorted(l))'
}

print fbc nbody.bas -arch 32 -fpu sse -O 3
time_runs ./nbody 32_bas_gas_sse

print fbc nbody.bas -arch 32 -fpu x87 -O 3
time_runs ./nbody 32_bas_gas_fpu

print fbc nbody.bas -arch 32 -gen gcc -O 3
time_runs ./nbody 32_bas_gas_gcc

print fbc nbody.bas -arch 32 -gen gcc -O 3 -Wc -march=native,-fomit-frame-pointer,-mfpmath=sse,-msse3
time_runs ./nbody 32_bas_gas_gcc_native

print fbc nbody.bas -arch 64 -gen gcc -O 3
time_runs ./nbody 64_bas_gas_gcc

print fbc nbody.bas -arch 64 -gen gcc -O 3 -Wc -march=native,-fomit-frame-pointer,-mfpmath=sse,-msse3
time_runs ./nbody 64_bas_gas_gcc_native



print g++ nbody.cpp -o nbody_cpp -m32 -O3
time_runs './nbody_cpp 50000000' 32_cpp

print g++ nbody.cpp -o nbody_cpp -m32 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
time_runs './nbody_cpp 50000000' 32_cpp_native

print g++ nbody.cpp -o nbody_cpp -m64 -O3
time_runs './nbody_cpp 50000000' 64_cpp

print g++ nbody.cpp -o nbody_cpp -m64 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
time_runs './nbody_cpp 50000000' 64_cpp_native



print g++ nbody_sse.cpp -o nbody_sse_cpp -m32 -O3
time_runs './nbody_sse_cpp 50000000' 32_cpp_sse

print g++ nbody_sse.cpp -o nbody_sse_cpp -m32 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
time_runs './nbody_sse_cpp 50000000' 32_cpp_sse_native

print g++ nbody_sse.cpp -o nbody_sse_cpp -m64 -O3
time_runs './nbody_sse_cpp 50000000' 64_cpp_sse

print g++ nbody_sse.cpp -o nbody_sse_cpp -m64 -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3
time_runs './nbody_sse_cpp 50000000' 64_cpp_sse_native

I think there are different ways to go about timing a language: either you can ask what is the fastest possible way to implement some algorithm in a language, or you can ask how fast is a more idiomatic implementation, ie using code that is the most natural for that language. Both performance comparisons are interesting for different use cases. These language benchmark games always go for maximum performance, using often pretty ugly code. Sometimes (IIRC 'fasta' in lua) they cheat outright by just calling some external numeric library to do the computations. For the n-body problem, unlike the fastest C++ solution, the fastest program, in FORTRAN, doesn't use CPU intrinsics!
deltarho[1859] wrote:In dodicat's code there is a high level of precision but with some code timings we can have 'rogue' values. Very often the first timing may be the fastest. I have never been able to fathom out why that happens. On other occasions some timings may be a lot slower than the average.
I am speculating, but perhaps this is caused by the CPU throttling up to an unsustainably high frequency for a short time, then reducing to a more sustainable (but still above baseline) frequency thereafter, to meet power draw and temperature requirements which are averages over time rather than instantaneous.

Whenever I do any timing, I first switch the kernel's frequency governor to a constant frequency. This greatly reduces timing jitter, but on my machine does not actually disable the CPU's millisecond-scale frequency adjustment (which you can see by running cpufreq-aperf).
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Speed of FreeBasic

Post by jj2007 »

deltarho[1859] wrote:I disagree. What we are doing is conditioning the data by using arbitrary values such as the first 10% and the highest 20%. I doubt that we would get a consensus on these values and there would be a temptation to tweak them for some reason.
Engineers and mathematicians often disagree ;-)
TeeEmCee wrote:I used the fastest n-body solution in C++ which didn't use SSE2 (here), and the fastest which did (here) -- which is tied for first place with Fortran -- and srvaldez's port of the C++ code to FB.
This is probably the innermost loop:

Code: Select all

        for(unsigned i=0,k=0; i < bodies.size()-1; ++i) {
            Body& iBody = bodies[i];
            for(unsigned j=i+1; j < bodies.size(); ++j,++k) {
                iBody.vx -= r[k].dx * bodies[j].mass * mag[k];
                iBody.vy -= r[k].dy * bodies[j].mass * mag[k];
                iBody.vz -= r[k].dz * bodies[j].mass * mag[k];
				
                bodies[j].vx += r[k].dx * iBody.mass * mag[k];
                bodies[j].vy += r[k].dy * iBody.mass * mag[k];
                bodies[j].vz += r[k].dz * iBody.mass * mag[k];
            }
        }
When timing this code, you use only a tiny subset of the functions of a language. It is not at all representative of the tasks that a programming language has to perform in real life applications.
Post Reply