@jj2007
Yes,
HERE
It is a pity that the forum software does not show post numbers.
From that post we have ( -gen gcc -s console )
Code: Select all
elapsed Asm: 172ms 24250000 leap years found (inline)
elapsed Asm: 175ms 24250000 leap years found (inline)
elapsed Asm: 174ms 24250000 leap years found (inline) (repeated 5x to make sure timings are reliable)
elapsed Asm: 173ms 24250000 leap years found (inline)
elapsed Asm: 175ms 24250000 leap years found (inline)
elapsed Asm: 699ms 24250000 leap years found (function)
elapsed D: 878ms 24250000 leap years found
elapsed C: 560ms 24250000 leap years found
elapsed B: 534ms 24250000 leap years found
elapsed R2: 625ms 24250000 leap years found (Roland Chastain 2)
elapsed AL: 904ms 24250000 leap years found (ULong)
elapsed AI: 2639ms 24250000 leap years found (ULongInt)
From your latest figures we have
Code: Select all
elapsed Asm: 184 ms 24250000 leap years found (inline)
elapsed Asm: 187 ms 24250000 leap years found (inline)
elapsed Asm: 178 ms 24250000 leap years found (inline)
elapsed Asm: 358 ms 24250000 leap years found (function) <<< naked cdecl
elapsed Asm: 361 ms 24250000 leap years found (function)
elapsed Asm: 361 ms 24250000 leap years found (function)
elapsed D: 1737 ms 24250000 leap years found
elapsed C: 1627 ms 24250000 leap years found
elapsed B: 1615 ms 24250000 leap years found
elapsed R2: 1618 ms 24250000 leap years found (Roland Chastain 2)
elapsed AL: 1782 ms 24250000 leap years found (ULong)
elapsed AI: 3317 ms 24250000 leap years found (ULongInt)
The differences with the inline are less than 7% which we can get simply by running code at different times. For some years now I have never regarded code A as being faster than code B, for example, unless the difference is greater than 7%.
My point about "bear little resemblance" is with regard D, ..., AI. R2, for example, is showing 625ms in the first list and 1618 ms in the second list.
jj2007 wrote:With "naked", that crap is gone, and of course it increases speed dramatically.
It does if no compiler optimisation is employed. I invariably use -gen gcc -Wc -O3 and in this case Naked has no effect.
For -gen gcc -Wc -O3 fans the following uses just that with jj2007's code, without alteration, used to produce the first list above. Bear in mind that I am using an Intel i7 so we cannot just multiply the first list by a factor and get my list - there will be variations.
Code: Select all
elapsed Asm: 143ms 24250000 leap years found (inline)
elapsed Asm: 145ms 24250000 leap years found (inline)
elapsed Asm: 144ms 24250000 leap years found (inline)
elapsed Asm: 144ms 24250000 leap years found (inline)
elapsed Asm: 143ms 24250000 leap years found (inline)
elapsed Asm: 261ms 24250000 leap years found (function)
elapsed D: 123ms 24250000 leap years found
elapsed C: 71ms 24250000 leap years found
elapsed B: 71ms 24250000 leap years found
elapsed R2: 70ms 24250000 leap years found (Roland Chastain 2)
elapsed AL: 199ms 24250000 leap years found (ULong)
elapsed AI: 1176ms 24250000 leap years found (ULongInt)
What we have here is proof, on my machine anyway, that sometimes compiler optimised FreeBASIC BASIC will be faster than our attempts to replace BASIC by asm. I found this when I was developing my fast random number generators. With PowerBASIC the BASIC I replaced with asm saw a speed improvement in each case. With FreeBASIC I had to check that was still the case and, in some cases, it was not true so I reverted to BASIC. In theory, asm should be faster than BASIC, even when optimised, but if it is not then it is not. In this case do we choose asm because it should be faster or optimised BASIC because it is faster. The choice is clear to me. <smile>
From my list Roland Chastain's second contribution as posted takes the crown.