LeapYear Function (Boolean eval.)

Source-code only - please, don't post questions here.
paul doe
Posts: 151
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: LeapYear Function (Boolean eval.)

Postby paul doe » Nov 13, 2017 15:49

Munair wrote:BTW, speaking of optimization, GCC (or any other) shouldn't be trusted blind-fold with aggressive optimization. Here is a great example I accidentally found: https://oli4444.wordpress.com/2013/11/0 ... -for-loop/

Conclusion reads: "aggressive loop optimization that the [latest] gcc has (which then seems to remove the condition in the loop...)"

;-)

I cannot agree with your conclusion, sorry. The very first comment that the author of that blog got was:

Thomas W. wrote:It seems (from looking at the source code) that “characters” has 127 elements. You loop reaches 127 which accesses the 128th element which is invalid and results in undefined behavior.
There have been some changes to gcc (I think version 4.7) which result in weird behavior when it detects undefined cases.

If the code worked with a backwards loop, that should tell you that the problem lies in the iterator, not inside the loop. I have encountered this situation before of course (I work on 64-bit GCC with full optimization *always on*), but was usually able to track it to me just being stupid. If you do something like that, your code deserves to fail. In fact, that code didn't fail hard enough, in my opinion =D
Munair
Posts: 130
Joined: Oct 19, 2017 15:00
Location: 't Zand, The Netherlands
Contact:

Re: LeapYear Function (Boolean eval.)

Postby Munair » Nov 13, 2017 18:37

paul doe wrote:
Munair wrote:BTW, speaking of optimization, GCC (or any other) shouldn't be trusted blind-fold with aggressive optimization. Here is a great example I accidentally found: https://oli4444.wordpress.com/2013/11/0 ... -for-loop/

Conclusion reads: "aggressive loop optimization that the [latest] gcc has (which then seems to remove the condition in the loop...)"

;-)

I cannot agree with your conclusion, sorry. The very first comment that the author of that blog got was:

Thomas W. wrote:It seems (from looking at the source code) that “characters” has 127 elements. You loop reaches 127 which accesses the 128th element which is invalid and results in undefined behavior.
There have been some changes to gcc (I think version 4.7) which result in weird behavior when it detects undefined cases.

If the code worked with a backwards loop, that should tell you that the problem lies in the iterator, not inside the loop. I have encountered this situation before of course (I work on 64-bit GCC with full optimization *always on*), but was usually able to track it to me just being stupid. If you do something like that, your code deserves to fail. In fact, that code didn't fail hard enough, in my opinion =D

Actually I didn't look at the comments. Thanks for clearing that up. But the question is, should full optimization allow the iterator to go beyond array bounds (I know checking slows down)? I think every programmer has made that mistake once or twice. In any case, know what you do when going for full optimization. For example, Lazarus warns the programmer when setting optimization to -O4 (beware).
paul doe
Posts: 151
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: LeapYear Function (Boolean eval.)

Postby paul doe » Nov 13, 2017 20:24

Munair wrote:Actually I didn't look at the comments. Thanks for clearing that up. But the question is, should full optimization allow the iterator to go beyond array bounds (I know checking slows down)? I think every programmer has made that mistake once or twice. In any case, know what you do when going for full optimization. For example, Lazarus warns the programmer when setting optimization to -O4 (beware).

In this particular case, if the code crashed, it would've very easy to track the error during integration. As the faulty code was executed silently, the programmer had NO idea what went wrong. That's what I meant when I said that the code didn't fail hard enough. Off-by-one errors are among the most common mistakes you can make, along with typos. One of the things that I hate the most when programming in C/C++ like languages, for example, is case sensitiveness. This is especially outrageous when you work with code by others, or when porting it to another language (I port quite a lot of C++ code to FB).

Another example are divisions by zero. Your code can run perfectly fine with one or two per second, but not with a couple thousand (this is very easy to do, for example, when doing 3D or color manipulation). It's another of those very hard to track bugs that normally pass by silently at run-time, when in fact just a single little zero div should send your app crashing down to the floor engulfed in flames =D
jj2007
Posts: 106
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: LeapYear Function (Boolean eval.)

Postby jj2007 » Nov 13, 2017 21:08

paul doe wrote:the code didn't fail hard enough.
That's an old debate among assembler programmers, too. I like code that crashes, and use a just in time debugger to find such bugs. The C/C++ brigade puts structured exception handlers (SEH) everywhere, the result is that bugs go unnoticed, and that Adobe Flash needs a "security update" every other day. SEH is an ugly habit.

Windows itself plays foul in this respect: At least in Win7-64, the wndproc suppresses exceptions unless you explicitly ask for them (which I do, of course). Re div zero, you can enable crashes for FPU code - just change the FPU control word.
paul doe
Posts: 151
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: LeapYear Function (Boolean eval.)

Postby paul doe » Nov 13, 2017 21:15

jj2007 wrote:
paul doe wrote:the code didn't fail hard enough.
That's an old debate among assembler programmers, too. I like code that crashes, and use a just in time debugger to find such bugs. The C/C++ brigade puts structured exception handlers (SEH) everywhere, the result is that bugs go unnoticed, and that Adobe Flash needs a "security update" every other day. SEH is an ugly habit.

Yeah, I think so too. Exception code IS a code smell ;)
jj2007 wrote:Windows itself plays foul in this respect: At least in Win7-64, the wndproc suppresses exceptions unless you explicitly ask for them (which I do, of course). Re div zero, you can enable crashes for FPU code - just change the FPU control word.

Yes, but I'm talking generic, platform-independent code, not ASM specific here. That is something that the compiler should do (and indeed, it does) when you ask him politely, but not on the higher optimization levels.
deltarho[1859]
Posts: 656
Joined: Jan 02, 2017 0:34
Location: UK

Re: LeapYear Function (Boolean eval.)

Postby deltarho[1859] » Nov 13, 2017 22:08

Sorry to interrupt the current conversation but, just out of interest, I ported the ASM function to PowerBASIC and was surprised at the difference.

FreeBASIC (-gen gcc -Wc -O3): elapsed Asm: 266ms 24250000 leap years found
FreeBASIC (gas): elapsed Asm: 439ms 24250000 leap years found
PowerBASIC: elapsed Asm: 547ms 24250000 leap years found

In this particular instance optimised FB is twice a fast as PB and gas is 20% faster than PB.
jj2007
Posts: 106
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: LeapYear Function (Boolean eval.)

Postby jj2007 » Nov 14, 2017 7:33

deltarho[1859] wrote:I ported the ASM function to PowerBASIC and was surprised at the difference.

FreeBASIC (-gen gcc -Wc -O3): elapsed Asm: 266ms 24250000 leap years found
FreeBASIC (gas): elapsed Asm: 439ms 24250000 leap years found
PowerBASIC: elapsed Asm: 547ms 24250000 leap years found
Indeed, there shouldn't be any difference, if the compilers inline the asm code "as is". But I saw in my tests that gcc, for example, inserts calls to some functions, maybe bound checking, no idea. Which IMHO is an issue for the developers: If a coder decides to use inline assembler, s/he should know exactly what s/he is doing, and therefore doesn't need a helping hand. We inline that to get a speed gain, not because we love assembler so much, right?
MrSwiss
Posts: 2022
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: LeapYear Function (Boolean eval.)

Postby MrSwiss » Nov 14, 2017 11:05

@jj2007,

I think that, to suppress "framing" you'll have to use the "naked" key-word in the Function.
(For details, on pros and cons, see FB-Manual.)
jj2007
Posts: 106
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: LeapYear Function (Boolean eval.)

Postby jj2007 » Nov 14, 2017 14:35

Thanks, MrSwiss. I've checked the manual but

Code: Select all

Function LeapYearJ Naked (ByVal yea As ULong) As ULong
Asm
  mov ecx, dword ptr [yea]
  ...
gives me error 17: Syntax error, found 'function' ... any idea what's wrong?
deltarho[1859]
Posts: 656
Joined: Jan 02, 2017 0:34
Location: UK

Re: LeapYear Function (Boolean eval.)

Postby deltarho[1859] » Nov 14, 2017 14:51

@jj2007

Following the Naked keyword should be cdecl, stacall or pascal. I mentioned on page 5 that I try it but there was no change in speed.
dodicat
Posts: 4410
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: LeapYear Function (Boolean eval.)

Postby dodicat » Nov 14, 2017 14:59

Here is cdecl naked.
But I am not an asm member.
Didn't test for speed though.

Code: Select all




Function LeapYearN naked cdecl(ByVal yea As ULong) As ULong
Asm
  mov ecx, dword ptr [esp + 4]
  test ecx, 3
  setz al
  jne 0f
  mov eax, ecx
  cdq
  push 100
  idiv dword ptr [esp]   ' 0, 4, 8, ..., 96, 0
  test edx, edx   ' mod 100=0 should set zero flag
  jne 1f
  mov eax, ecx
  cdq
  push 400
  idiv dword ptr [esp]
  pop eax
  dec edx   '*** invert the zero flag ***
  sets dl   ' sign set if edx was zero
  test dl, dl
1:
  pop eax
  setnz al
0:
  movsx eax, al
  ret
  'mov [function], eax   ' MichaelW
'   leave
'   ret 4
'   pop edi
'   pop esi
'   pop ebx
'   mov esp, ebp
'   pop ebp
'   ret 4
End Asm
End Function

Function LeapYearJ(ByVal yea As ULong) As ULong
Asm
  mov ecx, dword ptr [yea]
  test ecx, 3
  setz al
  jne 0f
  mov eax, ecx
  cdq
  push 100
  idiv dword ptr [esp]   ' 0, 4, 8, ..., 96, 0
  test edx, edx   ' mod 100=0 should set zero flag
  jne 1f
  mov eax, ecx
  cdq
  push 400
  idiv dword ptr [esp]
  pop eax
  dec edx   '*** invert the zero flag ***
  sets dl   ' sign set if edx was zero
  test dl, dl
1:
  pop eax
  setnz al
0:
  movsx eax, al
  mov [function], eax   ' MichaelW
'   leave
'   ret 4
'   pop edi
'   pop esi
'   pop ebx
'   mov esp, ebp
'   pop ebp
'   ret 4
End Asm
End Function
for n as long=2000 to 3000
    print n, LeapYearN(n),LeapyearJ(n), LeapYearN(n)=LeapyearJ(n)
next
sleep


 
deltarho[1859]
Posts: 656
Joined: Jan 02, 2017 0:34
Location: UK

Re: LeapYear Function (Boolean eval.)

Postby deltarho[1859] » Nov 14, 2017 15:41

I mentioned that I got no speed increase using Naked. I just tried it again and got 266ms whether I used Naked or not.

However, I was using -gen gcc -Wc -O3.

With gas I went from 439ms to 281ms using Naked which is amazing.

Perhaps -O3 is doing a Naked like tweak so using Naked here is redundant.
deltarho[1859]
Posts: 656
Joined: Jan 02, 2017 0:34
Location: UK

Re: LeapYear Function (Boolean eval.)

Postby deltarho[1859] » Nov 14, 2017 16:48

All this reminded me of PowerBASIC's Fastproc which is a function that does not use a stack frame. It has some restrictions but the asm function we are using here can converted to a Fastproc.

PowerBASIC went from 547ms to 331ms, getting closer to FreeBASIC with compiler optimisation.
jj2007
Posts: 106
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: LeapYear Function (Boolean eval.)

Postby jj2007 » Nov 14, 2017 17:35

Thanks a lot! I like it:

Code: Select all

elapsed Asm: 184 ms          24250000 leap years found (inline)
elapsed Asm: 187 ms          24250000 leap years found (inline)
elapsed Asm: 178 ms          24250000 leap years found (inline)
elapsed Asm: 358 ms          24250000 leap years found (function)  <<< naked cdecl
elapsed Asm: 361 ms          24250000 leap years found (function)
elapsed Asm: 361 ms          24250000 leap years found (function)
elapsed D:   1737 ms         24250000 leap years found
elapsed C:   1627 ms         24250000 leap years found
elapsed B:   1615 ms         24250000 leap years found
elapsed R2:  1618 ms         24250000 leap years found (Roland Chastain 2)
elapsed AL:  1782 ms         24250000 leap years found (ULong)
elapsed AI:  3317 ms         24250000 leap years found (ULongInt)
deltarho[1859]
Posts: 656
Joined: Jan 02, 2017 0:34
Location: UK

Re: LeapYear Function (Boolean eval.)

Postby deltarho[1859] » Nov 14, 2017 17:55

@jj2007

Your latest results bear little resemblance to earlier results where R2 was faster than Asm(function).

Return to “Tips and Tricks”

Who is online

Users browsing this forum: No registered users and 3 guests