gcc and asm

General FreeBASIC programming questions.
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

gcc and asm

Post by deltarho[1859] »

I have some code which uses two Naked subroutines and one Naked function. It works with gas, gcc and gcc using -O1. With gcc -O2 and -O3 I get issues where my asm logic is corrupted.

srvaldez mentions two possible solutions here. However, neither of them are available with FreeBASIC. The #pragma method has been available in gcc since, wait for it, version 4.4.

Many is the time that I have commented some BASIC code to be replaced by some asm and kept my fingers crossed during testing. If it works, then all well and good but if not, then I revert to the BASIC code because I do not want the lion's share of the code running at -O1.

I have found that using the inline assembler is much rarer than with PowerBASIC because -O2 and -O3 are so powerful. However, there are times when asm does give us an 'edge'; when it works.
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: gcc and asm

Post by srvaldez »

could you post one of the sub/function that fails?
try this, not sure if it will make a difference, add .text at the end of your asm sub or function
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: gcc and asm

Post by deltarho[1859] »

@srvaldez

The problem is sometimes it works, and sometimes it doesn't. By that, I mean I can close WinFBE, open it and all is well which makes me think about code alignment.

I was not going to post this as I doubted that anyone would be interested - I wanted to get acquainted with Naked and passing a pointer to a UDT more than anything.

So, here is the whole code - not many lines.

The problem seems to be with SFCSetSnapshot toward the bottom of the code. If I comment that I can use -O3 but, obviously, the snapshot is not being used. With -O2 and -O3 printing belts past six outputs.

Code: Select all

'#Console On
 
#Include Once "windows.bi"
#Inclib "bcrypt"
#Include Once "win/wincrypt.bi"

TYPE SFC32
  dw_a AS Ulong
  dw_b AS Ulong
  dw_c AS Ulong
  dw_counter AS Ulong = 1
  snapa As Ulong
  snapb As Ulong
  snapc As Ulong
  snapcounter As Ulong
END TYPE

Sub SFCGetSnapshot Naked cdecl ( ByVal ptrRangeN As Any Ptr )
Asm
  mov esi, Dword Ptr [esp+4]  ' dw_a
  mov ecx, esi
  add ecx, 16
  mov edi, ecx ' snapa
  mov ecx, 4
  rep movsd
  ret
eND aSM
end Sub

Sub SFCSetSnapshot Naked cdecl ( ByVal ptrRangeN As Any Ptr )
Asm
  mov edi, Dword Ptr [esp+4]  ' dw_a
  mov ecx, edi
  add ecx, 16
  mov esi, ecx ' snapa
  mov ecx, 4
  rep movsd
  ret
End Asm
End Sub

Function SFCRange Naked cdecl ( ByVal ptrRangeN As Any Ptr,  ByVal First As Long, ByVal Last As long ) As Long
Asm
  ' Generator
  mov esi, Dword Ptr [esp+4]  'ptrRangeN
  mov ecx, Dword Ptr [esi + 12]          ' counter.
  inc ecx
  mov Dword Ptr [esi + 12], ecx           ' write back the inc'd counter.
  add ecx, Dword Ptr [esi]
  mov edx, Dword Ptr [esi + 4]
  add ecx, edx                   ' tmp = ecx = counter + a + b  :  edx = b
  mov eax, edx
  shr edx, 9
  xor eax, edx
  mov Dword Ptr [esi], eax                 ' a = b XOR (SHIFT RIGHT b, 9)
  mov eax, Dword Ptr [esi + 8]
  mov edx, eax                   ' eax = edx = c.
  shl edx, 3
  add edx, eax
  mov Dword Ptr [esi + 4], edx             ' b = c + (SHIFT LEFT c, 3) : eax = c
  rol eax, 21
  add eax, ecx
  mov Dword Ptr [esi + 8], eax             ' c = (ROTATE LEFT c, 21) + tmp = eax
  ' End of generator
  ' Determine random value within First and Last inclusive
  mov edx, Dword Ptr [esp+12] ' Last
  sub edx, Dword Ptr [esp+8]   ' - First
  inc edx                         ' Range
  mul edx                         ' multiply by eax and put into edx:eax
  add edx, Dword Ptr [esp+8]
  mov eax, edx
  ret
End Asm
End Function

Sub SFCInitialize( ByVal ptrRangeN As Any Ptr )
Dim hRand As BCRYPT_ALG_HANDLE Ptr
  BCryptOpenALGOrithmProvider( @hRand, BCRYPT_RNG_ALGORITHM, 0, 0 )
  BCryptGenRandom( hRand, ptrRangeN, 12, 0) ' dw_a, dw_b and dw_c
  BCryptCloseALGOrithmProvider( hRand, 0  )
  ' Warm up
  For i As Ulong = 1 To 100
    SFCRange( ptrRangeN, 1, 10 )
  Next
  ' Copy initial state
  SFCGetSnapshot( ptrRangeN )
End Sub

Dim Range0 As SFC32
Dim ptrRange0 As SFC32 Ptr : ptrRange0 = @Range0

SFCInitialize( ptrRange0 )

Dim As Ulong i

Print "Random initial stage"
For i = 1 to 4
  Print SFCRange( ptrRange0, 0, 255 )
Next
Print
SFCGetSnapshot ptrRange0
Print "Random second stage"
For i = 1 to 6
  Print SFCRange( ptrRange0, 0, 255 )
Next
Print
SFCSetSnapshot ptrRange0
Print "Repeat second stage"
For i = 1 to 6
  Print SFCRange( ptrRange0, 0, 255 )
Next

Sleep
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: gcc and asm

Post by srvaldez »

I can't get it to fail, what gcc version are you using?
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: gcc and asm

Post by deltarho[1859] »

srvaldez wrote:I can't get it to fail
Oh, dear!

To keep life simple: 5.2

I tried fbc 1.07.1 and 1.07.3

This is what I get with -O1

Code: Select all

Random initial stage
 17
 9
 189
 141

Random second stage
 25
 209
 31
 73
 72
 133

Repeat second stage
 25
 209
 31
 73
 72
 133
With -O2 the Repeat just keeps printing.

If I loop for 5 or 7 then printing stops at 5 or 7 but the output is not a repeat of the second stage.
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: gcc and asm

Post by srvaldez »

I did not know what output to expect, with optimization it repeats here as well
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: gcc and asm

Post by srvaldez »

it looks like the variable I gets clobbered hence the infinite loop, it appears that the for i loop uses a register that's used in your asm code
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: gcc and asm

Post by marcov »

Hooking onto srvaldez' train of thought: try to push and pop esi and edi before/after your asm code, and see what happens.

From https://en.wikibooks.org/wiki/X86_Assem ... _Languages :


CDECL calling convention specifies a number of different requirements:

.....
The volatile registers are: EAX, ECX, EDX, ST0 - ST7, ES and GS
The non-volatile registers are: EBX, EBP, ESP, EDI, ESI, CS and DS
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: gcc and asm

Post by srvaldez »

@marcov
yes that fixed the problem

Code: Select all

'#Console On
 
#Include Once "windows.bi"
#Inclib "bcrypt"
#Include Once "win/wincrypt.bi"

TYPE SFC32
  dw_a AS Ulong
  dw_b AS Ulong
  dw_c AS Ulong
  dw_counter AS Ulong = 1
  snapa As Ulong
  snapb As Ulong
  snapc As Ulong
  snapcounter As Ulong
END TYPE

Sub SFCGetSnapshot Naked cdecl ( ByVal ptrRangeN As Any Ptr )
Asm
push esi
push edi
  mov esi, Dword Ptr [esp+4+8]  ' dw_a
  mov ecx, esi
  add ecx, 16
  mov edi, ecx ' snapa
  mov ecx, 4
  rep movsd
pop edi
pop esi
  ret
eND aSM
end Sub

Sub SFCSetSnapshot Naked cdecl ( ByVal ptrRangeN As Any Ptr )
Asm
push esi
push edi
  mov edi, Dword Ptr [esp+4+8]  ' dw_a
  mov ecx, edi
  add ecx, 16
  mov esi, ecx ' snapa
  mov ecx, 4
  rep movsd
pop edi
pop esi
  ret
End Asm
End Sub

Function SFCRange Naked cdecl ( ByVal ptrRangeN As Any Ptr,  ByVal First As Long, ByVal Last As long ) As Long
Asm
  ' Generatolr
push esi
push edi
  mov esi, Dword Ptr [esp+8+4]  'ptrRangeN
  mov ecx, Dword Ptr [esi + 12]          ' counter.
  inc ecx
  mov Dword Ptr [esi + 12], ecx           ' write back the inc'd counter.
  add ecx, Dword Ptr [esi]
  mov edx, Dword Ptr [esi + 4]
  add ecx, edx                   ' tmp = ecx = counter + a + b  :  edx = b
  mov eax, edx
  shr edx, 9
  xor eax, edx
  mov Dword Ptr [esi], eax                 ' a = b XOR (SHIFT RIGHT b, 9)
  mov eax, Dword Ptr [esi + 8]
  mov edx, eax                   ' eax = edx = c.
  shl edx, 3
  add edx, eax
  mov Dword Ptr [esi + 4], edx             ' b = c + (SHIFT LEFT c, 3) : eax = c
  rol eax, 21
  add eax, ecx
  mov Dword Ptr [esi + 8], eax             ' c = (ROTATE LEFT c, 21) + tmp = eax
  ' End of generator
  ' Determine random value within First and Last inclusive
  mov edx, Dword Ptr [esp+8+12] ' Last
  sub edx, Dword Ptr [esp+8+8]   ' - First
  inc edx                         ' Range
  mul edx                         ' multiply by eax and put into edx:eax
  add edx, Dword Ptr [esp+8+8]
  mov eax, edx
pop edi
pop esi
  ret
End Asm
End Function

Sub SFCInitialize( ByVal ptrRangeN As Any Ptr )
Dim hRand As BCRYPT_ALG_HANDLE Ptr
  BCryptOpenALGOrithmProvider( @hRand, BCRYPT_RNG_ALGORITHM, 0, 0 )
  BCryptGenRandom( hRand, ptrRangeN, 12, 0) ' dw_a, dw_b and dw_c
  BCryptCloseALGOrithmProvider( hRand, 0  )
  ' Warm up
  For i As Ulong = 1 To 100
    SFCRange( ptrRangeN, 1, 10 )
  Next
  ' Copy initial state
  SFCGetSnapshot( ptrRangeN )
End Sub

Dim Range0 As SFC32
Dim ptrRange0 As SFC32 Ptr : ptrRange0 = @Range0

SFCInitialize( ptrRange0 )

Dim As ulong iii, i

Print "Random initial stage"
For i = 1 to 4
  iii=SFCRange( ptrRange0, 0, 255 )
  Print iii 'SFCRange( ptrRange0, 0, 255 )
Next
Print
SFCGetSnapshot ptrRange0
Print "Random second stage"
For i = 1 to 6
    iii=SFCRange( ptrRange0, 0, 255 )
  Print iii 'SFCRange( ptrRange0, 0, 255 )
Next
Print
SFCSetSnapshot ptrRange0
Print "Repeat second stage"
For i = 1 to 6
   iii=SFCRange( ptrRange0, 0, 255 )
  Print iii 'SFCRange( ptrRange0, 0, 255 )
Next

Sleep
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: gcc and asm

Post by deltarho[1859] »

@marcov

You have the volatile and non-volatile registers the wrong way around. The link you provided is wrong.

With PowerBASIC EBX, ESI and EDI are automatically pushed at the beginning of a procedure and automatically popped before exiting the procedure. This is to conform to the Windows programming conventions. FreeBASIC does not mention anything about preserving registers.

To play it safe then we should also PUSH EBX.

@srvaldez

You have committed the cardinal sin of two changes so that it is not possible to tell which change worked.

There is no need for:

Code: Select all

iii=SFCRange( ptrRange0, 0, 255 )
Print iii 'SFCRange( ptrRange0, 0, 255 )

Code: Select all

Print SFCRange( ptrRange0, 0, 255 )
is fine.

So thanks marcov.

I should mention that SFCRange is an adaptation of code written by Daniel Penny at the PowerBASIC forum. He wanted to produce fast bounded integers. I pointed him to my PCG32 dll but he wanted thread safety as well. PCG32II is thread safe but the dll uses the original PCG32 which is not thread safe. He found a generator by Chris Doty-Humphrey, the author of PractRand, and uses a UDT to store the state vector to enable thread safety; as I do with PCG32II. He also used PowerBASIC's FastProc which reduces the function overhead. No stack frame is created, so he had to put the integer bounds in the UDT and FastProc returns Longs. FastProc only allows two Long parameters, and they are passed via ESI and EDI. I adapted the code to use FreeBASIC's Naked.

The FastProc range beat PCG32's range on speed. PCG32 loses many grunts when put into a dll. PCG32II's range, on the other hand, beats the Naked approach, so the Naked approach is rendered an interesting exercise.

Chris Doty-Humphrey's generator is good. I converted the Longs into Dwords, and it passed PractRand to 2TB with just one small anomaly. However, it is not as fast as PCG32II.

This thread has been worthwhile because I now know that if we use any of the volatile registers in Naked, then they should be preserved.

Perhaps fxm can mention that in the documentation.
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: gcc and asm

Post by srvaldez »

arghhh, I thought I had undone that
fxm
Moderator
Posts: 12106
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: gcc and asm

Post by fxm »

deltarho[1859] wrote:This thread has been worthwhile because I now know that if we use any of the volatile registers in Naked, then they should be preserved.

Perhaps fxm can mention that in the documentation.
Not only in naked procedure, but in any asm block.

Paragraph already existing on the ASM documentation page:
Register Preservation
  • When an Asm block is opened, the registers ebx, esi, and edi are pushed to the stack, when the block is closed, these registers are popped back from the stack. This is because these registers are required to be preserved by most or all OS's using the x86 CPU. You can therefore use these registers without explicitly preserving them yourself. You should not change esp and ebp, since they are usually used to address local variables.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: gcc and asm

Post by jj2007 »

fxm wrote:
deltarho[1859] wrote:This thread has been worthwhile because I now know that if we use any of the volatile registers in Naked, then they should be preserved.

Perhaps fxm can mention that in the documentation.
Not only in naked procedure, but in any asm block.
There is no need to preserve esi edi ebx in a normal (non-naked) function.
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: gcc and asm

Post by deltarho[1859] »

fxm wrote:Paragraph already existing on the ASM documentation page:
I missed that - I was expecting a comment in the Naked description; although at the bottom it does include Asm, among others, in the See also.

However, I opened an Asm block and closed it so why do we have to preserve esi and edi. The giveaway is what Naked does or, more to the point, what it doesn't do with "Write functions without prolog/epilog code". Pushing and Popping, among some other things, is prolog/epilog code.

So in the 'Register Preservation' paragraph we should be warned that although an Asm block is opened in Naked ebx, esi and edi are not pushed to the stack, and it is our responsibility to preserve a volatile register if it is changed in the Asm block. I think that a warning should also be given in the Naked description for anyone, like me, who did not check out the Asm description.

PowerBASIC does not have 'Asm/End Asm' but has 'Prefix "ASM "/End Prefix'. Effectively, we have then an Asm block but no registers are preserved; that only occurs with procedures including FastProc.

So FreeBASIC is a safer environment except for Naked which may bite us; as it did me. Image
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: gcc and asm

Post by marcov »

deltarho[1859] wrote:@marcov

You have the volatile and non-volatile registers the wrong way around. The link you provided is wrong.

With PowerBASIC EBX, ESI and EDI are automatically pushed at the beginning of a procedure and automatically popped before exiting the procedure. This is to conform to the Windows programming conventions. FreeBASIC does not mention anything about preserving registers.
This (the PB thing) is not normal. Normal is that callee saves non-volatile registers only IF these are used, not always. That would be unnecessary slowing, and prevent assembler users from achieving maximum performance.

But some compilers do this out of self compatibility/consistency with some older more primitive(*) version. FPC also had something similar it till version 2.0 which totally rewrote register allocation. Now blocks can signal to the compiler which regs are modified.

(*) 16-bit hosted compilers have special memory tradeoffs, and usually chose to keep it more simple and use the memory to compile/link larger programs
Post Reply