A fast CPRNG

Windows specific questions.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

A bug has crept into CryptoRndII. If you were using CryptoRnd then go back to it.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

The FillBuffer Sub should get used four times during the InitializeCryptoBuffers Sub but it does not always happen. I doubt that is a bug in my code. I have seen a report, 17 Aug 2017, about Threadpool corrupting the stack. This needs further investigation but if Threadpool is an issue then goodbye CryptoRndII. Oh, dear - come back CryptoRnd, all is forgiven. <smile>
integer
Posts: 408
Joined: Feb 01, 2007 16:54
Location: usa

Re: A fast CPRNG

Post by integer »

deltarho[1859] wrote:The FillBuffer Sub should get used four times during the InitializeCryptoBuffers Sub but it does not always happen. I doubt that is a bug in my code. I have seen a report, 17 Aug 2017, about Threadpool corrupting the stack. This needs further investigation but if Threadpool is an issue then goodbye CryptoRndII. Oh, dear - come back CryptoRnd, all is forgiven. <smile>
Since today is July 4 and you saw the report on Aug 17 ...

I like what you have done with this topic.
My attempt is to shorten the code (your code) and have it run completely independent of other possible threads.

Even though this is random numbers, it is necessary (for my application) to generate the exact (identical)
sequence for any given specific seed value. Why my routine fails, it may thus be possible to replicate the cause.

thanks
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

Since today is July 4 and you saw the report on Aug 17 ...
integer, you did make me laugh. That should have read 17 Aug 2016, the date of publication - which I saw earlier today. I mentioned the date because that puts it into recent times. Had it been 2010, say, then Microsoft should have gotten on top of it by now, one hopes.
it is necessary (for my application) to generate the exact (identical) sequence for any given specific seed value.
Well, CryptoRnd won't work for you - it is not seeded by the user and it is anybody's guess what the sequence would be, being cryptographic.

PCG32II on the other hand can be seeded and is ridiculously short on code. Have a look at the code section 'Here is PCG32II.bas ( 2 July 2017 )' in this post. On my machine I am getting close to half a billion random singles per second in 32 bit mode and it passes PractRand testing to one terabyte.

You can use:

Code: Select all

Dim pcg as pcg32
Dim Seed as Ulongint = 123456789
Dim __ as UlongInt ' Noticed dodicat using this the other day - really neat!

pcg.MyRandomize( Seed, __ )
The dummy '__' is a sequence index but you don't need that - change it and the sequence will change.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

<Much embarrassment>

It was a bug in my code. It crept in when I gave it the same treatment as PCG32II.

There is a small amount of extra work involved but at this level of speed it counts.

CryptoRndII is now 460 MHz in 32 bit mode and 473 MHz in 64 bit mode. This is a similar relationship to CryptoRnd where 64 bit mode wasn't much faster than 32 bit mode.

A quick and dirty 32 bit test gave 0.4999900108985607 for an average of 10^8 singles and 50.00087293 for an average of 10^8 range values (0,100). Similar results with 64 bit.

This now puts PCG32II as the fastest generator on my machine for both 32 bit and 64 bit.

I will now do a PractRand one terabyte test before declaring all is well, or not. <smile>

I should have done a PractRand test as the last job - it would have found something amiss. I did so with PCG32II.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

It passed a PractRand one terabyte test with three small anomalies so we are back in business. Rather than update the code above which is now on a different page it is posted here.

CryptoRndII comparison with PCG32II

Code size: PCG32II is very much smaller than CryptoRndII.
Speed: Comparable in 32 bit, PCG32II is faster in 64 bit.
Thread Safety: PCG32II is thread safe, CryptoRndII is not.
PractRand: Comparable with a one terabyte test.
Seeding: CryptoRndII is a seedless generator, PCG32II may be seeded, random or fixed, and have different sequences, random or fixed.

'A fast CPRNG' has been an interesting exercise - twin buffering and the use of a threadpool - but I will now be using PCG32II as my default generator.

CryptpRndII.bas (13 Decg 2017) Some 'Sleep' statements, for test purposes, had been left in - now removed.

Code: Select all

#include once "windows.bi"
#include once "win/bcrypt.bi"
#inclib "bcrypt"

#ifndef ALGO
  #define ALGO 1
#endif
 
#if (ALGO = 2)
  Declare Function RtlGenRandom Lib "Advapi32.dll" Alias "SystemFunction036" _
  ( RandomBuffer As Any Ptr, RandomBufferLength As Ulong ) As Byte
#endif

Dim Shared As UByte Buffer0(), Buffer1()
Dim Shared As Long BufferSize
Dim Shared As Any Ptr ptrBuffer, ptrBaseBuffer0, ptrBaseBuffer1
Dim Shared As Any Ptr ptrBaseBuffer0plus, ptrBaseBuffer1plus
Dim Shared As Long SwitchBufferCriteria
Dim Shared Pool As PTP_POOL
Dim Shared As PTP_WORK Work0, Work0plus, Work1, Work1plus
Dim Shared hRand As BCRYPT_ALG_HANDLE

Declare Sub SwitchBuffer
Declare Sub FillBuffer( As PTP_CALLBACK_INSTANCE, As PVOID Ptr, As PTP_WORK )
Declare Sub ResetBufferPointer
Declare Sub InitializeCryptoBuffers( As Long )

#If (ALGO = 1)
  BufferSize = 128*1024
#Else
  BufferSize = 32*1024
#Endif
 
Pool = CreateThreadpool(Null)

InitializeCryptoBuffers( BufferSize )
 
Private Function CryptoDW As Ulong
 
  If ptrBuffer >= SwitchBufferCriteria Then
    SwitchBuffer
  End If
 
  Asm
    mov eax, dword Ptr [ptrBuffer]
    mov eax, [eax]
    mov [Function], eax
  End Asm
 
  ptrBuffer += 4
 
End Function
 
Private Function CryptoS As Double ' [0,1)
Dim As Ulong TempVar
 
  If ptrBuffer >= SwitchBufferCriteria Then
    SwitchBuffer
  End If
  Asm
  	mov eax, dword Ptr [ptrBuffer]
  	mov eax, [eax]
  	mov dword Ptr [TempVar], eax
  End Asm
  ptrBuffer += 4
  Return TempVar/4294967296.0

End Function
 
Private Function CryptoSX As Double ' [-1,1]
Dim As Ulong TempVar
 
  If ptrBuffer >= SwitchBufferCriteria Then
    SwitchBuffer
  End If
 
  Asm
    mov eax, dword Ptr [ptrBuffer]
    mov eax, [eax]
    mov dword Ptr [TempVar], eax  
  End Asm
  ptrBuffer += 4
  Return TempVar/2147483648.0 - 1
 
End Function
 
Private Function CryptoD As Double  ' [0,1)
Dim As Ulongint uld
 
  If ptrBuffer >= SwitchBufferCriteria Then
    SwitchBuffer
  End If
 
  ' ASM by Wilbert at PureBasic forums
  Asm
    mov eax, dword Ptr [ptrBuffer]
    movd xmm0, [eax]
    movd xmm1, [eax + 4]
    punpckldq xmm0, xmm1
    psrlq xmm0, 12
    mov eax, 1
    cvtsi2sd xmm1, eax
    por xmm0, xmm1
    subsd xmm0, xmm1
    movq [Function], xmm0
  End Asm
 
  ptrBuffer += 8
 
End Function
 
Private Function CryptoDX As Double  ' [-1,1]
 
  If ptrBuffer >= SwitchBufferCriteria Then
    SwitchBuffer
  End If
 
  ' ASM adapted from CryptoD by author
  Asm
    mov eax, dword Ptr [ptrBuffer]
    movd xmm0, [eax]
    movd xmm1, [eax + 4]
    punpckldq xmm0, xmm1
    psrlq xmm0, 12
    mov eax, 2
    cvtsi2sd xmm1, eax
    por xmm0, xmm1
    subsd xmm0, xmm1
    mov eax, 1
    cvtsi2sd xmm1, eax
    subsd xmm0, xmm1
    movq [Function], xmm0
  End Asm
 
  ptrBuffer += 8
 
End Function
 
Private Function CryptoR( Byval One As Long, Byval Two As Long ) As Long
Dim As Ulong TempVar
 
  If ptrBuffer >= SwitchBufferCriteria Then
    SwitchBuffer
  End If
 
' ASM by John Gleason @ PowerBASIC forums
	Asm
  mov edx, dword Ptr [ptrBuffer]
  mov edx, [edx]
  mov ecx, [One]
  mov eax, [Two]
  cmp ecx, eax
  jl 0f
  xchg eax, ecx
0:
  Sub eax, ecx
  inc eax
  jz 1f
  mul edx
  Add edx, ecx
1:
  mov [Function], edx
	End Asm
	
  ptrBuffer += 4
 
End Function
 
Private Function Gauss As Single
Static As Long u2_cached
Static As Single u1, u2, x1, x2, w
 
  If u2_cached = -1 Then
    u2_cached = 0
    Function = u2
  Else
    Do
      x1 = CryptoS
      x2 = CryptoS
      w = x1 * x1 + x2 * x2
    Loop While w >= 1
    w = Sqr( -2 * Log(w)/w )
    u1 = x1 * w
    u2 = x2 * w
    u2_cached = -1
    Function = u1
  End If
 
End Function
 
Private Sub InitializeCryptoBuffers( Byval Buffer As Long )
  #If (ALGO = 1)
    BCryptOpenALGOrithmProvider Varptr(hRand), BCRYPT_RNG_ALGORITHM, 0, 0
  #endif
  If Buffer < 1024 Then
    BufferSize = 1024
  Else
    BufferSize = Buffer - Buffer Mod 8
  End If
  Redim Buffer0( 1 To BufferSize) As UByte
  ptrBaseBuffer0 = Varptr( Buffer0(1) )
  ptrBuffer = ptrBaseBuffer0
  #ifdef __FB_64BIT__
    SwitchBufferCriteria = Cast(Longint, ptrBuffer) + BufferSize
  #Else
    SwitchBufferCriteria = Cast(Long, ptrBuffer) + BufferSize
  #endif
  Work0 = CreateThreadpoolWork(Cast(PTP_WORK_CALLBACK, @FillBuffer), @ptrBaseBuffer0, Null)
  ptrBaseBuffer0plus = ptrBaseBuffer0 + BufferSize\2
  Work0plus = CreateThreadpoolWork(Cast(PTP_WORK_CALLBACK,@FillBuffer), @ptrBaseBuffer0plus, Null)
  SubmitThreadpoolWork(Work0)
  SubmitThreadpoolWork(Work0plus)
  Redim Buffer1( 1 To BufferSize) As UByte
  ptrBaseBuffer1 = Varptr( Buffer1(1) )
  Work1 = CreateThreadpoolWork(Cast(PTP_WORK_CALLBACK,@FillBuffer), @ptrBaseBuffer1, Null)
  ptrBaseBuffer1plus = ptrBaseBuffer1 + BufferSize\2
  Work1plus = CreateThreadpoolWork(Cast(PTP_WORK_CALLBACK,@FillBuffer), @ptrBaseBuffer1plus, Null)
  SubmitThreadpoolWork(Work1)
  SubmitThreadpoolWork(Work1plus)
  WaitForThreadpoolWorkCallbacks(Work0,FALSE)
  WaitForThreadpoolWorkCallbacks(Work0plus,FALSE)
  ' We don't need Work0 related objects again.
  CloseThreadpoolWork(Work0)
  CloseThreadpoolWork(Work0plus)
End Sub
 
#if (ALGO = 1)
  Private Sub CleanUpCryptoRndIIBuffer
    BCryptCloseALGOrithmProvider( hRand, 0  )
    CloseThreadpoolWork(Work1)
    CloseThreadpoolWork(Work1plus)
    CloseThreadpool(Pool)
  End Sub
#endif
 
#If (ALGO = 1)
Private Sub FillBuffer( Instance As PTP_CALLBACK_INSTANCE, Context As PVOID Ptr, Work As PTP_WORK)
Dim BaseBuffer As Any Ptr

  BaseBuffer = *Context
  BCryptGenRandom( hRand, BaseBuffer, BufferSize\2, 0)
End Sub
#Else
Private Sub FillBuffer(Byval Instance As PTP_CALLBACK_INSTANCE, Byval Context As PVOID Ptr, Byval Work As PTP_WORK)
Dim BaseBuffer As Any Ptr
Dim As Long HalfBuffer
Dim As Ulong RecoverBuffer
Dim As Any Ptr ptrRecoverBuffer
 
  BaseBuffer = *Context
  ptrRecoverBuffer = Varptr(RecoverBuffer)
 
  HalfBuffer = BufferSize\2
  Asm
    mov edi, dword Ptr [HalfBuffer]
    mov esi, 0
    mov ebx, dword Ptr [BaseBuffer]
  rptRdRand:
    mov ecx, 10 ' Max number Of tries before going into a recovery
  queryAgain:
  #ifdef __FB_64BIT__
    RdRand rax
  #Else
    RdRand eax
  #endif
    jc OK ' A Random value was available
    dec ecx
    jnz queryAgain
    Call Recover ' Use RtlGenRandom For This ULong
  OK:
    #ifdef __FB_64BIT__
      mov qword Ptr [ebx + esi], rax ' Store RdRand
      Add esi, 8
    #Else
      mov dword Ptr [ebx + esi], eax ' Store RdRand
      Add esi, 4
    #endif
      cmp edi, esi
      jne rptRdRand
      jmp Done
  Recover:
  #ifndef __FB_64BIT__
    pushad ' I am playing it safe here
  #endif
  End Asm
  #ifdef __FB_64BIT__
    RtlGenRandom(ptrRecoverBuffer, 8)  ' Populate buffer
  #Else
    RtlGenRandom(ptrRecoverBuffer, 4)
  #endif
  Asm
  #ifndef __FB_64BIT__
    popad
  #endif
  #ifdef __FB_64BIT__
    mov rax, qword Ptr [ptrRecoverBuffer]
  #Else
    mov eax, dword Ptr [ptrRecoverBuffer]
  #endif
    ret
  Done:
  End Asm
 
End Sub
#Endif
 
Private Sub SwitchBuffer
  WaitForThreadpoolWorkCallbacks(Work1,FALSE)
  WaitForThreadpoolWorkCallbacks(Work1plus,FALSE)
  Swap ptrBaseBuffer0, ptrBaseBuffer1
  Swap ptrBaseBuffer0plus, ptrBaseBuffer1plus
  ptrBuffer = ptrBaseBuffer0
  #ifdef __FB_64BIT__
    SwitchBufferCriteria = Cast(Longint, ptrBuffer) + BufferSize
  #Else
    SwitchBufferCriteria = Cast(Long, ptrBuffer) + BufferSize
  #endif
  SubmitThreadpoolWork(Work1)
  SubmitThreadpoolWork(Work1plus)
End Sub
 
Private Sub ResetBufferPointer
  ptrBuffer = ptrBaseBuffer0
End Sub
Last edited by deltarho[1859] on Dec 13, 2017 23:47, edited 5 times in total.
srvaldez
Posts: 3364
Joined: Sep 25, 2005 21:54

Re: A fast CPRNG

Post by srvaldez »

will not compile on my PC, Windows 10, FB latest git build, tried both 32 & 64-bit versions
CreateThreadpool not found plus too many more errors to mention
Josep Roca
Posts: 564
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: A fast CPRNG

Post by Josep Roca »

> CreateThreadpool not found

You must add

#define _WIN32_WINNT &h0602

Also don't compile with the -w pedantic option because BYREF has been ommited from the parameters.
Last edited by Josep Roca on Jul 05, 2017 16:00, edited 2 times in total.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

You probably did not include the first line. The CleanUp... is also a requirement else we will get a memory leak -it cleans up BCrypt usage and threadpool usage.

Added: Jose beat me to it<smile>

Code: Select all

Const _WIN32_WINNT = &h0602
#Include "CryptoRndII.bas"

Dim As ULong i

For i = 1 To 10
  Print CryptoS
Next
Print
For i = 1 To 10
  Print CryptoR(0,255)
Next

CleanUpCryptoRndIIBuffer

Sleep
srvaldez
Posts: 3364
Joined: Sep 25, 2005 21:54

Re: A fast CPRNG

Post by srvaldez »

that you Josep Roca and deltarho[1859] :-)
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

@integer

pcg.MyRandomize( Seed, __ ) was bad advice. Both Seed and seq are optional with a default value of zero. So, if we do not define or use 0, which '__ '' will be, then a random number will be used. To guarantee a repeat sequence then we should define both Seed and Seq: 0< Seed <= 2^64-1 and 0 < Seq <= 2^63-1. I cannot say I misread the notes - I wrote the code. It may be a result of senile decay. "I knew if I lived long enough something like this would happen". George Bernard Shaw on his death bed.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

BCryptGenRandom is in the Windows file Bcrypt.dll and introduced in Windows Vista which was made generally available in January 2007. It is declared in FB's bcrypt.bi which has a copyright dated 2009, 2010. The underlying code of BCryptGenRandom was updated in Windows 8 which was made generally available in October 2012.

So, if we are using CryptRndII on Windows 8 or later then we will be using the Vista version of BCryptGenRandom.

I think that the above will have no bearing on the use of BCryptGenRandom in CryptoRndII. However, I cannot prove that and who knows what, if any, security tweaks have been applied over the years. What we can do is ensure that the version of BCryptGenRandom is of the host machine by loading the library ourselves and not use FB's library.

José Roca provided an eloquent means of using the host's CryptProtectMemory in this post. I used that as a template for BCryptGenRandom.

Updated CryptoRndII above and dated 5 Aug 2017.

Added: I have just done an average of 10^8 Singles and got 0.5000000831604103. I did not get it again but thought that i'd mention it because I have never seen an average close to 7 significant figures before.
deltarho[1859]
Posts: 4281
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A fast CPRNG

Post by deltarho[1859] »

In the 'Windows CryptProtectMemory API' thread there is a discussion between St_W and José Roca (Josep Roca) about libraries. PowerBASIC users rarely need to get involved in libraries and I came to FreeBASIC pretty much ignorant of such matters. I am having to read their comments several times to understand what they are talking about. José made a comment which caught my eye: "When a function is supported by the headers and libraries provided with the compiler, I simply call it." This made me question my last post.

On second thoughts I came to the conclusion that if the previous version of CryptoRndII was used with Windows Vista, 7, 8, 8.1 or 10 the fact that the FreeBASIC library was 7 years old is neither here nor there. My reasoning for dynamically loading Bcrypt.dll is then flawed. I have reverted CryptoRndII to the previous version.

I know of an API which uses a function in Kernel32.dll and introduced in Windows 8. That API will not be found in the FreeBASIC libraries and is not in the PowerBASIC include files either. In this case, Windows 8 or later users will have to dynamically load Kernel32.dll. The API in question?: GetSystemTimePreciseAsFileTime.
Josep Roca
Posts: 564
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: A fast CPRNG

Post by Josep Roca »

"When a function is supported by the headers and libraries provided with the compiler, I simply call it."
"by the headers and libraries provided with the compiler", that is, BOTH. Because sometimes (not often), a function is declared in the header but the linker fails if you call it. And sometimes calling a function works using the 32 bit compiler but not the 64 bit compiler, or viceversa.

For example, these GDI+ functions

GdipGetMetafileHeaderFromWmf
GdipMeasureCharacterRanges

fail when called with the 64 bit compiler

I also have had problems with some exported GUIDs, e.g. DIID_HTMLDocumentEvents2, and had to use my own declare:

DIM DIID_HTMLDocumentEvents2_ AS GUID = (&h3050F613, &h98B5, &h11CF, {&hBB, &h82, &h00, &hAA, &h00, &hBD, &hCE, &h0B})

And I had so many problems trying to create an import library for propsys.dll (it was the first time that tried to make an import library), that I ended using dynamic loading with DyLibLoad. The tool to create the .def file that comes with the compiler crashed, another tool worked, but the resulting library failed; a tool called reimp worked, but I had problems to get versions for both 32 and 64 bit because the author provided the source code but not binaries, and I'm not a C programer. I lost two days and I was thinking in switching to another compiler.

I would like that the compilers for Windows will behave like other Windows compilers such VB6 or PowerBasic, that don't use import libraries, but guess that it is only a dream.
St_W
Posts: 1617
Joined: Feb 11, 2009 14:24
Location: Austria
Contact:

Re: A fast CPRNG

Post by St_W »

Josep Roca wrote:And I had so many problems trying to create an import library for propsys.dll (it was the first time that tried to make an import library), that I ended using dynamic loading with DyLibLoad. The tool to create the .def file that comes with the compiler crashed, another tool worked, but the resulting library failed; a tool called reimp worked, but I had problems to get versions for both 32 and 64 bit because the author provided the source code but not binaries, and I'm not a C programer. I lost two days and I was thinking in switching to another compiler.
Creating import libraries can be quite a hassle sometimes. Especially for Windows libraries I would suggest to use the ones provided by Microsoft (in the Windows SDK) instead of trying to create them oneself. Usually import libraries are built and provided with the DLLs - having to create an import library for an existing DLL shouldn't be the regular case.
Josep Roca wrote:I would like that the compilers for Windows will behave like other Windows compilers such VB6 or PowerBasic, that don't use import libraries, but guess that it is only a dream.
A feature we should think about. Two solutions come to mind: either replace the currently used static import completely by dynamic loading at runtime (which may cause some backward-compatibility issues) or create some syntax to use that feature optionally (e.g. EXTERN { ... } [[DYLOAD] LIB "libname"]). The preprocessor could be used to make this adjustable for the headers shipped with FB (like the Windows headers). Only ideas, though. Don't expect that to come soon :-)
I agree that many, if not all, high(er) level languages, including VB6 or C#, dynamically load libraries at runtime while system programming languages like C or Pascal typically use import libraries. I can't really prefer one method over the other, but I see that dynamic loading allows more flexibility and is probably also easier to use.

//edit: a issue for the dynamic loading feature are the different naming conventions. Ideally there would be a naming scheme specifier for both a import-library-name and a dynamic-library-name so that it is possible to easily switch between the methods. For example a Windows DLL typically exports names like "function" while the import library will export it as "_function@n". However, there do exist multiple kinds of combinations and different variants of decorations.

//edit2: such a "DYLOAD" feature could also allow to trigger an exception if the DLL/method is not available. The question is when to do the loading (at startup? when the method is first used?) and whether to raise an error for a missing DLL or for each missing method instead (and ignore the DLL?). Of course that are definitely dreams of the future as currently not even a concept for something like exceptions exists, but it doesn't hurt to think early and in the long term about future language design.
Post Reply