On FreeBASIC's random number generators.

Post your FreeBASIC tips and tricks here. Please don’t post your code without including an explanation.
counting_pine
Site Admin
Posts: 6236
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: On FreeBASIC's random number generators.

Postby counting_pine » Feb 23, 2017 21:13

How is an "anomaly" defined?

I would welcome patches to Randomize that improve the seeding. I think the rest of the pure Mersenne Twister is probably sound.
Note that MD5 is not entirely secure - it is possible to create two files with the same MD5sum. Although I don't know if that causes any security risks here. Any attacker who can control the seed can already predict the sequence.
Also, we don't have any hash implementations in the runtime library, although I dare say that ones compatible with the code's licence would be welcome.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 24, 2017 12:02

How is an "anomaly" defined?


PractRand classifies them according to the opening post. They occur when a particular test gives a result in the tail of a normal distribution. Marsaglia, if not folk before him, called them 'p-values' and PractRand uses the same term.

From Marsaglia's DieHarder suite:
NOTE: Most of the tests in DIEHARD return a p-value, which
should be uniform on [0,1) if the input file contains truly
independent random bits. Those p-values are obtained by
p=F(X), where F is the assumed distribution of the sample
random variable X---often normal. But that assumed F is just
an asymptotic approximation, for which the fit will be worst
in the tails. Thus you should not be surprised with
occasional p-values near 0 or 1, such as .0012 or .9983.
When a bit stream really FAILS BIG, you will get p's of 0 or
1 to six or more places. By all means, do not, as a
Statistician might, think that a p < .025 or p> .975 means
that the RNG has "failed the test at the .05 level". Such
p's happen among the hundreds that DIEHARD produces, even
with good RNG's. So keep in mind that " p happens".

Note that MD5 is not entirely secure - it is possible to create two files with the same MD5sum.

Agreed. It is possible then that two different strings give the same entry point to the Twister sequence. That does not concern me - all that I am interested in is getting the same entry point. The fact that may be possible via two different strings is neither here nor there. Mersenne Twister is not cryptographically secure anyway. It is worth noting that the creation of two messages giving the same MD5 hash, a collision, was contrived 'in the lab', so to speak. As far as I know a contrived preimage attack has not been done. However, even a contrived collision is enough for us to move away from MD5 with regard to cryptographic uses. If MD5 does worry anyone then half of a SHA256 will do the job.
I would welcome patches to Randomize that improve the seeding.

This is what I am using for a 4 x Ulong seeding in Greg's code; after removing his entry point.

Code: Select all

' Entry point
 
Declare Function MyRandomInt Lib "Advapi32.dll" Alias "SystemFunction036" _
 ( RandomBuffer As Any Ptr, RandomBufferLength As ULong ) As Byte
 
'Next two statements will be executed
Dim Shared init(4) As uInteger => {&H123, &H234, &H345, &H456}
init_by_array(init(), 4) ' Seed Twister
' and provide a default if RandomizeMT() is not employed
 
Sub RandomizeMT( seedString As String )
Dim As Long i
Dim binHash As String
Dim arrayPtr as Byte Ptr
 
  arrayPtr = Cast( Byte Ptr, @init(1) )
 
  If seedString <> "" Then ' Fixed seed
    BinHash = Hash( wstr("MD5"), seedString, 1, 0, -1 ) ' Last 3 x parameters: Index, Binary output, Only one pass
    For i = 1 To 16
      *arrayPtr = BinHash[i-1]
      arrayPtr += 1
    Next
  Else ' Random seed
    MyRandomInt( arrayPtr, 16 )
  End If
 
  init_by_array(init(), 4) ' Seed Twister
 
End Sub
 
Function RndMT() As Double
  Function = genrand_real2()
End Function

Hash() is from my thread here but that can be stripped down to its bare bones. Hash() uses Microsoft's CNG but we can use the earlier CryptoAPI to go back before Windows Vista. SystemFunction036 requires Windows XP and above but if we use the CryptoAPI for the hash then we can use CryptGenRandom; as used in the FreeBASIC C code. So, since the FreeBASIC C code is using the CrytpoAPI for CryptGenRandom it is a simple matter to use it for the hash.

Here is a test code:

Code: Select all

#Include "MT.bas"
 
Dim As Long i
 
For i = 1 to 5
  Print RndMT
Next
Print
RandomizeMT( "" )
For i = 1 to 5
  Print RndMT
Next
Print
RandomizeMT( "FreeBASIC" )
For i = 1 to 5
  Print RndMT
Next
Print
RandomizeMT( "FreeBASIC" )
For i = 1 to 5
  Print RndMT
Next
 
Sleep


with a typical output.

Code: Select all

 0.2485689006280154
 0.2225734812673181
 0.1111276280134916
 0.9562863928731531
 0.9846353149041534
 
 0.1271206198725849
 0.3026482213754207
 0.4521580399014056
 0.3267434446606785
 0.7410531162749976
 
 0.2051182913128287
 0.9361569825559855
 0.5373712498694658
 0.3031890040729195
 0.5847245713230223
 
 0.2051182913128287
 0.9361569825559855
 0.5373712498694658
 0.3031890040729195
 0.5847245713230223


The first block uses the default seeding as RandomizeMT was not employed. The second block uses random seeding and with a 128 bit input the chances are that sequence will never be seen again. The third and fourth blocks use fixed seeding.

Added: With regard to PractRand and Greg's 4 x ULong seeding I went to bed last night and left it running. There was a small anomaly during the 128MB test, as I reported earlier, but no more right up to 2TB.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 24, 2017 13:00

It takes a while for the penny to drop for me, sometimes.

The facility to initialise via an array is part of the original Twister code and Greg used 4 x ULongs as an example.

I have just used 8 x ULongs and SHA256 (256/32 = 8 ) ==> 2^256 = 2^224 x 2^32 entry points to the Twister sequence.

RandomizeMT( "" ) took 20 micro-seconds to execute - not enough time to make a cup of tea. <smile>

<grin>

16 x ULongs and SHA512. 2^480 x 2^32 entry points.

RandomizeMT( "" ) took 24 microseconds to execute - still not enough time to make a cup of tea.

RandomizeMT( "FreeBASIC" ) took the same time.

RandomizeMT( "The time has come, the Walrus said, to talk of many things" ) also took the same time. With all the overheads involved doubling the length of a string, for example, will not take RandomizeMT() twice as long to do it's job.

A possible 2^512 entry points is enough to shut even me up.
Last edited by deltarho[1859] on Feb 24, 2017 15:46, edited 1 time in total.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 24, 2017 14:04

Just for the record this is what I am now using:

Code: Select all

' Entry point to MT.bas

'Next two statements will be executed
Dim Shared init(16) As uInteger => {&H012, &H123, &H234, &H345, &H456, &H567, &H678, &H789, &H89A, &H9AB, &HABC, &HBCD, &HCDE, &HDEF, &HEF0, &HF00}
init_by_array(init(), 16) ' Seed Twister
' and provide a default if RandomizeMT() is not employed

Sub RandomizeMT( seedString As String )
Dim As Long i
Dim binHash As String
Dim arrayPtr as Byte Ptr

  arrayPtr = Cast( Byte Ptr, @init(1) )

  If seedString <> "" Then ' Fixed seed
    BinHash = Hash( wstr("SHA512"), seedString, 1, 0, -1 ) ' Last 3 x parameters: Index, Binary output, Only one pass
    For i = 1 To 64
      *arrayPtr = BinHash[i-1]
      arrayPtr += 1
    Next
  Else ' Random seed
    MyRandomInt( arrayPtr, 64 )
  End If
 
  init_by_array(init(), 16) ' Seed Twister
 
End Sub

Function RndMT() As Double
  Function = genrand_real2()
End Function
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 24, 2017 18:04

I have just remembered something. I no longer write for Windows XP and any crypto work that I do uses Windows CNG which was introduced in Windows Vista. At the same time the CryptoAPI was extended and saw AES and the SHA2 family of hash functions being added. Surprisingly, this extended CryptoAPI was added to Windows XP in SP3.

So, although the CryptoAPI has been around for nearly 25 years, including CryptGenRandom, the biggest hash function was at SHA1( 160 bit ) until Windows XP SP3.

No big deal: 4 x ULongs for <= XP SP2; 16 x ULongs for >= XP SP3.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 25, 2017 14:32

Whilst 'out and about' today it dawned on me that my ramblings have been platform specific and someone should have picked me up on that. <smile> Coming from PowerBASIC that shouldn't be a surprise because PowerBASIC do not have a Linux compiler. I have no idea what Linux has with regard hash functions, for example. I see that the FreeBASIC C code mentions HOST_WIN32 and HOST_LINUX.

counting_pine's "Also, we don't have any hash implementations in the runtime library," did not 'sink in' as it should have done.

Of course, what I have proposed is not insurmountable but it is not as easy as I thought it would be.

I have also noticed that the facility to seed via an array does not appear to be in FreeBASIC C code; unlike Greg's code and the original C code by Matsumoto and Nishimura; more hurdles. <gulp>

In future I will try and remember that I am not talking to a Window's only audience. Having said that if you are a Window's only guy and have Windows Vista and above you can knock out your own Mersenne Twister with 16 x 32 bit seeds on what has been posted.
MrSwiss
Posts: 3657
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: On FreeBASIC's random number generators.

Postby MrSwiss » Feb 25, 2017 17:07

Having had a look at your code above ...

Just a tip on array's sizing in FB (it's not the same, as in C):
    either: ... array(0 To 15), 16 Elements 'zero based', short form: array(15) -- only upper bound specified (lower bound = 0, implicit!)
    or: ........ array(1 To 16), 16 Elements 'one based'
Secondly: don't use 'var size' variables (U-/Integer), when fixed sizes are 'called for':
    you're talking about ULong, but are using UInteger in the array (4 Byte in 32bit, 8 Byte in 64bit), use ULong instead (always 32bit fixed).
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 25, 2017 19:31

Thank you, MrSwiss.

That particular code is Greg Lyon's - mine starts at Sub RandomizeMT - I haven't used UInteger since I landed here.

However, both points taken on board and the code has been corrected accordingly.

Fortunately I got away with it - even though I was entering the Twister's sequence at different locations.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 26, 2017 4:27

Secondly:

I am now going to question that.

If I change my code to ULong it will not compile in 64 bit unless I change it to ULongInt.

Alternatively I could change UInteger to ULong throughout the code, Greg's and mine, and that works in 32 bit. I could then change ULong to ULongInt throughout the code, Greg's and mine, and that works in 64 bit.

However, if I go back to Greg's code using UInteger and my code using UInteger the code works when I switch from 32 bit to 64 bit without my changing anything.

Added: I am aware that running in 64 bit does not give us a 64 bit Mersenne Twister - that is MT19937-64. All we are getting is a 32 bit Mersenne Twister running in 64 bit mode.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 26, 2017 13:45

However, the random numbers differ between 32 bit and 64 bit for the same fixed seed. This is because, in 64 bit mode, the hash bytes are not put into the array's 64 bit elements correctly.

Here is a revised RandomizeMT() such that the random numbers generated are the same for 32 bit and 64 bit for the same fixed seed.

Code: Select all

Sub RandomizeMT( seedString As String )
Dim As Long i, j
Dim BinHash As String * 64
Dim As Ulong Ptr BinHashPtr
Dim arrayPtr as Byte Ptr

  If seedString <> "" Then ' Fixed seed
    BinHashPtr = Cptr( ULong Ptr, StrPtr(BinHash) )
    BinHash = Hash( wstr("SHA512"), seedString, 1, 0, -1 ) ' Last 3 x parameters: Index, Binary output, Only one pass
    For i = 1 to 16
      init(i) = PeeK( Ulong, BinHashPtr )
      BinhashPtr += 1
    Next
  Else ' Random seed
    arrayPtr = Cast( Byte Ptr, @init(1) )
    MyRandomInt( arrayPtr, 64 )
  End If
 
  init_by_array(init(), 16) ' Seed Twister
 
End Sub


Added: I was using VarPtr for BinHashPtr and changed it before St_W put a contract out on me. <laugh>
Last edited by deltarho[1859] on Feb 26, 2017 19:24, edited 1 time in total.
Provoni
Posts: 393
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: On FreeBASIC's random number generators.

Postby Provoni » Feb 26, 2017 18:53

I've tested these random number generators using my substitution solver, a stochastic hill climber.

Option 1: time in seconds: 163.95, average score: 23905.71
Option 2: time in seconds: 157.91, average score: 23860.62
Option 3: time in seconds: 176.69, average score: 23923.33
Option 4: time in seconds: 158.05, average score: 23899.65
Option 5: time in seconds: 690.11, average score: 23871.95

Option 4 seems to have the most practical value for my solver, hitting a sweet spot between performance and the average score. Option 5 is very slow.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 26, 2017 19:39

@Provoni

If you do not need a fixed seed enviroment then give my A fast CPRNG a bash - you could see your timings down to about 55 seconds. Five posts from the end of that thread you will find CryptoRndBuffer.bas for inclusion in your source code. Use '#define Algo 1' as used in the post following the CryptoRndBuffer.bas listing.

Tch, tch - there I go again. OS requirement: Windows Vista or later.
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 27, 2017 0:33

I am now using this:

Code: Select all

' Entry point to MT.bas

'Next few statements will be executed
Dim Shared init(1 To 624) As UInteger
Dim As Long il
For il = 1 to 624
  init(il) = il
Next
init_by_array(init(), 624) ' Seed Twister
' and provide a default if RandomizeMT() is not employed

Sub RandomizeMT( seedString As String )

Dim BinHash As String * 64
Dim As Ulong Ptr BinHashPtr, BaseBinHashPtr
Dim arrayPtr as Byte Ptr
Dim As Long i, j

  If seedString <> "" Then ' Fixed seed
    BinHashPtr = Cptr(Ulong Ptr, StrPtr( BinHash ))
    BaseBinHashPtr = BinHashPtr
    BinHash = Hash( wstr("SHA512"), seedString, 1, 0, -1 ) ' Last 3 x parameters: Index, Binary output, Only one pass
    For j = 0 to 38
      BinHash = Hash( wstr("SHA512"), BinHash, 1, 0, -1 ) ' Hash the hash
      BinHashPtr = BaseBinHashPtr
      For i = 1 to 16
        init(j*16 + i) = PeeK( Ulong, BinHashPtr )
        BinhashPtr += 1
      Next
    Next
  Else ' Random seed
    arrayPtr = Cast( Byte Ptr, @init(1) )
    MyRandomInt( arrayPtr, 624*4 )
  End If
 
  init_by_array(init(), 624) ' Seed Twister
 
End Sub

Function RndMT() As Double
  Function = genrand_real2()
End Function

This will saturate the state space. This gives us access to every possible sequence ie 2^19937−1 entry points. So, instead of a single 32 bit seed, or 16 as my last effort, we are now using 624 x 32 bit seeds.

RandomizeMT( "Some string" ) now takes about 175 micro-seconds. RandomizeMT ( "" ) still takes about 24 micro_seconds.

RandomizeMT ( "Some string" ) simply hashes a hash 39 times - 39 x 512 / 32 = 624 x 32 bit seeds. Randomize ( "" ) uses RtlGenRandom for 624 x 4 = 2496 bytes.

Here is a typical test using the above code:

Code: Select all

 0.2055544629693031
 0.9001737011130899
 0.5330331262666732
 0.62346843467094
 0.4192499285563827

 0.5931593908462673
 0.5981001688633114
 0.2941236007027328
 0.9376221681013703
 0.3947651595808566

 0.3276005650404841
 0.2944225850515068
 0.9359331077430397
 0.9560786564834416
 0.1014269459992647

 0.3276005650404841
 0.2944225850515068
 0.9359331077430397
 0.9560786564834416
 0.1014269459992647
 174.1841542184375

A 64 bit run gave the same results except for the second block; which uses a random seeding. I used "HousesOfParliament" for "Some string".

Most PRNGs benefit from a warm up, a period of requesting random numbers to be discarded, to allow a smoothing of any wrinkles in the initial state. We do not need to do that - an initial state saturated by a CPRNG does not have any wrinkles.

With 'RtlGenRandom' or "/dev/urandom" all we need is an OS independent SHA512 and we are in business. I have a asm SHA512 inc file on my hard drive at the moment. I cannot do it - I couldn't write C code to save my life. <smile>
Provoni
Posts: 393
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: On FreeBASIC's random number generators.

Postby Provoni » Feb 27, 2017 19:33

deltarho[1859] wrote:@Provoni

If you do not need a fixed seed enviroment then give my A fast CPRNG a bash - you could see your timings down to about 55 seconds. Five posts from the end of that thread you will find CryptoRndBuffer.bas for inclusion in your source code. Use '#define Algo 1' as used in the post following the CryptoRndBuffer.bas listing.

Tch, tch - there I go again. OS requirement: Windows Vista or later.

Thanks deltarho[1859],

I'd very much like to try it but can't get it to work. Compiling with FreeBASIC-1.05.0-win64 under Windows 7.

Gives the following errors:

C:\AZdecrypt\CryptoRndBuffer.bas(214) error 20: Type mismatch, before ')' in 'SwitchBufferCriteria = Cast(Long, ptrBuffer) + BufferSize'
C:\AZdecrypt\CryptoRndBuffer.bas(282) error 20: Type mismatch, before ')' in 'SwitchBufferCriteria = Cast(Long, ptrBuffer) + BufferSize'
deltarho[1859]
Posts: 2748
Joined: Jan 02, 2017 0:34
Location: UK

Re: On FreeBASIC's random number generators.

Postby deltarho[1859] » Feb 27, 2017 20:07

Apologies, Provoni. Those two statements required an edit when compiling in 64 bit. I did mention what was needed but quite a few posts before the CryptoRndBuffer.bas listing. I should not expect anyone being referred to a particular post to read what has been posted previously.

The CryptoRndBuffer.bas listing has been replaced by the very latest version and I have put a date at the head of the listing.

The latest version conditionally compiles according to either 32 bit or 64 bit in addition to defining Algo. If not defined 'Algo = 1' will be used - my suggestion for you.

In your code replace Rnd with CryptoS.

At the end of your source code include the following:

Code: Select all

#if (Algo = 1)
  CleanUpCryptoRndBufferCNG
#endif

This closes the algorithm provider - BCRYPT_RNG_ALGORITHM had been opened.

The 'Type mismatch', by the way, required LongInt as opposed to Long when compiling in 64 bit mode - now done automatically.

Return to “Tips and Tricks”

Who is online

Users browsing this forum: No registered users and 2 guests