Blimey, this thread got over 4940 views. It is just another PRNG.
Anyway, the seeding protocol has been radically changed.
Originally, we had 'Sub MyRandomize( seed As String, seed1 As String )'. If both seeds were empty then the state vector, s(0) and s(1), both Ulongint, were populated with cryptographic random numbers from FB's generator #5 plus Get64Bit otherwise ValULng( seed<n> ) was used to provide a fixed seed environment.
In either case it was reckoned that a generator warm-up was essential; some reckoning that the larger the state vector the longer the warm-up should be. I chose a warm-up of 'burning' 8192 64-bit values. I have no idea whether that is more than required or not enough. I suspect that it was 'over the top'. Ideally, we should determine the best warm-up for each generator we use, but it is not an exact science, and we should employ a safety margin.
A generator is deemed to be warmed up when the seeds reach a Hamming weight of 50% as discussed in
Seeding FB generators. The new seeding protocol ensures that both seeds have a Hamming weight of 50%. There is then no need for a warm-up because we are at the position a correct warm-up should take us to, and the random numbers generated should have a good quality of randomness from the very first request.
In the link above the seed is simply a 32-bit value, but I mentioned that I would like a 64-bit variant. dodicat came up with a method much better than the method I looked at for producing Hamming weights of 50%, 32 from 64, at a faster rate. The code uses dodicat's functions HammingSeed64 and popcount64, a 64-bit variant of the asm instruction popcnt, 32-bit. Some older machiness do not have popcnt so I use my HammingSeed for that. He also produced a 64-bit equivalent of the asm instruction 'bswap eax' with 'reverseuLongint(x As Ulongint) As Ulongint'. I have replaced that with a function to do a Knuth shuffle on the eight bytes of a 64-bit value. This is considerably slower than dodicat's method but it only takes about 100 nanoseconds for the Knuth shuffle method so the fact that it is slower than dodicat's method is academic. The point of using 'bswap eax' or reverseuLongint() is to break the sequence because the rdtsc values are sequential. A Knuth shuffle increases the obfuscation, and we will have no idea what the 64-bit shuffled value will look like. The Hamming weight, of course, will not change.
ShuffleUlongint needs random numbers, so we use the generator's integral range function. However, we cannot shuffle during the HammingSeed64 function - we must generate four HammingSeed64's first and then shuffle each one.
MyRandomize now has no parameters - we are no longer involved in seeding. Repeating a sequence is easy because GetSnapshot is invoked immediatlely after MyRandomize in the Constructor; as was the case with the original version of the generator.
Ten tests were carried out to determine 10 million HammingSeed64 followed by a shuffle and the maximum time taken for each test was noted. The average time was 63.4 microseconds in 32-bit mode and 75.8 microseconds in 64-bit mode. However, the maximum time was 107.6 microseconds in 32-bit mode and 244 microseconds in 64-bit mode. Assuming the worst case for each seed gives us the state vector being populated in about half a millisecond. It is therefore highly unlikely that this approach to seeding will have any impact on our applications other than not needing a warmup and the random numbers generated should have a good quality of randomness from the very first request.
Here is a new version of xoroshiro128.bas and a slightly revised version of Usage example.
xoroshiro128.bas
Code: Select all
' http://prng.di.unimi.it/xoroshiro128starstar.c
/'
static inline uint64_t rotl(const uint64_t x, int k) {
return (x << k) | (x >> (64 - k));
}
static uint64_t s[2];
uint64_t next(void) {
const uint64_t s0 = s[0];
uint64_t s1 = s[1];
const uint64_t result = rotl(s0 * 5, 7) * 9;
s1 ^= s0;
s[0] = rotl(s0, 24) ^ s1 ^ (s1 << 16); // a, b
s[1] = rotl(s1, 37); // c
return result;
}
'/
#define rotl(x,k) ( (x Shl k) Or (x Shr(64-k)) ) ' Note the extra parentheses
#macro Engine
s0 = s(0)
s1 = s(1)
result = rotl(s0*5,7)*9
s1 Xor= s0
s(0) = rotl(s0,24) xor s1 xor (s1 shl 16)
s(1) = rotl(s1,37)
#endmacro
' Generator is Sebastian Vigna's xoroshiro128**
Type xoroshiro128
Public:
Declare Constructor
Declare Sub MyRandomize()
Declare Function rand() As Ulongint
Declare Function randD() As Double
Declare Function range overload( Byval One As Long, Byval Two As Long ) As Long
declare function range overload ( byval One as double, Byval Two as double ) as double
declare sub GetSnapshot()
Declare Sub SetSnapshot()
Declare Function Gauss() As Double
Private:
As Ulongint s(0 To 1)
As Ulongint snapshot(0 to 1)
End Type
dim Shared as xoroshiro128 x128
Function popcount64(x As Ulongint) As Ulong ' By dodicat
If x=&hffffffffffffffffull Then Return 64
x -= ((x Shr 1) And &h5555555555555555ull)
x = (((x Shr 2) And &h3333333333333333ull) + (x And &h3333333333333333ull))
x = (((x Shr 4) + x) And &hf0f0f0f0f0f0f0full)
x += (x Shr 8)
x += (x Shr 16)
x+= (x Shr 32)
Return x And &h0000003full
End function
Sub ShuffleUlongint( ByRef x As Ulongint )
Union localUDT
As Ulongint ul
As Byte b(7)
End Union
Dim As localUDT Ptr p = Cptr(localUDT Ptr, @x)
For i As Long = 0 to 6
Swap p->b(i), p->b(x128.range(i,7))
Next
End Sub
function HammingSeed64() As Ulongint ' By dodicat
Dim As Ulong seed,numbits
Dim As Ulongint CopySeed
While numBits <> 32
Asm
rdtsc ' Get a fast changing number from the processor, only 32 bits
mov dword Ptr [Seed], eax
End Asm
CopySeed = seed * 4294967295ull ' thrust into a 64 bit range
numbits=popcount64( copyseed )
Wend
Return copyseed
End function
Private Sub xoroshiro128.MyRandomize()
s(0) = HammingSeed64()
s(1) = HammingSeed64()
ShuffleUlongint( s(0) )
ShuffleUlongint( s(1) )
End Sub
Private Function xoroshiro128.rand() As ulongint
dim as ulongint s0, s1, result
Engine
Return result
End Function
Private Function xoroshiro128.randD() As Double
dim as ulongint s0, s1, result
Engine
Return result/2^64
End Function
Private Function xoroshiro128.range( Byval One As Long, Byval Two As Long ) As Long
dim as ulongint s0, s1, result
Engine
return Clng(CULng(result) Mod ( Two-One+1 )) + One ' By dodicat
End Function
Private Function xoroshiro128.range( Byval One As Double, Byval Two As Double ) As Double
dim as ulongint s0, s1, result
Engine
Return result/2^64*( Two - One ) + One
End Function
Private sub xoroshiro128.GetSnapshot()
snapshot(0) = s(0)
snapshot(1) = s(1)
end sub
Private sub xoroshiro128.SetSnapshot()
s(0) = snapshot(0)
s(1) = snapshot(1)
end sub
constructor xoroshiro128
MyRandomize()
GetSnapshot()
end constructor
Private Function xoroshiro128.gauss As double
Static As Long u2_cached
Static As double u1, u2, x1, x2, w
If u2_cached = -1 Then
u2_cached = 0
Function = u2
Else
Do
x1 = RandD
x2 = RandD
w = x1 * x1 + x2 * x2
Loop While w >= 1
w = Sqr( -2 * Log(w)/w )
u1 = x1 * w
u2 = x2 * w
u2_cached = -1
Function = u1
End If
End Function
#undef Rnd
#define Rnd x128.randD
#define xRange x128.range
Usage example:
Code: Select all
'#Console On
#include "xoroshiro128.bas"
dim as ulong i
dim as Double tot
print "Random floats [0,1) (53-bit granularity)"
print
For i = 1 to 6
Print Rnd
Next
Print
x128.SetSnapshot
Print "SetSnapshot executed - no GetSnapshot executed by user"
Print
For i = 1 to 6
Print Rnd
Next
Print
for i = 1 to 10^8
tot += Rnd
next
print "Average [0,1)";
print tot/10^8
Print
print "Random Ulongints"
print
For i = 1 to 6
Print x128.rand
Next
Print
print "Discrete range, 5 to 20"
Print
For i = 1 to 6
print xRange(5,20)
next
print
print "Continuous range, 1. to 10."
Print
For i = 1 to 6
print xRange(1.,10.)
next
Print
Print "GetSnapshot executed"
print
x128.GetSnapshot
for i = 1 to 6
print Rnd
next
Print
x128.SetSnapshot
print "SetSnapshot executed"
print
for i = 1 to 6
print Rnd
next
print
'Dim As Double t,t1,t2,z
'Dim As Double tall
'Dim As Ulongint x
'Dim As Ulong lim=10000000, y
'for k As Long=1 To 10
' z=0
' tall=Timer
' for n As Long=1 To lim
' t1=Timer
' x=hammingseed64()
' x = ShuffleUlongint( x )
' t2=Timer
' t=t2-t1
' If z<t Then z=t
' Next n
' tall = Timer - tall
' Print "max ";z*1000000,"Total time ";tall;" seed = ";x;", bits(1) = ";popcount64(x)
'Next k
Print " Done"
sleep
Oh, look Ma - 2TB and no anomalies!
Yes, dear - now put the kettle on - your mother is parched.
Yes, Ma.
Code: Select all
Microsoft Windows [Version 10.0.19042.1052]
(c) Microsoft Corporation. All rights reserved.
C:\Users\deltarho>E:
E:\>cd pr64
E:\PR64>MyRng | rng_test stdin64 -multithreaded
'MyRng' is not recognized as an internal or external command,
operable program or batch file.
E:\PR64>My_Rng | rng_test stdin64 -multithreaded
RNG_test using PractRand version 0.94
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)
rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 2.3 seconds
no anomalies in 226 test result(s)
rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 4.8 seconds
no anomalies in 243 test result(s)
rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 9.4 seconds
no anomalies in 261 test result(s)
rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 18.0 seconds
no anomalies in 277 test result(s)
rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 35.7 seconds
no anomalies in 294 test result(s)
rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 70.5 seconds
no anomalies in 310 test result(s)
rng=RNG_stdin64, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 136 seconds
no anomalies in 325 test result(s)
rng=RNG_stdin64, seed=unknown
length= 64 gigabytes (2^36 bytes), time= 276 seconds
no anomalies in 340 test result(s)
rng=RNG_stdin64, seed=unknown
length= 128 gigabytes (2^37 bytes), time= 548 seconds
no anomalies in 355 test result(s)
rng=RNG_stdin64, seed=unknown
length= 256 gigabytes (2^38 bytes), time= 1061 seconds
no anomalies in 369 test result(s)
rng=RNG_stdin64, seed=unknown
length= 512 gigabytes (2^39 bytes), time= 2176 seconds
no anomalies in 383 test result(s)
rng=RNG_stdin64, seed=unknown
length= 1 terabyte (2^40 bytes), time= 4351 seconds
no anomalies in 397 test result(s)
rng=RNG_stdin64, seed=unknown
length= 2 terabytes (2^41 bytes), time= 8545 seconds
no anomalies in 409 test result(s)