Seeding FB generators

General FreeBASIC programming questions.
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Seeding FB generators

Post by deltarho[1859] »

When seeding the FB PRNGs some may think that any old seed will do. Unfortunately, life is not that simple.

Many, if not all, generators need to be warmed up. What happens is that during the early stages of generation the random numbers are not that random and should be avoided. The easiest way to avoid them is to 'burn' a sufficient number until we reach a stage where the random numbers are better quality. With some generators, we don't need to 'burn' that many - with others we need to 'burn' quite a few.

A lot of code on the forum uses random numbers at the beginning of the code for initialization purposes only. Many folk will not realize that the numbers generated may not be of good quality even though the generator they are using is known for good quality random numbers.

With PractRand it examines blocks of data and the early numbers may not be a large enough sample for PractRand to reject the generator.

So, how do we know when we have reached a stage where the random numbers are better quality. This occurs when the original seed, after updating, tends to have a balance of zeros and ones. What we have is a Hamming weighting of 50%.

Instead of warming up we can ensure that the original seed has a good Hamming weight to start with.

When the generator is running the Hamming weight will drift in and out of a good state. There is nothing we can do about that - that is random numbers. However, according to the normal distribution, we shouldn't find ourselves in a poor state for long. If that happened PractRand would jump in and reject the generator.

The following code is one way of ensuring that a seed has a good Hamming weight.

Code: Select all

Function RdtscSeed() As Ulong
Dim Seed As Ulong
Asm
0:
  rdtsc              ' Get a fast changing number from the processor
  bswap eax          ' Reverse byte order to break sequence
  popcnt edx, eax    ' Count the number of set bits in eax
  cmp edx, 14        ' Force a good entropy with a Hamming Weight of 14 or more
  jl 0b
  cmp edx, 18        ' Force a good entropy with a Hamming Weight of 18 or less
  jg 0b
  mov Dword Ptr [Seed], eax
End Asm
Return Seed
End Function
Your CPU may not have popcnt but it probably has - Intel introduced it about 12 years ago.

rdtsc gives us 2^32 possible values. Staying in the code until we are in a good entropy window gives us 2^27*5 possible values, 671,088,640. That plus reversing the byte order will ensure that no one will be able to guess the seed generated anytime soon.

The time to come up with a suitable seed will vary. I tested RdtscSeed 100 million times and the longest time taken was 175 microseconds, so we shouldn't find ourselves twiddling our thumbs waiting. That test took over three minutes to complete.

Now for a graphics program using a dozen or so random ranges during initialization may see them as having exceptionally poor quality. This may not be an issue. On the other hand it may be. So if you use the FB generators using RdtscSeed should ensure that you get decent quality random numbers from the word go.

Not a great deal of work has been done on warming up generators. Here is a graph published by Sebastiano Vigna a few years back. As can be seen from this small sample warming up does not take that long and generators with large state vectors tend to take longer. We cannot second guess how long a particular generator will take to warm up but RdtscSeed solves that problem for us.

Image
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Seeding FB generators

Post by D.J.Peters »

For older CPU's without the popcnt ASM instruction !

Joshy

Code: Select all

Function ReadTSC() As Ulong
  asm
  rdtsc
  bswap eax
  mov [function], eax
  end asm
end function
Function Seed32() As Ulong
  dim as ulong b=any,seed=any,i=any,c=0
  while c<14 orelse c>18
    c=0:b=1:seed=ReadTSC()
    for i = 0 to 31
      if seed and b then c+=1
      ' if >18 get new seed
      if c=19 then exit for
      b shl=1 ' test next bit
    next
  wend  
  return seed
end function  

for i as integer=1 to 20
  print Seed32()
next  
sleep
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

That's great. Thanks, Joshy.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Seeding FB generators

Post by jj2007 »

D.J.Peters wrote:For older CPU's without the popcnt ASM instruction !
PopCount() works without the native instruction, and is very fast. If there is interest, I can look for the source.
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

It is worth noting that if we do not use Randomize, which I have seen many not do, then the default seed of 327680 is used. The binary representation of that is 1010000000000000000 which is bad from a Hamming weight perspective. The early random numbers generated may be heavily biased and not what the author expects or wants.

Now an author may want to use a fixed seed, for some reason, for their code in which case hard-wire a fixed seed in the code but use one with a Hamming weight of 50%.

Anyone of these will do:

1321891474 1001110110010100111011010010010
3881145490 11100111010101011001010010010010
1368894354 1010001100101111010101110010010
628870290 100101011110111100110010010010
650109074 100110101111111110000010010010
4286972307 11111111100001100000000110010011
233909651 1101111100010010110110010011
4247077523 11111101001001010100001010010011
3378140819 11001001010110100101011010010011
1115581075 1000010011111100110101010010011

For example, we can use: Randomize 1321891474, 2
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

I have been using a Hamming weight of between 14 and 18 figuring it would take too long to force 16. Testing RdtscSeed 100 million times gave the longest time to find a seed with a Hamming weight of 16 to be 205 microseconds, so I am going to suggest this.

Code: Select all

Function RdtscSeed() as Ulong
Dim Seed As Ulong
Asm
  0:
  rdtsc              ' Get a fast changing number from the processor
  bswap eax          ' Reverse byte order to break sequence
  popcnt edx, eax    ' Count the number of set bits in eax
  cmp edx, 16
  jne 0b             ' Force a 50% Hamming weight
  mov Dword Ptr [Seed], eax
End Asm
Return Seed
End Function
Of course, slower CPUs will take longer, but we will probably still be talking of a swift return.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Seeding FB generators

Post by jj2007 »

A faster variant:

Code: Select all

Function RdtscSeed() as Ulong
Dim Seed As Ulong
Asm
  0:
  rdtsc              ' Get a fast changing number from the processor
  popcnt edx, eax    ' Count the number of set bits in eax
  cmp edx, 16
  jne 0b             ' Force a 50% Hamming weight
  bswap eax          ' Reverse byte order to break sequence
  mov Dword Ptr [Seed], eax
End Asm
Return Seed
End Function
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

Image

Of course!

Since I don't write asm as much as you do, Jochen, it is not as obvious to me as a statement in a For/Loop which should be outside the loop.
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

jj2007 wrote:PopCount() works without the native instruction, and is very fast. If there is interest, I can look for the source.
I'm interested. Image
adeyblue
Posts: 300
Joined: Nov 07, 2019 20:08

Re: Seeding FB generators

Post by adeyblue »

Probably

Code: Select all

Function PopCount(ByVal val As ULong) As ULong
    dim numBits As ULong = 0
    While val <> 0
        val = val And val - 1
        numBits = numBits + 1
    Wend
    Return numBits
End Function
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

@adeyblue

Oh boy, is that neat or what?

I had to work through an example with pen and paper to see it working.

It is even more neat by working on any number of bits.

Incorporating your code I get this:

Code: Select all

Function HammingSeed() As Ulong
Dim As Ulong Seed, CopySeed, numBits = 0
While numBits <> 16
  Asm
    rdtsc    ' Get a fast changing number from the processor
    mov Dword Ptr [Seed], eax
  End Asm
  CopySeed = Seed : numBits = 0
  While CopySeed <> 0
    CopySeed = CopySeed And CopySeed - 1
    numBits += 1
  Wend
Wend
Asm
  mov eax, Dword Ptr [Seed]
  bswap eax    ' Reverse byte order to break sequence
  mov [Function], eax
End Asm
End Function
For 100 millon tests the longest time to return a 50% Hamming weight was 256 microseconds. This is not that much longer than the popcnt method. OK it could take longer but even it took four times longer, which I doubt, that is still only one millisecond.

Thanks, adeyblue, another gem from your stable. Image
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Seeding FB generators

Post by jj2007 »

deltarho[1859] wrote:
jj2007 wrote:PopCount() works without the native instruction, and is very fast. If there is interest, I can look for the source.
I'm interested. Image
Here it is:

Code: Select all

include \masm32\include\masm32rt.inc
.data
pcSource	dd 11111000001111100000111110000010y  ; just a test - expected value is 16 out of 32
.data?
popctTable	db 256 dup(?)

buffer db 36 dup(?)

.code
start:
  mov edi, offset popctTable	; *** phase 1: initialise PopCount table
  xor ebx, ebx
  .Repeat
	mov edx, ebx	; 0...255
	xor eax, eax	; bit counter
	push 32	; 32 shifts
	pop ecx
@@:
	shr edx, 1	; get carry flag
	adc al, 0	; inc al if carry set
	loop @B	; shorter than dec cl, jnz
	stosb	; al to table
	inc bl
  .Until Zero?		; byte turnaround at 256

  mov ebx, [pcSource]	; *** phase 2: use it (pcSource is the DWORD for which you need the popcount)
  movzx ecx, bh
  movzx eax, byte ptr popctTable[ecx]
  movzx ecx, bl
  movzx edx, byte ptr popctTable[ecx]
  bswap ebx
  movzx ecx, bh
  add eax, edx
  movzx edx, byte ptr popctTable[ecx]
  add eax, edx
  movzx ecx, bl
  movzx edx, byte ptr popctTable[ecx]
  add eax, edx		; *** done ***

  print str$(eax)
  print " is the popcount of "
  invoke dw2bin_ex, pcSource, offset buffer
  print offset buffer
  exit
end start
As you can see, it uses a table that has to be initiated once. Note this is part of the MasmBasic source, and therefore cannot be copyrighted or sold etc etc...
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

I am not experienced with masm32 so I may be gone some time with that.

Thanks, Jochen, but if it is all the same to you, I'll stay with adeyblue's tip.

I need to go past HammingSeed three times so on my machine I am looking at about three quarters of a millisecond. That will do for me.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Seeding FB generators

Post by jj2007 »

deltarho[1859] wrote:I'll stay with adeyblue's tip.
His code is fine, absolutely. Speed is rarely a problem if you stay below the one Million iterations mark ;-)
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Seeding FB generators

Post by deltarho[1859] »

@jj2007

I'd like a Ulongint version of HammingSeed. I could go past HammingSeed twice but would rather have a single function. However, I want to avoid reading the timestamp twice in close succession. Is there an asm instruction which would pause the asm for a small amount of time to allow the timestamp to 'spin' a little or should I use a simple loop?
Post Reply