On FreeBASIC's random number generators.

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

On FreeBASIC's random number generators.

Post by deltarho[1859] »

I have been using PractRand for a few years now to examine the output of pseudo-random number generators. Unlike most compilers FreeBASIC offers a selection of generators so I subjected them to PractRand. PractRand is a very unforgiving suite of tests.

Way back in 1995 George Marsaglia published Diehard tests. In those days tests were carried on 10MB to 20MB of data. Nowadays, tests are carried out on data of one TB or more. Many generators which pass the Diehard tests are now not passing muster when tested will the latest testing software including PractRand.

PractRand classifies anomalies from unusual, suspicious, very suspicious, VERY SUSPICIOUS, FAIL, FAIL!, FAIL!!, ..., FAIL!!!!!!!! <smile>. If any tests give a FAIL or upwards then no further tests are carried out.

The FreeBASIC algorithms 1, 2, and 4 all fail at the first size of test with many anomalies of varying degrees and several levels of FAILs.

Surprisingly, algorithm 5 also failed at the first size of test. This was repeated a few times and although only two or three anomalies were reported they each had a FAIL. This made me suspicious of my own code since algorithm 5 uses 'Win32 Crypto API' which is CryptGenRandom. In my 'A fast CPRNG' thread I use BCryptGenRandom which was introduced in Windows Vista. RtlGenRandom was introduced in Windows XP which allowed us to access the CrytoAPI random number generator engine without the need to invoke a Cryptographic Service Provider. I replaced BCryptGenRandom with RtlGenRandom and it passed PractRand. This leaves me questioning algorithm 5.

Algorithm 3 is Merseene Twister, the default generator for -lang fb.

We are in a very different place with FreeBASIC's implementation of Merseene Twister. Tests were carried out up to 64GB of data and only five tests did not have no anomalies. These only had one anomaly, each evaluated as unusual- the lowest ranking anomaly.

Given that the more tests we do increases the likelihood of anomalies, by virtue of random numbers, then PractRand is telling us, from a degree of randomness perspective, FreeBASIC's implementation of Merseen Twister is a class act.

I am not suggesting that algorithms 1, 2, 4 and 5 should not be used as they are probably OK for a small number of requests and, of course, for a small number it would be almost impossible to determine their degree of randomness; unless obvious with the same value, for example.

It is worth noting that PowerBASIC's random number generator also fails PractRand at the first size of test with many anomalies of varying degrees and several levels of FAILs. It does pass the Diehard tests, so it is OK for a small number of requests. Having said that if I need random numbers in PowerBASIC code I use Blackman and Vigna's Xoroshiro128+.

David Roberts
greenink
Posts: 200
Joined: Jan 28, 2016 15:45

Re: On FreeBASIC's random number generators.

Post by greenink »

You can tune a RNG to a test system's blind spots, and still (literally) see structure in RNG's output in practical applications. That happened to me with one of Vigna's earlier generators. With these short algorithms there is always structure in the output. It only really matters if it causes problems in your application. I do usually use Xoroshiro128+ for most things, however I also use the higher bits of simple 64 bit LCG's a lot. Even with constants that don't give the maximum cycle. The higher bits don't correlate with anything much. A single multiply is as powerful as a lot of shifts and xor's. I also like to use the bswap instruction with RNG's.

Additive recurrence is another interesting topic: https://en.wikipedia.org/wiki/Low-discr ... recurrence
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: On FreeBASIC's random number generators.

Post by counting_pine »

Just to lend a little context to your thread, KeyPgRandomize lists all the different methods:

0 - Default for current language dialect. This is algorithm 3 in the -lang fb dialect, 4 in the -lang qb dialect and 1 in the -lang fblite dialect.
1 - Uses the C runtime library's rand() function. This will give different results depending on the platform.
2 - Uses a fast implementation. This should be stable across all platforms, and provides 32-bit granularity, reasonable degree of randomness.
3 - Uses the Mersenne Twister. This should be stable across all platforms, provides 32-bit granularity, and gives a high degree of randomness.
4 - Uses a function that is designed to give the same random number sequences as QBASIC. This should be stable across all platforms, and provides 24-bit precision, with a low degree of randomness.
5 - Available on Win32 and Linux, using system features (Win32 Crypto API, Linux /dev/urandom) to provide cryptographically random numbers. If those system APIs are unavailable, algorithm 3 will be used instead.

My own comments:
0: This will be equivalent to 1, 3 or 4, depending on the dialect (usually 3 probably).
1: This was originally FB's only algorithm. The randomness and number of bits may vary depending on the platform. I think the Windows CRT uses a Multiply-with-carry algorithm, but I seem to recall it might only return the upper 16 bits, which would look more random but provide fewer bits of "entropy". I can't remember now.
2: A simple, 32-bit Multiply-with-carry, although I've never really used it
3: If there are any weaknesses in this one, then it would either be a weakness in the Mersenne Twister algorithm, our implementation, or something to do with the floating-point conversion.
4: I reverse-engineered this one myself analytically from QB (although I wasn't the first), and provided FB's implementation. It's another MWC, sadly only using 24 bits and returning all of them - the least significant bit merely alternates between 0 and 1 with each call.
5: I don't have much to say about this, only that something must be wrong if the results are ever less random than the Mersenne Twister. If some platforms are returning less than random results, it would be great to have a fix for that, even if it just means it falls back on the MT in more cases.

EDIT: Fixed above wiki link
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

Thanks counting_pine, it is good to have a broader description of FreeBASIC's random number generators.

The result with 5: bugged me and I had to go back to it. I started to question my logic.

With the CryptoRnd tests I used *Sptr = CryptoDW ... (1)

With the FreeBASIC tests I used *Sptr = Int( Rnd * 256 ) ) ... (2)

If any non-random behaviour existed in the generation of the Ulongs then Int( Rnd * 256 ) may digest that behaviour and give unpredictable results. On the one hand favourable results may occur when they should not, effectively giving a false negative. On the other hand poor results may occur when they should not leading to a false rejection.

From the Mersenne Twister paper we have 'return genrand_int32()*(1.0/4294967296.0);'
and it seems used in FreeBasic as 'Function = genrand_int32() * (1.0# / 4294967296.0#)'

to give us Rnd within [0,1)

So, to my mind we should be using Int( Rnd * 4294967296# ) ... (3)
to return the Rnd to its 32 bit origin.

Being in the Double precision domain we should be OK as in this example.

Code: Select all

x = 3221225472 * 1# / 4294967296# ' Inside generator
Print Int( x * 4294967296# ) => 3221225472
I then did tests using (3) but found that PractRand stopped at 64GB.

At the command prompt this was used: 'My_RNGFB | RNG_test stdin32' where My_RNGFB is my code and RNG_test is PractRand's author's code.

During the CryptoRnd tests I noticed that My_RNGFB had a CPU usage of about 2% whereas RNG_test had a CPU usage of about 12%. With FreeBASIC tests My_RNGFB was at about 10% and RNG_test at about 8%. This is probably because (1) is 'easier' than (3).

From PractRand: "If your program stops sending out random data before the PractRand RNG testing tool is done testing then it may print "error reading input". The time to build a string to send down the pipe will, obviously, be much greater with (3) than (1). Perhaps we have a bottleneck such that RNG_test thinks My_RNGFB has stopped sending data. At this point My_RNGFB had a CPU usage of 0.01% and RNG_test showed no activity at all. I tried a much shorter string for piping but that did not help.

We need then the ULong output of the FreeBASIC generators so that we can use something like (1) and not (3).

Anyway, I did five tests using (3).

1: 1 X unusual anomaly at 256Mb, 1 X unusual anomaly at 64GB
2: 1 X unusual anomaly at 4GB
3: 1 X unusual anomaly at 512MB
4: 1 X unusual anomaly at 64Mb, 1 X unusual anomaly at 8GB
5: 2 X unusual anomaly at 512MB, 1 X unusual anomaly at 2GB, 1 X unusual anomaly at 32GB

Testing suites assume that a set of random numbers is good and report if a contradiction is found or not. With the results above PractRand is telling us that there is no evidence that any of the FreeBasic generators has produced poor quality random numbers. If we dropped the results onto the floor and picked them up we would not be able to tell which results belonged to which generator.

This is the exact opposite to my opening post except for Mersenne Twister. Had the Mesenne Twister results been poor then I would have questioned my logic straight away. You will be pleased to learn that I do not work for a pharmaceutical company or intend to get involved in writing AV software anytime soon. <smile>

Testing beyond 64GB is a must but, on the face of it, all is well in the FreeBASIC camp.

Of the five generators my money is still on Mersenne Twister but my CryptoRnd isn't looking too shabby.
MOD
Posts: 555
Joined: Jun 11, 2009 20:15

Re: On FreeBASIC's random number generators.

Post by MOD »

I'll try to explain algo 5, but it's quite some years ago, so it might be not 100% correct:

Algo 5 was introduced to get a "better" random number. Basically, it uses the default algo 3 Mersenne Twister but uses CryptApi or /dev/urandom as additional initial value. Both APIs are a bit slow and can run out of random numbers. So theses APIs are only triggert once every 256 calls. So this algo will not simply and just get random numbers from the system and return them, but use them as additional value. I'm not sure, why the english documentation is so unspecific about it, but the german documentation (Google will help translating it) explains it very well.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

and can run out of random numbers.
I know nothing of '/dev/urandom' but the CryptoAPI has seen the entropy pool get larger with each version of Windows. However, it is finite so we may think that, at some point, we will "run out of random numbers". My understanding is that encryption and other conditioning is employed to generate more numbers with the pool being refreshed at some regular interval. I don't think that Microsoft has published the exact 'goings on' with their cryptographic random number generators.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: On FreeBASIC's random number generators.

Post by counting_pine »

(Just to say, I fixed my wiki link above.)
The internal Randomize code can be found in https://sourceforge.net/p/fbc/code/ci/m ... math_rnd.c. Here's an excerpt:

Code: Select all

	switch( algorithm ) {
[...]

#if defined HOST_WIN32 || defined HOST_LINUX
	case RND_REAL:
		rnd_func = hRnd_REAL;
		state[0] = (unsigned int)seed;
		for( i = 1; i < MAX_STATE; i++ )
			state[i] = ( state[i - 1] * 1664525 ) + 1013904223;
		p = state + MAX_STATE;
		break;
#endif

	default:
	case RND_MTWIST:
		rnd_func = hRnd_MTWIST;
		state[0] = (unsigned int)seed;
		for( i = 1; i < MAX_STATE; i++ )
			state[i] = ( state[i - 1] * 1664525 ) + 1013904223;
		p = state + MAX_STATE;
		break;
	}
}
So it only works on Windows and Linux.

The hRnd_REAL() function is defined below:

Code: Select all

static double hRnd_REAL( float n )
{
	static unsigned int count = 0;
	static unsigned int v;
	double mtwist;

	mtwist = hRnd_MTWIST(n);
	if( (count % 256) == 0 ) {
		count = 1;

		/* get new random number */
		v = hGetRealRndNumber( );
	} else {
		count++;
	}

	if( v == 0 ) {
		return mtwist;
	}
	v *= mtwist;

	v ^= (v >> 11);
	v ^= ((v << 7) & 0x9D2C5680);
	v ^= ((v << 15) & 0xEFC60000);
	v ^= (v >> 18);

	return (double)v / (double)4294967296ULL;
}
#endif
It fetches a random 'unsigned int' number every 256 times it's called (so fairly low-traffic), and uses it to seed the algorithm somehow.

hGetRealRndNumber() uses Windows-specific or LInux-specific code to get a random value:

Code: Select all

#if defined HOST_WIN32 || defined HOST_LINUX
static unsigned int hGetRealRndNumber( )
{
	union {
		unsigned int i;
		unsigned char b[sizeof(unsigned int)];
	} number = { 0 };

#if defined HOST_WIN32
	HCRYPTPROV provider = 0;
	if( CryptAcquireContext( &provider, NULL, 0, PROV_RSA_FULL, 0 ) == TRUE ) {
		if( CryptGenRandom( provider, sizeof(number), &number.b[0] ) == FALSE ) {
			number.i = 0;
		}
		CryptReleaseContext( provider, 0 );
	}

#elif defined HOST_LINUX
	int urandom = open( "/dev/urandom", O_RDONLY );
	if( urandom != -1 ) {
		if( read( urandom, &number.b[0], sizeof(number) ) != sizeof(number) ) {
			number.i = 0;
		}
		close( urandom );
	}
#endif

	return number.i;
}
There are obvious similarities between hRnd_REAL and hRnd_MTWIST, which can be found in the same file.
But because I don't understand the Mersenne Twister, and the code doesn't have much commenting, I can't really say more than that.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: On FreeBASIC's random number generators.

Post by dodicat »

freebasic can use the five algorithms as mentioned, but also has access to the one in msvcrt.dll.
I don't think it utilises any Win crypt stuff, it seems too fast for that.
It seems up to speed with the default freebasic rnd (slight difference due to using MOD)
Also the C rand seems to be in the range SHORT, 32 and 64 bit.

Code: Select all


#include "crt.bi"

screen 19
dim as double t=timer
for n as long=1 to 10000000
   ' if rand > 32767 then beep 'short range
    pset (rand mod 800,rand mod 600),15
next
print timer-t;"   Press a key"
sleep

 t=timer
for n as long=1 to 10000000
    pset (rnd*800,rnd*600),0
next
print timer-t,"   Done"
sleep 
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

Interesting stuff, guys. Just got around to going through your posts - I've been doing more work for the 'A fast CPRNG' thread.

The original fast CPRNG was based upon BCryptGenRandom, the follow on to CryptGenRandom. It is now a conditional compile where we can compile to either BCryptGenRandom or Intel's RdRand. Anyway, enough of that - this thread is about FreeBASIC's offerings.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: On FreeBASIC's random number generators.

Post by counting_pine »

Actually, I am a little concerned about the hRnd_REAL code. Firstly, it doesn't use pure integer math, so it seems like it would be hard to prove any properties about it.

Secondly, looking at an abridged version of it:

Code: Select all

   static unsigned int v;
...
   v *= mtwist;

   v ^= (v >> 11);
   v ^= ((v << 7) & 0x9D2C5680);
   v ^= ((v << 15) & 0xEFC60000);
   v ^= (v >> 18);

   return (double)v / (double)4294967296ULL;
So the 'v ^= ' lines look like the ones in the original Mersenne Twister code, and look like they do a bijective transformation on v.
(This is good, because it means no data is lost in this step - each 32-bit value is possible, and adjacent numbers will presumably be transformed to completely different values.)

But what is very concerning is: 'v *= mtwist'.

It takes a random-ish number between 0 and 2^32, and multiplies it by one between 0 and 1 (with granularity 2^-32).
This means the result is going to heavily skew 'v' towards lower values.

It means that low values (e.g. 0) will be more common; higher numbers will be rarer; and at the top end, 2^32-1 will only occur when both v and mtwist are at their highest possible values.

Thanks to the subsequent 'v^=' lines, the final average value won't appear very skewed, but the overall distribution across the 2^32 numbers will be very uneven, with some being vanishingly rare.

It's difficult to say how to ensure a perfect distribution, but I'd venture that the algorithm could probably be improved by doing 'v ^= (mtwist * 4294967296ULL)'.
Overall though, I still don't like it, and wouldn't trust it over the original MT, and I'd say a different algorithm should be found. Even if it's just seeding MT with a cryptographically random value (or the timer).
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

It seems to be a slightly different implementation to Greg Lyon's code (April 2006). What I liked about Greg's code is that we can seed via four 32 bit values. I am not happy with just one 32 bit seed as that restricts us to a small entry window to the Twister's sequence; 2^128 is much better than 2^32.

Just found C++ Seeding Surprises which looks into the perils of just using one 32 bit seed: "You really shouldn't try to initialize a random number generator whose state is 624 32-bit ints with a single 32-bit int."

The FreeBASIC Twister did OK for 64GB examined by PractRand. I couldn't get beyond 64GB because I could not get access to the ULong output. However, I can with Greg's code. I will see if I can get past 64GB by using a single 32 bit seed to see how that fairs. If there is any bias toward zero PractRand will spot it.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

Greg's code has just gone past 64GB and according to Process Explorer the piping is still working with my feeder using very little CPU time and the receiver using a lot more; just what I want to see.

It takes a long time to test to 1TB. So far only one small unusual anomaly.

With regard seeding via four 32 bit values, 2^128 can be represented as 2^96 x 2^32 which is roughly 8x10^28 x 2^32 which is a very different 'kettle of fish' to 1 x 2^32.

I am going to write some stuff such that Greg's code will only accept 4 x ULongs. We will have two seeding methods.

RandomizeMT( "Some string" ). "Some string" will be subject to MD5 giving then 128 bits for the 4 x ULongs. This is analogous to 'Randomize x' where x is a fixed seed.

RandomizeMT( "" ). The empty string will invoke RtlGenRandom to supply 16 bytes giving 128 bits for the 4 x ULongs. This is analogous to Randomize without reference to Timer.

Just hit 128GB; still only one anomaly.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

I pulled out at 512GB - after running for five hours I was looking at another eight hours - that is some serious number crunching.

So, there does not appear to be any bias in Greg Lyon's code. You may think that PractRand may not spot it.

Here is a quote with regard to xoroshiro128+ by David Blackman and Sebastiano Vigna.
Beside passing BigCrush, this generator passes the PractRand test suite up to (and included) 16TB, with the exception of binary rank tests, which fail due to the lowest bit being an LFSR; all other bits pass all tests. We suggest to use a sign test to extract a random Boolean value.
With my implementation it failed at 64GB on a binary rank test - which I have never seen with other generators.

This does not mean that counting_pine's concerns with 'v *= mtwist' in the FreeBASIC implementation are unwarranted.

The above results do not tell us that using a single 32 bit seed is a safe tack. It simply tells us that we got away with it this time.

I have RandomizeMT( "Some string" ) and Randomize( "" ) working. It was so easy to do that I do not think that it is worth publishing. The idea is in my last post - most folk will be able to code it.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: On FreeBASIC's random number generators.

Post by counting_pine »

Thanks for the read about 32-bit seeds, pretty disturbing..

Just to say for now that I looked up the Greg Lyon code and here's a link for reference: http://www.freebasic.net/forum/viewtopic.php?f=7&t=3969
Also, in case it's not clear, the 'v *= mtwist' code comes from the "real" random generator, not from Greg's or FB's Mersenne Twister code.

EDIT: I've created a bug report for the distribution problem at https://sourceforge.net/p/fbc/bugs/849/.
I've not created a bug for the 32-bit integer seed problem at this point..
I think it would be considered a problem specifically for the case where 'randomize , 4' is used: practically unpredictable sequences is a given (or should be) for real random numbers; it's unattainable in the simpler algorithms; and is the opposite of what is wanted when a seed is passed.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: On FreeBASIC's random number generators.

Post by deltarho[1859] »

I've got Greg's 4 x Ulong 'entry system' to 128GB with PractRand and only one small anomaly so far.

The downside, if it could be called that, is that running flat out it is coming in at about 50% of the inbuilt Twister and in a 'real world' environment it is coming in at about 2/3rd of the inbuilt Twister. Obviously, this has nothing to do with the entry system but to do with C, on this occasion, having an edge on FreeBASIC.

A rewrite of the inbuilt Twister would not come amiss - "pretty disturbing.." indeed.
Post Reply