Fast random integers

General FreeBASIC programming questions.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

Thanks, Provoni, and dodicat.

For some weird reason, the dummy loop is running slightly faster than the rnd loop and my dummy loop is the same as Provoni's dummy loop. I have other surrounding code. I must be using an incorrect variable name or something daft like that, which will compile and I have a blind spot in operation. It won't be the first time and when the error is eventually spotted it is hard to believe I read it many times before. I may be wrong but I reckon other folks have done this.
badidea wrote:Check, PCG32II with your test program (above) works fine.
That is good - thanks for letting me know.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

@Provoni

With your last code I get:

Code: Select all

0.1596740999652866
0.1935974999477139
However, if I swap the two loops I get:

Code: Select all

0.8822731999641178
0.1539957000504728
The 'dummy loop' is pretty much the same for both runs but the 'cryptos loop' in the second run takes over four times longer.

Would you do a swap on your machine to see if this happens?

BTW, my figures are also from FBC official build (32-bit).

Added: This odd behavior is not seen with 64-bit.
dodicat
Posts: 7995
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Fast random integers

Post by dodicat »

Here are my 32 bit -O3 results 5 runs.

Code: Select all

 0.3266782851811172  x=0
 1.98001706168418  x=cryptos

 1.958181500206706  x=cryptos
 0.2424342385153295  x=0
__________________________

 0.242853764900417  x=0
 1.968862997889971  x=cryptos

 1.961515811106153  x=cryptos
 0.2419527755798754  x=0
__________________________

 0.2422829900924626  x=0
 1.96680403364519  x=cryptos

 1.955502143301544  x=cryptos
 0.2423551923654941  x=0
__________________________

 0.2424807765763717  x=0
 1.956622135074724  x=cryptos

 1.963039588386664  x=cryptos
 0.2424284212783618  x=0
__________________________

 0.2421741732044893  x=0
 1.954143644319231  x=cryptos

 1.966766050441606  x=cryptos
 0.2442663301548578  x=0
__________________________

  
and 64 bit -O3

Code: Select all

  0.04190644284244627  x=0
 1.700490325805731  x=cryptos

 1.662760326173157  x=cryptos
 0.0337773491628468  x=0
__________________________

 0.03403091279324144  x=0
 1.651825949200429  x=cryptos

 1.641291251522489  x=cryptos
 0.03397137171123177  x=0
__________________________

 0.03390977717936039  x=0
 1.648388296598569  x=cryptos

 1.657925849198364  x=cryptos
 0.0336394461337477  x=0
__________________________

 0.03387350484263152  x=0
 1.646430620923638  x=cryptos

 1.649227007292211  x=cryptos
 0.03385879076085985  x=0
__________________________

 0.03390464431140572  x=0
 1.64388403412886  x=cryptos

 1.647235112381168  x=cryptos
 0.03396179026458412  x=0
__________________________

 
Using asm nop in all loops.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

dodicat's results are what we would expect - swapping the loop blocks sees little difference in both 32-bit and 64-bit.

I am now thinking computer differences so decided to check the Performance Counter frequency. I got 10000000. Where the blazes did that come from. I have seen a variety of values but never that one. I have the HPET enabled in the BIOS and have had it activated for as long as I can remember. Anyway, I reactivated it again from the command prompt and got 14318180 - the value I have used since Windows 7. Clearly a Windows 10 update has stuck its nose in and changed it.

"bcdedit /enum" in admin mode will tell you whether HPET is on or not. If you see this line, "useplatformclock Yes" then it is on. If it is off you won't see "No" but no line starting with useplatformclock.

Just thought that I would mention that - my issue has not changed.

Added: Being doing a spot of 'up to date' reading. With an invariant TSC HPET is best left off. I have just deactivated it again. <laugh> It is still enabled in the BIOS - on booting if Windows reckons I should use it then if it is disabled in the BIOS Windows will not be able to use it.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

Provoni's code is using 'Dim As Ulongint i,n=10^8'

I am using 'Dim As Ulong i, j, n = 10^8'

The 'j' is for an outer loop for multiple tests.

Since I don't need an unsigned integer in excess of four Gigs I used Ulong.

When I changed to Ulongint I get a much faster 'dummy loop' timing. Changing Provoni's code to Ulong and the 'dummy loop' timing slows down getting close to the 'rnd loop' timing.

I am no longer getting a negative MHz.

The timings still change when I swap the loop blocks.

Try this in 32-bit mode

Code: Select all

Dim As Ulong i, m = 10^8
Dim as Double t, x

t = Timer
For i = 1 to m
  x = 0
  asm nop
next
t = Timer - t
Print t

Dim As Ulongint j, n = 10^8

t = Timer
For j = 1 to n
  x = 0
  asm nop
next
t = Timer - t
Print t

Sleep
The Ulongint loop is running faster than the Ulong loop. In 64-bit mode the timings are pretty much the same.

After nearly two years with FreeBASIC why did I not know this? This would not arise with PowerBASIC since it is bereft of 64-bit mode.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

OK this is the state of the parties.

I have Provoni's code Copy&Paste and no tinkering. I am using the FBC official release and no tinkering.

If I swap the loop blocks the timing for the 'rnd loop' is over four times slower than with the 'dummy loop' executed first.

If I comment the 'rnd loop' I get the 'dummy loop' running the same as the blocks as is or swapped.

If I comment the 'dummy loop' I get the 'rnd loop' the same as when the 'rnd loop' is executed first.

So, the true 'rnd loop' is the same as the loop blocks swapped.

If I test PCG32II I don't have any issues.

This tells me that there is something wrong with CryptoRNDII.

However, both Provoni and dodicat are using the same CryptoRNDII that I am using and they are not having any issues.

As is the code compiles to an exe of 112640 bytes. With the loop blocks swapped the code compiles to 113152 bytes; a difference of 512 bytes.

With PCG32II the exes compile to exactly the same size with the code as is or with the loop blocks swapped.

Needless to say with the 'rnd loop' having different timings I am getting different MHz values for CryptoS. 'Real life' timings put PCG32II faster than CryptoRNDII suggesting that the loop blocks swapped is telling the truth.

What a can of worms. <smile>

Added: In 64-bit mode as is or loop blocks swapped I get the corresponding exes of the same size.

More added: The exes are the same size if 'Screenres 640,480,32' is commented in Provoni's code.
Last edited by deltarho[1859] on Oct 19, 2018 11:30, edited 2 times in total.
badidea
Posts: 2594
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: Fast random integers

Post by badidea »

I think the NSA noticed your crypto activities and remotely changed your Intel cpu with their backdoor in Windows.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

A few years ago I would have asked a friend to check that out <laugh> - she worked for MI6. I remember when she went for her first interview and I asked her what the company did. She said she didn't know but it was called Government Communications Bureau. I said, "In Waterloo?". She said "Yes, you've heard of them". I said "Yes, it is MI6". She burst out laughing. After the interview, she 'phoned me and confirmed MI6. The 'company' moved to Nine Elms just upriver ( the Thames ) from MI5, on the south bank which, in turn, is just upriver, on the same side as the Houses Of Parliament. The tenant of the new building is called Government Communications Bureau but you won't find much information Googling that name. The Secret Intelligence Service MI6 are at the new building and have their own website. My friend retired a few years ago.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

In Provoni's code if we add 'y' to 'dim as double t,x' and use 'y = 0' as opposed to 'x = 0' then 'rnd loop' takes the slower result whether we swap loop blocks or not.

How did I miss that? Blimey!

Why my setup and not Provoni's or dodicat's?

I get the same behavior, bad and good, with gcc 8.1.

Oh, only with -O3 and -Ofast; -O1 and -O2 are OK. perhaps I should simply give -O3 and -Ofast a wide berth but then PCG32II works OK.

Added: If I keep 'x' but use 'Dim as Double x = 0' in the 'dummy loop' the problem clears with -O3. The mind boggles.
More added: BUT for some reason, this slows the 'rnd loop' down distorting '100/(dt1m - dt0m)' giving a false MHz.
Last edited by deltarho[1859] on Oct 19, 2018 21:23, edited 1 time in total.
Provoni
Posts: 514
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Fast random integers

Post by Provoni »

I use the official FB 1.05 64-bit with these switches: -gen GCC -O max -Wc -march=native,-funroll-loops,-ffast-math

No issues with the code. I refuse to compile with 32-bit, even for testing. Sorry :D
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

Provoni wrote:I refuse to compile with 32-bit
Perhaps I should too.

What is the 'max' in '-gen GCC -O max'?
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

Here is a ranking based upon badidea's loop overhead filter technique. The figures are for 64-bit ( [0,1) Double with 32-bit granularity ) in MHz and the bracketed figures are without filtering. Nine tests per generator were undertaken and the median chosen. Tests used official FreeBASIC release FBC 1.05/gcc 5.2. The '*' indicates a seal of approval from PractRand to 1TB or more of data.

Code: Select all

* CryptoRNDII   604 (523) Windows only and not cryptographically secure.
* PCG32II       486 (432) Thread safe. Comes with a free Help file and a reduction in the author's life expectancy.
  Knuth64       486( 432) 53-bit granularity.
* MsWs          433 (389) Found by Paul Doe. 53-bit granularity.
  FB #2         389 (354)
  FB #4         354 (324)
* CMWC4096      294 (273) Period 2^131086. Can shuffle up to 10,940 elements.
  RndMT         226 (213) Digitally remastered FB #3 <smile>
  FB #3         212 (201) Period 2^19937 − 1. Can shuffle up to 2080 elements.
* Intel RdRand  116 (113) Uses RdRand 64-bit
  FB #1          74 ( 73)
Last edited by deltarho[1859] on Oct 23, 2018 3:36, edited 2 times in total.
Provoni
Posts: 514
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Fast random integers

Post by Provoni »

Thank you for the table deltarho[1859].

CryptoRNDII is bad ass, PCG32II also very good too.
Provoni
Posts: 514
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Fast random integers

Post by Provoni »

deltarho[1859] wrote: What is the 'max' in '-gen GCC -O max'?
it is the same as -O 3 since that is currently the maximum optimization level.
deltarho[1859]
Posts: 4318
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Fast random integers

Post by deltarho[1859] »

@Provoni

Not only is CryptoRNDII bad ass its quality of randomness is only second to Intel's RdRand. PCG32II comes into its own if thread safety is an issue.
it is the same as -O 3 since that is currently the maximum optimization level.
Ah, thank you.

64-bit is now my default in WinFBE. <smile>
Post Reply