some CPU info

General FreeBASIC programming questions.
adeyblue
Posts: 299
Joined: Nov 07, 2019 20:08

Re: some CPU info

Post by adeyblue »

dodicat wrote:AdeyBlue has a loop
Guys, I have feelings you know. I can't help it if my nose hooks around like that, have a heart
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: some CPU info

Post by deltarho[1859] »

adeyblue wrote:have a heart
Image

As with many of these exercises, it takes a million passes to separate methods. In practice with only a handful of passes, it doesn't matter which method we use. With RomuTrio with gcc in 64-bit mode, its best environment, I see no point in switching to the library.

And then someone comes along and says: "I'm using gas, and you won't believe how many passes I need". Image
adeyblue
Posts: 299
Joined: Nov 07, 2019 20:08

Re: some CPU info

Post by adeyblue »

dodicat wrote:the libgcc.a has goodness knows what!
It's obscured by macros to generate the constants for various type sizes, but it's /almost/ the same as the method in post #6
https://github.com/gcc-mirror/gcc/blob/ ... cc2.c#L842

It's the method under the 'The best method for counting bits...' line in the 'Counting bits set, in parallel' section here.
https://graphics.stanford.edu/~seander/bithacks.html

Basically all the various fancy methods of popcount, and other various funky bit twiddling things are on that page.
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Re: some CPU info

Post by dafhi »

wow, dodicat. that is quite the sub. things i do not understand. if you could remove the branching slowness (IF)

not entirely sure my logic is correct

Code: Select all

Function popcount64(x As Ulongint) As Ulong
  const as ulongint   mask_hi = 1 shl 63
  const as ulongint   mask_lo = -1 xor mask_hi
  
  static as ulongint y:  y = x and mask_hi
      'If x=18446744073709551615 Then Return 64
      x and= mask_lo
      x -= ((x Shr 1) And &h5555555555555555ull)
      x = (((x Shr 2) And &h3333333333333333ull) + (x And &h3333333333333333ull))
      x = (((x Shr 4) + x) And &hf0f0f0f0f0f0f0full)
      x += (x Shr 8)
      x += (x Shr 16)
      x+= (x Shr 32)
     return  (x And &h0000003full) + (y shr 63)
End function
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: some CPU info

Post by dodicat »

Hi dafhi.
if only catches the case of 64 ones, i.e. top of ulongint, 18446744073709551615.
If you don't go there then you can scrap IF.
I get:
Shift value greater than or equal to number of bits in data type
with your adjustment, and only 32 popcount for 18446744073709551615

...
Here is a convolution (concatenation) of fb rnd using the default algorithm to achieve 1 terabyte with practrand.
Only 2 unusuals on the way up
...
rng=RNG_stdin32, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 103 seconds
Test Name Raw Processed Evaluation
[Low1/32]BCFN(2+2,13-3,T) R= -7.0 p =1-8.5e-4 unusual
...and 215 test result(s) without anomalies

rng=RNG_stdin32, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 204 seconds
no anomalies in 229 test result(s)

rng=RNG_stdin32, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 404 seconds
Test Name Raw Processed Evaluation
DC6-9x1Bytes-1 R= -4.9 p =1-4.3e-3 unusual
...and 239 test result(s) without anomalies
...
after that no more.
It isn't the fastest of course, but it brings fb into practrand terabyte range with it's own algorithms.
I might try randomize ,2 later.
My sincere apologies AdeyBlue (nose), and thanks for looking into the library.
The concatenation:

Code: Select all


Const u32max = &hFFFFFFFFul             ' ULong maximum value
#define rnd64 (culngint(rnd*u32max)+(culngint(rnd*u32max) shl 32))
#define u64rng(f,l)  (culngint(rnd64) mod (((l)-(f))+1)) + (f)

'used this in practrand
'Do
    'For j = 1 To 262144
       ' *SPtr =  u64rng(0,4294967294)
        'SPtr += 1
   ' Next
    'Print S;
   ' SPtr = BasePtr
'Loop

dim as ulong count,v
screen 17
do
      count+=1
      v+=u64rng(1,99)
      locate 2,2,0
      print v/count,count
loop


 
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: some CPU info

Post by deltarho[1859] »

@dodicat

Your 16 gigabytes are taking a long time. Are you using PractRand's '-multithreaded'? Using all available threads speeds things up quite a bit. If testing Ulongint, then use stdin64.

With regard If 'x=18446744073709551615 Then Return 64' we are testing for a highly unlikely event for every pass of popcount64.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: some CPU info

Post by dodicat »

deltarho.
randomize ,2 gets nowhere, the only one is the twister (default) and randomize,5 is flawless.
Here is the code I use (command at the top to paste to the console).

Code: Select all

'fbrnds.exe | rng_test stdin32 -multithreaded 


'fbrnds.bas
randomize ,5
Const u32max = &hFFFFFFFFul             ' ULong maximum value
#define rnd64 (culngint(rnd*u32max)+(culngint(rnd*u32max) shl 32))
#define u64rng(f,l)  (culngint(rnd64) mod (((l)-(f))+1)) + (f)


Dim Shared S As String * 1048576
Dim As Ulong Ptr SPtr, BasePtr
Dim As Long j

SPtr = Cptr(Ulong Ptr, Strptr( S ))
BasePtr = SPtr

Do
    For j = 1 To 262144
        *SPtr =  u64rng(0,4294967294)
        SPtr += 1
    Next
    Print S;
    SPtr = BasePtr
Loop

 
As you see I have given algo 5 a shot.
results:

Code: Select all

Microsoft Windows [Version 10.0.19042.1165]
(c) Microsoft Corporation. All rights reserved.

C:\Users\Computer\Desktop\fb\test\practrand\msvc12_64bit>fbrnds.exe | rng_test stdin32 -multithreaded
RNG_test using PractRand version 0.94
RNG = RNG_stdin32, seed = unknown
test set = core, folding = standard (32 bit)

rng=RNG_stdin32, seed=unknown
length= 128 megabytes (2^27 bytes), time= 3.5 seconds
  no anomalies in 154 test result(s)

rng=RNG_stdin32, seed=unknown
length= 256 megabytes (2^28 bytes), time= 7.4 seconds
  no anomalies in 165 test result(s)

rng=RNG_stdin32, seed=unknown
length= 512 megabytes (2^29 bytes), time= 14.2 seconds
  no anomalies in 178 test result(s)

rng=RNG_stdin32, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 27.5 seconds
  no anomalies in 192 test result(s)

rng=RNG_stdin32, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 53.1 seconds
  no anomalies in 204 test result(s)

rng=RNG_stdin32, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 104 seconds
  no anomalies in 216 test result(s)

rng=RNG_stdin32, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 206 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin32, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 409 seconds
  no anomalies in 240 test result(s)

rng=RNG_stdin32, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 813 seconds
  no anomalies in 251 test result(s)

rng=RNG_stdin32, seed=unknown
length= 64 gigabytes (2^36 bytes), time= 1630 seconds
  no anomalies in 263 test result(s)

rng=RNG_stdin32, seed=unknown
length= 128 gigabytes (2^37 bytes), time= 3261 seconds
  no anomalies in 273 test result(s)

rng=RNG_stdin32, seed=unknown
length= 256 gigabytes (2^38 bytes), time= 6523 seconds
  no anomalies in 284 test result(s)

rng=RNG_stdin32, seed=unknown
length= 512 gigabytes (2^39 bytes), time= 13141 seconds
  no anomalies in 295 test result(s)

rng=RNG_stdin32, seed=unknown
length= 1 terabyte (2^40 bytes), time= 26293 seconds
  no anomalies in 304 test result(s)

 
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: some CPU info

Post by deltarho[1859] »

@dodicat

I thought that your latest PC's spec was similar to mine, but obviously not because with 'Randomize ,5' at 16 gigabytes you have 409 seconds to my 66.8 seconds.

There must be something else going on - that is a factor of 6.12.

I think that your macros are overkill - I'm using '*SPtr = Int(Rnd*2^32)' - but that will only get you another few per cent in speed.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: some CPU info

Post by dodicat »

My macro is the main ranger in the ulongint range, but using only a ulong range for practrand.
I don't think you need concatenation for function 5, it is very random and it is ulong range anyway, so yours will do there.
But for the default twister without concatenation it fails.
My main theme is a 64 bit simulator using fb rnd and giving a good practrand.
As I said earlier, the speed is poor.
Having to use rnd twice and rnd itself is already a double which needs to go back to ulong. (twice)
So there is no practical use for it really, just an exercise in messing around and being off topic for so long.
Sorry srvaldez.
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Re: some CPU info

Post by dafhi »

dodicat, i've been messing with quicksort again. I found yours to be fastest. I will see if i can tweak further

[update] also made x shared, swapped the order of last 2 calls for later experiments

Code: Select all

#macro SetQsort(datatype,fname,dot)
    Sub fname(a() As datatype,begin As Long,Finish As Long)
    Dim As Long i=begin,j=finish
    'Dim As datatype x =a((I+J)\2)
    x =a((I+J)\2) '' shared var
    While  I <= J
      While a(I)dot direction X dot:I+=1:Wend
      While x dot direction a(J)dot:J-=1:Wend
      
      '' shared tmp var
      If I<=J Then tmp=a(i) : a(i)=a(j) : a(j)=tmp: I+=1:J-=1
    wend
    If I < Finish Then fname(a(),I,Finish)
    If J > begin Then fname(a(),begin,J)
End Sub
#endmacro
Post Reply