A brain teaser

General FreeBASIC programming questions.
fxm
Moderator
Posts: 12081
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: A brain teaser

Post by fxm »

deltarho[1859] wrote:fxm and Lost Zergling may have spotted a weakness in gcc's optimization, but that is beyond my pay grade to get involved in that.
I did an elementary test with gcc and didn't see a parameter to pass 'ByRef' transformed into a parameter passed 'ByVal' (with only '@parameter' used in procedure body and not assignment) even with the option '-O 3'.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A brain teaser

Post by deltarho[1859] »

That is great news, fxm, thanks for that. Image
coderJeff
Site Admin
Posts: 4313
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: A brain teaser

Post by coderJeff »

I thought maybe this could be due to gcc's '-fstrict-aliasing' which is turned on for -O levels 2 and higher.
This option turns on optimizations that assume pointers of different types never point to the same memory location.

With a byref ulongint (aka ulongint ptr) referenced through a ubyte ptr, it's possible that gcc could optimize out needed bits. However, fbc is already passing -fno-strict-aliasing by default to gcc so maybe not the cause. Unless there is something else going on, for example passing options directly to gcc through fbc's -Wc command line option, I dunno.

However, when same memory location is accessed through a union, the aliasing optimizations do not apply and should get fool proof code generation as dodicat points out.
fxm
Moderator
Posts: 12081
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: A brain teaser

Post by fxm »

coderJeff wrote:However, when same memory location is accessed through a union, the aliasing optimizations do not apply and should get fool proof code generation as dodicat points out.
And such UDT can be defined locally in the procedure:

Code: Select all

Sub ShuffleUlongint( ByRef x As Ulongint )
  Union localUDT
    As Ulongint ul
    As Byte b(7)
  End Union
  
  Dim As localUDT Ptr p = Cptr(localUDT Ptr, @x)
  For i As Long = 0 to 6
    Swap p->b(i), p->b(Int(rnd_range(i,8)))
  Next
End Sub
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: A brain teaser

Post by dodicat »

deltarho[]
Regarding the shuffle method
If I use this code:

Code: Select all


#define irange(f,l) Int(Rnd*(((l)+1)-(f))+(f))
Function popcount64(x As Ulongint) As Ubyte
      If x=&hffffffffffffffffull Then Return 64
      x -= ((x Shr 1) And &h5555555555555555ull)
      x = (((x Shr 2) And &h3333333333333333ull) + (x And &h3333333333333333ull))
      x = (((x Shr 4) + x) And &hf0f0f0f0f0f0f0full)
      x += (x Shr 8)
      x += (x Shr 16)
      x += (x Shr 32)
      Return x And &h0000003full
End Function

#macro shuffle(x)
For i As Long = 0 to sizeof(x)-2
    Swap Cast(uByte Ptr,@x)[i], Cast(uByte Ptr,@x)[iRange((i+1),(sizeof(x)-1))] '<<<-------------------
  Next
#endmacro

dim as ulongint u
dim as ulong l
dim as ushort s

randomize
do
u=irange(0,18446744073709551614)
print "ulongint"
print u,popcount64(u)
shuffle(u)
print u,popcount64(u)
print
l=irange(0,4294967294)
print "ulong"
print l,popcount64(l)
shuffle(l)
print l,popcount64(l)
print
s=irange(0,65534)
print "ushort"
print s,popcount64(s)
shuffle(s)
print s,popcount64(s)
print
print "_______________________________"
sleep
loop until inkey=chr(27)

 
Then I am guaranteed no overlaps, i.e. I get 2 different values each time.
(using Swap Cast(uByte Ptr,@x), Cast(uByte Ptr,@x)[iRange((i+1),(sizeof(x)-1))])
If I use
Swap Cast(uByte Ptr,@x), Cast(uByte Ptr,@x)[iRange((i),(sizeof(x)-1))]
then I have a chance of repeats
ulongint .07%
ulong 4.3%
ushort 50%

@All.
I notice that if I compile with -pp option (to expand any macros) then the ull prefix is lost.
So I get an warning with the above code;
warning 35(1): Mixing signed/unsigned operands
Probably not very important generally, but for de bugging it could get in the way maybe.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A brain teaser

Post by deltarho[1859] »

@dodicat

Code: Select all

Swap Cast(uByte Ptr,@x)[i], Cast(uByte Ptr,@x)[iRange((i+1),(sizeof(x)-1))])
is not a Knuth shuffle.

The element i should be allowed to shuffle with itself. So, we should use 'iRange((i),' and not 'iRange((i+1),'. This has been subject to much discussion since the Fisher-Yates shuffle was introduced in 1938.

I need to reread codeJeff's and fxm's posts.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A brain teaser

Post by deltarho[1859] »

It seems uncertain that with coderJeff's comment "a byref ulongint (aka ulongint ptr) referenced through a ubyte ptr" whether that causes an issue or not. However, there is no issue with the union method, so expediency suggests that the union method should be preferred.

fxm's union code replaces dodicat's shuffle Sub and mixup Function and I will use that.

Thanks, guys.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: A brain teaser

Post by dodicat »

Still, deltarho[], there are many repeats with the Knuth shuffle for only 8 elements.
Here is a comparison, mine, fxm.
I have made fxm's rnd_range into a macro to speed it up,, but you can use the function if you tweak the names.

Code: Select all

Function rnd_range2 (first As Double, last As Double) As Double
    Function = Rnd * (last - first) + first
End Function

#define rnd_range(first,last) Rnd * (last - first) + first

Sub shuffle(a() As Ubyte)
      #define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
      For n As Integer = Lbound(a) To Ubound(a)-2
            Swap a(n), a(range((n+1),Ubound(a)))
      Next n
End Sub

Function mixup(byval u As Ulongint) As Ulongint
      Union mix
            u As Ulongint
            Type
            As Ubyte b(1 To 8)
      End Type
End Union
Dim As mix m:m.u=u
shuffle(m.b())
Return m.u
End Function



Sub ShuffleUlongint( ByRef x As Ulongint )
  Union localUDT
    As Ulongint ul
    As Byte b(7)
  End Union
 
  Dim As localUDT Ptr p = Cptr(localUDT Ptr, @x)
  For i As Long = 0 to 6
    Swap p->b(i), p->b(Int(rnd_range(i,8)))
  Next
End Sub




dim as ulongint u,tmp,limit=10000000
dim as long counter
dim as double t
print "please wait . . ."
t=timer
for n as ulongint=1 to limit
      u=n
      tmp=u
      shuffleulongint(u)
      if tmp=u then counter+=1
next
print "repeats ";counter;tab(40);timer-t;"  seconds for fxm"

counter=0
t=timer
for n as ulongint=1 to limit
       u=n
      tmp=u
      u=mixup(u)
      if tmp=u then counter+=1
next
print "repeats ";counter;tab(40);timer-t;"  seconds for dodicat"
print "Done"
sleep


 
 
my method will also have some repeats, 256 I can think of at the moment in the ulongint range
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A brain teaser

Post by deltarho[1859] »

@dodicat

If we change fxm's code to
Swap p->b(i), p->b(Int(rnd_range(i+1,8)))
we also get zero repeats.

That is not a Knuth shuffle - it is Sattolo's algorithm.

It introduces a bias where no element can ever end up in its original position. In other words, it will not produce each possible permutation with equal probability.

There may be a good reason to use Sattolo's algorithm. If we were looking at shuffling a deck of cards, for example, then a Knuth shuffle should be used. If an online casino used Sattolo's algorithm they could find themselves in trouble.

I am using shuffling with seeds for obfuscation purposes, so it probably doesn't matter which we use. I chose to use a Knuth shuffle because I am a fan of equal probability. Image
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: A brain teaser

Post by dodicat »

Old Sattolo's doesn't look too bad probability wise.
Plus it is efficient, no swapping something with itself.
Knuth of course shares similar probabilities.
score 0 0 footie ??.

Code: Select all

screen 20
Sub shuffle(a() As Ubyte)
      #define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
      For n As Integer = Lbound(a) To Ubound(a)-2
            Swap a(n), a(range((n+1),Ubound(a)))
      Next n
End Sub

dim as ubyte b(1 to 8)={1,2,3,4,5,6,7,8}' average=4.5
redim as double test(1 to 8)
redim as double average(1 to 8)
dim as long counter
randomize
for k as long=1 to 20
    counter=0
do
    counter+=1
    shuffle(b())
    for n as long=1 to 8
        test(n) += b(n)
        next
    loop until counter=1000000
    
    
    for n as long=1 to 8
        average(n)+=test(n)/counter
        print test(n)/counter;" ";
    next
    
    'print
    redim test(1 to 8)
    print
next k
print "averages"
for n as long=1 to 8
       print  csng(average(n)/20);" ";
    next
    dim as double acc
    for n as long=1 to 8
        acc+=average(n)/20
        next
    print
    print "overall average ";acc/8
    sleep
 
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A brain teaser

Post by deltarho[1859] »

@dodicat

Um, not sure what you are trying to prove.

Sattolo introduced his variation in 1986, 48 years after Fisher-Yates hit the streets.

Many websites explain Sattolo's algorithm, and some compare it with the Knuth shuffle. However, none of them give any reason to employ Sattolo rather than Knuth.

With Knuth if we have n elements then we have a potential of n! permutations. With n = 8 we have then 8! (=40,320). With Sattolo if we have n elements then we have a potential of (n-1)! permutations. With n = 8 we have then 7! (=5040). Sattolo is then rejecting 87.5%.

Neither of them is cryptographic. However, if we were under attack, an attacker would be looking at brute forcing 40,320 permutations with Knuth and 5040 permutations with Sattolo. I'd put my money on Knuth.

The worldwide consensus is that Knuth is a shuffle - Sattolo is not. Disallowing an element to swap with itself is not a proper shuffle - it is called a derangement.

At the moment I cannot think of a reason why we should disallow an element swapping with itself, and I cannot find anyone advocating that in certain circumstances. Suppose we have six pointers, and we want to shuffle them such that no pointer remains unmoved. Sattolo will do the job. Why would anyone want to do that? I have no idea - but they have. Image

I don't know if Sattolo wrote a paper on his algorithm because if he did, I'd like to see it assuming that he gives a reason for using it.

With a deck of cards, we get 52! permutations with Knuth and 51! permutations with Sattolo. 51! is still a large number, but that gives a 98.07% rejection. That is many permutations being 'thrown out'.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: A brain teaser

Post by deltarho[1859] »

Yours truly wrote:Sattolo introduced his variation in 1986
In fact, Sattolo is a lady - Sandra Sattolo. Her paper is "An algorithm to generate a random cyclic permutation" published 30 May 1986 in 'Information Processing Letters' by Elsevier B.V. It is copyrighted and requires rights to read it. However, it is supposed to be available online 6 March 2003, but I cannot find it.

Anyway, I am not going to keep searching. Basically, Knuth gives all possible permutations and Sattolo gives a subset where elements are not allowed to swap with themselves. We have to decide which is the more appropriate - a shuffle or a derangement. It is not subject to an opinion - they are mutually exclusive. Image
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: A brain teaser

Post by dodicat »

deltarho[]
The Sattolo is a tiny bit different, first loop is to end-1, I had end-2
https://danluu.com/sattolo/
I note the property of sattolo is that no number ends up in the same place after a shuffle.
VIZ:

Code: Select all


Sub shuffle(a() As Ubyte)
      #define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
      For n As Integer = Lbound(a) To Ubound(a)-1
            Swap a(n), a(range((n+1),Ubound(a)))
      Next n
End Sub

dim as byte c(1 to 8)={1,2,3,4,5,6,7,8}
dim as long counter
do
      counter+=1
      locate 1
      print counter
dim as ubyte b(1 to 8)={1,2,3,4,5,6,7,8}

shuffle(b())
for n as long=1 to 8
      if b(n)=c(n) then print c(n)
      next
loop 
Also no repeats in the ulongint range when shuffling a ulongint, except for those 255 exceptions where all the bytes have the same value in the union.
Anyway, I'll move on anew now.
Post Reply