Switch Net 4 neural network

GreenInkSplodge · Post by **GreenInkSplodge** » May 07, 2023 7:59

I programmed the Switch Net 4 neural network with backprobagation training in FreeBasic.
There are 2 versions in the .zip file.
A: A generic FreeBasic version.
B: A Linux AMD64 version with assembly language speed-ups.

https://archive.org/details/switch-net-4-bpfb

There is a blog post about it here:
https://ai462qqq.blogspot.com/2023/04/s ... mbine.html

And also there is also Switch Net:
https://ai462qqq.blogspot.com/2023/04/switch-net.html

dafhi · Post by **dafhi** » May 09, 2023 8:30

dunno how to add SDL to MX Linux
tried adding my bmp handler .. works ok.
segfault after i press T

got bmp images loading. not really sure what switchnet does but i'm sure it's cool.

back to my run length blit xD

dafhi · Post by **dafhi** » May 09, 2023 22:48

i think it was you who suggested randomizing, comparing, and resetting if unfavorable

i just thought of this method for sparse randomization. it's good for someone like me who doesn't grok backprop.
it's also fast

Code: Select all

'#include "dsi/boilerplate.bas"
union suspicion_suppressor   ' suspicious ptr tango
  as any ptr          a
  as ubyte ptr        b
  as ushort ptr       sho
  as ulong ptr        l
  as ulongint ptr     li
  As Single Ptr       s
  as double ptr       d
End Union


dim shared as suspicion_suppressor  gp

function sbin(p as any ptr, cBytes as long = 1) as string
  var s = ""
  gp.a = p + cbytes - 1 '' "most-significant" first
  for j as long = 1 to cBytes
    for i as long = 7 to 0 step -1
      s += str((*gp.b shr i) and 1)
    next:  gp.a -= 1
  next
  return s
end function
'#cmdline "-gen gcc -O 2"
Function popcnt(x As Ulongint) As Ulong
  ' https://freebasic.net/forum/viewtopic.php?p=294606#p294606
      If x=18446744073709551615 Then Return 64
      x -= ((x Shr 1) And &h5555555555555555ull)
      x = (((x Shr 2) And &h3333333333333333ull) + (x And &h3333333333333333ull))
      x = (((x Shr 4) + x) And &hf0f0f0f0f0f0f0full)
      x += (x Shr 8)
      x += (x Shr 16)
      x+= (x Shr 32)
     return  x And &h0000003full
End function

#macro SetQsort(datatype,fname,dot) '' dodicat's qsort, dafhi mod
    Sub fname(array() As datatype,begin as longint,Finish as longint)
static as typeof( datatype dot ) pivot
static as longint j
Dim as longint i = begin
j=finish: pivot = array((I+j)\2)dot
While  I < j '' i <= j
    While array(I)dot direction pivot:I+=1:Wend
    While pivot direction array(j)dot:j-=1:Wend
If I<=j Then Swap array(I),array(j): I+=1:j-=1 
wend
If j > begin Then fname(array(),begin,j)
If I < Finish Then fname(array(),I,Finish)
End Sub
#endmacro
' ------------------- boilerplate


type tUBYTEPOP field = 1
  as ubyte   A
  as ubyte   C
end type

dim as tUBYTEPOP xormix(255)

for i as long = 0 to ubound(xormix)
  xormix(i).A = i
  xormix(i).C = popcnt(i)
next


'' sort according to bit population
#define direction <

setqsort( tUBYTEPOP, qsort, .C )
qsort xormix(), 0, 255


'' verify sort
#if 0
for i as long = 0 to 255
  ? xormix(i).C; " ";
next
?
#endif


'' xor some data
dim as ubyte a(40)

for i as long = 0 to ubound(a)
  dim as single s = rnd
  a(i) xor= xormix(255.499*s^5).A
next

? sbin( @a(0), ubound(a) + 1)

GreenInkSplodge · Post by **GreenInkSplodge** » May 11, 2023 23:08

The example is set up to do autoassociation (same input and output) and you look at some validation inputs to see if the net produces new novel outputs based on what it has learned. Obviously there is a scaling issue because no network can generalize unless it has seen enough data during training. I've run it on 125 images of mountains and it was starting to generalize a bit. Probably if the images where more similar like mugshots it would give quicker results. If I could leave a computer running for a week or I could try it with a few thousand images but that isn't going to happen. This is where it would be good to have a few raspberry pi's kicking around.
I eventually converted to using backpropagation, but I just implemented it my way, what ever looked like the simplist way that would work. I didn't use the chain rule or whatever, I just intuited.
One aspect of the net is using sub-random projections and for that you need sub-random patterns of sign flips.
I'm still doing experiments with that.
For example using sub-random additive recurrence and shift with xor.

Code: Select all

'       3.879634e+12
sub subrandomxor1(result as single ptr,x as single ptr,n as ulongint)
	dim as ulong ptr resultptr=cast(ulong ptr,result)
	dim as ulong ptr xptr=cast(ulong ptr,x)
	dim as ulong r=&hE81D90AE,rs,shift=18
	for i as ulongint=0 to n-1 step 4
	    rs+=r
		resultptr[i]=xptr[i] xor ((rs xor (rs shl shift)) and &h80000000)
		rs+=r
		resultptr[i+1]=xptr[i+1] xor ((rs xor (rs shl shift)) and &h80000000)
		rs+=r
		resultptr[i+2]=xptr[i+2] xor ((rs xor (rs shl shift)) and &h80000000)
		rs+=r
		resultptr[i+3]=xptr[i+3] xor ((rs xor (rs shl shift)) and &h80000000)		
	next
end sub

'3.808423e+12
sub subrandomxor2(result as single ptr,x as single ptr,n as ulongint)
	dim as ulong ptr resultptr=cast(ulong ptr,result)
	dim as ulong ptr xptr=cast(ulong ptr,x)
	dim as ulong r=&h105D7152,rs,shift=18
	for i as ulongint=0 to n-1 step 4
	    rs+=r
		resultptr[i]=xptr[i] xor ((rs xor (rs shl shift)) and &h80000000)
		rs+=r
		resultptr[i+1]=xptr[i+1] xor ((rs xor (rs shl shift)) and &h80000000)
		rs+=r
		resultptr[i+2]=xptr[i+2] xor ((rs xor (rs shl shift)) and &h80000000)
		rs+=r
		resultptr[i+3]=xptr[i+3] xor ((rs xor (rs shl shift)) and &h80000000)		
	next
end sub


dim shared as single x(65535),y(65535)
for i as ulong=0 to 65535
   y(i)=1
next
for k as ulong=0 to 20
var t1=timer
dim as single s
for i as ulong=0 to 999
subrandomxor2(@x(0),@y(0),65536)
next
var t2=timer
print "per second",1000/(t2-t1)
next
for i as ulong=0 to 65535
   y(i)=1
next
screenres 400,400,32
subrandomxor1(@x(0),@y(0),65536)
subrandomxor2(@y(0),@y(0),65536)
for i as ulong=0 to 65535
  dim as ulong xi=i and 255
  dim as ulong yi=i shr 8
  dim as ulong r=iif(x(i)>0,255,0)
  dim as ulong b=iif(y(i)>0,255,0) 
  pset (xi,yi),rgb(r,(r+b) shr 1,b)
next
getkey

That compiles with -O 3 to some very fast assembly language when you look with the rr option

GreenInkSplodge · Post by **GreenInkSplodge** » May 12, 2023 13:47

This version may work with Windows, unless there is a folder seperator issue, like "/" versus "\". According to the documentation it should be ok. The code uses the .bmp image format.
https://archive.org/details/sw-net-4-bpgenericbmp

dafhi · Post by **dafhi** » May 12, 2023 15:02

~~when i hit esc while training, Geany says program's still running~~

~~otherwise,~~ that's super cool

Luxan · Post by **Luxan** » Jul 30, 2023 0:31

Might the enduring question of how to accurately generate signed magnitude from the FFT be
answered through the application of Switch NET to a large number of samples.

Luxan · Post by **Luxan** » Jul 30, 2023 0:39

Forgot to mention.

On Ubuntu 22.04, after installing libSDL2-dev, you need to change

Code: Select all


      #include  "SDL/SDL.bi"
     #include  "SDL/SDL_image.bi"

to

Code: Select all

      #include  "SDL2/SDL.bi"
     #include  "SDL2/SDL_image.bi"

in the file image.bas .

Luxan · Post by **Luxan** » Jul 30, 2023 3:51

The link you gave to the switch net wht is interesting; however comments
within your code would be appreciated.

Some other examples might elucidate what's involved; in particular a
set of vector inputs for training and testing rather than the imposing
images.

With vectors and arrays of the appropriate dimension you can print
out values, possibly in a dynamic way.

dafhi · Post by **dafhi** » Aug 01, 2023 1:37

i converted one of his js demos

Code: Select all

/' 

  switchnet 4 demo - translation by dafhi - 2023 Nov 15
    
    update:
  
  commented out 5 Lines
  
    plans:
  
  1. line 265 i will probably be able to eliminate
  2. line 268 - document my process for discovering optimal values
  
'/

#define sng  as single
#define dbl  as double

function round( f sng, places as byte = 0) as string '' dafhi 2023 Nov 4
  dim as long pow10 = 10^(places+0)
  dim as string s = str( int(f*pow10+.5)/pow10 )
  for i as long = 0 to len(s)-1
    if s[ i ] = 46 then return left(s,i+places+1)
  next
  return str(f)
end function


type SwNet4

  /'
    original
    https://editor.p5js.org/congchuatocmaydangyeu7/sketches/IIZ9L5fzS
  '/
  
  '' vecLen must be 4,8,16,32.....
  declare sub setup( as long, as long )
  declare sub recall( () as single, byref as single ptr ) '' () = array
  declare sub _abcd( () as single, as long )
  as long     depth
  as single   scale
  as byte     flips( any ) '' Nov 4
  as single   params( any )
  as single   _a,_b,_c,_d
  as long     _paramIdx, _j_base
end type

sub SwNet4.setup( vecLen as long, _depth as long)
  depth = _depth
  scale = 1 / sqr( vecLen shr 2 )
  redim flips( vecLen - 1 )
  for  i as long=0 to vecLen-1
    this.flips(i)= iif(rnd < .5, -1, 1 ) '' Nov 4
  next
  
  redim params ( 8 * vecLen * depth - 1 )
  var j = 0
  for i as long = 0 to ubound(this.params) step 8
    this.params(i+j) = this.scale
    this.params(i+4+j) = this.scale
    j=(j+1) and 3
  next
end sub

  '' Fast Walsh Hadamard Transform
  sub wht( vec() as single, hs_pow2_shl as long = 0 ) '' 1 or 2 for Partial Fast W.H.T.
    dim as long n = ubound(vec)+1
    var hs = 1 shl hs_pow2_shl
    while (hs < n) 
      var i = 0
      while (i < n) 
        var j = i + hs
        while (i < j) 
          var a = vec(i)
          var b = vec(i + hs)
          vec(i) = a + b
          vec(i + hs) = a - b
          i += 1
        wend
        i += hs
      wend
      hs += hs
    wend
  end sub

sub SwNet4._abcd( result() as single, j as long )
  _paramIdx += 8        
  dim sng x=result( j+_j_base )
  if(x<0)then
    _a+=x*this.params(_paramIdx)
    _b+=x*this.params(_paramIdx+1)
    _c+=x*this.params(_paramIdx+2)
    _d+=x*this.params(_paramIdx+3)     
  else
    _a+=x*this.params(_paramIdx+4)
    _b+=x*this.params(_paramIdx+5)
    _c+=x*this.params(_paramIdx+6)
    _d+=x*this.params(_paramIdx+7)
  endif
  _j_base += 1
end sub

sub SwNet4.recall( result() as single, byref inVec as single ptr )

  for i as long = 0 to ubound(result)'this.vecLen-1
    '' Nov 4
    result(i) = inVec[i] * (scale/9) * this.flips(i) 'iif( this.flips(i), 1, -1 )
  next
  wht( result() )
  _paramIdx = -8
  for i as long = 0 to this.depth - 1
    for j as long = 0 to ubound(result) step 4
      _j_base = 0
      _a=0:_b=0:_c=0:_d=0
      _abcd result(), j
      _abcd result(), j
      _abcd result(), j
      _abcd result(), j
      result(j)=_a
      result(j+1)=_b
      result(j+2)=_c
      result(j+3)=_d
    next
    const pow2_shl = 2 '' August 1
    wht( result(), pow2_shl )
  next
end sub


  function costL2(vec() as single, byref tar as single ptr) as double
    dim dbl cost
    for i as long= 0 to ubound(vec)-1
      var e = vec(i) - tar[i]
      cost += e*e
    next
    return cost
  end function


type hyperparam
  declare operator cast dbl
  declare sub      advance
'  declare sub      reset
  as double        f
  as double        _v
  as double        _rate
end type

sub hyperparam.advance
  _v += _rate
  f = 1 / _v
end sub

'sub hyperparam.reset
'  _v = .15 - _rate
'  advance
'end sub

operator hyperparam.cast dbl
  return 1+f
end operator


sub val_and_rate( byref h as hyperparam, f dbl, r dbl )
  h._v = 1 / f - r
  h._rate = r
  h.advance
end sub

  dim shared as hyperparam   muta_size
  dim shared as hyperparam   flip_chance


type mutator
  declare constructor( as long, as long, as single )
  declare sub   mutate( byref net as SwNet4 )
  declare sub   undo( byref net as SwNet4 )
  as long       size, precision
  as single     limit
  as single     previous( any)
  as long       pIdx( any)
end type

constructor mutator( size as long, precis as long, limit  as single )
  redim this.previous( size-1 )
  redim this.pIdx( size-1 )
  this.precision = precis
  this.limit = limit
end constructor

sub mutator.mutate( byref net as SwNet4 )
  
  dim sng     sc = ubound(net.params) + .499 '' c++ .999
  dim as long rpos, rpos2, c
  dim sng     vm, m
  
  '' previous() and pIdx() detail a small set for mutation

  for i as long= 0 to ubound( this.pidx ) ' relatively small array

    rpos = rnd * sc   ' random elem
    this.pIdx(i) = rpos '' muta location
    this.previous(i) = net.params(rpos) '' save pre-mutate
    m = 2 * this.limit * exp(rnd*-this.precision)
    vm = net.params(rpos) + iif(rnd<.5,m,-m)
    if (vm > this.limit)orelse(vm < -this.limit) then continue for
    net.params(rpos) = vm
    
  next
  
end sub

sub mutator.undo(byref net as SwNet4)
  for i as long = ubound(previous) to 0 step -1
    net.params( pIdx(i) ) = previous(i)
  next
end sub


  namespace demo

const       iter_max = 2499

dim sng       ex(8,255) '' namespace globals
dim sng       work(255)
dim dbl       parentCost = 1/0

dim as long   w,h
dim as ulong  c1,c2 = rgb(255,255,0)

dim as SwNet4 parentNet

sub setup()

  w = 400
  h = 400
  
  screenres w,h,32
  
  parentNet.setup 256, 2

  const tau = 8*atn(1)
  for i as long = 0 to 127
    '' Training data
    dim sng t = (i * tau) / 127
    ex(0,2 * i) = sin(t)
    ex(0,2 * i + 1) = sin(2 * t)
    ex(1,2 * i) = sin(2 * t)
    ex(1,2 * i + 1) = sin(t)
    ex(2,2 * i) = sin(2 * t)
    ex(2,2 * i + 1) = sin(3 * t)
    ex(3,2 * i) = sin(3 * t)
    ex(3,2 * i + 1) = sin(2 * t)
    ex(4,2 * i) = sin(3 * t)
    ex(4,2 * i + 1) = sin(4 * t)
    ex(5,2 * i) = sin(4 * t)
    ex(5,2 * i + 1) = sin(3 * t)
    ex(6,2 * i) = sin(2 * t)
    ex(6,2 * i + 1) = sin(5 * t)
    ex(7,2 * i) = sin(5 * t)
    ex(7,2 * i + 1) = sin(2 * t)
  next
  
  '' idea based on run time (iter_max)
  const sng f = .99 * (2499 / iter_max) ^ .25
  
  '' hyperparams 2023 Nov 4
  val_and_rate muta_size, 255 * .075*f, .0000025*f
  
end sub


  dim as long frame
  
sub draw()

  var precision = 35
  var limit = 2 * parentNet.scale
  
  '' mutator with hyperparams.  2023 Nov 4 - by dafhi
  dim as Mutator mut = type( muta_size, precision, limit )
  
  for i as long = 0 to 24 '' 100 originally.  reduced for some cpu sleep
    mut.mutate( parentNet )
    dim dbl cost = 0
    for j as long = 0 to 7
      parentNet.recall( work(), @ex(j,0) )
      cost += costL2( work(), @ex(j,0) )
    next
    if (cost < parentCost) then
      parentCost = cost
    else
      mut.undo( parentNet )
    endif
  next
  
  muta_size.advance
  
  cls
  
  locate 2,1
  ? "Training Data"
  for i as long = 0 to 7
    for j as long= 0 to 255 step 2
      var y=44 + 18 * ex(i,j + 1)
      pset (25 + i * 40 + 18 * ex(i,j), y), c2
    next
  next
  
  locate 10,1
  ? "Recall"
  for i as long = 0 to 7
    parentNet.recall( work(), @ex(i,0) )
    for j as long = 0 to  255 step 2
      pset(25 + i * 40 + 18 * work(j), 104 + 18 * work(j + 1)), c2
    next
  next
  
  frame += 1
  
  locate 30,1
  ?"Iterations: "; frame; " of "; iter_max
  ?"Cost: "; round( parentCost, 3 )
  ?
  ?"mutator size " + round( muta_size, 3)
end sub

end namespace


randomize
demo.setup

for i as long = 1 to demo.iter_max
  demo.draw
  sleep 1
  
  var kstr = lcase(inkey)
  select case kstr
  case ""
  case else
    exit for
  end select
next

?
? "done!"

sleep

Luxan · Post by **Luxan** » Aug 07, 2023 5:41

I ran the java script version online a few days ago, after a significant
amount of time had passed the cost had reached a value of 0.002
and didn't go lower than that.
I then wondered if this had resulted in a network that was over fitted for the
particular training data that was used.

The translation you've provided runs at a good pace upon
my computer and appears to return meaningful results.

dafhi · Post by **dafhi** » Aug 07, 2023 22:28

~~in .mutate(), rpos has the possibility of re-choosing the same element in the same loop which will overwrite baseline vec(rpos)~~

~~oddly enough, impact is negligible compared to a sub-random rpos() array~~

[edit 2] - his undo arrays record everything

[edit] .. interesting that you ran it that long. maybe it got 'stuck' because of rounding

Luxan · Post by **Luxan** » Aug 14, 2023 2:26

Eventually your code reached a cost of 0.001, it may even go lower.

After scrutinising the diagram that appears at :

https://ai462qqq.blogspot.com/2023/04/s ... mbine.html

I've written a FreeBasic program that implements many of the
features illustrated.
Including random sign flip, WHT(), switched 4 element per bank weights,
assignment of signed random weights to the arrays Wp() and Wn().

Assuming this is functioning as intended, what other routines are required.

Increase the width [ size ? ] of the SW4 net ?
Zero extend the input array ?
Adjust weights ?

dafhi · Post by **dafhi** » Aug 14, 2023 4:42

the only thing i can think after having briefly looked at the blog (again) is the suggestion of making net a few times wider than input.

~~have you coded a weights adjuster? GreenInk's mutator concept is really great~~

Luxan · Post by **Luxan** » Aug 14, 2023 7:35

I also read that about increasing the width; of what exactly, the weights between
the WHTs , in the diagram this looks like groups of vectors rather than something
like a matrix.

I haven't implemented a weight adjuster yet, I noted what the author said about
' intuiting' that aspect of the network, therefore I'm seeking some guidance.
The mutator that you mention might be one approach.

My code is different in style to the code thus far referenced or presented,
I haven't thoroughly tested the routines ; I suppose a comparison of some
sort is possible.

Switch Net 4 neural network

Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

(Nov 4) Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network

Re: Switch Net 4 neural network