FB_ML_libs project (formally AKA libLINREG)

User projects written in or related to FreeBASIC.
ron77
Posts: 197
Joined: Feb 21, 2019 19:24
Location: Israel
Contact:

FB_ML_libs project (formally AKA libLINREG)

Postby ron77 » Oct 21, 2020 10:15

hello :)

i am trying to convert a simple linear regression code from python to FB as part of my effort to create a ML FB library of linear regression.
well the intension is to make a library of ML linear regression in FB so when (hopefully) we'll have a working linear regression program code in FB that's just one step the next is to make out of it a library (libLINREG) with a bi file and a libLINREG.a file so any one in FB won't have to code this headache over and over when trying to add some linear regression calculation to his data / program

source of python code is in this site: https://machinelearningmastery.com/implement-simple-linear-regression-scratch-python/

for more details please visit this post: https://www.freebasic.net/forum/viewtopic.php?f=14&t=28901

project repository on github: https://github.com/ronblue/libLINREG

any help or contribution will be welcome this is an open source community project...

here is the thread in forum where we are trying to do the conversion/translation of the code from python to FB: https://www.freebasic.net/forum/viewtopic.php?f=2&t=28908


UPDATE 09/12/2020: we have another ML library in freeBASIC - KNN algorithm analysis on csv datasets thanks to the community help...
the repository (open source MIT license) is here https://github.com/ronblue/FB_KNN_lib


UPDATE 16/11/2020: PROJECT HAS BEEN SUCCESFULLY COMPLITED IN REPOSITORY OF PROJECT YOU'LL FIND A WORKING LIBRARY WITH DOCUMENTATION. THANKS FOR ALL OF YOUR HELP <3


update (15/11/2020):
-well me and my teacher had succeeded in taking dodicat linear regression code example and combine it with paul doe "load csv" code to read dataset csv files to a working code
- now the next step is learning how to make an FB library out of the code with a "bi" file header and a "lib.a" lib file
- the result working code with example dataset.csv is posted here
update (31/10/2020):
here is an update i'm taking private lessons in programming in freeBASIC with my old teacher who thought me and helped me in the past when i started to learn how to program and code... we are taking lessons in FB OOP and UDT in the aim to accomplish this project and make a) a program that does a simple linear regression and b) make a library of linear regression in freebasic. the lessons aim to teach me advance topics of FB programming in order that i will be able to code projects that require advance knowledge in FB so i'll be able to know what i'm doing and be able in completing such projects without asking for help or relaying on someone else for helping me... to be more independent and self-reliant in my progress in coding in FB...
as for tourist trap attempt to convert the python code i guess it didn't work out at the end - i really don't now where that stands... but anyway i will be as soon as possible attempting to code such a program and if succeeded i'll post it in the forum...
update (23/10/2020) version 0.0.2.1:
- Paul doe posted working code that mimics the result of python code (including presenting the results as a graph) by so proving that FB linear regression program is possible - i offered Paul doe to become a collaborator in project - i hope he will accept invitation. the most important thing in Paul doe code is the equations function - the math equations function which i hope will be at the core of the library at the end...
- old code of attempts to convert the python code that was scrapped is now in "source/ARCHIVE" folder
- looks like there is no option but to use UDT and OOP in order to create a linear regression / python code equivalent result
- the ownership of the project's repository has been transferred to Tourist Trap.
update(22/10/2020) version 0.0.2:
- Tourist Trap joined the project as a collaborator.
- we decided to start from scratch and ditch the main_program.bas code with the OOP to load the csv file...
- now we start to convert\translate the python code 1 to 1 - version of code is now 0.0.2
version 0.01(21/10/2020):
- checked if python code example works (and it does) ( in "py_example/py_linear_example" )
- dataset csv in "datasets\dataset.scv" of Swedish insurance data made this will be test and train data
- source code in "source\Load_csv.bas" - an attempt to convert the py code program to a working FB code program ( so far csv file data are loaded into the program and been able to convert py code for finding means varients and find convariance of data )
update 21/10/2020:]
- today tried to convert more of the python code to FB and add it to bas source code with partial success. help is needed by others
- it seems that the python code is more accurate then the FB code using the simple test.cvs file
- found a vb code example of simple linear regression here: https://www.centerspace.net/examples/nm ... xample.php however i believe i should continue to convert the python code...


test.csv file:

Code: Select all

1, 1
2, 3
4, 3
3, 2
5, 5



dataset.csv

Code: Select all

108,392.5
19,46.2
13,15.7
124,422.2
40,119.4
57,170.9
23,56.9
14,77.5
45,214
10,65.3
5,20.9
48,248.1
11,23.5
23,39.6
7,48.8
2,6.6
24,134.9
6,50.9
3,4.4
23,113
6,14.8
9,48.7
9,52.1
3,13.2
29,103.9
7,77.5
4,11.8
20,98.1
7,27.9
4,38.1
0,0
25,69.2
6,14.6
5,40.3
22,161.5
11,57.2
61,217.6
12,58.1
4,12.6
16,59.6
13,89.9
60,202.4
41,181.3
37,152.8
55,162.8
41,73.4
11,21.3
27,92.6
8,76.1
3,39.9
17,142.1
13,93
13,31.9
15,32.1
8,55.6
29,133.3
30,194.5
24,137.9
9,87.4
31,209.8
14,95.5
53,244.6
26,187.5


working code that read dataset.csv and does a simple linear regression and show result in a graph we hope to make a library out of this working code

Code: Select all

#include once "file.bi"

TYPE ListPair
      As Double x,y
End TYPE

'APPEND TO the listpair array the ListPair item
SUB pAPPEND(arr() AS ListPair , Item AS ListPair)
   REDIM PRESERVE arr(LBOUND(arr) TO UBOUND(arr) + 1) AS ListPair
   arr(UBOUND(arr)) = Item
END SUB

SUB loadDataset( byref path as const string , p() AS ListPair)
  'dim as ListPairTable t
   'Dim As ListPair p()
 
 
  if( fileExists( path ) ) then
    dim as long f = freeFile()
   
    open path for input as f
   
    do while( not eof( f ) )
      dim as ListPair d
     
      input #f, d.x
      input #f, d.y
      PAPPEND p(), d           
    LOOP
    CLOSE #f
  end if
 
end SUB


Function mean(p() As ListPair) As ListPair
      Dim As ListPair pt
      For n As Long=Lbound(p) To Ubound(p)
            pt.x+=p(n).x
            pt.y+=p(n).y
      Next n
      Var sz=(Ubound(p)-Lbound(p)+1)
      Return Type(pt.x/sz,pt.y/sz)
End Function

Function Gradient(p() As ListPair) As Double
      Dim As Double CoVariance,Variance
      Dim As ListPair m=mean(p())
      For n As Long=Lbound(p) To Ubound(p)
            CoVariance+=(p(n).x-m.x)*(p(n).y-m.y)
            Variance+=(p(n).x-m.x)^2           
      Next n
      Return CoVariance/Variance
End Function

Function intercept(p() As ListPair,grad As Double) As Double
      Var m=mean(p())
      Return  m.y-grad*m.x
End Function

Function RMSerror(p() As ListPair,m As Double,c As Double,res() As Double) As Double
      Dim As Double acc
      Redim res(Lbound(p) To Ubound(p))
      For n As Long=Lbound(p) To Ubound(p)
            res(n)=m*p(n).x+c
            acc+=(p(n).y-res(n))^2
      Next n
      acc/=(Ubound(p)-Lbound(p)+1)
      Return Sqr(acc)
End Function


Function minmax(p() As ListPair,flag As String="x") As ListPair 'for plotting
      Dim As ListPair result
      Dim As Double d(Lbound(p) To Ubound(p))
      For n As Long=Lbound(d) To Ubound(d)
            If flag="x" Then  d(n)=p(n).x Else d(n)=p(n).y
      Next
      For n1 As Long=Lbound(d) To Ubound(d)-1
            For n2 As Long=n1+1 To Ubound(d)
                  If d(n1)>d(n2) Then Swap d(n1),d(n2)
            Next
      Next
      Return Type(d(Lbound(d)),d(Ubound(d)))
End Function

Sub plot(p() As ListPair,pred() As Double,xres As Integer,yres As Integer)
      #define map(a,b,x,c,d) ((d)-(c))*((x)-(a))/((b)-(a))+(c)
      #define xmap(z) map(minx,maxx,z,k,(xres-k))
      #define ymap(z) map(miny,maxy,z,k,(yres-k))
      Var minx=minmax(p(),"x").x,maxx=minmax(p(),"x").y
      Var miny=minmax(p(),"y").x,maxy=minmax(p(),"y").y
      Var k=100
      Line(k,k)-(xres-k,yres-k),8,b
      Dim As Double lxpos,lypos
      For n As Long=Lbound(p) To Ubound(p)
            Circle(xmap(p(n).x),ymap(p(n).y)),5,15,,,,f
            Circle(xmap(p(n).x),ymap(pred(n))),5,5,,,,f
            If n>Lbound(p) Then Line(xmap(p(n).x),ymap(pred(n)))-(lxpos,lypos),5
            Line(xmap(p(n).x),ymap(p(n).y))-(xmap(p(n).x),ymap(pred(n))) ,8
            lxpos=xmap(p(n).x)
            lypos=ymap(pred(n))
      Next n
End Sub

Sub GetRegressionLineAndShow(p() As ListPair,xres As Integer,yres As Integer)
      Var M= Gradient(p())  'get the gradient and intercept
      Var C=intercept(p(),M)
      Redim As Double predictions()
      'get the regression line points (predictions) and root mean square error
      Dim As Double e=RMSerror(p(),M,C,predictions())
      COLOR 5
      'y=Mx+C
      Print "Regression line:   y = ";M;"*x";Iif(Sgn(C)=1," +","");C
      PRINT
      PRINT "Predictions"
      For n As Long=Lbound(predictions) To Ubound(predictions)
            Print predictions(n);" ";
      Next
      Print
      Color 8
      Print "RMSE: ";e
      SLEEP
      CLS
      PLOT(p(),predictions(),xres,yres)
End Sub


SCREEN 20
'SCREENRES 1000,950

Dim As Integer xres,yres
Screeninfo xres,yres
Window(0,0)-(xres,yres)

REDIM p(any) AS ListPair
loadDataset( "D:\repo\FB_libLINREG\datasets\dataset.csv", p() )
GetRegressionLineAndShow(p(),xres,yres)


Sleep









Paul doe code that mimic python code in results:

Code: Select all

#include once "fbgfx.bi"

const as double _
  MIN_DBL = 4.940656458412465E-324, _
  MAX_DBL = 1.797693134862316E+308

enum Colors
  White = rgba( 255, 255, 255, 255 )
  Black = rgba( 0, 0, 0, 255 )
  Red = rgba( 205, 80, 80, 255 )
  LightBlue = rgba( 130, 182, 208, 255 )
  LightGray = rgba( 214, 214, 214, 0 )
end enum

type Rect
  declare constructor()
  declare constructor( _
    byval as double, byval as double, byval as double, byval as double )
 
  as double x, y, w, h
end type

constructor Rect() : end constructor
constructor Rect( _
  byval nX as double, byval nY as double, _
  byval nW as double, byval nH as double )
 
  x = nX : y = nY
  w = nW : h = nH
end constructor

''' Linear regression stuff

type Values
  declare operator cast() as string
  declare operator []( byval as integer ) byref as double
 
  as double _values( any )
  as integer count
end type

operator Values.cast() as string
  dim as string s
 
  for i as integer = 0 to count - 1
    s += str( _values( i ) ) + iif( i < count - 1, ",", chr( 13, 10 ) )
  next
 
  return( s )
end operator

operator Values.[]( byval index as integer ) byref as double
  return( _values( index ) )
end operator

type Dataset
  declare operator cast() as string
  declare operator []( byval as integer ) byref as Values
 
  as Values _values( any )
  as integer count
end type

operator Dataset.cast() as string
  dim as string s = ""
 
  for i as integer = 0 to count - 1
    s += _values( i )
  next
 
  return( s )
end operator

operator Dataset.[]( byval index as integer ) byref as Values
  return( _values( index ) )
end operator

function add overload( byref ds as Values, byval v as double ) byref as Values
  ds.count += 1
  redim preserve ds._values( 0 to ds.count - 1 )
  ds._values( ds.count - 1 ) = v
 
  return( ds )
end function

function add( byref ds as Dataset, byref v as Values ) byref as Dataset
  ds.count += 1
  redim preserve ds._values( 0 to ds.count - 1 )
  ds._values( ds.count - 1 ) = v
 
  return( ds )
end function

type Coefs
  declare constructor()
  declare constructor( byval as double, byval as double )
 
  declare operator cast() as string
 
  as double b0, b1
end type

constructor Coefs() : end constructor
constructor Coefs( byval cB0 as double, byval cB1 as double )
  b0 = cB0 : b1 = cB1
end constructor

operator Coefs.cast() as string
  return( "B0=" & b0 & ",B1=" & b1 )
end operator

private function max overload( byval a as double, byval b as double ) as double
  return( iif( a > b, a, b ) )
end function

function max( byref ds as Values ) as double
  dim as double value = MIN_DBL
 
  for i as integer = 0 to ds.count - 1
    value = iif( ds[ i ] > value, ds[ i ], value )
  next
 
  return( value )
end function

private function min overload( byval a as double, byval b as double ) as double
  return( iif( a < b, a, b ) )
end function

function min( byref ds as Values ) as double
  dim as double value = MAX_DBL
 
  for i as integer = 0 to ds.count - 1
    value = iif( ds[ i ] < value, ds[ i ], value )
  next
 
  return( value )
end function

function mean( byref x as Values ) as double
  dim as double sum = 0.0d
 
  for i as integer = 0 to x.count - 1
    sum += x[ i ]
  next
 
  return( sum / x.count )
end function

function variance( byref x as Values, byval mean_x as double ) as double
  dim as double sum = 0.0d
 
  for i as integer = 0 to x.count - 1
    sum += ( x[ i ] - mean_x ) ^ 2
  next
 
  return( sum )
end function

function covariance( _
  byref x as Values, byval mean_x as double, _
  byref y as Values, byval mean_y as double ) as double
 
  dim as double covar = 0.0d
 
  for i as integer = 0 to min( x.count, y.count ) - 1
    covar += ( x[ i ] - mean_x ) * ( y[ i ] - mean_y )
  next
 
  return( covar )
end function

function coefficients( byref x as Values, byref y as Values ) as Coefs
  dim as double _
    mean_x = mean( x ), mean_y = mean( y ), _
    b1 = covariance( x, mean_x, y, mean_y ) / variance( x, mean_x ), _
    b0 = mean_y - b1 * mean_x
 
  return( Coefs( b0, b1 ) )
end function

function rmse_metric( byref actual as Values, byref predicted as Values ) as double
  dim as double sum_error = 0.0d
 
  for i as integer = 0 to actual.count - 1
    sum_error += ( predicted[ i ] - actual[ i ] ) ^ 2
  next
 
  return( sqr( sum_error / actual.count ) )
end function

type as function( byref as Dataset, byref as Values ) as Values _
  Algorithm

function evaluate_algorithm( _
  byref ds as Dataset, byval algorithm_func as Algorithm ) as Values
 
  dim as Values test_set = ds[ 0 ]
 
  return( algorithm_func( ds, test_set ) )
end function

function simple_linear_regression( _
  byref train as Dataset, byref test as Values ) as Values
 
  dim as Values predictions
  var c = coefficients( train[ 0 ], train[ 1 ] )
 
  for i as integer = 0 to test.count - 1
    add( predictions, c.b0 + c.b1 * test[ i ] )
  next
 
  return( predictions )
end function

'''

''' Visualization stuff
private function remap( _
    byval x as double, _
    byval start1 as double, _
    byval end1 as double, _
    byval start2 as double, _
    byval end2 as double ) _
  as double
 
  return( ( x - start1 ) * _
    ( end2 - start2 ) / ( end1 - start1 ) + start2 )
end function

sub drawRect( byref r as Rect, byval c as ulong )
  line( r.x, r.y ) - ( r.x + r.w - 1, r.y + r.h - 1 ), c, b
end sub

sub plot overload( _
  byref r as Rect, _
  byref xA as Values, byref yA as Values, _
  byval minX as double, byval maxX as double, _
  byval minY as double, byval maxY as double, _
  byval c as ulong )
 
  for i as integer = 0 to min( xA.count, yA.count ) - 1
    dim as double _
      x = remap( xA[ i ], minX, maxX, r.x, r.x + r.w - 1 ), _
      y = remap( yA[ i ], minY, maxY, r.y + r.h - 1, r.y )
   
    line( x - 5, y - 5 ) - ( x + 5, y + 5 ), c, bf
  next
end sub

sub plotLine( _
  byref r as Rect, _
  byref xA as Values, byref yA as Values, _
  byval minX as double, byval maxX as double, _
  byval minY as double, byval maxY as double, _
  byval c as ulong )
 
  for i as integer = 0 to min( xA.count, yA.count ) - 1
    if( i > 0 ) then
      dim as double _
        x1 = remap( xA[ i - 1 ], minX, maxX, r.x, r.x + r.w - 1 ), _
        y1 = remap( yA[ i - 1 ], minY, maxY, r.y + r.h - 1, r.y ), _
        x2 = remap( xA[ i ], minX, maxX, r.x, r.x + r.w - 1 ), _
        y2 = remap( yA[ i ], minY, maxY, r.y + r.h - 1, r.y )
     
      line( x1, y1 ) - ( x2, y2 ), c
    end if
  next
end sub

/'
  Test code
'/
dim as Values x, y

x = add( add( add( add( add( x, 1 ), 2 ), 3 ), 4 ), 5 )
y = add( add( add( add( add( y, 1 ), 3 ), 2 ), 3 ), 5 )

/'
  The dataset used for this example assumes x values in the 0 index, and
  y values in the 1 index.
'/
dim as Dataset ds

ds = add( add( ds, x ), y )

dim as integer _
  wW = 800, wH = 600, margin = 30

screenRes( 800, 600, 32, Fb.GFX_ALPHA_PRIMITIVES )
windowTitle( "Linear regression tutorial" )
color( Black, White )
cls()

var r = Rect( margin, margin, wW - margin * 2, wH - margin * 2 )
var predicted = evaluate_algorithm( ds, @simple_linear_regression )

dim as double _
  minX = 0, maxX = max( x ) + 1, _
  minY = 0, maxY = max( y ) + 1

minY = min( minY, min( predicted ) )
maxY = max( maxY, max( predicted ) )

drawRect( r, LightGray )

plot( r, x, y, minX, maxX, minY, maxY, LightBlue )
plot( r, x, predicted, minX, maxX, minY, maxY, Red )
plotLine( r, x, predicted, minX, maxX, minY, maxY, Black )

sleep()




update: this code has been moved to source\ARCHIVE folder this was my first attempt to convert the python code to FB development is halted we start from scratch:

Code: Select all

#include once "file.bi"
#include "string.bi"
/'
  Number of claims
  Total payment for all the claims in thousands of Swedish Kronor
  for geographical zones in Sweden
'/
type InsuranceData
  as single _
    numberOfClaims, _
    totalPayment
end type

type InsuranceTable
  declare operator []( byval as uinteger ) byref as InsuranceData
  as InsuranceData row( any )
  as uinteger count
end type

TYPE COEFFICI
   AS SINGLE _
   b0, _
   b1
END TYPE


operator InsuranceTable.[]( byval index as uinteger ) byref as InsuranceData
  return( row( index ) )
end operator

sub add overload( byref t as InsuranceTable, byref d as InsuranceData )
  t.count += 1
  redim preserve t.row( 0 to t.count - 1 )
 
  t.row( t.count - 1 ) = d
end sub

function loadDataset( byref path as const string ) as InsuranceTable
  dim as InsuranceTable t
 
  if( fileExists( path ) ) then
    dim as long f = freeFile()
   
    open path for input as f
   
    do while( not eof( f ) )
      dim as InsuranceData d
     
      input #f, d.numberOfClaims
      input #f, d.totalPayment
     
      add( t, d )
    loop
  end if
 
  return( t )
end function

'SUB iAppend(arr() AS DOUBLE, item AS DOUBLE)
'   REDIM PRESERVE arr(LBOUND(arr) TO UBOUND(arr) +1)
'   arr(UBOUND(arr)) = item
'END SUB

SUB iAppend(arr() AS DOUBLE, item AS DOUBLE)
    dim as integer lbnd = LBOUND(arr), ubnd =  UBOUND(arr)
    REDIM PRESERVE arr(lbnd TO ubnd+1)
    arr(ubnd+1) = item
END SUB

' Calculate the mean value of a list of numbers
function sum(x() as double) as double
  dim as single result
  for i as integer = 0 to ubound(x) - 1
    result = result + x(i)
  next i
  return result
end FUNCTION

function sum2(x() as DOUBLE, mean2 AS DOUBLE) as double
  dim as single result
  for i as integer = 0 to ubound(x) - 1
    result = result + x(i) - mean2
  next i
  return result
end FUNCTION

function mean(x() as double) as double
  return sum(x()) / cdbl(ubound(x) + 1)
end FUNCTION


' Calculate the variance of a list of numbers
function variance(values() AS double, BYVAL means AS DOUBLE) AS DOUBLE
   DIM resalt AS DOUBLE = 0
   FOR i AS INTEGER = LBOUND(values) TO UBOUND(values)
      resalt = resalt + (values(i) - means) * (values(i) - means)
   NEXT i
   Return resalt
END FUNCTION

FUNCTION covariance(x()as double, mean_x as double, y() as double, mean_y as double) as Double
    dim covar as Double
    for i as integer = 0 to UBOUND(x) - 1
        covar += (x(i) - mean_x) * (y(i) - mean_y)
    next
    return covar
end FUNCTION
' calculate cofficiants

FUNCTION COEFFICIENTSb0 (x() AS DOUBLE,mean_x AS DOUBLE, y() AS DOUBLE, mean_y AS DOUBLE) AS DOUBLE
   DIM coeffici AS COEFFICI
   mean_x = MEAN(x())
   mean_y = MEAN(y())
   WITH coeffici
      .b1 = COVARIANCE(x(), mean_x, y(), mean_y) / VARIANCE(x(), mean_x)
      .b0 = mean_y - .b1 * mean_x
   RETURN .b0
   END WITH
   
END FUNCTION

FUNCTION COEFFICIENTSb1 (x() AS DOUBLE,mean_x AS DOUBLE, y() AS DOUBLE, mean_y AS DOUBLE) AS DOUBLE
   DIM coeffici AS COEFFICI
   mean_x = MEAN(x())
   mean_y = MEAN(y())
   WITH coeffici
      .b1 = COVARIANCE(x(), mean_x, y(), mean_y) / VARIANCE(x(), mean_x)
      .b0 = mean_y - .b1 * mean_x
   RETURN .b1
   END WITH
   
END FUNCTION
 
FUNCTION rmse_meteric(actual() AS DOUBLE, predicted AS DOUBLE) AS DOUBLE
   DIM sum_error AS DOUBLE = 0.0
   
END FUNCTION
 
 
 REDIM SHARED test_set_x(0) AS DOUBLE
 REDIM SHARED test_set_y(0) AS DOUBLE

TYPE function_type AS FUNCTION(() As DOUBLE, () AS DOUBLE, () AS DOUBLE, () AS DOUBLE) As DOUBLE

FUNCTION elvaluate_algo(x() AS DOUBLE, y() AS DOUBLE, BYVAL algorithem AS function_type) AS DOUBLE
   
   FOR i AS INTEGER = 0 TO UBOUND(x) - 1
      IAPPEND test_set_x(), x(i)
      IAPPEND test_set_y(), y(i)
   NEXT
    REDIM actual(0) AS DOUBLE
    DIM AS DOUBLE predicted = algorithem(x(),y(), test_set_x(), test_set_y())
   FOR i AS INTEGER = 0 TO UBOUND(y) - 1
     
   NEXT
END FUNCTION

FUNCTION simple_linear_regression(train() AS DOUBLE, test() AS DOUBLE) AS DOUBLE
   REDIM prediction(0) AS DOUBLE
   
END FUNCTION


var t = loadDataset( "D:\repo\FB_libLINREG\datasets\test.csv" )

REDIM SHARED x(0) AS DOUBLE
REDIM SHARED y(0) AS DOUBLE

DIM AS DOUBLE mean_x, mean_y, covar

for i as integer = 0 to t.count - 1
 
  with t[ i ]
     
     IAPPEND x(), CDBL(.numberOfClaims)
     IAPPEND y(),  CDBL(.totalPayment)
   
  end WITH
  WITH t [ i ]
     
      ? .numberOfClaims, "means:", format(MEAN(x()), "0.00"), .totalPayment,  "means: ", format(MEAN(y()), "0.00")
 
  END WITH
NEXT


   ?  "convariance: ", format(COVARIANCE(x(), mean(x()), y(), mean(y())), "0.00")
   
      mean_x = MEAN(x())
      mean_y = MEAN(y())
      covar = COVARIANCE(x(), mean_x, y(), mean_y)
     
   ? "X colume:", FORMAT(mean_x,"0.00"), "Y colume:", FORMAT(mean_y, "0.00"), "CONVARIANCE:", FORMAT(covar, "0.00")

   ? "varients x:", FORMAT(VARIANCE(x(),mean_x), "0.00"), "VARIANCE y:",  FORMAT(VARIANCE(y(), mean_y), "0.00")
   
   ? "COEFFICIENTS:", "b0: " & FORMAT(COEFFICIENTSB0(x(), mean_x, y(), mean_y), "0.00"), "b1: " & FORMAT(COEFFICIENTSB1(x(), mean_x, y(), mean_y), "0.00")
sleep()


NEXT BIG STEP CONVERTING THIS CODES IN PYTHON TO FB:

Code: Select all

def simple_linear_regression(train, test):
   predictions = list()
   b0, b1 = coefficients(train)
   for row in test:
      yhat = b0 + b1 * row[0]
      predictions.append(yhat)
   return predictions



Code: Select all

# Standalone simple linear regression example
from math import sqrt

# Calculate root mean squared error
def rmse_metric(actual, predicted):
   sum_error = 0.0
   for i in range(len(actual)):
      prediction_error = predicted[i] - actual[i]
      sum_error += (prediction_error ** 2)
   mean_error = sum_error / float(len(actual))
   return sqrt(mean_error)

# Evaluate regression algorithm on training dataset
def evaluate_algorithm(dataset, algorithm):
   test_set = list()
   for row in dataset:
      row_copy = list(row)
      row_copy[-1] = None
      test_set.append(row_copy)
   predicted = algorithm(dataset, test_set)
   print(predicted)
   actual = [row[-1] for row in dataset]
   rmse = rmse_metric(actual, predicted)
   return rmse

# Calculate the mean value of a list of numbers
def mean(values):
   return sum(values) / float(len(values))

# Calculate covariance between x and y
def covariance(x, mean_x, y, mean_y):
   covar = 0.0
   for i in range(len(x)):
      covar += (x[i] - mean_x) * (y[i] - mean_y)
   return covar

# Calculate the variance of a list of numbers
def variance(values, mean):
   return sum([(x-mean)**2 for x in values])

# Calculate coefficients
def coefficients(dataset):
   x = [row[0] for row in dataset]
   y = [row[1] for row in dataset]
   x_mean, y_mean = mean(x), mean(y)
   b1 = covariance(x, x_mean, y, y_mean) / variance(x, x_mean)
   b0 = y_mean - b1 * x_mean
   return [b0, b1]

# Simple linear regression algorithm
def simple_linear_regression(train, test):
   predictions = list()
   b0, b1 = coefficients(train)
   for row in test:
      yhat = b0 + b1 * row[0]
      predictions.append(yhat)
   return predictions

# Test simple linear regression
dataset = [[1, 1], [2, 3], [4, 3], [3, 2], [5, 5]]
rmse = evaluate_algorithm(dataset, simple_linear_regression)
print('RMSE: %.3f' % (rmse))




ron77
Last edited by ron77 on Dec 09, 2020 18:45, edited 23 times in total.
Lost Zergling
Posts: 453
Joined: Dec 02, 2011 22:51
Location: France

Re: libLINREG project

Postby Lost Zergling » Oct 21, 2020 15:24

The purpose of this code example is to show that it is possible to perform algorithms in FreeBasic in a relatively detailed way using an easy syntax (so the objective is not neither efficiency nor speed, but to show the possibilities of the syntax using the lzle tool). This is only an example and not the translation which remains to be done. Consequently, the possibilities of different codings are very vast and will depend on the priorities of the programmer.

Code: Select all

#Include once "F:\Basic\LZLE_.bi"

Function Col_Mean(My_List As List, Col As uByte) As Double
    Dim KeyCount As Integer  ' Required for indexed option : KeyCount
    Dim d_sum As Double
    My_List.Aside 'Might be optional, depending on wether list context must be preserved or not
    My_List.Root
    While My_List.KeyStep
        KeyCount+=1 ' Required for indexed option : KeyCount
        d_sum+=cdbl(My_List.Tag(Col))
    Wend
    My_List.Recover 'Might be optional
    Return d_sum/KeyCount          ' Indexed option
  '  Return d_sum/My_List.Count    ' Non Indexed option
End Function

Function Row_Mean(My_List As List, RowNum As Integer, TotTags As uByte=2 ) As Double
    Dim d_sum As Double : Dim i As Integer
    If My_List.HasKey(Str(RowNum)) Then
        For i=1 to TotTags
            d_sum += cdbl(My_List.Tag(i))
        Next i
        Return d_sum/TotTags
    Else
        Return 0
    End If   
End Function

Dim MyList As List
Dim As Integer i

'--------------------- dataset = [[1, 1], [2, 3], [4, 3], [3, 2], [5, 5]]
Dim i_dataset(5,2) As Integer
i_dataset(1,1)=1 : i_dataset(1,2)=1
i_dataset(2,1)=2 : i_dataset(2,2)=3
i_dataset(3,1)=4 : i_dataset(3,2)=3
i_dataset(4,1)=3 : i_dataset(4,2)=2
i_dataset(5,1)=5 : i_dataset(5,2)=5
'--------------------- Of course not optimal compared to a c-like array management using zstring ptr, or FB-like array managed using lzae (whenever mature one day)

'Loading dataset to a string list, of course not optimal because number to string, and string to number conversions
For i=1 To 5   
    MyList.HashTag(Str(i) )      ' Indexed option
   ' MyList.BlindTag(Str(i) )      ' Non Indexed option
    MyList.RwTag1( Str(i_dataset(i,1) ) )
    MyList.RwTag2( Str(i_dataset(i,2) ) )
Next i

'Of course not optimal because duplicates iterations / best way would be to compute means on all columns in one pass
? "???"
Print Col_Mean(MyList, 1)
Print Col_Mean(MyList, 2)
Print "??"
'Of course not optimal because duplicates row search / best way would be to compute means on all rows in one pass & store results in Tag(3) or other list or in an array
Print Row_Mean(MyList, 1)
Print Row_Mean(MyList, 2)
? "?"
sleep
?
MyList.Destroy
system
ron77
Posts: 197
Joined: Feb 21, 2019 19:24
Location: Israel
Contact:

Re: libLINREG project

Postby ron77 » Oct 21, 2020 15:28

thank you lost zergling for the example code...
Lost Zergling
Posts: 453
Joined: Dec 02, 2011 22:51
Location: France

Re: libLINREG project

Postby Lost Zergling » Oct 21, 2020 15:38

With pleasure. No vector math operations, you still have to code the loops in a proper way. And lzae doesn't offer this stuff either, .. it's oriented: sorting, database, background interface and multi-dimensional array support. (lzle licence l-gpl compliant, not gpl, can be assimilated very close to a gpl using official FB compiler).
TJF
Posts: 3623
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Re: libLINREG project

Postby TJF » Oct 26, 2020 7:42

Carlos Herrera
Posts: 82
Joined: Nov 28, 2011 13:29
Location: Dictatorship

Re: libLINREG project

Postby Carlos Herrera » Oct 27, 2020 8:32

3D: X / Y regression analysis

Interesting, can you provide a reference to a paper describing the method, if exists?

Did you check, how the results depend on a small changes in the input parameters?
This might be a so-called ill-posed problem, which requires some kind of regularization method.

Moreover, if degree is to small, you may miss some extrema, if it is too high, the fit will be perfect
but with many false valleys and hills. For weather forecasting and pressure maps
I would rather use a B-spline interpolation.

Return to “Projects”

Who is online

Users browsing this forum: demosthenesk and 2 guests