[SOLVED] turning working KNN code into a static library

New to FreeBASIC? Post your questions here.
Post Reply
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

Re: [HELP] converting KNN code from python to FB

Post by Tourist Trap »

dodicat wrote:If it is just nearest neighbours to a chosen point in a 2D space array, then it is not difficult.
Very well done :)

As far as I understand it's meant to do some prediction. If I understand what they say your NN algorithm will first divide the whole array in meaningful clusters (those who gather more people close to each other given a distance set?).
Then you'll add a new point, and this one guy would fall or not it this or this class -> one of the previous clusters you'd found before I guess.

This is what I think they do over the whole program. Not 100% sure yet atm.
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

Re: [HELP] converting KNN code from python to FB

Post by Tourist Trap »

Hi,

so thanks to some clipboard functions provided by dodicat, I made this little grabber to list quickly the functions from the python code:

Code: Select all

#include "windows.bi"

Function Get_clipboard() As Const String
        If IsClipboardFormatAvailable(CF_TEXT) = 0 Then Return "Error"
        If OpenClipboard(0) = 0 Then Return "Error"
        Function = *Cast(zstring Ptr,GetClipboardData(CF_TEXT))
        CloseClipboard()
End Function

Sub Set_clipBoard(Text As Const String)
    Var Num=GlobalAlloc(GMEM_MOVEABLE,Len(Text)+1)
    memcpy(GlobalLock(num), @text[0], Len(Text)+1)
    Var Chars=GlobalLock(Num)
    GlobalUnlock(Num)
    OpenClipboard(0)
    EmptyClipboard()
    SetClipboardData(CF_TEXT,Chars)
    CloseClipboard()
End Sub

'---------------------------------
var w => width()

'1- get python as text from the clipboard
var originaltxt => get_clipboard()

'2- get the capture expressed as an array of lines
var endlinechar => chr(10)
dim as string temptxt => originaltxt
dim as string txtlines()
dim as integer txtlinescount
do    
    color 10, 0
    var endlinecharpos => instr( temptxt, endlinechar )
    ? "l="; endlinecharpos, 
    
    redim preserve txtlines( ubound(txtlines) + 1 )
    txtlines( ubound(txtlines) ) = trim(left( temptxt, endlinecharpos ), any chr(32) & chr(9))
    txtlinescount = ubound(txtlines) + 1
    
    temptxt = right( temptxt, len(temptxt) - endlinecharpos )

    color 11
    ? txtlines( ubound(txtlines) )

    ? space(14); 
    color 14, 5
    ? txtlinescount; space( loword(w) - len( trim(str(txtlinescount)) ) - 1 - 14 )

loop until not instr( temptxt, endlinechar )>0
color 10, 0

'3- retrieve the functions signatures in a dedicated array
dim as string functionnames() 

for index as integer = lbound(txtlines) to ubound(txtlines)
    var defpos => instr( txtlines(index), "def " )
    
    if defpos=0 then continue for
    
    ?"def found @"; defpos, txtlines(index)
    
    redim preserve functionnames( ubound(functionnames) + 1 )
    functionnames( ubound(functionnames) ) = "function " & right( txtlines(index), len(txtlines(index)) - (defpos + 3) ) + chr(10)
next index

'4- send functions names to clipboard
var toclipboard => ""
for index as integer = lbound(functionnames) to ubound(functionnames)
    toclipboard &= functionnames(index)
next index
Set_clipBoard(toclipboard)

color 0, 14
var concludetxt => "FB FUNCTIONS HEADERS SENT TO CLIPBOARD"
? space(loword(w) - 1)
? (space((loword(w) - 1 - len(concludetxt))\2)); concludetxt; (space((loword(w) - 1 - len(concludetxt))\2))
? space(loword(w) - 1)

color 7, 0
? "end"

sleep
function load_csv(filename):

function str_column_to_float(dataset, column):

function str_column_to_int(dataset, column):

function dataset_minmax(dataset):

function normalize_dataset(dataset, minmax):

function euclidean_distance(row1, row2):

function get_neighbors(train, test_row, num_neighbors):

function predict_classification(train, test_row, num_neighbors):
So there the 8 functions to be implemented accoding to the tutorial.
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [HELP] converting KNN code from python to FB

Post by ron77 »

hello tourist trap...

i got a question or favor to ask you... could you transfer the libLINREG library repository ownership back to me? my nickname on giuhub is ronblue

thank you...

as for the KNN code well i have a working code based on dodicat code example yet it can run only on 2D data so the next step would be to make it run on a multi-dimensional data (more then just 2 columns in a csv file)

since it was hard to find a 2D dataset csv file for the KNN i coded a small code that produces a pseudo 2D dataset csv file to use as an example...
here is the pseudo dataset csv generator code:

Code: Select all

type pt
    as single x,y
end TYPE


screen 20
dim as integer xres,yres
screeninfo xres,yres

dim as pt p(1 to 500)
'set up the array
for n as long=1 to ubound(p)
    p(n)=type(rnd*xres,rnd*yres)
next n

DIM f AS INTEGER = FREEFILE()
OPEN "knn-dataset.csv" FOR APPEND AS #f
FOR i AS LONG = 1 TO UBOUND (p)
   PRINT #f, p(i).x & ", " & p(i).y
NEXT
CLOSE #f


now for the KNN 2D code:

Code: Select all

#include once "file.bi"

type pt
    as single x,y
end type

'APPEND TO the pt array the pttem
SUB PTAPPEND(arr() AS pt, Item AS pt)
	REDIM PRESERVE arr(LBOUND(arr) TO UBOUND(arr) + 1) AS pt
	arr(UBOUND(arr)) = Item
END SUB

SUB loadDataset( byref path as const string , p() AS pt)
    
  if( fileExists( path ) ) then
    dim as long f = freeFile()
    
    open path for input as f
    
    do while( not eof( f ) )
      dim as pt d
      
      input #f, d.x
      input #f, d.y
       PTAPPEND p(), d            
    LOOP
    CLOSE #f
  end if
  
end SUB


sub GetClosest(a() as pt,ans() as pt,v as long,num as long)
     #define incircle(cx,cy,radius,x,y) (cx-x)*(cx-x) +(cy-y)*(cy-y)<= radius*radius
     dim as single r=.5,ctr
    do
        r+=.5
        ctr=0
    for n as long=lbound(a) to ubound(a)
        if incircle(a(v).x,a(v).y,r,a(n).x,a(n).y) then
            ctr+=1
            redim preserve ans(1 to ctr)
            ans(ctr)=a(n)
            end if
    next n
loop until ubound(ans)>=num
redim preserve ans(1 to num)
end sub

'screen 20
'dim as integer xres,yres
'screeninfo xres,yres
'DIM neighbors AS integer
'dim as pt p(ANY)
'redim as pt res()
'set up the array
'for n as long=1 to ubound(p)
'    p(n)=type(rnd*xres,rnd*yres)
'next n
'LOADDATASET("knn-dataset.csv", p())
'CLS
'INPUT "how many maxinum neighbors points to search for?: ", neighbors
 #define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
 randomize
SUB knn_search(p() AS pt, xres AS INTEGER, yres AS INTEGER, res() AS pt, neighbors AS integer)
do
    dim as long psn=range(lbound(p),ubound(p)),num=neighbors
    for n as long=1 to ubound(p)        'draw the points
        circle(p(n).x,p(n).y),2,7,,,,f
    next n
   
    GetClosest(p(),res(),psn,num)          'get closest num points centered on p(psn)
   
    for n as long=1 to ubound(res)
        circle(res(n).x,res(n).y),7,2   'draw these closest points
    next n
    circle(p(psn).x,p(psn).y),7,6
    draw string(50,50),"cluster of "&num & " centered on p("&psn &")"
    draw string (50,70),"Press a key, <esc> to end"
    sleep
    cls
loop until inkey=chr(27)
END SUB


screen 20

DIM AS integer xres,yres
screeninfo xres,yres
DIM neighbors AS integer

DIM AS pt p(ANY)
REDIM AS pt res()

LOADDATASET("knn-dataset.csv", p())
CLS
INPUT "how many maxinum neighbors points to search for?: ", neighbors

KNN_SEARCH(p(), xres, yres, res(), neighbors)
if anyone wishes to help me make this a MULTI-DIMENTIONAL KNN code then your help is welcomed :)

ron77
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

Re: [HELP] converting KNN code from python to FB

Post by Tourist Trap »

ron77 wrote: could you transfer the libLINREG library repository ownership back
Hi ron, no problem you should have it back home and safe right now. Tell me if it didn't work anyway.

About the code you posted above. It doesn't work, it seems to be stuck after I answer the 1st question. It seems not related to the lack of data file, because I added a ELSE to the IF FILEEXISTS stuff in the function that loads it.
For more consistency in the case of no data loaded I changed the program this way:

Code: Select all

#include once "file.bi"

type pt
    as single x,y
end type

'APPEND TO the pt array the pttem
SUB PTAPPEND(arr() AS pt, Item AS pt)
   REDIM PRESERVE arr(LBOUND(arr) TO UBOUND(arr) + 1) AS pt
   arr(UBOUND(arr)) = Item
END SUB

SUB loadDataset( byref path as const string , p() AS pt)
  if( fileExists( path ) ) then
    dim as long f = freeFile() 
    open path for input as f
    do while( not eof( f ) )
      dim as pt d
      input #f, d.x
      input #f, d.y
       PTAPPEND p(), d           
    LOOP
    CLOSE #f
  else
    print "! no file !" 
  end if
end SUB

sub GetClosest(a() as pt,ans() as pt,v as long,num as long)
     #define incircle(cx,cy,radius,x,y) (cx-x)*(cx-x) +(cy-y)*(cy-y)<= radius*radius
     dim as single r=.5,ctr
    do
        r+=.5
        ctr=0
    for n as long=lbound(a) to ubound(a)
        if incircle(a(v).x,a(v).y,r,a(n).x,a(n).y) then
            ctr+=1
            redim preserve ans(1 to ctr)
            ans(ctr)=a(n)
            end if
    next n
loop until ubound(ans)>=num
redim preserve ans(1 to num)
end sub

 #define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
 randomize
SUB knn_search(p() AS pt, xres AS INTEGER, yres AS INTEGER, res() AS pt, neighbors AS integer)
'do
    if ubound(p)<lbound(p) then
        print "! pointless !"
        exit sub
    end if
    
    dim as long psn=range(lbound(p),ubound(p)),num=neighbors
    for n as long=1 to ubound(p)        'draw the points
        circle(p(n).x,p(n).y),2,7,,,,f
    next n
   
    GetClosest(p(),res(),psn,num)          'get closest num points centered on p(psn)
   
    for n as long=1 to ubound(res)
        circle(res(n).x,res(n).y),7,2   'draw these closest points
    next n
    circle(p(psn).x,p(psn).y),7,6
    draw string(50,50),"cluster of "&num & " centered on p("&psn &")"
    draw string (50,70),"Press a key, <esc> to end"
     
'loop until inkey=chr(27)
END SUB

'------------------------------------------------MAIN
screen 20

DIM AS integer xres,yres
screeninfo xres,yres
DIM neighbors AS integer

DIM AS pt p(ANY)
REDIM AS pt res()

LOADDATASET("knn-dataset.csv", p())
INPUT "how many maxinum neighbors points to search for?: ", neighbors
KNN_SEARCH(p(), xres, yres, res(), neighbors)

sleep
Now if no file or p() is void, it will be caught. That's only a detail, but it's best not putting any CLS before the program is totally ready to be delivered, and to have some checkpoints all over the code. Also as a pure general and formal advice, I would put all the loops related to the display in the main section, not inside the subroutines/functions. That makes also things easier. Good job anyway.
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [HELP] converting KNN code from python to FB

Post by ron77 »

hello and hi tourist trap :)

thanks for transferring back the ownership of the repository to me and thanks for agreeing to accept ownership in the first place :) i appreciate it a lot!

as for your code i agree it works a lot better and safer from crashing :) tomorrow i hope to begin working on a multi - dimensional KNN i guess i'll be studying "linked lists" or maybe some other type of solution :)

once again thank you for your help :)

ron77
dodicat
Posts: 7987
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: [HELP] converting KNN code from python to FB

Post by dodicat »

By multidimensional you mean 3D?
I cannot visualize 4D space.
3D is only different in that to view the points properly the array should be sorted in the z direction (into the screen).
And the visual size should be mapped closest big, furthest small.
Also a perspective should be applied.
The function GetClosest is the same as 2D except that an insphere is used, not incircle.
In other words the method is identical to 2D, but showing the results uses sorting, perspective, sizing and colouring.
In this demo some of the cluster lie behind other points and thus are not seen.
Because I use perspective, if the cluster is centered on a point close to the viewer then it will have a large spread visually, and some could be off screen.

Code: Select all

  

Dim Shared As Integer xres,yres

#define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
#define map(a,b,x,c,d) ((d)-(c))*((x)-(a))/((b)-(a))+(c)    'for radius
Type pt
    As Single x,y,z
    As Ulong colour
End Type

Operator =(a As pt,b As pt) As Long  'for showing
Return a.x=b.x And a.y=b.y And a.z=b.z
End Operator

Sub GetClosest(a() As pt,ans() As pt,v As Long,num As Long)
    #define insphere(cx,cy,cz,radius,x,y,z) (cx-x)*(cx-x) +(cy-y)*(cy-y)+(cz-z)*(cz-z)<= radius*radius
    Dim As Single r=.5,ctr
    Do
        r+=.5
        ctr=0
        For n As Long=Lbound(a) To Ubound(a)
            If insphere(a(v).x,a(v).y,a(v).z,r,a(n).x,a(n).y,a(n).z) Then
                ctr+=1
                Redim Preserve ans(1 To ctr)
                ans(ctr)=a(n)
            End If
        Next n
    Loop Until Ubound(ans)>=num
    Redim Preserve ans(1 To num)
End Sub

Sub QsortZ(array() As pt,begin As Long,Finish As Ulong)
    Dim As Long i=begin,j=finish
    Dim As pt x =array(((I+J)\2))
    While I <= J
        While array(I).z > X .z:I+=1:Wend
            While array(J).z < X .z:J-=1:Wend
                If I<=J Then Swap array(I),array(J): I+=1:J-=1
            Wend
            If J >begin Then QsortZ(array(),begin,J)
            If I <Finish Then QsortZ(array(),I,Finish)
    End Sub

Function perspective(p As pt,eyepoint As pt) As pt
    Dim As Single w=1+(p.z/eyepoint.z)
    Return Type((p.x-eyepoint.x)/w+eyepoint.x, (p.y-eyepoint.y)/w+eyepoint.y, (p.z-eyepoint.z)/w+eyepoint.z,p.colour)
End Function

'set up the array
Sub setup(p() As pt)
    For n As Long=1 To Ubound(p)
        'the z value 500 out of screen to 500 into screen
        p(n)=Type(range(150,(xres-150)),Range(150,(yres-150)),range(-500,500),Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50))
        
    Next n
End Sub


Screen 20,32,,64

Screeninfo xres,yres

Dim As pt p(1 To 700)
Redim As pt res()
Randomize

setup(p())

Qsortz(p(),Lbound(p),Ubound(p))'sort in z direction

Dim As pt eyepoint=Type(xres/2,yres/2,800)' for perspective
Do
    
    Dim As Long psn=range(Lbound(p),Ubound(p)),num=range(3,15)
    
    GetClosest(p(),res(),psn,num)          'get closest num points centered on p(psn)
    'display
    For n As Long=1 To Ubound(p)
        Var flag=0
        For m As Long=Lbound(res) To Ubound(res)
            If p(n)=res(m) Then flag=n    'check out cluster members.
        Next m
        
        Var rad=map(-500,500,p(n).z,15,5) 'close, radius big, far radius small
        Var np=perspective(p(n),eyepoint) ' set 3D perspective
        Circle(np.x,np.y),rad,p(n).colour,,,,f            'show every point
        'do the rims
        If n=flag Then Circle(np.x,np.y),rad,Rgb(0,255,0)  ' a cluster member
        If n=psn Then  Circle(np.x,np.y),rad,Rgb(255,0,0)   'the cluster centre
    Next n
    
    Draw String(50,50),"cluster of "&num & " (green rimmed)  centered on p("&psn &")  red rim",Rgb(150,150,150)
    Draw String(50,70),"Cluster centre (" & Int(p(psn).x) & ","& Int(p(psn).y) & "," & Int(p(psn).z) &")",Rgb(150,150,150)
    Draw String (50,90),"Press a key, <esc> to end",Rgb(150,150,150)
    Sleep
    Cls
    Loop Until Inkey=Chr(27) 
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [HELP] KNN code doesn't work

Post by ron77 »

hello all :)

i need help again. me and my teacher worked very hard in changing dodicat 3D KNN code into a multi-value KNN that loads the values from a csv file...

we worked hard and the result is a code that does complie and run however doesn't work - you get a black screen and the program is stuck and you need to terminate it.

i guess this is due that we tried to do a multi value - multi dimensional KNN and freebasic is not python :(

here is the csv file and KNN code we worked on in the hope that someone can help - any help will be greatly appreciated! and welcomed!

the "iris.csv" dataset file:

Code: Select all

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
4.6,3.6,1.0,0.2,Iris-setosa
5.1,3.3,1.7,0.5,Iris-setosa
4.8,3.4,1.9,0.2,Iris-setosa
5.0,3.0,1.6,0.2,Iris-setosa
5.0,3.4,1.6,0.4,Iris-setosa
5.2,3.5,1.5,0.2,Iris-setosa
5.2,3.4,1.4,0.2,Iris-setosa
4.7,3.2,1.6,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
5.2,4.1,1.5,0.1,Iris-setosa
5.5,4.2,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,0.2,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa
5.1,3.4,1.5,0.2,Iris-setosa
5.0,3.5,1.3,0.3,Iris-setosa
4.5,2.3,1.3,0.3,Iris-setosa
4.4,3.2,1.3,0.2,Iris-setosa
5.0,3.5,1.6,0.6,Iris-setosa
5.1,3.8,1.9,0.4,Iris-setosa
4.8,3.0,1.4,0.3,Iris-setosa
5.1,3.8,1.6,0.2,Iris-setosa
4.6,3.2,1.4,0.2,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
6.5,2.8,4.6,1.5,Iris-versicolor
5.7,2.8,4.5,1.3,Iris-versicolor
6.3,3.3,4.7,1.6,Iris-versicolor
4.9,2.4,3.3,1.0,Iris-versicolor
6.6,2.9,4.6,1.3,Iris-versicolor
5.2,2.7,3.9,1.4,Iris-versicolor
5.0,2.0,3.5,1.0,Iris-versicolor
5.9,3.0,4.2,1.5,Iris-versicolor
6.0,2.2,4.0,1.0,Iris-versicolor
6.1,2.9,4.7,1.4,Iris-versicolor
5.6,2.9,3.6,1.3,Iris-versicolor
6.7,3.1,4.4,1.4,Iris-versicolor
5.6,3.0,4.5,1.5,Iris-versicolor
5.8,2.7,4.1,1.0,Iris-versicolor
6.2,2.2,4.5,1.5,Iris-versicolor
5.6,2.5,3.9,1.1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
6.1,2.8,4.0,1.3,Iris-versicolor
6.3,2.5,4.9,1.5,Iris-versicolor
6.1,2.8,4.7,1.2,Iris-versicolor
6.4,2.9,4.3,1.3,Iris-versicolor
6.6,3.0,4.4,1.4,Iris-versicolor
6.8,2.8,4.8,1.4,Iris-versicolor
6.7,3.0,5.0,1.7,Iris-versicolor
6.0,2.9,4.5,1.5,Iris-versicolor
5.7,2.6,3.5,1.0,Iris-versicolor
5.5,2.4,3.8,1.1,Iris-versicolor
5.5,2.4,3.7,1.0,Iris-versicolor
5.8,2.7,3.9,1.2,Iris-versicolor
6.0,2.7,5.1,1.6,Iris-versicolor
5.4,3.0,4.5,1.5,Iris-versicolor
6.0,3.4,4.5,1.6,Iris-versicolor
6.7,3.1,4.7,1.5,Iris-versicolor
6.3,2.3,4.4,1.3,Iris-versicolor
5.6,3.0,4.1,1.3,Iris-versicolor
5.5,2.5,4.0,1.3,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
6.1,3.0,4.6,1.4,Iris-versicolor
5.8,2.6,4.0,1.2,Iris-versicolor
5.0,2.3,3.3,1.0,Iris-versicolor
5.6,2.7,4.2,1.3,Iris-versicolor
5.7,3.0,4.2,1.2,Iris-versicolor
5.7,2.9,4.2,1.3,Iris-versicolor
6.2,2.9,4.3,1.3,Iris-versicolor
5.1,2.5,3.0,1.1,Iris-versicolor
5.7,2.8,4.1,1.3,Iris-versicolor
6.3,3.3,6.0,2.5,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
7.1,3.0,5.9,2.1,Iris-virginica
6.3,2.9,5.6,1.8,Iris-virginica
6.5,3.0,5.8,2.2,Iris-virginica
7.6,3.0,6.6,2.1,Iris-virginica
4.9,2.5,4.5,1.7,Iris-virginica
7.3,2.9,6.3,1.8,Iris-virginica
6.7,2.5,5.8,1.8,Iris-virginica
7.2,3.6,6.1,2.5,Iris-virginica
6.5,3.2,5.1,2.0,Iris-virginica
6.4,2.7,5.3,1.9,Iris-virginica
6.8,3.0,5.5,2.1,Iris-virginica
5.7,2.5,5.0,2.0,Iris-virginica
5.8,2.8,5.1,2.4,Iris-virginica
6.4,3.2,5.3,2.3,Iris-virginica
6.5,3.0,5.5,1.8,Iris-virginica
7.7,3.8,6.7,2.2,Iris-virginica
7.7,2.6,6.9,2.3,Iris-virginica
6.0,2.2,5.0,1.5,Iris-virginica
6.9,3.2,5.7,2.3,Iris-virginica
5.6,2.8,4.9,2.0,Iris-virginica
7.7,2.8,6.7,2.0,Iris-virginica
6.3,2.7,4.9,1.8,Iris-virginica
6.7,3.3,5.7,2.1,Iris-virginica
7.2,3.2,6.0,1.8,Iris-virginica
6.2,2.8,4.8,1.8,Iris-virginica
6.1,3.0,4.9,1.8,Iris-virginica
6.4,2.8,5.6,2.1,Iris-virginica
7.2,3.0,5.8,1.6,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
6.4,2.8,5.6,2.2,Iris-virginica
6.3,2.8,5.1,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
6.3,3.4,5.6,2.4,Iris-virginica
6.4,3.1,5.5,1.8,Iris-virginica
6.0,3.0,4.8,1.8,Iris-virginica
6.9,3.1,5.4,2.1,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
the multi-value multi dimensional KNN code that loads it's values from the csv file "iris.csv":

Code: Select all

Dim Shared As Integer xres,yres
DIM SHARED AS LONG psn, num
#define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
#define map(a,b,x,c,d) ((d)-(c))*((x)-(a))/((b)-(a))+(c) 


TYPE pt
  dim AS DOUBLE l(any)
   DIM AS Ulong colour
END TYPE

Operator =(a As pt,b As pt) As Long  'for showing
   FOR i AS INTEGER = LBOUND(a.l) TO UBOUND(a.l) + 1
      IF a.l(i) <> b.l(i) THEN
         RETURN 0
      ENDIF
   NEXT
   RETURN -1
END Operator


dim as string token
DIM p(any) AS pt

dim as integer pos1 = 1, pos2   
DIM fline AS STRING

DIM f AS INTEGER = FREEFILE()
OPEN "iris.csv" FOR INPUT AS f
WHILE NOT EOF(f)
DIM temp AS pt
DO
    LINE INPUT #f, fline
    
    pos2 = instr(pos1, fline, ",")
    
    if pos2 > 0 Then
        token = mid(fline, pos1, pos2 - pos1)    
    Else
        token = Mid(fline, pos1)
    end if
   
    
    redim preserve temp.l(0 to ubound(temp.l) + 1)
    temp.l(ubound(temp.l)) = CAST(DOUBLE,token)
   
    pos1 = pos2 + 1 
loop until pos2 = 0
redim preserve p(0 to ubound(p) + 1)
    p(ubound(p)) = temp
WEND
CLOSE #f

FUNCTION insphere(cl AS pt, radius AS Integer, l AS pt) AS LONG
   DIM sum AS DOUBLE
   FOR i AS integer = LBOUND(cl.l) TO UBOUND(cl.l)
      sum = (cl.l(i) - l.l(i)) * (cl.l(i)- l.l(i))
   NEXT
   IF sum <= radius*radius THEN
         RETURN -1
   ENDIF
   RETURN 0
END FUNCTION


Sub GetClosest(a() As pt,ans() As pt,v As Long,num As Long)
    Dim As Single r=.5,ctr
    Do
        r+=.5
        ctr=0
        For n As Long=Lbound(a) To Ubound(a)
            IF insphere(a(v), r ,a(n)) THEN
                ctr+=1
                Redim Preserve ans(1 To ctr)
                ans(ctr)=a(n)
            End If
        Next n
    Loop Until Ubound(ans)>=num
    Redim Preserve ans(1 To num)
End SUB

Sub Qsort(array() As pt, index AS INTEGER, begin As Long,Finish As Ulong)
    Dim As Long i=begin,j=finish
    Dim As pt x =array(((I+J)\2))
    While I <= J
        While array(I).l(index) > X .l(index):I+=1:Wend
            While array(J).l(index) < X .l(index):J-=1:Wend
                If I<=J Then Swap array(I),array(J): I+=1:J-=1
            Wend
            If J >begin Then QSORT(array(),index,begin,J)
            If I <Finish Then Qsort(array(),index,I,Finish)
End SUB

Function perspective(p As pt,eyepoint As pt) As pt
    DIM temp AS pt
    Dim As Single w=1+(p.l(UBOUND(p.l))/eyepoint.l(UBOUND(eyepoint.l)))
    
    FOR i AS INTEGER = lbound(p.l) TO UBOUND(p.l)
       redim preserve temp.l(0 to ubound(temp.l) + 1)
    temp.l(ubound(temp.l)) = p.l(i)/ w + eyepoint.l(i)
    NEXT
    
    temp.colour = p.colour
    
    RETURN temp
End FUNCTION

'set up the array
Sub setup(p() As pt)
    FOR n As Long=1 To Ubound(p)
        'the z value 500 out of screen to 500 into screen
        
       DIM temp AS pt
       REDIM temp.l(0 to ubound(temp.l) + 2)
       temp.l(0) = range(150,(xres-150))
       temp.l(1) = Range(150,(yres-150))
       temp.colour = Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50)
       IF UBOUND(p(n).l) > 1 THEN
           FOR i AS LONG = LBOUND(p) TO UBOUND(p)
               redim preserve p(0 to ubound(p) + 1)
               temp.l(ubound(temp.l)) = range(-500,500)
           NEXT
       ENDIF
        
        'p(n)=Type(range(150,(xres-150)),Range(150,(yres-150)),range(-500,500),Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50))
       p(n) = temp
    Next n
End SUB

FUNCTION return_eyepoint(p() AS pt) AS pt
   DIM arryLength AS INTEGER = UBOUND(p(0).l)
   DIM temp AS pt
   REDIM temp.l(lbound(temp.l) TO UBOUND(temp.l) + arryLength)
   temp.l(0) = xres/2
   temp.l(1) = yres/2
   IF UBOUND(temp.l) > 1 THEN
      FOR i AS LONG = 2 TO UBOUND(temp.l)
         temp.l(i) = 800
      NEXT
   END IF
   RETURN temp
END FUNCTION


Screen 20,32,,64

Screeninfo xres,yres

Redim As pt res()

setup(p())

FOR i AS INTEGER = LBOUND(p) TO UBOUND(p)
   Qsort(p(), i ,Lbound(p),Ubound(p))
NEXT

Dim As pt eyepoint=RETURN_EYEPOINT(p())     'TYPE(xres/2,yres/2,800)

FUNCTION print_k(p() AS pt, psn AS SINGLE) AS STRING
   DIM result AS STRING
   FOR i AS INTEGER = LBOUND(p) TO UBOUND(p)
      result = result & " " & INT(p(psn).l(i)) 
   NEXT        
   RETURN result 
END FUNCTION

Do
   CLS
   DIM AS STRING ans
   INPUT "do you wish to manualy input number of cluster range and number of neighbors? y/n: "; ans
   IF ans = LCASE("y") THEN
      DIM psn AS SINGLE, num AS SINGLE
      INPUT "how much cluster range?: "; psn
      INPUT "how many neightbors?: "; num
   ELSE
     psn=range(Lbound(p),Ubound(p))
     num=range(3,15)
   ENDIF
   CLS
     
    GetClosest(p(),res(),psn,num)          'get closest num points centered on p(psn)
    'display
    For n As Long=1 To Ubound(p)
        Var flag=0
        For m As Long=Lbound(res) To Ubound(res)
            If p(n)=res(m) Then flag=n    'check out cluster members.
        Next m
       
        Var rad=map(-500,500,p(n).l(UBOUND(p(n).l)),15,5) 'close, radius big, far radius small
        Var np=perspective(p(n),eyepoint) ' set 3D perspective
        Circle(np.l(0),np.l(1)),rad,p(n).colour,,,,f            'show every point
        'do the rims
        If n=flag Then Circle(np.l(0),np.l(1)),rad,Rgb(0,255,0)  ' a cluster member
        If n=psn Then  Circle(np.l(0),np.l(1)),rad,Rgb(255,0,0)   'the cluster centre
    Next n
   
    Draw String(50,50),"cluster of "&num & " (green rimmed)  centered on p("&psn &")  red rim",Rgb(150,150,150)
    Draw String(50,70),"Cluster centre (" & PRINT_K(p(), psn) & ")",Rgb(150,150,150)
    Draw String (50,90),"Press a key, <esc> to end",Rgb(150,150,150)
    Sleep
    Cls
Loop Until Inkey=Chr(27)


SLEEP


ron77
paul doe
Moderator
Posts: 1742
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: [HELP] KNN code doesn't work

Post by paul doe »

Have you checked the validity of the dataset? Doesn't seems like so to me...

Code: Select all

TYPE pt
  dim AS DOUBLE l(any)
   DIM AS Ulong colour
END TYPE

dim as string token
DIM p(any) AS pt

dim as integer pos1 = 1, pos2   
DIM fline AS STRING

DIM f AS INTEGER = FREEFILE()
OPEN "iris.csv" FOR INPUT AS f

? "Loading data..."
do WHILE NOT EOF(f)
  DIM temp AS pt
  DO
      LINE INPUT #f, fline
     
      pos2 = instr(pos1, fline, ",")
     
      if pos2 > 0 Then
          token = mid(fline, pos1, pos2 - pos1)   
      Else
          token = Mid(fline, pos1)
      end if
     
      redim preserve temp.l(0 to ubound(temp.l) + 1)
      temp.l(ubound(temp.l)) = CAST(DOUBLE,token)
     
      pos1 = pos2 + 1
  loop until pos2 = 0
  redim preserve p(0 to ubound(p) + 1)
  p(ubound(p)) = temp
loop

CLOSE #f

? "Done."

for i as integer = 0 to ubound( p )
  for j as integer = 0 to ubound( p( i ).l )
    ? p( i ).l( j ),
  next
  
  /'
    The dataset is not compatible with what the program uses to
    display. As is it, not only the coordinates are wrong,
    it's just drawing everything with the color zero (black).
  '/
  ? p( i ).colour
next

sleep()
Also, organize the code better, mate. It's a mess:

Code: Select all

Dim Shared As Integer xres,yres
DIM SHARED AS LONG psn, num

#define range(f,l) Int(Rnd*(((l)+1)-(f))+(f))
#define map(a,b,x,c,d) ((d)-(c))*((x)-(a))/((b)-(a))+(c)

TYPE pt
  dim AS DOUBLE l(any)
   DIM AS Ulong colour
END TYPE

Operator =(a As pt,b As pt) As Long  'for showing
   FOR i AS INTEGER = LBOUND(a.l) TO UBOUND(a.l) + 1
      IF a.l(i) <> b.l(i) THEN
         RETURN 0
      ENDIF
   NEXT
   RETURN -1
END Operator

FUNCTION insphere(cl AS pt, radius AS Integer, l AS pt) AS LONG
   DIM sum AS DOUBLE
   FOR i AS integer = LBOUND(cl.l) TO UBOUND(cl.l)
      sum = (cl.l(i) - l.l(i)) * (cl.l(i)- l.l(i))
   NEXT
   IF sum <= radius*radius THEN
         RETURN -1
   ENDIF
   RETURN 0
END FUNCTION

Sub GetClosest(a() As pt,ans() As pt,v As Long,num As Long)
    Dim As Single r=.5,ctr
    Do
        r+=.5
        ctr=0
        For n As Long=Lbound(a) To Ubound(a)
            IF insphere(a(v), r ,a(n)) THEN
                ctr+=1
                Redim Preserve ans(1 To ctr)
                ans(ctr)=a(n)
            End If
        Next n
    Loop Until Ubound(ans)>=num
    Redim Preserve ans(1 To num)
End SUB

Sub Qsort(array() As pt, index AS INTEGER, begin As Long,Finish As Ulong)
    Dim As Long i=begin,j=finish
    Dim As pt x =array(((I+J)\2))
    While I <= J
        While array(I).l(index) > X .l(index):I+=1:Wend
            While array(J).l(index) < X .l(index):J-=1:Wend
                If I<=J Then Swap array(I),array(J): I+=1:J-=1
            Wend
            If J >begin Then QSORT(array(),index,begin,J)
            If I <Finish Then Qsort(array(),index,I,Finish)
End SUB

Function perspective(p As pt,eyepoint As pt) As pt
    DIM temp AS pt
    Dim As Single w=1+(p.l(UBOUND(p.l))/eyepoint.l(UBOUND(eyepoint.l)))
   
    FOR i AS INTEGER = lbound(p.l) TO UBOUND(p.l)
       redim preserve temp.l(0 to ubound(temp.l) + 1)
    temp.l(ubound(temp.l)) = p.l(i)/ w + eyepoint.l(i)
    NEXT
   
    temp.colour = p.colour
   
    RETURN temp
End FUNCTION

'set up the array
Sub setup(p() As pt)
    FOR n As Long=1 To Ubound(p)
        'the z value 500 out of screen to 500 into screen
       
       DIM temp AS pt
       REDIM temp.l(0 to ubound(temp.l) + 2)
       temp.l(0) = range(150,(xres-150))
       temp.l(1) = Range(150,(yres-150))
       temp.colour = Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50)
       IF UBOUND(p(n).l) > 1 THEN
           FOR i AS LONG = LBOUND(p) TO UBOUND(p)
               redim preserve p(0 to ubound(p) + 1)
               temp.l(ubound(temp.l)) = range(-500,500)
           NEXT
       ENDIF
       
        'p(n)=Type(range(150,(xres-150)),Range(150,(yres-150)),range(-500,500),Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50))
       p(n) = temp
    Next n
End SUB

FUNCTION return_eyepoint(p() AS pt) AS pt
   DIM arryLength AS INTEGER = UBOUND(p(0).l)
   DIM temp AS pt
   REDIM temp.l(lbound(temp.l) TO UBOUND(temp.l) + arryLength)
   temp.l(0) = xres/2
   temp.l(1) = yres/2
   IF UBOUND(temp.l) > 1 THEN
      FOR i AS LONG = 2 TO UBOUND(temp.l)
         temp.l(i) = 800
      NEXT
   END IF
   RETURN temp
END FUNCTION

FUNCTION print_k(p() AS pt, psn AS SINGLE) AS STRING
   DIM result AS STRING
   FOR i AS INTEGER = LBOUND(p) TO UBOUND(p)
      result = result & " " & INT(p(psn).l(i))
   NEXT       
   RETURN result
END FUNCTION

/'
  Main code
'/
dim as string token
DIM p(any) AS pt

dim as integer pos1 = 1, pos2   
DIM fline AS STRING

DIM f AS INTEGER = FREEFILE()
OPEN "iris.csv" FOR INPUT AS f
WHILE NOT EOF(f)
DIM temp AS pt
DO
    LINE INPUT #f, fline
   
    pos2 = instr(pos1, fline, ",")
   
    if pos2 > 0 Then
        token = mid(fline, pos1, pos2 - pos1)   
    Else
        token = Mid(fline, pos1)
    end if
   
   
    redim preserve temp.l(0 to ubound(temp.l) + 1)
    temp.l(ubound(temp.l)) = CAST(DOUBLE,token)
   
    pos1 = pos2 + 1
loop until pos2 = 0

redim preserve p(0 to ubound(p) + 1)
    p(ubound(p)) = temp
WEND

CLOSE #f

Redim As pt res()

ScreenRes 800,600,,64
Screeninfo xres,yres

? "Setting up..."
setup(p())
? "Done."

? "Sorting..."
FOR i AS INTEGER = LBOUND(p) TO UBOUND(p)
   Qsort(p(), i ,Lbound(p),Ubound(p))
NEXT
? "Done."

? "Computing eyepoint..."
Dim As pt eyepoint=RETURN_EYEPOINT(p())     'TYPE(xres/2,yres/2,800)
? "Done."

Do
   CLS
   DIM AS STRING ans
   INPUT "do you wish to manualy input number of cluster range and number of neighbors? y/n: "; ans
   IF ans = LCASE("y") THEN
      DIM psn AS SINGLE, num AS SINGLE
      INPUT "how much cluster range?: "; psn
      INPUT "how many neightbors?: "; num
   ELSE
     psn=range(Lbound(p),Ubound(p))
     num=range(3,15)
   ENDIF
   CLS
     
    GetClosest(p(),res(),psn,num)          'get closest num points centered on p(psn)
    'display
    For n As Long=1 To Ubound(p)
        Var flag=0
        For m As Long=Lbound(res) To Ubound(res)
            If p(n)=res(m) Then flag=n    'check out cluster members.
        Next m
       
        Var rad=map(-500,500,p(n).l(UBOUND(p(n).l)),15,5) 'close, radius big, far radius small
        Var np=perspective(p(n),eyepoint) ' set 3D perspective
        Circle(np.l(0),np.l(1)),rad,p(n).colour,,,,f            'show every point
        'do the rims
        If n=flag Then Circle(np.l(0),np.l(1)),rad,Rgb(0,255,0)  ' a cluster member
        If n=psn Then  Circle(np.l(0),np.l(1)),rad,Rgb(255,0,0)   'the cluster centre
    Next n
   
    Draw String(50,50),"cluster of "&num & " (green rimmed)  centered on p("&psn &")  red rim",Rgb(150,150,150)
    Draw String(50,70),"Cluster centre (" & PRINT_K(p(), psn) & ")",Rgb(150,150,150)
    Draw String (50,90),"Press a key, <esc> to end",Rgb(150,150,150)
    Sleep
    Cls
Loop Until Inkey=Chr(27)


SLEEP
It's failing in setup(). Most likely because you're passing it invalid data. Not that it matters much, since calling the function will overwrite your dataset anyway:

Code: Select all

'set up the array
Sub setup(p() As pt)
    FOR n As Long=1 To Ubound(p)
        'the z value 500 out of screen to 500 into screen
       
       DIM temp AS pt
       REDIM temp.l(0 to ubound(temp.l) + 2)
       temp.l(0) = range(150,(xres-150))
       temp.l(1) = Range(150,(yres-150))
       temp.colour = Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50)
       IF UBOUND(p(n).l) > 1 THEN
           FOR i AS LONG = LBOUND(p) TO UBOUND(p)
               redim preserve p(0 to ubound(p) + 1)
               temp.l(ubound(temp.l)) = range(-500,500)
           NEXT
       ENDIF
       
        'p(n)=Type(range(150,(xres-150)),Range(150,(yres-150)),range(-500,500),Rgb(50+Rnd*50,50+Rnd*50,50+Rnd*50))
       p(n) = temp
    Next n
End SUB
So you can prescind of the call. However:

Code: Select all

Aborting due to runtime error 6 (out of bounds array access) at line 52 of knn.bas::QSORT()
Look at the original definition of pt:

Code: Select all

Type pt
    As Single x,y,z
    As Ulong colour
End Type
Three coords and a color, whereas your definition can contain an arbitrary amount of data, and the colour member is essentially zero (it will draw nothing). You can't possibly expect this to work as it is...
dodicat
Posts: 7987
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: [HELP] KNN code doesn't work

Post by dodicat »

My closest was Euclidean (goemetric) distance.
Your csv file has four floats and a string for the plant name.
So the four floats are a vector, and the closest points will use this vector

iris.csv

Code: Select all

 
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
4.6,3.6,1.0,0.2,Iris-setosa
5.1,3.3,1.7,0.5,Iris-setosa
4.8,3.4,1.9,0.2,Iris-setosa
5.0,3.0,1.6,0.2,Iris-setosa
5.0,3.4,1.6,0.4,Iris-setosa
5.2,3.5,1.5,0.2,Iris-setosa
5.2,3.4,1.4,0.2,Iris-setosa
4.7,3.2,1.6,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
5.2,4.1,1.5,0.1,Iris-setosa
5.5,4.2,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,0.2,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa
5.1,3.4,1.5,0.2,Iris-setosa
5.0,3.5,1.3,0.3,Iris-setosa
4.5,2.3,1.3,0.3,Iris-setosa
4.4,3.2,1.3,0.2,Iris-setosa
5.0,3.5,1.6,0.6,Iris-setosa
5.1,3.8,1.9,0.4,Iris-setosa
4.8,3.0,1.4,0.3,Iris-setosa
5.1,3.8,1.6,0.2,Iris-setosa
4.6,3.2,1.4,0.2,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
6.5,2.8,4.6,1.5,Iris-versicolor
5.7,2.8,4.5,1.3,Iris-versicolor
6.3,3.3,4.7,1.6,Iris-versicolor
4.9,2.4,3.3,1.0,Iris-versicolor
6.6,2.9,4.6,1.3,Iris-versicolor
5.2,2.7,3.9,1.4,Iris-versicolor
5.0,2.0,3.5,1.0,Iris-versicolor
5.9,3.0,4.2,1.5,Iris-versicolor
6.0,2.2,4.0,1.0,Iris-versicolor
6.1,2.9,4.7,1.4,Iris-versicolor
5.6,2.9,3.6,1.3,Iris-versicolor
6.7,3.1,4.4,1.4,Iris-versicolor
5.6,3.0,4.5,1.5,Iris-versicolor
5.8,2.7,4.1,1.0,Iris-versicolor
6.2,2.2,4.5,1.5,Iris-versicolor
5.6,2.5,3.9,1.1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
6.1,2.8,4.0,1.3,Iris-versicolor
6.3,2.5,4.9,1.5,Iris-versicolor
6.1,2.8,4.7,1.2,Iris-versicolor
6.4,2.9,4.3,1.3,Iris-versicolor
6.6,3.0,4.4,1.4,Iris-versicolor
6.8,2.8,4.8,1.4,Iris-versicolor
6.7,3.0,5.0,1.7,Iris-versicolor
6.0,2.9,4.5,1.5,Iris-versicolor
5.7,2.6,3.5,1.0,Iris-versicolor
5.5,2.4,3.8,1.1,Iris-versicolor
5.5,2.4,3.7,1.0,Iris-versicolor
5.8,2.7,3.9,1.2,Iris-versicolor
6.0,2.7,5.1,1.6,Iris-versicolor
5.4,3.0,4.5,1.5,Iris-versicolor
6.0,3.4,4.5,1.6,Iris-versicolor
6.7,3.1,4.7,1.5,Iris-versicolor
6.3,2.3,4.4,1.3,Iris-versicolor
5.6,3.0,4.1,1.3,Iris-versicolor
5.5,2.5,4.0,1.3,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
6.1,3.0,4.6,1.4,Iris-versicolor
5.8,2.6,4.0,1.2,Iris-versicolor
5.0,2.3,3.3,1.0,Iris-versicolor
5.6,2.7,4.2,1.3,Iris-versicolor
5.7,3.0,4.2,1.2,Iris-versicolor
5.7,2.9,4.2,1.3,Iris-versicolor
6.2,2.9,4.3,1.3,Iris-versicolor
5.1,2.5,3.0,1.1,Iris-versicolor
5.7,2.8,4.1,1.3,Iris-versicolor
6.3,3.3,6.0,2.5,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
7.1,3.0,5.9,2.1,Iris-virginica
6.3,2.9,5.6,1.8,Iris-virginica
6.5,3.0,5.8,2.2,Iris-virginica
7.6,3.0,6.6,2.1,Iris-virginica
4.9,2.5,4.5,1.7,Iris-virginica
7.3,2.9,6.3,1.8,Iris-virginica
6.7,2.5,5.8,1.8,Iris-virginica
7.2,3.6,6.1,2.5,Iris-virginica
6.5,3.2,5.1,2.0,Iris-virginica
6.4,2.7,5.3,1.9,Iris-virginica
6.8,3.0,5.5,2.1,Iris-virginica
5.7,2.5,5.0,2.0,Iris-virginica
5.8,2.8,5.1,2.4,Iris-virginica
6.4,3.2,5.3,2.3,Iris-virginica
6.5,3.0,5.5,1.8,Iris-virginica
7.7,3.8,6.7,2.2,Iris-virginica
7.7,2.6,6.9,2.3,Iris-virginica
6.0,2.2,5.0,1.5,Iris-virginica
6.9,3.2,5.7,2.3,Iris-virginica
5.6,2.8,4.9,2.0,Iris-virginica
7.7,2.8,6.7,2.0,Iris-virginica
6.3,2.7,4.9,1.8,Iris-virginica
6.7,3.3,5.7,2.1,Iris-virginica
7.2,3.2,6.0,1.8,Iris-virginica
6.2,2.8,4.8,1.8,Iris-virginica
6.1,3.0,4.9,1.8,Iris-virginica
6.4,2.8,5.6,2.1,Iris-virginica
7.2,3.0,5.8,1.6,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
6.4,2.8,5.6,2.2,Iris-virginica
6.3,2.8,5.1,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
6.3,3.4,5.6,2.4,Iris-virginica
6.4,3.1,5.5,1.8,Iris-virginica
6.0,3.0,4.8,1.8,Iris-virginica
6.9,3.1,5.4,2.1,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
And here is a non geometric treatment to get closest values.

Code: Select all



#include "file.bi"
Type pt
    Dim As Double l(1 To 4)
    As String nm
    As Long done
End Type



Sub printout(p As pt)
    For n As Long=1 To 4
        Print p.l(n);",";
    Next
    Print p.nm
End Sub


Function Splitstring(s_in As String,chars As String,result() As String) As Long
    Dim As Long ctr,ctr2,k,n,LC=Len(chars)
    Dim As boolean tally(Len(s_in))
    #macro check_instring()
    n=0
    While n<Lc
        If chars[n]=s_in[k] Then 
            tally(k)=true
            If (ctr2-1) Then ctr+=1
            ctr2=0
            Exit While
        End If
        n+=1
    Wend
    #endmacro
    
    #macro splice()
    If tally(k) Then
        If (ctr2-1) Then ctr+=1:result(ctr)=Mid(s_in,k+2-ctr2,ctr2-1)
        ctr2=0
    End If
    #endmacro
    '==================  LOOP TWICE =======================
    For k  =0 To Len(s_in)-1
        ctr2+=1:check_instring()
    Next k
    If ctr=0 Then
        If Len(s_in) Andalso Instr(chars,Chr(s_in[0])) Then ctr=1':beep
    End If
    If ctr Then Redim result(1 To ctr): ctr=0:ctr2=0 Else  Return 0
    For k  =0 To Len(s_in)-1
        ctr2+=1:splice()
    Next k
    '===================== Last one ========================
    If ctr2>0 Then
        Redim Preserve result(1 To ctr+1)
        result(ctr+1)=Mid(s_in,k+1-ctr2,ctr2)
    End If
    
    Return Ubound(result)
End Function

Function loadfiletostring(file As String) As String
    Var  f=Freefile
    if fileexists(file)=0 then print file;"  not found":sleep:end
    Open file For Binary Access Read As #f
    Dim As String text
    If Lof(f) > 0 Then
        text = String(Lof(f), 0)
        Get #f, , text
    End If
    Close #f
    Return text
End Function

Function vdist(p1 As pt,p2 As pt) As Double
    Dim As Double acc
    For n As Long=Lbound(p1.l) To Ubound(p1.l)
        acc+=(p1.l(n)-p2.l(n))^2
    Next
    Return Sqr(acc)
End Function

Function GetClosest(a() As pt,ans() As Long,v As pt,num As Long) As Long
    Dim As Double d,i
    Dim As Long ctr
    Do
        d=2e8
        For n As Long=Lbound(a) To Ubound(a)
            Var dst=vdist(a(n),v)
            If d>dst And a(n).done=0 Then d=dst:i=n:a(n).done=1
        Next n
        ctr+=1
        Redim Preserve ans(1 To ctr)
        ans(ctr)=i
    Loop Until Ubound(ans)>=num
    Return Ubound(ans)
End Function



Dim As String s=loadfiletostring("iris.csv")
Redim As String g()
splitstring(s,Chr(10),g()) 'load the file into a string array g()

Dim As pt array(Lbound(g) To Ubound(g)) '

For n As Long=Lbound(array) To Ubound(array)
    Redim As String tmp()
    splitstring(g(n),",",tmp())''load into temp with the , as seperator.
    For m As Long=1 To 4
        array(n).l(m)=Val(tmp(m)) 'pick out the values to suit the udt
    Next m
    array(n).nm=tmp(5)
    print n,
    printout(array(n))
Next

print



Redim As Long near()


GetClosest(array(),near(),array(100),8)'' <<--------  here, get 8 closest to array(100)
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
Next

'reset array
for n as long=lbound(array) to ubound(array)
    array(n).done=0
next n


GetClosest(array(),near(),array(10),5)'' <<--------  here, get 5 closest to array(10)
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
Next
for n as long=lbound(array) to ubound(array)
    array(n).done=0
    next n

Sleep


 
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [HELP] KNN code doesn't work

Post by ron77 »

hello all. hi paul doe and dodicat.

i am hoping to take dodicat code example and convert it into a "generic - multi-value KNN" program that will be able to do analysis on as many csv datasets as possible. today i will have a lesson with my teacher and we'll examine this possibility

ron77
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [SOLVED] KNN code work

Post by ron77 »

hello all :)

i'm excited to announce that me and my teacher have succeeded to modify dodicat KNN code to a flexible generic KNN code that can read and analyze many datasets types from simple 2 values to multi values AND also can take new values and make a prediction based on the dataset.

here for example are two types of datasets 1. "iris.csv" dataset with 4 numeric values and 2. a pseudo-dataset called "knn-dataset-plant.csv"...

1."iris.csv" dataset

Code: Select all

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
4.6,3.6,1.0,0.2,Iris-setosa
5.1,3.3,1.7,0.5,Iris-setosa
4.8,3.4,1.9,0.2,Iris-setosa
5.0,3.0,1.6,0.2,Iris-setosa
5.0,3.4,1.6,0.4,Iris-setosa
5.2,3.5,1.5,0.2,Iris-setosa
5.2,3.4,1.4,0.2,Iris-setosa
4.7,3.2,1.6,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
5.2,4.1,1.5,0.1,Iris-setosa
5.5,4.2,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,0.2,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa
5.1,3.4,1.5,0.2,Iris-setosa
5.0,3.5,1.3,0.3,Iris-setosa
4.5,2.3,1.3,0.3,Iris-setosa
4.4,3.2,1.3,0.2,Iris-setosa
5.0,3.5,1.6,0.6,Iris-setosa
5.1,3.8,1.9,0.4,Iris-setosa
4.8,3.0,1.4,0.3,Iris-setosa
5.1,3.8,1.6,0.2,Iris-setosa
4.6,3.2,1.4,0.2,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
6.5,2.8,4.6,1.5,Iris-versicolor
5.7,2.8,4.5,1.3,Iris-versicolor
6.3,3.3,4.7,1.6,Iris-versicolor
4.9,2.4,3.3,1.0,Iris-versicolor
6.6,2.9,4.6,1.3,Iris-versicolor
5.2,2.7,3.9,1.4,Iris-versicolor
5.0,2.0,3.5,1.0,Iris-versicolor
5.9,3.0,4.2,1.5,Iris-versicolor
6.0,2.2,4.0,1.0,Iris-versicolor
6.1,2.9,4.7,1.4,Iris-versicolor
5.6,2.9,3.6,1.3,Iris-versicolor
6.7,3.1,4.4,1.4,Iris-versicolor
5.6,3.0,4.5,1.5,Iris-versicolor
5.8,2.7,4.1,1.0,Iris-versicolor
6.2,2.2,4.5,1.5,Iris-versicolor
5.6,2.5,3.9,1.1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
6.1,2.8,4.0,1.3,Iris-versicolor
6.3,2.5,4.9,1.5,Iris-versicolor
6.1,2.8,4.7,1.2,Iris-versicolor
6.4,2.9,4.3,1.3,Iris-versicolor
6.6,3.0,4.4,1.4,Iris-versicolor
6.8,2.8,4.8,1.4,Iris-versicolor
6.7,3.0,5.0,1.7,Iris-versicolor
6.0,2.9,4.5,1.5,Iris-versicolor
5.7,2.6,3.5,1.0,Iris-versicolor
5.5,2.4,3.8,1.1,Iris-versicolor
5.5,2.4,3.7,1.0,Iris-versicolor
5.8,2.7,3.9,1.2,Iris-versicolor
6.0,2.7,5.1,1.6,Iris-versicolor
5.4,3.0,4.5,1.5,Iris-versicolor
6.0,3.4,4.5,1.6,Iris-versicolor
6.7,3.1,4.7,1.5,Iris-versicolor
6.3,2.3,4.4,1.3,Iris-versicolor
5.6,3.0,4.1,1.3,Iris-versicolor
5.5,2.5,4.0,1.3,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
6.1,3.0,4.6,1.4,Iris-versicolor
5.8,2.6,4.0,1.2,Iris-versicolor
5.0,2.3,3.3,1.0,Iris-versicolor
5.6,2.7,4.2,1.3,Iris-versicolor
5.7,3.0,4.2,1.2,Iris-versicolor
5.7,2.9,4.2,1.3,Iris-versicolor
6.2,2.9,4.3,1.3,Iris-versicolor
5.1,2.5,3.0,1.1,Iris-versicolor
5.7,2.8,4.1,1.3,Iris-versicolor
6.3,3.3,6.0,2.5,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
7.1,3.0,5.9,2.1,Iris-virginica
6.3,2.9,5.6,1.8,Iris-virginica
6.5,3.0,5.8,2.2,Iris-virginica
7.6,3.0,6.6,2.1,Iris-virginica
4.9,2.5,4.5,1.7,Iris-virginica
7.3,2.9,6.3,1.8,Iris-virginica
6.7,2.5,5.8,1.8,Iris-virginica
7.2,3.6,6.1,2.5,Iris-virginica
6.5,3.2,5.1,2.0,Iris-virginica
6.4,2.7,5.3,1.9,Iris-virginica
6.8,3.0,5.5,2.1,Iris-virginica
5.7,2.5,5.0,2.0,Iris-virginica
5.8,2.8,5.1,2.4,Iris-virginica
6.4,3.2,5.3,2.3,Iris-virginica
6.5,3.0,5.5,1.8,Iris-virginica
7.7,3.8,6.7,2.2,Iris-virginica
7.7,2.6,6.9,2.3,Iris-virginica
6.0,2.2,5.0,1.5,Iris-virginica
6.9,3.2,5.7,2.3,Iris-virginica
5.6,2.8,4.9,2.0,Iris-virginica
7.7,2.8,6.7,2.0,Iris-virginica
6.3,2.7,4.9,1.8,Iris-virginica
6.7,3.3,5.7,2.1,Iris-virginica
7.2,3.2,6.0,1.8,Iris-virginica
6.2,2.8,4.8,1.8,Iris-virginica
6.1,3.0,4.9,1.8,Iris-virginica
6.4,2.8,5.6,2.1,Iris-virginica
7.2,3.0,5.8,1.6,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
6.4,2.8,5.6,2.2,Iris-virginica
6.3,2.8,5.1,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
6.3,3.4,5.6,2.4,Iris-virginica
6.4,3.1,5.5,1.8,Iris-virginica
6.0,3.0,4.8,1.8,Iris-virginica
6.9,3.1,5.4,2.1,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
2. "knn-dataset-plant.csv"

Code: Select all

839.763, 533.6277, plant num.2
87.47839, 438.9235, plant num.1
869.1406, 693.8194, plant num.1
398.068, 739.2241, plant num.1
900.3099, 714.3109, plant num.1
38.52216, 186.3096, plant num.2
666.7241, 246.339, plant num.1
540.0168, 385.1998, plant num.2
942.0279, 189.435, plant num.2
1000.57, 196.747, plant num.3
174.3616, 146.0663, plant num.3
480.4058, 311.0027, plant num.3
994.8592, 2.625352, plant num.1
737.84, 118.9717, plant num.3
898.6907, 127.8865, plant num.1
657.5497, 196.0993, plant num.3
914.6758, 644.7253, plant num.3
748.7549, 568.1848, plant num.2
395.9584, 86.06844, plant num.2
155.6262, 398.4563, plant num.3
164.1085, 75.89746, plant num.1
218.2886, 522.6999, plant num.3
171.0408, 684.1329, plant num.3
682.1821, 349.0497, plant num.1
340.0001, 176.8915, plant num.3
641.1478, 694.9907, plant num.3
407.0589, 292.1173, plant num.2
999.2043, 251.2751, plant num.1
683.9269, 697.4973, plant num.3
639.5421, 623.4372, plant num.2
806.2257, 362.1215, plant num.3
325.5341, 209.2259, plant num.2
258.0703, 431.5316, plant num.1
546.2432, 585.4984, plant num.3
826.0145, 397.9962, plant num.3
921.4373, 596.2986, plant num.2
767.2444, 372.5981, plant num.3
840.975, 228.0614, plant num.1
325.9069, 342.6331, plant num.1
810.6459, 514.7084, plant num.2
632.4595, 409.6365, plant num.2
943.6869, 522.8565, plant num.1
827.6112, 181.5865, plant num.3
337.257, 530.5561, plant num.1
444.4773, 124.6036, plant num.1
757.4684, 376.4569, plant num.2
486.4684, 767.4945, plant num.1
848.9123, 308.1569, plant num.2
337.3087, 522.3027, plant num.1
43.18093, 336.0134, plant num.2
416.4825, 26.3317, plant num.3
879.1071, 144.165, plant num.1
554.0399, 753.3649, plant num.2
670.7303, 234.55, plant num.2
247.115, 566.0507, plant num.1
413.818, 367.7557, plant num.3
492.2917, 235.5425, plant num.1
112.6892, 194.0009, plant num.1
111.9127, 48.79112, plant num.1
738.8285, 288.8619, plant num.2
530.6846, 173.4339, plant num.2
250.8573, 0.6196359, plant num.2
468.0082, 207.6705, plant num.1
353.6275, 578.3349, plant num.2
436.7659, 441.8147, plant num.3
319.5284, 159.2238, plant num.3
362.4163, 584.6849, plant num.3
746.8157, 312.1161, plant num.2
461.2215, 540.4637, plant num.1
828.8101, 188.1416, plant num.1
915.6528, 571.3168, plant num.1
869.2048, 473.4064, plant num.3
248.2429, 346.4263, plant num.1
615.1219, 242.2403, plant num.2
839.6243, 33.88754, plant num.2
257.0939, 410.6909, plant num.2
517.286, 563.5756, plant num.1
422.6416, 570.4261, plant num.3
534.619, 732.4036, plant num.2
13.92606, 680.5698, plant num.1
700.7261, 440.1666, plant num.2
13.40573, 417.5752, plant num.3
772.497, 530.7556, plant num.3
847.9851, 424.5849, plant num.1
782.1179, 499.8281, plant num.1
983.0454, 598.0959, plant num.2
319.6597, 631.7553, plant num.1
692.5876, 512.6766, plant num.2
297.2598, 45.30761, plant num.1
732.7919, 382.5357, plant num.2
764.0788, 747.5126, plant num.2
574.5899, 410.6271, plant num.1
956.7115, 461.3054, plant num.2
1015.797, 269.7585, plant num.3
50.945, 577.9126, plant num.1
672.517, 109.4288, plant num.1
794.2375, 469.534, plant num.1
546.4384, 127.4397, plant num.1
539.947, 595.5008, plant num.1
883.6649, 105.8518, plant num.3
292.0938, 68.74947, plant num.3
450.9285, 410.7156, plant num.3
726.9075, 762.2878, plant num.2
214.9403, 762.587, plant num.1
133.9723, 649.9973, plant num.1
710.482, 657.7135, plant num.2
156.9566, 32.05522, plant num.2
686.8109, 194.4269, plant num.3
880.1242, 506.8491, plant num.3
219.9346, 680.6885, plant num.1
315.347, 460.6167, plant num.3
200.5164, 249.3748, plant num.1
520.9279, 627.285, plant num.2
605.5869, 383.9005, plant num.2
964.1627, 216.9309, plant num.3
741.0781, 530.4372, plant num.3
132.7084, 245.7739, plant num.3
373.2134, 346.4675, plant num.3
310.8484, 714.7153, plant num.2
434.4096, 248.883, plant num.1
731.2008, 362.0758, plant num.3
802.8395, 48.98589, plant num.3
67.51826, 30.34593, plant num.1
513.0661, 92.88588, plant num.3
343.6482, 24.01361, plant num.2
785.9011, 542.2375, plant num.3
681.9315, 191.8754, plant num.1
474.2036, 397.483, plant num.3
442.2979, 339.7865, plant num.3
757.4759, 126.4488, plant num.2
614.3705, 38.27979, plant num.3
284.4861, 639.895, plant num.1
778.7795, 646.638, plant num.1
20.34034, 289.8951, plant num.3
40.89967, 308.0763, plant num.1
616.3001, 165.48, plant num.1
235.5017, 29.06763, plant num.3
90.43571, 204.0504, plant num.2
324.2846, 442.11, plant num.2
93.96637, 209.3346, plant num.1
626.473, 557.2963, plant num.3
191.6159, 288.6758, plant num.3
430.0779, 732.5209, plant num.2
207.1581, 40.6993, plant num.1
280.9835, 283.1524, plant num.3
300.3931, 146.7829, plant num.1
961.2844, 426.3438, plant num.2
316.2873, 375.641, plant num.2
898.9312, 556.0547, plant num.2
428.5216, 408.2652, plant num.1
628.0039, 612.3703, plant num.3
949.1718, 375.1642, plant num.1
530.7438, 458.062, plant num.1
200.7454, 763.7803, plant num.1
648.1556, 256.233, plant num.1
198.7949, 731.3918, plant num.1
934.6631, 322.4878, plant num.2
541.7086, 664.2141, plant num.1
394.9478, 595.1163, plant num.3
117.4586, 592.5765, plant num.2
502.2957, 337.6709, plant num.3
567.2498, 457.8775, plant num.3
389.9026, 192.7876, plant num.2
563.4474, 637.5876, plant num.3
166.9886, 709.3369, plant num.1
953.5059, 212.9809, plant num.3
160.4703, 386.8483, plant num.2
385.353, 194.6259, plant num.3
281.856, 380.8631, plant num.3
491.7416, 2.503178, plant num.1
874.0471, 420.7957, plant num.1
581.449, 681.485, plant num.3
306.8539, 269.4364, plant num.2
640.6052, 262.9069, plant num.2
921.6228, 720.8293, plant num.1
242.7166, 199.3442, plant num.1
641.933, 605.6088, plant num.1
957.9228, 653.6485, plant num.1
733.4725, 517.1112, plant num.2
621.8931, 614.5082, plant num.3
541.3489, 645.144, plant num.2
125.8706, 282.814, plant num.1
638.3887, 31.75971, plant num.2
926.7067, 122.4137, plant num.2
600.1289, 399.6485, plant num.2
197.2466, 234.6442, plant num.1
723.4439, 605.9676, plant num.1
477.3592, 227.4747, plant num.3
356.0958, 257.2688, plant num.3
498.4721, 254.3261, plant num.3
655.793, 240.8912, plant num.3
863.5218, 566.6914, plant num.3
1010.726, 350.4812, plant num.2
502.8075, 100.7497, plant num.3
199.8911, 677.0789, plant num.3
38.96012, 358.3307, plant num.1
766.0051, 249.1684, plant num.2
139.6805, 177.1335, plant num.3
475.4207, 537.1297, plant num.1
901.5436, 444.4365, plant num.1
478.4989, 748.1678, plant num.3
444.2291, 267.2949, plant num.1
1001.314, 392.2352, plant num.2
40.39112, 131.9307, plant num.3
976.6196, 272.1337, plant num.3
543.2847, 702.0201, plant num.2
268.3406, 0.7956254, plant num.1
429.5689, 228.3265, plant num.2
648.1038, 712.0261, plant num.3
77.25901, 224.1226, plant num.1
847.4438, 211.4168, plant num.2
638.0681, 719.0822, plant num.1
157.6568, 530.1169, plant num.3
339.8197, 84.93613, plant num.1
518.3911, 602.4036, plant num.1
866.8544, 74.78979, plant num.1
968.9441, 125.431, plant num.1
570.7975, 704.2735, plant num.1
242.3893, 255.9464, plant num.2
960.5748, 100.0783, plant num.2
765.5321, 389.8643, plant num.2
289.3477, 445.8726, plant num.2
450.9991, 468.1948, plant num.3
17.71869, 508.1407, plant num.1
937.0787, 166.6956, plant num.3
762.3339, 469.3661, plant num.1
143.0164, 59.94797, plant num.3
839.0485, 15.09337, plant num.2
409.9516, 450.2187, plant num.2
932.5964, 467.2746, plant num.3
AND HERE IS THE MODIFIED CODE it will ask you some question like "how many numeric columns does the dataset in use has?" and later to insert new values for predictions

Code: Select all

#include "file.bi"
DIM SHARED col AS INTEGER

Type pt
    Dim As Double l(1 To col)
    As String nm
    As Long done
    Declare Constructor
End Type

Constructor pt
done=0
End Constructor

Sub printout(p As pt)
    For n As Long=1 To col
        Print p.l(n);",";
    Next
    Print p.nm
End Sub


Function Splitstring(s_in As String,chars As String,result() As String) As Long
    Dim As Long ctr,ctr2,k,n,LC=Len(chars)
    Dim As boolean tally(Len(s_in))
    #macro check_instring()
    n=0
    While n<Lc
        If chars[n]=s_in[k] Then
            tally(k)=true
            If (ctr2-1) Then ctr+=1
            ctr2=0
            Exit While
        End If
        n+=1
    Wend
    #endmacro
   
    #macro splice()
    If tally(k) Then
        If (ctr2-1) Then ctr+=1:result(ctr)=Mid(s_in,k+2-ctr2,ctr2-1)
        ctr2=0
    End If
    #endmacro
    '==================  LOOP TWICE =======================
    For k  =0 To Len(s_in)-1
        ctr2+=1:check_instring()
    Next k
    If ctr=0 Then
        If Len(s_in) Andalso Instr(chars,Chr(s_in[0])) Then ctr=1':beep
    End If
    If ctr Then Redim result(1 To ctr): ctr=0:ctr2=0 Else  Return 0
    For k  =0 To Len(s_in)-1
        ctr2+=1:splice()
    Next k
    '===================== Last one ========================
    If ctr2>0 Then
		Redim Preserve result(1 To ctr+1)
		result(ctr+1)=Mid(s_in,k+1-ctr2,ctr2)
	End If
   
    Return Ubound(result)
End Function

Function loadfiletostring(file As String) As String
    Var  f=Freefile
    Open file For Binary Access Read As #f
    Dim As String text
    If Lof(1) > 0 Then
        text = String(Lof(f), 0)
        Get #f, , text
    End If
    Close #f
    Return text
End Function

Function vdist(p1 As pt,p2 As pt) As Double
    Dim As Double acc
    For n As Long=Lbound(p1.l) To Ubound(p1.l)
        acc+=(p1.l(n)-p2.l(n))^2
    Next
    Return Sqr(acc)
End Function

Function GetClosest(a() As pt,ans() As Long,v As pt,num As Long) As Long
    Dim As Double d,i
    Dim As Long ctr
    Do
        d=2e8
        For n As Long=Lbound(a) To Ubound(a)
            Var dst=vdist(a(n),v)
            If d>dst And a(n).done=0 Then d=dst:i=n:a(n).done=1
        Next n
        ctr+=1
        Redim Preserve ans(1 To ctr)
        ans(ctr)=i
    Loop Until Ubound(ans)>=num
    Return Ubound(ans)
End Function


INPUT "how many numeric columons does your csv file has?: "; col
Dim As String s=loadfiletostring("..\iris.csv")'("knn-dataset-plant.csv")
Redim As String g()
splitstring(s,Chr(10),g()) 'load the file into a string array g()

Dim As pt array(Lbound(g) To Ubound(g)) '

For n As Long=Lbound(array) To Ubound(array)
    Redim As String tmp()
    splitstring(g(n),",",tmp())''load into temp with the , as seperator.
    For m As Long=1 To col
        array(n).l(m)=Val(tmp(m)) 'pick out the values to suit the udt
    Next m
    array(n).nm=tmp(col+1)
    print n,
    printout(array(n))
Next

print



Redim As Long near()


GetClosest(array(),near(),array(100),8)'' <<--------  here, get 8 closest to array(100)
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
Next


'============= new predictions!!!=================
DIM answer AS STRING
DO
DIM predict AS pt
DIM AS DOUBLE x1
DIM neighbors AS LONG
DIM counter AS INTEGER = 0

DO UNTIL counter >= col

INPUT "insert NEW value TO prediction: "; x1
predict.l(counter) = x1
counter += 1
LOOP
INPUT "how many neighbors TO search?: "; neighbors

Redim As Long near()
GetClosest(array(),near(),predict,neighbors) ' <- new predictions for new values
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
NEXT

INPUT "another prediction? y/n: "; answer
LOOP UNTIL answer = "n"

SLEEP()

 


please inspect the code and give feebbacks and notes if you can - THE NEXT STEP IS to make from this working code a KNN library for the use of the freeBASIC community...

:)

thanks for all your help
ron77...
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [HELP] turning working KNN code into a static library

Post by ron77 »

hello i need help again...

i'm trying to create an FB static library from the working KNN code and it doesn't seem to be working when compiled the test bas file doesn't do ant calculation the lib.a file and bi file seems to be compiling well but i get this weird message:
Make done
D:\FreeBASIC-1.07.1-win32\fbc -g -s console "KNN_test1.bas"
./libfb_knn_lib.a(fb_knn_lib.o):fake:(.text+0x0): multiple definition of `PT::PT()'
KNN_test1.o:D:\REPO\FREEBASIC_PROJECTS\LESSONS\FB_KNN.BI:12: first defined here

Make done


here are the bi file code and the lib bas file code and the test bas file code - any help would be greatly appreciated...

"fb_knn.bi" file code:

Code: Select all

#include "file.bi"

DIM SHARED col AS INTEGER

Type pt
    Dim As Double l(1 To col)
    As String nm
    As Long done
    Declare Constructor
End Type

Constructor pt
done=0
End CONSTRUCTOR

DECLARE Sub printout(p As pt)
DECLARE Function Splitstring(s_in As String,chars As String,result() As String) As Long
DECLARE Function loadfiletostring(file As String) As STRING
DECLARE Function vdist(p1 As pt,p2 As pt) As DOUBLE
DECLARE Function GetClosest(a() As pt,ans() As Long,v As pt,num As Long) As LONG
'DECLARE SUB dataset(file AS STRING)

#INCLIB "fb_knn_lib"
"fb_knn_lib.bas" file (the library code)

Code: Select all

#INCLUDE "file.bi"

DIM SHARED col AS INTEGER

Type pt
    Dim As Double l(1 To col)
    As String nm
    As Long done
    Declare Constructor
End Type

Constructor pt
done=0
End Constructor

Sub printout(p As pt)
    For n As Long=1 To col
        Print p.l(n);",";
    Next
    Print p.nm
End Sub


Function Splitstring(s_in As String,chars As String,result() As String) As Long
    Dim As Long ctr,ctr2,k,n,LC=Len(chars)
    Dim As boolean tally(Len(s_in))
    #macro check_instring()
    n=0
    While n<Lc
        If chars[n]=s_in[k] Then
            tally(k)=true
            If (ctr2-1) Then ctr+=1
            ctr2=0
            Exit While
        End If
        n+=1
    Wend
    #endmacro
   
    #macro splice()
    If tally(k) Then
        If (ctr2-1) Then ctr+=1:result(ctr)=Mid(s_in,k+2-ctr2,ctr2-1)
        ctr2=0
    End If
    #endmacro
    '==================  LOOP TWICE =======================
    For k  =0 To Len(s_in)-1
        ctr2+=1:check_instring()
    Next k
    If ctr=0 Then
        If Len(s_in) Andalso Instr(chars,Chr(s_in[0])) Then ctr=1':beep
    End If
    If ctr Then Redim result(1 To ctr): ctr=0:ctr2=0 Else  Return 0
    For k  =0 To Len(s_in)-1
        ctr2+=1:splice()
    Next k
    '===================== Last one ========================
    If ctr2>0 Then
		Redim Preserve result(1 To ctr+1)
		result(ctr+1)=Mid(s_in,k+1-ctr2,ctr2)
	End If
   
    Return Ubound(result)
End Function

Function loadfiletostring(file As String) As String
    Var  f=Freefile
    Open file For Binary Access Read As #f
    Dim As String text
    If Lof(1) > 0 Then
        text = String(Lof(f), 0)
        Get #f, , text
    End If
    Close #f
    Return text
End Function

Function vdist(p1 As pt,p2 As pt) As Double
    Dim As Double acc
    For n As Long=Lbound(p1.l) To Ubound(p1.l)
        acc+=(p1.l(n)-p2.l(n))^2
    Next
    Return Sqr(acc)
End Function

Function GetClosest(a() As pt,ans() As Long,v As pt,num As Long) As Long
    Dim As Double d,i
    Dim As Long ctr
    Do
        d=2e8
        For n As Long=Lbound(a) To Ubound(a)
            Var dst=vdist(a(n),v)
            If d>dst And a(n).done=0 Then d=dst:i=n:a(n).done=1
        Next n
        ctr+=1
        Redim Preserve ans(1 To ctr)
        ans(ctr)=i
    Loop Until Ubound(ans)>=num
    Return Ubound(ans)
End Function


"knn_test1.bas" code (the test file)

Code: Select all

#INCLUDE "fb_knn.bi"

INPUT "how many numeric columons does your csv file has?: "; col
Dim As String s=loadfiletostring("iris.csv")'("knn-dataset-plant.csv")
Redim As String g()
splitstring(s,Chr(10),g()) 'load the file into a string array g()

Dim As pt array(Lbound(g) To Ubound(g)) '

For n As Long=Lbound(array) To Ubound(array)
    Redim As String tmp()
    splitstring(g(n),",",tmp())''load into temp with the , as seperator.
    For m As Long=1 To col
        array(n).l(m)=Val(tmp(m)) 'pick out the values to suit the udt
    Next m
    array(n).nm=tmp(col+1)
    print n,
    printout(array(n))
Next

print



Redim As Long near()


GetClosest(array(),near(),array(100),8)'' <<--------  here, get 8 closest to array(100)
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
Next


'============= new predictions!!!=================
DIM answer AS STRING
DO
DIM predict AS pt
DIM AS DOUBLE x1
DIM neighbors AS LONG
DIM counter AS INTEGER = 0

DO UNTIL counter >= col

INPUT "insert NEW value TO prediction: "; x1
predict.l(counter) = x1
counter += 1
LOOP
INPUT "how many neighbors TO search?: "; neighbors

Redim As Long near()
GetClosest(array(),near(),predict,neighbors) ' <- new predictions for new values
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
NEXT

INPUT "another prediction? y/n: "; answer
LOOP UNTIL answer = "n"

SLEEP()

 
fxm
Moderator
Posts: 12161
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: [HELP] turning working KNN code into a static library

Post by fxm »

`PT::PT()' designates the constructor of 'PT'.

The constructor body:
Constructor pt
done=0
End Constructor

which is not a declaration must not be put in a *.bi file (only declarations inside).
ron77
Posts: 212
Joined: Feb 21, 2019 19:24

Re: [HELP] turning working KNN code into a static library

Post by ron77 »

thank you fxm for the correct advice...

now the program crashes right after asking for how many numeric values the csv has... i've change the code a bit... still need help in understanding how to fix this... here are the updated codes:

"fb_knn.bi":

Code: Select all

#include "file.bi"

DIM SHARED col AS INTEGER

Type pt
    Dim As Double l(1 To col)
    As String nm
    As Long done
    Declare Constructor
End Type


DECLARE Sub printout(p As pt)
DECLARE Function Splitstring(s_in As String,chars As String,result() As String) As Long
DECLARE Function loadfiletostring(file As String) As STRING
DECLARE Function vdist(p1 As pt,p2 As pt) As DOUBLE
DECLARE Function GetClosest(a() As pt,ans() As Long,v As pt,num As Long) As LONG
DECLARE SUB knn_main(f AS STRING)

#INCLIB "fb_knn_lib"
"fb_knn_lib.bas" (library bas file):

Code: Select all

#INCLUDE "file.bi"

DIM SHARED col AS INTEGER

Type pt
    Dim As Double l(1 To col)
    As String nm
    As Long done
    Declare Constructor
End Type

Constructor pt
done=0
End Constructor

Sub printout(p As pt)
    For n As Long=1 To col
        Print p.l(n);",";
    Next
    Print p.nm
End Sub


Function Splitstring(s_in As String,chars As String,result() As String) As Long
    Dim As Long ctr,ctr2,k,n,LC=Len(chars)
    Dim As boolean tally(Len(s_in))
    #macro check_instring()
    n=0
    While n<Lc
        If chars[n]=s_in[k] Then
            tally(k)=true
            If (ctr2-1) Then ctr+=1
            ctr2=0
            Exit While
        End If
        n+=1
    Wend
    #endmacro
   
    #macro splice()
    If tally(k) Then
        If (ctr2-1) Then ctr+=1:result(ctr)=Mid(s_in,k+2-ctr2,ctr2-1)
        ctr2=0
    End If
    #endmacro
    '==================  LOOP TWICE =======================
    For k  =0 To Len(s_in)-1
        ctr2+=1:check_instring()
    Next k
    If ctr=0 Then
        If Len(s_in) Andalso Instr(chars,Chr(s_in[0])) Then ctr=1':beep
    End If
    If ctr Then Redim result(1 To ctr): ctr=0:ctr2=0 Else  Return 0
    For k  =0 To Len(s_in)-1
        ctr2+=1:splice()
    Next k
    '===================== Last one ========================
    If ctr2>0 Then
		Redim Preserve result(1 To ctr+1)
		result(ctr+1)=Mid(s_in,k+1-ctr2,ctr2)
	End If
   
    Return Ubound(result)
End Function

Function loadfiletostring(file As String) As String
    Var  f=Freefile
    Open file For Binary Access Read As #f
    Dim As String text
    If Lof(1) > 0 Then
        text = String(Lof(f), 0)
        Get #f, , text
    End If
    Close #f
    Return text
End Function

Function vdist(p1 As pt,p2 As pt) As Double
    Dim As Double acc
    For n As Long=Lbound(p1.l) To Ubound(p1.l)
        acc+=(p1.l(n)-p2.l(n))^2
    Next
    Return Sqr(acc)
End Function

Function GetClosest(a() As pt,ans() As Long,v As pt,num As Long) As Long
    Dim As Double d,i
    Dim As Long ctr
    Do
        d=2e8
        For n As Long=Lbound(a) To Ubound(a)
            Var dst=vdist(a(n),v)
            If d>dst And a(n).done=0 Then d=dst:i=n:a(n).done=1
        Next n
        ctr+=1
        Redim Preserve ans(1 To ctr)
        ans(ctr)=i
    Loop Until Ubound(ans)>=num
    Return Ubound(ans)
End Function

SUB knn_main(f AS STRING)
INPUT "how many numeric columons does your csv file has?: "; col
Dim As String s=loadfiletostring(f)'("knn-dataset-plant.csv")
Redim As String g()
splitstring(s,Chr(10),g()) 'load the file into a string array g()

Dim As pt array(Lbound(g) To Ubound(g)) '

For n As Long=Lbound(array) To Ubound(array)
    Redim As String tmp()
    splitstring(g(n),",",tmp())''load into temp with the , as seperator.
    For m As Long=0 To col
        array(n).l(m)=Val(tmp(m)) 'pick out the values to suit the udt
    Next m
    array(n).nm=tmp(col+1)
    print n,
    printout(array(n))
Next

print



Redim As Long near()


GetClosest(array(),near(),array(100),8)'' <<--------  here, get 8 closest to array(100)
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
Next


'============= new predictions!!!=================
DIM answer AS STRING
DO
DIM predict AS pt
DIM AS DOUBLE x1
DIM neighbors AS LONG
DIM counter AS INTEGER = 0

DO UNTIL counter >= col

INPUT "insert NEW value TO prediction: "; x1
predict.l(counter) = x1
counter += 1
LOOP
INPUT "how many neighbors TO search?: "; neighbors

Redim As Long near()
GetClosest(array(),near(),predict,neighbors) ' <- new predictions for new values
print "using Getclosest()"

Print "index",,"values"
For n As Long=Lbound(near) To Ubound(near)
    Print near(n),
    printout array(near(n))
NEXT

INPUT "another prediction? y/n: "; answer
LOOP UNTIL answer = "n"
END SUB
and test file bas :

Code: Select all

#INCLUDE "fb_knn.bi"
KNN_MAIN("iris.csv")

SLEEP()
any help is welcomed and greatly appreciated
ron77...
fxm
Moderator
Posts: 12161
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: [HELP] turning working KNN code into a static library

Post by fxm »

In:

Code: Select all

For n As Long=Lbound(array) To Ubound(array)
    Redim As String tmp()
    splitstring(g(n),",",tmp())''load into temp with the , as seperator.
    For m As Long=0 To col
        array(n).l(m)=Val(tmp(m)) 'pick out the values to suit the udt
    Next m
    array(n).nm=tmp(col+1)
    print n,
    printout(array(n))
Next
Two points to correct (before testing the rest of the code):
- The string array 'tmp()' is sized from the index=1 to the index=col, while the 'FOR m' loop iterator runs for m=0 to m=col.
- In addition, at the line 'array(n).nm=tmp(col+1)', 'tmp()' is accessed at the index=col+1.
=> A "out of array bounds" error therefore occurs twice for 'tmp()'.
Post Reply