How would you find the shortest distance in a set of points?

General FreeBASIC programming questions.
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

How would you find the shortest distance in a set of points?

Post by Tourist Trap »

Hello,

so as said in the tittle, how would you find the shortest distance in a set of points, in a fast way.

So far here is my try:

Code: Select all

type POINT2D
    as single   x
    as single   y
end type

dim as POINT2D   xy(1 to 1000)

'set up 1000 points
randomize TIMER()
for index as integer = lBound(XY) to uBound(XY)
    xy(index).x = 50 + 500*rnd()
    xy(index).y = 50 + 500*rnd()
next index

#macro _dist(xy1, xy2)
    sqr(((xy1.x) - (xy2.x))*((xy1.x) - (xy2.x)) + ((xy1.y) - (xy2.y))*((xy1.y) - (xy2.y)))
#endMacro

screenRes 600,600,32

dim as single   dist
dim as single   minDistanceSofar = 1e+9
for index as integer = lBound(XY) to uBound(XY) - 1
    line (xy(index).x, xy(index).y)-(xy(index + 1).x, xy(index + 1).y), rgb(100,200,200)
    if (xy(index + 1).x - xy(index).x)>minDistanceSofar then
        continue for
    end if
    if (xy(index + 1).y - xy(index).y)>minDistanceSofar then
        continue for
    end if
    '
    dist = _dist(xy(index + 1), xy(index))
    if dist<minDistanceSofar then
        minDistanceSofar = dist
    end if
next index

? "min distance found = ", minDistanceSofar

getKey()
Any other idea?

edit: Hum, yes there is something weird, missing a loop or something in the above example. Every couple of points should be tested, not only adjacent ones in index order of course.
Last edited by Tourist Trap on Jun 28, 2020 16:17, edited 2 times in total.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: How would you find the shortest distance in a set of points?

Post by jj2007 »

- avoid the square roots, they are very slow
- check to what extent you can use the simple sum dx+dy as a proxy, e.g. to identify the best 10 matches
- with these 10 matches, use the exact method
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

Re: How would you find the shortest distance in a set of points?

Post by Tourist Trap »

jj2007 wrote:- avoid the square roots, they are very slow
- check to what extent you can use the simple sum dx+dy as a proxy, e.g. to identify the best 10 matches
- with these 10 matches, use the exact method
I see, first try another distance formula than the pythagorean stuff if I follow you well. Of course if all is positive number, why not something like dx + dy? I have to give it a try anyway. Thanks.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: How would you find the shortest distance in a set of points?

Post by counting_pine »

You could try partitioning the points into a grid of N*N cells. Then you’d only have to check the points in each cell against points in the same cell and neighbouring cells.

Note, depending on the distribution of points, you might find that most points are clustered together in the same cell. In that case you’d have to be more creative about how the cells are partitioned.
paul doe
Moderator
Posts: 1733
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: How would you find the shortest distance in a set of points?

Post by paul doe »

Tourist Trap wrote: I see, first try another distance formula than the pythagorean stuff if I follow you well. Of course if all is positive number, why not something like dx + dy? I have to give it a try anyway. Thanks.
Just use the squared distance( dx * dx + dy * dy ) to sort them. And you could use some fast data structure (like a Binary Heap, aka Priority Queues) to get the points desired:
  • add all points to the Binary Heap/Priority Queue, using the squared distance as the priority.
  • after all are added, popping the first will give you the nearest, popping again will give you the second nearest, and so on and so forth.
This, combined with a spatial partitioning scheme like counting_pine above suggests, should be pretty fast. Binary heaps are pretty easy to implement efficiently to boot, and are used extensively in eg. path finding algorithms (A*)
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: How would you find the shortest distance in a set of points?

Post by dodicat »

Using the squared distance can yield wrong results, or so it seems here.
I have it set up for the squared distance (line 68 or so).
For the true distance the parameter is 0
You can see the optimised distance increasing after iteration 572 using the the squared distance comparison.

Code: Select all

 

#include "crt.bi" 

Type pt
    As double x,y
End Type

Sub setup(points() As pt)
    #define Intrange(f,l) Int(Rnd*(((l)+1)-(f)))+(f)
    For n As long=Lbound(points) To Ubound(points)
        points(n)=Type<pt>(IntRange(10,790),IntRange(10,590))
    Next n
End Sub


Function distance(points() As pt,f as long=0) As double
    #define rawlength(a,b) ((a.x-b.x)*(a.x-b.x)+(a.y-b.y)*(a.y-b.y))
    Dim As double total,L
    For n As long=Lbound(points) To Ubound(points)-1
        L= rawlength(points(n),points(n+1))
       if f=0 then total+=sqr(L) else total+=L
    Next n
    Return total
End Function

Sub shuffle(a() As pt)
    #define range(f,l) Int(Rnd*((l+1)-(f)))+(f)
    For n As long = Lbound(a) To Ubound(a)-1
        Swap a(n), a(range(n,Ubound(a)))
    Next n
End Sub


Sub show(c() As pt,clr As Ulong=Rgb(255,255,255))
    pset(c(1).x,c(1).y),clr
    For n As Integer=Lbound(c)+1 To Ubound(c)
        Line -(c(n).x,c(n).y),clr
    Next n
End Sub

sub copymem(toa() as pt,froma() as pt)
    for n as long=lbound(toa) to ubound(toa)
        toa(n)=froma(n)
    next
    end sub


'===================================
redim as pt p(1 to 1000)
redim as pt opt(1 to ubound(p))'an optimised array
redim as pt orig(1 to ubound(p))'a copy of the original to view/compare the lines (not used here)

dim as long size=ubound(p)*sizeof(pt)
dim as double d=1e20,dd

setup(p())
'memcpy(@orig(1),@p(1),size) 'to see the lines (not used here)
copymem(orig(),p())
dim as double t=timer
dim as double start=(distance(p()))

print "original distance ";start
print "pseudo dist","array dist"
print d
for n as long=1 to 20000
    shuffle(p())
    dd=distance(p(),1) '<<<  flag for distance method 0 (true)or 1(pseudo)
    if d > dd then 
    d=dd
    'memcpy(@opt(1),@p(1),size)
    copymem(opt(),p())
    print d,distance(opt(),0),n
    end if
next n
print
print "optimized distance";
print distance(opt())
print
print timer-t; " seconds"
    

sleep
end
screen 19,32
show(orig(),rgb(50,50,50))
show(opt(),rgb(255,255,255))
sleep


 
To be honest I was surprised at this.
The problem was not memcpy, I changed it to a direct copy.
badidea
Posts: 2591
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: How would you find the shortest distance in a set of points?

Post by badidea »

Tourist Trap wrote:edit: Hum, yes there is something weird, missing a loop or something in the above example. Every couple of points should be tested, not only adjacent ones in index order of course.
Yes, that is weird, why? I would expect a nested for loop. And be careful not to test each pair twice and don't test against self.
dodicat wrote:Using the squared distance can yield wrong results, or so it seems here
That is even weirder. Mathematically not possible. And a doubt some floating point rounding is the cause.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: How would you find the shortest distance in a set of points?

Post by dodicat »

does the same with ulongint.
dim as ulongint d=1e18,dd (and all doubles ulongint except the timer)
1e20 is too big for ulongint.
paul doe
Moderator
Posts: 1733
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: How would you find the shortest distance in a set of points?

Post by paul doe »

dodicat wrote:Using the squared distance can yield wrong results, or so it seems here.
...
That would depend on how it's used. If they're used just for comparisons (is this closer/farther than that?) it's exactly the same. The problem with using Manhattan Distance for this sort of comparisons is that the distance will frequently overshoot. This is seen more clearly in, say, an A* implementation: using squared distance yields better results (it's a better heuristic).

Code: Select all

#define rndRange( aMin, aMax ) _
  ( rnd() * ( ( aMax + 1 ) - aMin ) + aMin )
#define sqDistance( p1, p2 ) _
  ( ( p2.x - p1.x ) ^ 2 + ( p2.y - p1.y ) ^ 2 )
#define manhattanDistance( p1, p2 ) _
  ( abs( p2.x - p1.x ) + abs( p2.y - p1.y ) )

type _
  Point
 
  as integer _
    x, y
end type

type _
  Bounds
 
  as integer _
    minX, maxX, _
    minY, maxY
end type

function _
  randomPoint( _
    byref aBound as Bounds ) _
  as Point
 
  return( type <Point>( _
    rndRange( aBound.minX, aBound.maxX ), _
    rndRange( aBound.minY, aBound.maxY ) ) )
end function

sub _
  withRandomPoints( _
    p() as Point, _
    byref aBound as Bounds )
 
  for _
    i as integer => lbound( p ) _
    to ubound( p )
   
    p( i ) => randomPoint( aBound )
  next
end sub

sub _
  render( _
    p() as Point )
 
  for _
    i as integer => lbound( p ) _
    to ubound( p )
   
    circle _
      ( p( i ).x, p( i ).y ), _
      3, rgba( 255, 255, 255, 255 ), , , , f
  next
end sub

sub _
  renderClosestTo( _
    byref p0 as Point, _
    p() as Point )
   
  dim as integer _
    indexSq, indexMn, _
    minSq => 999999, minMn => 999999
  
  for _
    i as integer => lbound( p ) _
    to ubound( p )
    
    dim as integer _
      distSq => sqDistance( p0, p( i ) ), _
      distMn => manhattanDistance( p0, p( i ) )
    
    if( distSq < minSq ) then
      minSq => distSq
      indexSq => i
    end if
    
    if( distMn < minMn ) then
      minMn => distMn
      indexMn => i
    end if
  next
  
  if( indexSq = indexMn ) then
    circle _
      ( p( indexSq ).x, p( indexSq ).y ), _
      6, rgba( 0, 192, 0, 255 )
    circle _
      ( p( indexMn ).x, p( indexMn ).y ), _
      3, rgba( 255, 0, 0, 255 ), , , , f
  else
    circle _
      ( p( indexSq ).x, p( indexSq ).y ), _
      3, rgba( 0, 192, 0, 255 ), , , , f
    circle _
      ( p( indexMn ).x, p( indexMn ).y ), _
      3, rgba( 255, 0, 0, 255 ), , , , f
  end if
end sub

var _
  bounds => type <Bounds> ( 0, 800, 0, 600 )

screenRes( bounds.maxX, bounds.maxY, 32 )
randomize()

dim as integer _
  count => 100

dim as Point _
  points( 0 to count - 1 )

withRandomPoints( points(), bounds )

dim as integer _
  mx, my

do
  getMouse( mx, my )
 
  screenLock()
    cls()
    render( points() )
    renderClosestTo( _
      type <Point>( mx, my ), _
      points() )
  screenUnlock()
 
  sleep( 1, 1 )
loop until( len( inkey() ) )
Last edited by paul doe on Jun 29, 2020 1:53, edited 2 times in total.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: How would you find the shortest distance in a set of points?

Post by jj2007 »

I couldn't resist: a graphical implementation of the problem. With 800 points, it takes 3 milliseconds on my trusty Core i5 to generate the random points and find the closest pair.
Image
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: How would you find the shortest distance in a set of points?

Post by dodicat »

Thanks jj2007, paul doe
I have lost all faith in squared distance now, but I use it here and it seems OK.

Code: Select all

 

type pt
    as double x,y
end type

#define rawlength(a,b) ((a.x-b.x)*(a.x-b.x)+(a.y-b.y)*(a.y-b.y))


function getpair(p() as pt) as pt
dim as pt result
dim as double min=1000000
dim as double d
for n1 as long=lbound(p) to ubound(p)-1
    for n2 as long=n1+1 to ubound(p)
        d=rawlength(p(n1),p(n2))
        if min>d then min=d:result=type(n1,n2)
    next n2
next n1
return result
end function

Sub setup(points() As pt)
    #define range(f,l) Rnd*((l)-(f))+(f)
    For n As long=Lbound(points) To Ubound(points)
        points(n)=Type<pt>(Range(10,790),Range(10,590))
    Next n
End Sub

sub show(p() as pt)
    for n as long=lbound(p) to ubound(p)
        circle(p(n).x,p(n).y),2,rgb(255,255,255)
    next n
    end sub

dim as pt z(1 to 800),res
randomize 
screen 19,32,,64
do
    cls
    dim as double t=timer
setup(z())
show(z())
res=getpair(z())
print "time (secs)",,"seperation (pixels)"
var seperation=sqr(rawlength(type<pt>(z(res.x).x,z(res.x).y),type<pt>(z(res.y).x,z(res.y).y)))
print timer-t,seperation

dim as pt half=type((z(res.x).x+z(res.y).x)/2,(z(res.x).y+z(res.y).y)/2)

circle(half.x,half.y),10,rgba(200,0,0,100),,,,f
sleep
loop until inkey=chr(27)

sleep


 
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: How would you find the shortest distance in a set of points?

Post by jj2007 »

I see no problem with squared distance - what could possibly go wrong??
My new version does 1,000 points in slightly less than 2 milliseconds.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: How would you find the shortest distance in a set of points?

Post by dodicat »

Hi jj2007
I found a problem (with squared distance) in my first snippet in this thread.
I am bamboozled.
I will never trust it again, from now on I am a square rooter.
I'll maybe come back to squared distance in the near future, .. depends ..
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: How would you find the shortest distance in a set of points?

Post by jj2007 »

dodicat wrote:I found a problem (with squared distance) in my first snippet in this thread.
I saw that one but didn't understand at all what you are doing, or what the problem is. Some comments might be helpful ;-)
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: How would you find the shortest distance in a set of points?

Post by fxm »

@dodicat,

We can not compare:
SUMi(Xi^2)
with:
SUMi(SQR(Xi^2))
because the SQR function induces a non-linearity in the sum.

Examples where that fails:

X1 = 1, X2 = 6
SUMi(Xi^2) = 37
SUMi(SQR(Xi^2)) = 7

X1 = 3, X2 = 5
SUMi(Xi^2) = 34
SUMi(SQR(Xi^2)) = 8

Between these two examples, SUMi(Xi^2) decreases while SUMi(SQR(Xi^2)) increases.
Post Reply