Applying functions for speed optimization

General FreeBASIC programming questions.
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Applying functions for speed optimization

Post by Provoni »

Hey all,

I'm looking to further speed optimize my function "m_ngrams". Especially initializing and clearing of the arrays "gram" and "id" are slowing down the function I believe.

Thanks

Code: Select all

function m_ngrams(array()as integer,byval l as integer)as integer
	
	dim as integer i,j,e,a,b,n,r
	dim as integer ngrams
	dim as short gram(l,l)
	dim as short id(l,l)
	dim as short arraycopy(l)
	for i=1 to l
		arraycopy(i)=array(i)
	next i
	for n=2 to l-1
		for i=1 to l-(n-1)
			a=arraycopy(i)
			b=arraycopy(i+1)
			gram(a,b)+=1
			if id(a,b)=0 then 
				e+=1
				id(a,b)=e
			end if
			arraycopy(i)=id(a,b)
			if gram(a,b)>1 then r+=1
		next i
		if r>0 then ngrams+=r*(2^(n-2))
		if r<2 then return ngrams
		e=0
		r=0
		for i=1 to l 'clear arrays
			for j=1 to l
				gram(i,j)=0
				id(i,j)=0
			next j
		next i
	next n
	return ngrams
	
end function

dim as integer i,j,score
dim as integer array(1000)
dim as double timero

screenres 500,500,32

timero=timer

for i=1 to 1000
	for j=1 to 400 'create random array
		array(j)=int(rnd*26)+1
	next j
	score=m_ngrams(array(),400)	
next i

print timer-timero

beep
sleep
Last edited by Provoni on Feb 16, 2017 20:26, edited 1 time in total.
fxm
Moderator
Posts: 12082
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Applying function for speed optimization

Post by fxm »

Replace:

Code: Select all

      for i=1 to l 'clear arrays
         for j=1 to l
            gram(i,j)=0
            id(i,j)=0
         next j
      next i
by:

Code: Select all

      clear gram(0,0),0,(l+1)*(l+1)*sizeof(gram(0,0)) 'clear array
      clear id(0,0),0,(l+1)*(l+1)*sizeof(id(0,0)) 'clear array
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

Re: Applying function for speed optimization

Post by Tourist Trap »

fxm wrote:

Code: Select all

      clear gram(0,0),0,(l+1)*(l+1)*sizeof(gram(0,0)) 
Hi,

don't blame me if I took bad habit with the syntax challenges! But I've found this effectively cleaning an array when one dimension is involved. Is it correct? and if yes, it's a little tricky for me to reach dimension 2, but I'm still reading the doc.

Code: Select all

'should check that @array(lbound(array))  is not NULL first

array(0) = *new (@array(lbound(array))) typeOf(array)[ubound(array) - lbound(array) + 1] 
By the way you see why dkl is right when he tries not breaking @array(lbound(array)) (with the bug we discussed already and reported http://www.freebasic.net/forum/viewtopi ... 15#p229048).

[edit] dimension 2 array reinitialized by placement new method seems ok too.

Code: Select all

 
redim as integer    array(1, 2)
? "array(.,.)"
for index1 as integer = lBound(array) to uBound(array)
    for index2 as integer = lBound(array, 2) to uBound(array, 2)
        array(index1, index2) = (index1 + 1)*(index2 + 10)
        ? array(index1, index2)
    next index2
next index1


array(0, 0) =   *new (@array(lBound(array), lBound(array, 2))) _ 
                typeOf(array)[(uBound(array) - lBound(array) + 1)* _ 
                                (uBound(array, 2) - lBound(array, 2) + 1)]

? "array(.,.)"
for index1 as integer = lBound(array) to uBound(array)
    for index2 as integer = lBound(array, 2) to uBound(array, 2)
        ? array(index1, index2)
    next index2
next index1
fxm
Moderator
Posts: 12082
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Applying function for speed optimization

Post by fxm »

1) For testing if the array is sized:

The safer syntax is to not use any array index but rather:
If ubound(array) >= lbound(array) then '' verified only if array is sized


2) For clearing the array:

Your syntax uses the New (placement) operator which allows to construct object(s) (no allocate memory) at an existing memory location:
New(address) datatype[ count ]
where:
address = @array(lbound(array))
address of the first element of array
datatype = typeOf(array)
datatype of array (Integer)
count = ubound(array) - lbound(array) + 1
number of elements of the array

Constructing Integers at existing memory location consists only to clear the corresponding memory.
The New (placement) instruction returns then a typed pointer to the first constructed memory address which is the address of the first array element.

As optimization, one can use the dereferenced version of New (placement):
*New(address) datatype[ count ]
which always constructs the Integers and then which returns the value of the first constructed element (0 in this case).
This can allow not to construct the last element of the array but to assign it to the value of the first array element (0 after construction), instead of an useless assignment like your solution (in addition, that suppresses adding '+1' in the expression of number of elements).


3) Example of safe and optimized code (for 1-dimension array and number of array elements > 1):

Code: Select all

Dim As Integer array(10)
For I As Integer = 0 to 10
  array(I) = (I+1) * 10
Next I
For I As Integer = 0 to 10
  Print array(I)
Next I
Print

If Ubound(array) > Lbound(array) Then
  array(Ubound(array)) = *New (@array(Lbound(array))) Typeof(array)[Ubound(array) - Lbound(array)]
  '' last array element not constructed but only assigned (to 0 = value of first element after construction)
Elseif Ubound(array) = Lbound(array) Then
  array(Lbound(array)) = 0
End If
For I As Integer = 0 to 10
  Print array(I)
Next I

Sleep
4) Clearing N-dimension array (with N>1):
The full optimization as above is not useful because it would add a subtracting '-1' in the expression of the number of elements.
Your code is correct if 'array(0, 0)' is a valid array element, otherwise use for example 'array(lBound(array), lBound(array, 2)) = *New .....'.


5) Final remark:
For clearing a numeric array, 'Clear' appears to be a little faster than 'New (placement)'.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Applying function for speed optimization

Post by dodicat »

If you declare all arrays static and use redim to clear them you have a better speed,

Code: Select all

function m_ngrams(array()as integer,byval l as integer)as integer
   
   dim as integer i,j,e,a,b,n,r
   dim as integer ngrams
   static as short gram()
   static as short id()
   redim gram(l,l)
   redim id(l,l)
   static as short arraycopy()
   redim arraycopy(l)
   for i=1 to l
      arraycopy(i)=array(i)
   next i
   for n=2 to l-1
      for i=1 to l-(n-1)
         a=arraycopy(i)
         b=arraycopy(i+1)
         gram(a,b)+=1
         if id(a,b)=0 then 
            e+=1
            id(a,b)=e
         end if
         arraycopy(i)=id(a,b)
         if gram(a,b)>1 then r+=1
      next i
      if r>0 then ngrams+=r*(2^(n-2))
      if r<2 then return ngrams
      e=0
      r=0
      redim gram(lbound(gram,1) to ubound(gram,1),lbound(gram,2) to ubound(gram,2))
      redim id(lbound(id,1) to ubound(id,1),lbound(id,2) to ubound(id,2))
     'for i=1 to l 'clear arrays
         'for j=1 to l
           ' gram(i,j)=0
            'id(i,j)=0
        ' next j
      'next i
   next n
   return ngrams
   
end function

dim as integer i,j,score
dim as integer array(1000)
dim as double timero

screenres 500,500,32

timero=timer

for i=1 to 1000
   for j=1 to 400 'create random array
      array(j)=int(rnd*26)+1
   next j
   score=m_ngrams(array(),400)   
next i

print timer-timero

beep
sleep 
fxm
Moderator
Posts: 12082
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Applying function for speed optimization

Post by fxm »

Yes declare arrays static improves access times, but use Redim is less fast than Clear.
The fastest IMHO:

Code: Select all

function m_ngrams(array()as integer,byval l as integer)as integer
   
   dim as integer i,j,e,a,b,n,r
   dim as integer ngrams
   static as short gram(l,l)
   static as short id(l,l)
   static as short arraycopy(l)
   for i=1 to l
      arraycopy(i)=array(i)
   next i
   for n=2 to l-1
      for i=1 to l-(n-1)
         a=arraycopy(i)
         b=arraycopy(i+1)
         gram(a,b)+=1
         if id(a,b)=0 then
            e+=1
            id(a,b)=e
         end if
         arraycopy(i)=id(a,b)
         if gram(a,b)>1 then r+=1
      next i
      if r>0 then ngrams+=r*(2^(n-2))
      if r<2 then return ngrams
      e=0
      r=0
      'for i=1 to l 'clear arrays
         'for j=1 to l
            'gram(i,j)=0
            'id(i,j)=0
         'next j
      'next i
      clear gram(0,0),0,(l+1)*(l+1)*sizeof(gram(0,0)) 'clear array
      clear id(0,0),0,(l+1)*(l+1)*sizeof(id(0,0)) 'clear array
   next n
   return ngrams
   
end function

dim as integer i,j,score
dim as integer array(1000)
dim as double timero

screenres 500,500,32

timero=timer

for i=1 to 1000
   for j=1 to 400 'create random array
      array(j)=int(rnd*26)+1
   next j
   score=m_ngrams(array(),400)   
next i

print timer-timero

beep
sleep
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Applying function for speed optimization

Post by Provoni »

Thanks allot guys, this is a huge and very welcome speed increase. I will be able to apply the same principles to other parts of my program. Great stuff.
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Applying function for speed optimization

Post by Provoni »

Submitting another similar function (m_fastbigrams) for optimization. It needs to be rock solid as it is called many billions of times on unique arrays with different arguments.

Code: Select all

function m_fastbigrams(array()as integer,byval l as integer,byval s as integer)as integer

	dim as short i,score,id(1,s,s)
	for i=1 to l-1
		s=array(i)
		l=array(i+1)
		id(0,s,l)=1
		if id(0,s,l)=1 then
			if id(1,s,l)=0 then 
				id(1,s,l)=1 
			else 
				score+=1
			end if		
		end if
	next i
	return score
	
end function

dim as integer i,j,l,s,score
dim as integer array(1000)
dim as double timera

screenres 500,500,32

timera=timer

for i=1 to 100000
	l=int(rnd*1000)+1
	s=int(rnd*99)+1
	for j=1 to l 'create random array to simulate input
		array(j)=int(rnd*s)+1
	next j
	score=m_fastbigrams(array(),l,s)
next i

print timer-timera

beep
sleep
SARG
Posts: 1756
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Applying functions for speed optimization

Post by SARG »

Hi,

Strange test

id(0,s,l)=1
if id(0,s,l)=1 then <-- always true or I'm missing something ?
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Applying functions for speed optimization

Post by Provoni »

Thanks SARG for spotting that. I will not try to come up with an excuse.

Here is the new code, it is already twice as fast now thanks to your keen observation.

Code: Select all

function m_fastbigrams(array()as integer,byval l as integer,byval s as integer)as integer

	dim as short i,score,id(s,s)
	for i=1 to l-1
		s=array(i)
		l=array(i+1)
		if id(s,l)=0 then
			id(s,l)=1
		else 
			score+=1
		end if
	next i
	return score
	
end function

dim as integer i,j,l,s,score
dim as integer array(1000)
dim as double timera

screenres 500,500,32

timera=timer

for i=1 to 100000
	l=int(rnd*1000)+1
	s=int(rnd*99)+1
	for j=1 to l 'create random array to simulate input
		array(j)=int(rnd*s)+1
	next j
	score=m_fastbigrams(array(),l,s)
next i

print timer-timera

beep
sleep
SARG
Posts: 1756
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Applying functions for speed optimization

Post by SARG »

Hi,

Use a pointer to save a few milliseconds.

Btw each value of array id(,) is always equal to zero, local var defined in the function. Maybe just for the test.

Code: Select all

function m_fastbigrams(array()as integer,byval l as integer,byval s as integer)as integer

   dim as short i,score,id(s,s)
   Dim As UShort Ptr pt
   for i=1 to l-1
      ''s=array(i)
      ''l=array(i+1)
      ''if id(s,l)=0 then
      pt=@id(array(i),array(i+1))
      if *pt=0 Then
         *pt=1
      else 
         score+=1
      end if
   next i
   return score
   
end function
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Applying functions for speed optimization

Post by Provoni »

Thanks SARG.
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Applying functions for speed optimization

Post by Provoni »

I have set up a pseudo random in the following way and wonder if it is possible to optimize away the MOD operation.

Code: Select all

new_random=(integer_value1+integer_value2) MOD random_range
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Applying functions for speed optimization

Post by dodicat »

The only shortcut I know is if your random_range is a power of 2 then AND can be used instead of mod.
Like this:

Code: Select all

randomize 1

dim as integer random_range,integer_value1,integer_value2,power,new_random

for z as long =  1 to 30
    integer_value1=1000+rnd*1000
    integer_value2=1000*rnd*1000
    power=int(1+rnd*7)
    random_range=2^power
print (integer_value1+integer_value2) MOD random_range,
print (integer_value1+integer_value2) and (random_range-1)

next

print "speed test"
dim as double t
t=timer
for n as long=1 to 100000000
    new_random=(integer_value1+integer_value2) MOD random_range
next
print new_random,timer-t,"MOD"
print
t=timer
for n as long=1 to 100000000
    new_random= (integer_value1+integer_value2) and (random_range-1)
next
print new_random,timer-t,"AND"
sleep


    
 
If you use optimized -gcc
for instance
-gen gcc -Wc -O3
then both results are very close.
Provoni
Posts: 513
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Applying functions for speed optimization

Post by Provoni »

Thank you dodicat.
Post Reply