Code timimg ( Part I )

General FreeBASIC programming questions.
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Code timimg ( Part I )

Postby deltarho[1859] » Nov 27, 2017 9:41

When members time code we often find something like this:-

Code: Select all

t1 = Timer
For i = 1 To 10000000
  ' Code1 to time
Next
t1 = Timer - t1
Print t1
 
t2 = Timer
For i = 1 To 10000000
  ' Code2 to time
Next
t2 = Timer - t2
Print t2

Lets look at t1. t1 includes the For/Next overhead.

So, t1 = Code1Time + overhead. By the same token t2 = Code2Time + overhead ... [1]

Since our machines are of the asynchronous variety then there may be a difference in those two overheads but it should not be large enough to have an effect on the proceedings.

Now if t2 > t1 then we would conclude that Code1 is the faster of the two.

Replacing t2 and t1 from [1] we get

Code2Time + overhead > Code1Time + overhead
ie Code2Time > Code1Time since the overheads cancel. This is an absolute comparison

Which is what we want to see so why am I writing this post?

There is a temptation to consider t2/t1 and then declare that t1 was x%, or whatever, faster than t2.

This would be wrong because

t2/t1 = (code2Time + overhead)/(code1Time + overhead) ... [2]

and we should be comparing code2Time/code1Time (a relative comparison); which is what we get if the overhead was zero.

As the overhead increases relative to the code being timed t2/t1 tends to 1 and our conclusion will tend to there being little difference between the codes being timed.

For a relative comparison we need to remove the overhead in [2]

Here is some code by dodicat from here followed by typical output.

Code: Select all

Dim Shared As Integer a(1000000) '0 to 1000000 = 1000001 elements
Dim As Integer Ptr ap=New Integer[1000001] '1000001 elements
 
a(2017)=131:a(96)=30
ap[2017]=131:ap[96]=30
 
Dim As Double t
Dim As longint i
 
For z As Long=1 To 10
 
  i=0
  t=Timer
  For n As Long=1 To 10000000
    i+=a(2017)-a(96)
  Next
  t = Timer - t
  Print t,i,"array element"
  Sleep 50
 
  i=0
  t=Timer
  For n As Long=1 To 10000000
    i+=ap[2017] -ap[96]
  Next
  t = Timer - t
  Print t ,i,"pointer element"
  Print
 
Next z
 
Delete[] ap
Sleep

Code: Select all

 0.02051838990728871        1010000000    array element
 0.03817670961015596        1010000000    pointer element
 
 0.02051161530336687        1010000000    array element
 0.03819158580192017        1010000000    pointer element
 
 0.02050253593680296        1010000000    array element
 0.03634987128285294        1010000000    pointer element
 
 0.02366844110090627        1010000000    array element
 0.04738011395321218        1010000000    pointer element
 
 0.02050225657169635        1010000000    array element
 0.03817126199032428        1010000000    pointer element
 
 0.02050449149238665        1010000000    array element
 0.03817265881519782        1010000000    pointer element
 
 0.02050630736596792        1010000000    array element
 0.03828335724271481        1010000000    pointer element
 
 0.02050120895266039        1010000000    array element
 0.03824166199914814        1010000000    pointer element
 
 0.0205050502225852         1010000000    array element
 0.04750086952434662        1010000000    pointer element
 
 0.02051804070114427        1010000000    array element
 0.04738185998475641        1010000000    pointer element

Here is the same code taking into account the For/Next overhead and typical output. It would be unwise to use a single estimate of the overhead so we take an average of six.

Code: Select all

Dim Shared As Integer a(1000000) '0 to 1000000 = 1000001 elements
Dim As Integer Ptr ap=New Integer[1000001] '1000001 elements
 
a(2017)=131:a(96)=30
ap[2017]=131:ap[96]=30
 
Dim As Double t, tover, toverAve
Dim As Ulongint i
 
For i = 1 To 6
  tover = Timer
  For n As Long=1 To 10000000
  Next
  tover = Timer - tover
  toverAve += tover
  Sleep 1
Next
toverAve = toverAve/6
 
For z As Long=1 To 10
 
  i=0
  t=Timer
  For n As Long=1 To 10000000
    i+=a(2017)-a(96)
  Next
  t = Timer - t- toverAve
  Print t,i,"array element"
  Sleep 50
 
  i=0
  t=Timer
  For n As Long=1 To 10000000
    i+=ap[2017] -ap[96]
  Next
  t = Timer - t - toverAve
  Print t ,i,"pointer element"
  Print
 
Next z
 
Delete[] ap
Sleep

Code: Select all

 0.008781551379824171       1010000000    array element
 0.02866473485653676        1010000000    pointer element
 
 0.001710750481704215       1010000000    array element
 0.02858197294160614        1010000000    pointer element
 
 0.001690426669996858       1010000000    array element
 0.01939477177482973        1010000000    pointer element
 
 0.001709563180239959       1010000000    array element
 0.01936620669187827        1010000000    pointer element
 
 0.001690217146319784       1010000000    array element
 0.01934846700763004        1010000000    pointer element
 
 0.001707747307155172       1010000000    array element
 0.02857917929063691        1010000000    pointer element
 
 0.001693220321212108       1010000000    array element
 0.02861382056503925        1010000000    pointer element
 
 0.001693150479386452       1010000000    array element
 0.02860062056277646        1010000000    pointer element
 
 0.001684769525771346       1010000000    array element
 0.01948514639013399        1010000000    pointer element
 
 0.001719829847752541       1010000000    array element
 0.01935908288192523        1010000000    pointer element

If we subtract array from pointer in the first run we get the absolute difference since the overheads 'cancel'. There are no overheads in the second run but we will get similar absolute differences on subtraction.

The difference between the runs is when we consider relative differences. With the first run it looks like array is about twice as fast as pointer. However, that is not true. The truth is to found in the second run. Now we see that array is about 10 times faster.

If all we want to know is whether Code A is faster than Code B then the first method is fine. However, if we want to decide whether to use Code A or Code B in our code then the first method may lead us astray.

Remember my writing "As the overhead increases relative to the code being timed t2/t1 tends to 1 and our conclusion will tend to there being little difference between the codes being timed." That is what is happening in the above test code. The overhead is closing the ratio.
Last edited by deltarho[1859] on Dec 01, 2017 10:03, edited 1 time in total.
Munair
Posts: 836
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Code timimg

Postby Munair » Nov 27, 2017 9:48

Whe doing the speed test on the four LeapYear algorithms, it was obvious that the order of algorithms was of influence. One would be faster when run second. Another when run third and yet another when run fourth. Therefore, the best result would be when running in all possible orders, preferably two or three times.

See also the LeapYear benchmarks: https://freebasic.net/forum/viewtopic.p ... 15#p239108
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Re: Code timimg

Postby deltarho[1859] » Nov 27, 2017 10:08

That is because our machines are, for the most part, asynchronous. Probably the best tack would be to consider algorithms as separate instance, and not together in one run, with each instance been tested several times and averaged. We could then look at the instances at different times. We will not get perfection with a PC - they are not designed for definitive timing.
Munair
Posts: 836
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Code timimg

Postby Munair » Nov 27, 2017 10:32

deltarho[1859] wrote:That is because our machines are, for the most part, asynchronous. Probably the best tack would be to consider algorithms as separate instance, and not together in one run, with each instance been tested several times and averaged. We could then look at the instances at different times. We will not get perfection with a PC - they are not designed for definitive timing.

They were run each with their own loop. Perhaps by instance you mean by executable.
paul doe
Posts: 960
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Code timimg

Postby paul doe » Nov 27, 2017 11:31

deltarho[1859] wrote:That is because our machines are, for the most part, asynchronous. Probably the best tack would be to consider algorithms as separate instance, and not together in one run, with each instance been tested several times and averaged. We could then look at the instances at different times. We will not get perfection with a PC - they are not designed for definitive timing.

Indeed. However, as you correctly state, algorithms should be tested with a real use case, just not completely standalone. There are many things that can (unpredictably) change the throughput of the code.
The EMA function that I posted before can be slightly modified to give the average of the session (or rather, from the point where you initialize it):

Code: Select all

function emaT( byval value as double = 0.0, byval zero as boolean = false ) as double
   /'
      EMA (Exponential Moving Average) implementation
   '/
   static as double avg
   static as uinteger times
   
   if( zero = true ) then
      avg = 0.0
      times = 0
   end if
   
   dim as double smoothFactor = 2.0 / ( 1.0 + times )   
   avg = avg * ( 1.0 - smoothFactor ) + value * smoothFactor
   
   times += 1
   
   return( avg )
end function

As you can see, it doesn't take a 'period' parameter anymore, it simply accumulates the results and gives you the EMA of that. The longer you probe the code, the closer to the 'real' average the measurement is. No measurement is perfect of course, but you know that a little too well ;)
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Re: Code timimg

Postby deltarho[1859] » Nov 27, 2017 11:33

Munair wrote:Perhaps by instance you mean by executable.

Yes.

Most of my code timing is involved with procedures and I am in the millisecond or upper microsecond domains and loops in the thousands. With this type of timing the For/Next overhead has very little impact. The LeapYear tests, even though they employed large loops, had code being tested very much in the millisecond domain and, there too, the For/Next overhead has little impact. With the dodicat code above where we are examining single statements, which in themselves are in the nanosecond domain, then the For/Next overhead impacts heavily with relative comparisons.

I suppose there is an argument to use the second method above in all timing so that we do not have to question whether or not the overhead has any impact.
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Re: Code timimg

Postby deltarho[1859] » Nov 27, 2017 11:58

Of course, we have grounds for questioning why we are comparing code which resides in the nanosecond domain anyway.

The absolute difference in either of the tables above is about 2 nanoseconds. Suppose Code1 was 4ns and Code2 was 2ns then we can say that Code2 was twice as fast as Code1. However, we would need to execute the faster of the two 100 million times in our application session to get a reward of 200 milliseconds.

If my application did not use anything like 100 million of the faster code and Code1 was more readable than Code2 then I would go for, yes, Code1. There is not much that I can do in 200, or less, milliseconds. It takes me longer than that to get out of my chair, so putting the kettle on is out. Quite frankly, if I have not got time to put the kettle on then any speed improvement would not qualify as significant. <laugh>
paul doe
Posts: 960
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Code timimg

Postby paul doe » Nov 27, 2017 12:24

deltarho[1859] wrote:If my application did not use anything like 100 million of the faster code and Code1 was more readable than Code2 then I would go for, yes, Code1. There is not much that I can do in 200, or less, milliseconds. It takes me longer than that to get out of my chair, so putting the kettle on is out. Quite frankly, if I have not got time to put the kettle on then any speed improvement would not qualify as significant. <laugh>

Hahaha indeed. There are some areas (games, for example) where such things are significant, but for the rest I agree, the difference is immaterial. I also share the preference for clean code over uber-optimized-but-ugly code.

It is worth noting, however, that these seemingly 'immaterial' things do impact one area where code should be as tight as is humanly possible, no matter the cost: data structure implementation. They have a determinant impact on the speed of your code, especially the algorithms used. A cleverly implemented algorithm cannot truly shine if the data structures that uses are mediocre at best. It is a blessing that FreeBasic actually allows you to implement these in an efficient way.
dodicat
Posts: 5991
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Code timimg

Postby dodicat » Nov 27, 2017 12:59

Using only stack variables.
Randomly choosing which to test first.
Comparing totals with Paul Doe's function.
And for the moment, not completely omitting the Heisenberg' effect ... deltarho[1859], which looks like an indexed pointer anyway.
(the testing code bloats)
-gen gas no optimisations.

Code: Select all


'Paul Doe
function emaT( byval value as double = 0.0, byval zero as boolean = false ) as double
   /'
      EMA (Exponential Moving Average) implementation
   '/
   static as double avg
   static as uinteger times
   
   if( zero = true ) then
      avg = 0.0
      times = 0
   end if
   
   dim as double smoothFactor = 2.0 / ( 1.0 + times )   
   avg = avg * ( 1.0 - smoothFactor ) + value * smoothFactor
   
   times += 1
   
   return( avg )
end function

#macro external(pd)
pd+=emat(t)
#endmacro

'==================================================

'On the stack
Dim As Integer a(10000)
Dim As Integer Ptr ap=@a(0)
 
a(2017)=131:a(96)=30
'ap[2017]=131:ap[96]=30
 
Dim As Double t
Dim As longint i
dim as double accA,accP 'time accunulators
dim as double PDA,PDP   '      ""           for Paul doe function
 
For z As Long=1 To 20
 
  if rnd >.5 then 'whih one first?
     
    i=0 
   t=timer   
  For n As Long=1 To 10000000
   
    i+=a(2017)-a(96)
  Next
  t=timer-t
  accA+=t
  external(PDA)
  Print t,i,"array element"
  Sleep 50
 
  i=0
  t=Timer
  For n As Long=1 To 10000000
     
    i+=ap[2017] -ap[96]
  Next
  t=timer-t
  accP+=t
   external(PDP)
  Print t ,i,"pointer element"
  Print
  sleep 50
 
else
 
  i=0
  t=Timer
  For n As Long=1 To 10000000
     
    i+=ap[2017] -ap[96]
  Next
  t=timer-t
  accP+=t
   external(PDP)
  Print t ,i,"pointer element"
  sleep 50
  i=0
  t=timer
   For n As Long=1 To 10000000
     
    i+=a(2017)-a(96)
  Next
  t=timer-t
  accA+=t
   external(PDA)
  Print t,i,"array element"
  sleep 50
  Print
  end if
 
Next z
 
'Delete[] ap
print
print "Toatal array   time ";accA,"  PAUL DOE'S=";PDA
print "Toatal pointer time ";accP,"  PAUL DOE'S=";PDP
Sleep
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Re: Code timimg

Postby deltarho[1859] » Nov 27, 2017 14:34

EMA is usually employed on two dimensional functions with the x-axis as temporal and should not be used where the x-axis is run number.

Ideally, for 20 tests, we should use the t-distribution with confidence limits where we would conclude with something like:-

95% CL: t ± Δt

As with above we cannot divide one total with another for a relative comparison because the For/Next overhead is still significant.

@paul doe
data structure implementation - haven't been there so cannot comment.
dodicat
Posts: 5991
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Code timimg

Postby dodicat » Nov 27, 2017 15:25

Silly.
What was I thinking of ??
Last edited by dodicat on Nov 27, 2017 19:25, edited 1 time in total.
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Re: Code timimg

Postby deltarho[1859] » Nov 27, 2017 16:30

I have no idea what dodicat is up to now so either he has been at the sauce or I have, and I haven't. Something tells me that if I did know what he was doing then it would be of no use to me and since I am scratching my head big time on something else at the moment I will pass. If anyone would like to enlighten me then feel free to do so. <smile>
paul doe
Posts: 960
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Code timimg

Postby paul doe » Nov 27, 2017 16:55

deltarho[1859] wrote:EMA is usually employed on two dimensional functions with the x-axis as temporal and should not be used where the x-axis is run number.

Ideally, for 20 tests, we should use the t-distribution with confidence limits where we would conclude with something like:-

95% CL: t ± Δt

As with above we cannot divide one total with another for a relative comparison because the For/Next overhead is still significant.

?

The EMA is simply another average (and isn't even the best of the bunch), although a different kind that the simple average. The former is an Infinite Impulse Response type of filter, the latter isn't. It is usually used with time series, which doesn't mean it cannot be used otherwise. Doing 20 tests isn't nearly enough for me (there's too much 'randomness' between measurements), unless those 20 tests were composed of, say, 10000 samples. Can't tell if you're assuming this or not, but I suppose you do.
deltarho[1859]
Posts: 2092
Joined: Jan 02, 2017 0:34
Location: UK

Re: Code timimg

Postby deltarho[1859] » Nov 27, 2017 17:28

Applying a weighting factor to our problem is nonsense. If we dropped our twenty tests on to the floor and then picked them up in any order it should not make a blind bit of difference to our analysis. Our problem is not a function of time - it is time. We are actually taking 10^7 samples for each test.

I think maybe the first post has not been fully read otherwise moving averages and the like would not have been mentioned. <smile>
fxm
Posts: 9260
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Code timimg

Postby fxm » Nov 27, 2017 19:24

dodicat wrote:Maybe keep any looping dead time buried

Code: Select all



dim as integer a=48
dim as integer ptr ip=@a
dim shared as string s
dim as double t

dim as double accI,accP

for z as long=1 to 20
   
   
t=timer
s=string(100000000,a)
t=timer-t
accI+=t

print t,s[2017],"integer"
sleep 50
t=timer
s=string(100000000,ip[0])
t=timer-t
accP+=t

print t,s[2017],"pointer index"
print
sleep 50

next
print "Straight integer time ";accI
print "Integer pointer  time ";accP
sleep



 

But since it is buried, maybe the method is fake.

From your code, I think there is only one read of 'a' and 'ip[0]', before filling with this read value the 100000000 characters of the two strings.

Return to “General”

Who is online

Users browsing this forum: MSN [Bot] and 2 guests