## integer operations benchmark (or interpreter killer)

For other topics related to the FreeBASIC project or its community.
Ed Davis
Posts: 10
Joined: Jul 28, 2008 23:24

### integer operations benchmark (or interpreter killer)

Here is the code in FreeBasic for an ASCII Mandelbrot, using integers only.

Code: Select all

`    dim as integer left_edge   = -420    dim as integer right_edge  =  300    dim as integer top_edge    =  300    dim as integer bottom_edge = -300    dim as integer x_step      =  7    dim as integer y_step      =  15    dim as integer max_iter    =  200    dim as integer y0 = top_edge    while y0 > bottom_edge        dim as integer x0 = left_edge        while x0 < right_edge            dim as integer y = 0            dim as integer x = 0            dim as integer x_x = 0            dim as integer y_y = 0            dim as integer i = 0            dim as string the_char = " "            while i < max_iter and x_x + y_y <= 800                x_x =    (x * x) \ 200                y_y =    (y * y) \ 200                if x_x + y_y > 800 then                    the_char = chr(48 + i)                    if i > 9 then the_char = "@"                    exit while                end if                dim as integer temp = x_x - y_y + x0                if (x < 0 and y > 0) or (x > 0 and y < 0) then                    y = (-1 * ((-1 * x * y) \ 100)) + y0                else                    y = (x * y \ 100) + y0                end if                x = temp                i = i + 1            wend            x0 = x0 + x_step            print the_char ;        wend        y0 = y0 - y_step        print    wend`

This is based on code I found at http://rosettacode.org/wiki/Mandelbrot_set

I got to thinking, this might make a nice integer benchmark, if instead of printing the Mandelbrot, we just accumulate a value (the sum of the generated characters) and looped for a while.

So, I modified the program, and after testing, found that a C version of the above, compiled with gcc -O3, took 1 second to perform 1545 iterations on my machine. The code is very similar to the above ascii integer Mandelbrot, except that it performs multiple iterations, accumulates the character to be printed, and the 'exit loop' is removed - not all simple languages support early-loop-exits, and I wanted the code to be as common between languages as possible.

I figured I could use the C version as a standard, and finding a count that took 1 second would give me an easy estimate as to how much slower other language processors are, compared to C, and of course, only in regards to this benchmark.

And (ta dah), the FreeBasic version compares very favorably with the C version, speed-wise. In fact, it was the second fastest processor that I tested.

Here is the FreeBasic version:

Code: Select all

`    dim as integer accum = 0    dim as integer count = 0    while count < 1545        dim as integer left_edge   = -420        dim as integer right_edge  =  300        dim as integer top_edge    =  300        dim as integer bottom_edge = -300        dim as integer x_step      =  7        dim as integer y_step      =  15        dim as integer max_iter    =  200        dim as integer y0 = top_edge        while y0 > bottom_edge            dim as integer x0 = left_edge            while x0 < right_edge                dim as integer y = 0                dim as integer x = 0                dim as integer x_x = 0                dim as integer y_y = 0                dim as integer i = 0                dim as integer the_char = 32                while i < max_iter and x_x + y_y <= 800                    x_x =    ((x * x) \ 200)                    y_y =    ((y * y) \ 200)                    if x_x + y_y > 800 then                        the_char = 48 + i                        if i > 9 then the_char = 64                    else                        dim as integer temp = x_x - y_y + x0                        if (x < 0 and y > 0) or (x > 0 and y < 0) then                            y = (-1 * ((-1 * x * y) \ 100)) + y0                        else                            y = (x * y \ 100) + y0                        end if                        x = temp                    end if                    i = i + 1                wend                accum = accum + the_char                x0 = x0 + x_step            wend            y0 = y0 - y_step        wend        if count mod 300 = 0 then print accum        count = count + 1    wendprint accum`

This is the output:

200574 60372774 120544974 180717174 240889374 301061574 309886830

This is a pretty intense integer benchmark!

Note this funky code:

Code: Select all

`    if (x < 0 and y > 0) or (x > 0 and y < 0) then        y = (-1 * ((-1 * x * y) \ 100)) + y0    else        y = (x * y \ 100) + y0    end if`

It turns out that there is no consensus as to whether integer division, when one of the operands is negative, should round towards zero, negative infinity, or positive infinity.

Python and Ruby both round towards negative infinity. C rounds towards zero.

In order to get the same output from each language (to verify that each language was essentially computing the same thing and doing similar work), I had to figure out how to preclude one of the operands from being negative.

The code checks, and if either x or y is < 0 (but not both), then it multiplies by minus one to force positive division, and then by minus one again at the end to restore the sign.

Whats the purpose of this?

I enjoy fiddling with compilers/interpreters, especially those that are simple enough that I can understand. I also enjoy writing my own. I wanted to find out what makes some interpreters so much faster than others, and why are others so slow. Examining the source to some of these interpreters has helped me learn some of the reasons, and has helped to improve the speed of those I'm working on.

Here are the specs for my machine:

Windows 7, Service Pack 1, 64-bit
Intel Core i7-3720QM CPU @2.60GHz
16.0 GB (15.9 usable)

Below are the tests that I have run.

Integer operations benchmark (based on integer ascii Mandelbrot): (1545 iterations)

Must generate the following output using 1545 iterations of the code above/below:

200574 60372774 120544974 180717174 240889374 301061574 309886830

All times are in seconds.

Compilers or JIT

Code: Select all

`gcc: C -O3 or -O2    1.00FreeBasic v1.01x64   1.09  freebasic.netBCX                  1.12  bcxbasic.comNim                  1.12  nim-lang.orgJava                 1.23C#                   1.34VisualBasic.Net      1.52FreePascal -O3       1.92gcc: C, no options   1.96gcc: C -Os           2.38Borland C            2.80TinyC                3.23  bellard.org/tcceuc -gcc -con        3.58  openeuphoria.orgOxygen basic         3.77  oxygenbasic.orgPowerBasic           4.03NodeJS              10.23  nodejs.orgQB64                32.43  qb64.net`

Interpreters that compile to byte code or an AST

Code: Select all

`pe             14.26  github.com/peberlein/interpretertoy.bas        33.35  updated version of: QDepartment/Files/Ed-Davis-stuff/toy.basEuphoria       39.90  openeuphoria.orgPike           44.39  pike.lysator.liu.setoy4.c         57.24  updated version of: QDepartment/Files/Ed-Davis-stuff/toy.cTinypas.c      58.97  Pascal-S converted to CRuby           60.25  ruby-lang.orgtiny-c.c       65.96  updated version of: iro.umontreal.ca/~felipe/IFT2030-Automne2002/Complements/tinyc.ccalc3a         70.01  updated version of: epaperpress.com/lexandyacc/vspl           68.76  cl.cam.ac.uk/~mr10/VSPL.htmlLua            80.14  lua.orgphp            81.23  php.netWren           84.57  github.com/munificent/wrenC4             97.80  github.com/rswier/c4UnderC        103     home.mweb.co.za/sd/sdonovan/underc.htmljwillia basic 109     github.com/jwillia3/BASIChoc           140     en.wikipedia.org/wiki/Hoc_%28programming_language%29NaaLaa        168     naalaa.comCInt          201     root.cern.ch/drupal/content/cintPython        274     python.orgYabasic       278     yabasic.dehanson-calc   321     github.com/drh/drhanson/tree/master/calcSpecBAS       358     sites.google.com/site/pauldunn`

Interpreters (not using a VM or AST)

Code: Select all

`ChipMunkBasic  442  nicholson.com/rhn/basicSmallBASIC     514  smallbasic.sourceforge.netBBCBasic       531  bbcbasic.co.ukThinBasic      543  thinbasic.comFBSL           551  fbsl.netCH             582  softintegration.comScriba         618  scriptbasic.orgSI            1020  drdobbs.com/cpp/si-a-c-like-script-interpreter/184408141LBB           1502  bbcbasic.co.uk/lbbLittleC       2160  books.mcgraw-hill.com/downloads/products/0072121246/0072121246_code.zipPicoC         2478  github.com/zsaleeba/picocDDS           3033  modified version of: grail.cba.csuohio.edu/~somos/ddsbasic.cmy-basic      3302  github.com/paladin-t/my_basicMiniBASIC     5471  sourceforge.net/p/minibasic/wiki/HomenuBASIC       8132  nubasic.eu`

Selected notes

Code: Select all

`FreeBasic     Screamingly fast!  Impressive!TinyC         Very fast compiling compilereuc -gcc -con Euphoria rocks!Oxygen basic  Oxyben Basic rocks!NodeJs        Server/desktop Javascriptpe            Micro Euphoria.  Fastest interpreter tested.toy.bas       Updated version of a simple interpreter I wrote several years ago.Pike          Neat C-like scripting languagetoy4.c        Updated version of a simple interpreter I wrote several years ago.Tinypas.c     I converted Pascal-S (famous pascal compiler/interpreter) to C.Ruby          Ruby rocks!tiny-c.c      I updated this, adding what was needed to run the benchmark.calc3a        I updated this, adding what was needed to run the benchmark.vspl          B-like programming languageLua           Lua rocks!php           php rocks!c4            Neat C subset interpreterUnderC        Almost full C and lots of C++ interpreterjwillia basic Neat tiny Basic interpreter.NaaLaa        NaaLaa rocks!CInt          Full C interpreterPython        Python rocks!CH            Full C interpreterSI            I updated this, adding what was needed to run the benchmark.LBB           Liberty Basic BoosterLittleC       I updated this, adding what was needed to run the benchmark.PicoC         Almost full C interpreterDDS           I updated this, adding what was needed to run the benchmark.`

Some take-aways:

• FreeBasic is really fast.
• If you are going to write an interpreter, and speed matters, compile to byte code or an AST.
• Several well-known interpreters are surprisingly slow, in this case at least. In there defense, there may be other features that they offer that make up for the slowness in this instance.

Source code for selected languages

The C version:

Code: Select all

`    #include <stdio.h>    int main() {      int left_edge, right_edge, top_edge, bottom_edge, max_iter,        x_step, y_step, y0, x0, x, y, i, x_x, y_y, temp,        the_char, accum, count;      accum = 0;      count = 0;      while (count < 1545) {        left_edge = -420;        right_edge  =  300;        top_edge  =  300;        bottom_edge = -300;        x_step    =  7;        y_step    =  15;        max_iter  =  200;        y0 = top_edge;        while (y0 > bottom_edge) {          x0 = left_edge;          while (x0 < right_edge) {            y = 0;            x = 0;            the_char = 32;            x_x = 0;            y_y = 0;            i = 0;            while (i < max_iter && x_x + y_y <= 800) {              x_x = (x * x) / 200;              y_y = (y * y) / 200;              if (x_x + y_y > 800 ) {                the_char = 48 + i;                if (i > 9) {                  the_char = 64;                }              } else {                temp = x_x - y_y + x0;                if ((x < 0 && y > 0) || (x > 0 && y < 0)) {                  y = (-1 * ((-1 * (x * y)) / 100)) + y0;                } else {                  y = x * y / 100 + y0;                }                x = temp;              }              i = i + 1;            }            accum = accum + the_char;            x0 = x0 + x_step;          }          y0 = y0 - y_step;        }        if (count % 300 == 0) {          printf("%d\n", accum);        }        count = count + 1;      }      printf("%d\n", accum);      return 0;    }`

The Java version:

Code: Select all

`    class mandelbench {      public static void main(String[] args) {        int left_edge, right_edge, top_edge, bottom_edge, max_iter,        x_step, y_step, y0, x0, x, y, i, x_x, y_y, temp, the_char,        accum, count;        accum = 0;        count = 0;        while (count < 1545) {          left_edge = -420;          right_edge  =  300;          top_edge  =  300;          bottom_edge = -300;          x_step    =  7;          y_step    =  15;          max_iter  =  200;          y0 = top_edge;          while (y0 > bottom_edge) {            x0 = left_edge;            while (x0 < right_edge) {              y = 0;              x = 0;              the_char = ' ';              x_x = 0;              y_y = 0;              i = 0;              while (i < max_iter && x_x + y_y <= 800) {                x_x = (x * x) / 200;                y_y = (y * y) / 200;                if (x_x + y_y > 800) {                  the_char = '0' + i;                  if (i > 9) {                    the_char = '@';                  }                } else {                  temp = x_x - y_y + x0;                  if ((x < 0 && y > 0) || (x > 0 && y < 0))                    y = (-1 * (-1 * x * y) / 100) + y0;                  else                    y = x * y / 100 + y0;                  x = temp;                }                i = i + 1;              }              accum = accum + the_char;              x0 = x0 + x_step;            }            y0 = y0 - y_step;          }          if (count % 300 == 0)            System.out.printf("%d ", accum);          count = count + 1;        }        System.out.printf("%d\n", accum);      }    }`

The Euphoria version:

Code: Select all

`    without type_check    integer left_edge, right_edge, top_edge, bottom_edge, max_iter,    x_step, y_step, y0, x0, x, y, i, x_x, y_y, temp, the_char, accum,    count    accum = 0    count = 0    while count < 1545 do      left_edge = -420      right_edge  =  300      top_edge  =  300      bottom_edge = -300      x_step    =  7      y_step    =  15      max_iter  =  200      y0 = top_edge      while y0 > bottom_edge do        x0 = left_edge        while x0 < right_edge do          y = 0          x = 0          the_char = 32          x_x = 0          y_y = 0          i = 0          while i < max_iter and x_x + y_y <= 800 do            x_x = floor((x * x) / 200)            y_y = floor((y * y) / 200)            if x_x + y_y > 800 then              the_char = 48 + i              if i > 9 then                the_char = 64              end if            else              temp = x_x - y_y + x0              if ((x < 0 and y > 0) or (x > 0 and y < 0)) then                y = (-1 * floor((-1 * x * y) / 100)) + y0              else                y = floor(x * y / 100) + y0              end if              x = temp            end if            i = i + 1          end while          accum = accum + the_char          x0 = x0 + x_step        end while        y0 = y0 - y_step      end while      if remainder(count, 300) = 0 then        printf(1, "%d ", accum)      end if      count = count + 1    end while    printf(1, "%d\n", accum)`

The Ruby version:

Code: Select all

`    accum = 0    count = 0    while count < 1545 do      left_edge = -420      right_edge  =  300      top_edge  =  300      bottom_edge = -300      x_step    =  7      y_step    =  15      max_iter  =  200      y0 = top_edge      while y0 > bottom_edge do        x0 = left_edge        while x0 < right_edge do          y = 0          x = 0          the_char = 32          x_x = 0          y_y = 0          i = 0          while i < max_iter and x_x + y_y <= 800 do            x_x = (x * x) / 200            y_y = (y * y) / 200            if x_x + y_y > 800 then              the_char = 48 + i              if i > 9 then                the_char = 64              end            else              temp = x_x - y_y + x0              if ((x < 0 and y > 0) or (x > 0 and y < 0)) then                y = (-1 * ((-1 * x * y) / 100)) + y0              else                y = (x * y / 100) + y0              end              x = temp            end            i = i + 1          end          accum = accum + the_char          x0 = x0 + x_step        end        y0 = y0 - y_step      end      if count % 300 == 0 then        printf "%d ", accum      end      count = count + 1    end    puts accum`

The Pascal version:

Code: Select all

`    program mandel;    var left_edge, right_edge, top_edge, bottom_edge, max_iter,    x_step, y_step, y0, x0, x, y, i, x_x, y_y, temp, the_char, accum,    count:integer;    begin;      accum := 0;      count := 0;      while count < 1545 do begin        left_edge := -420;        { left edge = -2.1 }        right_edge  :=  300;        { right edge = 1.5 }        top_edge  :=  300;        { top edge = 1.5 }        bottom_edge := -300;        { bottom edge = -1.5 }        x_step    :=  7;          { x step size }        y_step    :=  15;         { y step size }        max_iter  := 200;         { max iteration depth }        y0 := top_edge;        while y0 > bottom_edge do begin     { y0 }          x0 := left_edge;          while x0 < right_edge do begin      { x0 }            y := 0;               { y }            x := 0;               { x }            the_char := 32;           { char to be displayed }            x_x := 0;            y_y := 0;            i := 0;            while (i < max_iter) and (x_x + y_y <= 800) do begin  { iteration }              x_x := (x * x) div 200;               { x*x }              y_y := (y * y) div 200;               { y*y }              if x_x + y_y >   800 then begin                the_char := 48 + i;               { print digit 0...9 }                if i > 9 then begin               { if iteration count > 9, }                  the_char := 64;               {  print '@' }                end;              end else begin                temp := x_x - y_y + x0;             { temp = x*x - y*y + x0 }                if (((x < 0) and (y > 0)) or ((x > 0) and (y < 0))) then                  y := (-1 * ((-1 * x * y) div 100)) + y0                else                  y := (x * y div 100) + y0;          { y = 2*x*y + y0 }                x := temp;                    { x = temp }              end;              i := i + 1;            end;            accum := accum + the_char;            x0 := x0 + x_step;          end;          y0 := y0 - y_step;        end;        if count mod 300 = 0 then          write(accum, ' ');        count := count + 1;      end;      writeln(accum);    end.`

The Lua version.

Code: Select all

`    local left_edge, right_edge, top_edge, bottom_edge, max_iter, x_step, y_step, y0, x0, x,    y, i, x_x, y_y, temp, the_char, accum, count    accum = 0    count = 0    while count < 1545 do      left_edge = -420      right_edge  =  300      top_edge  =  300      bottom_edge = -300      x_step    =  7      y_step    =  15      max_iter  =  200      y0 = top_edge      while y0 > bottom_edge do        x0 = left_edge        while x0 < right_edge do          y = 0          x = 0          the_char = 32          x_x = 0          y_y = 0          i = 0          while i < max_iter and x_x + y_y <= 800 do            x_x = math.floor((x * x) / 200)            y_y = math.floor((y * y) / 200)            if x_x + y_y > 800 then              the_char = 48 + i              if i > 9 then                the_char = 64              end            else              temp = x_x - y_y + x0              if ((x < 0 and y > 0) or (x > 0 and y < 0)) then                y = (-1 * math.floor((-1 * x * y) / 100)) + y0              else                y = math.floor(x * y / 100) + y0              end              x = temp            end            i = i + 1          end          accum = accum + the_char          x0 = x0 + x_step        end        y0 = y0 - y_step      end      if count % 300 == 0 then        print(accum)      end      count = count + 1    end    print (pc)`

The Python version.

Code: Select all

`    accum = 0    count = 0    while count < 1545:      left_edge = -420      right_edge  =  300      top_edge  =  300      bottom_edge = -300      x_step    =  7      y_step    =  15      max_iter  =  200      y0 = top_edge      while y0 > bottom_edge:        x0 = left_edge        while x0 < right_edge:          y = 0          x = 0          the_char = 32          x_x = 0          y_y = 0          i = 0          while i < max_iter and x_x + y_y <= 800:            x_x = int((x * x) / 200)            y_y = int((y * y) / 200)            if x_x + y_y > 800:              the_char = 48 + i              if i > 9:                the_char = 64            else:              temp = x_x - y_y + x0              if ((x < 0 and y > 0) or (x > 0 and y < 0)):                y = (-1 * int((-1 * x * y) / 100)) + y0              else:                y = int(x * y / 100) + y0              x = temp            i = i + 1          accum = accum + the_char          x0 = x0 + x_step        y0 = y0 - y_step      if count % 300 == 0:        print(accum)      count = count + 1    print(accum)`
fxm
Posts: 9122
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

### Re: integer operations benchmark (or interpreter killer)

- As you are using Windows 64-bit, I think that INTEGER defines a 64 bits integer number.
- Can you test it by using everywhere LONG (32 bits) instead of INTEGER?
Ed Davis
Posts: 10
Joined: Jul 28, 2008 23:24

### Re: integer operations benchmark (or interpreter killer)

- As you are using Windows 64-bit, I think that INTEGER defines a 64 bits integer number.
- Can you test it by using everywhere LONG (32 bits) instead of INTEGER?

1.37 seconds.

Interesting. I thought it would have been much closer to the 1.09 seconds of the INTEGER version.
D.J.Peters
Posts: 7808
Joined: May 28, 2005 3:28

### Re: integer operations benchmark (or interpreter killer)

For array indices i use in my current FreeBASIC project unsiged integers.

The native CPU data type is always faster than any other data types.

A 64 bit data type on 64 bit CPU can be transferred on the data bus without masking.

Vice versa a 64 bit data type on 32 bit CPU must be masked and/or shifted and transferred in two pieces.

So i would accept ~20% performance differences in both direction

64bit data on 32bit CPU/BUS or 32bit on 64bit CPU/BUS.

But this all, is only true if the compiler, virtual machine or interpreter knows what it does. :-)

bla bla bla.

However nice collection of benchmarks.

Joshy
dkl
Posts: 3209
Joined: Jul 28, 2005 14:45
Location: Germany

### Re: integer operations benchmark (or interpreter killer)

If using the 64bit version of FB, that means it's using the -gen gcc backend, and you can do fbc -O 2 to tell it to use gcc -O2, which should make it similar to the gcc benchmark with the same options.
fxm
Posts: 9122
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

### Re: integer operations benchmark (or interpreter killer)

Because I was surprised by the score of 1.09 for FreeBASIC compared to 1 for C/gcc.
I think that for C/gcc, INT is a 32 bits long.
Thus, 1.37 (for FreeBASIC with LONG) must rather be compared to 1 (for C/gcc with INT).

And what about all the other results?
caseih
Posts: 1366
Joined: Feb 26, 2007 5:32

### Re: integer operations benchmark (or interpreter killer)

dkl wrote:If using the 64bit version of FB, that means it's using the -gen gcc backend, and you can do fbc -O 2 to tell it to use gcc -O2, which should make it similar to the gcc benchmark with the same options.

Yes I would think that the C code would be a nearly 1:1 translation by fbc, so the result should be statistically insignificant from the C-only version.

I'm pretty impressed by Java's results. I bet if you ran the test in a big loop, say 1000 times, the Java speed would come pretty near the C code speed, after the JIT has had a chance to analyze the execution paths.

EDIT:
On my 64-bit Linux machine here, with the -O2 optimization for both FB and C versions, the results are so close as to be identical, 1.3 s each, ran several times.

I also for kicks ran the Python version through PyPy JIT and was pretty impressed. The execution time was 14 seconds vs the interpreter which took at least 60 seconds. Not too shabby for a dynamic language.
Last edited by caseih on Apr 18, 2015 14:45, edited 2 times in total.
marcov
Posts: 2760
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

### Re: integer operations benchmark (or interpreter killer)

caseih wrote:I'm pretty impressed by Java's results. I bet if you ran the test in a big loop, say 1000 times, the Java speed would come pretty near the C code speed, after the JIT has had a chance to analyze the execution paths.

I wonder why that is surprising? It's all value types, and then the basic problem to the (JIT-) compiler is no different, except that the JIT compiler knows the exact CPU it runs on. Java could be FASTER than gcc. Though of course it is, as a JIT compiler, probably more reluctant to spend time in compiling/optimizing.
caseih
Posts: 1366
Joined: Feb 26, 2007 5:32

### Re: integer operations benchmark (or interpreter killer)

Well I never said I was surprised by Java's results (and in fact I was not at all surprised). Rather I was impressed. On my machine, when I turn the benchmark into a method and run it many times, the execution time comes within .1 s of the C version. Anyway, sort of interesting.
dodicat
Posts: 5913
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: integer operations benchmark (or interpreter killer)

If you use andalso instead of and
Then
instead of or use orelse

Then the Freebasic code is a bit faster.

Or is that perhaps cheating?
caseih
Posts: 1366
Joined: Feb 26, 2007 5:32

### Re: integer operations benchmark (or interpreter killer)

Wait. Why would andalso be cheating, and how would it be faster than the C code? I realize that in normal FB code, the AND is bitwise, not logical, so it's not short-circuited. hence it will be slower. But in C, a logical and is short-circuited, so in the C version it's already as fast as it can be. ANDALSO would simply bring it to C equivalence, no?

I learn something new about FB a lot on this forum. I hadn't known about these operators and I always had found BASIC's lack of true logical operators to be a weak spot, which ANDALSO and ORELSE handily address. This brings up a question. Why would I not want to use ANDALSO or ORELSE in a logical expression? I can't think of any situation, unless I specifically need bitwise operations. Thoughts?
dkl