fbcunit - fbc compiler unit testing component

Post by **coderJeff** » Jan 06, 2018 18:23

I created the pull request. Git "All checks have failed" in big red letters. That's genuine honesty for you :)

In the Travis CI logs same report.

Code: Select all

./fbc-tests
Aborting due to runtime error 6 (out of bounds array access) at line 133 of src/fbcunit.bas::HASH_FIND()
make[1]: *** [run_tests] Error 6
make[1]: Leaving directory `/home/travis/build/freebasic/fbc/tests'
make: *** [cunit-tests] Error 2

I am going to guess out of order execution of module constructors, and the dynamic shared arrays aren't getting set up first. I should probably explicitly redim the arrays on first use. Dang, it.
---
EDIT: yup, that was it. Fixed.

St_W · Post by **St_W** » Jan 06, 2018 23:41

I saw that dkl already merged the PR, but unfortunately all my FB-Test Jenkins jobs are failing now because no XML result file is generated when building & running the unit tests with:

Code: Select all

make cunit-tests auto=1

Could you provide an option in the Makefile to generate a JUnit result XML file? Currently the only option is to execute the UT application manually with the according command line arguments, as far as I saw; but ideally the unittests would just run once during the execution of the makefile. The workaround below works, but of course the unit-tests are executed twice for no real reason:

Code: Select all

make cunit-tests auto=1
./fbc-tests.exe --xml fbc-tests-Results.xml

An option in the Makefile that allows to pass arguments to the unit-test executable would do (and should be very simple to implement).

Post by **coderJeff** » Jan 07, 2018 3:41

First, FYI, for anyone reading this, but doesn't know what we are talking about:
- the fbc test suite is an important part of FB Dev Team's quality assurance.
- currently about 3500+ tests that make hundreds of thousands of assertions about how the compiler and run-time library is supposed to work
- there are primarily 2 test methods, compile-time tests and run-time tests
- the fbcunit library we are talking about here is responsible for recording and reporting the results of the run-time tests
- the makefiles and build scripts are responsible for recording and reporting the compile time tests.

@St_W
Yeah, there's a number of items to change and/or clean-up.
1) Is AUTO=1 expected to do other automatic things? If not, then we could name the option for what it is and add an XMLOUTPUT=path/file.xml as option to the makefile that is passed to `./fbc-tests.exe --xml path/file.xml' invocation.
2) Aesthetically, I want to use "Unit", "UNITTEST", etc. I will keep the "cunit-tests" target for now, but deprecate it's use. CUNIT, cunit, CUint, CUNITTESTS, etc is used through the makefiles and docs.
3) Does anyone use ALLOW_CUNIT=1 option? I added that because, once upon a time, I was having trouble with DOS builds. It uses a "fake" CUnit include file so that **all** tests get built same way as the log-tests and didn't need the libcunit.a library. It is extremely slow. I hoping to remove the ALLOW_CUNIT stuff completely.
4) I will do my best to keep from breaking stuff. Do expect some changes though.

Alternative to your work-around (there are multiple ways to do this):

Code: Select all

## special.mk

include cunit-tests.mk

.PHONY: stw_tests
stw_tests: make_fbcu $(CUNIT_TESTS_INC) build_tests stw_run

.PHONY: stw_run
stw_run:
	./$(MAINEXE) --xml fbc-tests-Results.xml

Then invoke with `make -f special.mk stw_tests'.

St_W · Post by **St_W** » Jan 07, 2018 4:40

@coderJeff
re 1) CUnit provided automated, basic and console execution modes (see http://cunit.sourceforge.net/doc/running_tests.html), but to be honest I always used automated mode AFAIR and can't tell if there are other implications than those described on the referenced page.
re 2) agree
re 3) I can only speak for myself, but I don't and probably never have. Neither do I run tests on DOS (as there's no Jenkins environment available for that platform and running the tests inside NTVDM probably doesn't make a lot sense).
re 4) I'm absolutely fine with changing/updating the way things work. As mentioned previously I think that even this is a considerable solution:

Code: Select all

make unit-tests UNITTEST_RUN_ARGS="--xml fbc-tests-Results.xml"

Post by **coderJeff** » Jan 07, 2018 22:44

Latest changes are pushed to https://github.com/jayrm/fbc/tree/fbcunit
I haven't made a pull request yet, as I am trying to do a test in a DOS build. So far I have tested latest fbcunit branch in win32 gas/gcc, win64 gcc, and Linux 64 gcc (slackware 14.1 64-bit actually).

---
I added the UNITTEST_RUN_ARGS="..." option. It works, but I think it is wrong way to solve this. I need more time to figure this one out. For your interest, hopefully I can explain why.
- yes, old fbcu (+ libcunit.a) had different execution modes, which depend on how library was built, so had to pass for example AUTO=1 to get the fbcu wrapper when building to get tests to run a certain way.
- ideally, the makefile should be targets, prerequisites, & recipes, nothing else. The top-level ./tests/Makefile mostly dispatches commands, which could just as easily be done in a shell script.
- unit-tests.mk is more like a makefile should be. For example, you can build a single target `make -f unit-tests.mk pp/hello.o' it will take the needed steps to get there
- currently production of `make unit-tests' is an exit code. yes, it does print some stuff to the console for the user's benefit, but in the end either we either get an exitcode=0 for success or exitcode=1 for failure.
- so for the xml file generation, it should be rewritten as a target in the makefile, IMO.

I will leave the UNITTEST_RUN_ARGS="..." option there for now, and I will change and/or remove it later. Also, cunit-tests target will still work, but is deprecated. Use 'unit-tests' instead.

---
fbcu.bi compatibility header is still in use. Next step is to review the .bas test source files, and for consistency, rewrite using the framework's SUITE() and TEST() macros.

Post by **coderJeff** » Jan 09, 2018 5:09

I tested the DOS build in a win98 VM using djgpp-2.05 and "current" packages. I built fbc 1.06 dos version from fbc 1.05 dos version.

building issues:
- 'cd path && make something' commands in a makefile fail, because command.com is being invoked by default, and it does not like '&&' separators.
- other commands then fail as well because wrong 'echo', 'find', etc are being invoked
- solution: use 'make unit-tests SHELL=d:/dj/bin/sh.exe' to set make's shell command.

compiling issues:
- crt/varargs.bas failed to compile with djgpp environment because snprintf() not available
- solution: changed test to compile but fail if snprintf not defined (commit)

unit-tests issues:
- crt/varargs.bas failed (no snprintf defined, expected)
- optimizations\inline-ops.bas(143) : error : FBC_TESTS.OPTIMIZATIONS.INLINE_OPS.TEST CU_ASSERT_DOUBLE_EQUAL(asin( v ),asinf( v ),EPSILON_SNG)
- all other unit-tests passed.

warnings-tests:
- also OK, no diffs for dos version except the EOL endings in the report

log-tests issues:
- FAILED LOG - for log-tests -lang fb
- ./functions/argv.log:functions/argv.bas : RESULT=FAILED
- if( __FB_DOS__ ) defined & #include "windows.bi" ? That can't be correct.

---
So I'm done messing around with the structure of the test-suite and it's makefiles.

Post by **counting_pine** » Jan 09, 2018 13:53

I think we are misusing the epsilon parameter of CU_ASSERT_DOUBLE_EQUAL in optimizations\inline-ops.bas, and probably in other places too.
The epsilon should probably be the single-precision ULP of the value in question, rather than a fixed value.

Post by **coderJeff** » Jan 10, 2018 2:35

I think I understand. Like if we were comparing, in decimal, with 5 significant digits:
6.67408x10^-11 to some other number
EPSILON should be 0.00001x10^-11, or 1x10^-16

Post by **counting_pine** » Jan 10, 2018 13:21

Yeah, that's basically it. So the ULP for single-precision x will be something like 2^(log2(x) - 24). (It might be 23, I forget.)
Or I think it'd be roughly equivalent to (x * EPSILON), for the purposes of comparing abs(x-y).

That said, 1 ulp may not be enough in some cases. There may be some error in the lowest bits on transcendental functions. But we can always start at 1 and increase it if that's too strict.

Post by **coderJeff** » Jan 12, 2018 2:56

Thanks, counting_pine, I will try it. The assertion macros in the test test suite use only use expressions and ultimately '=' or '<>' to test the assertion. Before the next fbcunit update I could try to re-implement CU_ASSERT_DOUBLE_EQUAL macro as a function call that calculates the allowable error.

Post by **counting_pine** » Jan 12, 2018 13:31

A different name might be good for compatibility's sake, in case we're using it correctly anywhere.

An implementation might look something like this, if it doesn't rely too much on the IEEE double representation:

Code: Select all

sub CU_ASSERT_DOUBLE_APPROX(a as double, b as double, ulps as longint = 1)
    CU_ASSERT(abs(cvlongint(a) - cvlongint(b)) <= abs(ulps))
end sub

(I think it's easier to write that way, although it's potentially more confusing mathematically.)

Something similar could work for SINGLE, with cvl instead of cvlongint.

Post by **coderJeff** » Feb 19, 2018 6:30

An update in case anyone is curious; I read several articles about floating point comparisons and ULP. Rather fascinating. //This one was pretty good, and not so heavy as this paper//. I looked at implementations like in GooleTest, and Boost, etc, and they are doing about the same thing.

After doing the reading & research I looked at the failed test again with this sample code:

Code: Select all

#include once "crt/math.bi"
const EPSILON_SNG as single = 1.19290929e-7
#define SNGEQ( a, e, g ) cbool( (abs(csng(a)-csng(e))) <= abs(csng(g)) )
#define DBLEQ( a, e, g ) cbool( (abs(cdbl(a)-cdbl(e))) <= abs(cdbl(g)) )
for v as single = -1 to 1 step .01
	if( SNGEQ(asin(v),asinf(v),EPSILON_SNG) = false ) then
		print "SNG", v, asin(v), asinf(v), asin(v)-asinf(v)
	end if
	if( DBLEQ(asin(v),asinf(v),EPSILON_SNG) = false ) then
		print "DBL", v, asin(v), asinf(v), asin(v)-asinf(v)
	end if
next

Under a FreeDOS VM and DosBOX, ASM code generation is the same as win32, but DOS version fails on calls to asin(). Looks like we are off by just a little more than the smallest float (single). But only when converting to double. Here is a snippet of the output:

Code: Select all

DBL            0.7399989     0.8330688     0.8330689    -1.192093e-07
DBL            0.7599989     0.8633115     0.8633116    -1.192093e-07
DBL            0.7899989     0.9108072     0.9108073    -1.192093e-07
DBL            0.7999989     0.9272934     0.9272935    -1.192093e-07
DBL            0.8099989     0.9441502     0.9441503    -1.192093e-07

It doesn't fail every assertion. It's right on the edge of pass/fail. Comparison of singles passes always. Maybe cpu rounding mode? Maybe something peculiar about djgpp's libc? I dunno.

Post by **coderJeff** » Feb 19, 2018 14:31

Here's my first try at implementing the ULP calculation. If you explore the values near zero, can see that ULP distance becomes huge compared to the epsilon value selected for the test (which come from fbc's test suite). So, I think I need a better test for testing the ULP calculation.

Code: Select all

''
function sng_isnan( byval a as single ) as boolean

	dim ia as long = *cast( long ptr, @a )

	'' NAN = exponent all 1's and mantissa not zero
	return cbool( ((ia and &h7f800000) = &h7f800000 ) _
	      andalso ((ia and &h007fffff) <> 0) )

end function

''
function sngULP( byval a as single, byval b as single ) as long

	dim ia as long = *cast( long ptr, @a )
	dim ib as long = *cast( long ptr, @b )

	if( sng_isnan(a) ) then
		return &h7fffffff
	end if

	if( sng_isnan(b) ) then
		return &h7fffffff
	end if

	'' signs different?
	if( (ia and &h80000000) <> (ib and &h80000000) ) then

		'' compare -0 and +0
		if( a = b ) then
			return 0
		end if

		'' assume big diff
		return &h7fffffff
	
	end if
		
	'' signs are the same, return |ia-ib|
	ia and= &h7fffffff
	ib and= &h7fffffff
	return iif( ia>ib, ia-ib, ib-ia )

end function

''
function dbl_isnan( byval a as double ) as boolean

	dim ia as longint = *cast( longint ptr, @a )

	'' NAN = exponent all 1's and mantissa not zero
	return cbool( ((ia and &h7ff0000000000000ll) = &h7ff0000000000000ll ) _
	      andalso ((ia and &h000fffffffffffffll) <> 0ll) )

end function

''
function dblULP( byval a as double, byval b as double ) as longint

	dim ia as longint = *cast( longint ptr, @a )
	dim ib as longint = *cast( longint ptr, @b )

	if( dbl_isnan(a) ) then
		return &h7fffffffffffffffll
	end if

	if( dbl_isnan(b) ) then
		return &h7fffffffffffffffll
	end if
	
	'' signs different?
	if( (ia and &h8000000000000000ll) <> (ib and &h8000000000000000ll) ) then

		'' compare -0 and +0
		if( a = b ) then
			return 0
		end if

		'' assume big diff
		return &h7fffffffffffffffll
	
	end if
		
	'' signs are the same, return |ia-ib|
	ia and= &h7fffffffffffffffll
	ib and= &h7fffffffffffffffll
	return iif( ia>ib, ia-ib, ib-ia )

end function

const EPSILON_SNG as single = 1.19290923e-7
const EPSILON_DBL as double = 2.2204460492503131e-016
for v as single = -1 to 1 step 0.01
	print v, sngULP(v,v+EPSILON_SNG), dblULP(cdbl(v),cdbl(v)+EPSILON_DBL)
next

EDIT: fixed a copy/paste error in types

Post by **coderJeff** » Feb 21, 2018 0:09

For an old Jack Tar (I think he said old...), dodicat posted a clever idea; more hex. Actually I used binary. There is something weird going on. As interesting as the ULP might be, the conversion and/or actual function getting called might be a cause. I don't know yet. If curious, compare the DOS & WIN32 output listed in this little snippet:

Code: Select all

/'
DOS OUTPUT

sin  : 0 01111111011 1001100011101010111011010001111010100000001000101011
sinf : 0 01111111011 1001100011101010111011010001111001001000110101001010
asin : 0 01111111011 1001101001001001001001111100011100100001111111101101
asinf: 0 01111111011 1001101001001001001010000000000000000000000000000000

WIN32 OUTPUT

sin  : 0 01111111011 1001100011101010111011010001111010100000001000101011
sinf : 0 01111111011 1001100011101010111011100000000000000000000000000000
asin : 0 01111111011 1001101001001001001001111100011100100001111111101101
asinf: 0 01111111011 1001101001001001001001111100011100100001111111101101

'/

#include once "crt/math.bi"

function Dbl2Bin( byval a as double ) as string
  dim x as string
  dim ia as longint = *cast( longint ptr, @a )
  x =           bin( (ia and &h8000000000000000ll) shr 63, 1 )
  x = x & " " & bin( (ia and &h7ff0000000000000ll) shr 52, 11 )
  x = x & " " & bin( (ia and &h000fffffffffffffll)       , 52 ) 
  function = x        
end function

dim a as single
dim b as double
a = .1

b = sin(a)
print "sin  : " & Dbl2Bin( b )
b = sinf(a)
print "sinf : " & Dbl2Bin( b )

b = asin(a)
print "asin : " & Dbl2Bin( b )
b = asinf(a)
print "asinf: " & Dbl2Bin( b )

TeeEmCee · Post by **TeeEmCee** » Feb 21, 2018 3:04

Other CPUs are another thing to consider
I'm afraid I haven't tried out your cunit replacement or tested FB on an ARM in a long time, but just want to draw your attention to this relevant commit:

commit 06cee30277a9a925e43af41e6d5a81c3410f52b1
Author: Ralph Versteegen <teeemcee@gmail.com>
Date: Mon Oct 24 16:00:27 2016 +1300

Update many testcases to work on ARM and DOS/8-bit-wstring platforms

Tests now check sizeof(wstring) instead of the platform, except for
quirk/len-sizeof.bas, which checks sizeof(wstring) is as expected on
each platform. Likewise cleanup alignment checking.

The following testcases still fail on ARM (when emulated in QEMU):

Suite fbc_tests.optimizations.consteval, Test uop double had failures:
1. optimizations/consteval.bas:338 - CU_ASSERT_DOUBLE_EQUAL(exp( 1.0 ),exp( d ),EPSILON_DBL)
2. optimizations/consteval.bas:338 - CU_ASSERT_DOUBLE_EQUAL(exp( d ),exp( 1.0 ),EPSILON_DBL)
Suite fbc_tests.optimizations.consteval, Test math bop double had failures:
1. optimizations/consteval.bas:422 - CU_ASSERT_DOUBLE_EQUAL(d3,d4,EPSILON_DBL)
Suite fbc_tests.string.print_using, Test inf/nan/ind printing test had failures:
1. string/print_using.bas:403 - CU_ASSERT_EQUAL(sResult,sExpected)
(etc)

The exp tests fail because the error is two epsilons. Line 422 tests pow(),
which is 8 epsilons out.
The print_using tests fail because of differences in whether the CPU produces
positive or negative NaN (which is not standardised by IEEE) and some INDs
being NANs instead. It doesn't appear practical to do anything about it.

So I guess the error margins for those tests simply need increasing.

As for the tests for NaN, etc, I take back what I wrote before. If the behaviour of those operations is not standardised, and differs in practice between different CPUs and emulated CPUs, then it seems that FB shouldn't try to standardise it, so those tests shouldn't be there. Instead they could just test that the result is non-finite. But maybe this is just a QEMU bug?

IIRC, I also ran the testcases on a real ARM, not just QEMU, but it was a while ago... or maybe I had trouble running the testcases on Android 2.2's horribly deficient commandline. Well I finally recently bought a modern Android phone which is more POSIX-compliant.
So I should do some testing for you.

Semi-related: an x86 CPU may or may not store intermediate result extended rather than double precision. Different OSes have different defaults. Linux and MinGW turn on extended precision, while Windows, the BSDs and OSX default to double precision. Also, I have seen different floating point results for FB code calculating "CINT(30 * 1.5)" (written so it's not evaluated at compile time) when using -gen gcc and -gen gas. I fixed this by forcing double precision eveerywhere, but unfortunately libm on x86/linux assumes extended precision intermediate results, so it makes pow() etc, less accurate.

fbcunit - fbc compiler unit testing component

Re: fbcunit and licensing

Re: fbcunit and licensing

Re: fbcunit and licensing

Re: fbcunit and licensing

Re: fbcunit and licensing

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component

Re: fbcunit - fbc compiler unit testing component