Signed/unsigned equality. Bug?

xlucas · Post by **xlucas** » Nov 21, 2017 5:57

I have just found something that maybe is not currently considered a bug, but it cost me many hours trying to locate an error I had committed, which wouldn't have happened otherwise, so just in case, here it goes...

It looks like, if you have a negative Long and its binary ULong twin and you compare them, this is seen as "equal" when compiling in 32bit, but as non-equal if compiling in 64bit. Let me say it with code:

Code: Select all

Dim a As Short, b As UShort

a = -1234   'Change this to any negative number
b = a
Print b
Print Hex(a); " = "; Hex(b)
If a = b Then Print "equal" Else Print "non-equal"
GetKey

When I compile and run this in Xubuntu 32 bit, it returns "equal". In Xubuntu 64 bit, it returns non-equal. Please verify and if anybody can comment what happens on other platforms, I'm curious to know.

The problem with this is that you may be working on a project on one system and not realise that you'll get a different result in another. When you see the difference, it's very difficult to find the bug. In my case, the program would just hang because a certain variable would never reach a certain value. Lucky that I just switched to 64bit or I would never have seen this. Is this a known thing?

Munair · Post by **Munair** » Nov 21, 2017 6:16

xlucas wrote:a negative Long and its binary ULong twin and you compare them

From a decimal perspective -1234 <> 4294966062, which is what the compiler wants to tell us. Changing it to HEX changes the interpretation.

I would never compare signed and unsigned. Even though in this case, technically it is the same value, the interpretation is different because one has the negative bit set. So in HEX you see the same value, but not in decimal. The code is flawed and should never be used, unless you want to specifically convert signed to unsigned, i.e casting to positive only, which makes further comparison between the two pointless from a decimal point of view.

xlucas · Post by **xlucas** » Nov 21, 2017 6:29

I don't think it is a bug that the numbers give non-equal. I think it could be a bug that it gives a different result if compiled in 32bit.

I'm trying to replicate it now and realised the code as I posted it is giving non-equal on both, so I'll have to check back the original code (much longer) to see exactly why it was different. Anyway... here's the code that actually caused the problem:

Code: Select all

Sub TargaSave(filename As String, image As Any Ptr)
	Dim As Integer iwidth, iheight, bypp, linelength
	Dim As ULong Ptr imagestart  '<---- Notice that imagestart points to unsigned
		
	'Reset error information
	TargaError = 0 : TargaErrorMessage = ""
	
	'Make sure it's a valid image
	If image = 0 Then
		TargaError = 101
		TargaErrorMessage = "No image in buffer"
		Exit Sub
	End If
	ImageInfo image, iwidth, iheight, bypp, linelength, imagestart

	If bypp <> 4 Then
		TargaError = 102
		TargaErrorMessage = "Not a 32bit image. Unsupported"
		Exit Sub
	End If
	
	'See if the image contains any alpha information
	Dim alphachannel As Byte = 0
	For i As Long = 0 To iwidth * iheight - 1
		If imagestart[i] ShR 24 <> 255 Then
			alphachannel = -1
			Exit For
		End If
	Next i
	
	'Set up image header
	Dim h As TargaHeader, f As Short
	
	If alphachannel Then
		h.PixelDepth = 32
		h.ImageDescriptor = 8
	Else
		h.PixelDepth = 24
		h.ImageDescriptor = 0
	End If
	
	h.ImageType = 10
	h.ImageWidth = iwidth
	h.ImageHeight = iheight

	'Open file
	f = FreeFile
	If Open(filename For Output As f) Then
		TargaError = 103
		TargaErrorMessage = "Failed to create image file"
		Exit Sub
	Else
		Close f
		Open filename For Binary Access Write As f
	End If
	
	'Put header
	Put #f, 1, h
	
	'Compress image row by row
	Dim rp As Long, column As Long, buffer As String
	Dim count As Short, status As Byte, sample As Long   '<----------- Notice that sample is signed
	
	For i As Long = 0 To iheight - 1	'For every row...
		'Calculate where to read the row from
		rp = (iheight - i - 1) * linelength \ 4
		
		buffer = ""
		column = 0
		status = 0	'Still don't know if RLE or not
		count = 0	'Nothing pending
		Do
			Select Case status
				Case 0 'Undefined
					sample = imagestart[rp + column]
					
					'If it's the last pixel, just push it
					If column = iwidth - 1 Then
						If alphachannel Then
							buffer &= Chr(0) + MkL(sample)
						Else
							buffer &= Chr(0) + Left(MkL(sample), 3)
						End If
						Exit Do
					End If
					
					count = 0
					If sample = imagestart[rp + column + 1] Then   '<---- This comparison gives non-equal in 64 bit and equal in 32 bit (when two pixels in a row are the same colour and alpha channel is a high value, say, &HFF)
						status = 1	'Building an RLE block
					Else
						status = 2	'Building a non-RLE block
					End If
				Case 1	'RLE
					If imagestart[rp + column] = sample Then
						If count = 128 Then	'Block full. Push into the buffer
							If alphachannel Then
								buffer &= Chr(255) + MkL(sample)
							Else
								buffer &= Chr(255) + Left(MkL(sample), 3)
							End If
							count = 0
							status = 0
						Else
							count += 1
							column += 1
						End If
					Else
						'Found an end for the RLE block
						buffer &= Chr(127 + count)
						If alphachannel Then
							buffer &= MkL(sample)
						Else
							buffer &= Left(MkL(sample), 3)
						End If
						count = 0
						status = 0
					End If
				Case Else	'Non-RLE
					If imagestart[rp + column] = imagestart[rp + column + 1]  Then
						'End of non-RLE block
						buffer &= Chr(count - 1)
						For j As Short = column - count To column - 1
							If alphachannel Then
								buffer &= MkL(imagestart[rp + j])
							Else
								buffer &= Left(MkL(imagestart[rp + j]), 3)
							End If
						Next j
						count = 0
						status = 0
					Else
						If count = 128 Then	'Block full. Push into the buffer
							buffer &= Chr(127)
							For j As Short = column - 128 To column - 1
								If alphachannel Then
									buffer &= MkL(imagestart[rp + j])
								Else
									buffer &= Left(MkL(imagestart[rp + j]), 3)
								End If
							Next j
							count = 0
							status = 0
						Else
							count += 1
							column += 1
						End If
					End If
			End Select
		Loop Until column = iwidth
		
		If column = iwidth Then
			If status = 1 Then
				buffer &= Chr(127 + count)
				If alphachannel Then
					buffer &= MkL(sample)
				Else
					buffer &= Left(MkL(sample), 3)
				End If
			Else
				buffer &= Chr(count - 1)
				For j As Short = column - count To column - 1
					If alphachannel Then
						buffer &= MkL(imagestart[rp + j])
					Else
						buffer &= Left(MkL(imagestart[rp + j]), 3)
					End If
				Next j
			End If
		End If
		
		Put #f, , buffer
	Next i
	
	'Targa v2.0 with no extensions
	buffer = String(8, 0) + "TRUEVISION-XFILE." + Chr(0)
	Put #f, , buffer
	
	Close f
End Sub

This code, when compiled in 32bit will produce a Targa image. When compiled in 64bit, it will hang because the variable column never increases. The problem is I wrote this code on a 32bit system, so I didn't detect my error until now I try to do it in 64bit.

Munair · Post by **Munair** » Nov 21, 2017 6:39

A quick look at your code tells me that you point to elements of an image using signed subscript types:

Code: Select all

imagestart[i] ' i as long should be ulong

Pointers are always unsigned, so you have to make sure addressing data is also unsigned. Changing the subscript types to ulong may solve the problem (they are supposed to be positive anyway).

Furthermore, you compare signed to unsigned: sample As Long should be sample As ULong.

Munair · Post by **Munair** » Nov 21, 2017 6:58

BTW, the reason LONG and ULONG HEX in your example give equal on 32bit systems and unequal on 64 bit systems, is because it is a 32bit data type. It is not portable.

xlucas · Post by **xlucas** » Nov 21, 2017 7:16

Munair, yes, indeces are always unsigned and I used a signed index variable, but it's still always positive, so it shouldn't make a difference.

Munair wrote:[...]is because it is a 32bit data type. It is not portable.

All types are portable (except possibly Integer if you use it as a 64bit integer number). What sense would it make that they weren't? And when you say "your problem", you mean this particular piece of code. Yes, of course, I was able to fix it easily by making sample ULong, but that's not the point. The matter is that, if the 32bit compiler will sometimes give equal where the 64bit compiler gives non-equal, then you won't see that you're creating a bug while testing in 32bit and the bug will become aparent in 64bit.

I am not complaining that my code didn't work. I'm saying I've noticed something that may be inconsistent between the 32bit and 64bit compilers.

Munair · Post by **Munair** » Nov 21, 2017 7:44

xlucas wrote:
Munair wrote:[...]is because it is a 32bit data type. It is not portable.
All types are portable (except possibly Integer if you use it as a 64bit integer number).

Not if you convert between signed and unsigned. Look at the following output of your example on a 64bit system:

Code: Select all

SHORT/USHORT (16bit):
64302
FB2E = FB2E
non-equal

LONG/ULONG (32bit):
4294966062
FFFFFB2E = FFFFFB2E
non-equal

LONGINT/ULONGINT (64 bit):
18446744073709550382
FFFFFFFFFFFFFB2E = FFFFFFFFFFFFFB2E
equal

You get non-equal where on 16 or 32bit systems you get equal because they are 16 or 32bit data types. Note that this is the interpretation of the HEX function. When a value is negative, the MSB is set. But on 32 bit systems (bit 32) this is not the same bit as on 64 bit systems (bit 64). That's why conversion from signed to unsigned isn't portable.

DamageX · Post by **DamageX** » Nov 21, 2017 8:41

Code: Select all
Dim a As Short, b As UShort

a = -1234   'Change this to any negative number
b = a
Print b
Print Hex(a); " = "; Hex(b)
If a = b Then Print "equal" Else Print "non-equal"
GetKey
I'm trying to replicate it now and realised the code as I posted it is giving non-equal on both,

In your example code you have short and ushort. I would expect those to come out unequal for both 32 and 64bit. The variables, having different types, will get extended to integer before the comparison. Ushort will be zero extended and short will be sign extended so the binary data changes. At least, that is the explanation that I read somewhere on the forum. I could be wrong or the compiler could have changed since then.

Comparison between ulong and long on a 64bit platform would be similar. They would be zero extended and sign extended, respectively, to 64 bits before the comparison, producing unequal result.

Comparing ulong and long with 32 bits is a different story. Because they are already 32 bits, they won't be extended. They have to be compared as they are, and the bits are the same so the result is equal.

Munair · Post by **Munair** » Nov 21, 2017 9:01

People often confuse values and interpretations of the values. Again, when asigning a = -1234 on a 32 bit platform, bit 32 is set to indicate it is negative. Converting it to unsigned on the same platform doesn't change the value, only the interpretation of the value. So when you output in decimal, you see the different interpretations of the same value. But the HEX function simply outputs the value which is always the same.

Apparently, the HEX function goes by the OS, not by the datatype being fed to it. So on a 64 bit system a 32 bit negative value will be interpreted differently, having bit 64 set, rather than bit 32.

xlucas · Post by **xlucas** » Nov 22, 2017 0:35

Thanks, DamageX. I know why this happens. But if it happens, it is still a problem. Munair, maybe you're not getting my point. I am not confused.

Let me explain again in more detail:
- I understand that, while the binary representation of two numbers may be equal, this does not mean the numbers themselves are equal. This is obvious and it does make sense that FreeBasic try to realise two numbers are non-equal even when the binary representation is. This is not a bug.
- I understand that Hex is a function that does not, in general, return the hexadecimal representation of a number, but instead, returns the hexadecimal representation of the unsigned integer that has a given binary representation. For example, -26 decimal, in hexadecimal is -1A, but Hex(-26) is E6 or FFE6 or FFFFFFE6, etc., depending on the type of the parameter. If two numbers have the same hexadecimal representation (of the number), then the numbers ARE equal, but having the same Hex() is a different thing.
- I understand that the most reasonable explanation for the fact that FB is able to determine that two numbers are non-equal even though their binary representation in memory is, is that it passes these numbers to some type (likely Integer) before comparing. Of course, if the numbers are of a type that's already the size of the CPU word, there will be no room to extend the sign and it will be impossible for FB to tell the difference, so I am not surprised that this happens.
- Understanding all this, I am concerned that, sometimes, a program will result in a completely different effect if compiled in 32bit from what you get when it's compiled in 64bit. Besides, you get no warning when this may be happening. Perhaps, it'd suffice if FB warned when comparing two integers of the same width, but one being signed and the other unsigned. Perhaps that'd be warnings too often. I don't know, but it could be a problem.
- Finally, I believe Munair and I have a different opinion on the definition of a "portable type", so I'll provide my definition. A type is "portable" if it behaves the same way and has the same features in every destination platform considered. With this definition, only Integer can be considered non-portable in FreeBasic. This is just to explain myself, not to contradict Munair.

Munair · Post by **Munair** » Nov 22, 2017 7:29

xlucas wrote:I understand that, while the binary representation of two numbers may be equal, this does not mean the numbers themselves are equal.

Only when they are complements (unsigned vs signed), like 255 and -1, which in binary would be: 1111 1111.

You may not agree with what I say, but it's the foundation of computing. For a better understanding a good read would be about the two complements: https://en.wikipedia.org/wiki/Two%27s_complement.

So when you port your program to a different architecture, you will have to look carefully at the datatypes. It should not be underestimated.

marcov · Post by **marcov** » Nov 22, 2017 14:34

Afaik this has only sideways to do with two's complement, and everything with type promotion rules.

If x <operator> y is encountered and x and y are of different types, to perform the operation the types have to be synchronized.

In the case of signed vs unsigned there are two schools of thought:

- cast/convert both to the base type (usually int) and perform the operation. Since the actual operation is performed in using CPU register, the precision might be higher if size(register)>size(int) like on x86_64.

_or_

- cast/convert to the integer type that is big enough to hold them all. This has a potential performance hit in loops because it more often uses larger types that are often implemented using helper routines (in the rts/libgcc) rather than native CPU instructions. It does fix the problem of two different values (signed -1 and unsigned $FFFF in 16-bit for ease) with different values since both -1 and 65535 can be represented in the next larger int (int32 in this case)

Even if it is not two complements but some other encoding, double values between signed and unsigned are possible, just different, and the conversion of unsigned to signed is usually more involved. But the principles are not different in non two-complements.

Munair · Post by **Munair** » Nov 22, 2017 16:02

marcov wrote:Afaik this has only sideways to do with two's complement, and everything with type promotion rules.

If x <operator> y is encountered and x and y are of different types, to perform the operation the types have to be synchronized.

In the case of signed vs unsigned there are two schools of thought:

- cast/convert both to the base type (usually int) and perform the operation. Since the actual operation is performed in using CPU register, the precision might be higher if size(register)>size(int) like on x86_64.

_or_

- cast/convert to the integer type that is big enough to hold them all. This has a potential performance hit in loops because it more often uses larger types that are often implemented using helper routines (in the rts/libgcc) rather than native CPU instructions. It does fix the problem of two different values (signed -1 and unsigned $FFFF in 16-bit for ease) with different values since both -1 and 65535 can be represented in the next larger int (int32 in this case)

Even if it is not two complements but some other encoding, double values between signed and unsigned are possible, just different, and the conversion of unsigned to signed is usually more involved. But the principles are not different in non two-complements.

The problem the OP encountered has everything to do with the binary range of a specific architecture. He wondered about the fact that 32bit complements on a 32bit system return true when compared, while the same comparison returns false on a 64 bit system. Negative and positive values can only be equal (from a binary point of view) when they are complements of the same type, which is what the OP demonstrated in his example. So it is important to know that the MSB can be bit 15, 31 or 63 depending on architecture. Hence my remark that bit-comparison (not type comparison) can yield different results on different platforms.

In old QuickBASIC complements didn't exist because the compiler threw an exception when trying to: A AS INTEGER = 32768 while other compilers like C and Pascal shifted bits (-32768) without complaint. ;)

I agree, promoting the type would solve the problem. But this is up to the programmer who shouldn't expect a 32bit program to run the same on a 64 bit platform with a 64 bit compiler.

marcov · Post by **marcov** » Nov 22, 2017 17:27

Munair wrote: The problem the OP encountered has everything to do with the binary range of a specific architecture. He wondered about the fact that 32bit complements on a 32bit system return true when compared, while the same comparison returns false on a 64 bit system.

Well, obviously it is not binary then. Since binary the values are equal. As soon as the results are different, SOMEWHERE there is a conversion.

Negative and positive values can only be equal (from a binary point of view) when they are complements of the same type, which is what the OP demonstrated in his example.

See above and previous post. Compilers don't compare binaries. They create a tree node for the comparison, and the semantic phase of the compiler will insert needed conversions to make the types match a pair that the compiler has a comparison for.

So it is important to know that the MSB can be bit 15, 31 or 63 depending on architecture. Hence my remark that bit-comparison (not type comparison) can yield different results on different platforms.

A 64-bit system is perfectly capable of doing a strict 32-bit comparison.

In old QuickBASIC complements didn't exist because the compiler threw an exception when trying to: A AS INTEGER = 32768 while other compilers like C and Pascal shifted bits (-32768) without complaint. ;)

(Borland derived) Pascal treats literals differently (usually with high bit numbers) to have an properly scale them back to the type it is assigned to later. The reason is because somewhat modern pascals both support the signed as the unsigned version of the largest integer, putting the literal in the unenviable position of having a range of -2^(x-1) ... (2^x)-1 or 1.5 times the range of an n-bit type.

I agree, promoting the type would solve the problem. But this is up to the programmer who shouldn't expect a 32bit program to run the same on a 64 bit platform with a 64 bit compiler.

A language has rules. I merely pointed out that there are two commonly accepted rules for dealing with this case. And both are fully governed by the type system, and not by 2-complements math or binary representation (*). That only comes into play in implementing those rules.

Another way of explaining It is a matter of which sign extension of 32-bit to 64-bit is chosen while loading the values.

(*) as the typed promoted binary representation is obviously different from the original in the case that both are promoted to a larger integer type.

Munair · Post by **Munair** » Nov 22, 2017 18:27

marcov wrote:A 64-bit system is perfectly capable of doing a strict 32-bit comparison.

I didn't say anything to the contrary. But if your program behaves differently on each platform, then there is something in the code that needs adjustment.

Signed/unsigned equality. Bug?

Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?

Re: Signed/unsigned equality. Bug?