Signed/unsigned equality. Bug?

General FreeBASIC programming questions.
xlucas
Posts: 256
Joined: May 09, 2014 21:19
Location: Argentina

Signed/unsigned equality. Bug?

Postby xlucas » Nov 21, 2017 5:57

I have just found something that maybe is not currently considered a bug, but it cost me many hours trying to locate an error I had committed, which wouldn't have happened otherwise, so just in case, here it goes...

It looks like, if you have a negative Long and its binary ULong twin and you compare them, this is seen as "equal" when compiling in 32bit, but as non-equal if compiling in 64bit. Let me say it with code:

Code: Select all

Dim a As Short, b As UShort

a = -1234   'Change this to any negative number
b = a
Print b
Print Hex(a); " = "; Hex(b)
If a = b Then Print "equal" Else Print "non-equal"
GetKey


When I compile and run this in Xubuntu 32 bit, it returns "equal". In Xubuntu 64 bit, it returns non-equal. Please verify and if anybody can comment what happens on other platforms, I'm curious to know.

The problem with this is that you may be working on a project on one system and not realise that you'll get a different result in another. When you see the difference, it's very difficult to find the bug. In my case, the program would just hang because a certain variable would never reach a certain value. Lucky that I just switched to 64bit or I would never have seen this. Is this a known thing?
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 21, 2017 6:16

xlucas wrote:a negative Long and its binary ULong twin and you compare them

From a decimal perspective -1234 <> 4294966062, which is what the compiler wants to tell us. Changing it to HEX changes the interpretation.

I would never compare signed and unsigned. Even though in this case, technically it is the same value, the interpretation is different because one has the negative bit set. So in HEX you see the same value, but not in decimal. The code is flawed and should never be used, unless you want to specifically convert signed to unsigned, i.e casting to positive only, which makes further comparison between the two pointless from a decimal point of view.
Last edited by Munair on Nov 21, 2017 7:01, edited 3 times in total.
xlucas
Posts: 256
Joined: May 09, 2014 21:19
Location: Argentina

Re: Signed/unsigned equality. Bug?

Postby xlucas » Nov 21, 2017 6:29

I don't think it is a bug that the numbers give non-equal. I think it could be a bug that it gives a different result if compiled in 32bit.

I'm trying to replicate it now and realised the code as I posted it is giving non-equal on both, so I'll have to check back the original code (much longer) to see exactly why it was different. Anyway... here's the code that actually caused the problem:

Code: Select all

Sub TargaSave(filename As String, image As Any Ptr)
   Dim As Integer iwidth, iheight, bypp, linelength
   Dim As ULong Ptr imagestart  '<---- Notice that imagestart points to unsigned
      
   'Reset error information
   TargaError = 0 : TargaErrorMessage = ""
   
   'Make sure it's a valid image
   If image = 0 Then
      TargaError = 101
      TargaErrorMessage = "No image in buffer"
      Exit Sub
   End If
   ImageInfo image, iwidth, iheight, bypp, linelength, imagestart

   If bypp <> 4 Then
      TargaError = 102
      TargaErrorMessage = "Not a 32bit image. Unsupported"
      Exit Sub
   End If
   
   'See if the image contains any alpha information
   Dim alphachannel As Byte = 0
   For i As Long = 0 To iwidth * iheight - 1
      If imagestart[i] ShR 24 <> 255 Then
         alphachannel = -1
         Exit For
      End If
   Next i
   
   'Set up image header
   Dim h As TargaHeader, f As Short
   
   If alphachannel Then
      h.PixelDepth = 32
      h.ImageDescriptor = 8
   Else
      h.PixelDepth = 24
      h.ImageDescriptor = 0
   End If
   
   h.ImageType = 10
   h.ImageWidth = iwidth
   h.ImageHeight = iheight

   'Open file
   f = FreeFile
   If Open(filename For Output As f) Then
      TargaError = 103
      TargaErrorMessage = "Failed to create image file"
      Exit Sub
   Else
      Close f
      Open filename For Binary Access Write As f
   End If
   
   'Put header
   Put #f, 1, h
   
   'Compress image row by row
   Dim rp As Long, column As Long, buffer As String
   Dim count As Short, status As Byte, sample As Long   '<----------- Notice that sample is signed
   
   For i As Long = 0 To iheight - 1   'For every row...
      'Calculate where to read the row from
      rp = (iheight - i - 1) * linelength \ 4
      
      buffer = ""
      column = 0
      status = 0   'Still don't know if RLE or not
      count = 0   'Nothing pending
      Do
         Select Case status
            Case 0 'Undefined
               sample = imagestart[rp + column]
               
               'If it's the last pixel, just push it
               If column = iwidth - 1 Then
                  If alphachannel Then
                     buffer &= Chr(0) + MkL(sample)
                  Else
                     buffer &= Chr(0) + Left(MkL(sample), 3)
                  End If
                  Exit Do
               End If
               
               count = 0
               If sample = imagestart[rp + column + 1] Then   '<---- This comparison gives non-equal in 64 bit and equal in 32 bit (when two pixels in a row are the same colour and alpha channel is a high value, say, &HFF)
                  status = 1   'Building an RLE block
               Else
                  status = 2   'Building a non-RLE block
               End If
            Case 1   'RLE
               If imagestart[rp + column] = sample Then
                  If count = 128 Then   'Block full. Push into the buffer
                     If alphachannel Then
                        buffer &= Chr(255) + MkL(sample)
                     Else
                        buffer &= Chr(255) + Left(MkL(sample), 3)
                     End If
                     count = 0
                     status = 0
                  Else
                     count += 1
                     column += 1
                  End If
               Else
                  'Found an end for the RLE block
                  buffer &= Chr(127 + count)
                  If alphachannel Then
                     buffer &= MkL(sample)
                  Else
                     buffer &= Left(MkL(sample), 3)
                  End If
                  count = 0
                  status = 0
               End If
            Case Else   'Non-RLE
               If imagestart[rp + column] = imagestart[rp + column + 1]  Then
                  'End of non-RLE block
                  buffer &= Chr(count - 1)
                  For j As Short = column - count To column - 1
                     If alphachannel Then
                        buffer &= MkL(imagestart[rp + j])
                     Else
                        buffer &= Left(MkL(imagestart[rp + j]), 3)
                     End If
                  Next j
                  count = 0
                  status = 0
               Else
                  If count = 128 Then   'Block full. Push into the buffer
                     buffer &= Chr(127)
                     For j As Short = column - 128 To column - 1
                        If alphachannel Then
                           buffer &= MkL(imagestart[rp + j])
                        Else
                           buffer &= Left(MkL(imagestart[rp + j]), 3)
                        End If
                     Next j
                     count = 0
                     status = 0
                  Else
                     count += 1
                     column += 1
                  End If
               End If
         End Select
      Loop Until column = iwidth
      
      If column = iwidth Then
         If status = 1 Then
            buffer &= Chr(127 + count)
            If alphachannel Then
               buffer &= MkL(sample)
            Else
               buffer &= Left(MkL(sample), 3)
            End If
         Else
            buffer &= Chr(count - 1)
            For j As Short = column - count To column - 1
               If alphachannel Then
                  buffer &= MkL(imagestart[rp + j])
               Else
                  buffer &= Left(MkL(imagestart[rp + j]), 3)
               End If
            Next j
         End If
      End If
      
      Put #f, , buffer
   Next i
   
   'Targa v2.0 with no extensions
   buffer = String(8, 0) + "TRUEVISION-XFILE." + Chr(0)
   Put #f, , buffer
   
   Close f
End Sub


This code, when compiled in 32bit will produce a Targa image. When compiled in 64bit, it will hang because the variable column never increases. The problem is I wrote this code on a 32bit system, so I didn't detect my error until now I try to do it in 64bit.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 21, 2017 6:39

A quick look at your code tells me that you point to elements of an image using signed subscript types:

Code: Select all

imagestart[i] ' i as long should be ulong
Pointers are always unsigned, so you have to make sure addressing data is also unsigned. Changing the subscript types to ulong may solve the problem (they are supposed to be positive anyway).

Furthermore, you compare signed to unsigned: sample As Long should be sample As ULong.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 21, 2017 6:58

BTW, the reason LONG and ULONG HEX in your example give equal on 32bit systems and unequal on 64 bit systems, is because it is a 32bit data type. It is not portable.
xlucas
Posts: 256
Joined: May 09, 2014 21:19
Location: Argentina

Re: Signed/unsigned equality. Bug?

Postby xlucas » Nov 21, 2017 7:16

Munair, yes, indeces are always unsigned and I used a signed index variable, but it's still always positive, so it shouldn't make a difference.

Munair wrote:[...]is because it is a 32bit data type. It is not portable.


All types are portable (except possibly Integer if you use it as a 64bit integer number). What sense would it make that they weren't? And when you say "your problem", you mean this particular piece of code. Yes, of course, I was able to fix it easily by making sample ULong, but that's not the point. The matter is that, if the 32bit compiler will sometimes give equal where the 64bit compiler gives non-equal, then you won't see that you're creating a bug while testing in 32bit and the bug will become aparent in 64bit.

I am not complaining that my code didn't work. I'm saying I've noticed something that may be inconsistent between the 32bit and 64bit compilers.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 21, 2017 7:44

xlucas wrote:
Munair wrote:[...]is because it is a 32bit data type. It is not portable.
All types are portable (except possibly Integer if you use it as a 64bit integer number).
Not if you convert between signed and unsigned. Look at the following output of your example on a 64bit system:

Code: Select all

SHORT/USHORT (16bit):
64302
FB2E = FB2E
non-equal

LONG/ULONG (32bit):
4294966062
FFFFFB2E = FFFFFB2E
non-equal

LONGINT/ULONGINT (64 bit):
18446744073709550382
FFFFFFFFFFFFFB2E = FFFFFFFFFFFFFB2E
equal

You get non-equal where on 16 or 32bit systems you get equal because they are 16 or 32bit data types. Note that this is the interpretation of the HEX function. When a value is negative, the MSB is set. But on 32 bit systems (bit 32) this is not the same bit as on 64 bit systems (bit 64). That's why conversion from signed to unsigned isn't portable.
DamageX
Posts: 106
Joined: Nov 21, 2009 8:42

Re: Signed/unsigned equality. Bug?

Postby DamageX » Nov 21, 2017 8:41

Code: Select all

Dim a As Short, b As UShort

a = -1234   'Change this to any negative number
b = a
Print b
Print Hex(a); " = "; Hex(b)
If a = b Then Print "equal" Else Print "non-equal"
GetKey

I'm trying to replicate it now and realised the code as I posted it is giving non-equal on both,

In your example code you have short and ushort. I would expect those to come out unequal for both 32 and 64bit. The variables, having different types, will get extended to integer before the comparison. Ushort will be zero extended and short will be sign extended so the binary data changes. At least, that is the explanation that I read somewhere on the forum. I could be wrong or the compiler could have changed since then.

Comparison between ulong and long on a 64bit platform would be similar. They would be zero extended and sign extended, respectively, to 64 bits before the comparison, producing unequal result.

Comparing ulong and long with 32 bits is a different story. Because they are already 32 bits, they won't be extended. They have to be compared as they are, and the bits are the same so the result is equal.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 21, 2017 9:01

People often confuse values and interpretations of the values. Again, when asigning a = -1234 on a 32 bit platform, bit 32 is set to indicate it is negative. Converting it to unsigned on the same platform doesn't change the value, only the interpretation of the value. So when you output in decimal, you see the different interpretations of the same value. But the HEX function simply outputs the value which is always the same.

Apparently, the HEX function goes by the OS, not by the datatype being fed to it. So on a 64 bit system a 32 bit negative value will be interpreted differently, having bit 64 set, rather than bit 32.
xlucas
Posts: 256
Joined: May 09, 2014 21:19
Location: Argentina

Re: Signed/unsigned equality. Bug?

Postby xlucas » Nov 22, 2017 0:35

Thanks, DamageX. I know why this happens. But if it happens, it is still a problem. Munair, maybe you're not getting my point. I am not confused.

Let me explain again in more detail:
- I understand that, while the binary representation of two numbers may be equal, this does not mean the numbers themselves are equal. This is obvious and it does make sense that FreeBasic try to realise two numbers are non-equal even when the binary representation is. This is not a bug.
- I understand that Hex is a function that does not, in general, return the hexadecimal representation of a number, but instead, returns the hexadecimal representation of the unsigned integer that has a given binary representation. For example, -26 decimal, in hexadecimal is -1A, but Hex(-26) is E6 or FFE6 or FFFFFFE6, etc., depending on the type of the parameter. If two numbers have the same hexadecimal representation (of the number), then the numbers ARE equal, but having the same Hex() is a different thing.
- I understand that the most reasonable explanation for the fact that FB is able to determine that two numbers are non-equal even though their binary representation in memory is, is that it passes these numbers to some type (likely Integer) before comparing. Of course, if the numbers are of a type that's already the size of the CPU word, there will be no room to extend the sign and it will be impossible for FB to tell the difference, so I am not surprised that this happens.
- Understanding all this, I am concerned that, sometimes, a program will result in a completely different effect if compiled in 32bit from what you get when it's compiled in 64bit. Besides, you get no warning when this may be happening. Perhaps, it'd suffice if FB warned when comparing two integers of the same width, but one being signed and the other unsigned. Perhaps that'd be warnings too often. I don't know, but it could be a problem.
- Finally, I believe Munair and I have a different opinion on the definition of a "portable type", so I'll provide my definition. A type is "portable" if it behaves the same way and has the same features in every destination platform considered. With this definition, only Integer can be considered non-portable in FreeBasic. This is just to explain myself, not to contradict Munair.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 22, 2017 7:29

xlucas wrote:I understand that, while the binary representation of two numbers may be equal, this does not mean the numbers themselves are equal.
Only when they are complements (unsigned vs signed), like 255 and -1, which in binary would be: 1111 1111.

You may not agree with what I say, but it's the foundation of computing. For a better understanding a good read would be about the two complements: https://en.wikipedia.org/wiki/Two%27s_complement.

So when you port your program to a different architecture, you will have to look carefully at the datatypes. It should not be underestimated.
marcov
Posts: 2404
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby marcov » Nov 22, 2017 14:34

Afaik this has only sideways to do with two's complement, and everything with type promotion rules.

If x <operator> y is encountered and x and y are of different types, to perform the operation the types have to be synchronized.

In the case of signed vs unsigned there are two schools of thought:

- cast/convert both to the base type (usually int) and perform the operation. Since the actual operation is performed in using CPU register, the precision might be higher if size(register)>size(int) like on x86_64.

_or_

- cast/convert to the integer type that is big enough to hold them all. This has a potential performance hit in loops because it more often uses larger types that are often implemented using helper routines (in the rts/libgcc) rather than native CPU instructions. It does fix the problem of two different values (signed -1 and unsigned $FFFF in 16-bit for ease) with different values since both -1 and 65535 can be represented in the next larger int (int32 in this case)

Even if it is not two complements but some other encoding, double values between signed and unsigned are possible, just different, and the conversion of unsigned to signed is usually more involved. But the principles are not different in non two-complements.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 22, 2017 16:02

marcov wrote:Afaik this has only sideways to do with two's complement, and everything with type promotion rules.

If x <operator> y is encountered and x and y are of different types, to perform the operation the types have to be synchronized.

In the case of signed vs unsigned there are two schools of thought:

- cast/convert both to the base type (usually int) and perform the operation. Since the actual operation is performed in using CPU register, the precision might be higher if size(register)>size(int) like on x86_64.

_or_

- cast/convert to the integer type that is big enough to hold them all. This has a potential performance hit in loops because it more often uses larger types that are often implemented using helper routines (in the rts/libgcc) rather than native CPU instructions. It does fix the problem of two different values (signed -1 and unsigned $FFFF in 16-bit for ease) with different values since both -1 and 65535 can be represented in the next larger int (int32 in this case)

Even if it is not two complements but some other encoding, double values between signed and unsigned are possible, just different, and the conversion of unsigned to signed is usually more involved. But the principles are not different in non two-complements.
The problem the OP encountered has everything to do with the binary range of a specific architecture. He wondered about the fact that 32bit complements on a 32bit system return true when compared, while the same comparison returns false on a 64 bit system. Negative and positive values can only be equal (from a binary point of view) when they are complements of the same type, which is what the OP demonstrated in his example. So it is important to know that the MSB can be bit 15, 31 or 63 depending on architecture. Hence my remark that bit-comparison (not type comparison) can yield different results on different platforms.

In old QuickBASIC complements didn't exist because the compiler threw an exception when trying to: A AS INTEGER = 32768 while other compilers like C and Pascal shifted bits (-32768) without complaint. ;)

I agree, promoting the type would solve the problem. But this is up to the programmer who shouldn't expect a 32bit program to run the same on a 64 bit platform with a 64 bit compiler.
marcov
Posts: 2404
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby marcov » Nov 22, 2017 17:27

Munair wrote: The problem the OP encountered has everything to do with the binary range of a specific architecture. He wondered about the fact that 32bit complements on a 32bit system return true when compared, while the same comparison returns false on a 64 bit system.


Well, obviously it is not binary then. Since binary the values are equal. As soon as the results are different, SOMEWHERE there is a conversion.

Negative and positive values can only be equal (from a binary point of view) when they are complements of the same type, which is what the OP demonstrated in his example.


See above and previous post. Compilers don't compare binaries. They create a tree node for the comparison, and the semantic phase of the compiler will insert needed conversions to make the types match a pair that the compiler has a comparison for.

So it is important to know that the MSB can be bit 15, 31 or 63 depending on architecture. Hence my remark that bit-comparison (not type comparison) can yield different results on different platforms.


A 64-bit system is perfectly capable of doing a strict 32-bit comparison.

In old QuickBASIC complements didn't exist because the compiler threw an exception when trying to: A AS INTEGER = 32768 while other compilers like C and Pascal shifted bits (-32768) without complaint. ;)


(Borland derived) Pascal treats literals differently (usually with high bit numbers) to have an properly scale them back to the type it is assigned to later. The reason is because somewhat modern pascals both support the signed as the unsigned version of the largest integer, putting the literal in the unenviable position of having a range of -2^(x-1) ... (2^x)-1 or 1.5 times the range of an n-bit type.

I agree, promoting the type would solve the problem. But this is up to the programmer who shouldn't expect a 32bit program to run the same on a 64 bit platform with a 64 bit compiler.


A language has rules. I merely pointed out that there are two commonly accepted rules for dealing with this case. And both are fully governed by the type system, and not by 2-complements math or binary representation (*). That only comes into play in implementing those rules.

Another way of explaining It is a matter of which sign extension of 32-bit to 64-bit is chosen while loading the values.

(*) as the typed promoted binary representation is obviously different from the original in the case that both are promoted to a larger integer type.
Munair
Posts: 358
Joined: Oct 19, 2017 15:00
Location: 't Zand, NL
Contact:

Re: Signed/unsigned equality. Bug?

Postby Munair » Nov 22, 2017 18:27

marcov wrote:A 64-bit system is perfectly capable of doing a strict 32-bit comparison.
I didn't say anything to the contrary. But if your program behaves differently on each platform, then there is something in the code that needs adjustment.

Return to “General”

Who is online

Users browsing this forum: Yahoo [Bot] and 5 guests