INTEGER and LONG Variable Specification

RBARAN · Post by **RBARAN** » Mar 09, 2013 16:09

Please consider making LONG and ULONG always be 64 bit integers (as LONGINT and ULONGINT currently are) in future versions of freebasic across all platforms except when it is in a compatibility mode. As it is currently implemented, LONG is ambiguous across platforms and redundant and should not be used.

Post by **counting_pine** » Mar 09, 2013 17:21

In FreeBASIC, regardless of dialect, Long has always been 32 bits wide, so in that sense it's not ambiguous. It's also consistent with QBasic and old versions of Visual Basic.
The current plans are to keep that consistency in the 64-bit version, and change Integer to match the native word size on the chosen architecture, i.e. 64 bits wide.

That way, whether in 32 or 64 bit mode, Byte, Short, Long and Longint will each have a consistent, well-defined size. This will just have the unfortunate side effect of Long being shorter than Integer in 64-bit mode.

We are also adding the ability to choose the desired bit width using Integer<bits>, where bits can be 8, 16, 32 or 64. This will make it possible to choose an integer type of a specific size without worrying about what its name might be.

pestery · Post by **pestery** » Mar 11, 2013 4:24

That's an interesting idea, switching Long and Integer. Although it might cause problems because in the documentation Integer is listed as always 32 bits, so if someone has done something like this somewhere in their code then they're in for a lot of trouble.

Code: Select all

Dim As Any Ptr buffer ' Some buffer that for some reason doesn't have a type
Dim As Integer Ptr i
For j As Integer = 0 To 9
   i = buffer + (j * 4) ' This will break
   i = buffer + (j * SizeOf(Integer)) ' This will be ok
   *i = j
Next

counting_pine wrote:We are also adding the ability to choose the desired bit width using Integer<bits>, where bits can be 8, 16, 32 or 64....

Awesome, that will be really useful for things like networking where your sending data to another computer that may have a different configuration.

counting_pine wrote:...change Integer to match the native word size on the chosen architecture...

I'm guessing this is in preparation of making FB able to produce 64bit programs? I've been hoping for this for ages and recently have actually been learning more about C and C++, and trying to find a compiler to make 64bit programs on windows, and IDE I like, and everything else associated. It would be a lot easier to just stick with FB, considering I only program because I enjoy it and not for a career. Any clues on when FB might be 64bit ready? Also, more importantly, any clues as to when the changes to Integer and Long are expected?

Edit: I just found the answers to some of my questions in this topic :P

TJF · Post by **TJF** » Mar 11, 2013 5:50

pestery wrote:That's an interesting idea, switching Long and Integer. Although it might cause problems because in the documentation Integer is listed as always 32 bits, so if someone has done something like this somewhere in their code then they're in for a lot of trouble.

I agree!

And think about the use of MKI / CVI function family to convert numbers in to STRINGs. A lot of code will be affected when a basic TYPE gets changed.

IMHO it's better to introduce new TYPEs for 64-bit coding.

counting_pine wrote:We are also adding the ability to choose the desired bit width using Integer<bits>, where bits can be 8, 16, 32 or 64. This will make it possible to choose an integer type of a specific size without worrying about what its name might be.

This is how it's done in GLib (gint8, guint64, ...) and I'm pretty happy with this solution.

pestery · Post by **pestery** » Mar 11, 2013 8:27

One quick extra point. Either Long or Integer will have to be 64bit on a 64bit build (a native size datatype) and the current documentation suggests that it should be Long, but because FB has been 32bit only it probably doesn't matter which is picked. Previously Integer = Long = Pointer was a valid assumption so old code may be broken whichever one is picked, but only if compiled as 64bit.

Personally I do kind of like the idea of using Integer as the native datatype. We don't need extra datatypes for 64bit (extra unneeded complexity), we just need to pick the one to use as the native size datatype. And remember that code intended for 64bit cannot assume that Integer = Long = Pointer.

Code: Select all

For i As some_datatype_that_i_dont_care_might_as_well_be_native_Integer = 1 to 10
   Do Stuff
Next

marcov · Post by **marcov** » Mar 11, 2013 15:26

I would let LONG match C's LONG, INT match C's INT, and define an own type that guaranteedly scales with pointer (like intptr_t or something in C99). Anything else means that you may explain every time again why FB deviates from C to any new header translator.

Keep in mind that on 64-bit *nix LONG=64-bit and on Windows (32 AND 64-bit) LONG = 32-bit.

Post by **counting_pine** » Mar 12, 2013 0:48

I think the changes now mean that Integer will always be the same size as a Ptr. But we can now be sure with Integer<sizeof(any ptr)*8>.
Or typeof(cast(any ptr,0)-cast(any ptr, 0)) should work as well.

marcov · Post by **marcov** » Mar 12, 2013 9:58

counting_pine wrote:I think the changes now mean that Integer will always be the same size as a Ptr. But we can now be sure with Integer<sizeof(any ptr)*8>.
Or typeof(cast(any ptr,0)-cast(any ptr, 0)) should work as well.

True. But it also has downsides:
- already said, header conversion issues where people have to factor in that Integer usually isn't equal to int.
- The average bigger integer size on 64-bit, you blow up datastructures and the stack space (for locals/temps) a lot, cause cache polution etc.
- On some older 64-bit architectures it might be slowing also (because load/stores of 64-bit values might will take longer). I don't know how this is on intel.
- Literal sizes might also expand, unless you implement the proper peephole implementation to load values <32-bit with relevant sign instructions. On non x86 you will have too, since most non x86(_64) targets can't load a full 64-bit literal in one instruction. This kind of stuff makes the optimization a necessity rather than a blessing. (since loading a 64-bit literal in parts really blows up the code)

And of course you will be different from everybody else, and encounter issues I can't quickly think of now.

Of course you can avoid a lot of the above stuff by making new code using an always 32-bit type and not care about old cruft go slow.

RBARAN · Post by **RBARAN** » Mar 27, 2013 0:40

"LONG

--------------------------------------------------------------------------------
Standard data type: signed integer having same size as Sizeof(Any Ptr)

Syntax:

dim variable as Long

Description:

Depending on the platform, same as Integer or Longint. A 32-bit or 64-bit signed whole-number data type."

This is directly out of the FBWiki, so no, LONG is not always 32 bit.

RBARAN · Post by **RBARAN** » Mar 27, 2013 0:52

Continued...

Or the FBWiki is wrong (regarding LONG). I don't have a problem with INTEGER not being the same across all platforms. It is logical for it to be 32 bit in a 32 bit FreeBasic version, just as it is 16 bits in QB compatibility mode, but LONG implies double the native integer size, at least it does to me.

RBARAN · Post by **RBARAN** » Mar 27, 2013 1:08

Is there any interest in implimenting 128 bit or more integer arithmetic in FreeBasic?

Post by **counting_pine** » Mar 27, 2013 1:26

The original plan was to have Long and Ulong be the same size as a pointer.
Those pages should really be updated, as long as we're committed to this change.

Post by **dkl** » Mar 28, 2013 2:05

I've updated TblVarTypes and the Integer/Long keyword pages now. Cint() & co to go...

I still think going with Integer as the only variable-size type will be the best, because Integer is the default type used for math operations, bitwise operations, pointer/string indexing, counters, string lengths, and so on. On a 64bit platform I think these should all work at 64bit precision. Besides that, Integer already is the variable-size type -- it changes to 16bit in -lang qb already. It's unfortunate that the naming will be different than with GCC, but in exchange FB will be consistent across 64bit operating systems.

marcov · Post by **marcov** » Mar 28, 2013 10:24

dkl wrote:I've updated TblVarTypes and the Integer/Long keyword pages now. Cint() & co to go...

I still think going with Integer as the only variable-size type will be the best, because Integer is the default type used for math operations, bitwise operations, pointer/string indexing, counters, string lengths, and so on.

(I updated my answer above, stackframe size will of course also inflate)

I don't see why those operations SHOULD become all 64-bit so strongly, and none of the reasons you name seem to be worth inflating your working set with double digit percentages. At least in the long term view.

The only benefit of this choice that is not legacy will mean that the corner case where people work purely in 32-bit (without mixing signs) but address memory over 32-bit in size. But that is rare, very rare, and very bad programming.

Yes in theory it allows some old QB program to be recompiled with FB for 64-bit and scale its central array mindlessly to >32-bit dimensions without changes, but the chance on that is not that high, since most OSes allocators might not even give you such large blocks.

Anyway, what I fundamentally doubt is the logic that integer is the root of everything and thus should be 64-bit.
Only the final effective address(EA) (and increments) needs to be done in 64-bit, and the rest of the EA only needs to be upscaled to 64-bit when needed. (if one of the inputs of the calculation is 64-bit or if there is a mix of signed and unsigned 32-bit). In fact normal type promotion rules apply. This is not very different from indexing an array on 32-bit systems with 16-bit values, is that allowed now?

The only corner case of this is the all signed 32-bit case yields an

All architectures have special instructions for loading smaller values sign and number, so you can easily upgrade existing codegeneration by inserting an extending instruction to load a 32-bit type.

A possible reason that I could think of are fundamental problems in FB's typing system that precluded effective type promotions?

Or even more probable, an attempt to recycle a too rigid codegenerator stage (which e.g. doesn't easily allow to allocate an extra register in case an instruction is now done directly, and needs an extra reg for the load then) for something it was never designed for, and the current devels rather work on the c backend, but politically can't kill the asm backend, and/or can't confine it to 32-bit only?

I'm all for the native codegenerator, but if it triggers decisions like this, just ditch it. Maybe the base support for it in the development community is simply not there.

Anyway, the decision is yours of course, but IMHO it is a bad one, and one you might have to live with it a long, long time, even when the current cg is long forgotten.

MichaelW · Post by **MichaelW** » Mar 28, 2013 22:43

I can’t see how limiting the size of the working set is worth complicating anything.

INTEGER and LONG Variable Specification

INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification

Re: INTEGER and LONG Variable Specification