Pure FB Runtime Library (in progress)

User projects written in or related to FreeBASIC.
SARG
Posts: 858
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Pure FB Runtime Library (in progress)

Postby SARG » Dec 18, 2017 17:32

I don't know if you have to change something. I guess you could keep them like they are.

_ZN10fb_Object$C1Ev( FB_OBJECT* ) could be parsed like this
_Z : defines that's a mangled name
N : begin of string
10 : number of characters of name (here fb_object$)
C1 : constructor
E : end of string
v : datatype of parameter (here void)

Some information I used for fbdebugger to find original names when meeting mangled names. Anyway it's very painful to 'play' with them.

Code: Select all

===================================

Chapter 5: Linkage and Object Files


--------------------------------------------------------------------------------


5.1 External Names (a.k.a. Mangling)

5.1.1 General
This section specifies the mangling, i.e. encoding, of external names (external in the sense of being visible outside the object file where they occur). The encoding is formalized as a derivation grammar along with the explanatory text, in a modified BNF with the following conventions:

•Non-terminals are delimited by diamond braces: "<>".
•Italics in non-terminals are modifiers to be ignored, e.g. <function name> is the same as <name>.
•Spaces are to be ignored.
•Text beginning with '#' is comments, to be ignored.
•Tokens in square brackets "[]" are optional.
•Tokens are placed in parentheses "()" for grouping purposes.
•'*' repeats the preceding item 0 or more times.
•'+' repeats the preceding item 1 or more times.
•All other characters are terminals, representing themselves.
See the separate table summarizing the encoding characters used as terminals. Also see additional mangling examples in the separate ABI examples document.

In the various explanatory examples, we use Ret? for an unknown function return type (i.e. that is not given by the mangling), or Type? for an unknown data type.


5.1.2 General Structure
Entities with C linkage and global namespace variables are not mangled. Mangled names have the general structure:


    <mangled-name> ::= _Z <encoding>
    <encoding> ::= <function name> <bare-function-type>
          ::= <data name>
          ::= <special-name>
Thus, a name is mangled by prefixing "_Z" to an encoding of its name, and in the case of functions its type (to support overloading). At this top level, function types do not have the special delimiter characters required when nested (see below). The type is omitted for variables and static data members.
For the purposes of mangling, the name of an anonymous union is considered to be the name of the first named data member found by a pre-order, depth-first, declaration-order walk of the data members of the anonymous union. If there is no such data member (i.e., if all of the data members in the union are unnamed), then there is no way for a program to refer to the anonymous union, and there is therefore no need to mangle its name.

All of these examples:

union { int i; int j; };
union { union { int : 7 }; union { int i; }; };
union { union { int j; } i; };
are considered to have the name i for the purposes of mangling.


    <name> ::= <nested-name>
      ::= <unscoped-name>
      ::= <unscoped-template-name> <template-args>
      ::= <local-name>   # See Scope Encoding below

    <unscoped-name> ::= <unqualified-name>
          ::= St <unqualified-name>   # ::std::

    <unscoped-template-name> ::= <unscoped-name>
              ::= <substitution>
Names of objects nested in namespaces or classes are identified as a delimited sequence of names identifying the enclosing scopes. In addition, when naming a class member function, CV-qualifiers may be prefixed to the compound name, encoding the this attributes. Note that if member function CV-qualifiers are required, the delimited form must be used even if the remainder of the name is a single substitution.

    <nested-name> ::= N [<CV-qualifiers>] <prefix> <unqualified-name> E
        ::= N [<CV-qualifiers>] <template-prefix> <template-args> E

    <prefix> ::= <prefix> <unqualified-name>
        ::= <template-prefix> <template-args>
             ::= <template-param>
        ::= # empty
        ::= <substitution>
             ::= <prefix> <data-member-prefix>

    <template-prefix> ::= <prefix> <template unqualified-name>
                      ::= <template-param>
                      ::= <substitution>
    <unqualified-name> ::= <operator-name>
                       ::= <ctor-dtor-name> 
                       ::= <source-name>   
                       ::= <unnamed-type-name>   

    <source-name> ::= <positive length number> <identifier>
    <number> ::= [n] <non-negative decimal integer>
    <identifier> ::= <unqualified source code identifier>

<number> is a pseudo-terminal representing a decimal integer, with a leading 'n' for negative integers. It is used in <source-name> to provide the byte length of the following identifier. <number>s appearing in mangled names never have leading zeroes, except for the value zero, represented as '0'. <identifier> is a pseudo-terminal representing the unqualified identifier for the entity in the source code.

Note that <source-name> in the productions for <unqualified-name> may be either a function or data object name when derived from <name>, or a class or enum name when derived from <type>.


5.1.3 Operator Encodings
Operators appear as function names, and in nontype template argument expressions. Unlike Cfront, unary and binary operators using the same symbol have different encodings. All operators are encoded using exactly two letters, the first of which is lowercase.


  <operator-name> ::= nw   # new           
        ::= na   # new[]
        ::= dl   # delete       
        ::= da   # delete[]     
        ::= ps        # + (unary)
        ::= ng   # - (unary)     
        ::= ad   # & (unary)     
        ::= de   # * (unary)     
        ::= co   # ~             
        ::= pl   # +             
        ::= mi   # -             
        ::= ml   # *             
        ::= dv   # /             
        ::= rm   # %             
        ::= an   # &             
        ::= or   # |             
        ::= eo   # ^             
        ::= aS   # =             
        ::= pL   # +=           
        ::= mI   # -=           
        ::= mL   # *=           
        ::= dV   # /=           
        ::= rM   # %=           
        ::= aN   # &=           
        ::= oR   # |=           
        ::= eO   # ^=           
        ::= ls   # <<           
        ::= rs   # >>           
        ::= lS   # <<=           
        ::= rS   # >>=           
        ::= eq   # ==           
        ::= ne   # !=           
        ::= lt   # <             
        ::= gt   # >             
        ::= le   # <=           
        ::= ge   # >=           
        ::= nt   # !             
        ::= aa   # &&           
        ::= oo   # ||           
        ::= pp   # ++           
        ::= mm   # --           
        ::= cm   # ,             
        ::= pm   # ->*           
        ::= pt   # ->           
        ::= cl   # ()           
        ::= ix   # []           
        ::= qu   # ?             
        ::= st   # sizeof (a type)
        ::= sz   # sizeof (an expression)
                  ::= at        # alignof (a type)
                  ::= az        # alignof (an expression)
        ::= cv <type>   # (cast)       
        ::= v <digit> <source-name>   # vendor extended operator

Vendors who define builtin extended operators (e.g. __imag) shall encode them as a 'v' prefix followed by the operand count as a single decimal digit, and the name in <length,ID> form.

 For a user-defined conversion operator the result type (i.e., the type to which the operator converts) is part of the mangled name of the function. If the conversion operator is a member template, the result type will appear before the template parameters. There may be forward references in the result type to the template parameters.


5.1.4 Other Special Functions and Entities
Associated with a virtual table are several entities with mangled external names: the virtual table itself, the VTT for construction, the typeinfo structure, and the name it references. Each has a <special-name> encoding that is a simple two-character code, prefixed to the type encoding for the class to which it applies.


  <special-name> ::= TV <type>   # virtual table
       ::= TT <type>   # VTT structure (construction vtable index)
       ::= TI <type>   # typeinfo structure
       ::= TS <type>   # typeinfo name (null-terminated byte string)
Initialization of certain objects with static storage duration requires a guard variable to prevent multiple initialization. The mangled name of a guard variable is the name of the guarded variable prefixed with GV.


  <special-name> ::= GV <object name>   # Guard variable for one-time initialization
         # No <type>

Virtual function override thunks come in two forms. Those overriding from a non-virtual base, with fixed this adjustments, use a "Th" prefix and encode the required adjustment offset, probably negative, indicated by a 'n' prefix, and the encoding of the target function. Those overriding from a virtual base must encode two offsets after a "Tv" prefix. The first is the constant adjustment to the nearest virtual base (of the full object), of which the defining object is a non-virtual base. It is coded like the non-virtual case, with a 'n' prefix if negative. The second offset identifies the vcall offset in the nearest virtual base, which will be used to finish adjusting this to the full object. After these two offsets comes the encoding of the target function. The target function encodings of both thunks incorporate the function type; no additional type is encoded for the thunk itself.


  <special-name> ::= T <call-offset> <base encoding>
            # base is the nominal target function of thunk
  <call-offset> ::= h <nv-offset> _
      ::= v <v-offset> _
  <nv-offset> ::= <offset number>
            # non-virtual base override
  <v-offset>  ::= <offset number> _ <virtual offset number>
            # virtual base override, with vcall offset

Virtual function override thunks with covariant returns are twice as complex. Just as normal virtual function override thunks must adjust the this pointer before calling the base function, those with covariant returns must adjust the return pointer after they return from the base function. So the mangling must also encode a fixed offset to a non-virtual base, and possibly an offset to a vbase offset in the vtable to get to the virtual base containing the result subobject. We achieve this by encoding two <call-offset> components, either of which may be either virtual or non-virtual.


  <special-name> ::= Tc <call-offset> <call-offset> <base encoding>
            # base is the nominal target function of thunk
            # first call-offset is 'this' adjustment
            # second call-offset is result adjustment

Constructors and destructors are simply special cases of <unqualified-name>, where the final <unqualified-name> of a nested name is replaced by one of the following:


  <ctor-dtor-name> ::= C1   # complete object constructor
         ::= C2   # base object constructor
         ::= C3   # complete object allocating constructor
         ::= D0   # deleting destructor
         ::= D1   # complete object destructor
         ::= D2   # base object destructor


5.1.5 Type encodings
Types are encoded as follows:


  <type> ::= <builtin-type>
    ::= <function-type>
    ::= <class-enum-type>
    ::= <array-type>
    ::= <pointer-to-member-type>
    ::= <template-param>
    ::= <template-template-param> <template-args>
    ::= <substitution> # See Compression below

Types are qualified (optionally) by single-character prefixes encoding cv-qualifiers and/or pointer, reference, complex, or imaginary types:


  <type> ::= <CV-qualifiers> <type>
    ::= P <type>   # pointer-to
    ::= R <type>   # reference-to
    ::= O <type>   # rvalue reference-to (C++0x)
    ::= C <type>   # complex pair (C 2000)
    ::= G <type>   # imaginary (C 2000)
    ::= U <source-name> <type>   # vendor extended type qualifier

  <CV-qualifiers> ::= [r] [V] [K]    # restrict (C99), volatile, const
Vendors who define extended type qualifiers (e.g. _near, _far for pointers) shall encode them as a 'U' prefix followed by the name in <length,ID> form.

In cases where multiple order-insensitive qualifiers are present, they should be ordered 'K' (closest to the base type), 'V', 'r', and 'U' (farthest from the base type), with the 'U' qualifiers in alphabetical order by the vendor name (with alphabetically earlier names closer to the base type). For example, int* volatile const restrict _far p has mangled type name U4_farrVKPi.

Vendors must therefore specify which of their extended qualifiers are considered order-insensitive, not necessarily on the basis of whether their language translators impose an order in source code. They are encouraged to resolve questionable cases as being order-insensitive to maximize consistency in mangling.

For purposes of substitution, given a CV-qualified type, the base type is substitutible, and the type with all the C, V, and r qualifiers plus any vendor extended types in the same order-insensitive set is substitutible; any type with a subset of those qualifiers is not. That is, given a type const volatile foo, the fully qualified type or foo may be substituted, but not volatile foo nor const foo. Also, note that the grammar above is written with the assumption that vendor extended type qualifiers will be in the order-sensitive (not CV) set. An appropriate grammar modification would be necessitated by an order-insensitive vendor extended type qualifier like const or volatile.

 The restrict qualifier is part of the C99 standard, but is strictly an extension to C++ at this time. There is no standard specification of whether the restrict attribute is part of the type for overloading purposes. An implementation should include its encoding in the mangled name if and only if it also treats it as a distinguishing attribute for overloading purposes. This ABI does not specify that choice.

C++0x pack expansions are prefixed with Dp. The C++0x decltype type is prefixed with either Dt or DT, depending on how the decltype type was parsed.


 <type>  ::= Dp <type>          # pack expansion of (C++0x)
         ::= Dt <expression> E  # decltype of an id-expression or class member access (C++0x)
         ::= DT <expression> E  # decltype of an expression (C++0x)
Builtin types are represented by single-letter codes:


  <builtin-type> ::= v   # void
       ::= w   # wchar_t
       ::= b   # bool
       ::= c   # char
       ::= a   # signed char
       ::= h   # unsigned char
       ::= s   # short
       ::= t   # unsigned short
       ::= i   # int
       ::= j   # unsigned int
       ::= l   # long
       ::= m   # unsigned long
       ::= x   # long long, __int64
       ::= y   # unsigned long long, __int64
       ::= n   # __int128
       ::= o   # unsigned __int128
       ::= f   # float
       ::= d   # double
       ::= e   # long double, __float80
       ::= g   # __float128
       ::= z   # ellipsis
                 ::= Dd # IEEE 754r decimal floating point (64 bits)
                 ::= De # IEEE 754r decimal floating point (128 bits)
                 ::= Df # IEEE 754r decimal floating point (32 bits)
                 ::= Dh # IEEE 754r half-precision floating point (16 bits)
                 ::= Di # char32_t
                 ::= Ds # char16_t
       ::= u <source-name>   # vendor extended type

Vendors who define builtin extended types shall encode them as a 'u' prefix followed by the name in <length,ID> form.

Function types are composed from their parameter types and possibly the result type. Except at the outer level type of an <encoding>, or in the <encoding> of an otherwise delimited external name in a <template-parameter> or <local-name> function encoding, these types are delimited by an "F..E" pair. For purposes of substitution (see Compression below), delimited and undelimited function types are considered the same.

Whether the mangling of a function type includes the return type depends on the context and the nature of the function. The rules for deciding whether the return type is included are:

1.Template functions (names or types) have return types encoded, with the exceptions listed below.
2.Function types not appearing as part of a function name mangling, e.g. parameters, pointer types, etc., have return type encoded, with the exceptions listed below.
3.Non-template function names do not have return types encoded.
The exceptions mentioned in (1) and (2) above, for which the return type is never included, are
•Constructors.
•Destructors.
•Conversion operator functions, e.g. operator int.
Empty parameter lists, whether declared as () or conventionally as (void), are encoded with a void parameter specifier (v). Therefore function types always encode at least one parameter type, and function manglings can always be distinguished from data manglings by the presence of the type. Member functions do not encode the types of implicit parameters, either this or the VTT parameter.

A "Y" prefix for the bare function type encodes extern "C". If there are any cv-qualifiers of this, they are encoded at the beginning of the <qualified-name> as described above. This affects only type mangling, since extern "C" function objects have unmangled names.


  <function-type> ::= F [Y] <bare-function-type> E
  <bare-function-type> ::= <signature type>+
   # types are possible return type, then parameter types

When a function parameter is a C++0x function parameter pack, its type is mangled with Dp <type>, i.e., its type is a pack expansion.

A class, union, or enum type is simply a name, It may be a simple <unqualified-name>, with or without a template argument list, or a more complex <nested-name>. Thus, it is encoded like a function name, except that no CV-qualifiers are present in a nested name specification.


  <class-enum-type> ::= <name>
An exception, however, is that class std::decimal::decimal32, std::decimal::decimal64, or std::decimal::decimal128 as defined in TR 24733 uses the same encoding as the corresponding native decimal-floating point scalar type.

Unnamed class, union, and enum types that aren't closure types, that haven't acquired a "name for linkage purposes" (through a typedef), and that aren't anonymous union types, follow the same rule when they are defined in class scopes, with the underlying <unqualified-name> an <unnamed-type-name> of the form

  <unnamed-type-name> ::= Ut [ <nonnegative number> ] _
The number is omitted for the first unnamed type in the class; it is n-2 for the nth unnamed type (in lexical order) otherwise.
(The mangling of such unnamed types defined in namespace scope is generally unspecified because they do not have to match across translation units. An implementation must only ensure that naming collisions are avoided. The mangling of such unnamed types in local scopes is described in Scope Encoding. The encoding of closure types is described in a Closure Types (Lambdas).)

For example:
   struct S { static struct {} x; };
   typedef decltype(S::x) TX;  // Type mangled as N1SUt_E
   TX S::x;                    // _ZN1S1xE
   void f(TX) {}               // _Z1fN1SUt_E
Array types encode the dimension (number of elements) and the element type. Note that "array" parameters to functions are encoded as pointer types. For variable length arrays (C99 VLAs), the dimension (but not the '_' separator) is omitted.


  <array-type> ::= A <positive dimension number> _ <element type>
          ::= A [<dimension expression>] _ <element type>

When the dimension is an expression involving template parameters, the second production is used. Thus, the declarations:

    template<int I> void foo (int (&)[I + 1]) { }
    template void foo<2> (int (&)[3]);
produce the mangled name "_Z3fooILi2EEvRAplT_Li1E_i".
Pointer-to-member types encode the class and member types.


  <pointer-to-member-type> ::= M <class type> <member type>

Note that for a pointer to cv-qualified member function, the qualifiers are attached to the function type, so


    struct A;
    void f (void (A::*)() const) {}
produces the mangled name "_Z1fM1AKFvvE".
When function and member function template instantiations reference the template parameters in their parameter/result types, the template parameter number is encoded, with the sequence T_, T0_, ... Class template parameter references are mangled using the standard mangling for the actual parameter type, typically a substitution. Note that a template parameter reference is a substitution candidate, distinct from the type (or other substitutible entity) that is the actual parameter.


  <template-param> ::= T_   # first template parameter
         ::= T <parameter-2 non-negative number> _
  <template-template-param> ::= <template-param>
             ::= <substitution>

Template argument lists appear after the unqualified template name, and are bracketed by I/E. This is used in names for specializations in particular, but also in types and scope identification. Template argument packs are bracketed by additional I/E to distinguish them from other arguments.


  <template-args> ::= I <template-arg>+ E
  <template-arg> ::= <type>         # type or template
       ::= X <expression> E           # expression
                 ::= <expr-primary>             # simple expressions
         ::= I <template-arg>* E        # argument pack
                 ::= sp <expression>            # pack expansion of (C++0x)

  <expression> ::= <unary operator-name> <expression>
          ::= <binary operator-name> <expression> <expression>
          ::= <trinary operator-name> <expression> <expression> <expression>
               ::= cl <expression>* E           # call
               ::= cv <type> expression           # conversion with one argument
               ::= cv <type> _ <expression>* E # conversion with a different number of arguments
               ::= st <type>              # sizeof (a type)
               ::= at <type>                      # alignof (a type)
               ::= <template-param>
               ::= <function-param>
               ::= sr <type> <unqualified-name>                   # dependent name
               ::= sr <type> <unqualified-name> <template-args>   # dependent template-id
               ::= sZ <template-param>                            # size of a parameter pack
          ::= <expr-primary>

  <expr-primary> ::= L <type> <value number> E                   # integer literal
                 ::= L <type <value float> E                     # floating literal
                 ::= L <mangled-name> E                          # external name

Type arguments appear using their regular encoding. For example, the template class "A<char, float>" is encoded as "1AIcfE". A slightly more involved example is a dependent function parameter type "A<T2>::X" (T2 is the second template parameter) which is encoded as "N1AIT0_E1XE", where the "N...E" construct is used to describe a qualified name.

Literal arguments, e.g. "A<42L>", are encoded with their type and value. Negative integer values are preceded with "n"; for example, "A<-42L>" becomes "1AILln42EE". The bool value false is encoded as 0, true as 1.

Floating-point literals are encoded using a fixed-length lowercase hexadecimal string corresponding to the internal representation (IEEE on Itanium), high-order bytes first, without leading zeroes. For example: "Lf bf800000 E" is -1.0f on Itanium.

The encoding for a literal of an enumerated type is the encoding of the type name followed by the encoding of the numeric value of the literal in its base integral type (which deals with values that don't have names declared in the type).

A reference to an entity with external linkage is encoded with "L<mangled name>E". For example:
          void foo(char); // mangled as _Z3fooc
          template<void (&)(char)> struct CB;
          // CB<foo> is mangled as "2CBIL_Z3foocEE"

The <encoding> of an extern "C" function is treated like global-scope data, i.e. as its <source-name> without a type. For example:
          extern "C" bool IsEmpty(char *); // (un)mangled as IsEmpty
          template<void (&)(char *)> struct CB;
          // CB<IsEmpty> is mangled as "2CBIL_Z7IsEmptyEE"

An expression, e.g., "B<(J+1)/2>", is encoded with a prefix traversal of the operators involved. The operators are encoded using their two letter mangled names. For example, "B<(J+1)/2>", if J is the third template parameter, becomes "1BI Xdv pl T1_ Li1E Li2E E E" (the blanks are present only to visualize the decomposition). Note that the expression is mangled without constant folding or other simplification, and without parentheses, which are implicit in the prefix representation. Except for the parentheses, therefore, it represents the source token stream. (C++ Standard reference 14.5.5.1 p. 5.) An expression used as a template argument is delimited by "X...E".

If an expression is a qualified-name, and the qualifying scope is a dependent type, one of the sr productions is used, rather than the <mangled-name> production. If the qualified name refers to an operator for which both unary and binary manglings are available, the mangling chosen is the mangling for the binary version.

5.1.6 Scope Encoding
A nonlocal scope is encoded as the qualifier of a qualified name: it can be the top-level name qualification or it can appear inside <type> to denote dependent types or bind specific names as arguments. Qualified names are encoded as:
   N <qual 1> ... <qual N> <unqual name> E
where each <qual K> is the encoding of a namespace name or a class name (with the latter possibly including a template argument list).
Occasionally entities in local scopes must be mangled too (e.g. because inlining or template compilation causes multiple translation units to require access to that entity). The encoding for such entities is as follows:
  <local-name> := Z <function encoding> E <entity name> [<discriminator>]
               := Z <function encoding> E s [<discriminator>]

  <discriminator> := _ <non-negative number>      # when number < 10
                  := __ <non-negative number> _   # when number >= 10
The first production is used for named local static objects and classes, which are identified by their "names" as encoded relative to the closest enclosing function. In case of unnamed local types (excluding unnamed types that have acquired a "name for linkage purposes"), the "name" the unqualified name is encoded as an <unnamed-type-name> of the form

  <unnamed-type-name> ::= Ut [ <nonnegative number> ] _
where the number is is omitted for the first unnamed type in the function, and n-2 for the nth unnamed type (in lexical order) otherwise. The <entity name> may itself be a compound name, but it is relative to the closest enclosing function, i.e. none of the components of the function encoding appear in the entity name. It is possible to have nested function scopes, e.g. when dealing with a member function in a local class. In such cases, the function encoding will itself have <local-name> structure.
The discriminator is used only for the second and later occurrences of the same "top-level" name within a single function (since "unnamed types" are distinctly numbered, they never include a discriminator). In this case <number> is n - 2, if this is the nth occurrence, in lexical order, of the given name. "top-level" here means that if there are e.g. three classes named X in a given function g, and only the third has a member function f, the encoding of S::f in g will still include a discriminator of the form "_1" (n-2 == 1).

For example:
   inline void g(int) {
     { struct S {}; }
     { struct S {}; }
     { struct S {}; }
     struct S {        // Fourth occurrence: _2
       void f(int) {   // _ZZ1giEN1S1fE_2i
         struct {} x1;
         struct {} x2;
         struct {      // Third occurrence: 1_, i.e.
                       // _ZZZ1giEN1S1fE_2iEUt1_
           int fx() {  // _ZZZ1giEN1S1fE_2iENUt1_2fxEv
              return 3;
                }
         } x3;
         x3.fx();
       }
     } s;
     s.f(1);
   }
The second production is used for string literals. The discriminator is used only if there is more than one, for the second and subsequent ones. In this case <number> is n - 2, if this is the nth distinct string literal, in lexical order, appearing in the function. Multiple references to the same string literal produce one string object with one name in the sequence. Note that this assumes that the same string literal occurring twice in a given function in fact represents a single entity, i.e. has a unique address.
In all cases the numbering order is strictly lexical order based on the original token sequence. All entities occurring in that sequence are to be numbered, even if subsequent optimization makes some of them unnecessary. The ordering of literals appearing in a mem-initializer-list shall be the order that the literals appear in the source, which may be different from the order in which the initializers will be executed when the program runs. It is expected that this will be the 'natural' order in most compilers. In any case, conflicts would arise only if different compilation units including the same code were compiled by different compilers, and multiple entities requiring mangling had the same name.

For entities in constructors and destructors, the mangling of the complete object constructor or destructor is used as the base function name, i.e. the C1 or D1 version. This yields mangled names that are consistent across the versions.

Example:
   inline char const* g() {
     "str1";                   // First string in g()
     struct B {};
     struct S: B {
       S()                     // Complete object ctor: _ZZ1gvEN1SC1Ev
         : msg("str2") {}      // First string in g()::S::S():
                               //      _ZZZ1gvEN1SC1EvEs
       char const *msg;
     } s;
     "str3";                   // Second string in g()
     static char const *str4a  // _ZZ1gvE5str4a
        = "str4";              // Third string in g() (n-2 == 1):
                               //      _ZZ1gvEs_1
     static char const *str4b  // _ZZ1gvE5str4b
        = "str4";              // Still the third string (_ZZ1gvEs_1)
     return str4b;
   }
See additional examples in the ABI examples document.

5.1.7 Closure Types (Lambdas)
A C++0x lambda expression introduces a unique class type called closure type. In some contexts, such closure types are unique to the translation unit: This ABI therefore does not specify an encoding for such cases (but an implementation must ensure that any internal encoding does not conflict with this ABI).

For example:
namespace N {
  int n = []{ return 1; }();  // Closure type internal to
}                             // the translation unit.
In the following contexts, however, the one-definition rule requires closure types in different translation units to "correspond":
•default arguments appearing in class definitions
•the in-class initializers of class members (a C++0x feature)
•the bodies of inline functions
•the bodies of non-exported nonspecialized template functions
•the initializers of nonspecialized static members of template classes
In all these contexts, the encoding of the closure types builds on an underlying <unqualified-name> that is an <unnamed-type-name> of the form
  <unnamed-type-name> ::= <closure-type-name>

  <closure-type-name> ::= Ul <lambda-sig> E [ <nonnegative number> ] _
with
  <lambda-sig> ::= <parameter type>+  # Parameter types or "v" if the lambda has no parameters
The number is omitted for the first closure type with a given <lambda-sig> in a given context; it is n-2 for the nth closure type (in lexical order) with that same <lambda-sig> and context.

If the context is the body of a function (inline and/or template), the closure type is encoded like any other local entity (see Scope Encoding above). For example:
   template<typename F> int algo(F fn) { return fn(); }
   inline void g(int n) {
     int bef(int i = []{ return 1; }());
       // Default arguments of block-extern function declarations
       // remain in the context of the encloding function body.
       // The closure type is encoded as Z1giEUlvE_.
       // The call operator of that type is _ZZ1giENKUlvE_clEv.

     algo([=]{return n+bef();});
       // The captured entities do not participate in <lambda-sig>
       // and so this closure type has the same <lambda-sig> as
       // the previous one.  It encoding is therefore Z1giEUlvE0_
       // and the call operator is _ZZ1giENKUlvE0_clEv.  The
       // instance of "algo" being called is then
       // _Z4algoIZ1giEUlvE0_EiT_.
   }
If the context is a default argument (of a member function parameter) appearing in a class definition, the closure class and its members are encoded as follows:
  <local-name> := Z <function encoding> Ed [ <parameter number> ] _ <entity name>
The parameter number is omitted for the last parameter, 0 for the second-to-last parameter, 1 for the third-to-last parameter, etc. The <entity name> will of course contain a <closure-type-name>: Its numbering will be local to the particular argument in which it appears -- other default arguments do not affect its encoding. For example:    struct S {
     void f(int = []{return 1;}()
              // Type: ZN1S1fEiiEd0_UlvE_
              // Operator: _ZZN1S1fEiiEd0_NKUlvE_clEv
                + []{return 2;}(),
              // Type: ZN1S1fEiiEd0_UlvE0_
              // Operator: _ZZN1S1fEiiEd0_NKUlvE0_clEv
            int = []{return 3;}());
              // Type: ZN1S1fEiiEd_UlvE_
              // Operator: _ZZN1S1fEiiEd_NKUlvE_clEv
   } s;
Finally, if the context of a closure type is an initializer for a class member (static or nonstatic), it is encoded in a qualified name with a final <prefix> of the form:
  <data-member-prefix> := <member source-name> M
For example:    template<typename T> struct S {
     static int x;
   };
   template<typename T> int S<T>::x = []{return 1;}();
   template int S<int>::x;
     // Type of lambda in intializer of S<int>::x: N1SIiE1xMUlvE_E
     // Corresponding operator(): _ZNK1SIiE1xMUlvE_clEv

5.1.8 Compression
To minimize the length of external names, we use two mechanisms, a substitution encoding to eliminate repetition of name components, and abbreviations for certain common names. Each non-terminal in the grammar above for which <substitution> appears on the right-hand side is both a source of future substitutions and a candidate for being substituted. There are two exceptions that appear to be substitution candidates from the grammar, but are explicitly excluded:

•<builtin-type> other than vendor extended types, and
•function and operator names other than extern "C" functions.
All substitutions are for entities that would appear in a symbol table. In particular, we make substitutions for prefixes of qualified names, but not for arbitrary components of them. Thus, the components ::n1::foo() and ::n2:foo() appearing in the same name would not result in substituting for the second "foo." Similarly, we do not substitute for expressions, though names appearing in them might be substituted. The reason for this is to facilitate implementations that use the symbol table to keep track of components that might be substitutable.

Note that the above exclusion of function and operator names from consideration for substitution does not exclude the full function entity, i.e. its name plus its signature encoding.

Logically, the substitutable components of a mangled name are considered left-to-right, components before the composite structure of which they are a part. If a component has been encountered before, it is substituted as described below. This decision is independent of whether its components have been substituted, so an implementation may optimize by considering large structures for substitution before their components. If a component has not been encountered before, its mangling is identified, and it is added to a dictionary of substitution candidates. No entity is added to the dictionary twice.

The type of a non-static member function is considered to be different, for the purposes of substitution, from the type of a namespace-scope or static member function whose type appears similar. The types of two non-static member functions are considered to be different, for the purposes of substitution, if the functions are members of different classes. In other words, for the purposes of substitution, the class of which the function is a member is considered part of the type of function.

 Therefore, in the following example:

typedef void T();
struct S {};
void f(T*, T (S::*)) {}
the function f is mangled as _Z1fPFvvEM1SFvvE; the type of the member function pointed to by the second parameter is not considered the same as the type of the function pointed to by the first parameter. Both function types are, however, entered the substitution table; subsequent references to either variant of the function type will result in the use of substitutions.

Substitution is according to the production:


  <substitution> ::= S <seq-id> _
       ::= S_

The <seq-id> is a sequence number in base 36, using digits and upper case letters, and identifies the <seq-id>-th encoded component, in left-to-right order, starting at "0". As a special case, the first substitutable entity is encoded as "S_", i.e. with no number, so the numbered entities are the second one as "S0_", the third as "S1_", the twelfth as "SA_", the thirty-eighth as "S10_", etc. All substitutable components are so numbered, except those that have already been numbered for substitution. A component is earlier in the substitution dictionary than the structure of which it is a part. For example:    "_ZN1N1TIiiE2mfES0_IddE": Ret? N::T<int, int>::mf(N::T<double, double>)
since the substitutions generated for this name are:    "S_" == N (qualifier is less recent than qualified entity)
   "S0_" == N::T (template-id comes before template)
   (int is builtin, and isn't considered)
   "S1_" == N::T<int, int>
   "S2_" == N::T<double, double>
Note that substitutable components are the represented symbolic constructs, not their associated mangling character strings. Thus, a substituted object matches its unsubstituted form, and a delimited <function-type> matches its <bare-function-type>.

In addition, the following catalog of abbreviations of the form "Sx" are used:


   <substitution> ::= St # ::std::
   <substitution> ::= Sa # ::std::allocator
   <substitution> ::= Sb # ::std::basic_string
   <substitution> ::= Ss # ::std::basic_string < char,
                   ::std::char_traits<char>,
                   ::std::allocator<char> >
   <substitution> ::= Si # ::std::basic_istream<char,  std::char_traits<char> >
   <substitution> ::= So # ::std::basic_ostream<char,  std::char_traits<char> >
   <substitution> ::= Sd # ::std::basic_iostream<char, std::char_traits<char> >

The abbreviation St is always an initial qualifier, i.e. appearing as the first element of a compound name. It does not require N...E delimiters unless either followed by more than one additional composite name component, or preceded by CV-qualifiers for a member function. This adds the case:


   <name> ::= St <unqualified-name> # ::std::

For example:    "_ZSt5state": ::std::state
   "_ZNSt3_In4wardE": ::std::_In::ward


Imortis
Posts: 1579
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Imortis » Dec 19, 2017 21:28

FB does not like the $ in the names... Any other suggestions?
coderJeff
Site Admin
Posts: 2717
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Pure FB Runtime Library (in progress)

Postby coderJeff » Dec 19, 2017 22:36

in my own stabs branch of fbc
see https://github.com/jayrm/fbc/blob/stabs ... b/fb_oop.h
see https://github.com/jayrm/fbc/blob/stabs ... copyctor.c
see https://github.com/jayrm/fbc/blob/stabs ... p_object.c

Is without the $ in the name. I think it works better with gdb/stabs debugging, but it's been so long I don't remember all the reasons why I did it that way. Using these names requires changes in the compiler. It will break everyone's code. And don't know what changes SARG needs to make to fbdebugger. I think it's worthwhile change. I don't have the proof why. Lots of codes will break.

Otherwise, can use fbc keyword ALIAS with the mangled name, for example:

Code: Select all

/* a.c: compile with gcc -c a.c */
int add$2( int x, int y)
{
   return x + y;
}

Code: Select all

'' b.bas: compile with fbc b.bas a.o
declare function add2 cdecl alias "add$2" ( x as integer, y as integer ) as integer
print add2( 3, 7 )
Dr_D
Posts: 2357
Joined: May 27, 2005 4:59
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Dr_D » Dec 20, 2017 0:28

Whoa...
Imortis
Posts: 1579
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Imortis » Dec 20, 2017 16:16

For now I have used the Alias. I did not know you could put $ in an Alias. I am currently trying to fix an issue related to the fb_ctx type and a macro. I think I just need to get the right combo of @ and (0) in there to make it compile, but I will let you guy's know if I am incapable.

Dr_D wrote:Whoa...


?
Imortis
Posts: 1579
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Imortis » Jan 03, 2018 19:48

Okay, so a new issue. FB's EOF functions use a crt function called "ftello64". Not only does FB's CRT include not have "ftello64", it doesn't even have "ftello". It only has "ftell". THe function of all these is the same, except for the return type. The 64 version needs to be used so that files larger than 2gb and have a file pointer returned.

Can I just define that in a crt_extra file like St_W did for other functions?
marcov
Posts: 2645
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Pure FB Runtime Library (in progress)

Postby marcov » Jan 03, 2018 20:30

What is "crt" in this context? glibc? msvcrt ? Usually files io functions (off_t parameter or returning functions like mmap stat,lseek,ftruncate etc) with -64 bits suffix are Linuxisms.

Most other systems simply transition to a 64-bit off_t on some major version transition (Like FreeBSD2 to FreeBSD3), and then be done with it. (Solaris already 128-bit?)

Anyway I suggest to avoid using the -64 call thing as a general mould but treat it as an exception, it is not POSIX. LINUX_LARGE_FILE or LINUX_FILE64 would be a good name for a define.
Imortis
Posts: 1579
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Imortis » Jan 03, 2018 20:39

marcov wrote:What is "crt" in this context? glibc? msvcrt ? Usually files io functions (off_t parameter or returning functions like mmap stat,lseek,ftruncate etc) with -64 bits suffix are Linuxisms.

Most other systems simply transition to a 64-bit off_t on some major version transition (Like FreeBSD2 to FreeBSD3), and then be done with it. (Solaris already 128-bit?)

Anyway I suggest to avoid using the -64 call thing as a general mould but treat it as an exception, it is not POSIX. LINUX_LARGE_FILE or LINUX_FILE64 would be a good name for a define.


On windows it is using msvcrt. I cannot see what is being used on the linux/dos ports readily.
The code is actually calling ftello, but in the rtlib fb_win32.bi that is just a macro to the ftello64 function.
marcov
Posts: 2645
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Pure FB Runtime Library (in progress)

Postby marcov » Jan 03, 2018 20:50

Imortis wrote:On windows it is using msvcrt. I cannot see what is being used on the linux/dos ports readily.
The code is actually calling ftello, but in the rtlib fb_win32.bi that is just a macro to the ftello64 function.


Ftello is very old. It was used afaik when there was both POSIX ftell ( (*) with returntype off_t) and ansi ftell (return type long) to tell them apart. On open source Unix that was already ancient history in 2000-2002 times, but it was kept sometimes for no longer updated commercial unix verisons.

So it seems that Windows is an exception too, probably for the same reason (to support both functions at the same time, never remove or change one, just add then next one).

But then ftell64 is already to connect OS specific code and no -64 is used in the generic code, which sounds right. The continued use of the -o is maybe just to avoid nameclashes for aliases.

Just define ftello to ftell(o)(64) whatever returns an 64-bit off_t on a per target basis.

(*) http://pubs.opengroup.org/onlinepubs/96 ... ftell.html
Imortis
Posts: 1579
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Imortis » Jan 04, 2018 17:07

marcov wrote:
Imortis wrote:On windows it is using msvcrt. I cannot see what is being used on the linux/dos ports readily.
The code is actually calling ftello, but in the rtlib fb_win32.bi that is just a macro to the ftello64 function.


Ftello is very old. It was used afaik when there was both POSIX ftell ( (*) with returntype off_t) and ansi ftell (return type long) to tell them apart. On open source Unix that was already ancient history in 2000-2002 times, but it was kept sometimes for no longer updated commercial unix verisons.

So it seems that Windows is an exception too, probably for the same reason (to support both functions at the same time, never remove or change one, just add then next one).

But then ftell64 is already to connect OS specific code and no -64 is used in the generic code, which sounds right. The continued use of the -o is maybe just to avoid nameclashes for aliases.

Just define ftello to ftell(o)(64) whatever returns an 64-bit off_t on a per target basis.

(*) http://pubs.opengroup.org/onlinepubs/96 ... ftell.html


I feel like you are missing the point. Here is the code that is causing the problem:

Code: Select all

eof__ = (ftello( fp ) >= handle->size)

ftello is just a macro, seen here:

Code: Select all

#define ftello(stream)                 ftello64(stream)


However, ftello64 is not defined ANYWHERE in the /crt/stdio.bi file, which is where it SHOULD be. The only thing defined there is ftell, which is functionally the same, but it returns a long. I need one that returns some 64bit type so that it can handle file location pointers over 2gb.

I see this a a failing in the crt bindings we are using. I could be wrong, as at this point I don't know if the msvcrt lib that we included even has that function in it. I was asking if anyone knew if it was in there (google is not helping me much on this) and if so, if I could just make a new BI file with declaration in it. If I can, great. I will do that. If I can't, then I need a way to do this differently so the function of the code is the same.
marcov
Posts: 2645
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Pure FB Runtime Library (in progress)

Postby marcov » Jan 04, 2018 17:36

Imortis wrote: However, ftello64 is not defined ANYWHERE in the /crt/stdio.bi file, which is where it SHOULD be. The only thing defined there is ftell, which is functionally the same, but it returns a long. I need one that returns some 64bit type so that it can handle file location pointers over 2gb.


I got most of it but understood that you expected that ftello64 to be defined in OS specific additional headers. I don't know much about msvcrt, we use winapi, not POSIX on Windows.

Seems it isn't, but there is a ftelli64
Then use ftelli64: https://stackoverflow.com/questions/117 ... quivalents

I could be wrong, as at this point I don't know if the msvcrt lib that we included even has that function in it.


If not, how does the C RTS compile? Maybe there is a macro in the msvcrt headers though, that's not the the BI.
coderJeff
Site Admin
Posts: 2717
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Pure FB Runtime Library (in progress)

Postby coderJeff » Jan 04, 2018 17:55

Imortis wrote:I see this a a failing in the crt bindings we are using.

Yeah, with the dependency on c-runtime, for whatever build system you are in, need to know that crt/*.bi is mapping things correctly.

Code: Select all

'' in ./crt/stdio.bi
declare function ftell (byval as FILE ptr) as clong

notice the "clong" so have to look in ./crt/long.bi

Code: Select all

#pragma once

#if defined( __FB_64BIT__ ) and (not defined( __FB_WIN32__))
   '' On 64bit Linux/BSD systems (but not 64bit Windows), C's long is
   '' 64bit like FB's integer.
   '' Note: Using 64bit Integer here instead of LongInt, to match fbc's
   '' mangling: on 64bit Linux/BSD, Integer is mangled to C's long.
   type clong as integer
   type culong as uinteger
#else
   '' On 32bit systems and 64bit Windows, C's long is 32bit like FB's long.
   '' Note: Using 32bit Long here instead of 32bit/64bit Integer, because
   '' this is also used for 64bit Windows where Integer isn't 32bit.
   type clong as long
   type culong as ulong
#endif

On win, I went to mingw headers to see what's going on there and ftello64 is actually an inline function that calls fgetpos() with typedef long long fpos_t; fgetpos(FILE*, fpos_t*)

That's as far as I got, so far. So looks like need to add an ftello64() to stdio.bi
marcov
Posts: 2645
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Pure FB Runtime Library (in progress)

Postby marcov » Jan 04, 2018 18:09

I checked the header of crt 10.0.10240.0 (comes with VS2015), and there is no ftell, but there is a ftelli64 that maps to _ftelli64_nolock

The fgetpos declaration is there too though, so that would be fine.
jj2007
Posts: 887
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Pure FB Runtime Library (in progress)

Postby jj2007 » Jan 04, 2018 18:26

marcov wrote:Just define ftello to ftell(o)(64) whatever returns an 64-bit off_t on a per target basis.

Can anybody show how to do that for Windows? This works, but how would one define size=ftello64(handle)?

Code: Select all

#include "windows.bi"

Dim as ulonglong fSize
Dim as handle fHandle

fHandle=CreateFile("fbc.exe", GENERIC_READ, 0, 0, OPEN_EXISTING, 0, 0)
GetFileSizeEx(fHandle, @fSize)
CloseHandle(fHandle)
print fSize
Sleep
Imortis
Posts: 1579
Joined: Jun 02, 2005 15:10
Location: USA
Contact:

Re: Pure FB Runtime Library (in progress)

Postby Imortis » Jan 04, 2018 18:28

coderJeff wrote:
Imortis wrote:I see this a a failing in the crt bindings we are using.

Yeah, with the dependency on c-runtime, for whatever build system you are in, need to know that crt/*.bi is mapping things correctly.

Code: Select all

'' in ./crt/stdio.bi
declare function ftell (byval as FILE ptr) as clong

notice the "clong" so have to look in ./crt/long.bi

Code: Select all

#pragma once

#if defined( __FB_64BIT__ ) and (not defined( __FB_WIN32__))
   '' On 64bit Linux/BSD systems (but not 64bit Windows), C's long is
   '' 64bit like FB's integer.
   '' Note: Using 64bit Integer here instead of LongInt, to match fbc's
   '' mangling: on 64bit Linux/BSD, Integer is mangled to C's long.
   type clong as integer
   type culong as uinteger
#else
   '' On 32bit systems and 64bit Windows, C's long is 32bit like FB's long.
   '' Note: Using 32bit Long here instead of 32bit/64bit Integer, because
   '' this is also used for 64bit Windows where Integer isn't 32bit.
   type clong as long
   type culong as ulong
#endif

On win, I went to mingw headers to see what's going on there and ftello64 is actually an inline function that calls fgetpos() with typedef long long fpos_t; fgetpos(FILE*, fpos_t*)

That's as far as I got, so far. So looks like need to add an ftello64() to stdio.bi


I gotcha. I was able to find the function and just convert it into an FB function. I added it as a "crt_extra/stdio.bi" in my source code:

Code: Select all

function ftello64 cdecl (stream as FILE ptr) as off64_t
   dim as fpos_t _pos
   if ( fgetpos(stream, @_pos) ) then
      return  -1
   else
      return (cast(off64_t, _pos))
   end if
end function


Thanks to both you and marcov for you help.

Return to “Projects”

Who is online

Users browsing this forum: No registered users and 4 guests