About FreeBASIC headers

General discussion for topics related to the FreeBASIC project or its community.
Stueber
Posts: 46
Joined: Nov 22, 2009 16:20

Post by Stueber »

I think I have a solution for the problem of translating examples from C to freebasic. As most libraries come with examples written in C it would be nice to have a c2fb. I already have a C parser but it does not do preprocessing. So I've got to figure out how to deal with #include and stuff like that.
Hi AGS, I had a lot of free time in the information science lesson today. I had the same idea. A c2fb converter would be the best solution. I allready have a lexer and a parser for C, like you. Maybe we have suggestions for each other. My plan was:

[X] Write a lexer with flex for C
[X] Write a parser with bison for C
[O] Create a tree structur of the parsed input << hardest part
[ ] Unparse the tree structur and emit FB code

But the hardest part is the standard C library, there isn't an implementation of it for FB but every C program uses it.
Last edited by Stueber on Dec 08, 2010 16:22, edited 1 time in total.
Galeon
Posts: 563
Joined: Apr 08, 2009 5:30
Location: Philippines
Contact:

Post by Galeon »

I want to work mainly for Linux because I don't like Windows. Planning to install other Linuxes here...
I want GTK+ as first priority. I recommend zlib, libzip, winapi (some fixes needed), glib and friends, sdl, allegro, etc. GTK+ and friends is very old and have a bad header tree. zlib is a small and well-used library that we could start with.
Seems that sir_mud already have a project like this? We could help him, or he could help us.
Also, headers should check for __FB_UNIX__ than __FB_LINUX__, so FreeBSD, NetBSD, etc. will be able to use them, they are all unix-like.
I think, FMOD is obsolete and now replaced by FModEx.
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

Galeon wrote:I want to work mainly for Linux because I don't like Windows. Planning to install other Linuxes here...
Great! You and others do linux, me and others do windows.
Galeon wrote: I want GTK+ as first priority. I recommend zlib, libzip, winapi (some fixes needed), glib and friends, sdl, allegro, etc. GTK+ and friends is very old and have a bad header tree. zlib is a small and well-used library that we could start with.
GTK+ as first priority: agree.
zlib will be easy to do (small api).
Galeon wrote: Seems that sir_mud already have a project like this? We could help him, or he could help us.
A certain galeon is already member of the project ;) I think it'll be fine to post code at the place where sir_mud is at (http://code.google.com/p/freebasic-headers/).
That is, if I can join of course.
Galeon wrote: Also, headers should check for __FB_UNIX__ than __FB_LINUX__, so FreeBSD, NetBSD, etc. will be able to use them, they are all unix-like.
I think, FMOD is obsolete and now replaced by FModEx.
FModEx it will be then. Will there be one set of header files for all platforms? I can see in the win32 header files that there is some use of #if defined (__FB_LINUX__) (suggesting that there is only one set of header files used for all OS).

So to get priorities straight we:
- start with zlib and then
- continue with GTK+/GLIB

I found plenty of GTK+ examples in the debian repositories. Which brings me to another point we might need to decide upon. How many examples do we consider 'enough'? I think it depends upon the size of the api.

With an api as big as GTK+ 38 examples is good (that's the number of examples that's in the debian package). For a smaller library less would be okay (with a minimum of, say, 3 examples).

glib does not come with any examples but luckily the gnome developers guide has some.

And a package like sqlite3 comes with a small api but I would still write a lot of examples for it. Using a library like sqlite3 is not as straightforward as using some other libraries due to the api that's on offer (loadsa pointer stuff going on there).
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

Stueber wrote:
I think I have a solution for the problem of translating examples from C to freebasic. As most libraries come with examples written in C it would be nice to have a c2fb. I already have a C parser but it does not do preprocessing. So I've got to figure out how to deal with #include and stuff like that.
Hi AGS, I had a lot of free time in the information science lesson today. I had the same idea. A c2fb converter would be the best solution. I allready have a lexer and a parser for C, like you. Maybe we have suggestions for each other. My plan was:

[X] Write a lexer with flex for C
[X] Write a parser with bison for C
[O] Create a tree structur of the parsed input << hardest part
[ ] Unparse the tree structur and emit FB code

But the hardest part is the standard C library, there isn't an implementation of it for FB but every C program uses it.
So you keep the calls to the C library. I'd mostly want to use c2fb for translating examples anyway (replacing calls to functions inside the standard C library would be great to have in some future version).

My plan was to get a C grammar from somewhere. I used the following one: http://www.lysator.liu.se/c/ANSI-C-grammar-y.html

And I fed that to dparser
http://dparser.sourceforge.net/

and used the output to parse C.

The parser built by dparser creates a parse tree for you. It's a concrete parse tree which retains all source info (including info on whitespace).

The parse tree is fairly ugly as the yacc grammar is one that does not use precedence/associativity for expressions. So you get the whole 17? levels of precedence when parsing an expression which looks something like this....
expression FALSE.
assignment_expression FALSE.
conditional_expression FALSE.
logical_or_expression FALSE.
logical_and_expression FALSE.
inclusive_or_expression FALSE.
exclusive_or_expression FALSE.
and_expression FALSE.
equality_expression FALSE.
relational_expression FALSE.
shift_expression FALSE.
additive_expression FALSE.
multiplicative_expression FALSE.
cast_expression FALSE.
unary_expression FALSE.
postfix_expression FALSE.
primary_expression FALSE.
IDENTIFIER FALSE.
[a-zA-Z0-9_]+ FALSE.
And something like code == 0 becomes
assignment_expression code == 0.
conditional_expression code == 0.
logical_or_expression code == 0.
logical_and_expression code == 0.
inclusive_or_expression code == 0.
exclusive_or_expression code == 0.
and_expression code == 0.
equality_expression code == 0.
equality_expression code.
relational_expression code.
shift_expression code.
additive_expression code.
multiplicative_expression code.
cast_expression code.
unary_expression code.
postfix_expression code.
primary_expression code.
IDENTIFIER code.
[a-zA-Z0-9_]+ code.
== ==.
relational_expression 0.
shift_expression 0.
additive_expression 0.
multiplicative_expression 0.
cast_expression 0.
unary_expression 0.
postfix_expression 0.
primary_expression 0.
CONSTANT 0.
[0-9]+((u|U|l|L|)*)? 0.
The thing now is to create a tree walker that visits all the nodes. The printouts above are just the output of 'printtree' which is a bit verbose. What's lacking is a symbol table and dparser gives you some tools to create one as well. You can create scopes and add variables to it, look up variables etc...

Some things to consider:
- switch statements
fallthrough cannot be simulated in fb by using select case so that will have to become if () then ... end if if () then ... end if
- macros
Simple #include and #define do not pose a problem. But it's those #define (arg1,arg2) followed by multiline macros that are hard to translate.

I have yet to resolve the preprocessor issue. I tried adding #include et all to the yacc grammar but dparser would not let me. I think it would be easier to preprocess all files without actually expanding macros (saving macro definitions instead), save the results of preprocessing in a database and then go ahead and parse the C files. Perhaps do some simple replacements (#define variable 0x100 --> #define variable &h100, #include <stdio.h> --> #include "crt/stdio.bi" etc...).

During parsing, when an identifier comes along, a lookup would be needed to see if it's a macro. I've got a scheme in mind to take care of more elaborate macro processing but I'm not sure if it'll work. Macro translation is going to be important. h2bi (by TJF) resolves some of the issues.

Or use the portable c compiler
http://pcc.zentus.com/cgi-bin/cvsweb.cgi/cc/ccom/

It's a 'real' C compiler that builds a tree, has a symbol table and has a clear front end/back end structure. PCC was created using flex/bison and works on windows/linux.

And then there is SDCC (uses flex/bison as well) http://sdcc.svn.sourceforge.net/viewvc/ ... /sdcc/src/
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

TJF wrote:Full sure: examples are very important for newcommers. They will help them a lot to get started. Also installing the binaries is often an issue.

From my point of view it's best to have experts for the operating systems: one is caring about Linux issues, one about Win issues and may be another one about Dos. When a header is translated on one system, it should get tested by the other experts (or groups) on the other systems to make fine-tuning. Ie I'm not able to make wholehearted testing on Win.
Yes, I agree. Headers should be tested on both Windows and Linux. I don't actually have a Linux box so testing on Linux is going to be a bit of a problem for me.

TJF wrote: Regarding the header updates, I think the situation isn't as bad as some may thing after the posts above. Ie:
  • * PDFLib (commercial lib) -> cairo-pdf (only password encryption is missing).
cairo-pdf is not comparable to PDFlib. PDFlib will let you do more things than cairo-pdf will (bookmarks/opening PDF files/etc...). Luckily PDFlib uses few macros.
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

I got the C parser to (partially) parse preprocessing symbols too.

I simply put the parser generator in the right 'mode' (=scannerless GLR) and all the problems went away. Using scannerless GLR does mean the parser parses C very wide (it will parse invalid C programs as well) but as c2fb will only be given valid C programs that isn't a problem.

Parsing speed: 2000 lines/second (includes printing parsetree).

C grammar used
${declare longest_match pre_if_line}

translation_unit
: external_declaration
| translation_unit external_declaration
;

primary_expression
: IDENTIFIER
| CONSTANT
| STRING_LITERAL
| '(' expression ')'
;

postfix_expression
: primary_expression
| postfix_expression '[' expression ']'
| postfix_expression '(' ')'
| postfix_expression '(' argument_expression_list ')'
| postfix_expression '.' IDENTIFIER
| postfix_expression PTR_OP IDENTIFIER
| postfix_expression INC_OP
| postfix_expression DEC_OP
;

argument_expression_list
: assignment_expression
| argument_expression_list ',' assignment_expression
;

unary_expression
: postfix_expression
| INC_OP unary_expression
| DEC_OP unary_expression
| unary_operator cast_expression
| SIZEOF unary_expression
| SIZEOF '(' type_name ')'
;

unary_operator
: '&'
| '*'
| '+'
| '-'
| '~'
| '!'
;

cast_expression
: unary_expression
| '(' type_name ')' cast_expression
;

multiplicative_expression
: cast_expression
| multiplicative_expression '*' cast_expression
| multiplicative_expression '/' cast_expression
| multiplicative_expression '%' cast_expression
;

additive_expression
: multiplicative_expression
| additive_expression '+' multiplicative_expression
| additive_expression '-' multiplicative_expression
;

shift_expression
: additive_expression
| shift_expression '<<' additive_expression
| shift_expression '>>' additive_expression
;

relational_expression
: shift_expression
| relational_expression '<' shift_expression
| relational_expression '>' shift_expression
| relational_expression '<=' shift_expression
| relational_expression '>=' shift_expression
;

equality_expression
: relational_expression
| equality_expression '==' relational_expression
| equality_expression '!=' relational_expression
;

and_expression
: equality_expression
| and_expression '&' equality_expression
;

exclusive_or_expression
: and_expression
| exclusive_or_expression '^' and_expression
;

inclusive_or_expression
: exclusive_or_expression
| inclusive_or_expression '|' exclusive_or_expression
;

logical_and_expression
: inclusive_or_expression
| logical_and_expression '&&' inclusive_or_expression
;

logical_or_expression
: logical_and_expression
| logical_or_expression '||' logical_and_expression
;

conditional_expression
: logical_or_expression
| logical_or_expression '?' expression ':' conditional_expression
;

assignment_expression
: conditional_expression
| unary_expression assignment_operator assignment_expression
;

assignment_operator
: '='
| '*='
| '/='
| '%='
| '+='
| '-='
| '<<='
| '>>='
| '&='
| '^='
| '|='
;

expression
: assignment_expression
| expression ',' assignment_expression
;

constant_expression
: conditional_expression
;

declaration
: declaration_specifiers ';'
| declaration_specifiers init_declarator_list ';'
;

declaration_specifiers
: storage_class_specifier
| storage_class_specifier declaration_specifiers
| type_specifier
| type_specifier declaration_specifiers
| type_qualifier
| type_qualifier declaration_specifiers
;

init_declarator_list
: init_declarator
| init_declarator_list ',' init_declarator
;

init_declarator
: declarator
| declarator '=' initializer
;

storage_class_specifier
: TYPEDEF
| EXTERN
| STATIC
| AUTO
| REGISTER
;

type_specifier
: VOID
| CHAR
| SHORT
| INT
| LONG
| FLOAT
| DOUBLE
| SIGNED
| UNSIGNED
| struct_or_union_specifier
| enum_specifier
| IDENTIFIER
;

struct_or_union_specifier
: struct_or_union IDENTIFIER '{' struct_declaration_list '}'
| struct_or_union '{' struct_declaration_list '}'
| struct_or_union IDENTIFIER
;

struct_or_union
: STRUCT
| UNION
;

struct_declaration_list
: struct_declaration
| struct_declaration_list struct_declaration
;

struct_declaration
: specifier_qualifier_list struct_declarator_list ';'
;

specifier_qualifier_list
: type_specifier specifier_qualifier_list
| type_specifier
| type_qualifier specifier_qualifier_list
| type_qualifier
;

struct_declarator_list
: struct_declarator
| struct_declarator_list ',' struct_declarator
;

struct_declarator
: declarator
| ':' constant_expression
| declarator ':' constant_expression
;

enum_specifier
: ENUM '{' enumerator_list '}'
| ENUM IDENTIFIER '{' enumerator_list '}'
| ENUM IDENTIFIER
;

enumerator_list
: enumerator
| enumerator_list ',' enumerator
;

enumerator
: IDENTIFIER
| IDENTIFIER '=' constant_expression
;

type_qualifier
: CONST
| VOLATILE
;

declarator
: pointer direct_declarator
| direct_declarator
;

direct_declarator
: IDENTIFIER
| '(' declarator ')'
| direct_declarator '[' constant_expression ']'
| direct_declarator '[' ']'
| direct_declarator '(' parameter_type_list ')'
| direct_declarator '(' identifier_list ')'
| direct_declarator '(' ')'
;

pointer
: '*'
| '*' type_qualifier_list
| '*' pointer
| '*' type_qualifier_list pointer
;

type_qualifier_list
: type_qualifier
| type_qualifier_list type_qualifier
;


parameter_type_list
: parameter_list
| parameter_list ',' ELLIPSIS
;

parameter_list
: parameter_declaration
| parameter_list ',' parameter_declaration
;

parameter_declaration
: declaration_specifiers declarator
| declaration_specifiers abstract_declarator
| declaration_specifiers
;

identifier_list
: IDENTIFIER
| identifier_list ',' IDENTIFIER
;

type_name
: specifier_qualifier_list
| specifier_qualifier_list abstract_declarator
;

abstract_declarator
: pointer
| direct_abstract_declarator
| pointer direct_abstract_declarator
;

direct_abstract_declarator
: '(' abstract_declarator ')'
| '[' ']'
| '[' constant_expression ']'
| direct_abstract_declarator '[' ']'
| direct_abstract_declarator '[' constant_expression ']'
| '(' ')'
| '(' parameter_type_list ')'
| direct_abstract_declarator '(' ')'
| direct_abstract_declarator '(' parameter_type_list ')'
;

initializer
: assignment_expression
| '{' initializer_list '}'
| '{' initializer_list ',' '}'
;

initializer_list
: initializer
| initializer_list ',' initializer
;

statement
: labeled_statement
| compound_statement
| expression_statement
| selection_statement
| iteration_statement
| jump_statement
;

labeled_statement
: IDENTIFIER ':' statement
| CASE constant_expression ':' statement
| DEFAULT ':' statement
;

compound_statement
: '{' '}'
| '{' statement_list '}'
| '{' declaration_list '}'
| '{' declaration_list statement_list '}'
;

declaration_list
: declaration
| declaration_list declaration
;

statement_list
: statement
| statement_list statement
;

expression_statement
: ';'
| expression ';'
;

selection_statement
: IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
| SWITCH '(' expression ')' statement
;

iteration_statement
: WHILE '(' expression ')' statement
| DO statement WHILE '(' expression ')' ';'
| FOR '(' expression_statement expression_statement ')' statement
| FOR '(' expression_statement expression_statement expression ')' statement
;

jump_statement
: GOTO IDENTIFIER ';'
| CONTINUE ';'
| BREAK ';'
| RETURN ';'
| RETURN expression ';'
;

external_declaration
: function_definition
| declaration
| preprocessor
;

function_definition
: declaration_specifiers declarator declaration_list compound_statement
| declaration_specifiers declarator compound_statement
| declarator declaration_list compound_statement
| declarator compound_statement
;

preprocessor
: '#'
( pre_include_
| pre_define_
| pre_if_line
| pre_else_
| pre_elif_
| pre_endif_
| pre_line_
| pre_undef_
| pre_pragma_
)
;

pre_include_ : 'include' ("<[^>]+>" | "\"[^\"]+\"");
pre_define_
: 'define' IDENTIFIER '('identifier_list')' "[^\n]*\n"
| 'define' IDENTIFIER CONSTANT "[^\n]*\n"
;

pre_else_ : 'else' "[^\n]*\n";
pre_elif_ : 'elif' "[^\n]*\n";
pre_endif_ : 'endif' "[^\n]*\n";
pre_if_line : pre_if_ | pre_ifdef_ | pre_ifndef_;
pre_line_ : 'line' CONSTANT;
pre_undef_ : 'undef' IDENTIFIER;
pre_pragma_ : 'pragma' "[^\n]*\n";
pre_if_ : 'if' "[^\n]+\n";
pre_ifdef_ : 'ifdef' "[^\n]+\n";
pre_ifndef_ : 'ifndef' "[^\n]+\n";

GOTO : 'goto';
CONTINUE : 'continue';
BREAK : 'break';
RETURN : 'return';
WHILE : 'while';
PTR_OP : '->';
SWITCH : 'switch';
CASE : 'case';
IF : 'if';
ELSE : 'else';
FOR : 'for';
DO : 'do';
INC_OP : '++';
DEC_OP : '--';
DEFAULT : 'default';
ELLIPSIS : '...';
CONST : 'const';
VOLATILE : 'volatile';
ENUM : 'enum';
TYPEDEF: 'typedef';
EXTERN : 'extern';
STATIC : 'static';
AUTO : 'auto';
REGISTER : 'register';
CHAR : 'char';
SHORT : 'short';
INT : 'int';
FLOAT : 'float';
VOID : 'void';
DOUBLE : 'double';
UNSIGNED : 'unsigned';
SIGNED : 'signed';
INLINE : 'inline';
LONG : 'long';
STRUCT : 'struct';
UNION : 'union';
SIZEOF : 'sizeof';

STRING_LITERAL : "[a-zA-Z]?\"[^\"]*\"" $term -1;
CONSTANT : "0[xX][a-fA-F0-9]+((u|U|l|L)*)?"
| "0[0-9]+(u|U|l|L|)*)?"
| "[0-9]+((u|U|l|L|)*)?"
| "[a-zA-Z_]\'(\\.|[^\\'])+\'"
| "[0-9]+[Ee][+-]?[0-9]+[fFlL]?"
| "[0-9]*\.[0-9]+([Ee][+-]?[0-9]+)?(f|F|l|L)?"
| "[0-9]+\.[0-9]*([Ee][+-]?[0-9]+)?(f|F|l|L)?"
$term -2;
IDENTIFIER : "[a-zA-Z0-9_]+" $term -6;
Galeon
Posts: 563
Joined: Apr 08, 2009 5:30
Location: Philippines
Contact:

Post by Galeon »

Wow, seems like this project will come true :).
I'm that Galeon that is the member of http://code.google.com/p/freebasic-headers/. I registered for translating GTK+ headers and friends (that is in the Libraries forum).
Sebastian
Posts: 131
Joined: Jun 18, 2005 14:01
Location: Europe / Germany
Contact:

Post by Sebastian »

Of course I didn't want to urge you to consider a project like that one I suggested. It was just an idea, my two cents. ;-)
TJF wrote:Still, I think hosting headers on some web-site (ie on the AGS web-site) will be an intermediate solution. The final solution should be a portal (platform) for the headers, like Sebastian proposed. I'll support him as good as possible to get his idea in practise.
Thanks a lot for your support and the kind words!

But actually I can't spent dozens of hours on developing such a platform if most of the potential users refuse to use it later. ;-)
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

Sebastian wrote:Thanks a lot for your support and the kind words!
No support and no kind words yet. Your proposal is good!
Sebastian wrote:But actually I can't spent dozens of hours on developing such a platform if most of the potential users refuse to use it later. ;-)
10 month ago, I needed an GTK+ headers update and some bugfixes. I found out that I'm too studid to use SWIG and I made my personal tool for header translations. Now it's called h_2_bi.bas, some users know and use it, some additional headers got translated meanwhile.

Kind words: keep on thinking about your header-portal idea, you'll find some time to realise it in future. Users will find and realy appreciate using your portal, when realised!
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

I thought about this D.J.Peters topic and I remembered an issue that we should discuss here. I assume DJP got in trouble with the following:

To prevent including the same header twice, in most C headers symbols are used. It's code like
#ifndef __GTK_H__
#define __GTK_H__

/* ... */

#endif
In the original FB-GTK headers (gtk/gtk.bi) the symbols are renamed and the translation is:

Code: Select all

#IFNDEF __GTK_BI__
#DEFINE __GTK_BI__

/' ... '/

#ENDIF
I see no advantage in renaming __GTK_H__ into __GTK_BI__. Instead, you'll have more work at translation time and you'll have more trouble when you translate C-examples checking this symbols later on. You'll have to keep this renaming in mind.

From my piont of view, it's best to change as less as possible. It means: stay with the original symbol name, what I did in GTK-2.22.0_TJF.bi.

But this has the disadvantage that you cannot mix the ..._TJF.bi headers with original FB headers. (You'll get doublicated definition errors on compiling.)


Conclusion:

A rule is needed on how to translate this symbols in the next header generation:
  • a) renaming (compatible, but more work for every new header and every new example)
    b) no renaming (incompatible in the interim phase, less work further on, easier to use)
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

You'll have to detect what's after the #ifdef (#ifndef). If it´s an intrinsic define a program could replace it with something else.

For example WIN32
#ifndef (WIN32)
could be translated like so
#ifndef (__FB_WIN32__)
Same thing with #ifdef
#ifdef __cplusplus
extern "C" {
#endif
// lots of stuff
#ifdef __cplusplus
}
#endif
The number of 'standard' intrinsic defines is (fortunately) not infinite.

The C99 standard does not include that many intrinsic defines. Standard (or non - standard) C defines might (or might not), after possible translation, make sense in a FB header file.

__cplusplus does not make that much sense in FB (could use extern "C" .. end extern) but something like WIN32 (__FB_WIN32__) might make sense.

Translating c header files to freebasic header files is big fun :)
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

@AGS:

Some missunderstanding here, sorry:
AGS wrote:You'll have to detect what's after the #ifdef (#ifndef). If it´s an intrinsic define a program could replace it with something else.
I mean the symbol at the start of the headers. After the #ifndef and before the #endif is the complete header context (ie have a look at gtk/gtk.bi).
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Post by marcov »

Stueber wrote:
I think I have a solution for the problem of translating examples from C to freebasic. As most libraries come with examples written in C it would be nice to have a c2fb. I already have a C parser but it does not do preprocessing. So I've got to figure out how to deal with #include and stuff like that.
Hi AGS, I had a lot of free time in the information science lesson today. I had the same idea. A c2fb converter would be the best solution. I allready have a lexer and a parser for C, like you. Maybe we have suggestions for each other. My plan was:
[X] run the preprocessor over the sources
[X] Write a lexer with flex for C
[X] Write a parser with bison for C
[O] Create a tree structur of the parsed input << hardest part
[ ] Unparse the tree structur and emit FB code

But the hardest part is the standard C library, there isn't an implementation of it for FB but every C program uses it.
The hardest part is are preprocessor constructs (macro's, but also e.g. structured constants without type attached) , that many of the expressions need (usage) context to translate.

The problem is C programs are digested in two steps (cpp + cc). A header translation works in principle in one step (.h/.c - > .xxx). But information added during the second step (the C compiler, the usage) is not available.

Another way of stating this is that a C (.h) header is never really compiled without usage. It is kept as preprocessor state till it can be substituted in the source that uses it.

This makes doing the translator that understands nearly everything you throw at it impossible. (specially packages that use macros to deal e.g. with complex datatypes or pseudo OOP)

You can always try heuristic approaches, and in theory you could go through the full process (compile GTK apps in C), and obtain usage parameters, and then use the found info to generate a translation. Example: Assume a parameterised macro is called once with X as type of the first parameter, and once with Y. You can then generate two functions for it under the hood, once with X and once with Y.

You can even try to convert it to a generic, or whatever own macro facility you do have (even if it is not 1:1 C), and retain the ability to also use what you translate the macro to with own types.

Long story short: fully automated is nearly impossible, unless you effectively C as a subset. Preprocessor and all. (give or take some token substitution). I do not know any freely accessable codebases that do on random substantial headers.

For these reasons I would go in the direction of a customizable kit (OOP with virtual methods to override, or even scripting), so that to translate a large header set, you can leverage what kinds of special information you know about. (if a macro matches this and that, then transform it to function/generic/struct/class whatever), or leave it for manual intervention). If you are really cool, you can record decisions that are prompted and store them for later retranslations of the same header.

A random example: (from FreeBSD headers)

Code: Select all

#define CTL_P1003_1B_NAMES { \
        { 0, 0 }, \
        { "asynchronous_io", CTLTYPE_INT }, \
        { "mapped_files", CTLTYPE_INT }, \
        { "memlock", CTLTYPE_INT }, \
        { "memlock_range", CTLTYPE_INT }, \
        { "memory_protection", CTLTYPE_INT }, \
        { "message_passing", CTLTYPE_INT }, \
        { "prioritized_io", CTLTYPE_INT }, \
        { "priority_scheduling", CTLTYPE_INT }, \
        { "realtime_signals", CTLTYPE_INT }, \
        { "semaphores", CTLTYPE_INT }, \
        { "fsync", CTLTYPE_INT }, \
        { "shared_memory_objects", CTLTYPE_INT }, \
        { "synchronized_io", CTLTYPE_INT }, \
        { "timers", CTLTYPE_INT }, \
        { "aio_listio_max", CTLTYPE_INT }, \
        { "aio_max", CTLTYPE_INT }, \
        { "aio_prio_delta_max", CTLTYPE_INT }, \
        { "delaytimer_max", CTLTYPE_INT }, \
        { "mq_open_max", CTLTYPE_INT }, \
        { "pagesize", CTLTYPE_INT }, \
        { "rtsig_max", CTLTYPE_INT }, \
        { "nsems_max", CTLTYPE_INT }, \
        { "sem_value_max", CTLTYPE_INT }, \
        { "sigqueue_max", CTLTYPE_INT }, \
        { "timer_max", CTLTYPE_INT }, \
}
How to translate this without context info?
sir_mud
Posts: 1401
Joined: Jul 29, 2006 3:00
Location: US
Contact:

Post by sir_mud »

marcov wrote: A random example: (from FreeBSD headers)

Code: Select all

#define CTL_P1003_1B_NAMES { \
        { 0, 0 }, \
        { "asynchronous_io", CTLTYPE_INT }, \
        { "mapped_files", CTLTYPE_INT }, \
        { "memlock", CTLTYPE_INT }, \
        { "memlock_range", CTLTYPE_INT }, \
        { "memory_protection", CTLTYPE_INT }, \
        { "message_passing", CTLTYPE_INT }, \
        { "prioritized_io", CTLTYPE_INT }, \
        { "priority_scheduling", CTLTYPE_INT }, \
        { "realtime_signals", CTLTYPE_INT }, \
        { "semaphores", CTLTYPE_INT }, \
        { "fsync", CTLTYPE_INT }, \
        { "shared_memory_objects", CTLTYPE_INT }, \
        { "synchronized_io", CTLTYPE_INT }, \
        { "timers", CTLTYPE_INT }, \
        { "aio_listio_max", CTLTYPE_INT }, \
        { "aio_max", CTLTYPE_INT }, \
        { "aio_prio_delta_max", CTLTYPE_INT }, \
        { "delaytimer_max", CTLTYPE_INT }, \
        { "mq_open_max", CTLTYPE_INT }, \
        { "pagesize", CTLTYPE_INT }, \
        { "rtsig_max", CTLTYPE_INT }, \
        { "nsems_max", CTLTYPE_INT }, \
        { "sem_value_max", CTLTYPE_INT }, \
        { "sigqueue_max", CTLTYPE_INT }, \
        { "timer_max", CTLTYPE_INT }, \
}
How to translate this without context info?
Without context I can tell it is a macro for defining an array of type( char *, integer? ), with context and human intervention this macro would be very similiar in the fb version. I welcome all to the freebasic-headers project on googlecode, information on applying for the project is here: http://freebasic-headers.googlecode.com/wiki/HelpWanted it's been awhile since i've updated anything (i've been moving and changing careers) but i'm back and will be starting to work on it again very soon. I'm also going to go back over this thread with a fine toothed comb to I can gather the great ideas here. You can find me on irc at nights (US) or you can always send me an email (i get them on my phone)
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

sir_mud wrote:
marcov wrote:How to translate this without context info?
Without context I can tell it is a macro for defining an array of type( char *, integer? ), with context and human intervention this macro would be very similiar in the fb version.
I agree. The macro can be translated with h_2_bi without any manual fine-tuning and without any entry in the config file.
sir_mud wrote:I welcome all to the freebasic-headers project on googlecode, information on applying for the project is here: http://freebasic-headers.googlecode.com/wiki/HelpWanted it's been awhile since i've updated anything (i've been moving and changing careers) but i'm back and will be starting to work on it again very soon. I'm also going to go back over this thread with a fine toothed comb to I can gather the great ideas here. You can find me on irc at nights (US) or you can always send me an email (i get them on my phone)
I had a look at your project. For me it's in a very early stage. It needs much more than providing the name of one file to download.

I'm looking for a portal that
  • - serves more than one header,
    - shows some text description for each header,
    - shows a logo (and maybe more pictures) to give the users an idea on the downloads possible use,
    - serves the latest version and olders as well ...
Have a look at http://www.freebasic-portal.de/downloads/bibliotheken/ and you'll get an idea of what I mean.

Additionally we need a place to collect comments on each header and another place for general discussions (as Sebastian wrote). And some installers for the different platforms may be useful, including header, binary and maybe some examples as well.

When you tell us that you realised 50 % of this needs, I'll have a look at freebasic-headers.googlecode.com again.
Post Reply