Stueber wrote:I think I have a solution for the problem of translating examples from C to freebasic. As most libraries come with examples written in C it would be nice to have a c2fb. I already have a C parser but it does not do preprocessing. So I've got to figure out how to deal with #include and stuff like that.
Hi AGS, I had a lot of free time in the information science lesson today. I had the same idea. A c2fb converter would be the best solution. I allready have a lexer and a parser for C, like you. Maybe we have suggestions for each other. My plan was:
[X] run the preprocessor over the sources
[X] Write a lexer with flex for C
[X] Write a parser with bison for C
[O] Create a tree structur of the parsed input << hardest part
[ ] Unparse the tree structur and emit FB code
But the hardest part is the standard C library, there isn't an implementation of it for FB but every C program uses it.
The hardest part is are preprocessor constructs (macro's, but also e.g. structured constants without type attached) , that many of the expressions need (usage) context to translate.
The problem is C programs are digested in two steps (cpp + cc). A header translation works in principle in one step (.h/.c - > .xxx). But information added during the second step (the C compiler, the usage) is not available.
Another way of stating this is that a C (.h) header is never really compiled without usage. It is kept as preprocessor state till it can be substituted in the source that uses it.
This makes doing the translator that understands nearly everything you throw at it impossible. (specially packages that use macros to deal e.g. with complex datatypes or pseudo OOP)
You can always try heuristic approaches, and in theory you could go through the full process (compile GTK apps in C), and obtain usage parameters, and then use the found info to generate a translation. Example: Assume a parameterised macro is called once with X as type of the first parameter, and once with Y. You can then generate two functions for it under the hood, once with X and once with Y.
You can even try to convert it to a generic, or whatever own macro facility you do have (even if it is not 1:1 C), and retain the ability to also use what you translate the macro to with own types.
Long story short: fully automated is nearly impossible, unless you effectively C as a subset. Preprocessor and all. (give or take some token substitution). I do not know any freely accessable codebases that do on random substantial headers.
For these reasons I would go in the direction of a customizable kit (OOP with virtual methods to override, or even scripting), so that to translate a large header set, you can leverage what kinds of special information you know about. (if a macro matches this and that, then transform it to function/generic/struct/class whatever), or leave it for manual intervention). If you are really cool, you can record decisions that are prompted and store them for later retranslations of the same header.
A random example: (from FreeBSD headers)
Code: Select all
#define CTL_P1003_1B_NAMES { \
{ 0, 0 }, \
{ "asynchronous_io", CTLTYPE_INT }, \
{ "mapped_files", CTLTYPE_INT }, \
{ "memlock", CTLTYPE_INT }, \
{ "memlock_range", CTLTYPE_INT }, \
{ "memory_protection", CTLTYPE_INT }, \
{ "message_passing", CTLTYPE_INT }, \
{ "prioritized_io", CTLTYPE_INT }, \
{ "priority_scheduling", CTLTYPE_INT }, \
{ "realtime_signals", CTLTYPE_INT }, \
{ "semaphores", CTLTYPE_INT }, \
{ "fsync", CTLTYPE_INT }, \
{ "shared_memory_objects", CTLTYPE_INT }, \
{ "synchronized_io", CTLTYPE_INT }, \
{ "timers", CTLTYPE_INT }, \
{ "aio_listio_max", CTLTYPE_INT }, \
{ "aio_max", CTLTYPE_INT }, \
{ "aio_prio_delta_max", CTLTYPE_INT }, \
{ "delaytimer_max", CTLTYPE_INT }, \
{ "mq_open_max", CTLTYPE_INT }, \
{ "pagesize", CTLTYPE_INT }, \
{ "rtsig_max", CTLTYPE_INT }, \
{ "nsems_max", CTLTYPE_INT }, \
{ "sem_value_max", CTLTYPE_INT }, \
{ "sigqueue_max", CTLTYPE_INT }, \
{ "timer_max", CTLTYPE_INT }, \
}
How to translate this without context info?