Language Extension Through Preprocessing

General discussion for topics related to the FreeBASIC project or its community.
rdc
Posts: 1741
Joined: May 27, 2005 17:22
Location: Texas, USA
Contact:

Language Extension Through Preprocessing

Post by rdc »

I have been reading over some material about the history of programming languages, and come across some interesting tidbits. One tidbit was language extension through the use of macro languages. Fortran and C were both extended using a macro language. The m4 macro language was originally developed for this purpose:

http://en.wikipedia.org/wiki/M4_(computer_language)

m4 is used to add classes to ansi-c:

http://www.planetpdf.com/codecuts/pdfs/ooc.pdf (pdf)

And of course early C++ (C with Classes) was implemented via a preprocessor.

We should be able to use a macro processor to implement things like inheritance and templates using standard FB code. It would require an additional compiling step, but might be worth the trouble.

For example, inheritance could be implemented using containment by just adding in the parent methods to the target class using the extended type syntax. It would not be the same as having it at the compiler level, but might simplify the coding process.

Of course how would this be implemented since the compiler would choke on seeing the word class. Maybe use individual files and generate bi files for the project? Wrap it in comments? Not sure about this aspect of it.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Language Extension Through Preprocessing

Post by marcov »

rdc wrote:
And of course early C++ (C with Classes) was implemented via a preprocessor.
Yes, but CFront was a different kind than M4. As far as I know it was also not meant as a timesaver, but mainly meant to reuse exisitng C compiler for the many various unixes as much as possible.

I think it would have actually been easier to integrate it in just one C compiler than the way they took. (but Stroustrup had his reasons, and I don't think these apply to FB). Moreover, C++ only really took off for pproduction use after the CFront period.
We should be able to use a macro processor to implement things like inheritance and templates using standard FB code. It would require an additional compiling step, but might be worth the trouble.

For example, inheritance could be implemented using containment by just adding in the parent methods to the target class using the extended type syntax. It would not be the same as having it at the compiler level, but might simplify the coding process.
There are multiple problems with that. No ability for metaclasses, no ability to test runtime if a class is a descendant (IS). No typechecking what so ever. And even basic operations (class instantiation, virtual methods and their calling, polymorphism so to say)

Specially polymorphic use and class instantiation are a problem Getting your types right will be hard, since the real compiler might complain that a *UDTDescendant is not compatible with *UDTBaseclass, forcing you to rewrite and insert casts.

e.g. in OOP jargon it is perfectly fine to say

dim base1 as *UDTBaseClass
dim base2 as *UDTDescendant

base1=new(UDTDescendant)
base2=new(UDTDescendant)

new will probably be returning UDTDescendant, so in the first case probably some cast will be inserted. In OOP languages, the compiler knows this, but in the "front" case not.

Therefore I have some doubts if you can really pull this off without having some serious grasp of typing and semantics in the preprocessor, which really complicates the preprocessor (turning it in a 2nd compiler, easier to fix it in the primary one in the first place, and save yourself the duplication of all infrastructural work)

And without somewhat automated polymorphism, inheritance doesn't make sense at all.

Another problem is error reporting. Without the preprocessor fully understanding what he processes, error reporting is hard. Trying to add that to the preprocessor again pulls the complexity of a compiler in, and you are better off.
Of course how would this be implemented since the compiler would choke on seeing the word class. Maybe use individual files and generate bi files for the project? Wrap it in comments? Not sure about this aspect of it.
That's no problem. The preprocessor doesn't have to output the word class, that is what they do. They process commands for themselves, and only output stuff the compiler understands. I'm just afraid that in the case of inheritance you need to understand too much to make it a short cut.

=====================================

Limited forms of generics are a better fit though. A preprocessor approach is not entirely the same as the real thing, but it can be usuable, and it is/was commonly done before generics became commonplace. Primtive generics/template implementations are often based on tokenreplay.

It does mean though that if you have say a general vector datatype that you want to instantiate 5 times, you'll probably link the implementation 5 times. Not a problem for small potatoe use, but heavy use (STL) it really would blow up the binary.

More modern generic implementation (C#, Delphi, Java) do type analysis (and provide constraints) to avoid this. I don't know how C++ handles this. (older implementation, and a C++ class type is more a UDT, while in the above languages it is a pointer to an UDT)
agamemnus
Posts: 1842
Joined: Jun 02, 2005 4:48

Post by agamemnus »

Choke on the word "class"? I'm not sure why you would not use TYPE...END TYPE syntax for inheritance. This is Basic, not C++!
anonymous1337
Posts: 5494
Joined: Sep 12, 2005 20:06
Location: California

Post by anonymous1337 »

@agamemnus: In C++, the only difference between structs and classes are that classes have different default member access levels.

This is from v1ctor's TODO for FreeBASIC: http://fbc.svn.sourceforge.net/viewvc/f ... iew=markup

Code: Select all

Java/Php5-ish syntax: CLASS INTERFACE EXTENDS IMPLEMENTS THROWS ABSTRACT
I'm all for a "class" keyword.
agamemnus
Posts: 1842
Joined: Jun 02, 2005 4:48

Post by agamemnus »

Why, though? Why not just continue to use TYPE? Why complicate things with 2 keywords when 1 will do?
rdc
Posts: 1741
Joined: May 27, 2005 17:22
Location: Texas, USA
Contact:

Post by rdc »

agamemnus wrote:Choke on the word "class"? I'm not sure why you would not use TYPE...END TYPE syntax for inheritance. This is Basic, not C++!
No, you would have to output Type. I meant the source file. If you embed it into the source, you have to mask it out somehow. I guess a better approach would be to use individual files.
rdc
Posts: 1741
Joined: May 27, 2005 17:22
Location: Texas, USA
Contact:

Post by rdc »

@marcov

Yeah, I have to agree with your analysis. As you say, the way to go though would be to just add it in the compiler. However, I have no real hope that we will ever see proper classes in FB, which is a shame as it would make many things much easier.
jevans4949
Posts: 1186
Joined: May 08, 2006 21:58
Location: Crewe, England

Post by jevans4949 »

rdc wrote:... If you embed it into the source, you have to mask it out somehow. I guess a better approach would be to use individual files.
But that's the whole point of (any) preprocessor; it turns your made-up bells & whistles into stuff the existing compiler can handle.

OTOH, as marcov remarks, you would not have the capacity for any run-time functionality relating to classes, beyond what is already in the compiler.

That said, an upgrade to the existing preprocessor would be cool - though probably a lot of hard work.

It might be easier to write a preprocessor in a script-processing language. For example, Mr Spikowski's wonderful ScriptBasic?
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Post by marcov »

jevans4949 wrote:
That said, an upgrade to the existing preprocessor would be cool - though probably a lot of hard work.
To do what? I think the same time spent on the compiler is more worthwhile. Preprocessors are fundamentally limited, and thus more labourous.
It might be easier to write a preprocessor in a script-processing language. For example, Mr Spikowski's wonderful ScriptBasic?
I don't see what writing a preprocessor in a scriptlanguage would actually change anything.
VonGodric
Posts: 997
Joined: May 27, 2005 9:06
Location: London
Contact:

Post by VonGodric »

I have been working on a precompiler for fb and frankly its not a simple task if done properly.

For one thing you need to track scope to know when variables go out of scope and symbol visibility.

Track all types, variable sizes, functions, files etc. For example if ordinary Type object is included in a class? You need to alloc and dealloc it properly or other way around.

Process the preprocessor and macros. Can't be done before compiling because of FBs support for type checking in macros for one.

There are number of downsides too:
- slow. double compilation, generates lot of code
- inevitable differences between two front ends (syntax deviations, ...)
- debugging is nearly impossible. (true #line and #file directives help)


On the whole i've decided its a bad idea and I am working on my own compiler.
agamemnus
Posts: 1842
Joined: Jun 02, 2005 4:48

Post by agamemnus »

Why not just work on the FB compiler instead?
VonGodric
Posts: 997
Joined: May 27, 2005 9:06
Location: London
Contact:

Post by VonGodric »

because I don't understand it. I've tried to modify it and frankly either I am not smart enough or the source code stinks (sorry). Little comments and next to no documentation about internals... And secondly much bigger problem is that I use mac and fb doesn't work with osx...
v1ctor
Site Admin
Posts: 3804
Joined: May 27, 2005 8:08
Location: SP / Bra[s]il
Contact:

Post by v1ctor »

I added some support for inheritance. The changes are in the v0_22-inheritance branch.

It's WIP and i can't promise anything.

For now the implementation is really simple, but things like this already work:

Code: Select all

type TObject
	public:
	declare sub thing()
	
	private:
	dim as byte unused
end type

	sub TObject.thing()
		print "TObject.thing()"
	end sub

type TBase extends TObject
	public:
	declare constructor(value as integer)
	
	public:
	declare sub thing()
	
	protected:
	dim as integer value
end type

	constructor TBase (value as integer)
		this.value = value
	end constructor

	sub TBase.thing()
		print "  TBase.thing()"
	end sub
	
type TDerived extends TBase
	public:
	declare constructor (value as integer)
	declare sub thing(value as integer)
end type

	constructor TDerived(v as integer)
		''base(value) --- can't call base ctor yet
		base.value = v 
		value = v  '' --- also works, base symbols were imported
	end constructor

	sub TDerived.thing(value as integer)
		''thing() --- shouldn't work, same as in C++
		base.thing()
		base.base.thing()
		print "    TDerived.thing(integer)"
	end sub

sub main
	var d = TDerived(1234)
	
	d.thing 1234  '' ok, accessing the overload
	
	''d.thing()  --- not allowed, can't access thing() because it was redefined in derived, same as in C++
	
	''d.base.thing() --- not allowed, "base" can't be accessed outside the derived class
	
	cast( TBase, d ).thing()
end sub


	main
Dr_D
Posts: 2451
Joined: May 27, 2005 4:59
Contact:

Post by Dr_D »

oh, you... you had to come back and mess things up. :p
rdc
Posts: 1741
Joined: May 27, 2005 17:22
Location: Texas, USA
Contact:

Post by rdc »

Wow. Completely unexpected but awesome.
Post Reply