Text Transformation Language

User projects written in or related to FreeBASIC.
Post Reply
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Text Transformation Language

Post by dustinian »

Summary
  • Purpose: Text Transformation Language is a scripting language that uses interpreted commands to transform text files.
  • Author: Dustinian Camburides
  • Revision: 2.7
  • Updated: 7/24/2015
Details / Instructions Downloads Special thanks to counting_pine for help in the "Beginners" forum.
Last edited by dustinian on Aug 01, 2015 3:28, edited 3 times in total.
Rens
Posts: 256
Joined: Jul 06, 2005 21:09

Re: Text Transformation Language

Post by Rens »

Nice program, but there are things wrong

If i use this:

test.ttl
REPLACE ALL WITH "*" FROM "start" TO "end"

testin.txt
start > < end

ttl.exe test.ttl testin.txt testout.txt

I get this error message:

LOADING FILE>>>
PARSING FILE>>>
Error 0000: Unrecognised Command REPLACE ALL WITH "*" FROM "start" to "end"

(i expected to see start * * end)

if i use this:

test.ttl
REPLACE EVERYTHING AFTER "b" to "f" WITH "x"

testin.txt
before

i get this
bexore

Is this the right behavior? (i expected to get bxfore)

And If i use multiple replaces in the .ttl command file, thing went very wrong (exception error given by windows)
For example if i first replace a string and then use it in a next command like this:
REPLACE "before" with "before_"
REPLACE EVERYTHING AFTER "b" TO "f" WITH "x"


I have not tested the other commands extensively.
Hope you can fix the errors, because this program can be very usefull.
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

Rens,

First of all, thank you for downloading, puzzling through my documentation, and giving TTL a shot! I'm pumped to get a reply here.

Secondly, let me confirm that none of what you described is expected behavior. Definitely defects. I don't have time to start debugging until tonight, but here's my resolution plan:
  • Attempt to re-create your issues as described below. Perhaps TTL has an issue with single-line or single-word input files. I use TTL several times a week, but I'm always working with fairly large files. If I cannot recreate...
  • Then it must be either: A) An environmental factor (i.e. the .EXE is too sensitive to OS or something), or B) an issue with the way it reads input files (perhaps the .EXE is too sensitive to charset). Either way, to figure this out, I would post a sample (working, on my machine) input and script file, and ask you to run them. If they work, it's something to do with the input file, and I'd ask for that input file so I can resolve those bugs. If they don't work, it's something to do with the environment and... I'm not sure how I'd tackle that.
I look forward to digging into this later today. Thanks again for taking the time work with TTL, and double-thanks for taking the time to document the bugs so well!
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

Rens,

After spending some time on this, I've discovered the issue: a few careless mistakes on my part. I'd made a few tweaks to the source code here and there over the last few weeks... and the tweaks always compiled, so I assumed all was well without testing.

I didn't notice the breaks, because I've been using an old version of the .EXE.

The issues you raised have been fixed. Fixes include:
  • Fixed parsing errors in REPLACE SUBSEQUENT and REPLACE BETWEEN.
  • Changed "REPLACE SUBSEQUENT" syntax to "REPLACE FIRST X AFTER Y WITH Z," better aligning the syntax with what actually happens.
I've uploaded version 1.9 of ttl.exe at http://www.dustinian.com/text.transform ... guage.html.

To facilitate my own regression testing in the future, and to better demonstrate an actual TTL script, I also added two sample files to that page. Scroll all the way to the bottom.

Please let me know if you see any additional errors. The only one you mentioned previously that I didn't pin down was when multiple "replace" statements were run. I couldn't re-create. I'm hoping the above fixes addressed that, especially since the script I used to test had multiple replace statements.

And thanks again for reporting and documenting those defects!
Rens
Posts: 256
Joined: Jul 06, 2005 21:09

Re: Text Transformation Language

Post by Rens »

Hello Dustinian,

I'm glad to give you some feedback.

1) I noticed that this isn't working:

example.ttl
REPLACE "before" with "before_"

in.txt
before

After running:
ttl example.ttl in.txt out.txt

Output:
LOADING FILE>>>
PARSING FILE>>
EXECUTING SCRIPT>>>
Windows error message

After compiling as follows:
fbc -exx -s console "ttl.bas" "String Manipulation.bas" "String Array.bas" "Text File Input Output.bas"

After running:
ttl example.ttl in.txt out.txt

Error message:
Aborting due to runtime error 12 ("segmentation violation") in ttl.bas::EXECUTE_COMMAND()

And a stupid replacement:

example.ttl
replace "before" with "before"

in.txt
before

After running:
ttl example.ttl in.txt out.txt

LOADING FILE>>>
PARSING FILE>>
EXECUTING SCRIPT>>>
Nothing more happens!?

2) Why not using the common parsing of the command line arguments as in C (no comma's) ?

ttl example.ttl in.txt out.txt

In your program changes must be:
strScriptPath = Command$(1)
strInputPath = Command$(2)
strOutputPath=Command$(3)

3) Why are you using -lang "QB" for compiling instead of the default fb compiling? You only have to change Text Transformation Language.fbp. It compiles fine as lang fb without any changes in the files

4) Maybe a stupid question

You have changed things:
Fixed parsing errors in REPLACE SUBSEQUENT and REPLACE BETWEEN.
Changed "REPLACE SUBSEQUENT" syntax to "REPLACE FIRST X AFTER Y WITH Z," better aligning the syntax with what actually happens.
What is the new syntax for this?

REPLACE ALL WITH "*" FROM "start" TO "end"
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

Rens wrote: 1) I noticed that this isn't working:

example.ttl
REPLACE "before" with "before_"

in.txt
before

After running:
ttl example.ttl in.txt out.txt

Output:
LOADING FILE>>>
PARSING FILE>>
EXECUTING SCRIPT>>>
Windows error message

After compiling as follows:
fbc -exx -s console "ttl.bas" "String Manipulation.bas" "String Array.bas" "Text File Input Output.bas"

After running:
ttl example.ttl in.txt out.txt

Error message:
Aborting due to runtime error 12 ("segmentation violation") in ttl.bas::EXECUTE_COMMAND()
The "replace" command is recursive, so when your "replace" string (before_) contains your "find" string (before), it starts an infinite loop. I have a "planned enhancement" listed that would allow a user to make a "replace" that isn't recursive, but I'll add a "minor enhancement" to add an error message for now, something like "ERROR: The 'replace' sub-string contains the 'find' sub-string."
Rens wrote: And a stupid replacement:

example.ttl
replace "before" with "before"

in.txt
before

After running:
ttl example.ttl in.txt out.txt

LOADING FILE>>>
PARSING FILE>>
EXECUTING SCRIPT>>>
Nothing more happens!?
Same thing as above, your 'find' and 'replace' are the same, so it starts an infinite loop. The previous one causes an error because it adds 1 character (_) to the string on every pass until it exceeds the limits of the string. This one has the same problem, but doesn't error, because the string never gets any bigger... it just keeps going. I'll definitely add that error message.
Rens wrote: 2) Why not using the common parsing of the command line arguments as in C (no comma's) ?

ttl example.ttl in.txt out.txt

In your program changes must be:
strScriptPath = Command$(1)
strInputPath = Command$(2)
strOutputPath=Command$(3)
Because I had no idea that was possible. That's awesome; I will test that. Thanks!
Rens wrote: 3) Why are you using -lang "QB" for compiling instead of the default fb compiling? You only have to change Text Transformation Language.fbp. It compiles fine as lang fb without any changes in the files
Because I started Text Transformation Language in QB64 quite some time ago. When I first converted, I thought perhaps -lang "QB" would get me to a state that would compile in FB more quickly, and then once I started compiling in FB, I forgot all about that. Does -lang "QB" hurt the executable? What are its implications?
Rens wrote: 4) Maybe a stupid question

You have changed things:
Fixed parsing errors in REPLACE SUBSEQUENT and REPLACE BETWEEN.
Changed "REPLACE SUBSEQUENT" syntax to "REPLACE FIRST X AFTER Y WITH Z," better aligning the syntax with what actually happens.
What is the new syntax for this?

REPLACE ALL WITH "*" FROM "start" TO "end"
That syntax remained the same. The syntax that went away was:
  • REPLACE EVERYTHING AFTER "precedant" TO "antecedant" WITH "add"
That syntax went away because:
  • Functionally, it would have done the same thing that 'REPLACE EVERYTHING BETWEEN "precedant" AND "antecedant"' already does.
  • I had that old syntax calling a function (Replace_Subsequent$) that actually did "REPLACE FIRST X AFTER Y WITH Z", so I just had to change point the new syntax at the old function.
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

I just released V2.0 of TTL, which includes the following changes:
  • Input validation to prevent infinite "Replace" commands where the "Find" sub-string appears in the "Replace" sub-string.
  • Command-line arguments now ready in the "C" style
Thanks to Rens; the above changes are a direct result of Rens' feedback.

Note that I have not migrated TTL away from the current -lang "QB". I attempted this, but I ran into quite a few errors when I created a new "Windows Console" project in FBEdit and imported my .BI and .BAS files.

The "build" statement resulted in numerous "Sub/Function was declared twice" warnings. At the end of these warnings, I get a "make successful" message, but no .EXE is created...

I'm still looking into this. In the mean time, I hope folks find this updated version useful.
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

I just updated to version 2.2 of TTL. I added:
  • Token support
  • Compiled in -lang FB
I have what I think is a really, really basic question:
  • If you look at the source code, you'll see that I declare my subs/functions in .bi files, but I have the code for these subs/functions in separate .bas files. Is there a way I can get rid of one or the other? I'd love to have the declarations AND the code for those subs/functions in the same file, and I'd love for that file to be the .bi file, so that I could think of .bi files as almost .h files.
Thanks in advance for any help anyone can offer. I hope folks out there find TTL useful. I routinely use it to convert text to HTML, or to convert really terrible HTML into something more decent.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Text Transformation Language

Post by MrSwiss »

dustinian wrote:I declare my subs/functions in .bi files, but I have the code for these subs/functions in separate .bas files. Is there a way I can get rid of one or the other?
Yesss ... very easily at that: just add the code (in your .bas file) below the declarations in the .bi file. That's it ...
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Text Transformation Language

Post by D.J.Peters »

@dustinian short suggestion

Joshy

Code: Select all

#if defined(__FB_WIN32__) or defined(__FB_DOS__) 
								Case "LINEBREAK"
									'Replace the word with the appropriate ASCII character:
										Command_Words(intWord) = Chr$(34) + Chr$(13) + Chr$(10) + Chr$(34)
#else ' __FB_LINUX__  __FB_UNIX__  __FB_FREEBSD__
								Case "LINEBREAK"
									'Replace the word with the appropriate ASCII character:
										Command_Words(intWord) = Chr$(34) + Chr$(10) + Chr$(34)
#ndif
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

MrSwiss wrote:Yesss ... very easily at that: just add the code (in your .bas file) below the declarations in the .bi file. That's it ...
Well, that's... embarrassing! Thank you, MrSwiss! I feel like I tried that in the past, and it didn't work. Perhaps a remnant of -lang QB? Or perhaps a figment of my imagination. This has been fixed at the download link above. I have not yet committed the changes to GitHub, but that will be done by tonight. *EDIT: GitHub repository is up-to-date.
D.J.Peters wrote:@dustinian short suggestion
Thank you, D.J.Peters. I'm actually debating keeping the carriage return (13) in there at all, even for Windows. Aside from good, old typewriter tradition, is there any reason to keep the carriage return character under circumstances?
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Text Transformation Language

Post by MrSwiss »

I'm not able to comment on anything QB related, since I've never used it.

After coding in (IBM) ROM BASIC and GW-BASIC I've switched very quickly to:
PowerBASIC compiler (successor of Borland TurboBASIC), because of the im-
posed limits of interpreted BASIC (early '90-ties). Mainly because of:
direct linking of ASM libraries possible. (some 16/32 bit ASM hacking on DOS)
Later on: some VBA in EXCEL ... and similar stuff.
In FreeBASIC I've never attempted anything else but -lang "FB" ... (default).

I love FreeBASIC because: the 'old style' type identifiers are 'gone' ...
... reason to keep LF (line feed, 'ix'-style) or CRLF (DOS/Windows)
I'd keep it, many files are made human-readable that way, even some HW-
communication sometimes relies on them, e.g. RS232 -- end transmission,
I'm certain I've missed plenty other issues ...
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

I've updated the first post to reflect the recent update to 2.3. I've eliminated every issue that I know about, and I added a "Replace Once" command to allow for those scenarios where you don't need a recursive replace.

If you look at GitHub, you'll see I had a devil of a time messing around with trying to pull files in differently. Right now, I grab text out of the files with an inelegant (and slow) "While Not EOF Line Input." I tried to learn BOM and encoding to use WInput, but it just created too many issues for me downstream. I wound up reverting the whole thing back to my inelegant Line Inputs.

I'll be extensively testing this in the coming weeks as I have a few projects that really need to TTL TLC.

If you encounter any issues, please let me know! Feel free to reply here or raise the issues in GitHub!
dustinian
Posts: 14
Joined: Nov 12, 2005 5:02

Re: Text Transformation Language

Post by dustinian »

Updated the top post to reflect V2.7.

TTL is now 25% faster, and I added quite a few new commands. You can see TTL here: https://github.com/dustinian/ttl
Post Reply