Regular expressions

New to FreeBASIC? Post your questions here.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Regular expressions

Post by dodicat »

Put
https://github.com/jkitchin/emacs-win/b ... b/libtre.a
libtra.a into your lib folder (64 bit fb I have used here)
Run the examples.
Here are my results for match.bas from the examples
\examples\regex\TRE\match.bas

Code: Select all

<foo>
<_bar>
<foo123>
<BAR>
<Foo__>
  
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Regular expressions

Post by jj2007 »

newbieforever wrote:So I am searching for three-character substrings beginning with d and ending with f. In string 'abcdefgh' this would be the substring 'def'; the same for the string 'addddbcdefgh'.
RegEx is really an overkill. A simple instr() combined with a check for the character "f" at position n+2 will do the job:

Code: Select all

Dim as string cont="addddbcdefgh"
Dim as integer vp, posLeft, TheChar

Do
	posLeft=instr(posLeft+1, cont, "d")	' get next d
	if posLeft=0 Then Exit Do
	vp=Peek(Integer, Varptr(cont))
	TheChar=Peek(ubyte, vp+posLeft)
	' print "The char: ", TheChar
Loop Until TheChar=101	' f

print "PosL=";posLeft, TheChar
sleep()
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Regular expressions

Post by fxm »

@jj2007,
We search for "d.f" and not "de" (101 = Asc("e")).
The right code could be:

Code: Select all

Dim As String cont="addddbcdefgh"
Dim As Integer posLeft

Do
   posLeft=Instr(posLeft+1, cont, "d")   ' get next d
   If posLeft=0 Then Exit Do
   If posLeft > Len(cont)-2 Then posLeft= 0 : Exit Do
Loop Until cont[posLeft+1]=Asc("f")

Print "PosL=";posLeft, Mid(cont, posLeft, 3)

Sleep
Last edited by fxm on Jun 30, 2018 8:11, edited 1 time in total.
newbieforever
Posts: 117
Joined: Jun 21, 2018 11:14

Re: Regular expressions

Post by newbieforever »

Mercie, gracie, thank you, fxm, jj2007, dodicat, MrSwiss and others!

In understand now how solutions can be found without regex, and I will use these tips extensively.

But there will be situations where I have to search for e.g. '(.*)' (a substring starting and ending with brackets with any number of characters in the middle). Without regex this would became to complicated, right?

MrSwiss: Thank you, I finally found it. But in the moment I am unable to learn how the library should be finalized for the use (which seems to be necessary).

dodicat: I tryed to make it with libtre.a, but until now without success...

Code: Select all

#Include "regex.bi"
#Include "libtre.a"
...
>>>
C:\Test\FreeBASIC\libtre.a(1) error 3: Expected End-of-Line, found '!' in '!<arch>'
C:\Test\FreeBASIC\libtre.a(2) error 3: Expected End-of-Line, found '/' in '/ 1435578605 0
0 0 1026 `'
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Regular expressions

Post by fxm »

newbieforever wrote:But there will be situations where I have to search for e.g. '(.*)' (a substring starting and ending with brackets with any number of characters in the middle).

Code: Select all

Dim As String cont="addddbc(def)gh"
Dim As Integer posLeft, posRight

posLeft=Instr(cont, "(")   ' get next (
If posLeft>0 Then
   posRight=Instr(posLeft+1, cont, ")")   ' get next (
   If posRight=0 Then posLeft = 0
End If

Print "PosL=";posLeft, "PosR=";posRight, Mid(cont, posLeft,posRight-posLeft+1)

Sleep
  • Nested brackets unsupported as other delimiters.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Regular expressions

Post by jj2007 »

> fxm: We search for "d.f" and not "de" (101 = Asc("e"))

Yep, that was a little glitch. Your version is better. The exit could be shorter:

Code: Select all

   If posLeft=0 Then Exit Do
... assuming that Instr() is clever enough to return zero if the startpos is near the end and no match is found.

I didn't know that cont[posLeft+1]=Asc("f") is possible; cont is a string, Asc() is a ubyte. Oh well...
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Regular expressions

Post by fxm »

jj2007 wrote:The exit could be shorter:

Code: Select all

   If posLeft=0 Then Exit Do
No, the second test

Code: Select all

If posLeft > Len(cont)-2 Then posLeft= 0 : Exit Do
is mandatory because of testing "cont[posLeft+1]"

See documentation at page Operator [] (String index):
the user must ensure that the index does not exceed the range "[0, Len(lhs) - 1]". Outside this range, results are undefined.
  • (otherwise, access outside the allocated memory for the string characters)
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Regular expressions

Post by MrSwiss »

A practical example, that uses string indexing ... (no wide strings!)

Code: Select all

' replaces given 'search' char, with a given 'replace' char, in Source String (src_str)
Function ReplaceChar( _
    ByRef src_str   As String, _        ' the string: to be modified
    ByVal search    As UByte, _         ' ASCII number of: char (to search for)
    ByVal replace   As UByte _          ' ASCII number of: char (to replace with)
    ) ByRef As String                   ' modified string (as above specified)
    For i As UInteger = 0 To Len(src_str) - 1
        If src_str[i] = search Then src_str[i] = replace
    Next
    Return src_str
End Function
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Regular expressions

Post by dodicat »

newbieforever

You do
#Include "regex.bi"

#Include "libtre.a"

You cannot #include a binary file, #include files must be viable freebasic code.
Example from freebasic

Code: Select all

'' PHP-like regex_replace() function, by MisterD

#include "regex.bi"

#ifndef regexmatch
#define regexmatch(match,zeile,n) mid(zeile,1+match(n).rm_so, match(n).rm_eo-match(n).rm_so)
#endif

function regex_replace(byref regex as string, byref replace_pattern as string, byref subject as string) as string
    dim replaced as string, rest as string
    rest=subject
    dim re as regex_t
    if regcomp( @re, regex, REG_EXTENDED or REG_ICASE )<>0 then return ""
    dim match(re.re_nsub) as regmatch_t, n as integer
    while regexec( @re, strptr(rest), re.re_nsub+1, @match(0), 0 )=0
        replaced+=left(rest,match(0).rm_so)
        for n = 1 to len(replace_pattern)
            if mid(replace_pattern,n,1) = "" and _
               mid(replace_pattern,n-1,1)<>"\" and _
               val(mid(replace_pattern,n+1,1)) > 0 and _
               val(mid(replace_pattern,n+1,1)) <= re.re_nsub _
            then
                replaced+=regexmatch(match,rest,val(mid(replace_pattern,n+1,1)))
                n+=1
            else
                replaced+=mid(replace_pattern,n,1)
            end if
        next n
        if match(0).rm_eo=len(rest) then return replaced
        rest=mid(rest,match(0).rm_eo+1)
    wend
    return replaced+rest
end function

print regex_replace("-(.+?)-", "*1*", "Hi -you- strange -user- :D")
sleep

   
result

Code: Select all

Hi *1* strange *1* :D   
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Regular expressions

Post by jj2007 »

fxm wrote:
jj2007 wrote:The exit could be shorter:

Code: Select all

   If posLeft=0 Then Exit Do
No, the second test

Code: Select all

If posLeft > Len(cont)-2 Then posLeft= 0 : Exit Do
is mandatory because of testing "cont[posLeft+1]"
You are right, I had forgotten that case.
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Regular expressions

Post by fxm »

The Operator [] (String index) also works with WString, and returns the UShort datatype:

Code: Select all

Dim As Wstring * 4 w = "abc"

#Print Typeof(W[0])

For I As Integer = 0 To 2
  Print @W[I], W[I]
Next I

Sleep
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Regular expressions

Post by MrSwiss »

ReplaceChar Function, overloaded for both: String & WString:

Code: Select all

' ReplaceChar_Func-overload.bas -- 2018-07-01, MrSwiss
'
' compile: -s console
'

' replaces given 'search' char, with a given 'replace' char, in Source String (src_str)
Function ReplaceChar OverLoad( _
    ByRef src_str   As String, _        ' the string: to be modified
    ByVal search    As UByte, _         ' ASCII number of: char (to search for)
    ByVal replace   As UByte _          ' ASCII number of: char (to replace with)
    ) ByRef As String                   ' modified string (as above specified)
    For i As UInteger = 0 To Len(src_str) - 1
        If src_str[i] = search Then src_str[i] = replace
    Next
    Return src_str
End Function
' as above, but for wstring, char = UShort here (instead of: UByte)
Function ReplaceChar OverLoad( _
    ByRef src_str   As WString, _       ' the wstring: to be modified
    ByVal search    As UShort, _        ' number of: char (to search for)
    ByVal replace   As UShort _         ' number of: char (to replace with)
    ) ByRef As WString                  ' modified wstring (as above specified)
    For i As UInteger = 0 To Len(src_str) - 1
        If src_str[i] = search Then src_str[i] = replace
    Next
    Return src_str
End Function


' demo code: the compiler chooses, which of the overloaded Functions, to use!
Dim As WString * 10 wstrg = "FreeBASIC" ' lenght must be defined (no dynamic allocation!)
Dim As String       strg  = "FreeBASIC"

Print "Test WString, original: ", wstrg
Print "modified: ",, ReplaceChar(wstrg, CUShort(Asc("F")), CUShort(Asc("T")))  ' wstring version
Print
Print "Test String, original: ", strg
Print "modified: ",, ReplaceChar(strg, Asc("F"), Asc("T"))   ' string version

Sleep
' end demo code     ' ----- EOF -----
lizard
Posts: 440
Joined: Oct 17, 2017 11:35
Location: Germany

Re: Regular expressions

Post by lizard »

On Linux Mint the libs often are much easier to find. Just do

Code: Select all

sudo apt-get install libtre-dev
I have posted a script for many libs in the linux forum.
viewtopic.php?f=5&t=26838&p=249012#p249012
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Regular expressions

Post by dodicat »

Perhaps a macro

Code: Select all



#macro Replace(subject,find ,replacement)
  scope
    dim as long position=Instr((subject),(find)),LR=Len((replacement)),LF=len((find))
    While position>0
        (subject)=Mid((subject),1,position-1) & (replacement) & Mid((subject),position+LF)
        position=Instr(position+LR,(subject),(find))
    Wend
    end scope
#endmacro

dim as string g="123"
replace(g,"2"," <-->Hello<--> ")
print g
print
dim as wstring * 30 w =wchr(700,850,1000,1001,1002,1003)

replace(w,wchr(1001)," Goodbye"+chr(13,10)+"Good luck.")
print w
sleep

 
Post Reply