substrings

General FreeBASIC programming questions.
jimg
Posts: 24
Joined: Jan 16, 2020 19:43
Location: Oregon

substrings

Post by jimg »

I often use the operator [] string index.
A long time ago I used a dialect of basic which had something similar, but it also included a range.

e.g. a$[2:5] would be characters 2-5. (=mid$(a$,3,4)).

Would this be a useful notation to add to freebasic?

There has been many times it would have been useful, and clearer than say if I wanted the substring starting at character c and ending at character d, it would be a$(c:d) rather than mid$(a$,c+1,d-c+1) especially in a more complex expression.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: substrings

Post by MrSwiss »

jimg wrote:Would this be a useful notation to add to freebasic?
I don't think so, because FB lets you easily define something of your liking yourself:

Code: Select all

' SubStr_Proc.bas -- (c) 2020-01-29, MrSwiss
'
' compile: -s console
'

Declare Function SubStr(ByRef As Const String, ByVal As UInteger, ByVal As UInteger) As String

' ===== DEMO code =====
Dim As String   tst = "FreeBASIC is geat for DIY procedure writing!"

Print "testing SubStr() Procedure" : Print
Print "original String: "; tst
Print SubStr(tst, 0, 8)
Print "testing ECC"
Print SubStr(tst, Len(tst) - 1, 10)
Print "empty source test ["; SubStr("", 0, 8); "]"
Print "index 0/0 test ["; SubStr(tst, 0, 0); "] minimal return = 1 character!"
Print : Print
Print "... done ... ";

Sleep
' ===== end DEMO code =====

' implement declared Function
Private Function SubStr( _              ' similar to mid() but, BASE 0 indexed
    ByRef src   As Const String, _      ' source string (read only)
    ByVal strt  As UInteger,     _      ' start index
    ByVal fini  As UInteger      _      ' finish (end) index
    ) As String                         ' sub-String
    ' ERROR checks:
    If Len(src) = 0 Then Return ""      ' if src is empty --> return ""
    ' if index 'out of range' --> return ""
    If strt > Len(src) OrElse fini > Len(src) Then Return ""
    ' ECC (error correction code) corrects user mistake
    If strt > fini Then Swap strt, fini
    
    Dim As String   ret
    
    For i As UInteger = strt To fini
        ret += Chr(src[i])
    Next
    
    Return ret
End Function
' ----- EOF -----
Here I'm going "all the way" which means, it includes:
- ERROR checking (on all arguments/parameters)
- ERROR correction code (exchanging indexes if they are: "the wrong way around")
and some DEMO code ...
paul doe
Moderator
Posts: 1732
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: substrings

Post by paul doe »

jimg wrote:...
Would this be a useful notation to add to freebasic?

There has been many times it would have been useful, and clearer than say if I wanted the substring starting at character c and ending at character d, it would be a$(c:d) rather than mid$(a$,c+1,d-c+1) especially in a more complex expression.
Indeed. Parsing text was always a weak point of most BASIC dialects IMO (and FreeBasic was designed from the ground up to be syntactically compatible with QB). I always liked the In operator in Pascal:

Code: Select all

function IsAlpha(c: char): boolean;
begin
   IsAlpha := UpCase(c) in ['A'..'Z'];
end;
Or the Like operator of .Net, which provides simple but very useful pattern matching functionality:

Code: Select all

' The following statement returns True (does "F" occur in the set of
'    characters from "A" through "Z"?)
testCheck = "F" Like "[A-Z]"
Vortex
Posts: 118
Joined: Sep 19, 2005 9:50

Re: substrings

Post by Vortex »

SubString function using inline assembly code. No string boundary checking.

Code: Select all

Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
    
    Dim As Zstring Ptr z
    
    Asm
        .data
        
        .lcomm LastCharPos,4
        .lcomm LastChar,4
        
        .text
        
        mov edx,DWORD PTR [s]
        mov esi,DWORD PTR [edx]
        mov edi,esi
        add esi,DWORD PTR [StrEnd]
        inc esi
        mov LastCharPos,esi
        
        mov al,BYTE PTR [esi]
        mov Byte PTR [esi],0
        
        add edi,DWORD PTR [StrStart]
        mov DWORD PTR [z],edi
        mov LastChar,al
    
    End Asm
    
    Function=*z
    
    Asm
        
        mov edx,LastCharPos
        mov al,LastChar
        mov Byte PTR [edx],al
        
    End Asm
    
    
End Function


Dim As String MyStr

MyStr="This is a test string."

Print SubString(MyStr,10,20)
Print MyStr
Vortex
Posts: 118
Joined: Sep 19, 2005 9:50

Re: substrings

Post by Vortex »

The 64-bit version :

Code: Select all

Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
    
    Dim As Zstring Ptr z
    
    Asm
        .data
        
        .lcomm LastCharPos,8
        .lcomm LastChar,4
        
        .text
        
        mov rdx,QWORD PTR [s]
        mov rsi,QWORD PTR [rdx]
        mov rdi,rsi
        add rsi,QWORD PTR [StrEnd]
        inc rsi
        mov LastCharPos,rsi
        
        mov al,BYTE PTR [rsi]
        mov Byte PTR [rsi],0
        
        add rdi,QWORD PTR [StrStart]
        mov QWORD PTR [z],rdi
        mov LastChar,al
    
    End Asm
    
    Function=*z
    
    Asm
        
        mov rdx,LastCharPos
        mov al,LastChar
        mov Byte PTR [rdx],al
        
    End Asm
    
    
End Function


Dim As String MyStr

MyStr="This is a test string."

Print SubString(MyStr,10,20)
Print MyStr
caseih
Posts: 2157
Joined: Feb 26, 2007 5:32

Re: substrings

Post by caseih »

If there was a notation as you suggest, would it be zero indexed? Is the end index inclusive or exclusive? Lots of languages do this differently. Perl's slicing notation is interesting and fairly clear. Python's syntax is a bit more complicated but still alright. Negative indexes are used to do powerful things but unintuitive--I have to test it out in an interactive session every time I use it.

An interesting overview of various slicing syntaxes and methods can be found here: https://en.wikipedia.org/wiki/Array_slicing
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: substrings

Post by jj2007 »

caseih wrote:An interesting overview of various slicing syntaxes and methods can be found here: https://en.wikipedia.org/wiki/Array_slicing
Strange that the only Basic dialect mentioned is Sinclair Basic. Which was cute in a way, but shortly after you had e.g. GfaBasic, which played in a different league, syntax- and speed-wise.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: substrings

Post by dodicat »

ASM versus pointers, five runs

Code: Select all



function substring2(_in as string, x as long,y as long) as string
    #macro memcopy(dest,src,size)
    For n As Long=0 To size-1
        (dest)[n]=(src)[n]
    Next
    #endmacro
  static as zstring * 5000 g=""
      memcopy(cast(ubyte ptr,@g),Cast(Ubyte Ptr,Strptr(_in)) + x, y-x+1)
      return g
end function


#if sizeof(integer)=8  '64 bits

Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
   
    Dim As Zstring Ptr z
   
    Asm
        .data
       
        .lcomm LastCharPos,8
        .lcomm LastChar,4
       
        .text
       
        mov rdx,QWORD PTR [s]
        mov rsi,QWORD PTR [rdx]
        mov rdi,rsi
        add rsi,QWORD PTR [StrEnd]
        inc rsi
        mov LastCharPos,rsi
       
        mov al,BYTE PTR [rsi]
        mov Byte PTR [rsi],0
       
        add rdi,QWORD PTR [StrStart]
        mov QWORD PTR [z],rdi
        mov LastChar,al
   
    End Asm
   
    Function=*z
   
    Asm
       
        mov rdx,LastCharPos
        mov al,LastChar
        mov Byte PTR [rdx],al
       
    End Asm
   
End Function
#else  '32 bits

Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
   
    Dim As Zstring Ptr z
   
    Asm
        .data
       
        .lcomm LastCharPos,4
        .lcomm LastChar,4
       
        .text
       
        mov edx,DWORD PTR [s]
        mov esi,DWORD PTR [edx]
        mov edi,esi
        add esi,DWORD PTR [StrEnd]
        inc esi
        mov LastCharPos,esi
       
        mov al,BYTE PTR [esi]
        mov Byte PTR [esi],0
       
        add edi,DWORD PTR [StrStart]
        mov DWORD PTR [z],edi
        mov LastChar,al
   
    End Asm
   
    Function=*z
   
    Asm
       
        mov edx,LastCharPos
        mov al,LastChar
        mov Byte PTR [edx],al
       
    End Asm
   
   
End Function
#endif

Dim As String MyStr,ans

MyStr="This is a test string."
dim as double t,t2,tallya,tallyp
dim as long lim=10000000\2

for n as long=1 to lim 'WARMUP
    rnd
    next n

for k as long=1 to 5
    
t=timer

for n as long=1 to lim
ans= SubString(MyStr,10,15)
next n
t2=timer
tallya+=t2-t
print t2 -t,"'";ans;"'","ASM"

Print MyStr


t=timer
for n as long=1 to lim
ans= SubString2(MyStr,10,15)
next n
t2=timer
tallyp+=t2-t
print t2-t,"'";ans;"'","Pointer"
print
next k
print
print "total ASM time ";tallya
print "total PTR time ";tallyp
sleep

 
Last edited by dodicat on Feb 06, 2020 9:42, edited 1 time in total.
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: substrings

Post by srvaldez »

Hi dodicat
which was faster for you?
for me the pointer version was a tiny-bit faster
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: substrings

Post by dodicat »

Hi srvaldez.
The pointer is a bit faster here.
Maybe
static as zstring * 5000 g=""
Would be better (same speed)

But MID is just as fast as either of the methods.
Vortex
Posts: 118
Joined: Sep 19, 2005 9:50

Re: substrings

Post by Vortex »

I will try to improve my code. The second version is removed for the moment.
Last edited by Vortex on Feb 06, 2020 17:11, edited 1 time in total.
Vortex
Posts: 118
Joined: Sep 19, 2005 9:50

Re: substrings

Post by Vortex »

Hi dodicat,

Thanks for the testing. I guess you should correct a small typo here :

Substring -> Substring2

Code: Select all

for n as long=1 to lim
ans= SubString2(MyStr,10,15)
next n
t2=timer
tallyp+=t2-t
print t2-t,"'";ans;"'","Pointer"
print
next k
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: substrings

Post by srvaldez »

@dodicat
just realized to your benchmark code only calls the asm function, replacing the function in line 125 SubString2
the function substring2 is 1.8 times faster than the asm version
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: substrings

Post by dodicat »

srvaldez
Er . . . hmm, how right you are, I didn't call my own function at all.
I think it is time (2308) to take the dogs for a walk.
It is worrying of course at 71 years old, not to call your own function.
Thank you.
Vortex
Posts: 118
Joined: Sep 19, 2005 9:50

Re: substrings

Post by Vortex »

Simplified 32-bit version :

Code: Select all

Function SubString stdcall alias "SubString" ( s As String,StrStart As Uinteger,StrEnd As Uinteger ) As String

Dim As Zstring Ptr z
    
    Asm
        mov  ecx,DWORD PTR [s]
        mov  esi,DWORD PTR [ecx]
        mov  edi,esi

        add  esi,DWORD PTR [StrEnd]
        inc  esi
        
        mov  bl,BYTE PTR [esi]
        mov  Byte PTR [esi],0
        
        add  edi,DWORD PTR [StrStart]
        mov  DWORD PTR [z],edi
    
    End Asm
    
    Function=*z
    
    Asm
    
        mov BYTE PTR [esi],bl

    End Asm
    
    
End Function


Dim As String MyStr,a

MyStr="This is a test string."

a=SubString(MyStr,10,20)
Print a
Print MyStr
Print a
64-bit version :

Code: Select all

Function SubString alias "SubString" ( s As String,StrStart As Uinteger,StrEnd As Uinteger ) As String

Dim As Zstring Ptr z
    
    Asm
        mov  rcx,QWORD PTR [s]
        mov  rsi,QWORD PTR [rcx]
        mov  rdi,rsi

        add  rsi,QWORD PTR [StrEnd]
        inc  rsi
        
        mov  bl,BYTE PTR [rsi]
        mov  BYTE PTR [rsi],0
        
        add  rdi,QWORD PTR [StrStart]
        mov  QWORD PTR [z],rdi
    
    End Asm
    
    Function=*z
    
    Asm
    
        mov BYTE PTR [rsi],bl

    End Asm
    
    
End Function


Dim As String MyStr,a

MyStr="This is a test string."

a=SubString(MyStr,10,20)
Print a
Print MyStr
Print a
Post Reply