substrings
substrings
I often use the operator [] string index.
A long time ago I used a dialect of basic which had something similar, but it also included a range.
e.g. a$[2:5] would be characters 2-5. (=mid$(a$,3,4)).
Would this be a useful notation to add to freebasic?
There has been many times it would have been useful, and clearer than say if I wanted the substring starting at character c and ending at character d, it would be a$(c:d) rather than mid$(a$,c+1,d-c+1) especially in a more complex expression.
A long time ago I used a dialect of basic which had something similar, but it also included a range.
e.g. a$[2:5] would be characters 2-5. (=mid$(a$,3,4)).
Would this be a useful notation to add to freebasic?
There has been many times it would have been useful, and clearer than say if I wanted the substring starting at character c and ending at character d, it would be a$(c:d) rather than mid$(a$,c+1,d-c+1) especially in a more complex expression.
Re: substrings
I don't think so, because FB lets you easily define something of your liking yourself:jimg wrote:Would this be a useful notation to add to freebasic?
Code: Select all
' SubStr_Proc.bas -- (c) 2020-01-29, MrSwiss
'
' compile: -s console
'
Declare Function SubStr(ByRef As Const String, ByVal As UInteger, ByVal As UInteger) As String
' ===== DEMO code =====
Dim As String tst = "FreeBASIC is geat for DIY procedure writing!"
Print "testing SubStr() Procedure" : Print
Print "original String: "; tst
Print SubStr(tst, 0, 8)
Print "testing ECC"
Print SubStr(tst, Len(tst) - 1, 10)
Print "empty source test ["; SubStr("", 0, 8); "]"
Print "index 0/0 test ["; SubStr(tst, 0, 0); "] minimal return = 1 character!"
Print : Print
Print "... done ... ";
Sleep
' ===== end DEMO code =====
' implement declared Function
Private Function SubStr( _ ' similar to mid() but, BASE 0 indexed
ByRef src As Const String, _ ' source string (read only)
ByVal strt As UInteger, _ ' start index
ByVal fini As UInteger _ ' finish (end) index
) As String ' sub-String
' ERROR checks:
If Len(src) = 0 Then Return "" ' if src is empty --> return ""
' if index 'out of range' --> return ""
If strt > Len(src) OrElse fini > Len(src) Then Return ""
' ECC (error correction code) corrects user mistake
If strt > fini Then Swap strt, fini
Dim As String ret
For i As UInteger = strt To fini
ret += Chr(src[i])
Next
Return ret
End Function
' ----- EOF -----
- ERROR checking (on all arguments/parameters)
- ERROR correction code (exchanging indexes if they are: "the wrong way around")
and some DEMO code ...
Re: substrings
Indeed. Parsing text was always a weak point of most BASIC dialects IMO (and FreeBasic was designed from the ground up to be syntactically compatible with QB). I always liked the In operator in Pascal:jimg wrote:...
Would this be a useful notation to add to freebasic?
There has been many times it would have been useful, and clearer than say if I wanted the substring starting at character c and ending at character d, it would be a$(c:d) rather than mid$(a$,c+1,d-c+1) especially in a more complex expression.
Code: Select all
function IsAlpha(c: char): boolean;
begin
IsAlpha := UpCase(c) in ['A'..'Z'];
end;
Code: Select all
' The following statement returns True (does "F" occur in the set of
' characters from "A" through "Z"?)
testCheck = "F" Like "[A-Z]"
Re: substrings
SubString function using inline assembly code. No string boundary checking.
Code: Select all
Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
Dim As Zstring Ptr z
Asm
.data
.lcomm LastCharPos,4
.lcomm LastChar,4
.text
mov edx,DWORD PTR [s]
mov esi,DWORD PTR [edx]
mov edi,esi
add esi,DWORD PTR [StrEnd]
inc esi
mov LastCharPos,esi
mov al,BYTE PTR [esi]
mov Byte PTR [esi],0
add edi,DWORD PTR [StrStart]
mov DWORD PTR [z],edi
mov LastChar,al
End Asm
Function=*z
Asm
mov edx,LastCharPos
mov al,LastChar
mov Byte PTR [edx],al
End Asm
End Function
Dim As String MyStr
MyStr="This is a test string."
Print SubString(MyStr,10,20)
Print MyStr
Re: substrings
The 64-bit version :
Code: Select all
Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
Dim As Zstring Ptr z
Asm
.data
.lcomm LastCharPos,8
.lcomm LastChar,4
.text
mov rdx,QWORD PTR [s]
mov rsi,QWORD PTR [rdx]
mov rdi,rsi
add rsi,QWORD PTR [StrEnd]
inc rsi
mov LastCharPos,rsi
mov al,BYTE PTR [rsi]
mov Byte PTR [rsi],0
add rdi,QWORD PTR [StrStart]
mov QWORD PTR [z],rdi
mov LastChar,al
End Asm
Function=*z
Asm
mov rdx,LastCharPos
mov al,LastChar
mov Byte PTR [rdx],al
End Asm
End Function
Dim As String MyStr
MyStr="This is a test string."
Print SubString(MyStr,10,20)
Print MyStr
Re: substrings
If there was a notation as you suggest, would it be zero indexed? Is the end index inclusive or exclusive? Lots of languages do this differently. Perl's slicing notation is interesting and fairly clear. Python's syntax is a bit more complicated but still alright. Negative indexes are used to do powerful things but unintuitive--I have to test it out in an interactive session every time I use it.
An interesting overview of various slicing syntaxes and methods can be found here: https://en.wikipedia.org/wiki/Array_slicing
An interesting overview of various slicing syntaxes and methods can be found here: https://en.wikipedia.org/wiki/Array_slicing
Re: substrings
Strange that the only Basic dialect mentioned is Sinclair Basic. Which was cute in a way, but shortly after you had e.g. GfaBasic, which played in a different league, syntax- and speed-wise.caseih wrote:An interesting overview of various slicing syntaxes and methods can be found here: https://en.wikipedia.org/wiki/Array_slicing
Re: substrings
ASM versus pointers, five runs
Code: Select all
function substring2(_in as string, x as long,y as long) as string
#macro memcopy(dest,src,size)
For n As Long=0 To size-1
(dest)[n]=(src)[n]
Next
#endmacro
static as zstring * 5000 g=""
memcopy(cast(ubyte ptr,@g),Cast(Ubyte Ptr,Strptr(_in)) + x, y-x+1)
return g
end function
#if sizeof(integer)=8 '64 bits
Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
Dim As Zstring Ptr z
Asm
.data
.lcomm LastCharPos,8
.lcomm LastChar,4
.text
mov rdx,QWORD PTR [s]
mov rsi,QWORD PTR [rdx]
mov rdi,rsi
add rsi,QWORD PTR [StrEnd]
inc rsi
mov LastCharPos,rsi
mov al,BYTE PTR [rsi]
mov Byte PTR [rsi],0
add rdi,QWORD PTR [StrStart]
mov QWORD PTR [z],rdi
mov LastChar,al
End Asm
Function=*z
Asm
mov rdx,LastCharPos
mov al,LastChar
mov Byte PTR [rdx],al
End Asm
End Function
#else '32 bits
Function SubString( Byval s As String,Byval StrStart As Uinteger,Byval StrEnd As Uinteger ) As String
Dim As Zstring Ptr z
Asm
.data
.lcomm LastCharPos,4
.lcomm LastChar,4
.text
mov edx,DWORD PTR [s]
mov esi,DWORD PTR [edx]
mov edi,esi
add esi,DWORD PTR [StrEnd]
inc esi
mov LastCharPos,esi
mov al,BYTE PTR [esi]
mov Byte PTR [esi],0
add edi,DWORD PTR [StrStart]
mov DWORD PTR [z],edi
mov LastChar,al
End Asm
Function=*z
Asm
mov edx,LastCharPos
mov al,LastChar
mov Byte PTR [edx],al
End Asm
End Function
#endif
Dim As String MyStr,ans
MyStr="This is a test string."
dim as double t,t2,tallya,tallyp
dim as long lim=10000000\2
for n as long=1 to lim 'WARMUP
rnd
next n
for k as long=1 to 5
t=timer
for n as long=1 to lim
ans= SubString(MyStr,10,15)
next n
t2=timer
tallya+=t2-t
print t2 -t,"'";ans;"'","ASM"
Print MyStr
t=timer
for n as long=1 to lim
ans= SubString2(MyStr,10,15)
next n
t2=timer
tallyp+=t2-t
print t2-t,"'";ans;"'","Pointer"
print
next k
print
print "total ASM time ";tallya
print "total PTR time ";tallyp
sleep
Last edited by dodicat on Feb 06, 2020 9:42, edited 1 time in total.
Re: substrings
Hi dodicat
which was faster for you?
for me the pointer version was a tiny-bit faster
which was faster for you?
for me the pointer version was a tiny-bit faster
Re: substrings
Hi srvaldez.
The pointer is a bit faster here.
Maybe
static as zstring * 5000 g=""
Would be better (same speed)
But MID is just as fast as either of the methods.
The pointer is a bit faster here.
Maybe
static as zstring * 5000 g=""
Would be better (same speed)
But MID is just as fast as either of the methods.
Re: substrings
I will try to improve my code. The second version is removed for the moment.
Last edited by Vortex on Feb 06, 2020 17:11, edited 1 time in total.
Re: substrings
Hi dodicat,
Thanks for the testing. I guess you should correct a small typo here :
Substring -> Substring2
Thanks for the testing. I guess you should correct a small typo here :
Substring -> Substring2
Code: Select all
for n as long=1 to lim
ans= SubString2(MyStr,10,15)
next n
t2=timer
tallyp+=t2-t
print t2-t,"'";ans;"'","Pointer"
print
next k
Re: substrings
@dodicat
just realized to your benchmark code only calls the asm function, replacing the function in line 125 SubString2
the function substring2 is 1.8 times faster than the asm version
just realized to your benchmark code only calls the asm function, replacing the function in line 125 SubString2
the function substring2 is 1.8 times faster than the asm version
Re: substrings
srvaldez
Er . . . hmm, how right you are, I didn't call my own function at all.
I think it is time (2308) to take the dogs for a walk.
It is worrying of course at 71 years old, not to call your own function.
Thank you.
Er . . . hmm, how right you are, I didn't call my own function at all.
I think it is time (2308) to take the dogs for a walk.
It is worrying of course at 71 years old, not to call your own function.
Thank you.
Re: substrings
Simplified 32-bit version :
64-bit version :
Code: Select all
Function SubString stdcall alias "SubString" ( s As String,StrStart As Uinteger,StrEnd As Uinteger ) As String
Dim As Zstring Ptr z
Asm
mov ecx,DWORD PTR [s]
mov esi,DWORD PTR [ecx]
mov edi,esi
add esi,DWORD PTR [StrEnd]
inc esi
mov bl,BYTE PTR [esi]
mov Byte PTR [esi],0
add edi,DWORD PTR [StrStart]
mov DWORD PTR [z],edi
End Asm
Function=*z
Asm
mov BYTE PTR [esi],bl
End Asm
End Function
Dim As String MyStr,a
MyStr="This is a test string."
a=SubString(MyStr,10,20)
Print a
Print MyStr
Print a
Code: Select all
Function SubString alias "SubString" ( s As String,StrStart As Uinteger,StrEnd As Uinteger ) As String
Dim As Zstring Ptr z
Asm
mov rcx,QWORD PTR [s]
mov rsi,QWORD PTR [rcx]
mov rdi,rsi
add rsi,QWORD PTR [StrEnd]
inc rsi
mov bl,BYTE PTR [rsi]
mov BYTE PTR [rsi],0
add rdi,QWORD PTR [StrStart]
mov QWORD PTR [z],rdi
End Asm
Function=*z
Asm
mov BYTE PTR [rsi],bl
End Asm
End Function
Dim As String MyStr,a
MyStr="This is a test string."
a=SubString(MyStr,10,20)
Print a
Print MyStr
Print a