DIM StrData AS STRING
DIM SearchPat AS STRING
DECLARE FUNCTION instrrev2(BYREF searchstr AS STRING, BYREF searchpat AS STRING) AS INTEGER
SearchPat = "TEST"
StrData = SearchPat+SearchPat+SearchPat+CHR(0)
PRINT InstrRev(StrData, SearchPat, -1)
PRINT InstrRev2(StrData, SearchPat)
PRINT MID(StrData, InstrRev2(StrData, SearchPat), LEN(SearchPat))
SLEEP
FUNCTION instrrev2(BYREF searchstr AS STRING, BYREF searchpat AS STRING) AS INTEGER
FOR q AS INTEGER = (LEN(searchstr) - LEN(searchpat) + 1) TO 1 STEP -1
IF MID(searchstr,q,LEN(searchpat)) = searchpat THEN RETURN q
NEXT
RETURN 0
END FUNCTION
Shouldn't InStrRev return the same result as the other function? The String "TEST" appears in "TESTTESTTEST"+chr(0) at the positions 1,5 and 9. Shouldn't InStrRev return 9?
Don't append chr(0) to a "normal" freeBASIC string. It is unnecessary (there is an internal null terminating the string) and confuses all other string functions. With two nulls the end of string and string length don't match.
Yes, without the trailing chr(0) InStrRev returns 9. But all is not right, because InStr will search over a leading chr(0) and return the correct position, where InStrRev appears to search over a trailing chr(0) but does not return the correct position. I think both functions should return 0 if the first character they encounter in the string is a chr(0).
The source-code above is only an example. Usually I don't append a chr(0) to strings, of course. I've encountered the problem while reading Data from various files which contain binary data.
It's also possible (but uncommon) that the files end with a chr(0) as last character.
I think a Null-Character at the end of a STRING shouldn't manipulate the behaviour of the String-Functions because STRINGs can contain Null-Characters at any position - in contrast to ZSTRINGs which may not contain any chr(0)-Characters.
I may be wrong. InStrRev may be the only string function/statement affected by appending a null to an ordinary string. It's the only function that "reads" right to left, all others "read" left to right to the first null.
But I consider this issue trivial. The point, one impressed upon me at times, is that ordinary strings should not contain nulls. If I'm going to use a string for arbitrary binary data (which may include nulls) storage then I'm "hacking" and I'm on my own (I shouldn't expect any string functions to succeed upon said hacked string, and I don't).
If you mix nulls in your strings you may very well have unexpected behavior if you then want to treat them as ordinary (untainted) strings. Don't mix nulls. Don't drive on the left side of the road in Kansas.
I agree with the no nulls in STRINGs. If you place nulls in a STRING then they will not work correctly if passed to a function that expects a null-terminated string, probably the most common string format.
St_W wrote:The source-code above is only an example. Usually I don't append a chr(0) to strings, of course. I've encountered the problem while reading Data from various files which contain binary data.
It's also possible (but uncommon) that the files end with a chr(0) as last character.
I think a Null-Character at the end of a STRING shouldn't manipulate the behaviour of the String-Functions because STRINGs can contain Null-Characters at any position - in contrast to ZSTRINGs which may not contain any chr(0)-Characters.
Ah, no - or yes depending on your perspective. A null is considered to be a terminator in both ordinary strings and zstrings, in this realm. You are making what I consider to be an arbitrary distinction between ordinary.. strings.. and zstrings. Both can "contain" nulls in that the memory allocated to either can "contain" anything, but your ability to manipulate either as string data with inherent string functions is affected by any nulls present.
You can file a bug report if you like. See this thread:
Yes, it should return 9 in this instance. So it is a bug - do please file a report.
But the null-char has nothing to do with it. Try appending a different character - like chr(1), or "x".
There shouldn't be a problem with nulls in normal FB string functions. Although currently this isn't generally true for literals like !"foo\0bar", because string literals are treated more like zstrings.
I'd like to read some data from a file and search for a pattern. The file(s) contain NULL-Characters. I store them also in a big string, which contains the whole content of the file. But then there are also these NULL-Chars in the string and some of the string functions wouldn't work correctly. What should I do else?
OPEN "TEST.XYZ" FOR BINARY AS #1
DIM FileDat AS STRING
FileDat = SPACE(LOF(1))
GET #1, 1, FileDat
PRINT "Last Occurence at: "; InStrRev(FileDat, "TESTSTRING")
CLOSE
/EDIT: I've just read your answer, counting_pine. I'll submit a bug-report at sourceforge soon. At first I have to register myself there. Until the problem is fixed I'll use my own InStrRev function, written in Assembly.
As mentioned above I've written an InStrRev-Function in FreeBasic+Inline Assembler that should work correctly, can handle larger strings and is multiple times faster than the origin.
If somebody needs such a function - here's the source:
function InStrRev2(byref searchstr as string, byref searchpat as string, byval startpos as uinteger = 0) as uinteger
dim SStrPtr as integer = cint(strptr(searchstr))
dim SPatPtr as integer = cint(strptr(searchpat))
dim SStrLen as integer = len(searchstr)
dim SPatLen as integer = len(searchpat)
if startpos > 0 then SStrLen = startpos + SPatLen - 1
if SStrLen > len(searchstr) then SStrLen = len(searchstr)
if SPatLen > SStrLen then return 0
asm
std
mov ecx, [SStrLen]
mov ebx, [SPatLen]
mov edi, [SStrPtr]
mov esi, [SPatPtr]
add edi, ecx
sub edi, ebx
mov al, [esi]
instrrev_continue:
repne scasb
jz instrrev_foundfirst
jcxz instrrev_notfound
instrrev_foundfirst:
push ecx
push edi
mov ecx, ebx
add edi, ebx
mov esi, [SPatPtr]
add esi, ebx
dec esi
repe cmpsb
jz instrrev_found
pop edi
pop ecx
jmp instrrev_continue
instrrev_found:
pop edi
pop ecx
mov edx, edi
sub edx, [SStrPtr]
add edx, 2
mov [Function], edx
instrrev_notfound:
cld
end asm
end function
I'm not so good in assembly programming and I'm sure the function could be optimized...