I found a bug with the current regular expression matcher. While matching works as expected replacing does not. The routine that does the replacing uses a call to wxStyledTextCtrl_ReplaceSelection to replace the selected text.
Which is fine when not using a regular expression. But when using a regular expression \1 .. \9 are not handled correcly (these should evaluate to the text matched).
Example
input = "hello world"
regex = "\(hello\)"
replacement = "\1 freebasic"
Find-and-replace using regular expression should yield "hello freebasic". Instead I get
\1 freebasic. That's because the \1 does not get evaluated the way it should (not replaced
with the word "hello").
When using a regular expression the function wxStyledTextCtrl_ReplaceTargetRE should be used instead of wxStyledTextCtrl_ReplaceSelection. A possible solution to the problem would be to replace every occurence of
wxStyledTextCtrl_ReplaceSelection( stcPTR, wxComboBox_GetValue( searchAndRepDialog.comboReplace ) ) with:
Code: Select all
If wxCheckBox_GetValue( searchAndRepDialog.checkboxRegExp ) Then
''get target from selection (needed because selection, not target, is used by wxfbe)
wxStyledTextCtrl_TargetFromSelection(stcPTR)
wxStyledTextCtrl_ReplaceTargetRE( stcPTR, wxComboBox_GetValue( searchAndRepDialog.comboReplace ) )
else
wxStyledTextCtrl_ReplaceSelection( stcPTR, wxComboBox_GetValue( searchAndRepDialog.comboReplace ) )
end if
There is no ReplaceSelectionRE so ReplaceTarget must be used.
I've got one idea on the subject of regular expression find/replace that can be implemented without having to include a different regex engine. There are shorthand character classes for some character classes. These are:
\d -> [:digit:] -> [0-9]
\w -> [:alnum:] -> [a-zA-Z0-9_]
\s -> [:blank:] -> [ \t]
and their opposites \D \W \S (which translate to the negation of the previously mentioned character classes).
Actually \s is not shorthand for :blank: but for :space: which also includes cr, lf, vertical tab and formfeed.
These shorthand character classes can be translated into character classes in a straightforward manner :
\d -> [0-9]
\D -> [^0-9]
\w -> [a-zA-Z0-9_]
\W -> [^a-zA-Z0-9_]
\s -> [ \t]
\S -> [^ \t]
\s actually should match a lot more than space and tab (is should match space\t\r\n\f\v) but the built-in regex library has no special character for line endings, form feeds or vertical tabs. But space and tab are good enough (multi-line regex selection/replacement is not possible with the built-in regex engine anyway).
wxfbe can change the shorthand character classes into full character classes (\d into [0-9], \w into [a-zA-Z0-9_] etc...). When a regular expression contains the above mentioned shorthands then the shorthands get expanded into full character classes and the expanded form is then used as replacement for the value in the combo box.
The regular expression
function\s+
is transformed into
function[ \t]+
which is something the scintilla regex engine understands.
This replacement scheme should be fairly easy to implement (replacing shorthand charclass with full charclass).
A possible use of shorthand character classes:
Replacing some common prefix of identifiers with another prefix. This can then be done by using the following regular expressions:
find -> prefix\(\w+\)
replace -> newprefix\1
Other regular expression extensions (| operator) or the more sophisticated PCRE stuff (assertions) let alone the use of recursion in a regular expression cannot be implemented using the existing regex engine. Ultimately something more needs to be done to get proper regex support. But in the meantime shorthand character classes are a nice thing to have (and are easily implemented as an extension to the current regex engine).
I was able to make wxfbe crash by using regex $ and then perform a search-all ($ denotes 'end-of-line'). Could that be because of a bug in wxfbe (or is the scintilla regex engine to blame?).