DWSTRING.bi - Dynamic null terminated unicode string data type

User projects written in or related to FreeBASIC.
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 06, 2018 10:27

The DWSTRING class implements a dynamic unicode null terminated string. Free Basic has a dynamic string data type (STRING) and a fixed length unicode data type (WSTRING), but it lacks a dynamic unicode string.

DWSTRING behaves as if it was a native data type, working directly with the intrinsic Free Basic string functions and operators.

It works transparently with literals and Free Basic native strings, e.g.

Code: Select all

#include once "DWSTRING.bi"

DIM dws AS DWSTRING = "One"
DIM s AS STRING = "Three"
dws = dws & " Two " & s
PRINT dws


It can be used with Windows API functions, e.g.

Code: Select all

DIM nLen AS LONG = SendMessageW(hwnd, WM_GETTEXTLENGTH, 0, 0)
DIM dwsText AS DWSTRING = DWSTRING(nLen + 1, FALSE)
SendMessageW(hwnd, WM_GETTEXT, nLen + 1, cast(LPARAM, *dwsText))


Arrays

We can use arrays of DWSTRING strings transparently, e.g.

Code: Select all

DIM rg(1 TO 10) AS DWSTRING
FOR i AS LONG = 1 TO 10
   rg(i) = "string " & i
NEXT

FOR i AS LONG = 1 TO 10
   print rg(i)
NEXT


A two-dimensional array

Code: Select all

DIM rg2 (1 TO 2, 1 TO 2) AS DWSTRING
rg2(1, 1) = "string 1 1"
rg2(1, 2) = "string 1 2"
rg2(2, 1) = "string 2 1"
rg2(2, 2) = "string 2 2"
print rg2(2, 1)


REDIM PRESERVE / ERASE

Code: Select all

REDIM rg(0) AS DWSTRING
rg(0) = "string 0"
REDIM PRESERVE rg(0 TO 2) AS DWSTRING
rg(1) = "string 1"
rg(2) = "string 2"
print rg(0)
print rg(1)
print rg(2)
ERASE rg


And we can also sort one-dimensional arrays calling the DWStringSort or DWStringArraySort procedures:

Code: Select all

DIM rg(1 TO 10) AS DWSTRING
FOR i AS LONG = 1 TO 10
   rg(i) = "string " & i
NEXT
FOR i AS LONG = 1 TO 10
  print rg(i)
NEXT
print "---- after sorting ----"
DWstringArraySort rg()
FOR i AS LONG = 1 TO 10
   print rg(i)
NEXT


It can also be used with files:

Using FreeBasic intrinsic statements:

Code: Select all

DIM dws AS DWSTRING = "Дмитрий Дмитриевич Шостакович"
DIM f AS LONG = FREEFILE
OPEN "test.txt" FOR OUTPUT ENCODING "utf16" AS #f
PRINT #f, dws
CLOSE #f


Using the Windows API:

Code: Select all

' // Writing to a file
DIM dwsFilename AS DWSTRING = "тест.txt"
DIM dwsText AS DWSTRING = "Дмитрий Дмитриевич Шостакович"
DIM hFile AS HANDLE = CreateFileW(dwsFilename, GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL)
IF hFile THEN
   DIM dwBytesWritten AS DWORD
   DIM bSuccess AS LONG = WriteFile(hFile, dwsText, LEN(dwsText) * 2, @dwBytesWritten, NULL)
   CloseHandle(hFile)
END IF


Code: Select all

' // Read the file
hFile = CreateFileW(dwsFilename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, NULL)
IF hFile THEN
   DIM dwFileSize AS DWORD = GetFileSize(hFile, NULL)
   IF dwFileSize THEN
      DIM dwsOut AS DWSTRING = WSPACE(dwFileSize \ 2)
      DIM bSuccess AS LONG = ReadFile(hFile, *dwsOut, dwFileSize, NULL, NULL)
      CloseHandle(hFile)
      PRINT dwsOut
   END IF
END IF


Notice that, contrarily to CreateFileW, FreeBasic's OPEN statemente doesn't allow to use unicode for the file name.

Quirks:

- MID as a statement: Something like MID(cws, 2, 1) = "x" compiles but does not change the contents of the dynamic unicode string. MID(cws.wstr, 2, 1) = "x" or MID(**cws, 2, 1) = "x" works.
- SELECT CASE: Something like SELECT CASE LEFT(dws, 2) does not compile; we have to use SELECT CASE LEFT(**dws, 2). Same problem with RIGHT, but not with MID.
- Operator []: cwsText[0] = ASC("x") does not compile; we have to use (*cwsText)[0] = ASC("x").
dodicat
Posts: 5086
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby dodicat » Jul 06, 2018 12:59

Excuse me if I am going stupid.
I cannot test your stuff because I cannot find "DWSTRING.bi"
And I don't understand the concept of CLASS in freebasic.
I say again:
Excuse me if I am going stupid.
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 06, 2018 15:27

Sorry, but an error happened in the forum when I tried to post the code saying that it could not save the post at this time.

DWSTRING.bi

Code: Select all

' ########################################################################################
' Platform: Microsoft Windows
' Filename: DWSTRING.bi
' Purpose: Implements a data type for dynamic null terminated unicode strings.
' Compiler: Free Basic 32 & 64 bit
' Copyright (c) 2018 José Roca
'
' License: Same license as the FreeBasic include files provided with the compiler.
'
'   This library is free software; you can redistribute it and/or
'   modify it under the terms of the GNU Lesser General Public
'   License as published by the Free Software Foundation; either
'   version 2.1 of the License, or (at your option) any later version.
'
'   This library is distributed in the hope that it will be useful,
'   but WITHOUT ANY WARRANTY; without even the implied warranty of
'   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
'   Lesser General Public License for more details.
'
'   You should have received a copy of the GNU Lesser General Public
'   License along with this library; if not, write to the Free Software
'   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
'
' ########################################################################################

#pragma ONCE
#INCLUDE ONCE "windows.bi"
#INCLUDE ONCE "/crt/string.bi"
#INCLUDE ONCE "/crt/wchar.bi"
#INCLUDE ONCE "utf_conv.bi"

' ========================================================================================
' Macro for debug
' To allow debugging, define _DWSTRING_DEBUG_ 1 in your application before including this file.
' ========================================================================================
#ifndef _DWSTRING_DEBUG_
   #define _DWSTRING_DEBUG_ 0
#ENDIF
#ifndef _DWSTRING_DP_
   #define _DWSTRING_DP_ 1
   #MACRO DWSTRING_DP(st)
      #IF (_DWSTRING_DEBUG_ = 1)
         OutputDebugStringW(st)
      #ENDIF
   #ENDMACRO
#ENDIF
' ========================================================================================

' ########################################################################################
'                                *** DWSTRING CLASS ***
' ########################################################################################
TYPE DWSTRING

   Private:
      m_Capacity AS UINT         ' // The total size of the buffer in UTF16 characters
      m_GrowSize AS LONG = 260   ' // How much to grow the buffer by when required

   Public:
      m_pBuffer AS WSTRING PTR   ' // Pointer to the buffer
      m_BufferLen AS UINT        ' // Length in characters of the string

      DECLARE CONSTRUCTOR
      DECLARE CONSTRUCTOR (BYVAL nChars AS UINT, BYVAL bClear AS BOOLEAN)
      DECLARE CONSTRUCTOR (BYVAL pwszStr AS WSTRING PTR)
      DECLARE CONSTRUCTOR (BYREF ansiStr AS STRING, BYVAL nCodePage AS UINT = 0)
      DECLARE CONSTRUCTOR (BYREF dws AS DWSTRING)
      DECLARE CONSTRUCTOR (BYVAL n AS LONGINT)
      DECLARE CONSTRUCTOR (BYVAL n AS DOUBLE)
      DECLARE DESTRUCTOR
      DECLARE PROPERTY GrowSize () AS LONG
      DECLARE PROPERTY GrowSize (BYVAL nValue AS LONG)
      DECLARE PROPERTY Capacity () AS UINT
      DECLARE PROPERTY Capacity (BYVAL nValue AS UINT)
      DECLARE PROPERTY SizeAlloc (BYVAL nChars AS UINT)
      DECLARE PROPERTY SizeOf () AS UINT
      DECLARE FUNCTION ResizeBuffer (BYVAL nChars AS UINT, BYVAL bClear AS BOOLEAN = FALSE) AS WSTRING PTR
      DECLARE FUNCTION AppendBuffer (BYVAL memAddr AS ANY PTR, BYVAL nChars AS UINT) AS BOOLEAN
      DECLARE FUNCTION InsertBuffer (BYVAL memAddr AS ANY PTR, BYVAL nIndex AS UINT, BYVAL nChars AS UINT) AS BOOLEAN
      DECLARE SUB Clear
      DECLARE FUNCTION Add (BYVAL pwszStr AS WSTRING PTR) AS BOOLEAN
      DECLARE FUNCTION Add (BYREF ansiStr AS STRING, BYVAL nCodePage AS UINT = 0) AS BOOLEAN
      DECLARE FUNCTION Add (BYREF dws AS DWSTRING) AS BOOLEAN
      DECLARE PROPERTY Char(BYVAL nIndex AS UINT) AS USHORT
      DECLARE PROPERTY Char(BYVAL nIndex AS UINT, BYVAL nValue AS USHORT)
      DECLARE OPERATOR [] (BYVAL nIndex AS UINT) AS USHORT
      DECLARE FUNCTION DelChars (BYVAL nIndex AS UINT, BYVAL nCount AS UINT) AS BOOLEAN
      DECLARE FUNCTION Insert (BYVAL pwszStr AS WSTRING PTR, BYVAL nIndex AS UINT) AS BOOLEAN
      DECLARE FUNCTION Insert (BYREF ansiStr AS STRING, BYVAL nIndex AS UINT, BYVAL nCodePage AS UINT = 0) AS BOOLEAN
      DECLARE FUNCTION Insert (BYREF dws AS DWSTRING, BYVAL nIndex AS UINT) AS BOOLEAN
      DECLARE OPERATOR CAST () BYREF AS CONST WSTRING
      DECLARE OPERATOR CAST () AS ANY PTR
      DECLARE OPERATOR LET (BYVAL pwszStr AS WSTRING PTR)
      DECLARE OPERATOR LET (BYREF ansiStr AS STRING)
      DECLARE OPERATOR LET (BYREF dws AS DWSTRING)
      DECLARE OPERATOR LET (BYVAL n AS LONGINT)
      DECLARE OPERATOR LET (BYVAL n AS DOUBLE)
      DECLARE OPERATOR += (BYREF ansiStr AS STRING)
      DECLARE OPERATOR += (BYVAL pwszStr AS WSTRING PTR)
      DECLARE OPERATOR += (BYREF dws AS DWSTRING)
      DECLARE OPERATOR += (BYVAL n AS LONGINT)
      DECLARE OPERATOR += (BYVAL n AS DOUBLE)
      DECLARE OPERATOR &= (BYVAL pwszStr AS WSTRING PTR)
      DECLARE OPERATOR &= (BYREF ansiStr AS STRING)
      DECLARE OPERATOR &= (BYREF dws AS DWSTRING)
      DECLARE OPERATOR &= (BYVAL n AS LONGINT)
      DECLARE OPERATOR &= (BYVAL n AS DOUBLE)
      DECLARE FUNCTION vptr () AS WSTRING PTR
      DECLARE FUNCTION sptr () AS WSTRING PTR
      DECLARE FUNCTION wstr () BYREF AS CONST WSTRING
      DECLARE PROPERTY utf8 () AS STRING
      DECLARE PROPERTY utf8 (BYREF ansiStr AS STRING)

END TYPE
' ########################################################################################

' ========================================================================================
' DWSTRING constructors
' ========================================================================================
' ========================================================================================
' Attempts to allocate, or reserve, m_GrowSize number of bytes from the free store (heap).
' The newly allocated memory is initialized.
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING
   DWSTRING_DP("DWSTRING CONSTRUCTOR - Default")
   this.ResizeBuffer(m_GrowSize, TRUE)   ' Create the initial buffer
END CONSTRUCTOR
' ========================================================================================
' ========================================================================================
' Attempts to allocate, or reserve, nChars number of bytes from the free store (heap).
' bClear = FALSE: The newly allocated memory is not initialized.
' bClear = TRUE: The newly allocated memory is initialized.
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING (BYVAL nChars AS UINT, BYVAL bClear AS BOOLEAN)
   DWSTRING_DP("DWSTRING CONSTRUCTOR - nChars, bClear")
   IF nChars < 1 THEN nChars = m_GrowSize
   this.ResizeBuffer(nChars, bClear)
END CONSTRUCTOR
' ========================================================================================
' ========================================================================================
' Initializes the DWSTRING with the contents of the passed WSTRING.
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING (BYVAL pwszStr AS WSTRING PTR)
   DWSTRING_DP("DWSTRING CONSTRUCTOR - WSTRING PTR")
   IF pwszStr = NULL THEN this.ResizeBuffer(m_GrowSize) : EXIT CONSTRUCTOR
   this.Add(pwszStr)
END CONSTRUCTOR
' ========================================================================================
' ========================================================================================
' Initializes the DWSTRING with the contents of the passed STRING.
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING (BYREF ansiStr AS STRING, BYVAL nCodePage AS UINT = 0)
   DWSTRING_DP("DWSTRING CONSTRUCTOR - STRING")
   IF .LEN(ansiStr) = 0 THEN this.ResizeBuffer(m_GrowSize) : EXIT CONSTRUCTOR
   IF .LEN(ansiStr) = 0 THEN this.ResizeBuffer(m_GrowSize) : EXIT CONSTRUCTOR
   this.Add(ansiStr, nCodePage)
END CONSTRUCTOR
' ========================================================================================
' ========================================================================================
' Initializes the DWSTRING with the contents of the passed DWSTRING.
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING (BYREF dws AS DWSTRING)
   DWSTRING_DP("DWSTRING CONSTRUCTOR - DWSTRING")
   IF dws.m_BufferLen = 0 THEN this.ResizeBuffer(m_GrowSize) : EXIT CONSTRUCTOR
   this.Add(dws)
END CONSTRUCTOR
' ========================================================================================

' ========================================================================================
' Initializes the DWSTRING with a number.
' These two constructors are needed to allow to use a number with the & operator.
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING (BYVAL n AS LONGINT)
   DWSTRING_DP("DWSTRING CONSTRUCTOR LONGINT")
   DIM wsz AS WSTRING * 260 = .WSTR(n)
   this.Add(wsz)
END CONSTRUCTOR
' ========================================================================================
' ========================================================================================
PRIVATE CONSTRUCTOR DWSTRING (BYVAL n AS DOUBLE)
   DWSTRING_DP("DWSTRING CONSTRUCTOR DOUBLE")
   DIM wsz AS WSTRING * 260 = .WSTR(n)
   this.Add(wsz)
END CONSTRUCTOR
' ========================================================================================

' ========================================================================================
' Destructor
' ========================================================================================
PRIVATE DESTRUCTOR DWSTRING
   DWSTRING_DP("DWSTRING DESTRUCTOR - buffer: " & .WSTR(m_pBuffer))
   IF m_pBuffer THEN Deallocate(m_pBuffer)
END DESTRUCTOR
' ========================================================================================

' ========================================================================================
' Number of characters to preallocate to minimize multiple allocations when doing multiple
' concatenations. A value of less than 0 indicates that it must double the capacity each
' time that the buffer needs to be resized.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.GrowSize() AS LONG
   DWSTRING_DP("DWSTRING PROPERTY GET GrowSize")
   PROPERTY = m_GrowSize
END PROPERTY
' ========================================================================================
' ========================================================================================
PRIVATE PROPERTY DWSTRING.GrowSize (BYVAL nChars AS LONG)
   DWSTRING_DP("DWSTRING PROPERTY SET Growsize")
   m_GrowSize = nChars
END PROPERTY
' ========================================================================================

' ========================================================================================
' The size of the internal string buffer is retrieved and returned to the caller. The size
' is the number of characters which can be stored without further expansion.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.Capacity() AS UINT
   DWSTRING_DP("DWSTRING PROPERTY GET Capacity")
   PROPERTY = m_Capacity
END PROPERTY
' ========================================================================================
' ========================================================================================
PRIVATE PROPERTY DWSTRING.SizeOf() AS UINT
   DWSTRING_DP("DWSTRING PROPERTY GET SizeOf")
   PROPERTY = m_Capacity
END PROPERTY
' ========================================================================================
' ========================================================================================
' The internal string buffer is expanded to the specified number of characters. If the new
' capacity is smaller than the current capacity, the buffer is shortened and the contents
' that exceed the new capacity are lost.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.Capacity (BYVAL nValue AS UINT)
   DWSTRING_DP("DWSTRING PROPERTY SET Capacity")
   IF nValue = m_Capacity THEN EXIT PROPERTY
   this.ResizeBuffer(nValue)
END PROPERTY
' ========================================================================================
' ========================================================================================
PRIVATE PROPERTY DWSTRING.SizeAlloc (BYVAL nValue AS UINT)
   DWSTRING_DP("DWSTRING PROPERTY SET SizeAlloc")
   IF nValue = m_Capacity THEN EXIT PROPERTY
   this.ResizeBuffer(nValue)
END PROPERTY
' ========================================================================================

' ========================================================================================
' Returns a pointer to the DWSTRING buffer.
' ========================================================================================
PRIVATE OPERATOR DWSTRING.CAST () AS ANY PTR
   DWSTRING_DP("DWSTRING CAST ANY PTR - buffer: " & .WSTR(m_pBuffer))
   OPERATOR = m_pBuffer
END OPERATOR
' ========================================================================================
' ========================================================================================
' Returns the string data (same as **).
' ========================================================================================
PRIVATE OPERATOR DWSTRING.CAST () BYREF AS CONST WSTRING
   DWSTRING_DP("DWSTRING CAST BYREF AS WSTRING - buffer: " & .WSTR(m_pBuffer))
   OPERATOR = *m_pBuffer
END OPERATOR
' ========================================================================================
' ========================================================================================
' Returns the string data (same as **).
' ========================================================================================
PRIVATE FUNCTION DWSTRING.wstr () BYREF AS CONST WSTRING
   DWSTRING_DP("DWSTRING StrAddr - buffer: " & .WSTR(m_pBuffer))
   RETURN *m_pBuffer
END FUNCTION
' ========================================================================================
' ========================================================================================
' Returns the address of the DWSTRING buffer (same as *)
' ========================================================================================
PRIVATE FUNCTION DWSTRING.vptr () AS WSTRING PTR
   DWSTRING_DP("DWSTRING vptr - buffer: " & .WSTR(m_pBuffer))
   RETURN m_pBuffer
END FUNCTION
' ========================================================================================
' ========================================================================================
' Returns the address of the DWSTRING buffer.
' Same as vptr for this kind of data type.
' ========================================================================================
PRIVATE FUNCTION DWSTRING.sptr () AS WSTRING PTR
   DWSTRING_DP("DWSTRING sptr - buffer: " & .WSTR(m_pBuffer))
   RETURN m_pBuffer
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns the corresponding unicode integer representation of the character at the position
' specified by the nIndex parameter (1 for the first character, 2 for the second, etc.).
' If nIndex is beyond the current length of the string, a 0 is returned.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.Char (BYVAL nIndex AS UINT) AS USHORT
   DWSTRING_DP("DWSTRING PROPERTY GET Char")
   IF nIndex < 1 OR nIndex > m_BufferLen THEN EXIT PROPERTY
   ' Get the numeric character code at position nIndex
   nIndex -= 1
   PROPERTY = PEEK(USHORT, m_pBuffer + nIndex)
END PROPERTY
' ========================================================================================
' ========================================================================================
' Changes the corresponding unicode integer representation of the character at the position
' specified by the nIndex parameter (1 for the first character, 2 for the second, etc.).
' If nIndex is beyond the current length of the string, nothing is changed.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.Char (BYVAL nIndex AS UINT, BYVAL nValue AS USHORT)
   DWSTRING_DP("DWSTRING PROPERTY SET Char")
   IF nIndex < 1 OR nIndex > m_BufferLen THEN EXIT PROPERTY
   ' Set the numeric character code at position nIndex (zero based)
   nIndex -= 1
   POKE USHORT, m_pBuffer + nIndex, nValue
END PROPERTY
' ========================================================================================

' ========================================================================================
' Returns the corresponding ASCII or Unicode integer representation of the character at
' the zero-based position specified by the nIndex parameter (0 for the first character,
' 1 for the second, etc.), e.g. value = dws[1].
' ========================================================================================
PRIVATE OPERATOR DWSTRING.[] (BYVAL nIndex AS UINT) AS USHORT
   IF nIndex < 0 OR nIndex > m_BufferLen - 1 THEN EXIT OPERATOR
   ' Get the numeric character code at position nIndex
   OPERATOR = PEEK(USHORT, m_pBuffer + nIndex)
END OPERATOR
' ========================================================================================
' Remarks: To change a value using pointer arithmetic we need to use an intermediate
' pointer variable:
'   DIM dwsText AS DWSTRING = "This is my text."
'   DIM p AS WSTRING PTR = *dwsText
'   p[1] = ASC("x")
' or use casting:
'   (*dwsText)[1] = ASC("x")
' --or--
'   CAST(WSTRING PTR, *dwsText)[1] = ASC("x")
' When using the [] operator this way, there is not range checking. Therefore, make sure
' that the string is not empty and that the index does not exceed the range "[0, Len(dwsText) - 1]".
' Outside this range, results are undefined.
' ========================================================================================

' ========================================================================================
' Converts the DWSTRING to UTF8.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.Utf8 () AS STRING
   DWSTRING_DP("DWSTRING Utf8 GET PROPERTY")
   DIM cbLen AS INTEGER
   IF m_BufferLen = 0 THEN RETURN ""
   DIM buffer AS STRING = STRING(m_BufferLen * 5 + 1, 0)
   PROPERTY = *cast(ZSTRING PTR, WCharToUTF(1, m_pBuffer, m_BufferLen * 2, STRPTR(buffer), @cbLen))
END PROPERTY
' ========================================================================================

' ========================================================================================
' Converts UTF8 to unicode and assigns it to the DWSTRING.
' ========================================================================================
PRIVATE PROPERTY DWSTRING.Utf8 (BYREF utf8String AS STRING)
   DWSTRING_DP("DWSTRING Utf8 SET PROPERTY")
   this.Clear
   this.Add(utf8String, CP_UTF8)
END PROPERTY
' ========================================================================================

' ========================================================================================
' Resizes the internal buffer capacity
' ========================================================================================
PRIVATE FUNCTION DWSTRING.ResizeBuffer (BYVAL nChars AS UINT, BYVAL bClear AS BOOLEAN = FALSE) AS WSTRING PTR
   DWSTRING_DP("DWSTRING ResizeBuffer")
   ' // Allocate a buffer of nChars utf16 characters + 1 for the terminating null
   DIM pNewBuffer AS WSTRING PTR = IIF(bClear, CAllocate((nChars + 1) * 2), Allocate((nChars + 1) * 2))
   ' // Copy the old buffer in the new one
   IF nChars < m_BufferLen THEN m_BufferLen = nChars
   IF m_pBuffer THEN
      ' // Copy the old buffer in the new
      wmemmove(pNewBuffer, m_pBuffer, m_BufferLen)
      ' // Deallocate the old buffer
      Deallocate m_pBuffer
   END IF
   ' // Update the capacity
   m_Capacity = nChars
   ' // Store the new pointer
   m_pBuffer = pNewBuffer
   ' // Mark the end of the string with a double null
   m_pBuffer[m_BufferLen] = 0
   RETURN m_pBuffer
END FUNCTION
' ========================================================================================

' ========================================================================================
' Appends the specified number of characters from the specified memory address to the end of the buffer.
' ========================================================================================
PRIVATE FUNCTION DWSTRING.AppendBuffer (BYVAL memAddr AS ANY PTR, BYVAL nChars AS UINT) AS BOOLEAN
   DWSTRING_DP("DWSTRING AppendBuffer")
   IF memAddr = NULL OR nChars = 0 THEN RETURN FALSE
   ' // Number of characters + extra space to avoid multiple memory allocations
   DIM nSize AS UINT = m_BufferLen + nChars + m_GrowSize
   ' // If m_GrowSize = -1 THEN double the current capacity
   IF m_GrowSize < 0 THEN nSize = (m_BufferLen + nChars) * 2
   ' // If there is not enough capacity, resize the buffer
   IF m_BufferLen + nChars > m_Capacity THEN this.ResizeBuffer(nSize)
   ' // Copy the passed buffer
   IF m_pBuffer = NULL THEN RETURN FALSE
   wmemmove(m_pBuffer + m_BufferLen, memAddr, nChars)
   ' // Update the length of the buffer
   m_BufferLen += nChars
   ' // Mark the end of the string with a double null
   m_pBuffer[m_BufferLen] = 0
   RETURN TRUE
END FUNCTION
' ========================================================================================

' ========================================================================================
' nCount characters are removed starting at the position given by nIndex.
' nIndex = 1 for the first character, 2 for the second, etc.
' Return value: If the function succeeds, it returns TRUE; otherwise, FALSE.
' Remarks: If nCount is bigger that the number of characters available to delete, the
' function deletes all the characters from nIndex to the end of the string.
' ========================================================================================
PRIVATE FUNCTION DWSTRING.DelChars (BYVAL nIndex AS UINT, BYVAL nCount AS UINT) AS BOOLEAN
   DWSTRING_DP("DWSTRING DelChars")
   IF nIndex < 1 OR nIndex > m_BufferLen OR nCount < 1 THEN RETURN FALSE
   DIM numChars AS UINT = m_BufferLen
   IF nCount > m_BufferLen - nIndex + 1 THEN nCount = m_BufferLen - nIndex + 1
   wmemmove(m_pBuffer + nIndex - 1, m_pBuffer + (nIndex + nCount) - 1, (m_BufferLen - nIndex - nCount + 1))
   m_BufferLen -= nCount
   m_pBuffer[m_BufferLen] = 0
   RETURN TRUE
END FUNCTION
' ========================================================================================

' ========================================================================================
' Inserts the specified number of characters from the specified memory address into the buffer.
' ========================================================================================
PRIVATE FUNCTION DWSTRING.InsertBuffer (BYVAL memAddr AS ANY PTR, BYVAL nIndex AS UINT, BYVAL nChars AS UINT) AS BOOLEAN
   DWSTRING_DP("DWSTRING InsertBuffer")
   IF nIndex < 1 OR nIndex > m_BufferLen THEN RETURN FALSE
   ' // Determine the size of the new buffer
   IF m_BufferLen + nChars > m_Capacity THEN m_Capacity = m_BufferLen + nChars
   DIM pNewBuffer AS WSTRING PTR = Allocate((m_Capacity + 1) * 2)
   IF m_pBuffer THEN
      ' // Copy the existing data into the new buffer up to nIndex
      nIndex -= 1   ' // Buffer memory is zero-based
      wmemmove(pNewBuffer, m_pBuffer, nIndex)
      ' // Insert characters
      wmemmove(pNewBuffer + nIndex, memAddr, nChars)
      ' // Copy characters from nIndex
      wmemmove(pNewBuffer + nIndex + nChars, m_pBuffer + nIndex, m_BufferLen - nIndex)
      Deallocate m_pBuffer
   END IF
   m_pBuffer = pNewBuffer
   m_BufferLen += nChars
   m_pBuffer[m_BufferLen] = 0
   RETURN TRUE
END FUNCTION
' ========================================================================================

' ========================================================================================
' The incoming string parameter is inserted in the string starting at the position
' given by nIndex. nIndex = 1 for the first character, 2 For the second, etc.
' If nIndex is beyond the current length of the string + 1, no operation is performed.
' ========================================================================================
PRIVATE FUNCTION DWSTRING.Insert (BYREF ansiStr AS STRING, BYVAL nIndex AS UINT, BYVAL nCodePage AS UINT = 0) AS BOOLEAN
   DWSTRING_DP("DWSTRING Insert STRING")
   IF nIndex < 1 OR nIndex > m_BufferLen OR .LEN(ansiStr) = 0 THEN RETURN FALSE
   ' // Create the wide string from the incoming ansi string
   DIM dwLen AS UINT, pbuffer AS ANY PTR
   IF nCodePage = CP_UTF8 THEN
      dwLen = MultiByteToWideChar(CP_UTF8, 0, STRPTR(ansiStr), LEN(ansiStr), NULL, 0)
      IF dwLen = 0 THEN RETURN FALSE
      dwLen *= 2
      pbuffer = Allocate(dwLen)
      dwLen = MultiByteToWideChar(CP_UTF8, 0, STRPTR(ansiStr), LEN(ansiStr), pbuffer, dwLen)
      IF dwLen = 0 THEN RETURN FALSE
   ELSE
      dwLen = .LEN(ansiStr)
      dwLen *= 2
      pbuffer = Allocate(dwLen)
      dwLen = MultiByteToWideChar(nCodePage, MB_PRECOMPOSED, STRPTR(ansiStr), .LEN(ansiStr), pbuffer, dwLen)
      IF dwLen = 0 THEN RETURN FALSE
   END IF
   ' // Copy the string into the buffer
   DIM bRes AS BOOLEAN = this.InsertBuffer(pbuffer, nIndex, dwLen)
   Deallocate(pbuffer)
   RETURN bRes
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWSTRING.Insert (BYVAL pwszStr AS WSTRING PTR, BYVAL nIndex AS UINT) AS BOOLEAN
   DWSTRING_DP("DWSTRING Insert WSTRING PTR")
   IF nIndex < 1 OR nIndex > m_BufferLen THEN RETURN FALSE
   RETURN this.InsertBuffer(pwszStr, nIndex, .LEN(*pwszStr))
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWSTRING.Insert (BYREF dws AS DWSTRING, BYVAL nIndex AS UINT) AS BOOLEAN
   DWSTRING_DP("DWSTRING Insert DWSTRING")
   IF (nIndex < 1) OR nIndex > m_BufferLen THEN RETURN FALSE
   IF dws.m_BufferLen = 0 THEN RETURN FALSE
   RETURN this.InsertBuffer(dws.m_pBuffer, nIndex, dws.m_BufferLen)
END FUNCTION
' ========================================================================================

' ========================================================================================
' All data in the class object is erased. Actually, we only set the buffer length to zero,
' indicating no string in the buffer. The allocated memory for the buffer is deallocated
' when the class is destroyed.
' ========================================================================================
PRIVATE SUB DWSTRING.Clear
   DWSTRING_DP("DWSTRING Clear")
   m_BufferLen = 0
   m_pBuffer[m_BufferLen] = 0
END SUB
' ========================================================================================

' ========================================================================================
' The string parameter is appended to the string held in the class. If the internal string
' buffer overflows, the class will automatically extend it to an appropriate size.
' ========================================================================================
PRIVATE FUNCTION DWSTRING.Add (BYVAL pwszStr AS WSTRING PTR) AS BOOLEAN
   DWSTRING_DP("DWSTRING Add - WSTRING PTR")
   RETURN this.AppendBuffer(pwszStr, .LEN(*pwszStr))
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWSTRING.Add (BYREF ansiStr AS STRING, BYVAL nCodePage AS UINT = 0) AS BOOLEAN
   DWSTRING_DP("DWSTRING Add - STRING")
   IF LEN(ansiStr) = 0 THEN RETURN FALSE
   ' // Create the wide string from the incoming ansi string
   DIM dwLen AS UINT, pbuffer AS ANY PTR
   IF nCodePage = CP_UTF8 THEN
      dwLen = MultiByteToWideChar(CP_UTF8, 0, STRPTR(ansiStr), LEN(ansiStr), NULL, 0)
      IF dwLen = 0 THEN RETURN FALSE
      dwLen *= 2
      pbuffer = Allocate(dwLen)
      dwLen = MultiByteToWideChar(CP_UTF8, 0, STRPTR(ansiStr), LEN(ansiStr), pbuffer, dwLen)
      IF dwLen = 0 THEN RETURN FALSE
   ELSE
      dwLen = .LEN(ansiStr)
      dwLen *= 2
      pbuffer = Allocate(dwLen)
      dwLen = MultiByteToWideChar(nCodePage, MB_PRECOMPOSED, STRPTR(ansiStr), .LEN(ansiStr), pbuffer, dwLen)
      IF dwLen = 0 THEN RETURN FALSE
   END IF
   ' // Copy the string into the buffer
   DIM bRes AS BOOLEAN = this.AppendBuffer(pbuffer, dwLen)
   Deallocate(pbuffer)
   RETURN bRes
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWSTRING.Add (BYREF dws AS DWSTRING) AS BOOLEAN
   DWSTRING_DP("DWSTRING Add - DWSTRING")
   RETURN this.AppendBuffer(dws.m_pBuffer, dws.m_BufferLen)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Appends a WSTRING to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.+= (BYVAL pwszStr AS WSTRING PTR)
   DWSTRING_DP("DWSTRING OPERATOR &= WSTRING PTR")
   this.Add(pwszStr)
END OPERATOR
' ========================================================================================
' ========================================================================================
' Appends a STRING to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.+= (BYREF ansiStr AS STRING)
   DWSTRING_DP("DWSTRING OPERATOR += STRING")
   this.Add(ansiStr)
END OPERATOR
' ========================================================================================
' ========================================================================================
' Appends a DWSTRING to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.+= (BYREF dws AS DWSTRING)
   DWSTRING_DP("DWSTRING OPERATOR += DWSTRING")
   this.Add(dws)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Appends a number to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.+= (BYVAL n AS LONGINT)
   DWSTRING_DP("DWSTRING OPERATOR += LONGINT")
   DIM wsz AS WSTRING * 260 = .WSTR(n) : this.Add(wsz)
END OPERATOR
' ========================================================================================
' ========================================================================================
PRIVATE OPERATOR DWSTRING.+= (BYVAL n AS DOUBLE)
   DWSTRING_DP("DWSTRING OPERATOR += DOUBLE")
   DIM wsz AS WSTRING * 260 = .WSTR(n) : this.Add(wsz)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Appends a WSTRING to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.&= (BYVAL pwszStr AS WSTRING PTR)
   DWSTRING_DP("DWSTRING OPERATOR &= WSTRING PTR")
   this.Add(pwszStr)
END OPERATOR
' ========================================================================================
' ========================================================================================
' Appends a STRING to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.&= (BYREF ansiStr AS STRING)
   DWSTRING_DP("DWSTRING OPERATOR &= STRING")
   this.Add(ansiStr)
END OPERATOR
' ========================================================================================
' ========================================================================================
' Appends a DWSTRING to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.&= (BYREF dws AS DWSTRING)
   DWSTRING_DP("DWSTRING OPERATOR &= DWSTRING")
   this.Add(dws)
END OPERATOR
' ========================================================================================
' ========================================================================================
' Appends a number to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.&= (BYVAL n AS LONGINT)
   DWSTRING_DP("DWSTRING OPERATOR &= LONGINT")
   DIM wsz AS WSTRING * 260 = .WSTR(n) : this.Add(wsz)
END OPERATOR
' ========================================================================================
' ========================================================================================
PRIVATE OPERATOR DWSTRING.&= (BYVAL n AS DOUBLE)
   DWSTRING_DP("DWSTRING OPERATOR &= DOUBLE")
   DIM wsz AS WSTRING * 260 = .WSTR(n) : this.Add(wsz)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Assigns new text to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.Let (BYVAL pwszStr AS WSTRING PTR)
   DWSTRING_DP("DWSTRING Add - WSTRING PTR")
   this.Clear : this.Add(*pwszStr)
END OPERATOR
' ========================================================================================
' ========================================================================================
PRIVATE OPERATOR DWSTRING.Let (BYREF ansiStr AS STRING)
   DWSTRING_DP("DWSTRING LET STRING")
   this.Clear : this.Add(ansiStr)
END OPERATOR
' ========================================================================================
' ========================================================================================
PRIVATE OPERATOR DWSTRING.Let (BYREF dws AS DWSTRING)
   DWSTRING_DP("DWSTRING LET DWSTRING")
   IF m_pBuffer = dws.m_pBuffer THEN EXIT OPERATOR   ' // Ignore dws = dws
   this.Clear : this.Add(dws)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Assigns a number to the DWSTRING
' ========================================================================================
PRIVATE OPERATOR DWSTRING.Let (BYVAL n AS LONGINT)
   DWSTRING_DP("DWSTRING OPERATOR Let LONGINT")
   this.Clear : DIM wsz AS WSTRING * 260 = .WSTR(n) : this.Add(wsz)
END OPERATOR
' ========================================================================================
' ========================================================================================
PRIVATE OPERATOR DWSTRING.Let (BYVAL n AS DOUBLE)
   DWSTRING_DP("DWSTRING OPERATOR Let DOUBLE")
   this.Clear : DIM wsz AS WSTRING * 260 = .WSTR(n) : this.Add(wsz)
END OPERATOR
' ========================================================================================

' ########################################################################################
'                         *** GLOBAL OPERATORS AND FUNCTIONS ***
' ########################################################################################

' ========================================================================================
' Returns the length, in characters, of the DWSTRING.
' ========================================================================================
PRIVATE OPERATOR LEN (BYREF dws AS DWSTRING) AS UINT
   OPERATOR = dws.m_BufferLen
END OPERATOR
' ========================================================================================

' ========================================================================================
' One * returns the address of the DWSTRING buffer.
' Two ** deferences the string data.
' ========================================================================================
PRIVATE OPERATOR * (BYREF dws AS DWSTRING) AS WSTRING PTR
   OPERATOR = dws.m_pBuffer
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates two strings, converting non-strings to strings as needed
' ========================================================================================
PRIVATE OPERATOR & (BYREF dws1 AS DWSTRING, BYREF dws2 AS DWSTRING) AS DWSTRING
   DIM dwsRes AS DWSTRING = dws1
   dwsRes.Add(dws2)
   OPERATOR = dwsRes
END OPERATOR
' ========================================================================================

' ========================================================================================
' Returns the leftmost substring of a string
' ========================================================================================
PRIVATE FUNCTION Left OVERLOAD (BYREF dws AS DWSTRING, BYVAL nChars AS INTEGER) AS DWSTRING
   RETURN LEFT(*dws.m_pBuffer, nChars)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns the rightmost substring of a string
' ========================================================================================
PRIVATE FUNCTION Right OVERLOAD (BYREF dws AS DWSTRING, BYVAL nChars AS INTEGER) AS DWSTRING
   RETURN RIGHT(*dws.m_pBuffer, nChars)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Converts a string to a floating point number
' ========================================================================================
PRIVATE FUNCTION Val OVERLOAD (BYREF dws AS DWSTRING) AS DOUBLE
   RETURN .VAL(*dws.m_pBuffer)
END FUNCTION
' ========================================================================================

' =====================================================================================
' Converts the string to a 32bit integer
' =====================================================================================
PRIVATE FUNCTION ValLng OVERLOAD (BYREF dws AS DWSTRING) AS LONG
   RETURN .ValLng(*dws.m_pBuffer)
END FUNCTION
' =====================================================================================
' =====================================================================================
PRIVATE FUNCTION ValInt OVERLOAD (BYREF dws AS DWSTRING) AS LONG
   RETURN .ValInt(*dws.m_pBuffer)
END FUNCTION
' =====================================================================================

' =====================================================================================
' Converts the string to an unsigned 32bit integer
' =====================================================================================
PRIVATE FUNCTION ValULng OVERLOAD (BYREF dws AS DWSTRING) AS ULONG
   RETURN .ValULng(*dws.m_pBuffer)
END FUNCTION
' =====================================================================================
' =====================================================================================
PRIVATE FUNCTION ValUInt OVERLOAD (BYREF dws AS DWSTRING) AS ULONG
   RETURN .ValUInt(*dws.m_pBuffer)
END FUNCTION
' =====================================================================================

' =====================================================================================
' Converts the string to a 64bit integer
' =====================================================================================
PRIVATE FUNCTION ValLongInt OVERLOAD (BYREF dws AS DWSTRING) AS LONGINT
   RETURN .ValLng(*dws.m_pBuffer)
END FUNCTION
' =====================================================================================

' =====================================================================================
' Converts the string to an unsigned 64bit integer
' =====================================================================================
PRIVATE FUNCTION ValULongInt OVERLOAD (BYREF dws AS DWSTRING) AS ULONGINT
   RETURN .ValULng(*dws.m_pBuffer)
END FUNCTION
' =====================================================================================

' ########################################################################################
'                                *** HELPER FUNCTIONS ***
' ########################################################################################

' ========================================================================================
' qsort DWSTRING comparison function
' ========================================================================================
PRIVATE FUNCTION DWStringArrayCompare CDECL (BYVAL a AS DWSTRING PTR, BYVAL b AS DWSTRING PTR) AS LONG
   FUNCTION = wcscmp(a->m_pBuffer, b->m_pBuffer)
END FUNCTION
' ========================================================================================
' ========================================================================================
' Reverse qsort DWSTRING comparison function
' ========================================================================================
PRIVATE FUNCTION DWStringArrayReverseCompare CDECL (BYVAL a AS DWSTRING PTR, BYVAL b AS DWSTRING PTR) AS LONG
   DIM r AS LONG = wcscmp(a->m_pBuffer, b->m_pBuffer)
   IF r = 1 THEN r = -1 ELSE IF r = -1 THEN r = 1
   RETURN r
END FUNCTION
' ========================================================================================

' ========================================================================================
' Sorts a one-dimensional DWSTRING array calling the C qsort function.
' Parameters:
' - rgwstr : Start of target array.
' - numElm : Number of elements in the array.
' - bAscend: TRUE for sorting in ascending order; FALSE for sorting in descending order.
' Example:
' DIM rg(1 TO 10) AS DWSTRING
' FOR i AS LONG = 1 TO 10
'    rg(i) = "string " & i
' NEXT
' FOR i AS LONG = 1 TO 10
'   print rg(i)
' NEXT
' print "---- after sorting ----"
' DWStringArraySort rg()
' FOR i AS LONG = 1 TO 10
'    print rg(i)
' NEXT
' ========================================================================================
PRIVATE SUB DWStringSort (BYREF rgwstr AS ANY PTR, BYVAL numElm AS LONG, BYVAL bAscend AS BOOLEAN = TRUE)
   IF rgwstr = NULL OR numElm < 2 THEN EXIT SUB
   IF bAscend THEN
      qsort rgwstr, numElm, SIZEOF(DWSTRING), CPTR(ANY PTR, @DWStringArrayCompare)
   ELSE
      qsort rgwstr, numElm, SIZEOF(DWSTRING) , CPTR(ANY PTR, @DWStringArrayReverseCompare)
   END IF
END SUB
' ========================================================================================
' ========================================================================================
PRIVATE SUB DWStringArraySort (rgwstr() AS DWSTRING, BYVAL bAscend AS BOOLEAN = TRUE)
   DIM numElm AS LONG = UBOUND(rgwstr) - LBOUND(rgwstr) + 1
   DWStringSort @rgwstr(LBOUND(rgwstr)), numElm, bAscend
END SUB
' ========================================================================================


7 Jul 2018: Small modifications in the comments and removal of one unused line of code.
7 Jul 2018: Changed the position of the line m_pBuffer[m_BufferLen] = 0 in the ResizeBuffer method.
Last edited by Josep Roca on Jul 07, 2018 17:36, edited 9 times in total.
deltarho[1859]
Posts: 1440
Joined: Jan 02, 2017 0:34
Location: UK

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby deltarho[1859] » Jul 06, 2018 17:51

WOW!

Code: Select all

#Include Once "DWSTRING.bi"

? "Using DWstring" : ?
Dim dws As DWstring = "FreeBASIC"
Dim As String fin = " the roost"

? dws + " rules" + fin;" ";"=> ";Str(Len(dws + " rules" + fin));" chars"

? : ? "Using Wstring" : Print
Dim dws1 As Wstring*10 = "FreeBASIC"

? dws1 + " rules" + fin;" ";"=> ";Str(Len(dws1 + " rules" + fin));" chars"

Sleep
jj2007
Posts: 749
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby jj2007 » Jul 06, 2018 18:00

It works ;-)

Code: Select all

Dim dws As DWSTRING = "БесплатныйBASIC"
deltarho[1859]
Posts: 1440
Joined: Jan 02, 2017 0:34
Location: UK

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby deltarho[1859] » Jul 06, 2018 18:15

Have a look at the bi file - that is something else.
StringEpsilon
Posts: 37
Joined: Apr 09, 2015 20:49

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby StringEpsilon » Jul 06, 2018 19:19

Looks like the opposite of what I tried to do with ustring. Nice job!

A few notes though:

1) 260 byte growsize is a bit excessive. I'd set the default to 32 or 64 byte. I think normal freebasic strings use 32.

2) You depend on windows.bi, I think because you rely on MultiByteToWideChar() for conversions. (more on that below)

3) The utf8 getter won't work for all strings. At least judging from my tests of WCharToUTF. That function doesn't seem to cover all corners. At least surrogate-pairs fail in my tests.

I've already implemented functions to convert UTF-16 (including surrogate pairs) to proper UTF-8 for fbJSON. Should not be much trouble to make a UTF-8 to UTF-16 converter too, so you can drop windows.bi and make your lib cross-plattform. If you need help, shoot me a message.

(I can also cross-license the relevant fbJSON bits if you don't want to mix MPL and LGPL)

Edit: I'd add an example for the WCharToUTF error here, but the forum rejects the offending characters. But here's an escaped example: \uD800\uDBFF
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 06, 2018 20:40

What I have tried is to achieve the maximum integration possible with FreeBasic intrinsic functions and operators, so we can use the same syntax as with the native FB string data types. Speed is also very important to me when dealing with strings and DWString is very fast. I have chosen 260 characters because it is the size of MAX_PATH and it is the size that I most often use when dealing with null terminated strings, but it can be changed if FBer's find it excesive.

If you think that you can improve it to make it cross platform, please feel free to do it. Unfortunately, I can't help you with this task because I never have used Linux and utf-8. I'm a Windows guy, sorry.

Regarding the license, any one that the FreeBasic users will find acceptable will be fine with me.

BTW I have many useful string functions that I can easily adapt to work with DWSTRINGs.
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 07, 2018 18:02

So far, the only bug that I have found was that the line m_pBuffer[m_BufferLen] = 0 in the ResizeBuffer method was misplaced. I have updated the code in the original post.

DWStrProcs.bi - Part I

String functions. They work with STRING, ZSTRING, WSTRING, DWSTRING and string literals.

Code: Select all

' ########################################################################################
' Platform: Microsoft Windows
' Filename: DWStrProcs.bi
' Contents: String wrapper functions.
' Compiler: FreeBasic 32 & 64-bit, Unicode.
' Copyright (c) 2018 José Roca. Freeware. Use at your own risk.
' ########################################################################################

#pragma once
#include once "DWString.bi"

' ========================================================================================
' * Returns a copy of a string with substrings removed.
' If wszMatchStr is not present in wszMainStr, all of wszMainStr is returned intact.
' This function is case sensitive.
' Example: DWStrRemove("Hello World. Welcome to the Freebasic World", "World")
' ========================================================================================
PRIVATE FUNCTION DWStrRemove OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   DIM nLen AS LONG = LEN(wszMatchStr)
   DO
      DIM nPos AS LONG = INSTR(**dws, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      dws.DelChars nPos, nLen
   LOOP
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Case insensitive version of DWStrRemove.
' Example: DWStrRemoveI("Hello World. Welcome to the Freebasic World", "world")
' ========================================================================================
PRIVATE FUNCTION DWStrRemoveI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   DIM nLen AS LONG = LEN(wszMatchStr)
   DO
      DIM nPos AS LONG = INSTR(UCASE(**dwsMainStr), **dwsMatchStr)
      IF nPos = 0 THEN EXIT DO
      dwsMainStr.DelChars nPos, nLen
   LOOP
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns a copy of a string with a substring enclosed between the specified delimiters removed.
' Parameters:
' nStart: [Optional]. The one-based starting position where to start the search
' wszMainStr: The main string
' wszDelim1: The first delimiter
' wszDelim2: The second delimiter
' fRemoveAll: TRUE or FALSE. TRUE = Recursively remove all the occurrences.
' This function is case-sensitive.
' Example:
' DIM dwsText AS DWSTRING = "blah blah (text beween parentheses) blah blah"
' print DWStrRemove(dwsText, "(", ")")   ' Returns "blah blah  blah blah"
' Example:
' DIM dwsText AS DWSTRING = "As Long var1(34), var2(  73 ), var3(any)"
' print DWStrRemove(dwsText, "(", ")", TRUE)   ' Returns "As Long var1, var2, var3"
' ========================================================================================
PRIVATE FUNCTION DWStrRemove OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszDelim1 AS CONST WSTRING, BYREF wszDelim2 AS CONST WSTRING, BYVAL fRemoveAll AS BOOLEAN = FALSE) AS DWSTRING
   DIM nPos1 AS LONG = INSTR(wszMainStr, wszDelim1)
   IF nPos1 = 0 THEN RETURN wszMainStr
   DIM nPos2 AS LONG = INSTR(nPos1 + LEN(wszDelim1), wszMainStr, wszDelim2)
   IF nPos2 = 0 THEN RETURN wszMainStr
   nPos2 += LEN(wszDelim2)
   DIM nLen AS LONG = nPos2 - nPos1
   IF fRemoveAll = FALSE THEN RETURN MID(wszMainStr, 1, nPos1 - 1) & MID(wszMainStr, nPos2)
   RETURN DWStrRemove(MID(wszMainStr, 1, nPos1 - 1) & MID(wszMainStr, nPos2), wszDelim1, wszDelim2, fRemoveAll)
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWStrRemove OVERLOAD (BYVAL nStart AS LONG = 1, BYREF wszMainStr AS CONST WSTRING, BYREF wszDelim1 AS CONST WSTRING, BYREF wszDelim2 AS CONST WSTRING, BYVAL fRemoveAll AS BOOLEAN = FALSE) AS DWSTRING
   DIM nLen AS LONG = LEN(wszMainStr)
   IF (nStart = 0) OR (nStart > nLen) THEN RETURN ""
   IF nStart < 0 THEN nStart = nLen + nStart + 1
   DIM nPos1 AS LONG = INSTR(nStart, wszMainStr, wszDelim1)
   IF nPos1 = 0 THEN RETURN wszMainStr
   DIM nPos2 AS LONG = INSTR(nPos1, wszMainStr, wszDelim2)
   IF nPos2 = 0 THEN RETURN wszMainStr
   nPos2 += LEN(wszDelim2)
   nLen = nPos2 - nPos1
   IF fRemoveAll = FALSE THEN RETURN MID(wszMainStr, 1, nPos1 - 1) & MID(wszMainStr, nPos2)
   RETURN DWStrRemove(nStart, MID(wszMainStr, 1, nPos1 - 1) & MID(wszMainStr, nPos2), wszDelim1, wszDelim2, fRemoveAll)
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns a copy of a string with characters removed.
' If wszMatchStr is not present in wszMainStr, all of wszMainStr is returned intact.
' wszMatchStr specifies a list of single characters to be searched for individually,
' a match on any one of which will cause that character to be removed from the result.
' This function is case sensitive.
' Example: DWStrRemoveAny("abacadabra", "bac")   ' -> "dr"
' ========================================================================================
PRIVATE FUNCTION DWStrRemoveAny (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   FOR i AS LONG = 1 TO LEN(wszMatchStr)
      DO
         DIM nPos AS LONG = INSTR(**dwsMainStr, MID(wszMatchStr, i, 1))
         IF nPos = 0 THEN EXIT DO
         dwsMainStr.DelChars nPos, 1
      LOOP
   NEXT
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrRemoveAny.
' Example: DWStrRemoveAnyI("abacadabra", "BaC")   ' -> "dr"
' ========================================================================================
PRIVATE FUNCTION DWStrRemoveAnyI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   FOR i AS LONG = 1 TO LEN(wszMatchStr)
      DO
         DIM nPos AS LONG = INSTR(UCASE(**dwsMainStr), MID(**dwsMatchStr, i, 1))
         IF nPos = 0 THEN EXIT DO
         dwsMainStr.DelChars nPos, 1
      LOOP
   NEXT
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Within a specified string, replace all occurrences of one string with another string.
' Replaces all occurrences of wszMatchStr in wszMainStr with wszReplaceWith
' The replacement can cause wszMainStr to grow or condense in size.
' When a match is found, the scan for the next match begins at the position immediately
' following the prior match.
' This function is case sensitive.
' Example: DWStrReplace("Hello World", "World", "Earth")   ' -> "Hello Earth"
' ========================================================================================
PRIVATE FUNCTION DWStrReplace OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYREF wszReplaceWith AS WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM nLenReplaceWith AS LONG = LEN(wszReplaceWith)
   DIM nLenMatchStr AS LONG = LEN(wszMatchStr)
   DIM nPos AS LONG = 1
   DO
      nPos = INSTR(nPos, **dwsMainStr, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      dwsMainStr = MID(**dwsMainStr, 1, nPos - 1) + wszReplaceWith + MID(**dwsMainStr, nPos + nLenMatchStr)
      nPos += nLenReplaceWith
   LOOP
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrReplace.
' Example: DwStrReplaceI("Hello world", "World", "Earth")   ' -> "Hello Earth"
' ========================================================================================
PRIVATE FUNCTION DWStrReplaceI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYREF wszReplaceWith AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   DIM nLenReplaceWith AS LONG = LEN(wszReplaceWith)
   DIM nLenMatchStr AS LONG = LEN(wszMatchStr)
   DIM nPos AS LONG = 1
   DO
      nPos = INSTR(nPos, UCASE(**dwsMainStr), **dwsMatchStr)
      IF nPos = 0 THEN EXIT DO
      dwsMainStr = MID(**dwsMainStr, 1, nPos - 1) + wszReplaceWith + MID(**dwsMainStr, nPos + nLenMatchStr)
      nPos += nLenReplaceWith
   LOOP
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Within a specified string, replace all occurrences of any of the individual characters
' specified in the wszMainStr string.
' wszReplaceWith must be a single character. This function does not replace words therefore
' wszMatchStr will be the same size - it will not shrink or grow.
' This function is case-sensitive.
' Example: DWStrReplaceAny("abacadabra", "bac", "*")   ' -> *****d**r*
' ========================================================================================
PRIVATE FUNCTION DWStrReplaceAny (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYREF wszReplaceWith AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   IF LEN(wszMatchStr) = 0 THEN RETURN dwsMainStr
   IF LEN(wszReplaceWith) = 0 THEN RETURN dwsMainStr
   FOR x AS LONG = 1 TO LEN(wszMatchStr)
      FOR i AS LONG = 1 TO LEN(wszMainStr)
         IF MID(wszMatchStr, x, 1) = MID(wszMainStr, i, 1) THEN
            MID(**dwsMainStr, i, 1) = wszReplaceWith
         END IF
      NEXT
   NEXT
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrReplaceAny.
' Example: DWStrReplaceAnyI("abacadabra", "BaC", "*")   ' -> *****d**r*
' ========================================================================================
PRIVATE FUNCTION DWStrReplaceAnyI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYREF wszReplaceWith AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   IF LEN(wszMatchStr) = 0 THEN RETURN dwsMainStr
   IF LEN(wszReplaceWith) = 0 THEN RETURN dwsMainStr
   FOR x AS LONG = 1 TO LEN(wszMatchStr)
      FOR i AS LONG = 1 TO LEN(wszMainStr)
         IF MID(UCASE(wszMatchStr), x, 1) = MID(UCASE(wszMainStr), i, 1) THEN
            MID(**dwsMainStr, i, 1) = wszReplaceWith
         END IF
      NEXT
   NEXT
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Reverses the contents of a string expression.
' Usage example: DIM dws AS DWSTRING = DWStrReverse("garden")
' ========================================================================================
PRIVATE FUNCTION DWStrReverse (BYREF wszMainStr AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM wszChar AS WSTRING * 2
   DIM nLen AS LONG = LEN(wszMainStr)
   FOR i AS LONG = 1 TO nLen \ 2
      wszChar = MID(**dwsMainStr, i, 1)
      MID(**dwsMainStr, i, 1) = MID(**dwsMainStr, nLen - i + 1, 1)
      MID(**dwsMainStr, nLen - i + 1, 1) = wszChar
   NEXT
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Extracts characters from a string up to a character or group of characters.
' Complement function to DWStrRemain.
' Returns a substring of wszMainStr starting with its first character (or the character
' specified by nStart) and up to (but not including) the first occurrence of wszMatchStr
' If wszMatchStr is not present in wszMainStr (or is null) then all of wszMainStr is
' returned from the nStart position.
' This function is case-sensitive.
' The following line returns "aba" (match on "cad")
' DIM dws AS DWSTRING = DWStrExtract(1, "abacadabra","cad")
' ========================================================================================
PRIVATE FUNCTION DWStrExtract OVERLOAD (BYVAL nStart AS LONG = 1, BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM nLen AS LONG = LEN(wszMainStr)
   IF (nStart = 0) OR (nStart > nLen) THEN RETURN ""
   IF nStart < 0 THEN nStart = nLen + nStart + 1
   DIM nPos AS LONG = INSTR(nStart, wszMainStr, wszMatchStr)
   IF nPos THEN RETURN MID(wszMainStr, nStart, nPos - nStart)
   RETURN MID(wszMainStr, nStart)
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrExtract.
' DIM dws AS DWSTRING = DWStrExtractI(1, "abacadabra","CaD")
' ========================================================================================
PRIVATE FUNCTION DWStrExtractI (BYVAL nStart AS LONG = 1, BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   DIM nLen AS LONG = LEN(wszMainStr)
   IF (nStart = 0) OR (nStart > nLen) THEN RETURN ""
   IF nStart < 0 THEN nStart = nLen + nStart + 1
   DIM nPos AS LONG = INSTR(nStart, UCASE(wszMainStr), UCASE(wszMatchStr))
   IF nPos THEN RETURN MID(wszMainStr, nStart, nPos - nStart )
   RETURN dws = MID(wszMainStr, nStart)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns the portion of a string following the occurrence of a specified delimiter up to
' the second delimiter. If one of the delimiters isn't found, it returns an empty string.
' Parameters:
' nStart: [Optional]. The one-based starting position where to start the search
' wszMainStr: The main string
' wszDelim1: The first delimiter
' wszDelim2: The second delimiter
' This function is case-sensitive.
' Example:
' DIM dwsText AS DWSTRING = "blah blah (text beween parentheses) blah blah"
' print DWStrExtract(dwsText, "(", ")")
' ========================================================================================
PRIVATE FUNCTION DWStrExtract OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszDelim1 AS CONST WSTRING, BYREF wszDelim2 AS CONST WSTRING) AS DWSTRING
   DIM nPos1 AS LONG = INSTR(wszMainStr, wszDelim1)
   IF nPos1 = 0 THEN RETURN ""
   nPos1 += LEN(wszDelim1)
   DIM nPos2 AS LONG = INSTR(nPos1, wszMainStr, wszDelim2)
   IF nPos2 = 0 THEN RETURN ""
   DIM nLen AS LONG = nPos2 - nPos1
   RETURN MID(wszMainStr, nPos1, nLen)
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWStrExtract OVERLOAD (BYVAL nStart AS LONG = 1, BYREF wszMainStr AS CONST WSTRING, BYREF wszDelim1 AS CONST WSTRING, BYREF wszDelim2 AS CONST WSTRING) AS DWSTRING
   DIM nLen AS LONG = LEN(wszMainStr)
   IF (nStart = 0) OR (nStart > nLen) THEN RETURN ""
   IF nStart < 0 THEN nStart = nLen + nStart + 1
   DIM nPos1 AS LONG = INSTR(nStart, wszMainStr, wszDelim1)
   IF nPos1 = 0 THEN RETURN ""
   nPos1 += LEN(wszDelim1)
   DIM nPos2 AS LONG = INSTR(nPos1, wszMainStr, wszDelim2)
   IF nPos2 = 0 THEN RETURN ""
   nLen = nPos2 - nPos1
   RETURN MID(wszMainStr, nPos1, nLen)
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Extract characters from a string up to a specific character.
' Returns a substring of wszMainStr starting with its first character (or the character
' specified by nStart) and up to (but not including) the first occurrence of wszMatchStr.
' wszMatchStr specifies a list of single characters to be searched for individually, a
' match on any one of which will cause the extract operation to be performed up to that character.
' If wszMatchStr is not present in wszMainStr (or is null) then all of wszMainStr is returned.
' This function is case-sensitive.
' The following line returns "aba" (match on "c")
' Example: DWStrExtractAny(1, "abacadabra","cd")
' ========================================================================================
PRIVATE FUNCTION DWStrExtractAny (BYVAL nStart AS LONG = 1, BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM nLen AS LONG = LEN(wszMainStr)
   IF (nStart = 0) OR (nStart > nLen) THEN RETURN ""
   IF nStart < 0 THEN nStart = nLen + nStart + 1
   FOR i AS LONG = nStart TO nLen
      FOR x AS LONG = 1 TO LEN(wszMatchStr)
         IF MID(wszMainStr, i, 1) = MID(wszMatchStr, x, 1) THEN
            dwsMainStr = MID(wszMainStr, nStart, i - nStart)
            RETURN dwsMainStr
         END IF
      NEXT
   NEXT
   RETURN ""
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrExtractAny.
' Example: DWStrExtractAnyI(1, "abacadabra","CD")
' ========================================================================================
PRIVATE FUNCTION DWStrExtractAnyI (BYVAL nStart AS LONG = 1, BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dwsMainStr AS DWSTRING = wszMainStr
   DIM nLen AS LONG = LEN(wszMainStr)
   IF (nStart = 0) OR (nStart > nLen) THEN RETURN ""
   IF nStart < 0 THEN nStart = nLen + nStart + 1
   FOR i AS LONG = nStart TO nLen
      FOR x AS LONG = 1 TO LEN(wszMatchStr)
         IF MID(UCASE(wszMainStr), i, 1) = MID(UCASE(wszMatchStr), x, 1) THEN
            dwsMainStr = MID(wszMainStr, nStart, i - nStart)
            RETURN dwsMainStr
         END IF
      NEXT
   NEXT
   RETURN ""
END FUNCTION
' ========================================================================================


' ========================================================================================
' * Complement to the DWStrExtract function.
' Returns the portion of a string following the first occurrence of a substring.
' wszMainStr is searched for the string specified in wszMatchStr If found, all characters
' after wszMatchStr are returned. If wszMatchStr is not present in wszMainStr (or is null) then
' a zero-length empty string is returned.
' nStart is an optional starting position to begin searching. If nStart is not specified,
' position 1 will be used. If nStart is zero, a nul string is returned. If nStart is negative,
' the starting position is counted from right to left: if -1, the search begins at the last
' character; if -2, the second to last, and so forth.
' This function is case-sensitive.
' Example: DWStrRemain("Brevity is the soul of wit", "is ")   ' -> "the soul of wit"
' ========================================================================================
PRIVATE FUNCTION DWStrRemain (BYREF wszMainStr AS WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYVAL nStart AS LONG = 1) AS DWSTRING
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN RETURN ""
   IF nStart = 0 OR nStart > LEN(wszMainStr) THEN RETURN ""
   IF nStart < 0 THEN nStart = LEN(wszMainStr) + nStart + 1
   DIM nPos AS LONG = INSTR(nStart, wszMainStr, wszMatchStr)
   IF nPos = 0 THEN RETURN ""
   DIM dwsMainStr AS DWSTRING = wszMainStr
   dwsMainStr = MID(**dwsMainStr, nPos + LEN(wszMatchStr))
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrRemain.
' Example: DWStrRemainI("Brevity is the soul of wit", "Is ")   ' -> "the soul of wit"
' ========================================================================================
PRIVATE FUNCTION DWStrRemainI (BYREF wszMainStr AS WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYVAL nStart AS LONG = 1) AS DWSTRING
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN RETURN ""
   IF nStart = 0 OR nStart > LEN(wszMainStr) THEN RETURN ""
   IF nStart < 0 THEN nStart = LEN(wszMainStr) + nStart + 1
   DIM nPos AS LONG = INSTR(nStart, UCASE(wszMainStr), UCASE(wszMatchStr))
   IF nPos = 0 THEN RETURN ""
   DIM dwsMainStr AS DWSTRING = wszMainStr
   dwsMainStr = MID(**dwsMainStr, nPos + LEN(wszMatchStr))
   RETURN dwsMainStr
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Complement to the DWStrExtract function. Returns the portion of a string following the
' first occurrence of a character or group of characters.
' wszMainStr is searched for the string specified in wszMatchStr If found, all characters
' after wszMatchStr are returned. If wszMatchStr is not present in wszMainStr (or is null) then
' a zero-length empty string is returned.
' wszMatchStr specifies a list of single characters to be searched for individually. A match
' on any one of which will cause the extract operation be performed after that character.
' nStart is an optional starting position to begin searching. If nStart is not specified,
' position 1 will be used. If nStart is zero, a nul string is returned. If nStart is negative,
' the starting position is counted from right to left: if -1, the search begins at the last
' character; if -2, the second to last, and so forth.
' This function is case-sensitive.
' Example: DWStrRemainAny("I think, therefore I am", ",")   ' -> " therefore I am"
' ========================================================================================
PRIVATE FUNCTION DWStrRemainAny (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS WSTRING, BYVAL nStart AS LONG = 1) AS DWSTRING
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN RETURN ""
   IF nStart = 0 OR nStart > LEN(wszMainStr) THEN RETURN ""
   IF nStart < 0 THEN nStart = LEN(wszMainStr) + nStart + 1
   DIM dwsMainStr AS DWSTRING
   FOR i AS LONG = nStart TO LEN(wszMainStr)
      FOR x AS LONG = 1 TO LEN(wszMatchStr)
         IF MID(wszMainStr, i, 1) = MID(wszMatchStr, x, 1) THEN
            dwsMainStr = MID(wszMainStr, i + 1)
            RETURN dwsMainStr
         END IF
      NEXT
   NEXT
   RETURN ""
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrRemainAny.
' Example: DWStrRemainAnyI("I think, therefore I am", "E")   ' -> "refore I am"
' ========================================================================================
PRIVATE FUNCTION DWStrRemainAnyI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING, BYVAL nStart AS LONG = 1) AS DWSTRING
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN RETURN ""
   IF nStart = 0 OR nStart > LEN(wszMainStr) THEN RETURN ""
   IF nStart < 0 THEN nStart = LEN(wszMainStr) + nStart + 1
   DIM dwsMainStr AS DWSTRING
   FOR i AS LONG = nStart TO LEN(wszMainStr)
      FOR x AS LONG = 1 TO LEN(wszMatchStr)
         IF MID(UCASE(wszMainStr), i, 1) = MID(UCASE(wszMatchStr), x, 1) THEN
            dwsMainStr = MID(wszMainStr, i + 1)
            RETURN dwsMainStr
         END IF
      NEXT
   NEXT
   RETURN ""
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Count the number of occurrences of strings within a string.
' wszMainStr is the string expression in which to count characters.
' wszMatchStr is the string expression to count all occurrences of.
' If cbMatchStr is not present in wszMainStr, zero is returned.
' When a match is found, the scan for the next match begins at the position immediately
' following the prior match.
' This function is case-sensitive.
' Example: DIM nCount AS LONG = DWStrTally("abacadabra", "ab")   ' -> 2
' ========================================================================================
PRIVATE FUNCTION DWStrTally (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS LONG
   DIM nCount AS LONG, nPos AS LONG = 1
   DIM nLen AS LONG = LEN(wszMatchStr)
   DO
      nPos = INSTR(nPos, wszMainStr, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      nCount += 1
      nPos += nLen
   LOOP
   RETURN nCount
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrTally.
' Example: DIM nCount AS LONG = DWStrTallyI("abacadabra", "Ab")   ' -> 2
' ========================================================================================
PRIVATE FUNCTION DWStrTallyI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS LONG
   DIM nCount AS LONG, nPos AS LONG = 1
   DIM dwsMainStr AS DWSTRING = UCASE(wszMainStr)
   DIM dwsMatchStr AS DWSTRING = UCASE (wszMatchStr)
   DIM nLen AS LONG = LEN(wszMatchStr)
   DO
      nPos = INSTR(nPos, **dwsMainStr, **dwsMatchStr)
      IF nPos = 0 THEN EXIT DO
      nCount += 1
      nPos += nLen
   LOOP
   RETURN nCount
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Count the number of occurrences of specified characters strings within a string.
' wszMainStr is the string expression in which to count characters.
' wszMatchStr is a list of single characters to be searched for individually. A match on
' any one of which will cause the count to be incremented for each occurrence of that
' character. Note that repeated characters in wszMatchStr will not increase the count.
' This function is case-sensitive.
' Example: DIM nCount AS LONG = DWStrTallyAny("abacadabra", "bac")   ' -> 8
' ========================================================================================
PRIVATE FUNCTION DWStrTallyAny (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS LONG
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN EXIT FUNCTION
   ' // Remove possible duplicates in the matches string
   DIM nPos AS LONG
   DIM dwsMatchStr AS DWSTRING = wszMatchStr
   FOR i AS LONG = 1 TO LEN(dwsMatchStr)
      nPos = INSTR(**dwsMatchStr, MID(wszMatchStr, i, 1))
      IF nPos = 0 THEN dwsMatchStr += MID(wszMatchStr, i, 1)
   NEXT
   ' // Do the count
   DIM nCount AS LONG
   FOR i AS LONG = 1 TO LEN(dwsMatchStr)
      nPos = 1
      DO
         nPos = INSTR(nPos, wszMainStr, MID(**dwsMatchStr, i, 1))
         IF nPos = 0 THEN EXIT DO
         IF nPos THEN
            nCount += 1
            nPos += 1
         END IF
      LOOP
   NEXT
   RETURN nCount
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrTallyAny.
' Example: DIM nCount AS LONG = DWStrTallyAnyI("abacadabra", "bAc")
' ========================================================================================
PRIVATE FUNCTION DWStrTallyAnyI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS LONG
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN EXIT FUNCTION
   ' // Remove possible duplicates in the matches string
   DIM nPos AS LONG
   DIM dwsMainStr AS DWSTRING = UCASE(wszMainStr)
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   FOR i AS LONG = 1 TO LEN(dwsMatchStr)
      nPos = INSTR(**dwsMatchStr, MID(wszMatchStr, i, 1))
      IF nPos = 0 THEN dwsMatchStr += MID(wszMatchStr, i, 1)
   NEXT
   ' // Do the count
   DIM nCount AS LONG
   FOR i AS LONG = 1 TO LEN(dwsMatchStr)
      nPos = 1
      DO
         nPos = INSTR(nPos, **dwsMainStr, MID(**dwsMatchStr, i, 1))
         IF nPos = 0 THEN EXIT DO
         IF nPos THEN
            nCount += 1
            nPos += 1
         END IF
      LOOP
   NEXT
   RETURN nCount
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Determine whether each character of a string is present in another string.
' Returns zero if each character in wszMainStr is present in wszMatchStr
' If not, it returns the position of the first non-matching character in wszMainStr.
' This function is very useful for determining if a string contains only numeric digits, for example.
' This function is case-sensitive.
' If nStart evaluates to a position outside of the string, or if nStart is zero, then the
' function returns zero.
' Example: DIM nCount AS LONG = DWStrVerify(5, "123.65,22.5", "0123456789")   ' -> 7
' Rreturns 7 since 5 starts it past the first non-digit ("." at position 4)
' ========================================================================================
PRIVATE FUNCTION DWStrVerify (BYVAL nStart AS LONG, BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS LONG
   IF nStart <= 0 OR nStart > LEN(wszMainStr) THEN RETURN 0
   ' // Get each character in wszMainStr and look for it in wszMatchStr
   DIM AS LONG nPos, idx
   FOR i AS LONG = nStart TO LEN(wszMainStr)
      nPos = INSTR(wszMatchStr, MID(wszMainStr, i, 1))
      IF nPos = 0 THEN
         idx = i
         EXIT FOR
      END IF
   NEXT
   RETURN  idx
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case sensintive version of DWStrVerify.
' Example: DWStrVerifyI(5, "123.65abcx22.5", "0123456789ABC")   ' -> 10
' ========================================================================================
PRIVATE FUNCTION DWStrVerifyI (BYVAL nStart AS LONG, BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS LONG
   IF nStart <= 0 OR nStart > LEN(wszMainStr) THEN RETURN 0
   ' // Get each character in wszMainStr and look for it in wszMatchStr
   DIM dwsMainStr AS DWSTRING = UCASE(wszMainStr)
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   DIM AS LONG nPos, idx
   FOR i AS LONG = nStart TO LEN(dwsMainStr)
      nPos = INSTR(**dwsMatchStr, MID(**dwsMainStr, i, 1))
      IF nPos = 0 THEN
         idx = i
         EXIT FOR
      END IF
   NEXT
   RETURN  idx
END FUNCTION
' ========================================================================================


10 Jul 2018: Changed some AfxStr... to DWStr... in the comments.
Last edited by Josep Roca on Jul 10, 2018 20:49, edited 4 times in total.
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 07, 2018 18:04

DWStrProcs.bi - Part II

Code: Select all

' ========================================================================================
' Returns a string containing a left-justified (padded) string.
' If the optional parameter wszPadCharacter not specified, the function pads the string with
' space characters to the left. Otherwise, the function pads the string with the first
' character of wszPadCharacter
' Example: DIM dws AS DWSTRING = DWStrLSet("FreeBasic", 20, "*")
' ========================================================================================
PRIVATE FUNCTION DWStrLSet (BYREF wszMainStr AS CONST WSTRING, BYVAL nStringLength AS LONG, BYREF wszPadCharacter AS CONST WSTRING = " ") AS DWSTRING
   DIM dws AS DWSTRING = WSTRING(nStringLength, wszPadCharacter)
   MID(**dws, 1, LEN(wszMainStr)) = wszMainStr
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a string containing a right-justified (padded) string.
' If the optional parameter wszPadCharacter not specified, the function pads the string with
' space characters to the left. Otherwise, the function pads the string with the first
' character of wszPadCharacter.
' Example: DIM dws AS DWSTRING = DWStrRSet("FreeBasic", 20, "*")
' ========================================================================================
PRIVATE FUNCTION DWStrRSet (BYREF wszMainStr AS CONST WSTRING, BYVAL nStringLength AS LONG, BYREF wszPadCharacter AS CONST WSTRING = " ") AS DWSTRING
   IF LEN(wszMainStr) > nStringLength THEN RETURN LEFT(wszMainStr, nStringLength)
   DIM dws AS DWSTRING = WSTRING(nStringLength, wszPadCharacter)
   MID(**dws, nStringLength - LEN(wszMainStr) + 1, LEN(wszMainStr)) = wszMainStr
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a string containing a centered (padded) string.
' If the optional parameter wszPadCharacter not specified, the function pads the string with
' space characters to the left. Otherwise, the function pads the string with the first
' character of wszPadCharacter.
' Example: DIM dws AS DWSTRING = DWStrCSet("FreeBasic", 20, "*")
' ========================================================================================
PRIVATE FUNCTION DWStrCSet (BYREF wszMainStr AS CONST WSTRING, BYVAL nStringLength AS LONG, BYREF wszPadCharacter AS CONST WSTRING = " ") AS DWSTRING
   IF LEN(wszMainStr) > nStringLength THEN RETURN LEFT(wszMainStr, nStringLength)
   DIM dws AS DWSTRING = WSTRING(nStringLength, wszPadCharacter)
   MID(**dws, (nStringLength - LEN(wszMainStr)) \ 2 + 1, LEN(wszMainStr)) = wszMainStr
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
'  Parses a path/file name to extract component parts.
'  This function evaluates a text path/file text name, and returns a requested part of the
'  name. The functionality is strictly one of string parsing alone.
'  wszOption is one of the following words which is used to specify the requested part:
'  PATH
'        Returns the path portion of the path/file Name. That is the text up to and
'        including the last backslash (\) or colon (:).
'  NAME
'        Returns the name portion of the path/file Name. That is the text to the right
'        of the last backslash (\) or colon (:), ending just before the last period (.).
'  EXTN
'        Returns the extension portion of the path/file name. That is the last
'        period (.) in the string plus the text to the right of it.
'  NAMEX
'        Returns the name and the EXTN parts combined.
' Example:
' DIM dwsPath AS DWSTRING = ExePath
' PRINT DWStrPathName("Name", dwsPath)
' ========================================================================================
PRIVATE FUNCTION DWStrPathName (BYREF wszOption AS CONST WSTRING, BYREF wszFileSpec AS WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = ""
   IF LEN(wszFileSpec) = 0 THEN RETURN dws
   SELECT CASE UCASE(wszOption)
      CASE "PATH"
         ' // Returns the path portion of file spec
         DIM nPos AS LONG = InstrRev(wszFileSpec, ANY ":/\")
         IF nPos THEN dws = MID(wszFileSpec, 1, nPos)
      CASE "NAME"
         ' // Retrieve the full filename
         dws = wszFileSpec
         DIM nPos AS LONG = InstrRev(wszFileSpec, ANY ":/\")
         IF nPos THEN dws = MID(wszFileSpec, nPos + 1)
         ' // Retrieve the filename
         nPos = InstrRev(dws, ".")
         IF nPos THEN dws = MID(dws, 1, nPos - 1)
      CASE "NAMEX"
         ' // Retrieve the name and extension combined
         DIM nPos AS LONG = InStrRev(wszFileSpec, ANY ":/\")
         IF nPos THEN dws = MID(wszFileSpec, nPos + 1) ELSE dws = wszFileSpec
      CASE "EXTN"
         ' // Retrieve the name and extension combined
         DIM nPos AS LONG = InstrRev(wszFileSpec, ANY ":/\")
         IF nPos THEN dws = MID(wszFileSpec, nPos + 1) ELSE dws = wszFileSpec
         ' // Retrieve the extension
         nPos = InStrRev(dws, ".")
         IF nPos THEN dws = MID(dws, nPos) ELSE dws = ""
   END SELECT
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a string consisting of multiple copies of the specified string.
' This function is very similar to STRING (which makes multiple copies of a single character).
' Example: DIM dws AS DWSTRING = DWStrRepeat(5, "Paul")
' ========================================================================================
PRIVATE FUNCTION DWStrRepeat (BYVAL nCount AS LONG, BYREF wszMainStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = ""
   IF nCount <= 0 THEN RETURN dws
   ' // Create the final full buffer and insert the strings into it
   ' // in order to avoid nCount concatenations.
   DIM nLen AS LONG = LEN(wszMainStr)
   dws = SPACE(nCount * nLen)
   FOR i AS LONG = 0 TO nCount - 1
      MID(**dws, (i * nLen) + 1, nLen) = wszMainStr
   NEXT
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a string with nCount characters removed from the left side of the string.
' If nCount is less than one then the entire string is returned.
' Example: DIM dws AS DWSTRING = DWStrClipLeft("1234567890", 3)
' ========================================================================================
PRIVATE FUNCTION DWStrClipLeft (BYREF wszMainStr AS CONST WSTRING, BYVAL nCount AS LONG) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   IF nCount <= 0 THEN RETURN dws
   DIM nLen AS LONG = LEN(wszMainStr)
   nCount = IIF(nLen < nCount, nLen, nCount)
   dws = MID(wszMainStr, nCount + 1)
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a string with nCount characters removed from the right side of the string.
' If nCount is less than one then the entire string is returned.
' DIM dws AS DWSTRING = DWStrClipRight("1234567890", 3)
' ========================================================================================
PRIVATE FUNCTION DWStrClipRight (BYREF wszMainStr AS CONST WSTRING, BYVAL nCount AS LONG) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   IF nCount <= 0 THEN RETURN dws
   DIM nLen AS LONG = LEN(wszMainStr)
   nCount = nLen - nCount
   nCount = IIF(nLen < nCount, nLen, nCount)
   dws = LEFT(wszMainStr, nCount)
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a string with nCount characters removed starting at position nStart. The first
' character is considered position 1, the second position 2, etc...
' If nCount or nStart is less than one then the entire string is returned.
' Usage example:
' DIM dws AS DWSTRING = DWStrClipMid("1234567890", 3, 4)
' ========================================================================================
PRIVATE FUNCTION DWStrClipMid (BYREF wszMainStr AS CONST WSTRING, BYVAL nStart AS LONG, BYVAL nCount AS LONG) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   IF (nCount <= 0) OR (nStart <= 0) THEN RETURN dws
   DIM nLen AS LONG = LEN(wszMainStr)
   dws = LEFT(wszMainStr, nStart - 1) + MID(wszMainStr, nStart + nCount)
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
'  Adds paired characters to the beginning and end of a string.
'  It is particularly useful for enclosing text with parenthesess, quotes, brackets, etc.
'  For example: DWStrWrap("Paul", "<", ">") results in <Paul>
'  If only one wrap character/string is specified then that character or string is used
'  for both sides.
'  For example: DWStrWrap("Paul", "'") results in 'Paul'
'  If no wrap character/string is specified then double quotes are used.
'  For example: DWStrWrap("Paul") results in "Paul"
' ========================================================================================
PRIVATE FUNCTION DWStrWrap OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszLeftChar AS CONST WSTRING, BYREF wszRightChar AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = wszLeftChar + wszMainStr & wszRightChar
   RETURN dws
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWStrWrap OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszChar AS CONST WSTRING = CHR(34)) AS DWSTRING
   DIM dws AS DWSTRING = wszChar + wszMainStr + wszChar
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' Removes paired characters to the beginning and end of a string.
' It is particularly useful for removing text with parenthesess, quotes, brackets, etc.
' For example: DWStrUnWrap("<Paul>", "<", ">") results in Paul
' If only one unwrap character/string is specified then that character or string is used for both sides.
' For example: DWStrUnWrap("'Paul'", "'") results in Paul
' If no wrap character/string is specified then double quotes are used.
' For example: DWStrUnWrap("""Paul""") results in Paul
' ========================================================================================
PRIVATE FUNCTION DWStrUnWrap OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszLeftChar AS CONST WSTRING, BYREF wszRightChar AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = LTRIM(wszMainStr, wszLeftChar)
   dws = RTRIM(dws, wszRightChar)
   RETURN dws
END FUNCTION
' ========================================================================================
' ========================================================================================
PRIVATE FUNCTION DWStrUnWrap OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszChar AS CONST WSTRING = CHR(34)) AS DWSTRING
   DIM dws AS DWSTRING = LTRIM(wszMainStr, wszChar)
   dws = RTRIM(**dws, wszChar)
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Deletes a specified number of characters from a string expression.
' Returns a string based on wszMainStr but with nCount characters deleted
' starting at position nStart. The first character in the string is position 1, etc.
' Usage example:
' DIM dws AS DWSTRING = DWStrDelete("1234567890", 4, 3)
' ========================================================================================
PRIVATE FUNCTION DWStrDelete (BYREF wszMainStr AS CONST WSTRING, BYVAL nStart AS LONG, BYVAL nCount AS LONG) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   DIM nLen AS LONG = LEN(wszMainStr)
   IF nLen = 0 OR nStart < 0 OR nCount <= 0 OR nStart > nLen THEN RETURN dws
   dws.DelChars nStart, nCount
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Inserts a string at a specified position within another string expression.
' Returns a string consisting of wszMainStr with the string wszInsertString inserted
' at nPosition. If nPosition is greater than the length of wszMainStr or <= zero then
' wszInsertString is appended to wszMainStr. The first character in the string is position 1, etc.
' DIM dws AS DWSTRING = DWStrInsert("1234567890", "--", 6)
' ========================================================================================
PRIVATE FUNCTION DWStrInsert (BYREF wszMainStr AS CONST WSTRING, BYREF wszInsertString AS WSTRING, BYVAL nPosition AS LONG) AS DWSTRING
   DIM dws AS DWSTRING = wszMainStr
   IF nPosition <= 0 THEN RETURN dws
   IF nPosition > LEN(wszMainStr) THEN
      dws += wszInsertString
   ELSEIF nPosition = 1 THEN
      dws = wszInsertString + MID(wszMainStr, 1)
   ELSE
      dws = MID(wszMainStr, 1, nPosition - 1) + wszInsertString + MID(wszMainStr, nPosition)
   END IF
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns a string containing only the characters contained in a specified match string.
' All other characters are removed. If wszMatchStr is an empty string the function returns
' an empty string. This function is case-sensitive.
' Example: DIM dws AS DWSTRING = DWStrRetain("abacadabra","b")   ' -> "bb"
' ========================================================================================
PRIVATE FUNCTION DWStrRetain (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = ""
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN RETURN dws
   DIM nLen AS LONG = LEN(wszMatchStr)
   DIM nPos AS LONG = 1
   DO
      nPos = INSTR(nPos, wszMainStr, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      dws += MID(wszMainStr, nPos, nLen)
      nPos += nLen
   LOOP
   RETURN dws
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DEStrRetain.
' Example: DIM dws AS DWSTRING = DWStrRetainI("abacadabra","B")   ' -> "bb"
' ========================================================================================
PRIVATE FUNCTION DWStrRetainI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = ""
   IF LEN(wszMainStr) = 0 OR LEN(wszMatchStr) = 0 THEN RETURN dws
   DIM dwsMainStr AS DWSTRING = UCASE(wszMainStr)
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   DIM nLen AS LONG = LEN(wszMatchStr)
   DIM nPos AS LONG = 1
   DO
      nPos = INSTR(nPos, **dwsMainStr, **dwsMatchStr)
      IF nPos = 0 THEN EXIT DO
      dws += MID(wszMainStr, nPos, nLen)
      nPos += nLen
   LOOP
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns a string containing only the characters contained in a specified match string.
' All other characters are removed.
' If wszMatchStr is an empty string the function returns an empty string.
' wszMatchStr specifies a list of single characters to be searched for individually.
' A match on any one of which will cause that character to be removed from the result.
' This function is case-sensitive.
' Example: DWStrRetainAny("<p>1234567890<ak;lk;l>1234567890</p>", "<;/p>")
' ========================================================================================
PRIVATE FUNCTION DWStrRetainAny (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = ""
   DIM nLen AS LONG = LEN(wszMainStr)
   IF nLen = 0 OR LEN(wszMatchStr) = 0 THEN RETURN dws
   DIM nPos AS LONG
   FOR i AS LONG = 1 TO nLen
      nPos = INSTR(wszMatchStr, MID(wszMainStr, i, 1))
      IF nPos THEN dws += MID(wszMainStr, i, 1)
   NEXT
   RETURN dws
END FUNCTION
' ========================================================================================
' ========================================================================================
' * Case insensitive version of DWStrRetainAny.
' Example: DWStrRetainAnyI("<p>1234567890<ak;lk;l>1234567890</p>", "<;/P>")
' ========================================================================================
PRIVATE FUNCTION DWStrRetainAnyI (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS CONST WSTRING) AS DWSTRING
   DIM dws AS DWSTRING = ""
   DIM nLen AS LONG = LEN(wszMainStr)
   IF nLen = 0 OR LEN(wszMatchStr) = 0 THEN RETURN dws
   DIM dwsMainStr AS DWSTRING = UCASE(wszMainStr)
   DIM dwsMatchStr AS DWSTRING = UCASE(wszMatchStr)
   DIM nPos AS LONG
   FOR i AS LONG = 1 TO nLen
      nPos = INSTR(**dwsMatchStr, MID(**dwsMainStr, i, 1))
      IF nPos THEN dws += MID(wszMainStr, i, 1)
   NEXT
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Shrinks a string to use a consistent single character delimiter.
' The purpose of this function is to create a string with consecutive data items (words)
' separated by a consistent single character. This makes it very straightforward to parse
' the results as needed.
' If wszMask is not defined then all leading spaces and trailing spaces are removed entirely.
' All occurrences of two or more spaces are changed to a single space. Therefore, the new
' string returned consists of zero or more words, each separated by a single space character.
' If wszMask is specified, it defines one or more delimiter characters to shrink. All leading
' and trailing mask characters are removed entirely. All occurrences of one or more mask
' characters are replaced with the first character of wszMask The new string returned consists
' of zero or more words, each separated by the character found in the first position of wszMask.
' WhiteSpace is generally defined as the four common non-printing characters:
' Space, Tab, Carriage-Return, and Line-Feed. wszbMask = Chr(32,9,13,10)
' Example: DIM dws AS DWSTRING = DWStrShrink(",,, one , two     three, four,", " ,")
' ========================================================================================
PRIVATE FUNCTION DWStrShrink (BYREF wszMainStr AS CONST WSTRING, BYREF wszMask AS CONST WSTRING = " ") AS DWSTRING
   DIM dws AS DWSTRING = ""
   IF LEN(wszMainStr) = 0 OR LEN(wszMask) = 0 THEN RETURN dws
   ' // Eliminate all leading and trailing cbMask characters
   dws = TRIM(wszMainStr, ANY wszMask)
   ' // Eliminate all duplicate wszMask characters within the string
   DIM wszReplace AS WSTRING * 2 = MID(wszMask, 1, 1)
   DIM wszDuplicate AS WSTRING * 3
   DIM nMaskLen AS LONG = LEN(wszMask)
   DIM nPos AS LONG
   FOR i AS LONG = 1 TO nMaskLen
      wszDuplicate = MID(wszMask, i, 1) + MID(wszMask, i, 1)   ' usually double spaces
      nPos = 1
      DO
         nPos = INSTR(**dws, wszDuplicate)
         IF nPos = 0 THEN EXIT DO
         dws = MID(**dws, 1, nPos - 1) + wszReplace + MID(**dws, nPos + LEN(wszDuplicate))
      LOOP
   NEXT
   ' // Replace all single characters in the mask with the first character of the mask.
   nPos = 1
   DO
      nPos = INSTR(nPos, **dws, ANY wszMask)
      IF nPos = 0 THEN EXIT DO
      ' Only do the replace if the character at the position found is
      ' different than the character we need to replace it with. This saves
      ' us from having to do an unneeded string concatenation.
      IF MID(**dws, nPos, 1) <> wszReplace  THEN
         dws = MID(**dws, 1, nPos - 1) + wszReplace + MID(**dws, nPos + 1)
      END IF
      nPos += 1
   LOOP
   ' Finally, do a pass to ensure that there are no duplicates of the
   ' first mask character because of the replacements in the step above.
   wszDuplicate = MID(wszMask, 1, 1) + MID(wszMask, 1, 1)
   nPos = 1
   DO
      nPos = INSTR(**dws, wszDuplicate)
      IF nPos = 0 THEN EXIT DO
      dws = MID(**dws, 1, nPos - 1) + wszReplace + MID(**dws, nPos + LEN(wszDuplicate))
   LOOP
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns the count of delimited fields from a string expression.
' If wszMainStr is empty (a null string) or contains no delimiter character(s), the string
' is considered to contain exactly one sub-field. In this case, DWStrParseCount returns the value 1.
' Delimiter contains a string (one or more characters) that must be fully matched.
' Delimiters are case-sensitive.
' Example: DIM nCount AS LONG = DWStrParseCount("one,two,three", ",")
' ========================================================================================
PRIVATE FUNCTION DWStrParseCount (BYREF wszMainStr AS CONST WSTRING, BYREF wszDelimiter AS CONST WSTRING = ",") AS LONG
   DIM nCount AS LONG = 1
   DIM nPos AS LONG = 1
   DO
      nPos = INSTR(nPos, wszMainStr, wszDelimiter)
      IF nPos = 0 THEN EXIT DO
      nCount += 1
      nPos += LEN(wszDelimiter)
   LOOP
   RETURN nCount
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Return the count of delimited fields from a string expression.
' If wszMainStr is empty (a null string) or contains no delimiter character(s), the string
' is considered to contain exactly one sub-field. In this case, DWStrParseCountAny returns the value 1.
' Delimiter contains a set of characters (one or more), any of which may act as a delimiter character.
' Delimiters are case-sensitive.
' Example: DIM nCount AS LONG = DWStrParseCountAny("1;2,3", ",;")
' ========================================================================================
PRIVATE FUNCTION DWStrParseCountAny (BYREF wszMainStr AS CONST WSTRING, BYREF wszDelimiter AS CONST WSTRING = ",") AS LONG
   DIM nCount AS LONG = 1
   FOR i AS LONG = 1 TO LEN(wszDelimiter)
      nCount += DWStrParseCount(wszMainStr, MID(wszDelimiter, i, 1))
   NEXT
   RETURN nCount
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns the nPosition-th substring in a string wszMainStr with separations wszDelimiter
' (one or more characters), beginning with nPosition = 1.
' ========================================================================================
PRIVATE FUNCTION DWStrParse OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYREF wszDelimiter AS CONST WSTRING, BYVAL nPosition AS LONG, BYVAL bIsAny AS BOOLEAN, BYVAL nLenDelimiter AS LONG) AS DWSTRING
   DIM nCount AS LONG, nStart AS LONG
   nPosition = ABS(nPosition)
   DIM nPos AS LONG = 1
   DIM fReverse AS BOOLEAN = IIF(nPosition < 0, TRUE, FALSE)
   DIM dws AS DWSTRING = ""
   IF fReverse THEN
      ' Reverse search
      ' Get the start of the token (j) by searching in reverse
      IF bIsAny THEN
         nPos = InstrRev(wszMainStr, ANY wszDelimiter)
      ELSE
         nPos = InstrRev(wszMainStr, wszDelimiter)
      END IF
      DO WHILE nPos > 0        ' if not found loop will be skipped
         nStart = nPos + nLenDelimiter
         nCount += 1
         nPos = nPos - nLenDelimiter
         IF nCount = nPosition THEN EXIT DO
         IF bIsAny THEN
            nPos = InStrRev(wszMainStr, ANY wszDelimiter, nPos)
         ELSE
             nPos = InStrRev(wszMainStr, wszDelimiter, nPos)
         END IF
      LOOP
      IF nPos = 0 THEN nStart = 1
      ' Now continue forward to get the end of the token
      IF bIsAny THEN
         nPos = INSTR(nStart, wszMainStr, ANY wszDelimiter)
      ELSE
         nPos = INSTR(nStart, wszMainStr, wszDelimiter)
      END IF
      IF nPos > 0 OR nCount = nPosition THEN
         IF nPos = 0 THEN
            dws = MID(wszMainStr, nStart)
         ELSE
            dws = MID(wszMainStr, nStart, nPos - nStart)
         END IF
      END IF
   ELSE
      ' Forward search
      DO
         nStart = nPos
         IF bIsAny THEN
            nPos = INSTR(nPos, wszMainStr, ANY wszDelimiter)
         ELSE
            nPos = INSTR(nPos, wszMainStr, wszDelimiter)
         END IF
         IF nPos THEN
            nCount += 1
            nPos += nLenDelimiter
         END IF
      LOOP UNTIL nPos = 0 OR nCount = nPosition
      IF nPos > 0 OR nCount = nPosition - 1 THEN
         IF nPos = 0 THEN
            dws = MID(wszMainStr, nStart)
         ELSE
            dws = MID(wszMainStr, nStart, nPos - nLenDelimiter - nStart)
         END IF
      END IF
   END IF
   RETURN dws
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Returns a delimited field from a string expression.
' wszDelimiter contains a string of one or more characters that must be fully matched to be successful.
' If nPosition evaluates to zero or is outside of the actual field count, an empty string is returned.
' If nPosition is negative then fields are searched from the right to left of the wszMainStr
' Delimiters are case-sensitive.
' Example: DIM dws AS DWSTRING = DWStrParse("one,two,three", 2)
' Example: DIM dws AS DWSTRING = DWStrParse("one;two,three", 1, ";")
' ========================================================================================
PRIVATE FUNCTION DWStrParse OVERLOAD (BYREF wszMainStr AS CONST WSTRING, BYVAL nPosition AS LONG = 1, BYREF wszDelimiter AS CONST WSTRING = ",") AS DWSTRING
   ' The parse must match the entire deliminter string
   RETURN DWStrParse(wszMainStr, wszDelimiter, nPosition, FALSE, Len(wszDelimiter))
END FUNCTION
' ========================================================================================

' ========================================================================================
' * Return a delimited field from a string expression.
' Delimiter contains a set of characters (one or more), any of which may act as a delimiter character.
' If nPosition evaluates to zero or is outside of the actual field count, an empty string is returned.
' If nPosition is negative then fields are searched from the right to left of the MainString.
' Delimiters are case-sensitive.
' Example: DIM dws AS DWSTRING = DWStrParseAny("1;2,3", 2, ",;")
' ========================================================================================
PRIVATE FUNCTION DWStrParseAny (BYREF wszMainStr AS CONST WSTRING, BYVAL nPosition AS LONG = 1, BYREF wszDelimiter AS CONST WSTRING = ",") AS DWSTRING
   ' The parse must match one character (len = 1) in the delimiter string
   RETURN DWStrParse(wszMainStr, wszDelimiter, nPosition, TRUE, 1)
END FUNCTION
' ========================================================================================
Last edited by Josep Roca on Jul 10, 2018 20:48, edited 1 time in total.
deltarho[1859]
Posts: 1440
Joined: Jan 02, 2017 0:34
Location: UK

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby deltarho[1859] » Jul 10, 2018 20:13

I think that José Roca should have been given a FreeBASIC MVP award for his WinFBX contribution.

After kicking his latest contribution around for a few days I reckon that DWSTRING qualifies for a 'and bar'. PowerBASIC has had a dynamic unicode string for some time.

José, thank you very much for this FreeBASIC enhancement.
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 10, 2018 21:02

If you're using WinFBX, you can use CWSTR instead of DWSTRING and the AfxStr functions instead of the DWStr ones. I have posted these variations to make them independent of my framework so that FBer's not wanting to use it for wathever reason can at least use unicode with Windows easily. I even have used the .bi extension instead of .inc to avoid rejections :)
James Klutho
Posts: 12
Joined: Nov 11, 2009 23:44

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby James Klutho » Jul 10, 2018 21:16

Thank you Jose for your excellent contributions.
deltarho[1859]
Posts: 1440
Joined: Jan 02, 2017 0:34
Location: UK

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby deltarho[1859] » Jul 10, 2018 21:24

Josep Roca wrote:I even have used the .bi extension instead of .inc to avoid rejections :)

<smile>

and more uses of Private than we can shake a stick at.
Josep Roca
Posts: 376
Joined: Sep 27, 2016 18:20
Location: Valencia, Spain

Re: DWSTRING.bi - Dynamic null terminated unicode string data type

Postby Josep Roca » Jul 10, 2018 22:17

I use source code and PRIVATE all the time. I know that this is unusual in this forum, so if anybody wants to build libraries with the class and functions, I don't care. Just don't ask me to do it.

Return to “Projects”

Who is online

Users browsing this forum: No registered users and 2 guests