functions utf_conv.bi?

External libraries (GTK, GSL, SDL, Allegro, OpenGL, etc) questions.
Eric-S
Posts: 18
Joined: Aug 29, 2008 3:42
Location: Russian, Sankt-Peterburg

functions utf_conv.bi?

Postby Eric-S » Dec 23, 2008 13:08

Please tell us more about the functions:
CharToUTF
WCharToUTF
UTFToChar
UTFToWChar
in header file: "%programfiles%\freebasic\inc\utf_conv.bi"
Jojo
Posts: 107
Joined: Jan 04, 2006 20:02
Contact:

Postby Jojo » Dec 23, 2008 13:53

What's your exact problem with them? They're doing charset conversion.
Eric-S
Posts: 18
Joined: Aug 29, 2008 3:42
Location: Russian, Sankt-Peterburg

Postby Eric-S » Dec 23, 2008 15:36

Found these functions, but what they do, I can not understand. No examples and descriptions. -

I need, loading unicode text file.
http://www.freebasic.net/forum/viewtopi ... 720#112720
And converting different ascii code page in unicode chars.

Нашел эти функции, а что они делают не понимаю. Нет примеров и описаний.

Мне нужно загрузить текст из юникодовского файла.
И ещё нужно перекодировать текст из разных аски кодировок, в юникод.
counting_pine
Site Admin
Posts: 6225
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Postby counting_pine » Dec 23, 2008 16:35

The fuctions convert between UTF strings and [z]strings/wstring data.

I believe this is how it works, although I had to delve into the rtlib source code to figure it all out:

Code: Select all

declare function CharToUTF cdecl alias "fb_CharToUTF"       ( _
    byval encod as UTF_ENCOD, _
    byval src as zstring ptr, _
    byval chars as integer, _
    byval dst as any ptr, _
    byval bytes as integer ptr ) as any ptr

declare function WCharToUTF cdecl alias "fb_WCharToUTF"    ( _
    byval encod as UTF_ENCOD, _
    byval src as wstring ptr, _
    byval chars as integer, _
    byval dst as any ptr, _
    byval bytes as integer ptr ) as any ptr

[W]charToUTF convert from Z/Wstring data to (little-endian) UTF.
encod provides the encoding method, src and dst provide the input/output buffers. chars provides the number of characters in the src buffer. bytes receives the number of bytes output into the dst buffer.

If dst is NULL, then a destination buffer is allocated with enough memory to hold the worst-case number of bytes. In either case, the destination buffer is given as the return value.

Code: Select all

declare function UTFToChar cdecl alias "fb_UTFToChar"       ( _
    byval encod as UTF_ENCOD, _
    byval src as any ptr, _
    byval dst as zstring ptr, _
    byval chars as integer ptr ) as zstring ptr

declare function UTFToWChar cdecl alias "fb_UTFToWChar"    ( _
    byval encod as UTF_ENCOD, _
    byval src as any ptr, _
    byval dst as wstring ptr, _
    byval chars as integer ptr ) as wstring ptr

UTFTo[W]Char do the reverse, converting UTF to Z/Wstring data. It works in a similar way, except the length of the input buffer can't be given; the routine just finishes when it finds the terminating character at the end.
Again, the output buffer pointer is returned, and passing a NULL dst pointer will make the routine allocate its own output buffer. chars receives the number of characters written to the output buffer.


Here are a couple of examples based on the unit test, using ASCII-friendly example strings:

Code: Select all

#include once "utf_conv.bi"

const NULL as any ptr = 0

scope '' UTF <-> zstring

   dim as zstring ptr srcstr = @"abc"
   dim as byte ptr utfstr
   dim as integer bytes

   utfstr = CharToUTF( UTF_ENCOD_UTF8, srcstr, len( *srcstr ) + 1, NULL, @bytes )

   dim as zstring ptr newstr
   newstr = UTFToChar( UTF_ENCOD_UTF8, utfstr, NULL, @bytes )

   print bytes, *newstr
   assert( *newstr = *srcstr )

   deallocate( newstr )
   deallocate( utfstr )

end scope

scope '' UTF <-> wstring

   dim as wstring ptr srcstr = @wstr("defg")
   dim as byte ptr utfstr
   dim as integer chars

   utfstr = WCharToUTF( UTF_ENCOD_UTF8, srcstr, len( *srcstr ) + 1, NULL, @chars )

   dim as wstring ptr newstr
   newstr = UTFToWChar( UTF_ENCOD_UTF8, utfstr, NULL, @chars )


   print chars, *newstr
   assert( *newstr = *srcstr )

   deallocate( newstr )
   deallocate( utfstr )

end scope
bcohio2001
Posts: 553
Joined: Mar 10, 2007 15:44
Location: Ohio, USA
Contact:

Re: functions utf_conv.bi?

Postby bcohio2001 » Apr 22, 2014 21:13

Yeah, I know. OLD topic!

Am I on the right track?

Code: Select all

#Include Once "utf_conv.bi"
Sub StrToUTF(AsciiStr As String, UTF As UByte Ptr)
   Dim As ZString Ptr Z = @AsciiStr
   Dim As Integer B
   'function will allocate ptr and caller must DeAllocate after use
   If UTF <> 0 Then
      DeAllocate(UTF)
      UTF = 0
   EndIf
   UTF = CharToUTF( UTF_ENCOD_UTF8, Z, Len(AsciiStr) + 1, NULL, @B)
End Sub

Sub UTFToStr(UTF As UByte Ptr, AsciiStr As String)
   Dim As ZString Ptr Z
   Dim As Integer B
   Z = UTFToChar( UTF_ENCOD_UTF8, UTF, NULL, @B)
   AsciiStr = *Z
End Sub

Return to “Libraries”

Who is online

Users browsing this forum: No registered users and 1 guest