Extended lib xml file problem

General FreeBASIC programming questions.
Post Reply
bojan.dosen
Posts: 166
Joined: May 14, 2007 12:20
Location: Zagreb, Croatia

Extended lib xml file problem

Post by bojan.dosen »

I have some problems reading "unicode" hex code from xml attribute. Is this maybe bug?

Code: Select all

#include "ext/xml.bi"
Dim As ext.xml.tree file
file.load("file.xml")

? file.root->child("characters")->child("char", 0)->attribute("unicode")
...
file.xml:

Code: Select all

<characters>
<char unicode="&#xf041;" />
<char unicode="&#xf042;" />
<char unicode="&#xf043;" />
<char unicode="&#xf044;" />
</characters>
Result is 3 strange characters in string. All I want is unicode character or "&#xf041;" string! When I change attribute in xml file to something else I got normal result, but when it is in this format &#xf041; then I get that strange characters.
How to fix that? Thanks!
sir_mud
Posts: 1401
Joined: Jul 29, 2006 3:00
Location: US
Contact:

Re: Extended lib xml file problem

Post by sir_mud »

The unicode portion of the xml module is rather hackish looking to me. I know it only supports a common subset of what is actually required by the standard. I'm looking into proper handling.
I've filed this as Issue 26: https://code.google.com/p/fb-extended-l ... tail?id=26 If you have a google account you can star it to be notified when it is updated.
A preview of some of the fun reading material for this: http://www.hackcraft.net/xmlUnicode/

I have been wanting to refactor the xml module (along with other modules) so if you want to help with anything you can contact me through this form: http://ext.freebasic.net/contact/form/contact-us

Thank you for using the Extended Library!
bojan.dosen
Posts: 166
Joined: May 14, 2007 12:20
Location: Zagreb, Croatia

Re: Extended lib xml file problem

Post by bojan.dosen »

Ok, thank you. I will wait for update.
sir_mud
Posts: 1401
Joined: Jul 29, 2006 3:00
Location: US
Contact:

Re: Extended lib xml file problem

Post by sir_mud »

Thank you for waiting bojan.dosen. I have determined that there actually is not an error per-se, just the result is not documented. The code point is actually encoded as UTF-8. You can convert a string to wstring using the wstr function, however I am not sure if it will convert the UTF-8 to the platform's wstring format as it is not documented. wstr uses the mbstowcs function internally which says:
[quote=MSDN]
Converts a sequence of multibyte characters to a corresponding sequence of wide characters.
[/quote]
So it needs to be tested.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Extended lib xml file problem

Post by marcov »

From the need to call setlocale in the example in this (http://msdn.microsoft.com/en-us/library ... 80%29.aspx), I don't think that it works without doing an setlocale to utf8, which is appwide.

This is a common problem with MS string conversions, they usually work for active encodings. (
Post Reply