My JSON Parser

User contributed sources that have become inactive, deprecated, or generally unusable. But ... we don't really want to throw them away either.
KristopherWindsor
Posts: 2428
Joined: Jul 19, 2006 19:17
Location: Sunnyvale, CA
Contact:

My JSON Parser

Post by KristopherWindsor »

Today I wrote a JSON parser, since I need one, and someone else was recently looking for one.
The parser is hopefully easy to use and setup, although I didn't document it.

If you haven't heard of JSON, read this.

Three example programs and output:

Code: Select all

#include once "JSONParser.bas"

dim as string jsonstring = _
"{" + chr(10) + _
"     ""firstName"": ""John""," + chr(10) + _
"     ""lastName"": ""Smith""," + chr(10) + _
"     ""age"": 25," + chr(10) + _
"     ""address"":" + chr(10) + _
"     {" + chr(10) + _
"         ""streetAddress"": ""21 2nd Street""," + chr(10) + _
"         ""city"": ""New York""," + chr(10) + _
"         ""state"": ""NY""," + chr(10) + _
"         ""postalCode"": ""10021""" + chr(10) + _
"     }," + chr(10) + _
"     ""phoneNumber"":" + chr(10) + _
"     [" + chr(10) + _
"         {" + chr(10) + _
"           ""type"": ""home""," + chr(10) + _
"           ""number"": ""212 555-1234""" + chr(10) + _
"         }," + chr(10) + _
"         {" + chr(10) + _
"           ""type"": ""fax""," + chr(10) + _
"           ""number"": ""646 555-4567""" + chr(10) + _
"         }" + chr(10) + _
"     ]" + chr(10) + _
"}"

dim as json.variable variable = json.variable(jsonstring)

print "Parsed this:"
print jsonstring
print
print "Variable type: " & variable.getType()
print "firstName: " & variable.getObject()->get("firstName")->toString()
sleep()

Code: Select all

Parsed this:
{
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "address":
     {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": "10021"
     },
     "phoneNumber":
     [
         {
           "type": "home",
           "number": "212 555-1234"
         },
         {
           "type": "fax",
           "number": "646 555-4567"
         }
     ]
}

Variable type: Object
firstName: John

Code: Select all

#include once "JSONParser.bas"

dim as string numberstring = "-1.23"
dim as string boolstring = "false"
dim as string invalidstring = "invalid"
dim as string arraystring = "[]"
dim as string objectstring = "{}"

dim as json.variable num = json.variable(numberstring)
dim as json.variable boo = json.variable(boolstring)
dim as json.variable inv = json.variable(invalidstring)
dim as json.variable arr = json.variable(arraystring)
dim as json.variable obj = json.variable(objectstring)

print "For """ & numberstring & """:"
print "< " & num.getType() & " > < " & num.toString() & " >"
print

print "For """ & boolstring & """:"
print "< " & boo.getType() & " > < " & boo.toString() & " >"
print

print "For """ & invalidstring & """:"
print "< " & inv.getType() & " > < " & inv.toString() & " >"
print

print "For """ & arraystring & """:"
print "< " & arr.getType() & " > < " & arr.toString() & " >"
print

print "For """ & objectstring & """:"
print "< " & obj.getType() & " > < " & obj.toString() & " >"
print

sleep()

Code: Select all

For "-1.23":
< Number > < -1.23 >

For "false":
< Boolean > < false >

For "invalid":
<  > <  >

For "[]":
< Array > < ARRAY >

For "{}":
< Object > < OBJECT >

Code: Select all

#include once "JSONParser.bas"

dim as string jsonstring = "[[5, {}], null]"
dim as json.variable variable = json.variable(jsonstring)

#macro show(var)
  print "< " & (var).getType() & " > < " & (var).toString() & " >"
#endmacro

print "For """ & jsonstring & """:"
show(variable)
print "Total children: " & variable.getArray()->getLength()
print
print "First child:"
show(*variable.getArray()->get(0))
print "Second child:"
show(*variable.getArray()->get(1))
print
print "First child's first child:"
show(*variable.getArray()->get(0)->getArray()->get(0))
print "First child's second child:"
show(*variable.getArray()->get(0)->getArray()->get(1))
sleep()

Code: Select all

For "[[5, {}], null]":
< Array > < ARRAY >
Total children: 2

First child:
< Array > < ARRAY >
Second child:
< null > < null >

First child's first child:
< Number > < 5 >
First child's second child:
< Object > < OBJECT >
The parser is made for (string -> freebasic UDT conversions), not the other way around; but functionality could be added.

Project notes / status:
- This can parse a JSON string. Unless the Wikipedia article forgot to mention something, the only limitation here is a lack of unicode support.
- The ability to convert a JSON object to a string could easily be added. Right now variables have a toString() function, but for arrays it just returns "ARRAY" instead of showing the array contents.
- The ability to build a JSON object procedurally (instead of from a string) could also be added. Currently you can add items to arrays and fields to objects, but you cannot mutate instantiated JSON objects in any other way.
- Everything is implemented with linked lists, so speed could be improved, but there is no limit to how much JSON this can handle.
- Most errors in the JSON format will result in the parsed JSON variable having type vartype.malformed, which is easy to catch in code. The only thing I can think of that I'm not catching is multiple fields in an object with the same key.

Source and the example .exes:
http://kristopher.jafile.com/jsonparser.zip
agamemnus
Posts: 1842
Joined: Jun 02, 2005 4:48

Post by agamemnus »

Great. That will be useful for PHP noobs willing to try Freebasic. :P
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

Why did you leave out unicode? According to the specification on json.org
The character encoding of JSON text is always Unicode
You could add support for UTF8 as it's the most used UTF encoding (at least it is on the internet/it's recommended by the W3C).

Assuming you know about UTF8 the next bit of text is a bit superfluous....

UTF8 is a byte oriented unicode encoding which could be encoded using a byte ptr (you could try a string but 0 is a valid unicode char so...).

A byte with a 'magic' value is a marker for the start of an UTF8 code point (110xxxxx => 2 byte code point, 1110xxxx, 3 byte code point, 11110xxx => 4 byte code point).

Assuming ch is the current char in the string getting processed:

Code: Select all

if (ch <= &h7F) then
  ''regular' ASCII
elseif (ch and &b11000000 = &hc0) then
  '2 byte code point => read one more byte
elseif (ch and &b11100000 = &he0) then
  '3 byte code point => read two more bytes
elseif (ch and &b11110000 = &hf0) then
  '4 byte code point =>read three more bytes
end if
The individual bytes of the code point can be copied verbatim to the resulting string. Escape sequences are all in the ASCII range (don't forget the \uxxxx where xxxx is a four digit hex number).

If you're willing to gamble that no 0 will occur in the JSON input then you could use the string type.

Print will (of course) not display the code points as expected (the individual bytes of the code point will get displayed, not some fancy glyph). But that's not a JSON issue.

What adt would you suggest instead of the used linked list (just asking)?
KristopherWindsor
Posts: 2428
Joined: Jul 19, 2006 19:17
Location: Sunnyvale, CA
Contact:

Post by KristopherWindsor »

I could add Unicode support, but TBH I didn't need it. ;p
AGS wrote:What adt would you suggest instead of the used linked list (just asking)?
A JSON object is a set of key/value pairs, eg a map. The LinkedList implementation is O(n) lookup; a TreeMap would be O(lg(n)), and a HashMap would be O(1), but they are harder to implement.
anonymous1337
Posts: 5494
Joined: Sep 12, 2005 20:06
Location: California

Post by anonymous1337 »

TreeMaps <3

Am I right to assume that even though TreeMaps are O( log n ), trees with more than two members per branch will suffer from more comparisons per branch?

JSON > XML

IMO
KristopherWindsor
Posts: 2428
Joined: Jul 19, 2006 19:17
Location: Sunnyvale, CA
Contact:

Post by KristopherWindsor »

anonymous1337 wrote:Am I right to assume that even though TreeMaps are O( log n ), trees with more than two members per branch will suffer from more comparisons per branch?
The performance is still O(n) regardless of the number of children for each node, but AFAIK the binary tree will be easiest to implement and be at least as fast as anything else. :)
Oz
Posts: 586
Joined: Jul 02, 2005 14:21
Location: Waterloo, Ontario, Canada
Contact:

Post by Oz »

Have you thought about posting this on google code so if people want to implement different ways to store the data, it's easier to patch/update?

-Oz
KristopherWindsor
Posts: 2428
Joined: Jul 19, 2006 19:17
Location: Sunnyvale, CA
Contact:

Post by KristopherWindsor »

Oz wrote:Have you thought about posting this on google code so if people want to implement different ways to store the data, it's easier to patch/update?

-Oz
Nope.
kiyotewolf
Posts: 1009
Joined: Oct 11, 2008 7:42
Location: ABQ, NM
Contact:

Post by kiyotewolf »

@AGS

How do you finish adding UNICODE support to programs?
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

kiyotewolf wrote:@AGS

How do you finish adding UNICODE support to programs?
Different programs support UNICODE in different ways.

A couple of examples to clarify.

The PCRE library

Is a regular expression library (The Perl Compatible Regular Expression Library). It can process Unicode encoded files. PCRE can parse (read) a file encoded in Unicode (utf8) and regular expression can contain utf8 code points.

Gedit
http://projects.gnome.org/gedit/screenshots.html

gedit is part of of GNOME. Like other editors that display messages in different languages it uses gettext. gedit comes with a bunch of po files that contain translations of all messages used in the program. gedit uses gettext

http://en.wikipedia.org/wiki/GNU_gettext

Pango

http://www.pango.org/

You can use Pango to render Unicode encoded text to the screen. Pango 'understands' Unicode (and it's cross platform (Linux/Windows/other?)).


It's a question of what your program needs in terms of Unicode processing? Does it need to
- render Unicode to the screen
- display messages in a bunch of different languages (of which the user can choose one?)
- compare Unicode encoded strings (do the strings need to be in a certain normalized form?).
- etc... etc...

Unicode support is a big subject.
kiyotewolf
Posts: 1009
Joined: Oct 11, 2008 7:42
Location: ABQ, NM
Contact:

Post by kiyotewolf »

I really have a low level use for it.

I just want to be able to open a text file with UNICODE characters and be able to strip them out selectively, then write a file with some UNICODE characters in it.

Alot of what I do with fancy UNICODE characters borders on old text art.

I would have to be able to visualize the characters, so I could muck around with them as well.



~Kiyote!
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

pango, gettext, gedit, gnome, ...

All are based on glib, see
http://library.gnome.org/devel/glib/sta ... ation.html
for details.

To visualize the characters I recomment to use a GtkTextView. For lower levels cairo or pango can be used.
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

Post by AGS »

I don't think gettext is based on glib. glib has a gettext dependency, not vice versa.

Anyway, I think using cairo would be a good idea.

http://www.cairographics.org/manual/cai ... -show-text

Cairo can render graphics to pdf, jpg, png, svg , the screen (hwnd), (printer) etc.....

cairo expects text to be in the UTF8 format. Some simple (non error checking) code to read an UTF8 file (replace t3.utf8 with a unicode file ((utf8 encoded)) of your own).

Code: Select all

dim bom(0 to 2) as ubyte = {&hef,&hbb,&hbf}
dim fh as integer = freefile()
open "t3.utf8" for binary access read  as #fh
var err_ = err()
if (err_) then
  print "file not found (t3.utf8)"
end if
var flen = LOF(fh)
dim bom_(0 to 2) as ubyte
get #fh,,bom_()
for i as integer = 0 to 2
  if (bom_(i) <> bom(i)) then
    print "not an UTF8 file (no BOM found)"
    end
  end if
next i
dim bytesread as uinteger
dim utf8_content as ubyte ptr = callocate(sizeof(ubyte),flen)
get #fh,,*utf8_content,flen - 3,bytesread
for i as integer = 0 to bytesread - 1
  print hex(utf8_content[i]);" ";
next i
deallocate(utf8_content)
close
The cairo function that does the rendering does not use a length parameter so I'm guessing the routine will stop parsing the UTF8 at the first 0 it finds (there are a couple at the end of utf8_content).
Roland Chastain
Posts: 992
Joined: Nov 24, 2011 19:49
Location: France
Contact:

Re: My JSON Parser

Post by Roland Chastain »

In case someone is interested, here is an alternative link to download the JSON Parser:

jsonparser.zip
Last edited by Roland Chastain on Sep 24, 2020 6:57, edited 3 times in total.
Roland Chastain
Posts: 992
Joined: Nov 24, 2011 19:49
Location: France
Contact:

Re: My JSON Parser

Post by Roland Chastain »

There is a problem with string values ending with "\\".
[
{
"command" : "mosquito.exe",
"name" : "Mosquito",
"protocol" : "uci",
"workingDirectory" : "\\engines\\mosquito\\-"
},

{
"command" : "Ruffian_105.exe",
"name" : "Ruffian",
"protocol" : "uci",
"workingDirectory" : "\\engines\\ruffian\\-"
},

{
"command" : "Rybka v1.0 Beta.w32.exe",
"name" : "Rybka",
"protocol" : "uci",
"workingDirectory" : "\\engines\\rybka\\-"
}
]
If I remove the "-", my test program crashes.

Code: Select all

#include once "source\jsonparser\JSONParser.bas"

declare function LoadFileAsString(byval filename as string) as string

dim text as string = LoadFileAsString("engines.json")

dim as json.variable variable = json.variable(text)
print variable.getType()
print variable.getArray()->getLength()
print variable.getArray()->get(0)->getObject()->get("name")->toString()
print variable.getArray()->get(0)->getObject()->get("workingDirectory")->toString()

#macro show(var)
  print "< " & (var).getType() & " > < " & (var).toString() & " >"
#endmacro

show(*variable.getArray()->get(0)->getObject()->get("name"))



function LoadFileAsString(byval filename as string) as string
  dim s as string
  if open(filename for input access read as #1) = 0 then
    close #1
    if open(filename for binary access read as #1) = 0 then
      s = space(lof(1))
      get #1,, s
      close #1
    else
      print "Unable to open '" & filename & "'"
    end if 
  else
    print "File not found '" & filename & "'"
  end if
  return s
end function
Post Reply