mathematical expression parser (operator precedence parser)

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
Post Reply
AGS
Posts: 1284
Joined: Sep 25, 2007 0:26
Location: the Netherlands

mathematical expression parser (operator precedence parser)

Post by AGS »

The FreeBASIC compiler (and several examples on this forum) parse expressions in a top down fashion. But it can also be done in a bottom up fashion without having to resort to
a compiler compiler.

This example is an example of an operator precedence parser. Operator precedence parsers are a kind of bottom - up (shift - reduce) parsers you can implement by hand.
http://en.wikipedia.org/wiki/Operator-precedence_parser

An operator precedence parser (or at least the one I'm presenting here) uses two stacks, an operator precedence table and a couple of routines to parse an expression.
One stack is used for storing operands while the other stack is used for storing operators. The precedence table is used by the parsing routine to figure out (given the next token and the token at the top of the operator stack) what action to undertake (either SHIFT, REDUCE, ACCEPT or print a syntax error).


The precedence table is a two dimensional array sized (number_of_operators,number_of_operators)
A short description of possible table entries (x and y are operators):
table(x,y) = S and x unequal to y ==> precedence(y) > precedence(x)
table(x,y) = S and x = y ==> x is right associative
table(x,y) = R and x unequal to y ==> precedence(x) > precedence(y)
table(x,y) = R and x = y ==> x is left associative
table(x,y) = En ==> SYNTAX ERROR
table(x,y) = A ==> ACCEPT (parsing completed succesfully)

S stands for SHIFT
R stands for REDUCE
A stands for ACCEPT

At each step during the parsing of an expression the parsing routine (parse) uses the precedence table to decide what action to undertake. Depending on what's on top of the operator stack and the next input token the action can be:
- SHIFT (S) ==> push input token on the operator stack;
- SHIFT (S) ==> push input token on the value stack;
- REDUCE (R) ==>l remove one or more tokens from the operator stack. The operator that's removed from the operator stack can be applied to the topmost entrie(s) of the
value stack. This will lead to the removal of tokens from the value stack. The result of the operation is pushed back onto the value stack.
- ACCEPT (A) ==> end of parsing (finished evaluating expression)
- display an error message (En) as there is some error in the expression.

operators defined:
+ - * / ^ unary minus , ( ) f(actorial) p(ermutation) c(ombination)

format of a number (notated as a regular expression): [0-9]+(\.[0-9]+)?

Code: Select all

'uncommenting the DEBUGGING_ macro makes subroutine split print tokens on stdout

'#define DEBUGGING_

' actions
enum ActionEnum
    S                  ' shift
    R                  ' reduce
    A                  ' accept
    E1                 ' error: missing right parenthesis
    E2                 ' error: missing operator
    E3                 ' error: unbalanced right parenthesis
    E4                 ' error: invalid function argument
end enum

' tokens
enum tokEnum
    ' operators
    tAdd                ' +
    tSub                ' -
    tMul                ' *
    tDiv                ' /
    tPow                ' ^ (power)
    tUmi                ' - (unary minus)
    tFact               ' f(x): factorial
    tPerm               ' p(n,r): permutations, n objects, r at a time
    tComb               ' c(n,r): combinations, n objects, r at a time
    tComa               ' comma
    tLpr                ' (
    tRpr                ' )
    tEof                ' end of string
    tMaxOp              ' maximum number of operators
    ' non-operators
    tVal                ' value
end enum

dim shared Tok as tokEnum          ' token
dim shared Tokval as double        ' token value

const MaxOpr as integer = 50
const MaxV as integer = 50

dim shared Opr(MaxOpr) as integer      ' operator stack
dim shared V(MaxV) as double           ' value stack
dim shared OprTop as integer           ' top of operator stack
dim shared VTop as integer             ' top of value stack
redim shared Term(0 to 0) as string    ' array of terms
dim shared TermIndex as integer        ' current term
dim shared failure as integer = 0      ' set to 1 when evaluation of expression fails

dim shared parseTbl(tMaxOp,tMaxOp) as byte = { _
    /' stk      ------------------ input ------------------------ '/ _
    /'         +   -   *   /   ^   M   f   p   c   ,   (   )   $  '/ _
    /'         --  --  --  --  --  --  --  --  --  --  --  --  -- '/ _
    /' + '/  { R,  R,  S,  S,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
    /' - '/  { R,  R,  S,  S,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
    /' * '/  { R,  R,  R,  R,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
    /' / '/  { R,  R,  R,  R,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
    /' ^ '/  { R,  R,  R,  R,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
    /' M '/  { R,  R,  R,  R,  R,  S,  S,  S,  S,  R,  S,  R,  R },_
    /' f '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  S,  R,  R },_
    /' p '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  S,  R,  R },_
    /' c '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  S,  R,  R },_
    /' , '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  E4},_
    /' ( '/  { S,  S,  S,  S,  S,  S,  S,  S,  S,  S,  S,  S,  E1},_
    /' ) '/  { R,  R,  R,  R,  R,  R,  E3, E3, E3, R, E2,  R,  R },_
    /' $ '/  { S,  S,  S,  S,  S,  S,  S,  S,  S,  E4, S,  E3, A }_
}

declare function ErrorMsg(byref msg as string) as integer

'splits expression into tokens
'resulting tokens are placed in dynamic array terms
'returns: 0 on succes
'         failure is set to 1 when an illegal token
'         is found (failure is a global variable)
function split(byref expression as string) as integer

'To make the parser a bit more solid split wraps numbers in parens
'That way the parser will report an error when 
'two consecutive numbers without an operator in between are found.
'An example of how this works ($ = top of stack at start of parse)
'expression ==> 8 5 +  
'parse:
'input = 8    
'input = 5    action: shift 8 on the operand stack    operand  stack: 58$
'input = +    action: reduce
'The reason for this erroneous behaviour of the parser has to do with the
'way binary operators are reduced. On a reduce a binary operator simply looks 
'at the value stack. if there are 2 values on the stack it applies the operator
'to those values and puts the result on the value stack. 
'Wrapping a number into parens leads to different parser behaviour as ( and ) 
'are operators that lead to a reduction. So parsing of 8 5 + (after adding parens)
'now looks like
'expression ==> (8) (5) +
'parse:
'input = (  action: push ( on the operator stack operator stack: ($
'input = 8  action: push 8 on the value stack    value stack   : 8$
'input = )  error as parsetabl('(',')' equals E2 (missing operator).
'expression ==> (8) f((5))
'parse:
'input = ( action: push ( on the operator stack operator stack: ($
'input = 8  action: push 8 on the value stack    value stack   : 8$
'input = ) error as parsetbl('(','f') equals E3 (unbalanced right parens)
'E3 is a 'bad' error message as there is no problem with the parens.
'Error handling could be better at the expense of a more complicated lexer/parser.

  dim mark as integer
  dim ct as integer
  dim lexp as integer = len(expression)  

  function = 0
 
  if (len(expression) = 0) then
    errormsg("empty expression")
    failure = 1
    exit function
  end if
  
  while (expression[ct] = asc(" ") orelse expression[ct] = asc(!" \t") andalso ct < lexp)
    ct += 1
  wend
  'expression contained nothing but space
  if (ct = lexp) then
    errormsg("empty expression")
    failure = 1
    exit function
  end if
 
  var upper = 0
  var lower = 0  

  mark = ct
  while (ct < lexp)
    select case as const expression[ct] 
    'operator or single char function
    case asc("+"),asc("-"),asc("*"),asc("/"),asc("^"),asc("("),asc(")"),asc(","),_
         asc("f"),asc("p"),asc("c")    
         if (lower = upper andalso term(0) = "") then           
             term(0) = chr(expression[ct])
             #ifdef DEBUGGING_
               print chr(expression[ct])
             #endif
         else
            redim preserve term(0 to upper+1)
            upper += 1
            term(upper) = chr(expression[ct])
            #ifdef DEBUGGING_
            print chr(expression[ct])
            #endif
         end if
         ct += 1
    'number
    case asc("0"),asc("1"),asc("2"),asc("3"),asc("4"),_
         asc("5"),asc("6"),asc("7"),asc("8"),asc("9")
    number_loop:
        'at end of expression?
        if (ct = lexp - 1) then
          if (lower = upper andalso term(0) = "") then           
             redim preserve term(0 to upper+2)
             'parenthesize numbers to evoke errors from the parser
             upper += 2
             term(upper-2) = "("
             term(upper-1) = mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
             term(upper) = ")"
            #ifdef DEBUGGING_
              print mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
            #endif             
          else
            redim preserve term(0 to upper + 3)      
             'parenthesize numbers to evoke errors from the parser
            upper += 3
            term(upper-2) = "("
            term(upper-1) = mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
            term(upper) = ")"
            #ifdef DEBUGGING_
              print mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
            #endif
          end if
          return 0
        end if        
        ct += 1
        select case as const expression[ct]
        case asc("0"),asc("1"),asc("2"),asc("3"),asc("4"),_
             asc("5"),asc("6"),asc("7"),asc("8"),asc("9"),asc(".")
          goto number_loop        
        case else'asc(" "),asc(!"\t")
          if (lower = upper andalso term(0) = "") then           
             redim preserve term(0 to upper + 2)
             upper += 2
             'parenthesize numbers to evoke errors from the parser
             term(upper-2) = "("
             term(upper-1) = mid(expression,mark+1,ct - mark)
             term(upper) = ")"
            #ifdef DEBUGGING_
              print mid(expression,mark+1,ct - mark)
            #endif             
          else
            redim preserve term(0 to upper + 3)      
            upper += 3
            'parenthesize numbers to evoke errors from the parser
            term(upper-2) = "("
            term(upper-1) = mid(expression,mark+1,ct - mark)
            term(upper) = ")"
            #ifdef DEBUGGING_
              print mid(expression,mark+1,ct - mark)
            #endif
          end if          
        end select    
    'illegal character in expression (expression will evaluate to 0)
    case else
      if (expression[ct] <> Asc(" ") andalso expression[ct] <> asc(!"\t")) then
        errormsg("syntax error at character " & ct + 1 & " (" & chr(expression[ct]) & ")")
        print expression
        print space(ct);"^" 
        failure = 1
        end
      end if
    end select    
    'skip whitespace
    while (expression[ct] = asc(" ") orelse expression[ct] = asc(!"\t") andalso ct < lexp)
      ct += 1
    wend
    mark = ct
  wend 

end function

function ErrorMsg(byref msg as string) as integer

    print "Error: " & msg
    return 1

end function

function GetTok() as integer

    dim TokStr as string
    static PrevTok as tokEnum
      
    ' get next token
    GetTok = 0
    TermIndex = TermIndex + 1
    if TermIndex > UBound(Term) then
        Tok = tEof
        exit function
    end if
        
    TokStr = Term(TermIndex)

    ' convert symbol to token
    select case TokStr
    case "+": Tok = tAdd
    case "-": Tok = tSub
    case "*": Tok = tMul
    case "/": Tok = tDiv
    case "^": Tok = tPow
    case "(": Tok = tLpr
    case ")": Tok = tRpr
    case ",": Tok = tComa
    case "f": Tok = tFact
    case "p": Tok = tPerm
    case "c": Tok = tComb
    case else
        'check whether token is a valid (possibly floating point) number
        var dot = 0
        for i as integer = 0 to len(TokStr) - 1
          select case as const TokStr[i]
          case asc("0"),asc("1"),asc("2"),asc("3"),_
               asc("4"),asc("5"),asc("6"),asc("7"),_
               asc("8"),asc("9")
          case asc(".")
            dot += 1
            if (dot > 1) then
              dot = 0
              goto case_else
            end if
          case else
            case_else:
            print "token not numeric (" & TokStr & "), use spaces as separators"
            failure = 1
            function = 1
          end select
        next i    
        Tokval = Val(TokStr)
        Tok = tVal
    end select

    ' check for unary minus
    if Tok = tSub then
      if TermIndex > 0 then
        if PrevTok <> tVal And PrevTok <> tRpr then
          Tok = tUmi
        end if
      'uminus at start of expression
      ElseIf TermIndex = 0 then
          Tok = tUmi
      end if
    end if

    PrevTok = Tok

end function

function Shift() as integer

    if Tok = tVal then
        VTop = VTop + 1
        if VTop >= MaxV then
            Shift = ErrorMsg("V stack exhausted")
            exit function
        end if
        V(VTop) = Tokval        
    else
        OprTop = OprTop + 1
        if OprTop >= MaxOpr then
            Shift = ErrorMsg("Opr stack exhausted")
            failure = 1
            exit function
        end if
        Opr(OprTop) = Tok
    end if
    if GetTok() <> 0 then
        Shift = 1
        exit function
    end if
    Shift = 0

end function

function Fact(byval N as double) as double

    dim i as double
    dim t as double
    
    t = 1:i=1
    while (i <= N)    
      t *= i
      i += 1
    wend
    return t
    
end function

'comb is used for debugging purposes (to check whether
'parse yields a mathematically correct result)
function comb(byval N as double,byval M as double) as double

    return  Fact(n) / (Fact(m) * Fact(n - m))

end function

'perm is used for debugging purposes (to check whether
'parse yields a mathematically correct result)
function perm(byval n as double,byval m as double) as double

  return Fact(n) / Fact(n - m)

end function

function Reduce() as integer

    select case Opr(OprTop)
    case tAdd
        ' apply E := E + E
        if VTop < 1 then
            Reduce = ErrorMsg("Syntax error")
            failure = 1
            exit function
        end if
        V(VTop - 1) = V(VTop - 1) + V(VTop)
        VTop = VTop - 1
    case tSub
        ' apply E := E - E
        if VTop < 1 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop - 1) = V(VTop - 1) - V(VTop)
        VTop = VTop - 1
    case tMul
        ' apply E := E * E
        if VTop < 1 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop - 1) = V(VTop - 1) * V(VTop)
        VTop = VTop - 1
    case tDiv
        ' apply E := E / E
        if VTop < 1 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop - 1) = V(VTop - 1) / V(VTop)
        VTop = VTop - 1
    case tUmi
        ' apply E := -E
        if VTop < 0 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop) = -V(VTop)
    case tPow
        ' apply E := E ^ E
        if VTop < 1 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        if (V(VTop - 1) < 0) then
          V(VTop - 1) = -(V(VTop - 1) ^ V(VTop))
        else
          V(VTop - 1) = V(VTop - 1) ^ V(VTop)
        end if
        VTop = VTop - 1
    case tFact
        ' apply E := f(E)
        if VTop < 0 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop) = Fact(V(VTop))
    case tPerm
        ' apply E := p(N,R)
        if VTop < 1 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop - 1) = Fact(V(VTop - 1)) / Fact(V(VTop - 1) - V(VTop))
        VTop = VTop - 1
    case tComb
        ' apply E := c(N,R)
        if VTop < 1 then
            failure = 1
            Reduce = ErrorMsg("Syntax error")
            exit function
        end if
        V(VTop - 1) = Fact(V(VTop - 1)) / _
            (Fact(V(VTop)) * Fact(V(VTop - 1) - V(VTop)))
        VTop = VTop - 1
    case tRpr
        ' pop () off stack
        OprTop = OprTop - 1
    end select
    OprTop = OprTop - 1
    Reduce = 0

end function

function Parse(byref Expr as string) as double

    ' initialize for next expression
    OprTop = 0
    VTop = -1
    Opr(OprTop) = tEof
    TermIndex = -1
    
    'split expression into terms (put terms in global array term)
    Split(Expr)
    
    if GetTok() <> 0 then exit function

    do
        ' input is Value (operand)
        if Tok = tVal then
            ' shift token to value stack
            if Shift() <> 0 then exit function
        else
    
            ' input is operator
            select case ParseTbl(Opr(OprTop), Tok)
            case R
                if Reduce() <> 0 then exit function
            case S
                if Shift() <> 0 then exit function
            case A
                ' accept
                if VTop = 0 then
                    'print "value = " & V(0)
                    return V(0)
                else
                    failure = 1                   
                    ErrorMsg("Syntax error")
                end if
                exit function
            case E1
                ErrorMsg("Missing right parenthesis")
                failure = 1
                exit function
            case E2
                failure = 1
                ErrorMsg("Missing operator")
                exit function
            case E3
                failure = 1
                ErrorMsg("Unbalanced right parenthesis")
                exit function
            case E4
                failure = 1
                ErrorMsg("Invalid function argument")
                exit function                
            end select
        end if
    loop

end function

function Test(byref Expr as string) as double

    return(Parse(Expr))

end function

'some testing code
dim t as double
dim s as string = "f(8) + f(6) + f(5) + c(f(2),4)"

var t1 = timer()
t = test(s)
var t2 = timer()
print using "#######.#######";t2 - t1
if (failure) then
  print "eval ==> ";s;" ==> error"
else
  print "eval ==> ";s;" = ";t
end if
t1 = timer()
t = fact(8) + fact(6) + fact(5) + comb(fact(2),4)
t2 = timer()
print using "#######.#######";t2 - t1
print "fb   ==> ";s;" = ";t
redim term(0 to 0)
agamemnus
Posts: 1842
Joined: Jun 02, 2005 4:48

Post by agamemnus »

Omnomnom!
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Post by marcov »

Very nice.
agamemnus
Posts: 1842
Joined: Jun 02, 2005 4:48

Re: mathematical expression parser (operator precedence pars

Post by agamemnus »

I am going to add this to my Capitalopoly tick engine. Thank you again for this excellent code.
Edit: I've slightly reformulated this so that it fits inside a namespace and removed the globals:

Code: Select all

namespace math_parser_namespace
 ' Uncommenting the DEBUGGING_ macro makes subroutine split print tokens on stdout
 ' #define DEBUGGING_
 ' Actions.
 enum ActionEnum
  S      ' shift
  R      ' reduce
  A      ' accept
  E1     ' error: missing right parenthesis
  E2     ' error: missing operator
  E3     ' error: unbalanced right parenthesis
  E4     ' error: invalid function argument
 end enum
 
 ' Tokens.
 enum tokEnum
  ' Operators.
  tAdd   ' +
  tSub   ' -
  tMul   ' *
  tDiv   ' /
  tPow   ' ^ (power)
  tUmi   ' - (unary minus)
  tFact  ' f(x): factorial
  tPerm  ' p(n,r): permutations, n objects, r at a time
  tComb  ' c(n,r): combinations, n objects, r at a time
  tComa  ' comma
  tLpr   ' (
  tRpr   ' )
  tEOF   ' end of string
  tMaxOp ' maximum number of operators
  ' Non-operators.
  tval   ' value
 end enum
 const maxOpr as integer = 50
 const maxV   as integer = 50

 type initUDT
  dim tok         as tokEnum     ' token
  dim tokVal      as double      ' token value
  dim opr(maxOpr) as integer     ' Operator stack.
  dim v(maxV)     as double      ' value stack.
  dim oprTop      as integer     ' Top of operator stack.
  dim vTop        as integer     ' Top of value stack.
  dim term        as string ptr
  dim term_size   as uinteger
  dim termIndex   as integer     ' Current term.
  dim failure     as integer = 0 ' Set to 1 when evaluation of expression fails.
  
  dim parseTbl(tMaxOp,tMaxOp) as byte = { _
   /' stk   ------------------ input ------------------------ '/ _
   /'   +   -   *   /   ^   M   f   p   c   ,   (   )   $  '/ _
   /'   --  --  --  --  --  --  --  --  --  --  --  --  -- '/ _
   /' + '/  { R,  R,  S,  S,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
   /' - '/  { R,  R,  S,  S,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
   /' * '/  { R,  R,  R,  R,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
   /' / '/  { R,  R,  R,  R,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
   /' ^ '/  { R,  R,  R,  R,  S,  S,  S,  S,  S,  R,  S,  R,  R },_
   /' M '/  { R,  R,  R,  R,  R,  S,  S,  S,  S,  R,  S,  R,  R },_
   /' f '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  S,  R,  R },_
   /' p '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  S,  R,  R },_
   /' c '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  S,  R,  R },_
   /' , '/  { R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  R,  E4},_
   /' ( '/  { S,  S,  S,  S,  S,  S,  S,  S,  S,  S,  S,  S,  E1},_
   /' ) '/  { R,  R,  R,  R,  R,  R,  E3, E3, E3, R, E2,  R,  R },_
   /' $ '/  { S,  S,  S,  S,  S,  S,  S,  S,  S,  E4, S,  E3, A }_
  }
  
  declare function errormsg (byref msg as string) as integer
  declare function split    (byref expression as string) as integer
  declare function getTok   () as integer
  declare function shift    () as integer
  declare function reduce   () as integer
  declare function parse    (byref expr as string) as double
 end type


 ' Splits expression into tokens.
 ' Resulting tokens are placed in dynamic array terms.
 ' Returns: 0 on success.
 '   failure is set to 1 when an illegal token
 '   is found (failure is a global variable)
 function initUDT.split (byref expression as string) as integer
  ' To make the parser a bit more solid split wraps numbers in parens.
  ' That way the parser will report an error when
  ' two consecutive numbers without an operator in between are found.
  ' An example of how this works ($ = top of stack at start of parse)
  ' expression ==> 8 5 + 
  ' parse:
  ' input = 8   
  ' input = 5 action: shift 8 on the operand stack operand  stack: 58$
  ' input = + action: reduce
  ' The reason for this erroneous behaviour of the parser has to do with the
  ' way binary operators are reduced. On a reduce a binary operator simply looks
  ' at the value stack. if there are 2 values on the stack it applies the operator
  ' to those values and puts the result on the value stack.
  ' Wrapping a number into parens leads to different parser behaviour as ( and )
  ' are operators that lead to a reduction. So parsing of 8 5 + (after adding parens)
  ' now looks like
  ' expression ==> (8) (5) +
  ' parse:
  ' input = (  action: push ( on the operator stack operator stack: ($
  ' input = 8  action: push 8 on the value stack value stack   : 8$
  ' input = )  error as parsetabl('(',')' equals E2 (missing operator).
  ' expression ==> (8) f((5))
  ' parse:
  ' input = ( action: push ( on the operator stack operator stack: ($
  ' input = 8  action: push 8 on the value stack value stack   : 8$
  ' input = ) error as parsetbl('(','f') equals E3 (unbalanced right parens)
  ' E3 is a 'bad' error message as there is no problem with the parens.
  ' Error handling could be better at the expense of a more complicated lexer/parser.
  
  dim mark as integer
  dim ct as integer
  dim lexp as integer = len(expression) 
  function = 0
  if (len(expression) = 0) then
   errormsg ("empty expression")
   failure = 1
   exit function
  end if
  
  while (expression[ct] = asc(" ") orelse expression[ct] = asc(!" \t") andalso ct < lexp)
   ct += 1
  wend
  ' expression contains nothing but space.
  if (ct = lexp) then
   errormsg ("empty expression")
   failure = 1
   exit function
  end if
  
  var upper = 0
  var lower = 0 
  
  mark = ct
  while (ct < lexp)
   select case as const expression[ct]
    ' Operator or single char function.
    case asc("+"),asc("-"),asc("*"),asc("/"),asc("^"),asc("("),asc(")"),asc(","),_
    asc("f"),asc("p"),asc("c")   
    if (lower = upper andalso term[0] = "") then
     term[0] = chr(expression[ct])
     #ifdef DEBUGGING_
     print chr(expression[ct])
     #endif
    else
     term = reallocate (term, (upper+2)*sizeof(string)): term_size = upper+2
     upper += 1
     term[upper] = chr(expression[ct])
     #ifdef DEBUGGING_
     print chr(expression[ct])
     #endif
    end if
    ct += 1
    ' Mumber.
    case asc("0"),asc("1"),asc("2"),asc("3"),asc("4"),_
    asc("5"),asc("6"),asc("7"),asc("8"),asc("9")
    number_loop:
    ' At end of expression?
    if (ct = lexp - 1) then
     if (lower = upper andalso term[0] = "") then
      term = reallocate (term, (upper+3)*sizeof(string)): term_size = upper+3
      ' Parenthesize numbers to evoke errors from the parser.
      upper += 2
      term[upper-2] = "("
      term[upper-1] = mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
      term[upper] = ")"
      #ifdef DEBUGGING_
      print mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
      #endif    
     else
      term = reallocate (term, (upper+4)*sizeof(string)): term_size = upper+4
      ' Parenthesize numbers to evoke errors from the parser.
      upper += 3
      term[upper-2] = "("
      term[upper-1] = mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
      term[upper] = ")"
      #ifdef DEBUGGING_
      print mid(expression,mark+1,iif(ct - mark=0,1,ct+1 - mark))
      #endif
     end if
     return 0
    end if    
    ct += 1
    select case as const expression[ct]
    case asc("0"),asc("1"),asc("2"),asc("3"),asc("4"),_
     asc("5"),asc("6"),asc("7"),asc("8"),asc("9"),asc(".")
     goto number_loop    
    case else'asc(" "),asc(!"\t")
     if (lower = upper andalso term[0] = "") then     
      term = reallocate (term, (upper+3)*sizeof(string)): term_size = upper+3
      upper += 2
      ' Parenthesize numbers to evoke errors from the parser.
      term[upper-2] = "("
      term[upper-1] = mid(expression,mark+1,ct - mark)
      term[upper] = ")"
      #ifdef DEBUGGING_
      print mid(expression,mark+1,ct - mark)
      #endif    
     else
      term = reallocate (term, (upper+4)*sizeof(string)): term_size = upper+4
      upper += 3
      ' Parenthesize numbers to evoke errors from the parser.
      term[upper-2] = "("
      term[upper-1] = mid(expression,mark+1,ct - mark)
      term[upper] = ")"
      #ifdef DEBUGGING_
      print mid(expression,mark+1,ct - mark)
      #endif
     end if   
    end select   
    ' Illegal character in expression (expression will evaluate to 0)
   case else
    if (expression[ct] <> Asc(" ") andalso expression[ct] <> asc(!"\t")) then
     errormsg ("syntax error at character " & ct + 1 & " (" & chr(expression[ct]) & ")")
     failure = 1
     exit function
    end if
   end select   
   ' Skip whitespace.
   while (expression[ct] = asc(" ") orelse expression[ct] = asc(!"\t") andalso ct < lexp)
    ct += 1
   wend
   mark = ct
  wend
 end function
 
 function initUDT.errormsg (byref msg as string) as integer
  'print "Error: " & msg
  return 1
 end function
 
 function initUDT.getTok() as integer
  dim tokStr as string
  static prevtok as tokEnum
  ' Get next token.
  getTok = 0
  termIndex = termIndex + 1
  if termIndex >= term_size then
   tok = tEOF
   exit function
  end if
  
  tokStr = Term[termIndex]
  
  ' convert symbol to token
  select case tokStr
  case "+": tok = tAdd
  case "-": tok = tSub
  case "*": tok = tMul
  case "/": tok = tDiv
  case "^": tok = tPow
  case "(": tok = tLpr
  case ")": tok = tRpr
  case ",": tok = tComa
  case "f": tok = tFact
  case "p": tok = tPerm
  case "c": tok = tComb
  case else
   ' Check whether token is a valid (possibly floating point) number.
   var dot = 0
   for i as integer = 0 to len(tokStr) - 1
     select case as const tokStr[i]
     case asc("0"),asc("1"),asc("2"),asc("3"),_
       asc("4"),asc("5"),asc("6"),asc("7"),_
       asc("8"),asc("9")
     case asc(".")
    dot += 1
    if (dot > 1) then
      dot = 0
      goto case_else
    end if
     case else
    case_else:
    errormsg ("token not numeric (" & tokStr & "), use spaces as separators")
    failure = 1
    function = 1
     end select
   next i   
   tokVal = val(tokStr)
   tok = tval
  end select

  ' Check for unary minus.
  if tok = tSub then
    if termIndex > 0 then
   if prevtok <> tval And prevtok <> tRpr then
     tok = tUmi
   end if
    'uminus at start of expression
    ElseIf termIndex = 0 then
     tok = tUmi
    end if
  end if
  prevtok = tok
 end function
 
 function initUDT.shift() as integer
  if tok = tval then
   vTop = vTop + 1
   if vTop >= maxV then
    shift = errormsg ("v stack exhausted")
    exit function
   end if
   v(vTop) = tokVal    
  else
   oprTop = oprTop + 1
   if oprTop >= maxOpr then
    shift = errormsg ("opr stack exhausted")
    failure = 1
    exit function
   end if
   opr(oprTop) = tok
  end if
  if getTok() <> 0 then
   shift = 1
   exit function
  end if
  shift = 0
 end function

 function Fact(byval N as double) as double
  dim i as double
  dim t as double
  t = 1:i=1
  while (i <= N)   
    t *= i
    i += 1
  wend
  return t
 end function
 
 ' Comb is used for debugging purposes -- to check whether
 ' parse yields a mathematically correct result.
 function comb(byval N as double,byval M as double) as double
  return  Fact(n) / (Fact(m) * Fact(n - m))
 end function
 
 ' Perm is used for debugging purposes -- to check whether
 ' parse yields a mathematically correct result.
 function perm(byval n as double,byval m as double) as double
   return Fact(n) / Fact(n - m)
 end function
 
 function initUDT.reduce() as integer
  select case opr(oprTop)
  case tAdd
   ' Apply E := E + E.
   if vTop < 1 then
    reduce = errormsg ("Syntax error")
    failure = 1
    exit function
   end if
   v(vTop - 1) = v(vTop - 1) + v(vTop)
   vTop = vTop - 1
  case tSub
   ' Apply E := E - E.
   if vTop < 1 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop - 1) = v(vTop - 1) - v(vTop)
   vTop = vTop - 1
  case tMul
   ' Apply E := E * E.
   if vTop < 1 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop - 1) = v(vTop - 1) * v(vTop)
   vTop = vTop - 1
  case tDiv
   ' Apply E := E / E.
   if vTop < 1 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop - 1) = v(vTop - 1) / v(vTop)
   vTop = vTop - 1
  case tUmi
   ' Apply E := -E.
   if vTop < 0 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop) = -v(vTop)
  case tPow
   ' Apply E := E ^ E.
   if vTop < 1 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   if (v(vTop - 1) < 0) then
     v(vTop - 1) = -(v(vTop - 1) ^ v(vTop))
   else
     v(vTop - 1) = v(vTop - 1) ^ v(vTop)
   end if
   vTop = vTop - 1
  case tFact
   ' Apply E := f(E).
   if vTop < 0 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop) = Fact(v(vTop))
  case tPerm
   ' Apply E := p(N,R).
   if vTop < 1 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop - 1) = Fact(v(vTop - 1)) / Fact(v(vTop - 1) - v(vTop))
   vTop = vTop - 1
  case tComb
   ' Apply E := c(N,R).
   if vTop < 1 then
    failure = 1
    reduce = errormsg ("Syntax error")
    exit function
   end if
   v(vTop - 1) = Fact(v(vTop - 1)) / _
    (Fact(v(vTop)) * Fact(v(vTop - 1) - v(vTop)))
   vTop = vTop - 1
  case tRpr
   ' Pop () off stack.
   oprTop = oprTop - 1
  end select
  oprTop = oprTop - 1
  reduce = 0
 end function
 
 function initUDT.parse (byref expr as string) as double
  ' initUDTialize for next expression.
  oprTop = 0
  vTop = -1
  opr(oprTop) = tEOF
  termIndex = -1
  term = callocate (1, sizeof(string)): term_size = 1
  
  ' split expression into terms (put terms in global array term).
  split(expr)
  
  if getTok() <> 0 then exit function
  do
   ' Input is value (operand).
   if tok = tval then
    ' shift token to value stack.
    if shift() <> 0 then exit function
   else
    ' Input is operator.
    select case parseTbl(opr(oprTop), tok)
    case R
     if reduce() <> 0 then exit function
    case S
     if shift() <> 0 then exit function
    case A
     ' Accept.
     if vTop = 0 then
      return v(0)
     else
      failure = 1       
      errormsg ("Syntax error")
     end if
     exit function
    case E1
     errormsg ("Missing right parenthesis")
     failure = 1
     exit function
    case E2
     failure = 1
     errormsg ("Missing operator")
     exit function
    case E3
     failure = 1
     errormsg ("Unbalanced right parenthesis")
     exit function
    case E4
     failure = 1
     errormsg ("Invalid function argument")
     exit function      
    end select
   end if
  loop
 end function
end namespace

' Some testing code.
dim t as double
dim s as string = "5*(3*2)^(2+1)"
dim math_parser as math_parser_namespace.initUDT
t = math_parser.parse(s)
print t
sleep
Post Reply