gren-lang/parser - Parser.Advanced

type Parser context problem value

An advanced Parser gives two ways to improve your error messages:

problem — Instead of all errors being a String, you can create a custom type like type Problem = BadIndent | BadKeyword String and track problems much more precisely.
context — Error messages can be further improved when precise problems are paired with information about where you ran into trouble. By tracking the context, instead of saying “I found a bad keyword” you can say “I found a bad keyword when parsing a list” and give folks a better idea of what the parser thinks it is doing.

I recommend starting with the simpler Parser module though, and when you feel comfortable and want better error messages, you can create a type alias like this:

import Parser.Advanced

type alias MyParser a =
  Parser.Advanced.Parser Context Problem a

type Context = Definition String | List | Record

type Problem = BadIndent | BadKeyword String

All of the functions from Parser should exist in Parser.Advanced in some form, allowing you to switch over pretty easily.

run : Parser c x a -> String -> Result (Array (DeadEnd c x)) a

This works just like Parser.run. The only difference is that when it fails, it has much more precise information for each dead end.

type alias DeadEnd context problem =

{ row : Int

, col : Int

, problem : problem

, contextStack : Array { row : Int, col : Int , context : context }

}

Say you are parsing a function named viewHealthData that contains a list. You might get a DeadEnd like this:

{ row = 18
, col = 22
, problem = UnexpectedComma
, contextStack =
    [ { row = 14
      , col = 1
      , context = Definition "viewHealthData"
      }
    , { row = 15
      , col = 4
      , context = List
      }
    ]
}

We have a ton of information here! So in the error message, we can say that “I ran into an issue when parsing a list in the definition of viewHealthData. It looks like there is an extra comma.” Or maybe something even better!

Furthermore, many parsers just put a mark where the problem manifested. By tracking the row and col of the context, we can show a much larger region as a way of indicating “I thought I was parsing this thing that starts over here.” Otherwise you can get very confusing error messages on a missing ] or } or ) because “I need more indentation” on something unrelated.

Note: Rows and columns are counted like a text editor. The beginning is row=1 and col=1. The col increments as characters are chomped. When a \n is chomped, row is incremented and col starts over again at 1.

inContext : context -> Parser context x a -> Parser context x a

This is how you mark that you are in a certain context. For example, here is a rough outline of some code that uses inContext to mark when you are parsing a specific definition:

import Char
import Parser.Advanced exposing (..)
import Set

type Context
  = Definition String
  | List

definition : Parser Context Problem Expr
definition =
  functionName
    |> andThen definitionBody

definitionBody : String -> Parser Context Problem Expr
definitionBody name =
  inContext (Definition name) <|
    succeed (Function name)
      |= arguments
      |. symbol (Token "=" ExpectingEquals)
      |= expression

functionName : Parser c Problem String
functionName =
  variable
    { start = Char.isLower
    , inner = Char.isAlphaNum
    , reserved = Set.fromList ["let","in"]
    , expecting = ExpectingFunctionName
    }

First we parse the function name, and then we parse the rest of the definition. Importantly, we call inContext so that any dead end that occurs in definitionBody will get this extra context information. That way you can say things like, “I was expecting an equals sign in the view definition.” Context!

type Token x

= Token String x

With the simpler Parser module, you could just say symbol "," and parse all the commas you wanted. But now that we have a custom type for our problems, we actually have to specify that as well. So anywhere you just used a String in the simpler module, you now use a Token Problem in the advanced module:

type Problem
  = ExpectingComma
  | ExpectingListEnd

comma : Token Problem
comma =
  Token "," ExpectingComma

listEnd : Token Problem
listEnd =
  Token "]" ExpectingListEnd

You can be creative with your custom type. Maybe you want a lot of detail. Maybe you want looser categories. It is a custom type. Do what makes sense for you!

int : x -> x -> Parser c x Int

Just like Parser.int where you have to handle negation yourself. The only difference is that you provide a two potential problems:

int : x -> x -> Parser c x Int
int expecting invalid =
  number
    { int = Ok identity
    , hex = Err invalid
    , octal = Err invalid
    , binary = Err invalid
    , float = Err invalid
    , invalid = invalid
    , expecting = expecting
    }

You can use problems like ExpectingInt and InvalidNumber.

float : x -> x -> Parser c x Float

Just like Parser.float where you have to handle negation yourself. The only difference is that you provide a two potential problems:

float : x -> x -> Parser c x Float
float expecting invalid =
  number
    { int = Ok toFloat
    , hex = Err invalid
    , octal = Err invalid
    , binary = Err invalid
    , float = Ok identity
    , invalid = invalid
    , expecting = expecting
    }

You can use problems like ExpectingFloat and InvalidNumber.

number :

{ int : Result x (Int -> a)

, hex : Result x (Int -> a)

, octal : Result x (Int -> a)

, binary : Result x (Int -> a)

, float : Result x (Float -> a)

, invalid : x

, expecting : x

}

-> Parser c x a

Just like Parser.number where you have to handle negation yourself. The only difference is that you provide all the potential problems.

symbol : Token x -> Parser c x {}

Just like Parser.symbol except you provide a Token to clearly indicate your custom type of problems:

comma : Parser Context Problem {}
comma =
  symbol (Token "," ExpectingComma)

keyword : Token x -> Parser c x {}

Just like Parser.keyword except you provide a Token to clearly indicate your custom type of problems:

let_ : Parser Context Problem {}
let_ =
  symbol (Token "let" ExpectingLet)

Note that this would fail to chomp letter because of the subsequent characters. Use token if you do not want that last letter check.

variable :

{ start : Char -> Bool

, inner : Char -> Bool

, reserved : Set String

, expecting : x

}

-> Parser c x String

Just like Parser.variable except you specify the problem yourself.

end : x -> Parser c x {}

Just like Parser.end except you provide the problem that arises when the parser is not at the end of the input.

succeed : a -> Parser c x a

Just like Parser.succeed

(|=) : Parser c x (a -> b) -> Parser c x a -> Parser c x b

Just like the (|=) from the Parser module.

(|.) : Parser c x keep -> Parser c x ignore -> Parser c x keep

Just like the (|.) from the Parser module.

lazy : ({} -> Parser c x a) -> Parser c x a

Just like Parser.lazy

andThen : (a -> Parser c x b) -> Parser c x a -> Parser c x b

Just like Parser.andThen

problem : x -> Parser c x a

Just like Parser.problem except you provide a custom type for your problem.

oneOf : Array (Parser c x a) -> Parser c x a

Just like Parser.oneOf

map : (a -> b) -> Parser c x a -> Parser c x b

Just like Parser.map

backtrackable : Parser c x a -> Parser c x a

Just like Parser.backtrackable

commit : a -> Parser c x a

Just like Parser.commit

token : Token x -> Parser c x {}

Just like Parser.token except you provide a Token specifying your custom type of problems.

sequence :

{ start : Token x

, separator : Token x

, end : Token x

, spaces : Parser c x {}

, item : Parser c x a

, trailing : Trailing

}

-> Parser c x (Array a)

Just like Parser.sequence except with a Token for the start, separator, and end. That way you can specify your custom type of problem for when something is not found.

type Trailing

= Forbidden

| Optional

| Mandatory

What’s the deal with trailing commas? Are they Forbidden? Are they Optional? Are they Mandatory? Welcome to shapes club!

loop :

state

-> (state -> Parser c x ( Step state a))

-> Parser c x a

Just like Parser.loop

type Step state a

= Loop state