Parser.Advanced
Parsers
An advanced Parser
gives two ways to improve your error messages:
problem
— Instead of all errors being aString
, you can create a custom type liketype Problem = BadIndent | BadKeyword String
and track problems much more precisely.context
— Error messages can be further improved when precise problems are paired with information about where you ran into trouble. By tracking the context, instead of saying “I found a bad keyword” you can say “I found a bad keyword when parsing a list” and give folks a better idea of what the parser thinks it is doing.
I recommend starting with the simpler Parser
module though, and
when you feel comfortable and want better error messages, you can create a type
alias like this:
import Parser.Advanced
type alias MyParser a =
Parser.Advanced.Parser Context Problem a
type Context = Definition String | List | Record
type Problem = BadIndent | BadKeyword String
All of the functions from Parser
should exist in Parser.Advanced
in some
form, allowing you to switch over pretty easily.
This works just like Parser.run
.
The only difference is that when it fails, it has much more precise information
for each dead end.
Say you are parsing a function named viewHealthData
that contains a list.
You might get a DeadEnd
like this:
{ row = 18
, col = 22
, problem = UnexpectedComma
, contextStack =
[ { row = 14
, col = 1
, context = Definition "viewHealthData"
}
, { row = 15
, col = 4
, context = List
}
]
}
We have a ton of information here! So in the error message, we can say that “I
ran into an issue when parsing a list in the definition of viewHealthData
. It
looks like there is an extra comma.” Or maybe something even better!
Furthermore, many parsers just put a mark where the problem manifested. By
tracking the row
and col
of the context, we can show a much larger region
as a way of indicating “I thought I was parsing this thing that starts over
here.” Otherwise you can get very confusing error messages on a missing ]
or
}
or )
because “I need more indentation” on something unrelated.
Note: Rows and columns are counted like a text editor. The beginning is row=1
and col=1
. The col
increments as characters are chomped. When a \n
is chomped,
row
is incremented and col
starts over again at 1
.
This is how you mark that you are in a certain context. For example, here
is a rough outline of some code that uses inContext
to mark when you are
parsing a specific definition:
import Char
import Parser.Advanced exposing (..)
import Set
type Context
= Definition String
| List
definition : Parser Context Problem Expr
definition =
functionName
|> andThen definitionBody
definitionBody : String -> Parser Context Problem Expr
definitionBody name =
inContext (Definition name) <|
succeed (Function name)
|= arguments
|. symbol (Token "=" ExpectingEquals)
|= expression
functionName : Parser c Problem String
functionName =
variable
{ start = Char.isLower
, inner = Char.isAlphaNum
, reserved = Set.fromList ["let","in"]
, expecting = ExpectingFunctionName
}
First we parse the function name, and then we parse the rest of the definition.
Importantly, we call inContext
so that any dead end that occurs in
definitionBody
will get this extra context information. That way you can say
things like, “I was expecting an equals sign in the view
definition.” Context!
With the simpler Parser
module, you could just say symbol ","
and
parse all the commas you wanted. But now that we have a custom type for our
problems, we actually have to specify that as well. So anywhere you just used
a String
in the simpler module, you now use a Token Problem
in the advanced
module:
type Problem
= ExpectingComma
| ExpectingListEnd
comma : Token Problem
comma =
Token "," ExpectingComma
listEnd : Token Problem
listEnd =
Token "]" ExpectingListEnd
You can be creative with your custom type. Maybe you want a lot of detail. Maybe you want looser categories. It is a custom type. Do what makes sense for you!
Everything past here works just like in the
Parser
module, except that String
arguments become Token
arguments, and you need to provide a Problem
for
certain scenarios.
Building Blocks
Just like Parser.int
where you have to handle negation
yourself. The only difference is that you provide a two potential problems:
int : x -> x -> Parser c x Int
int expecting invalid =
number
{ int = Ok identity
, hex = Err invalid
, octal = Err invalid
, binary = Err invalid
, float = Err invalid
, invalid = invalid
, expecting = expecting
}
You can use problems like ExpectingInt
and InvalidNumber
.
Just like Parser.float
where you have to handle negation
yourself. The only difference is that you provide a two potential problems:
float : x -> x -> Parser c x Float
float expecting invalid =
number
{ int = Ok toFloat
, hex = Err invalid
, octal = Err invalid
, binary = Err invalid
, float = Ok identity
, invalid = invalid
, expecting = expecting
}
You can use problems like ExpectingFloat
and InvalidNumber
.
Just like Parser.number
where you have to handle
negation yourself. The only difference is that you provide all the potential
problems.
Just like Parser.symbol
except you provide a Token
to
clearly indicate your custom type of problems:
comma : Parser Context Problem {}
comma =
symbol (Token "," ExpectingComma)
Just like Parser.keyword
except you provide a Token
to clearly indicate your custom type of problems:
let_ : Parser Context Problem {}
let_ =
symbol (Token "let" ExpectingLet)
Note that this would fail to chomp letter
because of the subsequent
characters. Use token
if you do not want that last letter check.
Just like Parser.variable
except you specify the
problem yourself.
Just like Parser.end
except you provide the problem that
arises when the parser is not at the end of the input.
Pipelines
Just like Parser.succeed
Just like the (|=)
from the Parser
module.
Just like the (|.)
from the Parser
module.
Just like Parser.lazy
Just like Parser.andThen
Just like Parser.problem
except you provide a custom
type for your problem.
Branches
Just like Parser.oneOf
Just like Parser.map
Just like Parser.backtrackable
Just like Parser.commit
Just like Parser.token
except you provide a Token
specifying your custom type of problems.
Loops
Just like Parser.sequence
except with a Token
for
the start, separator, and end. That way you can specify your custom type of
problem for when something is not found.
What’s the deal with trailing commas? Are they Forbidden
?
Are they Optional
? Are they Mandatory
? Welcome to shapes
club!
Just like Parser.loop
Just like Parser.Step
Whitespace
Just like Parser.spaces
Just like Parser.lineComment
except you provide a
Token
describing the starting symbol.
Just like Parser.multiComment
except with a
Token
for the open and close symbols.
Works just like Parser.Nestable
to help distinguish
between unnestable /*
*/
comments like in JS and nestable {-
-}
comments like in Gren.
Chompers
Just like Parser.getChompedString
Just like Parser.chompIf
except you provide a problem
in case a character cannot be chomped.
Just like Parser.chompWhile
Just like Parser.chompUntil
except you provide a
Token
in case you chomp all the way to the end of the input without finding
what you need.
Just like Parser.chompUntilEndOr
Just like Parser.mapChompedString
Indentation
Just like Parser.withIndent
Just like Parser.getIndent
Positions
Just like Parser.getPosition
Just like Parser.getRow
Just like Parser.getCol
Just like Parser.getOffset
Just like Parser.getSource