instead of a closed set of languages. I also removed the offsets:
I simply use the current region to determine whether the
preprocessing directie starts at the beginning of a line. I also
removed scanning line indicators, to make the lexer simpler.
LexToken.mll: Moved the function [check_right_context] that
checks stylistic constraints from Lexer.mll to
LexToken.mll. While this triplicates code (as CameLIGO, PascaLIGO
and ReasonLIGO share the same constraints), the benefit is that
Lexer.mll becomes more generic and the signature for the TOKEN
module is simpler (no more exporting predicates, except for
EOF). In accordance with the change of the preprocessor, the
lexers and parsers for LIGO now depend on the kind of comments,
not a fixed set of syntaxes. This gives more versatility when
adding a new language: only the kinds of its comments are needed,
although Lexer.mll and Preproc.mll may have to be modified if
they do not already know the comment delimiters, for example line
comments starting with #.
****************************************************************
BUG: The exceptions coming from LexToken.mll when a stylistic
constraint is broken in [LexToken.check_right_context] are not
caught yet.
****************************************************************
Lexer.mll: I moved out as much as I could from the header into a
new module LexerLib. The aim is to make it easy to reuse as much
as possible of the lexer machinerie, when it cannot be used as
is.
like the absence of an input filename. (This simplifies all the
clients codes.) Fixed the dune file for the preprocessor. Fixed
the build of PreprocMain.exe and PreprocMain.byte. Restricted
preprocessing errors [Preproc.Newline_in_string] and
[Preproc.Open_string] to the argument of the #include
directive (instead of general strings: this is for the LIGO lexer
to report the error). I removed the error [Preproc.Open_comment]
as this is for the LIGO lexer to report. The preprocessor scanner
[Preproc.lex] does not take a parameter [is_file:bool] now: the
source file (if any) is determined from the lexing
buffer. Accordingly, the field [is_file] of the state of the
preprocessing lexer has been removed: the lexing buffer becomes
now the reference for the input source (bug fix and interface
improvement). Fixed the comments of the test contract
pledge.religo. I removed the data constructor [Lexer.Stdin], as
redundant with [Lexer.Channel].
LIGO lexer later). Added field [is_file] to the state of the
lexer to know if the input is a file or not (insert or not a
first line directive). Fixed ReasonLIGO comments in
entrypoints-contracts.md and website2.religo. WIP on the LIGO
lexer to properly handle comments for all the syntaxes.
* The parameter for logging the lexer is now mandatory.
* The ParserAPI now thread the logging of the lexer.
* LexerMain.ml now call the logging of the lexers (CameLIGO, ReasonLIGO).
* Fixed bug in lexer when a line comment ends with EOF.
I removed the last top-level effect (the execution of cpp).
The idea is that ParserUnit.ml and each ParserMain.ml get closer
to pascaligo.ml, cameligo.ml and reasonligo.ml, respectively.
* I added CLI option "--mono" to select the monolithic API of Menhir.
* I added a field "win" to the state of the lexer (a two-token
window for error reporting).
* I escaped LIGO strings before making them OCaml strings (for
example for printing).
I also had to remove the keywords [Down], [Fail] and [Step] in
PascaLIGO that made a mysterious and unwanted come back. (I did not
bother with [git blame]).
LexToken, AST: Tiny refactoring.
Bug: Added the making of the AST node PBytes.
Parser: The rule "pattern" was not properly stratified (the
constructor "PCons" was always produced, even when no consing was
done (now a fall-through to "core_pattern").
Bug: When sharing the lexers between Ligodity and Pascaligo, a
regression was introduced with the lexing of symbols. Indeed,
symbols specific to Ligodity (like "<>") and
Pascaligo (like "=/=") were scanned, but the
function "LexToken.mk_sym" for each only accepted their own,
yielding to an assertion to be invalidated. Fix: I created an
error "sym_err" now to gracefully handle that situation and
provide a hint to the programmer (to wit, to check the LIGO
syntax in use).
WIP: Started to write pretty-printing functions for the nodes of
the AST.
CLI: The option "--verbose=ast" now calls that function instead
of printing the tokens from the AST. When the pretty-printer is
finished, the option for printing the tokens will likely
be "--verbose=ast-tokens".