ocaml - Retrieve a part of parsing by making separate .mly and .mll -
i writing front-end parse set of txt
files, each file contains set of procedures
, instance 1 txt file looks like:
sub procedure1 ... end sub sub procedure2 ... end sub ...
syntax.ml
contains:
type ev = procedure_declaration list type procedure_declaration = { procedure_name : string; procedure_body : procedure_body } type procedure_body = ... ...
parser.mly
looks like:
%start main %type <syntax.ev> main %% main: procedure_declarations eof { list.rev $1 } procedure_declarations: /* empty */ { [] } | procedure_declarations procedure_declaration { $2 :: $1 } procedure_declaration: sub name = procedure_name eos body = procedure_body end sub eos { { procedure_name = name; procedure_body = body } } ...
now, retrieve parsing of procedure_declaration
(for purpose of exception handling). means, want create parser_pd.mly
, lexer_pd.mll
, , let parser.mly
call parser_pd.main
. therefore, parser_pd.mly
looks like:
%start main %type <syntax.procedure_declaration> main %% main: procedure_declaration eof { $1 }; ...
as of content in previous parser.mly
should moved parser_pd.mly
, parser.mly
should lighter before , like:
%start main %type <syntax.ev> main %% main: procedure_declarations eof { list.rev $1 } procedure_declarations: /* empty */ { [] } | procedure_declarations procedure_declaration { $2 :: $1 } procedure_declaration: sub name = procedure_name eos ?????? end sub eos { { procedure_name = name; procedure_body = parser_pd.main (lexer_pd.token ??????) } }
the question don't know how write ??????
part, , lexer.mll
should light (as reads token end
, sub
, eos
, , lets contents treated lexer_pd.mll
). maybe functions lexing
module needed?
hope question clear... help?
you write want retrieve parsing of procedure_declaration, in code, want retrieve procedure_body, i'm assuming that's want.
to put own words, want have compose grammars without telling embedding grammar grammar embedded. problem (no problem in case, because luckily have friendly grammar) in lalr(1), need 1 token of lookahead decide rule take. grammar looks this:
procedure_declaration: sub procedure_name eos procedure_body end sub eos
you can combine procedure_name , procedure_body, rule , semantic action like:
procedure_declaration: sub combined = procedure_name eos /* nothing here */ eos { { procedure_name = fst combined; procedure_body = snd combined; } } procedure_name: id = ident { let lexbuf = _menhir_env._menhir_lexbuf in (id, parser_pd.main lexer_pd.token lexbuf) }
parser_pd contain rule:
main: procedure_body end sub { $1 }
you want end sub in parser_pd, because procedure_body not self-delimiting.
note call sub-parser before parsing first eos after procedure name identifier, because lookahead. if call in eos, late, , parser have pulled token body, already. second eos 1 after end sub.
the _menhir_env
thing hack works menhir. may need hack make menhir --infer
work (if use that), because doesn't expect user refer it, symbol won't in scope. hack be:
%{ type menhir_env_hack = { _menhir_lexbuf : lexing.lexbuf } let _menhir_env = { _menhir_lexbuf = lexing.from_function (* make sure lexbuf never used. *) (fun _ _ -> assert false) } %}
Comments
Post a Comment