ocaml - Retrieve a part of parsing by making separate .mly and .mll -


i writing front-end parse set of txt files, each file contains set of procedures, instance 1 txt file looks like:

sub procedure1 ... end sub  sub procedure2 ... end sub  ... 

syntax.ml contains:

type ev = procedure_declaration list type procedure_declaration =    { procedure_name : string; procedure_body : procedure_body } type procedure_body = ... ... 

parser.mly looks like:

%start main %type <syntax.ev> main %% main: procedure_declarations eof { list.rev $1 }  procedure_declarations:   /* empty */ { [] } | procedure_declarations procedure_declaration { $2 :: $1 }  procedure_declaration: sub name = procedure_name eos body = procedure_body end sub eos { { procedure_name = name; procedure_body = body } } ... 

now, retrieve parsing of procedure_declaration (for purpose of exception handling). means, want create parser_pd.mly , lexer_pd.mll, , let parser.mly call parser_pd.main. therefore, parser_pd.mly looks like:

%start main %type <syntax.procedure_declaration> main %% main: procedure_declaration eof { $1 }; ... 

as of content in previous parser.mly should moved parser_pd.mly, parser.mly should lighter before , like:

%start main %type <syntax.ev> main %% main: procedure_declarations eof { list.rev $1 }  procedure_declarations:   /* empty */ { [] } | procedure_declarations procedure_declaration { $2 :: $1 }  procedure_declaration: sub name = procedure_name eos ?????? end sub eos { { procedure_name = name;      procedure_body = parser_pd.main (lexer_pd.token ??????) } } 

the question don't know how write ?????? part, , lexer.mll should light (as reads token end, sub , eos, , lets contents treated lexer_pd.mll). maybe functions lexing module needed?

hope question clear... help?

you write want retrieve parsing of procedure_declaration, in code, want retrieve procedure_body, i'm assuming that's want.

to put own words, want have compose grammars without telling embedding grammar grammar embedded. problem (no problem in case, because luckily have friendly grammar) in lalr(1), need 1 token of lookahead decide rule take. grammar looks this:

procedure_declaration:   sub procedure_name eos   procedure_body   end sub eos 

you can combine procedure_name , procedure_body, rule , semantic action like:

procedure_declaration:   sub combined = procedure_name eos /* nothing here */ eos   { { procedure_name = fst combined; procedure_body = snd combined; } }  procedure_name:   id = ident {     let lexbuf = _menhir_env._menhir_lexbuf in     (id, parser_pd.main lexer_pd.token lexbuf)   } 

parser_pd contain rule:

main: procedure_body end sub { $1 } 

you want end sub in parser_pd, because procedure_body not self-delimiting.

note call sub-parser before parsing first eos after procedure name identifier, because lookahead. if call in eos, late, , parser have pulled token body, already. second eos 1 after end sub.

the _menhir_env thing hack works menhir. may need hack make menhir --infer work (if use that), because doesn't expect user refer it, symbol won't in scope. hack be:

%{   type menhir_env_hack = { _menhir_lexbuf : lexing.lexbuf }   let _menhir_env = { _menhir_lexbuf = lexing.from_function     (* make sure lexbuf never used. *)     (fun _ _ -> assert false) } %} 

Comments

Popular posts from this blog

html - Sizing a high-res image (~8MB) to display entirely in a small div (circular, diameter 100px) -

java - IntelliJ - No such instance method -

identifier - Is it possible for an html5 document to have two ids? -