2011年1月21日金曜日

Some enhancements for pa_monad's "unit binder"

In OCaml monadic programming, I am completely happy with writing the bind binary operator (>>=). Here is a code piece of my LLVM thing, with a builder monad:

let run clos lty_ret =
B.cast clos lty_generic_clos_ptr >>= fun clos ->
B.const_load clos [0; pos_code_ptr] "code" >>= fun loaded ->
B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ])) >>= fun code_ptr ->
B.const_load clos [0; pos_env_ptr] "env" >>= fun env_ptr ->
get_arg_ptr clos (L.Const.int 0) >>= fun args_ptr ->
B.check_call code_ptr [ env_ptr; args_ptr ] >>= fun () ->
B.call code_ptr [ env_ptr; args_ptr ] "called"
But some people cannot work with this conservative style and want to use do notation in Haskell. For OCaml, we have pa_monad's perform notation (http://www.cas.mcmaster.ca/~carette/pa_monad/):
let run clos lty_ret = perform
clos <-- B.cast clos lty_generic_clos_ptr;
loaded <-- B.const_load clos [0; pos_code_ptr] "code";
code_ptr <-- B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ]));
env_ptr <-- B.const_load clos [0; pos_env_ptr] "env";
args_ptr <-- get_arg_ptr clos (L.Const.int 0);
B.check_call code_ptr [ env_ptr; args_ptr ]; (* It is "unit binder" *)
B.call code_ptr [ env_ptr; args_ptr ] "called"
Nice. But after a while playing with pa_monad, I have found two shortcomings of pa_monad's "unit binder", a bind expression without <-- (I am not sure what it is called, so I call it "unit binder").


OCaml sequence expression is forbidden in perform notation

perform notation has the form perform e; e; e; e; ... by changing the original parsing of OCaml sequence expression under perform keyword. Therefore we cannot write the normal OCaml sequence expression e; e; e; ... in it. Instead, we had to use let () = e; e; e in:

let run clos lty_ret = perform
clos <-- B.cast clos lty_generic_clos_ptr;
let () = prerr_endline "clos done" in
loaded <-- B.const_load clos [0; pos_code_ptr] "code";
let () = prerr_endline "loaded done" in
code_ptr <-- B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ]));
env_ptr <-- B.const_load clos [0; pos_env_ptr] "env";
args_ptr <-- get_arg_ptr clos (L.Const.int 0);
let () = prerr_endline "ptrs done"; prerr_endline "all things are prepared. Now call!" in
B.check_call code_ptr [ env_ptr; args_ptr ]; (* It is "unit binder" *)
B.call code_ptr [ env_ptr; args_ptr ] "called"
Oh BTW, I do not recommend using let _ = e in, since it just throws away the result of e and it is unsafe. Use let () = e in instead, to make sure that e's type is unit. Anyway, this workaround requires lots of key types, so I have introduced \\ escape to have side effect expressions more easily in perform:
let run clos lty_ret = perform
clos <-- B.cast clos lty_generic_clos_ptr;
\ prerr_endline "clos done";
loaded <-- B.const_load clos [0; pos_code_ptr] "code";
\ prerr_endline "loaded done";
code_ptr <-- B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ]));
env_ptr <-- B.const_load clos [0; pos_env_ptr] "env";
args_ptr <-- get_arg_ptr clos (L.Const.int 0);
\ prerr_endline "ptrs done";
\ prerr_endline "all things are prepared. Now call!";
B.check_call code_ptr [ env_ptr; args_ptr ]; (* It is "unit binder" *)
B.call code_ptr [ env_ptr; args_ptr ] "called"
Simple isn't it? With \\ sign, the normal OCaml sequences and unit binders are clearly distinguished.


Unit binders should really bind unit. Otherwise, it should be warned.

I also noticed that unit binders can have monads of any contents, and those contents are simply discarded there without any warning. It is very dangerous. If your expression has a type t monad, where t <> unit, then t should be meaningful and should not be thrown away silently. Here is an example of option monad:

let bind x f = match x with
| Some v -> f v
| None -> None

let return x = Some x

let the_answer = return 42

perform
the_answer; (* 42 is gone! *)
return 666
This is because the unit binder e; is converted to an expression bind e (fun _ -> ...), with a wild-card _. In the above example, bind the_answer (fun _ -> return 666). I have changed this conversion a little so that the warning should be printed if non-unit value is thrown away:
let bind x f = match x with
| Some v -> f v
| None -> None

let return x = Some x

let the_answer = return 42

perform
the_answer; (* Warning S: this expression should have type unit. *)
return 666
Now the expression is converted to bind the_answer (fun __should_be_unit -> __should_be_unit; return 666), and if __should_be_unit does not have unit type then OCaml compiler annoys about it.


pa_monad_custom

The modified version is available at https://bitbucket.org/camlspotter/pa_monad_custom . Help yourself.