Caml Spotting

2017年3月24日金曜日

A Glitch of Lwt.cancel

Lwt.cancel cannot cancel the thread in which itself is running in lwt.2.7.0. The following code demostrates what I mean:

(* ocamlfind ocamlopt -linkpkg -package lwt,lwt.unix x.ml *)

open Lwt

module Test1 = struct
let self = ref None

let from_some = function
| Some x -> x
| None -> assert false

let s =
pause () >>= fun () ->
cancel (from_some !self);
prerr_endline "cancel called";
pause () >>= fun () ->
prerr_endline "direct cancel cannot cancel itself";
return ()

let () =
self := Some s;
prerr_endline "set";
Lwt_main.run @@ s
end

module Test2 = struct
let self = ref None

let from_some = function
| Some x -> x
| None -> assert false

let s =
pause () >>= fun () ->
async (fun () ->
pause () >>= fun () ->
cancel (from_some !self); (* should give an exception Canceled *)
prerr_endline "cancel called";
return ());
pause () >>= fun () ->
prerr_endline "not dead";
return ()

let () =
self := Some s;
prerr_endline "set";
Lwt_main.run @@ s
end

TEST1 cannot cancel itself: cancel has no effect. TEST2, however, can. Only the difference is the existence of async and pause:

async (fun () -> pause () >>= fun () -> cancel <self>)

I do not fully understand why this difference we have, but the key seems to be call Lwt.cancel from another thread.

The updated code is found at: https://bitbucket.org/camlspotter/lwt_cancel_glitch

2015年4月23日木曜日

Recover the good old C-x C-b (list-buffers) behaviour around Emacs 24.4 and later

Something has been changed around Emacs 24.4: my favorite C-x C-b (list-buffers) no longer works as before. It displays the buffer menu in a random window. Which windows is chosen is almost undeterministic and no way to predict. Sometimes the current window is selected. :-(

This change is awful to me who built a special neuro circuit for Emacs. I often did:

C-x C-b to display the buffer list in another window than the current
then immediately C-x o to move the cursor from the current window to the buffer list.

I can type these strokes in 0.5secs and this worked fine since what happened with C-x C-b was almost predictable. 24.x ruined it.

After long investigation, finally I found a way to recover the good old behaviour of C-x C-b (or at least something pretty similar):

(defun good-old-list-buffers () (interactive) (display-buffer (list-buffers-noselect))) (global-change-key (kbd "C-x C-b") 'good-old-list-buffers)

This is it. I hope this helps some other Emacs users.

Now the following is much nearer:

(defun good-old-list-buffers () (interactive) (if (not (string-equal (buffer-name (current-buffer)) "*Buffer List*")) (save-selected-window (buffer-menu-other-window))))(global-change-key (kbd "C-x C-b") 'good-old-list-buffers)

2013年7月4日木曜日

OCaml◎Scope is now an OCaml heroku app!

OCaml◎Scope, a new OCaml API search, is now a service running at http://ocamloscope.herokuapp.com!

Change list:

Now it no longer uses CamlGI but Eliom as the web engine. Eliom is much safer and easier to write!
DB carries 245302 entries from 76 OCamlFind packages.
Search algorithm tweak to get better results in shorter time
More query forms (see Examples)

2013年6月7日金曜日

OCaml◎Scope : a new OCaml API search by names and types

The first public preview version of OCaml◎Scope is now available at http://ocamloscope.herokuapp.com.

It supports:

Fast: on memory DB.
Friendly with OCamlFind packages: names are prefixed with the OCamlFind package name it belongs to.
Friendly with OPAM: each OCamlFind package knows which OPAM package installed it.
Auto extraction of OCamlDoc comments.
Edit distance based path and type search.

Currently, the state of OCaml◎Scope is still at the proof-of-concept level. Many things to be done, search result tweak, UI, tools, etc... but so far, I am happy with its search speed and rather small memory consumption. Currently it has nearly 150k entries (100 OCamlFind packages including lablgtk, core, batteries, ocamlnet and eliom) takes 2secs maximum per search.

P.S. Finally it is migrated to herokuapps!

2012年9月4日火曜日

A safe but strange way of modifying OCaml compiler

The updated article and code available at https://bitbucket.org/camlspotter/compiler-libs-hack

A safe but strange way of modifying OCaml compiler

OCaml 4.00.0 is out! Have you tried it? GADT? Better first class modules? More warnings?

I am talking something different, something more exciting at least for me. Compiler-libs.

Compiler-libs are modules of the compiler itself, and now available for everyone as a library, even for binary package users. This means that we can deliver compilish hacks much easier to everyone. If the hack is reasonably small we can publish them not as a compiler patch which requires boring source download + patching + entire recompilation of the compiler, but as a stand alone tool which compiles in really shorter time. Here, I am going to demonstrate such a small compiler hack, SML style overloading, my favorite compiler mod.

Safe compiler mod ever

What is great about 4.00.0 is it also have an untyper and an AST printer. They are not in the part of the compiler-libs, but found in tools dir. (So for binary package users we must copy them but they are very small, and I hope they are soon in compiler-libs in 4.00.1 or 4.01.0.)

The untyper takes a type-checked source tree (Typedtree), strips away its attached type information, then returns the corresponding untyped tree (Parsetree). The AST printer prints out a Parsetree as a reparseable OCaml source code.

Using them we can create safe compiler mods: our modified compiler can do whatever it wants, then it makes the result back to Parsetree and refeeds it to the original typechecker and compiler. If the mod does something wrong, the original compiler part should find it. If a user is paranoiac about what our mod does, we can always print out the result as vanilla OCaml code. Cool.

Preparation

All the code is available here:

hg clone https://bitbucket.org/camlspotter/compiler-libs-hack

It contains the full source tree of the official OCaml 4.00.0, but it is attached only for the copyright requirements. We only need few files of it. And of course, you must have OCaml 4.00.0 installed.

Vanilla compiler

First of all, lets start cloning a vanilla compiler from compiler-libs. It is very easy:

$ cd vanilla
$ make

cp ../ocaml/driver/main.ml main.ml
ocamlc -I +compiler-libs -I +unix -c main.ml
ocamlc -o vanilla -I +compiler-libs ocamlcommon.cma ocamlbytecomp.cma main.cmo

cp ../ocaml/driver/optmain.ml optmain.ml
ocamlc -I +compiler-libs -I +unix -c optmain.ml
ocamlc -o vanillaopt -I +compiler-libs ocamlcommon.cma ocamloptcomp.cma optmain.cmo

To build a vanilla ocamlc, we need the original main.ml and link it with ocamlcommon.cma and ocamlbytecomp.cma. main.ml must be copied from the original source tree, since it is not included in the compiler-libs.

For the native code compiler, instead of main.ml and ocamlbytecomp.cma, we use optmain.ml and ocamloptcompo.cma.

Now you have two executables vanilla and vanillaopt, which are actually clones of ocamlc and ocamlopt. Try using them to compile some simple modules to see they are really working.

Now you know how to use compiler-libs. Let's do something more interesting.

Compiler with untype+retyping

The next thing is to use the untyper and the AST printer. Here we modify the bytecode compiler workflow a bit, so that once the original compiler type-check the source code, we untype it, then print it as readable OCaml source, then retype it again. The workflow is implemented in ocaml/driver/compile.ml:

Pparse.file ppf inputfile Parse.implementation ast_impl_magic_number
++ print_if ppf Clflags.dump_parsetree Printast.implementation
++ Typemod.type_implementation sourcefile outputprefix modulename env
++ Translmod.transl_implementation modulename
++ print_if ppf Clflags.dump_rawlambda Printlambda.lambda
++ Simplif.simplify_lambda
++ print_if ppf Clflags.dump_lambda Printlambda.lambda
++ Bytegen.compile_implementation modulename
++ print_if ppf Clflags.dump_instr Printinstr.instrlist
++ Emitcode.to_file oc modulename;

Simple. The source file is first parsed by Pparse.file, then the result is sent to the next line of the parsetree dumper, then sent to the type checker, and so on... The source is pipelined from the top line to the bottom.

We here insert few extra steps into this pipeline to untype and print:

Pparse.file ppf inputfile Parse.implementation ast_impl_magic_number
++ print_if ppf Clflags.dump_parsetree Printast.implementation
++ Typemod.type_implementation sourcefile outputprefix modulename env
++ (fun (str, _) ->  (* Inserting an additional step! *)
  let ptree =  Untypeast.untype_structure str in
  Format.eprintf "%a@." Pprintast.structure ptree;
  ptree
)
++ Translmod.transl_implementation modulename
++ print_if ppf Clflags.dump_rawlambda Printlambda.lambda
++ Simplif.simplify_lambda
++ print_if ppf Clflags.dump_lambda Printlambda.lambda
++ Bytegen.compile_implementation modulename
++ print_if ppf Clflags.dump_instr Printinstr.instrlist
++ Emitcode.to_file oc modulename;

Typed structure str from Typemod.type_implementation is untyped back to ptree by Untypeast.untype_structure, then it is printed out by Pprintast.structure. The untyped tree is sent again to the type checker and the later steps.

Does it really work? Yes!:

$ cd retype
$ make

It creates a bytecode compiler retype. It just works as ocamlc, but it also prints out the source code. Try it to compile some files.

Compiler mod!

Now you should get the idea of compiler modification with compiler-libs: your compiler mod somehow creates an untyped AST, then feed it to the original typechecker and the following compiler pipeline. The original type-checker assures the safety of the output of your mod. The output can be printed as a normal OCaml code by the AST printer, too.

By this, you can even have your own parser and you own type-checker in order to implement a completely diffrent language which uses OCaml as a backend! (Besides, beware of the license terms if you want to distribute your hack!)

But for this time, I would like to demonstrate something much simpler: using the original parser and type-checker, then modify that typedtree: adding another pipeline step after the first type checking of the retype compiler:

(* See overload/compile.ml *)
...
++ Typemod.type_implementation sourcefile outputprefix modulename env
++ (fun (str, _) -> Mod.structure str)   (* We modify the tree! *)
++ (fun str ->
  let ptree =  Untypeast.untype_structure str in
  Format.eprintf "%a@." Pprintast.structure ptree;
  ptree)
++ Typemod.type_implementation sourcefile outputprefix modulename env
++ ...

Mod.structure : Typedtree.structure -> Typedtree.structure does something fancy, in this article, SML styple overloading resolution!

SML style overloading

SML style overloading is very simple way to overload things. Much simpler than Haskell type classes, so you cannot derive overloading from overloaded values. You can get the idea from my past article *http://camlspotter.blogspot.sg/2011/09/small-patch-for-bizarre-but-user.html*. Let's try to overload (+) here too.

The design of the mod of this time is as follows. We need a seed of an overloaded value, with a polymorphic type, but without any actual definition. Fortunately, we have a way for this in OCaml: primitive declaration:

module Loaded = struct
  external (+) : 'a -> 'a -> 'a = "OVERLOADDED"
end

Here we declare Loaded.(+) to be a polymorphic function whose implementation is by C function named OVERLODED. But we do not give any C code. The name OVERLOADED is just a mark for our overloading. Very luckily, we can have such a fake polymorphic value in OCaml as far as such a value is never actually used.

In this Loaded module, we stack sub-modules which provide overloaded instances for this (+):

module Loaded = struct
  external (+) : 'a -> 'a -> 'a = "OVERLOADDED"
  module Int = struct
    let (+) = Pervasives.(+)
  end
  module Float = struct
    let (+) = Pervasives.(+.)
  end
end

Here we have pluses for int and float. Now the preparation is done! Let's use Loaded.(+) as if it is overloaded by these two instances!:

open Loaded
let _ =
  assert (1 + 2 = 3);
  assert (1.2 + 3.4 = 4.6) (* See it is not +. but + !!! *)

Hey, I used Loaded.(+), which is actually a C primitive without C code! Is it ok? It is NOT, without our compiler mod. The mod must replace the use of Loaded.(+) by Loaded.Int.(+) or Loaded.Float.(+) appropriately depending on its type from the context: the first + is int -> int -> int and the second is float -> float -> float:

(* See overload/mod.ml *)
let resolve_overloading e lidloc path = ...

class map = object (self)
  inherit Ttmap.map as super

  method! expression = function
    | ({ exp_desc= Texp_ident (path, lidloc, vdesc) } as e)->
        begin match vdesc.val_kind with
        | Val_prim { Primitive.prim_name = "OVERLOADED" } ->
            self, resolve_overloading e lidloc path
        | _ -> super#expression e
        end
    | e -> super#expression e
end

let structure str =
  let o = new map in
  let _, str =  o#structure str in
  str

Here is (some part of) the code of the mod. It is a function of Typedtree.structure -> Typedtree.structure, but we are only interested in the uses of identifiers whose definitions are by primitives OVERLOADED. So the boilerplate code to dig into the AST data types I used a generic map class Ttmap created by a CamlP4 hack. For each identifier whose definition is OVERLOADED is converted by the function resolve_overloading function.

The actual overload resolution is quite simple, if you know the internals of OCaml type-checker. But if you don't, it is just too painful to read. So it is skipped :^) (see mod.ml if you are really interested). The big picture is: traverse the module which defines the primitive to find the values with the same name, then filter out those which do not match the context type. If there is none left, error. If there are more than one matches, error. If there is only one candidate, replace the primitive use by the candidate variable.

Anyway, building and playing this mod is very easy:

$ cd overlaod
$ make

It creates a bytecode compiler poorman. Well, compared to the full overloading by type classes, this is very simple, a poorman's overloading solution. We have a test code at test/test.ml so you can try compiling it by poorman:

$ ./poorman -o test/test test/test.ml
$ ./test/test  # Well, it just tests some assertions

Do you see how the overloaded instances are declared in test/test.ml? They are separately defined in modules and then gathered under Loaded with the OVERLOADED primitive by module aliases. Actually it is very powerful mechanism to tailor overloading!

That's all, folks!

This kind of compiler modifications are of course possible even in the previous versions of OCaml compilers, but their distributions had to be as patches against the original compilers, and the users need to recompile the whole compiler sets, which took about 10 minutes. But now, with compiler-libs, it is less than one minute. Compiler-libs are not just for strange compiler mods, but also good for compiler related tool development. It is really encouraging for us, OCaml mutators, since we can deliver our compiler prototypes very easily to end users!

2011年9月21日水曜日

A Small Patch for Bizarre but User Controllable Limited Overloading

A Small Patch for Bizarre but User Controllable Limited Overloading

Yes, it is bizarre. Yes, it is limited. Yes, it is nothing new at all. But yes, it is simple and useful.

I have written a small patch to OCaml compiler to provide an SML style limited overloading. Limited means that you cannot derive overloading using overloaded values. For example, in SML (+) is (limitedly) overloaded:

1 + 2;
1.2 + 3.4;

But you cannot define overloaded double using (+):

(* In SML syntax *)
fun double x = x + x; (* It is not overloaded. *)
                      (* Defaulted to int -> int *)

The patch provides this "poorman's overloading" to OCaml, additionaly with controllability: you can choose what are overloaded. And this part is the bizarrest part of this patch.

Let's overload plus operators. First of all, list the overloaded instances:

module Plus = struct

  module Int = struct
    let (+) = (+)
  end

  module Float = struct
    let (+) = (+.)
  end

end

That's all. Simple. What I did here is just list the definitions of (+) for different types (int and float). Since one module cannot export more than one values with the same name, Those (+) are defined in separate modules. I named Int and Float but you can use whatever name you like. The preparation is done.

Now, make those (+) overloaded. It is surprising simple:

open* Plus   (* Strange open with a star *)

Done. Now if you use (+), it is overloaded!:

let _ = assert (1 + 2 = 3); assert (1.2 + 3.4 = 4.6);; (* It works! *)

What open* Plus does? It traverses its signature recursively and opens all the sub-modules. So in this case it is equivalent with:

open Plus
open Plus.Int
open Plus.Float

but in ADDITION, if open* finds more than one definitions with the same name, here (+), they are registered as overloaded. So, open* Plus overloads the two (+)!

The overloading by open* can be repeated and the overloaded instances can be easily accumulated. This provides simple and powerful "open" world overload control:

module Num = struct

  let (+) = Num.(+/)
  module Big_int = struct let (+) = Big_int.add_big_int end
  module Nat = struct let (+) = Nat.add_nat end
  module Ratio = struct let (+) = Ratio.add_ratio end

end

open* Plus (* overload (+) for int and float *)
open* Num  (* overload (+) for additional 4 num types! *)
open* Int32 (* defines (+) for int32, omitted *)
open* Int64 (* defines (+) for int64, trivial *)
open* Natint (* defines (+) for natint *)

(* Now you can use (+) for 9 num types! *)

Or more simply, once you define the following:

module SuperPlus = struct
  include Plus
  let (+) = Num.(+/)
  module Big_int = struct let (+) = Big_int.add_big_int end
  module Nat = struct let (+) = Nat.add_nat end
  module Ratio = struct let (+) = Ratio.add_ratio end
  include Int32
  include Int64
  include Natint
end

You can just say open* SuperPlus to enjoy the overloading in your current module.

It is limited.

The overloading is limited. Any local ambiguity is reported as a type error immediately. For example, let double x = x + x is rejected since it has no enough context type information to resolve the overloading of (+). No defaulting, or fancy real polymorphic overloading.

One overloading must be locally resolvable by itself. The following example has no ambiguity, since (+) is clear for int -> int -> int from its context, then the type of one could be fixed as int.:

module One = struct
  module Int = struct let one = 1 end
  module Float = struct let one = 1.0 end
end

open* Plus
open* One

let _ = assert (one + 2 = 3)

But this overloading resolution propagation is not implemented for simplicity, and this example is rejected due to the false positive ambiguous use of one. You need an explicit type annotation.

The source

The source is available from my μ-tated OCaml ranch, poormans_overloading branch: https://bitbucket.org/camlspotter/mutated_ocaml/src/f4aeda4f648a

2011年9月15日木曜日

Redundant open module warning for OCaml

Redundant open module warning for OCaml

GHC has a warning for never used imports; such imports are just redundant and cause unexpected name space contamination. The warning is useful to keep up your import list minimal as possible.

OCaml's open has the same issue of the name space contamination, and unnecessary opens should be warned, too. And I have added a new warning for it.

You can obtain the latest diff for OCaml 3.12.1 from my repo at https://bitbucket.org/camlspotter/mutated_ocaml . After cloning, get it by:

hg diff -r ocaml-3.12.1-11110 -r redundant_open_warning

After patching, building is as usual:

make core coreboot world opt opt.opt install # Beware what you are doing!

I have found nearly 150 redundant opens in OCaml source code! You should check your OCaml code with it, too!

P.S. The first patch contained some garbages. Now the above command creates a clean patch for OCaml 3.12.1.