New Lazy Encoding in BuckleScript
Introduction
Recently we made some significant improvements with our new encoding for lazy values, and we find it so exciting that we want to highlight the changes. The new encoding generates very idiomatic JS output like hand-written code.
For people who are not familiar with lazy evaluation, it is documented here.
Comparison between the old and new lazy encoding
Let's take an example, and see how the old encoding of a lazy value would look like:
RElet lazy1 = lazy {
"Hello, lazy" -> Js.log;
1
}; // create a lazy value
let lazy2 = lazy 3 ; // artifical lazy values for demo purpose
Js.log2 (lazy1, lazy2); // logging the lazy values
let (lazy la, lazy lb) = (lazy1, lazy2); // pattern match to force evaluation
Js.log2 (la, lb); // logging forced values
When compiled and run in node
, the runtime representation of our lazy values will look something like:
lazy_demo$node src/lazy_demo.bs.js [ [Function], tag: 246 ] 3 # logging the output of two lazy blocks Hello, lazy # lazy1, laz2 evaluated forced by pattern match, hence logging 1 3 #logging the evaluated lazy block
With the new encoding, the output of the same example code would look like this:
{ RE_LAZY_DONE: false, value: [Function: value] } { RE_LAZY_DONE: true, value: 3 } # logging block one with new encoding Hello, lazy 1 3
As you can see, with the new encoding, no magic tags like 246 appear, and the lazy status is clearly marked via RE_LAZY_DONE: (true | false)
.
In fact, the code quality of our generated bs.js
files has also improved. Going back to our old version, the generated JS would look like this:
JSvar lazy1 = Caml_obj.caml_lazy_make((function (param) {
console.log("Hello, lazy");
return 1;
}));
console.log(lazy1, 3);
var la = CamlinternalLazy.force(lazy1);
var lb = CamlinternalLazy.force(3);
console.log(la, lb);
var lazy2 = 3;
In our new version with all the new changes to the lazy encoding, the output is way more simplified:
JSvar lazy1 = {
RE_LAZY_DONE: false,
value: (function () { // closure now is uncurried arity-0 function
console.log("Hello, lazy");
return 1;
})
};
var lazy2 = {
RE_LAZY_DONE: true,
value: 3
};
console.log(lazy1, lazy2);
var la = CamlinternalLazy.force(lazy1);
var lb = CamlinternalLazy.force(lazy2);
console.log(la, lb);
What changes did we make?
In the native runtime environment, the encoding of lazy values is rather complicated:
It is an array, which is not friendly for debugging in JS context.
It has some special tags which are not meaningful, for example, magic number 246, in JS context.
It tries to unbox lazy values with the help of the native garbage collector (GC). However, this behavior does not make sense in a JS runtime environment since the JSVM does not expose its GC semantics. Keeping that behavior would only introduce more complexity for the JS side.
So in our current master branch, we drastically simplified our lazy encoding scheme to optimize for the JS runtime as much as possible:
The encoding is uniform; it is always an object of two key value pairs. One is
RE_LAZY_DONE
to mark its status, the other is either a closure or an evaluated value.The compiler optimization still kicks in at compile time: if it knows a lazy value is already evaluated or does not need to be evaluated, it will promote its status to be 'done'. However, unlike in a native environment, unboxing is not happening. This makes sense since the most interesting unboxing scenarios only happen during runtime and not during compile time (a scenario which is impossible in the JSVM).
With the new encoding, lazy
is now way more viable for JS usage, so we encourage our users to use it whenever it is convenient!
Caveats:
Don't rely on the special name RE_LAZY_DONE
for JS interop; we may change it to a symbol in the future.