Binary addon design

04 Sep 2013

Ataraxia Consulting


When writing your application in Node you may need to communicate with an external library or device that may be impractical or impossible to do from pure JavaScript. It may be that the overhead added from ffi is also too much for your application. Or – the least likely scenario – you have demonstrable proof that your compiler can generate a faster version of your algorithm than the V8 JIT can (while this might be easy to confirm on the micro case, it’s quite unlikely to be true for everyone given the varying CPUs and compiler combinations in use). If you find yourself in one of these rare cases, it is necessary to write a binary addon.

If you need to write a binary addon, here are some basic guidelines to keep in mind:

(When “native layer” is used, it’s meant to refer to actions performed in the C/C++ side, vs what happens in JavaScript)

Opt Out

Just to reinforce what I said before, by writing this module in C/C++ you’re explicitly opting out of any runtime benefit the JIT may provide you. Remember the JIT will watch execution of your functions and optimize and inline those functions on the fly.

You will need to strike a good balance between what is done in JavaScript and what is done in C/C++. I would vote to do as little as possible on the C/C++ side, just enough to unpack and pack the values you are passing back and forth between the library and JavaScript.

JavaScript for consumers

To help enforce the previous rule, it’s best to wrap a method defined in C/C++ that’s meant to be consumed by other users with a JavaScript shim. That is if you NODE_SET_METHOD(target, "foo", Foo) you shouldn’t be doing exports.foo = binding.foo. This puts the burden of argument checking and validation in the native layer. Instead consider:

exports.foo = function(bar, baz) {
  if (isNaN(baz) || baz > 10)
    throw new Error("u did it wrong");
  binding.foo(''+bar, +baz);
};

Since we’re passing bar coerced as a string and validating baz’s input, we can make more assumptions about the safety of certain actions in the native layer.

We’re here to write JavaScript, so let’s actually do that whenever possible.

Cheap boundary

It is pretty darn cheap to cross the JavaScript and C/C++ boundary, this is unlikely to be the bottleneck in your application. Write and design what feels comfortable and is easy for you to understand and maintain. With the following caveat:

Primitives Please

You will get the best bang for your buck if you interact only with “primitives”, like:

  • v8::Number or v8::Integer
  • v8::String
  • v8::Function
  • node::Buffer
  • v8::External
  • v8::Array

The methods you export from your binding should avoid (like the plague) creating, inspecting, or mutating complex objects from the native layer. It’s extra ordinarily slow. It’s fast in every day JavaScript usage because the JIT gets to do all sorts of fancy caching and inlining, which you won’t get because you explicitly opted out.

Just in case there’s any confusion, if the method defined in the native layer will be called often make sure it does not call ->Get and especially not ->Set on an object passed to it. Doing so will only make you sad. Actually, it’s ok to call these methods on relatively small v8::Array’s because you are just doing normal pointer arithmetic to get the values. It’s the v8::String based lookups and sets you need to be wary of.

Remember it’s cheap to cross the boundary, so if you’d like to have a method return a complex object, actually pass to the method a factory function that takes the representative pieces and creates the object in JavaScript and then returns it to the native layer, which is then free to do whatever it needs to with that object.

function createObj(a, b, c) {
  return {
    foo: a,
    bar: b,
    baz: c,
  };
}

var myObj = module.do_something('foobarbaz', createObj);
//myObj.foo
//myObj.bar
//myObj.baz

Don’t throw

First and foremost, do NOT use C++ exceptions, ever, at all, in your code. Node is compiled with -fno-exceptions which doesn’t prevent you from using exceptions, but does change how things are cleaned up after C++ exceptions are hit. Do NOT use them. Just don’t.

Try to avoid throwing exceptions from the native layer. While it’s certainly possible to do so (and Node does so in places) it won’t be as helpful as you might want it to be, mostly you won’t really know the what and where of when something threw on the native side, just that it did. (You can always use very explicit messages and things like __func__, __FILE__, __LINE__ to help in that regard … eww)

Chances are if you let something through the JavaScript side that is wrong for the native side you want to assert and die a horrible death, instead of soldiering on only to have arbitrary memory corruption later.

Remember, we’re here to write JavaScript, don’t be afraid of it.

Handle wrapping

As an extension of the “JavaScript for consumers” section, consider the following pattern:

var assert = require('assert');
var binding = require('bindings')('mylib');

function Wrapper(opts) {
  if(!(this instanceof Wrapper))
    return new Wrapper(opts);

  assert(opts.foo);
  assert(opts.bar);

  this._handle = binding.libraryInit(opts.foo, opts.bar);
}

Wrapper.prototype.doSomething = function(baz) {
  assert(baz);

  binding.doSomething(this._handle, baz);
};

module.exports = Wrapper;

The idea here being that you have some sort of resource handle that needs to be reused in subsequent calls to your binding. Again, this is perfectly doable in C/C++ by following the node::ObjectWrap pattern, or implementing something similar yourself, but that’s sacrificing most of what the JIT can provide you.

In this model we can do more of the validation in JavaScript before sending it to C/C++ and potentially crashing.

If you need to handle finalization of the handle after it goes out of scope in JavaScript, you should make the handle a v8::Persistent and then call MakeWeak to define a callback that will be triggered when the GC is about to collect your object.

Use bindings

Please oh please, use bindings. This simple library takes care of the naming and pathing changes that may occur from various configurations and versions of Node. Most importantly if I’ve compiled a debug version of Node, I’ll actually build and run a debug version of your module and may be able to help figure out what might be going wrong.

Know your tools

Do not be afraid to compile Node with ./configure --debug and get your hands dirty with mdb or gdb to figure out just what is causing your module to crash. There are lots of resources out there to help you with that, but often just getting the stack trace from a core file will tell you quite a lot about what you’ve done wrong.

TL;DR

Write more JavaScript and less C/C++

TL;DR P2

  • This code can’t be optimized anymore than the compiler being used
  • Make sure consumers are getting JavaScript functions
  • It’s cheaper than you think to call between JavaScript and C/C++
  • Do NOT mutate objects in C/C++
  • Avoid exceptions in C/C++ whenever possible
  • Define your wrapper classes in JavaScript
  • Use the bindings module
  • Don’t be afraid to debug