You're going to have to rewrite it anyway

06 Jul 2013

Ataraxia Consulting


Node v1.0 is approaching, and v0.12 is imminent (as far as that goes for FOSS projects). As we work towards getting v0.12 out the door, there have been a lot of changes happening for node’s primary dependency v8. Ben is working on moving us to the 3.20 branch, follow his progress here.

As you can tell this is a signficant change to the API, which requires a touch of virtually every file in our src/, this has been a huge headache for him, and will ultimately cause a huge headache for developers of binary addons.

You’re going to have to #ifdef around significant portions of the API to keep your module working across different version of node, this is going to cause endless amounts of pain and issues for node and developers who have for the most part been accepting of the churn in our underspecified addon API.

This one is going to hurt.

A lot.

TL;DR – A modest proposal

Since you’re going to have to rewrite your module anyway, it’s time for node to specify and export the API we are going to “bless” for addons. That is, just what API we are going to support and make sure continues to work from minor and major releases, as well as a deprecation policy.

More specifically I think we should be exporting a separate (and not equal) wrapper around (at the very least) javascript object creation, get/set, function calling.

Additionally we should package and distribute (if possible in npm) a transitional library/headers which module authors can target today which will allow their module to compile and work from v0.8 through v1.0

The Platform Problem

We currently allow platforms/distributors to build against shared (their own) versions of many of our dependencies, including but not limited to:

  • v8
    • Holy crap, we’re about as tightly coupled to the version of v8 we ship as chromium itself is.
  • libuv
    • If we weren’t strictly coupled to v8, we certainly are for libuv, there would be no (useful) node, without libuv.
  • openssl
    • This is a must for linux distributions, who like to break DSA keys and then make every dependency vulnerable as a result (sorry Debian, I keed I keed).
    • This actually allows distributors who know specific things about their platform to enable/disable the features that allow it to run best.
  • zlib
    • Meh, this isn’t such a big deal, it doesn’t really change all that often.
  • http_parser
    • Really? People ship this as a separate library?

This functionality was added to appease platform builders, the likes of Debian, Fedora, and even SmartOS. However, doing so has complicated and muddled the scenario of building and linking binary addons.

Currently node-gyp downloads the sourceball, extracts the headers from it, and makes some assumptions from process.config about how to build your addon. In practice this has been working reasonably well.

However, I’m very concerned about this as a long term strategy. It’s possible for someone to have tweaked or twisted the node (or one of its dependencies) builds, which could lead to some unintended consequences. In the “best” case, you’ll get a compiler error from a changed API or clashing symbol. In the worst case they have modified the ABI which will manifest itself in unexpected and often subtle ways.

Not to mention that we have no good answer on how to build and link addon modules against the proper version of a shared dependency (what if the system has multiple openssl’s, what if they compiled against it in one place, but now run against it in another).

And last but not least, how do modules consume symbols from our dependencies that node itself doesn’t consume. Consider a specific crypto routine from openssl that you want to provide as an addon module because node doesn’t currently have an interface for it.

Enemies without, and enemies within

As if it weren’t bad enough that platforms may ship against a version of v8 that we haven’t blessed, we (and addon developers) have to fight against the beast that is the v8 API churn.

I don’t really fault Google and the chromium or v8 team for how they are handling this, more often then not we just end up with ugly compile time deprecation warnings, letting us know the world is about to break.

However, there have been times – like right now – where node can’t paper over the drastic change in the v8 API for module developers. And as a result we begrudgingly pass the API change to module authors.

To paraphrase, don’t forget that execrement will inevitably lose its battle with gravity.

So what are we going to do?

Meat and Potatoes

This is where I don’t particularly have everything fleshed out, and I’m sure I will take a considerable amount of heat from people on API decisions that haven’t been made.

I want to export the following interfaces:

  • node/js.h
    • Object creation and manipulation.
    • Function calling and Error throwing.
  • node/platform.h
    • IO and event loop abstraction.
  • node/ssl.h
  • node/zlib.h
  • node/http.h

While I am not particularly attached to the names of these headers, each represent an interface that I think module authors would opt to target. I only feel strongly that we export js and platform as soon as possible as they are the primary interactions for every module.

Basic Principles

There are only a few principles:

  • Avoid (like the plague) any scenario where we expose an ABI to module authors.
    • Where possible use opaque handles and getter/setter functions.
  • The exported API should be a reliable interface which authors can depend on working across releases.
  • While a dependency may change its API, we have committed to our external API and need to provide a transitional interface in accordance with our deprecation policy.
  • The API should never expose an implementation detail to module authors (A spidermonkey backed node one day?).

Platform

The platform interface is the easiest to discuss, but the pattern would follow for ssl, zlib, and http.

This would just rexport the existing uv API, however with a C-style namespace of node_. Any struct passing should be avoided, and libuv would need to be updated to reflect that.

JS

I expect the js interface to be the most contentious, and also fraught with peril.

The interface for addon authors should be C, I don’t want to forsake the C++ folk, but I think the binding for that should be based on our C interface.

I was going to describe my ideal interface, and frame it in context of my ruby and python experience. However, after a brief investigation, the JSAPI for spidermonkey exports almost exactly the API I had in mind. So read about that here.

Would it make sense, and would it be worth the effort, for node to export a JSAPI compatible interface?

Would it make more sense to export a JSAPI influenced API currently targetted at v8 which could be trivially extended to also support spidermonkey?

UPDATE 2013-07-08:

It’s interesting and worthy to have a conversation about being able to provide a backend neutral object model, though our current coupling to v8 and its usage in existing addons may not make it possible to entirely hide away the eccentricities of the v8 API. But what we can provide is an interface that is viable to target against from release to release regardless of how the public v8 API changes.

Prior Art

A lot of these ideas came from a discussion I had with Joshua Clulow while en route to NodeConf.

Part of that conversation was about v8+ which was written by a particularly talented coworker, who had a rather nasty experience writing for the existing C++ API (such as it is).

There’s some overlap in how it works and how I envisioned the new API. However, I’m not sure I’m particularly fond of automatically converting objects into nvlists, though that does solve some of the release and retain issues.

In general I would advocate opaque handles and getter and setter functions, with a helper API which could do that wholesale conversion for you.

Really though this matters less in a world where addon authors are following some defined “Best Practices”.

  • Only pass and return “primitives” to/from the javascript/C boundary
    • Primitives would be things like: String, Number, Buffer.
  • Only perform objection manipulation in javascript where the JIT can work its magic

Dessert

Work on this needs to begin as soon as possible. We should be able to distribute it in npm, and authors should be able to target it by including a few headers in their source and adding a dependency stanza in their binding.gyp, and by doing so their module will work from v0.8 through v1.0

I mean, you’re going to have to rewrite it anyway.

Discussion should happen on the mailing list on thread: https://groups.google.com/d/msg/nodejs/VlUJ68n6QBg/fPsuArtR0roJ