Bridging the module gap between Node.js and browsers

[2011-11-19] dev, nodejs, javascript, jsmodules

Update 2012-07-04: amdefine: use AMD modules on Node.js

One of the advantages of Node.js is that you can use the same programming language – JavaScript – on both server and client. When it comes to modularizing code that is portable between the two platforms, one is presented with a major challenge: they approach modularity differently. This post examines four solutions for writing cross-platform modules.

Synchronous versus asynchronous modules

Node.js – synchronous modules. Node.js loads its modules synchronously – program execution stops until loading is finished. The following code is a Node.js module.

    var module1 = require("./module1");
    module1.foo();
    var module2 = require("./module2");
    export.bar = function() {
        module2.baz();
    }

The module uses the function require() to import the modules module1 and module2. It exports the function bar. While require() is doing its work of loading a file from disk and evaluating it, the code waits for the result.

Browsers – asynchronous modules. In browsers, things work differently. Loading can take a long time, so modules (scripts) are always loaded asynchronously: you give the order for loading a file plus a callback that is used to inform you when the file has been loaded. Hence, there is no (synchronous) waiting. That also means that you cannot load an import in the middle of your module, like on Node.js. The Asynchronous Module Definition (AMD) standard has been developed with asynchronicity in mind: the module becomes a callback that is invoked once all imports have loaded. The above Node.js module looks as follows as an AMD:

    define([ "./module1", "./module2" ],
        function (module1, module2) {
            module1.foo();
            return { // export
                bar: function () {
                    module2.baz();
                }
            };
        }
    );

Yes, there is a little more syntactic noise, but that is necessary to make things easy to parse. For development, having many little modules is great because it provides structure. For deployment, you want to have as few files as possible, because each download request costs time and bandwidth. Hence, the AMD tool RequireJS comes with an optimizer that lets you compile several AMDs into a single minified file. Such a file can be loaded via the simplified AMD module loader almond which is only 750 bytes when minified and gzipped.

Note that the above example demonstrates the core AMD syntax. The complete AMD standard has more features.

Crossing platforms. At their core, the two module formats are not dramatically different: specify a path to a file, assign the result of evaluating it to a variable. The following sections examine two approaches for using either module standard on both platforms:

Use boilerplate to ensure compatibility.
Transform “pure” modules on the fly or via compilation.

Compatibility via boilerplate

The idea of boilerplate code is to wrap a module with code that runs it directly on its native platform and acts as an intermediary on the non-native platform. If the module calls the function define() then the boilerplate code has to switch between two implementations of that function, depending on which platform is active. For this section, we assume that define() already exists as a global function on browsers and has to be defined on Node.js. We distinguish three boilerplate patterns, by how they provide an implementation:

via a variable.

    var define; // does nothing if `define` has already been declared.
    if (typeof define === "undefined") {
        define = function (...) {
            ...
        }
    }
    define(...);

This pattern relies on the peculiarities of var in a brittle manner. For example, it will cease to work if the code is wrapped in a function. Thus, it should be avoided.

via a function parameter.
```
    (function (define) {
        define(...);
    }(typeof define === "function" ? define : function (...) { ... }));
```
Approach: Wrap all of the code into an immediately-invoked function expression, check whether define() already exists and if not, provide a value for it.
via a method definition.
```
    ({ define: typeof define === "function" ? define : function (...) { ... } }).
    define(...);
```
Approach: turn define() into a method call by prepending an object with a suitable method inside. Advantage: shorter and only a prefix (easier to remove, easier to add via copy/paste).

All boilerplate below uses the method definition approach. Apart from putting redundant code into each module, a disadvantage of boilerplate code is that tools might not work that combine modules into single files, for browsers.

Boilerplate to browser-enable a Node.js module.

    ({ define: typeof define === "function"
        ? define  // browser
        : function(F) { F(require,exports,module) } }).  // Node.js
    define(function (require, exports, module) {
        // Node.js module code goes here
    });

Approach: Wrap the Node.js code in a function whose parameters provide the module API. That function is called the module body. On Node.js, the boilerplate can use require, exports and module directly. On browsers, you use advanced AMD features:

The pseudo-modules "require", "exports" and "module" provide a Node.js-style API.
The default for a missing module name array is ["require", "exports", "module"].

Those features go beyond the core functionality introduced previously, but every complete AMD implementation (such as RequireJS [1]) supports them. The challenge is to make the synchronous code inside the module body work in the asynchronous browser environment. That is achieved by the following steps: First, get the source code of the function. Most JavaScript engines let you do that by invoking toString() on it. Second, collect the arguments of all require() calls, e.g. via a regular expression. Those are the modules that are to be loaded. Third, load the modules. Fourth, call the module body with the modules.

Disadvantages: boilerplate, source code must be parsed in browsers.
Advantage: works with script tags.

Boilerplate to Node.js-enable an AMD.

    ({ define: typeof define === "function"
        ? define
        : function(A,F) { module.exports = F.apply(null, A.map(require)) } }).
    define([ "./module1", "./module2" ],
        function (module1, module2) {
            return ...
        }
    );

Approach: On Node.js, use require() to compute the arguments for the module body (the second argument of define()). The result of evaluating the body is assigned to module.exports. On browsers, an implementation of define() is provided by an AMD-compatible loader such as RequireJS [1].

Disadvantage: boilerplate.
Advantage: no parsing on browsers, works with script tags.

Pure Node.js or AMD modules

Node.js modules on browsers. On Node.js, you can use the modules directly. On browsers, you need to distinguish between development and deployment. During development, you can take a simpler, but slower approach and load modules via the lobrow tool [2]. For deployment, you can compile several modules into a single file via a tool such as browserify [3] or webmake [4].

Disadvantage: lobrow needs to parse the code and is limited by its use of XMLHttpRequest (compared to script tags).
Advantage: no boilerplate.

AMDs on Node.js. The tool node-amd-loader [5] supports AMDs natively under Node.js. However, you must import it to activate it. Obviously, it is a hassle if you have to perform that import each time you use Node’s REPL as an interactive JavaScript command line. Thankfully, there is a trick [6] that lets you avoid that – by automatically executing code when you start the REPL. On browsers, you use an AMD-compatible script loader.

Disadvantage: need for adapter on Node.js.
Advantage: can use script tags, no boilerplate.

Structuring modules

This section presents two ways to structure modules. Many other ways exist, some of them a mixture of what is shown here.

Separate exports.

Node.js

    function foo() { } // public
    function bar() {  // private
        foo(); // call public function
    }
    
    // Exports are separate:
    exports.foo = foo;

AMDs

    function foo() { } // public
    function bar() { // private
        foo(); // call public function
    }

    // Exports are separate:
    return {
        foo: foo
    };

Disadvantage: you have to mention an identifier a total of three times.

Inline exports.

Node.js

    var e = exports;

    e.foo = function () { }; // public
    function bar() { // private
        e.foo(); // call public function
    }

AMDs

    var e = {};

    e.foo = function () { }; // public
    function bar() { // private
        e.foo(); // call public function
    }

    return e;

Disadvantages: You always have to prefix exported identifiers with “e.”, minimally less efficient than directly accessing the exported values.

Alternatively, one could put the exported values inside the object literal that is initially assigned to e:

    var e = {
        foo: function () { } // public
    };

    function bar() { // private
        e.foo(); // call public function
    }

This avoids the redundant “e.” when defining an exported value, but then you can’t freely mix private and public identifiers.

Conclusion

My current favorite is AMD + boilerplate for Node.js. That way, Node.js programs don’t have to be aware that they are using a “foreign” module format and browser code can use a proven AMD loader. A possible problem is AMD tools choking on the boilerplate, but that can be solved by either teaching them to ignore boilerplate (e.g. via special comments) or by stripping out the boilerplate during an extra build step. I would love for Node.js to support AMDs natively, but that is not going to happen. The best we can hope for is support for module loader plugins. Lastly, ECMAScript.next will have modules built in, so all of our modularity woes will go away eventually.