Introduction

This file contains information only of interest to library maintainers, and possibly to "power" users of the library. It is presumed that you've read the entire Programmer's Guide already.


Building

Build Targets

Run "make" in the "buildscripts" directory with no arguments to get a list of the targets to be listed.

Required Tools

No tools are necessary to simply use the already-built JavaScript library.

Different tools are necessary for maintainers, depending on the make target desired.

We only bundle the tools we author; you must download and install the others. You must also edit your copy of buildscripts/externalconfig.mak to indicate the location of these tools.

In the case of the JsUnit tools, you currently have no choice in installation location: they must be in top-level subdirectories jsunit_hieatt and jsunit_schaible (those might be symbolic links of course).

To use the Makefile at all, it is necessary to have these:

  gnu make:  http://www.gnu.org/software/make/make.html
  perl:      http://www.perl.org/
  standard unix shell commands like "ls" and "cp":  for Windows, try http://www.cygwin.com/

Tools for Generating API documentation

To generate API documentation (make target "apidoc") it is necessary to have:

  doxygen:     http://www.stack.nl/~dimitri/doxygen/
  dot:         http://www.research.att.com/sw/tools/graphviz/ 
               http://www.graphviz.org/pub/graphviz/graphviz-current.tar.gz
               http://www.phil.uu.nl/~js/graphviz/
  src2doc.pl:  (our tool, provided)

Tools for Command-Line Unit Testing

To run unit tests at the command line (make target "testbuild" or "testsrc") it is necessary to have:

  a command-line ECMAScript interpreter
  Schaible JsUnit:    http://jsunit.berlios.de
  BUFakeDom.js:       (our tool, provided)
  jsunit_wrap.js:     (our tool, provided)
For a command-line ECMAScript interpreter see shells for some choices. You will need to edit buildscripts/externalconfig.mak appropriately (the JS variable).

Note that Schaible JsUnit currently must installed in the top-level directory as "jsunit_schaible" (or a symbolic link placed there).

Tools for Browser Unit Testing

To run unit tests in a browser (via testRunner.html) it is necessary to have:

  Hieatt JsUnit: http://jsunit.net

Note that this currently must installed in the top-level directory as "jsunit_hieatt" (or a symbolic link placed there).

Other Tools

For other output types from doxygen, you may need:

  tetex
  ghostscript

For xslt processing of xml output from doxygen, you'll need an xslt processor such as:

  xsltproc
  saxon


Style Guide

Comments

Some general guidelines for writing "javadoc" comments that may be worth reading:

   http://www.nbirn.net/ForDevelopers/Conventions/Commenting/C_Comments.htm
   http://java.sun.com/j2se/javadoc/writingdoccomments/

Code Formatting

Formatting is among the *least* important issue in coding standards, yet one which many programmers feel strongly about. Decent editors will automatically reformat an entire file to comply with a new convention. We therefore do not dictate formatting conventions here. The principle author/maintainer of any particular file gets to choose.

Note that programmers should feel no compulsion to reduce white space for reasons of file download time, as we post-process the files under source control.

Some basic things I do:

   Usually, do not rely on automatic semi-colon insertion.
   Usually, use explicit {} block in "if", "else", "for", etc.
   Do not use tab for indent

Symbol Naming

Symbol naming is more important than code formatting because symbol naming cuts across source files, is externally visible, and is not trivial to change later.

We specify here some naming conventions for namespaces, classes, constants, variables, methods, and functions, and how these conventions relate to file and directory names.

Note that because the ECMAScript language does not have native support for all these constructs, any naming conventions have to be considered in light of a programming convention for emulating these constructs. For more discussion of that emulation, see Emulating Programming Constructs in ECMAScript 262-3 .

Comparison to Java and C++

These naming guidelines are a bit of a cross between those of Java and C++. Java has packages, classes, and interfaces, and has strict enforcement of a relationship between source directories and packages. C++ has namespaces and classes, and has no enforcement of a relationship between source files and namespaces.

The tendency in modern C++ is towards all lower-case for practically all symbols (see for example www.boost.org ). While this may be appealling, for the sake of consistency with ECMAScript tradition (as represented by the builtin objects and methods) the the conventions represented here are more Java-like.

Symbol Naming Guidelines

WhatExampleLikeNot Like
classesUpperLowerJava and .NETC++
packages (parent namespace names and any matching directories)burst.ioJava and C++.NET (UpperLower)
namespaces with non-namespace members (like a class with only static members)burst.TextJavaC++ (all lower_case), .NET (all UpperLower)
public methodsfooBarJava.NET (Object.GetHashCode()), C++ (get_hash_code)
public static functionsfooBar or foo_barC++.NET (System.Math.BigMul); not possible in Java
non-public methods and functionsfooBar_ (recommended, not required)many conventions
public variablesfoo_barC++
non-public variablesfoo_bar_ (recommended, not required; could also be _foo_bar)many conventions
constants (and enums, to the extent emulated)FOO_BAReverything

Note that this convention means there is no lexical distinction between "terminal" namespace names and class names. This is appropriate, since at least with our namespace emulation, qualification by namespace and qualification of a static class member are done identically with '.'. (This is similar to C++ qualification by class or namespace with '::').

In general, each level in a naming hierarchy may be either a namespace or class (as with C++).

Namespace Pollution

All symbols are below a namespace "burst", with a few special exceptions. Except for the special cases, there are no top-level global symbols, only ones scoped by a class or namespace.

These are the exception cases for top-level variable symbols outside of the "burst" namespace:

  - Widely used convenience functions for logging and errors: 
    bu_debug, bu_info, bu_warn, bu_throw
  - Functions commonly used to emulate language extensions:
    bu_inherits, bu_loaded, bu_require
  - Functions which for technical reasons must be top-level:
    bu_eval
  - Variables which must be set by programmers prior to library loading: 
    bu_AppConfig

Except for the last case, these top-level symbols are simply aliases to full-qualified namespace functions or class methods.

There may also be global symbols for private internal use; these should start with "bu" or "BU".

File Naming and Organization Guidelines

- All source files are below a directory "burst/".

- For each each directory level, there <em>must</em> be a matching descent in namespace level (for example, "burst/io/" and "burst.io.").

- The reverse is not true: terminal namespaces (and classes) can appear in symbols without there being a matching directory, when for example there are nested classes or nested namespaces within a single source file.

- A file may introduce symbols at any level below its parent directory's namespace:

 -- It might add symbols directly within its parent's namespace.
    For example, a file "io_init.js" that defines no classes or namespaces
 -- It might declare a child namespace, and symbols within that child namespace.
    For example, a file "Ioctl.js" that defines a namespace "burst.io.Ioctl"
 -- It might declare a class within the parent's namespace.
    For example, a file "File.js" that defines a class "burst.io.File".
 -- It might declare multiple classes within the parent's namespace.
    For example, a file "Stream.js" that defines an abstract class "burst.io.Stream" and
    several subclasses such as "burst.io.BufferedStream"
 -- It might declare nested classes/namespaces, to an arbitrary degree.
    For example, a file "Queue.js" might define a class "burst.Queue" and a nested
    class "burst.Queue.Iterator".
 -- Any combination of the above (though only with good reason).

- Files should generally define at most one namespace or class. However, unlike the case with Java, we do not enforce this. In particular, sometimes it is unweildy to have a distinct file for every subclass of some base class, and so it is convenient to define all those classes in a single file.

- When a file defines a single namespace or class, it is named after it (e.g. "burst/XPath.js").

- When a file defines multiple classes, the file is named after the common base class when possible, or arbitrarily whatever the author thinks is the most important class.

- When a file defines no child class or child namespace, but only symbols in the parent (or no symbols at all), it is given a lower-case name (e.g. "burst/fix_ecma.js")

Singletons

There are multiple conventions for how singleton instances are accessed:
  1. A namespace or class method that returns a singleton instance.
    Some java examples are:
          static java.lang.Runtime java.lang.Runtime.getRuntime()
          static java.text.Collator java.text.Collator.getInstance()
          static java.util.Preferences java.util.Preferences.systemRoot()
          static java.lang.ClassLoader java.lang.ClassLoader.getSystemClassLoader()
    
  2. A namespace or class variable which is a unique instance of some class.
    For example, java's java.lang.System.out.
  3. No variable, just use a set of class methods and class variables directly.
    For example, the static methods in java's java.util.prefs.Preferences.*

We generally favor the first approach. We generally frown upon the last approach except for those rare cases where the hidden state is confidentally something that will always be unique within the whole runtime environment (which is incidentally a defect in the design of java's java.util.prefs).

Return Values

Some conventions:

   Constructors should return "this". 
       In general, functions that otherwise might return void should also return "this".
   TBD: use of null vs. undefined?


Writing Documention

We have two kinds of documentation:
  1. textual content to appear in a manual
  2. API reference documentation

We approach these differently, as far as authoring tools are concerned.

Writing Manual Documentation

For now, we write documentation directly as html and perl pod files.

We do not expect anyone else to author using perl pod :). If someone wants to use some other non-xml markup (say, textile) that can generate html, that is fine too.

We do definitely prefer authoring in xhtml over some xml vocabulary such as DocBook. It is possible of course to author manual documentation directly in DocBook (XML 4.2). However, authoring tools for HTML still surpass those for XML (let alone DocBook XML), and HTML has the *huge* benefit of being directly viewable in a browser.

It is not hard to convert well-structured html to DocBook, if DocBook is desired for some reason. Two tools to consider for converting HTML to DocBook are:

  Michael Fuchs' dbdoclet: http://www.michael-a-fuchs.de/projects/dbdoclet/en/
   (it will process html, not just javadoc)
  
  Pradeep Padala html2db: http://www.cise.ufl.edu/~ppadala/tidy/

Writing API Documentation

The API reference documentation is authored inline with the source code. This is then processed with doxygen using a perl filter we wrote.

Doxygen and ECMAScript

Doxygen does not have native support for ECMAScript as a source language. This is not too surprising, because ECMAScript 3.x does not have native support for classes, namespaces, and so on. So an automatic documentation generator would have to understand particular programming conventions for how those constructs are implemented. That is not very robust, because so many tricks are possible in the language. Furthermore, that typically does not result in knowing the types of any function arguments.

Instead, we have implemented a perl filter (src2doc.pl) which produces Java or C++ code for doxygen to work on. The perl filter does two things:

Here is an example excerpt from the source code:

   /**
   * A class whose only purpose is to be used to create the fixed constants such as burst.logging.Log.DEBUG
   */
   //:CLBEGIN burst.logging.LogLevel
   /**
   @param name Name of the log level.
   @param val Value of the log level
   @param is_fatal (optional) whether messages to this level will cause an exception
   */
   //:CLCONSTRUCT burst.logging.LogLevel(String name, Number val, Boolean is_fatal)
   burst.logging.LogLevel = function(name, val, is_fatal) {
     this.name_ = name;
     this.value_ = val;
     this.is_fatal_ = typeof is_fatal == 'undefined' ? false : is_fatal;
   }
   //:CLEND

In our approach, Doxygen never sees any of our actual ECMAScript source code. Rather, it sees our javadoc-style comments, and it sees generated Java or C++ made from the special comments like "//:CLCONSTRUCT ...".

At the moment, we are generating Java rather than C++, because Java more closely matches the symbol naming and programming constructs we have adopted, and is more likely to be familiar to users of this library. Note that this means we have to generate actual intermediary java files, because doxygen has some language-specific behavior turned on only based on file suffix (sigh).

NOTE: This is definitely ugly. A cleaner approach would be possible if:

In that alternative universe, we could maintain our source files in ECMAScript 4, and we would not need the special comments.

Note that there are javadoc-like tools available for "ActionScript 2.0", which is a Draft ECMAScript 4 dialect:

  AS2Doc     http://www.as2doc.com/  
     commercial
  ActionDoc  http://www.jellyvision.com/actiondoc/
     no source gratis Windows
  AsDocGen   http://www.asdocgen.org/blog/
     GPL, C#
  BLDocs     http://www.rewindlife.com/archives/000124.cfm
     announced

Perhaps some day someone will write a parser to plug in to http://synopsis.fresco.org/ .

API Comment Conventions

Doxygen has several nice "do what I mean" conveniences when processing source, such as automatic paragraph breaks and automatic bulleted lists.

We write javadoc comments in html4, not xhtml.

We may rely on the "auto bullet" feature in doxygen (hyphens converted to ul/li).

At this time, we only use standard javadoc tags, none of the extra doxygen tags.

For more information on src2doc, see tools.html .

(As an aside, note that there is an Ant task for doxygen: http://ant-doxygen.sourceforge.net/ Though all it really does is generate a doxconf file in the user's home directory and invoke the external doxygen command.)

Getting API Documentation into DocBook

DocBook is too impoverished to be considered as a source format for documenting APIs. It is far from being a complete semantic markup language for programming language constructs, even if recent changes have moved it beyond mere manual pages.

Some people may still desire DocBook as a derived format.

(There is an effort as part of the boost C++ project to extend DocBook to make it more viable for library documentation: http://www.boost.org/doc/html/boostbook.html However, it is oriented around an output format; the input format is still via Doxygen or Synopsis.)

Doxygen can be used to produce a variety of outputs, including HTML, XML, and PDF. Doxygen does not currently produce DocBook directly (it produces a different XML vocabulary, which preserves finer API semantics than DocBook).

Note that a potential alternative approach to using doxygen would be to use Michael Fuch's DocBookDoclet. It produces DocBook directly via javadoc. However, it must receive real java sources, because it is implemented as a javadoc doclet.

I think a better approach than DocBookDoclet for producing API reference documentation in DocBook may be to transform Doxygen's XML output into DocBook XML. See for example doxygen2boostbook.xsl at http://www.boost.org/doc/html/boostbook.getting.started.html . (Yet another potential direction would be to transform Doxygen XML into .NET XML comments and use http://ndoc.sourceforge.net/ .)


Testing

The tests in the "tests/testscripts" directory may be run from the command line (with a shell ECMAScript interpreter) or from within a browser. Most if not all of the tests will pass in the command-line, thanks to our JsFakeDom tool.

To run the tests in the command line run:

   # cd buildscripts
   # make testsrc
You may run the unit tests in your browser from the Test Pages .


Architecture

The Single Core File burstlib.js

All the core script files are combined into the single file "burstlib.js". This process also eliminates comments and extra white space. Combining them into a single file has several benefits:

Module dependencies

All modules can assume that:

All other dependencies should be declared via bu_require (but this is not enforced, since a specific load order is enforced in creating burstlib.js).

Catch-22

There are two substantial mutual dependency problems:
  1. Each module may call bu_require() to declare what it requires. But that function has to be defined and have the functionality to load scripts, and that functionality may not be ready yet (it may in fact be in the module calling bu_require).
  2. Any module may want to call bu_debug, even before logging is defined, and maybe before logging is configured.

Our solutions are:

  1. There is a placeholder implementation of bu_require() in burst_first.js, so that modules can use it without having to worry about ordering.
  2. It is illegal to use bu_debug(), bu_info(), etc., at load time, until after the logging module is loaded. (Of course any module can have a forward dependency in a function body; the trickiness ensues if you want such a forward dependency at file load time.) It is legal to use those logging functions at load time in modules being loaded after the logging module, and prior to full configuration of the logging module. In that intervening period, between when the logging module is loaded and when it is configured, there is a default logger that accumulates calls.

Description of initialization

Here is a simplified summary of the startup sequence:
  1. Load files for core objects (either individually or all combined in burstlib.js). Any ECMAScript interpreter semantically acts in two phases: parse phase and execution phase; in the execution phase function bodies can refer to later definitions.
  2. At some point the core module responsible for processing application configuration (bu_AppConfig, etc.) is loaded. Then any "onConfig" callback methods for previously loaded modules are run. Modules loaded in the future will have their "onConfig" callback called immediately. (At this point, in a browser, it might be the case that only the document "head" element exists, so no DOM operations should be done.)
  3. When in a browser, after document 'onload', run any "onDocumentLoad" callback functions from any modules. Some special ones are:

For more information, examing the source code to burst_first.js and the documentation for burst.ScriptLoader.