Wednesday, September 07, 2011

Packaging extension steps for Calabash

In my previous blog post, I introduced how to develop an extension XProc step in Java, for the Calabash processor. Even though writing such an extension is quite easy when you know what to do, the configuration part for the final user is quite tricky. That complexity could be a serious argument for a potential user to give up even before he/she is able to run an example using your extension step. See the previous blog entry for details, but basically the user has to configure the classpath for Calabash with your JAR and all its dependencies, point to your config file when launching Calabash, and import your library into the main pipeline (after having decided where to install your extension step).

At the end of the previous post, I introduced the idea of having such extension steps, writen in Java for Calabash, supported out of the box by the repository implementing the Packaging System. I played a little bit with the idea and came up with the following design (and implementation). Of course you still have to provide the same information (the step interface, its implementation, and the link between its type and the class implementing it), but the goal is to enable the author to do it once for all, so the user can simply use the following commands to install the package and run a pipeline using it:

> xrepo install
> calabash pipeline.xproc
The only constraint on the user is to use the absolute URI you defined to import the XProc library you wrote with the step interface declaration. This absolute URI will be resolved automatically into the user local repository, and the repository system will configure Calabash with the Java code automatically. In order to achieve that goal, you, as an extension step author, have to provide a package with the following structure:


This structure looks familiar to whoever knows the structure of a standard package: you have the package descriptor, namely expath-pkg.xml, containing meta-information about the package and its content, then within the package directory you have the components, the content itself of the package. In addition, you have an additional descriptor, specific to Calabash, that is calabash.xml. In this case, the content of the package is an XProc library containing the step declarations, the JAR file with the compiled Java implementation of your extension steps, and all its dependencies (the other Java libraries it uses). Let's see how the two descriptors carry out all the information needed in order to use the extension steps. First the standard package descriptor, expath-pkg.xml:

<package xmlns=""

   <title>Your XProc steps for Calabash</title>

   <dependency processor=""/>



Besides the usual informations about the package (its name, textual description, version number, etc.), we tell that this package is specific to Calabash (by depending on that processor). We also declare a public component, a standard XProc library, by assigning a public, absolute URI to it, and by linking to its file by name, within the package content. Indeed, keep in mind that this library declares the step interfaces and is standard XProc, it remains the same even if there are several implementations. The library itself is:

<p:library xmlns:p=""

   <p:declare-step type="y:some-of-your-steps">
      <p:input  port="source" primary="true"/>
      <p:output port="result" primary="true"/>
      <p:option name="username"/>

   <p:declare-step type="y:another-one">
      <p:output port="result" primary="true"/>


Finally, the second descriptor, specific to Calabash and named calabash.xml, describe the informations about the Java implementation: the JAR files to add to the classpath, and the Java class implementing each of the extension step types:

<package xmlns="">





The JAR files are referenced by filenames (relative to the package content dir), the step types are identified by there QName (using Clark notation, to represent both the namespace URI and the local name as one single string), and the implementation class is referenced by it fully qualified name.

The package author has just to respect those conventions and to provide those two descriptor. He/she can package everything up by zipping this into one single ZIP file (usually using the extension *.xar, for XML ARchive). He/she is then able to publish and distribute the package to users. If the users have support for the packages, the only piece of documentation to provide is the public URI of the XProc library, to import it into their own pipeline.

An interesting point is that this strategy is usable as well for private extensions. Let's take the set of XSLT 2.0 stylesheets for DocBook for instance. A pipeline, or even a set of pipelines, might make perfect sense to drive some processings using this large application. If that processing needs some extensions to the standard languages, then it is possible to write extension steps for Calabash, integrate them within the package with the standard XSLT stylesheets and XProc pipelines, and to use it internally. If the XProc library declaring the steps is not publicly exposed in the package descriptor, then only the other components in the package itself can use it.

In that case, a user using Calabash just installs the package like any other package, and does not have to, you know, configure the extensions...

Labels: , ,


Post a Comment

<< Home