Thursday, June 25, 2009

Divide and Conquer, or XPath, XSLT, XQuery and XProc packaging

Packaging of various X* technologies seems to be of interest for a lot of people for now. And of course it is for me. But it seems everyone comes with its own idea of packaging, as well as a different scope. So to add to the complexity yet, I will present here my own ideas on that matter. Hopefully, I will try to tidy up the different concepts and to identify the different needs. And as always, I like to speak about concrete. To ease further discussions, if only that. So I will introduce a prototype of a packaging system for X* libraries and extensions for Saxon.

Packaging is nothing in itself. It is always related to something else (a language, a technology, a framework...) Packaging is just a mean to ease sharing and delivering something in the scope of that "something else." The several files in an ODF document are packaged in a single ZIP file, with a pre-defined structure, to make it possible for an application to use its content. The important point is not the structure in itself, but rather the information it gathers.

I have followed some very interesting discussions about X* packging during the last few weeks, with very interesting people. Rapidly, I have seen everyone were talking about slightly (or not) different things. The most important point where people have different views IMHO, is the scope of packaging.

As with most of modern languages, an XML developer may have to deliver different pieces of software, depending on the project: libraries, standalone applications, or web applications built for a specific framework. If you look at Java for instance, this is reflected quite clearly in its various packaging formats: JAR files for libraries and applications, WAR files for web applications, EAR files for entire enterprise applications...

WAR files contain Java classes, as JAR files. But the structure is quite different, and there are a few other files, describing what is in the package: "that class is a servlet class, conforming to the definition of servlet and coded to live in a servlet container, with a precise lifecycle," or "the package depends on this JAR file."

The same way, you can package XSLT libraries or XQuery modules, telling a processor that when a stylesheet or a module imports a specific URI, some functions are available (provided as plain XSLT stylesheets, XQuery modules, or extension functions.) Or you can package an entire web application using XProc to control the overall processes, XQuery to query XML databases and XSLT for the presentation layer (sounds very MVC, doesn't it?) But those packages are really different beasts: when the first example just need to package some XSLT, XQuery, Java, whatever code, alonside a simple cataloging system, the second example require to define a complete web framework, its lifecycle, how script can plug into this and exchange information with it ("this XProc pipeline has to be evaluated on an HTTP GET on http://www.example/app/theuri, it knows you will provide it with request information as a wa:http-request element, as we agreed upon, and that XSLT stylesheet has to be applied to its result; by the way it will access runtime information by using the extension functions you provide.")

There has been some work on XRX frameworks, and clearly it would be beneficial for anybody (users, but also implementors,) to have such a standard packaging format for entire applications following their rules (as WAR and EAR files can be to Java.) And they would benefit also from a more low-level packaging format dedicated to package X* libraries, and would build upon them. But they really are at different levels, and I think it is fundamental to make the distinction between both concepts.

As part of the EXPath project, and because I think this is the first step X* technologies need for several years to enable the delivery of libraries, I am particularly interested in a library packaging format.

To illustrate that, I've built a very simple prototype of a package manager for Saxon. On the one hand you have a simple GUI to install and delete packages in a repository, and on the other hand you have a shell script to launch Saxon (setting the classpath for extension functions and setting catalogs to resolve XSLT imports refering to libraries.) If those tools are built around a well-defined, open package format, other implementations could be written (for eXist, for MarkLogic, XQilla, Zorba... but also for oXygen, providing a one-click implementation to install a package and then being able to enable it in some scenarii.)

You can find the manager at http://www.fgeorges.org/purl/20090624/. You should be able to run it simply by clicking on one of the links on the launch.html page (through Java Web Start,) but you can also download the JAR file (look also in the lib/ sub-directory,) putting both JAR files in the classpath and running Java the usual way, with the main class org.expath.pkg.saxon.PackageManagerGUI (there is also a text interface with org.expath.pkg.saxon.PackageManagerTextUI.) You first have to set up an environment variable EXPATH_REPO, pointing to a directory (that will be your EXPath Packaging repository, just create an empty directory.) The interface is very simple: choose the install item in the file menu, and select the package file you want to install. To remove a package, select it in the list of installed modules and select delete in the menu.

Once a module is installed, you can use it via Saxon by adding the additional JARs to the classpath as needed (for extension functions) and by setting up the XML Catalogs support. The following script does that for you: http://www.fgeorges.org/purl/20090624/saxon. It needs a few environment variables: EXPATH_REPO as explained above, APACHE_XML_RESOLVER_JAR must point to the Apache XML Commons Resolver (see http://xml.apache.org/commons/, and be sure to pick the resolver JAR) and SAXON_HOME must point to the directory containing the Saxon JARs.

But what about the package format itself? In this prototype, this is a simple ZIP file, with the following structure:

expath-pkg.xml
expath-http-client/
   saxon/
      xsl/
         expath-http-client-saxon.xsl
      jar/
         expath-http-client-saxon.jar
      lib/
         commons-codec-1.3.jar
         ...jar

where expath-pkg.xml is the package descriptor, and expath-http-client is the directory containing one module (here the EXPath HTTP Client module.) This module is implemented as a Java extension, besides a frontend XSLT stylesheet that take care of Saxon-specifics to bind to the Java functions. During the install, an XML Catalogs file is created, to resolve the URI http://www.expath.org/mod/http-client.xsl to that stylesheet, in the local repository. One stylesheet can then simply import that URI and use the functions of the module. The real package for the HTTP Client can be downloaded at the same place: http://www.fgeorges.org/purl/20090624/expath-http-client-saxon-0.3.zip.

There are of course still a lot of work defining exactly the package format, how to handle dependencies, improving the implementation... But I think that gives the big picture. If you are interested, here is what the package descriptor looks like:

<package xmlns="http://expath.org/mod/expath-pkg">
   <module version="0.3" name="expath-http-client">
      <title>EXPath HTTP Client</title>
      <xsl>
         <import-uri>http://www.expath.org/mod/http-client.xsl</import-uri>
         <file>saxon/xsl/expath-http-client-saxon.xsl</file>
      </xsl>
   </module>
</package>

We can see the package contains one module, namely "EXPath HTTP Client," version 0.3. The URIs are used to create an XML catalog. This version of the package contains all the dependencies (the JARs used by the Java implementation of the extension functions,) but they can be also left out, and configured with the following element:

<saxon>
   <dep type="jar">
      <title>Apache Commons Codec 1.3</title>
      <home>http://jakarta.apache.org/commons/codec/</home>
   </dep>
   <dep type="jar">
      <title>Apache Commons Logging 1.1.1</title>
      <home>http://commons.apache.org/logging/</home>
   </dep>
   <dep type="jar">
      <title>Apache HTTP Client 4.0-beta2</title>
      <home>http://hc.apache.org/</home>
   </dep>
   <dep type="jar">
      <title>Apache HTTP Core 4.0</title>
      <home>http://hc.apache.org/</home>
   </dep>
   <dep type="jar">
      <title>Tagsoup 1.2</title>
      <home>http://home.ccil.org/~cowan/XML/tagsoup/</home>
      <href>http://home.ccil.org/~cowan/XML/tagsoup/tagsoup-1.2.jar</href>
   </dep>
</saxon>

The GUI does not take them into account yet, but it should propose to automatically download JARs when possible, and give the user a list of libraries and their homepage when a manual download is required. But of course, the same format can be used to package standard XSLT stylesheets, without any Java features, just by mapping the main entry point files to their public URIs.

Of course, this format will be particularly useful once precisely defined in an open spec, and if several processors support it (either natively, or through external managers.)

To end this post, I would like to introduce an idea from Jim Fuller: CXAN. I am sure most of you know CTAN for TeX, or CPAN for Perl. They are central, organized repositories of libraries for those languages, accessible throught HTTP. With a proper packaging format, it would be possible to set up such a web repository gathering XPath, XSLT, XQuery and XProc libraries and applications, installable automatically with a manager that would install a package from its name, handling dependencies and the like. But for sure, that is yet a step forward.

Labels: , , ,

Saturday, March 21, 2009

SOA Design Patterns and Web Service Contract Design & Versioning for SOA

A few weeks ago, I received the final, paper version of the book "SOA Design Patterns" that I contributed to. I was used to the drafts, and I am glad to say the final layout is really nice. Same for the previous book, "Web Service Contract Design & Versioning for SOA." More info at http://www.soapatterns.com/ and http://www.soabooks.com/.

Friday, February 27, 2009

XSLStyle and oXygen

On almost every XSLT projects I worked on, I used Ken Holman's XSLStyleTM. It enables one to document each stylesheet component (template, function, variable, module, etc.) using an XML vocabulary (DocBook and DITA are supported out of the box.) For instance, the following exerpt shows how to document a simple named template with a parameter, assuming the vocabulary has been set to DocBook:

<doc:template>
   <para>Create a paragraph with a greetings message.</para>
   <doc:param name="who">
      <para>The name of the person to address the greetings to.</para>
   </doc:param>
</doc:template>
<xsl:template name="greetings">
   <xsl:param name="who" as="xs:string"/>
   <h:p>
      <xsl:value-of select="concat('Hello, ', $who, '!')"/>
   </h:p>
</xsl:template>

XSLStyleTM is a set of stylesheets to extract this information and format it to an HTML page. Besides this formating tasks, it also checks for best-practices (did you declare the type for parameters?, etc.) I think this is a very important piece in the XSLT writer's toolbox.

Seting a project up to use XSLStyle is as easy as adding a tranform task using the XSLStyle's stylesheets to transform the project's stylesheet to HTML pages. And of course to add documentation to the stylesheets. All that is very easy, but you always have to remember where to installed XSLStyle, check for the namespace URIs to use, and the exact element names to use to document your code. Once again, that's very easy, but repetitive and time-consuming.

I use oXygen more and more for a few months. And I really enjoy it. It helps a lot automating repetitive, time-consuming tasks: create a new stylesheet with the right namespace URIs, code completion for known vocabularies, tansform scenarii, etc. And for schemas, XQuery modules and WSDL definitions, it provides actions to generate documentation, the same way XSLStyle generates documentation for XSLT modules. Unfortunately, it does not support XSLStyle yet. Actually, the scenario attached to almost all of my stylesheets is the scenario using XSLStyle.

But having XSLStyle integrated within oXygen would bring several advantages. It would always be installed alongside with oXygen. I am a freelance consultant and move a lot between different companies, using several computers, with different system administration policies. Thanks to the per-person-license model of oXygen, and its platform independence, I can take it with me everywhere, and be quickly productive, without having to install separately a JRE, Saxon, FOP, Xerces, DocBook environment, RELAX NG tools, Schematron skeletons, Emacs with nXML and others modes, and even Cygwin to be able to create shell scripts and Makefile to automate tasks. But I still have to download XSLStyle separately, open one of its stylesheets to copy and paste a documentation sample and the correct namespace bindings. Of course, it would be time saving to have an XSLStyle framework directly in oXygen.

Besides that point, oXygen could offer editing facilities for stylesheet components documentation as well. When you want to document a named template with several parameters, you have to create the documentation structure as showed above, and create a doc:param element for each parameter, with the correct name. This could be automatically done with a specific action, to add an empty documentation structure to a particular component or even to each non-documented yet componnents in the stylesheet:

There would not have anymore the need to attach the stylesheet to an XSLStyle scenario neither. When I work on a stylesheet, I like to have its current scenario bound regarding what I am working on in particular. Switching back and forth between this scenario and the XSLStyle scenario does not really make sense. It would be more productive to have a Tools > Generate Documentation > XSLT Documentation... action as for the other file type, in my humble opinion.

And finally, oXygen would even be able to report some documentation errors directly in the editor pane, as wrong names in documenting parameters.

I hope oXygen team will be intersted to add support for XSLStyle, as this would be convenient, but also would spread XSLT writing best practices advocated by XSLStyle. Anyway, thanks to the team for listening to user requests, this is too rarely the case in other companies, and of course to Ken for XSLStyle.

Labels: ,

Tuesday, December 09, 2008

FXSL currying and nestable sequences

After an interesting discussion on the FXSL Help forum, the problem of currying and nested sequences showed up again. The FXSL project provides, among other things, first-class citizen functions. Basically, it represents a function as an element. When executing such a function, the dispatching to the code is done by applying templates on that element.

An interesting feature of FXSL is the ability to curry parameters to a function, to create an other function of a lesser order. The principle is to attach parameters to the function. This new function can then be used as any other function, with specified parameters bound to specified values.

The result of currying is then another first-class citizen function. So it has to be a node, because f:apply() applies templates on it to find the code to execute. And it has to be a single item, in order to be used as any other items (in particular its behaviour in sequence handling and atomization.) The later point makes it impossible to use a sequence as result of currying.

The approach taken by FXSL for now is to create an XML element as the result of f:curry(). This element contains several information: the child fun holds the curried function (may be itself a currying), cnArgs is the cardinality of the curried function and then the childs arg hold the curried values. For instance, the expression f:curry(my:add(), 2, 1024) will return the following element:

<f-curry:f-curry xmlns:f-curry="http://fxsl.sf.net/curry">
   <fun>
      <my:add xmlns:my="urn:X-FGeorges.org:tests:curry-sref.xsl"/>
   </fun>
   <cnArgs>2</cnArgs>
   <arg t="xs:integer">1024</arg>
</f-curry:f-curry>

This approach is convenient because we can use any structure we need to represent currying. Unfortunately, the semantics of adding items to an XML tree implies to copy nodes and to make nodes from atomic values. That means that if the curried argument is an XML element, piece of a whole document, it will be copied to the element representing currying. For example if the curried function uses the ancestor axis on this curried element, it will see the f-curry:f-curry element, instead of the ancestors in the original document. That was actually the problem reported by Christoph Lange on the FXSL forum.

And this leads to other problems related to identity. For instance, items are transformed to nodes. FXSL resolves that problem by recording the initial type in the currying structure, and convert the node back to that type. While this is ok for standard simple types, that can't be applied to user-defined simple types. Another example is for validated nodes; they loose their type annotations when added to the currying structure, which can be a problem for the curried function. You can find more about this topic in Type-preserving copy in XSLT 2.0.

Actually, all those problem could be solved with a simple feature that does not exist in standard XPath: the ability to nest sequence. If we could nest sequences, or if we had a special type of sequences that wouldn't atomize when added to another sequence, we could use them as the result of currying. Even if that's not a node anymore, we could adapt f:apply() to handle those particular sequences and use its, say, first item as the node to apply templates on.

The good news is that this is simple to implement such a sequence as a Java extension in Saxon. Here is a very simple implementation. I have called it SRef, for Sequence Reference. I guess we would need something more elaborated to be efficient and general-purpose, but this is just a proof-of-concept:

package org.fgeorges.saxon;

import java.util.ArrayList;
import java.util.List;
import net.sf.saxon.om.ArrayIterator;
import net.sf.saxon.om.Item;
import net.sf.saxon.om.SequenceIterator;
import net.sf.saxon.trans.XPathException;

/**
 * XPath sequence reference, or non-atomizable XPath sequence.
 *
 * @author Florent Georges - fgeorges.org
 * @date 2006-12-01
 */
public class SequenceRef
{
    public SequenceRef(SequenceIterator seq) throws XPathException
    {
        myIter = seq.getAnother();
    }

    public SequenceIterator getSequence()
    {
        return myIter;
    }

    static public boolean isSequenceRef(Object obj)
    {
        return obj instanceof SequenceRef;
    }

    @Override
    public String toString()
    {
        throw new RuntimeException("toString not supported, cannot be added to a tree!");
    }

    private SequenceIterator myIter = null;
}

This implementation in Java is coupled to an simple API in XPath. Three functions are created: sref:make-sref() takes a sequence and returns an sref for this sequence, sref:sequence() takes an sref and return the original sequence, and sref:is-sref() get an item and return true if it is an sref. The following XSLT module defines those functions:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:sref="http://www.fgeorges.org/xslt/sref"
                xmlns:impl="java:org.fgeorges.saxon.SequenceRef"
                exclude-result-prefixes="xs sref impl"
                version="2.0">

   <xsl:function name="sref:make-sref" as="item()">
      <xsl:param name="seq" as="item()*"/>
      <xsl:sequence select="impl:new($seq)"/>
   </xsl:function>

   <xsl:function name="sref:sequence" as="item()*">
      <xsl:param name="ref" as="item()"/>
      <xsl:sequence select="impl:getSequence($ref)"/>
   </xsl:function>

   <xsl:function name="sref:is-sref" as="xs:boolean">
      <xsl:param name="ref" as="item()"/>
      <xsl:sequence select="impl:isSequenceRef($ref)"/>
   </xsl:function>

   <xsl:function name="sref:atomize" as="item()*">
      <xsl:param name="seq" as="item()*"/>
      <xsl:sequence select="
          for $item in $seq return
            if ( sref:is-sref($item) ) then
              sref:sequence($item)
            else
              $item"/>
   </xsl:function>

   <xsl:function name="sref:deep-atomize" as="item()*">
      <xsl:param name="seq" as="item()*"/>
      <xsl:sequence select="
          for $item in $seq return
            if ( sref:is-sref($item) ) then
              sref:deep-atomize(sref:sequence($item))
            else
              $item"/>
   </xsl:function>

</xsl:stylesheet>

With those simple functions, it is then possible to modify f:curry() and f:apply() to support (to take advantage of) SRefs. The folowing is a simple example (supporting only currying a function of cardinality 2 with a single argument). I create a first-citizen function my:add() that takes two integers and returns their sum, I write new versions of f:apply() and f:curry(), then I call my:add() both directly and with currying:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:f="http://fxsl.sf.net/"
                xmlns:my="urn:X-FGeorges.org:tests:curry-sref.xsl"
                xmlns:sref="http://www.fgeorges.org/xslt/sref"
                xmlns:impl="java:org.fgeorges.saxon.SequenceRef"
                exclude-result-prefixes="xs f my sref impl"
                version="2.0">

   <xsl:import href="sref.xsl"/>

   <xsl:output indent="yes"/>

   <!--
      The my:add() first class function.
   -->
   <xsl:variable name="my:add" as="element()">
      <my:add/>
   </xsl:variable>

   <xsl:function name="my:add" as="node()">
      <xsl:sequence select="$my:add"/>
   </xsl:function>

   <xsl:function name="my:add" as="xs:integer">
      <xsl:param name="lhs" as="xs:integer"/>
      <xsl:param name="rhs" as="xs:integer"/>
      <xsl:sequence select="$lhs + $rhs"/>
   </xsl:function>

   <xsl:template match="my:add" mode="f:FXSL">
      <xsl:param name="arg1"/>
      <xsl:param name="arg2"/>
      <xsl:sequence select="my:add($arg1, $arg2)"/>
   </xsl:template>

   <!--
      Apply on SRefs.
   -->
   <xsl:function name="f:apply-sref">
      <xsl:param name="pFunc" as="item()"/>
      <xsl:param name="arg1" as="item()*"/>
      <xsl:variable name="seq" select="sref:sequence($pFunc)"/>
      <xsl:apply-templates select="$seq[1]" mode="f:FXSL">
         <xsl:with-param name="seq" select="$seq"/>
         <xsl:with-param name="arg1" select="$arg1"/>
      </xsl:apply-templates>
   </xsl:function>

   <!--
      Currying using SRefs.
   -->
   <xsl:function name="f:curry-sref" xmlns:f-c-s="http://fxsl.sf.net/curry-sref">
      <xsl:param name="pFun" as="node()"/>
      <xsl:param name="pNargs" as="xs:integer"/>
      <xsl:param name="arg1"/>
      <xsl:variable name="curry-fun" as="element()">
         <f-c-s:f-c-s/>
      </xsl:variable>
      <xsl:sequence select="
          sref:make-sref(($curry-fun, $pFun, $pNargs, sref:make-sref($arg1)))"/>
   </xsl:function>

   <xsl:template match="f-c-s:*" mode="f:FXSL"
       xmlns:f-c-s="http://fxsl.sf.net/curry-sref">
      <xsl:param name="seq" as="item()*"/>
      <xsl:param name="arg1" as="item()*"/>
      <xsl:apply-templates select="$seq[2]" mode="f:FXSL">
         <xsl:with-param name="arg1" select="sref:sequence($seq[position() gt 3])"/>
         <xsl:with-param name="arg2" select="$arg1"/>
      </xsl:apply-templates>      
   </xsl:template>

   <!--
      The testing template.
   -->
   <xsl:template match="/">
      <root>
         <test-1>
            <xsl:sequence select="my:add(512, 1024)"/>
         </test-1>
         <test-2>
            <xsl:variable name="fun" select="f:curry-sref(my:add(), 2, 1024)"/>
            <xsl:sequence select="f:apply-sref($fun, 512)"/>
         </test-2>
      </root>
   </xsl:template>

</xsl:stylesheet>

Thanks to Christoph Lange for the original problem and to Dimitre for his ideas.

Labels: , ,

Sunday, November 09, 2008

XProc with XSLT completion in oXygen

After having played a little bit with XProc, and having written a few simple XProc definitions with oXygen, I was tired to always check the step names spelling and to use copy & paste intensively. So I decided to add support for the XProc document type in oXygen.

Thanks to the XProc WG, who has published a schema as part of the current WD (and has done so in various schema languages,) the first step was quite straigthforward. Download the two RNC modules from the current WD, in appendix "D Pipeline Language Summary" (direct links: xproc.rnc and steps.rnc.) While editing an XProc definition, click the Associate Schema... button (see the screenshot below.) In the dialog box, choose RelaxNG Schema, choose the option Compact syntax and select the xproc.rnc file you have just downloaded. The only configuration to change is in Preferences / XML / XML Parser / RELAX NG, and unselect the option Check ID/IDREF (thanks, George.)

Now, you can validate your XProc definition while editing it, as well as enjoy the completion from oXygen. So far, so good. But while editing XProc definitions, you will often use small inline XSLT stylesheets (at least, I do.) And it would be great to have validation and completion for those stylesheet as well. So you have to combine the XProc schemas with XSLT schemas. And thanks to Norman Walsh, there is an RNC schema that validates both XSLT 1.0 and 2.0. You can download them from his blog (direct links: xslt.rnc, xslt10.rnc and xslt20.rnc.)

So far, you have then the schemas for XSLT, and the schemas for XProc, without XSLT. So you have to plug the former within the later. Unfortunately, the XProc RNC schema use the same pattern for p:inline for all steps (that patterns simply accepts anything.) The simple approach here is to redefine that pattern to accept anything except elements in the XSLT namespace, or to accept the xsl:stylesheet element defined in the XSLT schemas. The drawback is that this redefinition occurs for all steps; but so far, it hasn't been a restriction. The custom RNC file is just:

default namespace p = "http://www.w3.org/ns/xproc"
namespace xsl = "http://www.w3.org/1999/XSL/Transform"

include "xproc.rnc" {
   Inline =
      element inline {
         exclude-inline-prefixes.attr?,
         common.attributes,
         ( xslt | AnyButXSLT )
         # I am not sure which one is better...
         # ( xslt | AnyButStylesheet )
      }
}

xslt = external "xslt.rnc"

AnyButXSLT =
   element (* - xsl:*) {
      (_any.attr | text | Any)*
   }
AnyButStylesheet =
   element (* - (xsl:stylesheet|xsl:transform)) {
      (_any.attr | text | Any)*
   }

Just copy & paste this code to a file, for instance xproc-with-xslt.rnc (take care to adapt the two paths to the other RNC schemas as needed.) Then remove the <?oxygen RNGSchema...?> previously added by oXygen to your XProc definition, and associate now your new RNC grammar. That's all!

Labels: , ,

Thursday, October 30, 2008

Poor man's Calabash integration into oXygen

XML Calabash, the XProc processor from Norman Walsh, becomes more mature from day to day. Here is a very simple (but very limited too) way to integrate it into the great oXygen XML IDE. Well, the word integrate is maybe too much for this simple trick, that will just add a button in the toolbar to execute the currently edited XProc definition file. But at least that will prevent you to switch between your IDE and a console.

You have to register Calabash as an external tool within oXygen. Go to Tools > External Tools > Preferences > New, and fill the various fields. The point is to correctly set the working directory to ${cfd} and the command line to something like:

java -cp ".../calabash.jar:.../saxon9.jar:.../saxon9-s9api.jar"
    com.xmlcalabash.drivers.Main ${cfne}

Of course, you have to set the absolute path to the JAR files on your machine. Be sure to use ":" as the path separator on Linux and ";" on Windows. You can also set additional options like -Dcom.xmlcalabash.phonehome.email=your@email.com. In other words, just use the command line you usually use to launch Calabash.

When this is done, you will have a new button on your toolbar, called Calabash (be sure to have selected the External Tools toolbar.) When your are editing an XProc definition, you can press that button to execute it with Calabash, viewing the output in the result panel.

Labels: , ,

Wednesday, April 02, 2008

Simple SVG chart generation with XSLT

This week, for my job, I have to create a report generator for a financial company. The reports must be in PDF, so I naturally decided to use XSL-FO. Among other things, the reports contain graphical charts with, you know, financial stuff. The client wants its developers to be able to generate JPEG files themselves, so for the charts I just have to include external graphic files.

But I was curious to see if SVG was adapted to fit in this scenario. So this evening, after my working hours, I created a sample input files and started to learn a little bit about SVG. It was incredible as I was able to quickly get the result I wanted for a static SVG document.

Then the fun part started: create the XSLT stylesheet to transform the input document to the final SVG chart. The goal is of course to have a simple stylesheet that is generic enough to not be bound to specific lengths or other magic values.

And I think the result is quite interesting, assuming it was written in a few hours, without knowledge of SVG at the beginning. Of course the kind of chart is fixed, as well the input format is fixed. But it is flexible enough to adapt to various lengths, various Y axis scales, and such.

This stylesheet is not at all aimed to be used as such, but I think it can be a good strating point for similar SVG charts generation with XSLT. If I have the time, and if I want to, I'd try to make it more configurable, especially in the way the input is provided (I think FXSL can be of great help here to provide adapters for any input document type.)

Basically the input looks like the following. Those numbers are the number of post to XSL List by month, for 2007 (stolen from MarkMail.org):

<input min-value="500" step-value="100" step-number="5">
   <val v="784">Jan-07</val>
   <val v="765">Feb-07</val>
   <val v="910">Mar-07</val>
   <val v="734">Apr-07</val>
   <val v="907">May-07</val>
   <val v="626">Jun-07</val>
   <val v="865">Jul-07</val>
   <val v="682">Aug-07</val>
   <val v="790">Sep-07</val>
   <val v="725">Oct-07</val>
   <val v="649">Nov-07</val>
   <val v="577">Dec-07</val>
</input>

The result of the transformation looks like this (screenshot of the Firefox rendering of the SVG document):

And finally here is the stylesheet itself:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:svg="http://www.w3.org/2000/svg"
                xmlns:my="http://www.fgeorges.org/TMP/svg/charts#internals"
                version="2.0">

   <xsl:output indent="yes"/>

   <!--
       +- - - - - - - - - - - - - - - - - - - - - - - -+
       |  +- - - - - - - - - - - - - - - - - - - -+ nn |
       |  |                                       |    |
       |  |                                   o   |    |
       |  | . . . . . . . . . . .o. . . . . oo. . | nn |
       |  |                    oo oo      oo      |    |
       |  |         o        oo     o   oo        |    |
       |  | . . . oo oo . ooo . . . .ooo. . . . . | nn |
       |  |    ooo     ooo                        |    |
       |  |   o                                   |    |
       |  +- -|- -|- -|- -|- -|- -|- -|- -|- -|- -+ nn |
       |      x   x   x   x   x   x   x   x   x        |
       +- - - - - - - - - - - - - - - - - - - - - - - -+
       
       The outer box is the whole space that the diagram will occupy.
       The length between that imaginary box and the top-left corner
       of the diagram is represented by '$init-x' and '$init-y'.
       
       The length of the rectangle of the diagram itself (the inner
       box in the picture) is represented by '$width' and '$height'.
       
       The number of steps on the right-hand Y axis (in the picture
       there are 3 steps, that is 3 steps "between" the various "nn"s)
       is represented by @step-number in the input document.  The
       numeric value between two steps is @step-value.
       
       The diagram's plot (the "o"s) and the X axis (the "x"s) are
       represented by the input document.  Exemple of input:
       
           <input min-value="100" step-value="5" step-number="5">
              <val v="110">Jan-08</val>
              <val v="107">Feb-08</val>
              <val v="123">Mar-08</val>
           </input>
       
       In this example, the Y axis will be from 100 to 125, with a
       line (and a label) from 5 to 5.  The X axis will have 3 labels
       (from Jan to Mar) and the plot will be computed from 3 values:
       110, 107 and finally 123.
   -->
   <xsl:param name="init-x" as="xs:double" select="10"/>
   <xsl:param name="init-y" as="xs:double" select="10"/>
   <xsl:param name="width"  as="xs:double" select="500"/>
   <xsl:param name="height" as="xs:double" select="250"/>

   <xsl:variable name="baseline"  select="$height + $init-y"/>

   <!--
       For test purpose: an root SVG element that should be ok for the
       default values of the global parameters, provided an input of
       10 or 20 values.
   -->
   <xsl:template match="/">
      <svg:svg width="540" height="300">
         <svg:g>
            <xsl:apply-templates select="*"/>
         </svg:g>
      </svg:svg>
   </xsl:template>

   <!--
       The Y axis, the X axis and the plot line.
   -->
   <xsl:template match="input">
      <!-- the diagram's box -->
      <svg:rect x="{ $init-x }" y="{ $init-y }"
                width="{ $width }" height="{ $height }"
                fill="#fff" stroke="#000"/>
      <!-- the Y axis's labels and their lines -->
      <xsl:sequence select="my:lines(@min-value, @step-value, @step-number)"/>
      <xsl:variable name="len" select="$width div count(*)"/>
      <!-- the X axis's labels -->
      <xsl:apply-templates select="*">
         <xsl:with-param name="len" select="$len"/>
      </xsl:apply-templates>
      <!-- the plot line -->
      <svg:path stroke="blue" stroke-width="1" fill="none">
         <xsl:attribute name="d">
            <xsl:apply-templates select="*" mode="path">
               <xsl:with-param name="len"       select="$len"/>
               <xsl:with-param name="min"       select="@min-value"/>
               <xsl:with-param name="one-y-len" select="
                   ( $height div @step-number ) div @step-value"/>
            </xsl:apply-templates>
         </xsl:attribute>
      </svg:path>
   </xsl:template>

   <!--
       Draw a label on the X axis.
   -->
   <xsl:template match="val">
      <xsl:param name="len" as="xs:double"/>
      <xsl:variable name="x" select="my:x-pos($len, position())"/>
      <svg:path d="M { $x },{ $baseline } L { $x },{ $baseline + 5 }" stroke="#000"/>
      <svg:text x="{ $x + 10 }" y="{ $baseline + 15 }"
                transform="rotate(-45 { $x + 10 } { $baseline + 15 })"
                font-size="10px" text-anchor="end">
         <xsl:value-of select="."/>
      </svg:text>
   </xsl:template>

   <!--
       Compute one single step of an SVG path's @d, to draw the plot.
   -->
   <xsl:template match="val" mode="path">
      <xsl:param name="len"       as="xs:double"/>
      <xsl:param name="min"       as="xs:double"/>
      <xsl:param name="one-y-len" as="xs:double"/>
      <xsl:value-of select="if ( position() eq 1 ) then 'M' else 'L'"/>
      <xsl:text> </xsl:text>
      <xsl:value-of select="my:x-pos($len, position())"/>
      <xsl:text>,</xsl:text>
      <xsl:value-of select="my:y-pos($one-y-len, @v, $min)"/>
      <xsl:text> </xsl:text>
   </xsl:template>

   <!--
       The lines for each Y step, as well as the label for each Y
       step.
   -->
   <xsl:function name="my:lines" as="element()+">
      <xsl:param name="min"      as="xs:double"/>
      <xsl:param name="step-val" as="xs:double"/>
      <xsl:param name="step-num" as="xs:integer"/>
      <xsl:variable name="step-len" select="$height div $step-num"/>
      <!-- the N - 1 lines -->
      <xsl:for-each select="1 to ($step-num - 1)">
         <xsl:variable name="y" select="(. * $step-len) + $init-y"/>
         <svg:path d="M { $init-x },{ $y } L { $width + $init-x },{ $y }" stroke="#AAA"/>
      </xsl:for-each>
      <!-- the N + 1 labels -->
      <xsl:for-each select="0 to $step-num">
         <xsl:variable name="y" select="(. * $step-len) + $init-y"/>
         <svg:text x="{ $width + $init-x + 25 }" y="{ $y + 4 }" font-size="10px"
                   text-align="end" text-anchor="end" font-family="Helvetica Condensed">
            <xsl:value-of select="$min + ($step-num - .) * $step-val"/>
         </svg:text>
      </xsl:for-each>
   </xsl:function>

   <!--
       Compute the absolute X position from the ordinal position and
       the length of one X step.
   -->
   <xsl:function name="my:x-pos" as="xs:double">
      <xsl:param name="step-len" as="xs:double"/>
      <xsl:param name="position" as="xs:integer"/>
      <xsl:sequence select="($step-len * $position) - ($step-len div 2) + $init-x"/>
   </xsl:function>

   <!--
       Compute the absolute Y position for one point of the diagram's
       plot.  $one-len is the length of 1 on the Y axis, $value is the
       Y value of the plot's point, and $min is the minimal value on
       the Y axis.
   -->
   <xsl:function name="my:y-pos" as="xs:double">
      <xsl:param name="one-len" as="xs:double"/>
      <xsl:param name="value"   as="xs:double"/>
      <xsl:param name="min"     as="xs:double"/>
      <xsl:variable name="mid" select="$height div 2"/>
      <!-- scale the value to the scale [min - max] -->
      <xsl:variable name="val" select="($value - $min) * $one-len"/>
      <!-- reverse 0->$height and $height->0, 'cause in SVG y=0 is at top -->
      <xsl:variable name="rev" select="(- ($val - $mid)) + $mid"/>
      <!-- slide because our graph begins at y=$init-y -->
      <xsl:value-of select="$rev + $init-y"/>
   </xsl:function>

</xsl:stylesheet>

Labels: ,

Monday, March 31, 2008

HTTP extension for Saxon

I have just finished a little extension function for Saxon, to be able to send HTTP request from XSLT 2.0 (and get the result back). The idea is based on the SOAP extension from Andrew Welch, but is less restricted, as it can perform other HTTP requests (besides SOAP request over HTTP.)

The function take two parameters: a URI and an element that describe the request (the payload, the headers, the HTTP method, etc.) The later looks like:

<http-request method="post" mime-type="text/xml" charset="utf-8">
   <header name="Header-Name">...</header>
   <header name="Header2-Name">...</header>
   <body>
      The textual value of body will be the payload of the HTTP request...
   </body>
</http-request>

Let's say such an element is bound to the variable $request, then you can call ex:http-send($request, 'http://...'), and you will get a result that will look like:

<http-response code="200">
   <message>OK</message>
   <header name="Header-Name">...</header>
   <header name="Header-x-Name">...</header>
   <body>
      The textual value of body was the payload of the HTTP response...
   </body>
</http-response>

All the info, javadoc, JAR file, sample, can be found at http://www.fgeorges.org/xslt/saxon-ext/. This page contains a full sample sending a SOAP message to a Web service and formating the result to a simple text.

Labels: , ,

Thursday, January 10, 2008

Emacs: favourite directories implementation

Today, I have finally taken a look at one of the simple features I always missed in Emacs: the ability to define a set of "favourite directories." That is, a set of named directories that one can use in the minibuffer when prompted for instance to open a file. Given a set of such dirs:

  emacs-src -> /enter/your/path/to/emacs/sources
  projects  -> /path/to/some/company/projects
  now       -> @projects/the/project/I/am/working/on

one can use the following path in the minibuffer to open a file, for instance using C-x C-f:

  @emacs-src/lisp/files.el
  @emacs-src/src/alloc.c
  @projects/great/README
  @now/src/some/stuff.txt

Doing so, completion is available for both directory names and files under their target directories. For instance, to open the third file above, you only have to type:

  C-x C-f @ p <tab> g <tab> R <tab> <enter>

The implementation I have just written is really simple, but useful yet. It implements all described above (including recursive defined directories, as the '@now' above.) Thanks to Emacs, I am still suprised by the facility to implement such a feature!

The code was written on GNU Emacs 22.1 on Windows, but should work on any platform, and I think on Emacs 21 as well.

;; TODO: Make a custom variable.
(defvar drkm-fav:favourite-directories-alist
  '(("saxon-src"  . "y:/Saxon/saxon-resources9-0-0-1/source/net/sf/saxon")
    ("kernow-src" . "~/xslt/kernow/svn-2007-09-29/kernow/trunk/src/net/sf/kernow"))
  "See `drkm-fav:handler'.")

(defvar drkm-fav::fav-dirs-re
  ;; TODO: Is tehre really no other way (than mapcar) to get the list
  ;; of the keys of an alist?!?
  (concat
   "^@"
   (regexp-opt
    (mapcar 'car drkm-fav:favourite-directories-alist)
    t))
  "Internal variable that stores a regex computed from
`drkm-fav:favourite-directories-alist'.  WARNING: This is not
updated automatically if the later variable is changed.")

(defun drkm-fav:handler (primitive &rest args)
  "Magic handler for favourite directories.

With this handler installed into `file-name-handler-alist', it is
possible to use shortcuts for often used directories.  It uses
the mapping in the alist `drkm-fav:favourite-directories-alist'.

Once installed, say you have the following alist in the mapping
variable:

    ((\"dir-1\" . \"~/some/real/dir\")
     (\"dir-2\" . \"c:/other/dir/for/windows/users\"))

You can now use \"@dir-1\" while opening a file with C-x C-f for
instance, with completion for the abbreviation names themselves
as well as for files under the target directory."
  (cond
   ;; expand-file-name
   ((and (eq primitive 'expand-file-name)
         (string-match drkm-fav::fav-dirs-re (car args)))
    (replace-match
     (cdr (assoc (match-string 1 (car args))
                 drkm-fav:favourite-directories-alist))
     t t (car args)))
   ;; file-name-completion
   ((and (eq primitive 'file-name-completion)
         (string-match "^@\\([^/]*\\)$" (car args)))
    (let ((compl (try-completion
                  (match-string 1 (car args))
                  drkm-fav:favourite-directories-alist)))
      (cond ((eq t compl)
             (concat "@" (match-string 1 (car args)) "/"))
            ((not compl)
             nil)
            (t
             (concat "@" compl)))))
   ;; file-name-all-completions
   ((and (eq primitive 'file-name-all-completions)
         (string-match "^@\\([^/]*\\)$" (car args)))
    (all-completions
     (match-string 1 (car args))
     drkm-fav:favourite-directories-alist))
   ;; Handle any primitive we don't know about (from the info node
   ;; (info "(elisp)Magic File Names")).
   (t (let ((inhibit-file-name-handlers
             (cons 'drkm-fav:handler
                   (and (eq inhibit-file-name-operation primitive)
                        inhibit-file-name-handlers)))
            (inhibit-file-name-operation primitive))
        (apply primitive args)))))

;; Actually plug the feature into Emacs.
(push '("\\`@" . drkm-fav:handler) file-name-handler-alist)

Labels:

Saturday, November 03, 2007

XSLT stacktrace with Saxon 9

I have played a little bit with the Saxon B's XSLT stack representation. When an error appears while evaluating a stylesheet, you can indeed catch the Java exception, so the Java stacktrace, but what is really interresting is the XSLT stacktrace. That is, where in the XSLT processing the error occured. For instance "in the function X, called from the template Y, applied from the external application".

I didn't find a comprehensive documentation on that subject, so I experimented a bit with what I got: an XPathException and from there an XPathContext. Please note I used for that the new version 9.0.0.1 (I know there are a few differences between version 8 and 9 in the area of interest here, but I didn't look for cataloging them).

Let me first introduce a concrete sample. Here is a example of XSLT stylesheet that will throw an error when applied to itself (there is a few boilerplate code to avoid to much optimization from Saxon):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:fg="http://www.fgeorges.org/xslt/samples"
                version="2.0">

   <xsl:template match="/">
      <html>
         <head/>
         <body>
            <xsl:apply-templates select="*" mode="y"/>
         </body>
      </html>
   </xsl:template>

   <xsl:template match="*" name="nana" mode="y">
      <b>
         <xsl:value-of select="name(.)"/>
         <xsl:sequence select="fg:fun(*)"/>
      </b>
   </xsl:template>

   <xsl:template match="xsl:function/xsl:choose | aa//bb" mode="y">
      <b>
         <xsl:sequence select="fg:fun(*)"/>
      </b>
   </xsl:template>

   <xsl:function name="fg:fun">
      <xsl:param name="n" as="node()*"/>
      <xsl:choose>
         <xsl:when test="$n[0][name() = 'inexistent']">
            <xsl:apply-templates mode="y" select="
                $n[position() mod 2 eq 0]"/>
         </xsl:when>
         <xsl:otherwise>
            <xsl:apply-templates mode="y" select="$n"/>
         </xsl:otherwise>
      </xsl:choose>
   </xsl:function>

   <xsl:template match="xsl:otherwise" mode="y">
      <xsl:if test="@test">
         <xsl:sequence select="'&#10;'"/>
      </xsl:if>
      <dummy>
         <xsl:sequence select="
             error(xs:QName('fg:ERR007'), 'My error message')"/>
      </dummy>
   </xsl:template>

</xsl:stylesheet>

When applied to itself, the stylesheet should throw an error in the last template rule, matching xsl:otherwise. Here is what I expect the XSLT stacktrace should look like:

ERR007: My error message
  in template #y matching "xsl:otherwise" (at style.xsl:47)
    `-> /xsl:stylesheet[1]/xsl:function[1]/xsl:choose[1]/xsl:otherwise[1]
  applied in function fg:fun #1 (at style.xsl:36)
  called in template #y matching "xsl:function/xsl:choose | aa//bb" (at style.xsl:24)
    `-> /xsl:stylesheet[1]/xsl:function[1]/xsl:choose[1]
  applied in function fg:fun #1 (at style.xsl:36)
  called in template nana #y matching "*" (at style.xsl:18)
    `-> /xsl:stylesheet[1]/xsl:function[1]
  applied in function fg:fun #1 (at style.xsl:36)
  called in template nana #y matching "*" (at style.xsl:18)
    `-> /xsl:stylesheet[1]
  applied in template matching "/" (at style.xsl:10)
    `-> /
  applied from external application

We can see first the message of the error and its local part name. Then the stack itself is composed by templates and function calling/applying each other. For a function, there is its name and its arity: function fg:fun #1. For a template (both named templates and template rules), there is its name if any, then its mode if any, and finally its pattern if any (what it is matching): template nana #y matching "*". In addition, there is the location within the stylesheet (the stylesheet module's file name and the line number) as well as path to the current node within its document, if any.

This is what I actually get now, except for the pattern. The initial text of the pattern is not always keep in memory by Saxon. It is sometimes reconstructed from its internal form (its compiled form). I'm sure I could improve this situation, but I didn't yet. In the above example, the only difference is that the pattern for the first template is not: template #y matching "xsl:otherwise" but instead: template #y matching "element({http://www.w3.org/1999/XSL/Transform}otherwise, xs:anyType)".

Instead of directly writing those info as I walk the Saxon's stack, I instead build an alternative Java representation of the XSLt stack, and I then output it to the console. That way it would be easier to reuse that code for other purpose (for instance build a graphical view of the stack within a GUI for Saxon ;-)).

You can find the Java source files there:

saxon9-xslt-stacktrace.zip
The ZIP archive with all the files
StackFrame.java
This class represent an abstract XSLT stack frame.
FunctionFrame.java
This class represent an XSLT stack frame for a function.
TemplateFrame.java
This class represent an XSLT stack frame for a template.
StackVisitor.java
The stack visitor interface.
Main.java
A sample program that run a transform that throws an error, and display the obtained XSLT stack.
style.xsl
The sample transform.

Below are the complete sources inline, for completeness.

StackFrame.java

/*
 * StackFrame.java
 * 
 * Created on Oct 27, 2007, 2:57:04 PM
 */

package saxon.xslt.stacktrace;

import javax.xml.XMLConstants;
import javax.xml.namespace.QName;
import javax.xml.transform.SourceLocator;
import net.sf.saxon.expr.XPathContext;
import net.sf.saxon.instruct.Template;
import net.sf.saxon.instruct.UserFunction;
import net.sf.saxon.om.Axis;
import net.sf.saxon.om.AxisIterator;
import net.sf.saxon.om.Item;
import net.sf.saxon.om.NodeInfo;
import net.sf.saxon.om.SequenceIterator;
import net.sf.saxon.om.StandardNames;
import net.sf.saxon.om.StructuredQName;
import net.sf.saxon.pattern.NameTest;
import net.sf.saxon.pattern.NodeKindTest;
import net.sf.saxon.pattern.NodeTest;
import net.sf.saxon.pattern.NodeTestPattern;
import net.sf.saxon.pattern.Pattern;
import net.sf.saxon.trace.InstructionInfo;
import net.sf.saxon.trace.Location;
import net.sf.saxon.trans.Mode;
import net.sf.saxon.trans.Rule;
import net.sf.saxon.trans.XPathException;
import net.sf.saxon.type.Type;
import org.xml.sax.Locator;
import org.xml.sax.helpers.LocatorImpl;

/**
 * The base class for XSLT stack frames.
 * 
 * <p>The main entry point are {@link #makeStack(XPathContext,Locator)} and
 * {@link #makeStack(XPathException)}.</p>
 *
 * @author Florent Georges
 */
public abstract class StackFrame
{
    /**
     * Get the locator of the XSLT istruction this frame stands for.
     */
    public Locator getLocator()
    {
        return myLocator;
    }

    /**
     * Get the path of the current node if any.  May be null.
     */
    public String getPath()
    {
        return myPath;
    }

    /**
     * Get the next frame in the stack (that is, the frame that <em>called</em> this one).
     */
    public StackFrame getNext()
    {
        return myNext;
    }

    /**
     * Set the next frame in the stack (that is, the frame that <em>called</em> this one).
     */
    public void setNext(StackFrame next)
    {
        myNext = next;
    }

    /**
     * Accept a visitor.
     * 
     * <p>The visitor will visit this frame, then the rest of the stack (the next
     * frame, then its next frame, etcetera).</p>
     */
    public abstract void acceptVisitor(StackVisitor visitor);

    /**
     * Make a stack representation from the XPath exception.
     * 
     * <p>Just extract the locator from the exception, then call
     * {@link #makeStack(XPathContext,Locator)}.</p>
     */
    public static StackFrame makeStack(XPathException ex)
    {
        Locator locator = jaxpToSaxLocator(ex.getLocator());
        XPathContext ctxt = ex.getXPathContext();
        return makeStack(ctxt, locator);
    }

    /**
     * Make a stack representation from the XPath context.
     * 
     * <p><b>Discussion</b>: The Saxon's XPath context <em>seems</em> to be
     * organised as following.  One context represent either a template, a
     * call-template, an apply-template or a function call (and a few other
     * like for-each, not relevant here and ignored).  On the one hand, that's
     * important for a template to have all those things in the contexts, because
     * a same template can be called or applied, and a same apply-template can
     * apply different template rules.  For a function, on the other hand, a
     * call identify clearly the function.</p>
     * 
     * <p>So you have to different types of context to identify the template or
     * function you are in: template and function call (as there is no function
     * context).  And you have three different way of knowing where you are in
     * a template or function (where you leave it to another template or
     * function, the line number that interests you): apply-templates,
     * call-template and function call.  The later are encountered first, the
     * former later.</p>
     * 
     * <p>The locaction of the error (the starting point) is again something
     * different, that the {@link TransformerException} will give you.</p>
     * 
     * <p>So the idea is the following.  We know the location to put in the
     * stack one (or more) context before the context telling the template or
     * function name, mode, arity, etc.  The function call does both things:
     * it tells the next context the line number to use, and it tell which
     * function to use with the current line number.</p>
     * 
     * <p>So we start by providing the locator, then recurse the contexts.  If
     * the context is a template, we make a new template frame with the locator,
     * then recurse with a null locator.  If the context is an apply-templates
     * or a call-template, we recurse with the locator the give us.  If the
     * context is a function call, both things are done: we use the current
     * locator to create a new function frame with the function we have, then
     * we recurse with the locator the function call gives us.</p>
     * 
     * <p>Locators are copied to not keep references to a lot of Saxon objects,
     * directly or indirectly.</p>
     */
    public static StackFrame makeStack(XPathContext ctxt, Locator locator)
    {
        return makeStack(ctxt, locator, null);
    }

    /**
     * Implementation of {@link #makeStack(XPathContext ctxt, Locator locator)}.
     * 
     * <p>To deal with template frames, we need to keep a reference to the
     * <em>called</em> frame.  Thus this method with an extra parameter.</p>
     */
    private static StackFrame makeStack(XPathContext ctxt, Locator locator, StackFrame called)
    {
        // stop recursion
        if ( ctxt == null ) {
            return null;
        }

        InstructionInfo info = ctxt.getOrigin().getInstructionInfo();
        switch ( info.getConstructType() ) {
            case StandardNames.XSL_TEMPLATE: {
                // path
                String path = makePathToCurrentNode(ctxt);
                // pattern
                Rule rule = ctxt.getCurrentTemplateRule();
                String pattern = getPatternText(rule.getPattern());
                // name
                Template t = (Template) rule.getAction();
                QName name = structuredToQName(t.getTemplateName());
                // frame & recurse
                StackFrame frame = new TemplateFrame(path, pattern, name, null, locator);
                frame.setNext(makeStack(ctxt.getCaller(), null, frame));
                return frame;
            }
            case Location.FUNCTION_CALL: {
                // path
                String path = makePathToCurrentNode(ctxt);
                // name
                QName name = structuredToQName(info.getObjectName(ctxt.getNamePool()));
                // arity
                UserFunction fun = (UserFunction) info.getProperty("target");
                int arity = fun.getNumberOfArguments();
                // frame & recurse
                StackFrame frame = new FunctionFrame(path, name, arity, locator);
                frame.setNext(makeStack(ctxt.getCaller(), new LocatorImpl(info), frame));
                return frame;
            }
            case StandardNames.XSL_CALL_TEMPLATE: {
                // Should always be, but is not...?
                if ( called instanceof TemplateFrame ) {
                    TemplateFrame frame = (TemplateFrame) called;
                    frame.setCalled(true);
                }
                return makeStack(ctxt.getCaller(), new LocatorImpl(info), called);
            }
            case StandardNames.XSL_APPLY_TEMPLATES: {
                // Should always be, but is not...?
                if ( called instanceof TemplateFrame ) {
                    TemplateFrame frame = (TemplateFrame) called;
                    // mode
                    Object pmode = info.getProperty("mode");
                    if ( pmode != null ) {
                        StructuredQName smode = ((Mode) pmode).getModeName();
                        frame.setMode(structuredToQName(smode));
                    }
                }
                // recurse with a new locator
                return makeStack(ctxt.getCaller(), new LocatorImpl(info), called);
            }
            default: {
                // just recurse the next context
                return makeStack(ctxt.getCaller(), locator, called);
            }
        }
    }

    /**
     * Internal helper, returning a representation of a QName for human readers.
     */
    protected static String displayQName(QName name)
    {
        if ( XMLConstants.DEFAULT_NS_PREFIX.equals(name.getPrefix()) ) {
            return name.getLocalPart();
        }
        else {
            return name.getPrefix() + ":" + name.getLocalPart();
        }
    }

    /**
     * Return the text view of a <em>compiled</em> pattern.
     * 
     * <p>Could be improved to be more human-friendly in some cases.</p>
     */
    private static String getPatternText(Pattern pattern)
    {
        NodeTest test = pattern.getNodeTest();
        if ( pattern instanceof NodeTestPattern && test instanceof NodeKindTest ) {
            if ( test == NodeKindTest.ATTRIBUTE ) {
                return "@*";
            }
            else if ( test == NodeKindTest.DOCUMENT ) {
                return "/";
            }
            else if ( test == NodeKindTest.ELEMENT ) {
                return "*";
            }
        }
        return pattern.toString();
    }

    /**
     * Build a new SAX {@link Locator} from a JAXP {@link SourceLocator}.
     */
    private static Locator jaxpToSaxLocator(SourceLocator jaxp)
    {
        LocatorImpl sax = new LocatorImpl();
        sax.setColumnNumber(jaxp.getColumnNumber());
        sax.setLineNumber(jaxp.getLineNumber());
        sax.setPublicId(jaxp.getPublicId());
        sax.setSystemId(jaxp.getSystemId());
        return sax;
    }

    /**
     * Build a new JAXP {@link QName} from a Saxon {@link StructuredQName}.
     */
    private static QName structuredToQName(StructuredQName sname)
    {
        if ( sname == null ) {
            return null;
        }
        String uri = sname.getNamespaceURI();
        String local = sname.getLocalName();
        String prefix = sname.getPrefix();
        return new QName(uri, local, prefix);
    }

    /**
     * Return a human-friendly view of the path to the current node if any.
     */
    private static String makePathToCurrentNode(XPathContext ctxt)
    {
        SequenceIterator it = ctxt.getCurrentIterator();
        if ( it == null ) {
            return null;
        }
        Item item = it.current();
        if ( item instanceof NodeInfo ) {
            return makePathTo((NodeInfo) item);
        }
        return null;
    }

    /**
     * Return a human-friendly view of the path to a node within its document.
     */
    private static String makePathTo(NodeInfo node)
    {
        if ( node == null ) {
            return null;
        }
        String path = null;
        switch ( node.getNodeKind() ) {
            case Type.DOCUMENT: {
                return "/";
            }
            case Type.ELEMENT: {
                String name = node.getNamePool().getDisplayName(node.getNameCode());
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = name + "[" + pos + "]";
                break;
            }
            case Type.ATTRIBUTE: {
                String name = node.getNamePool().getDisplayName(node.getNameCode());
                path = "@" + name;
                break;
            }
            case Type.TEXT: {
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, NodeKindTest.TEXT);
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "text()[" + pos + "]";
                break;
            }
            case Type.COMMENT: {
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, NodeKindTest.COMMENT);
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "comment()[" + pos + "]";
                break;
            }
            case Type.PROCESSING_INSTRUCTION: {
                String name = node.getNamePool().getDisplayName(node.getNameCode());
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "processing-instruction(" + name + ")[" + pos + "]";
                break;
            }
            case Type.NAMESPACE: {
                int name_code = node.getNameCode();
                String name = name_code < 0 ? "" : node.getNamePool().getDisplayName(name_code);
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "namespace(" + name + ")[" + pos + "]";
                break;
            }
            default: {
                throw new RuntimeException("FIXME: What to do?!?");
            }
        }

        String parent = makePathTo(node.getParent());
        if ( parent == null ) {
            return path;
        }
        else if ( "/".equals(parent) ) {
            return "/" + path;
        }
        else {
            return parent + "/" + path;
        }
    }

    protected Locator myLocator;
    protected String myPath;
    private StackFrame myNext;
}

FunctionFrame.java

/*
 * FunctionFrame.java
 * 
 * Created on Oct 31, 2007, 8:10:37 PM
 */

package saxon.xslt.stacktrace;

import javax.xml.namespace.QName;
import org.xml.sax.Locator;

/**
 * Represent an XSLT stack frame for a function.
 *
 * @author Florent Georges
 */
public class FunctionFrame
        extends StackFrame
{
    /**
     * Build a new function frame.
     * 
     * @param curr_item
     *             The path to the current node if any.
     * 
     * @param name
     *             The name of the function.
     * 
     * @param arity
     *             The arity of the function (its number of parameters).
     * 
     * @param locator
     *             The SAX locator for the function.
     */
    public FunctionFrame(String curr_item, QName name, int arity, Locator locator)
    {
        myPath = curr_item;
        myName = name;
        myArity = arity;
        myLocator = locator;
    }

    /**
     * Return the name of the function.
     */
    public QName getName()
    {
        return myName;
    }

    /**
     * Return the arity of the function (its number of parameters).
     */
    public int getArity()
    {
        return myArity;
    }

    @Override
    public String toString()
    {
        return "function " + displayQName(myName) + " #" + myArity;
    }

    public void acceptVisitor(StackVisitor visitor)
    {
        visitor.visitFunctionFrame(this);
        if ( getNext() != null ) {
            getNext().acceptVisitor(visitor);
        }
    }

    private QName myName;
    private int myArity;
}

TemplateFrame.java

/*
 * TemplateFrame.java
 * 
 * Created on Oct 31, 2007, 8:09:11 PM
 */

package saxon.xslt.stacktrace;

import javax.xml.namespace.QName;
import org.xml.sax.Locator;

/**
 * Represent an XSLT stack frame for a template.
 *
 * @author Florent Georges
 */
public class TemplateFrame
        extends StackFrame
{
    /**
     * Build a new template frame.
     * 
     * @param curr_item
     *             The path to the current node if any.
     * 
     * @param pattern
     *             The pattern the template matches, if it is a template rule.
     * 
     * @param name
     *             The name of the template, if it is a named template.
     * 
     * @param mode
     *             The mode of the template, if any.
     * 
     * @param locator
     *             The SAX locator for the template.
     */
    public TemplateFrame(String curr_item, String pattern, QName name, QName mode, Locator locator)
    {
        myPath = curr_item;
        myPattern = pattern;
        myName = name;
        myMode = mode;
        myLocator = locator;
        myCalled = false;
    }

    /**
     * Return the pattern of the template, if it is a template rule.
     */
    public String getPattern()
    {
        return myPattern;
    }

    /**
     * Return the name of the template if it is a named template.
     */
    public QName getName()
    {
        return myName;
    }

    /**
     * Return the mode's name of the template, if it is a template rule and has a mode.
     */
    public QName getMode()
    {
        return myMode;
    }

    /**
     * Set the mode of the template.
     */
    public void setMode(QName mode)
    {
        myMode = mode;
    }

    /**
     * Return {@code true} if the template was called, {@code false} if it was applied.
     */
    public boolean isCalled()
    {
        return myCalled;
    }

    /**
     * Set the {@code isCalled} property of the template (see {@link #isCalled()}).
     */
    public void setCalled(boolean called)
    {
        myCalled = called;
    }

    @Override
    public String toString()
    {
        StringBuilder buf = new StringBuilder("template ");
        if ( myName != null ) {
            buf.append(displayQName(myName)).append(' ');
        }
        if ( myMode != null ) {
            buf.append('#').append(displayQName(myMode)).append(' ');
        }
        if ( myPattern != null ) {
            buf.append("matching ").append('\"').append(myPattern).append('\"');
        }
        return buf.toString();
    }

    public void acceptVisitor(StackVisitor visitor)
    {
        visitor.visitTemplateFrame(this);
        if ( getNext() != null ) {
            getNext().acceptVisitor(visitor);
        }
    }

    private String myPattern;
    private QName myName;
    private QName myMode;
    private boolean myCalled;
}

StackVisitor.java

/*
 * StackFrameVisitor.java
 * 
 * Created on Oct 31, 2007, 8:14:12 PM
 */

package saxon.xslt.stacktrace;

/**
 * Visitor that visits an XSLT stack.
 *
 * @see {@link StackFrame#acceptVisitor(StackFrameVisitor)}
 * 
 * @author Florent Georges
 */
public interface StackVisitor
{
    /**
     * Visit a function frame.
     */
    public void visitFunctionFrame(FunctionFrame frame);

    /**
     * Visit a template frame.
     */
    public void visitTemplateFrame(TemplateFrame frame);
}

Main.java

/*
 * Main.java
 * 
 * Created on Oct 27, 2007, 1:21:32 PM
 */

package saxon.xslt.stacktrace;

import java.io.File;
import javax.xml.transform.ErrorListener;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import net.sf.saxon.TransformerFactoryImpl;
import net.sf.saxon.trans.XPathException;

/**
 * Sample application that run a stylesheet an output the XSLT stacktrace in case of error.
 *
 * <p>The stylesheet used is {@code src/saxon/xslt/stacktrace/style.xsl}.</p>
 * 
 * @author Florent Georges
 */
public class Main
{
    public static void main(String[] args)
            throws TransformerException
    {
        TransformerFactory factory = TransformerFactoryImpl.newInstance();
        Source style = new StreamSource(new File("src/saxon/xslt/stacktrace/style.xsl"));
        Transformer trans = factory.newTransformer(style);
        trans.setErrorListener(new NullErrorListener());
        try {
            trans.transform(style, new StreamResult(System.out));
        }
        catch ( XPathException ex ) {
            System.err.println(ex.getErrorCodeLocalPart() + ": " + ex.getMessage());
            StackFrame stack = StackFrame.makeStack(ex);
            DisplayerStackVisitor visitor = new DisplayerStackVisitor();
            stack.acceptVisitor(visitor);
            visitor.finish();
        }
    }

    /**
     * Does nothing, to supress error messages from Saxon itself.
     */
    private static class NullErrorListener
            implements ErrorListener
    {
        public void warning(TransformerException ex)
                throws TransformerException {
        }
        public void error(TransformerException ex)
                throws TransformerException {
        }
        public void fatalError(TransformerException ex)
                throws TransformerException {
        }
    }

    /**
     * Stack visitor that display the stack as text on {@code System.err}.
     */
    private static class DisplayerStackVisitor
            implements StackVisitor
    {
        public void visitFunctionFrame(FunctionFrame frame)
        {
            doVisit(frame);
            myHead = "  called ";
        }

        public void visitTemplateFrame(TemplateFrame frame)
        {
            doVisit(frame);
            myHead = frame.isCalled() ? "  called " : "  applied ";
        }

        public void finish()
        {
            System.err.println(myHead + "from external application");
        }

        private void doVisit(StackFrame frame)
        {
            int line = frame.getLocator().getLineNumber();
            int col = frame.getLocator().getColumnNumber();
            String pos = getFile(frame) + ":" + line + ( col < 0 ? "" : ":" + col );
            System.err.println(myHead + "in " + frame + " (at " + pos + ")");
            String path = frame.getPath();
            if ( path != null ) {
                System.err.println("    `-> " + path);
            }
        }

        private static String getFile(StackFrame frame)
        {
            String sysid = frame.getLocator().getSystemId();
            int slash = sysid.lastIndexOf('/');
            if ( slash < 0 ) {
                return sysid;
            }
            else {
                return sysid.substring(slash + 1);
            }
        }

        private String myHead = "  ";
    }
}

Labels: ,