Saturday, November 03, 2007

XSLT stacktrace with Saxon 9

I have played a little bit with the Saxon B's XSLT stack representation. When an error appears while evaluating a stylesheet, you can indeed catch the Java exception, so the Java stacktrace, but what is really interresting is the XSLT stacktrace. That is, where in the XSLT processing the error occured. For instance "in the function X, called from the template Y, applied from the external application".

I didn't find a comprehensive documentation on that subject, so I experimented a bit with what I got: an XPathException and from there an XPathContext. Please note I used for that the new version 9.0.0.1 (I know there are a few differences between version 8 and 9 in the area of interest here, but I didn't look for cataloging them).

Let me first introduce a concrete sample. Here is a example of XSLT stylesheet that will throw an error when applied to itself (there is a few boilerplate code to avoid to much optimization from Saxon):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:fg="http://www.fgeorges.org/xslt/samples"
                version="2.0">

   <xsl:template match="/">
      <html>
         <head/>
         <body>
            <xsl:apply-templates select="*" mode="y"/>
         </body>
      </html>
   </xsl:template>

   <xsl:template match="*" name="nana" mode="y">
      <b>
         <xsl:value-of select="name(.)"/>
         <xsl:sequence select="fg:fun(*)"/>
      </b>
   </xsl:template>

   <xsl:template match="xsl:function/xsl:choose | aa//bb" mode="y">
      <b>
         <xsl:sequence select="fg:fun(*)"/>
      </b>
   </xsl:template>

   <xsl:function name="fg:fun">
      <xsl:param name="n" as="node()*"/>
      <xsl:choose>
         <xsl:when test="$n[0][name() = 'inexistent']">
            <xsl:apply-templates mode="y" select="
                $n[position() mod 2 eq 0]"/>
         </xsl:when>
         <xsl:otherwise>
            <xsl:apply-templates mode="y" select="$n"/>
         </xsl:otherwise>
      </xsl:choose>
   </xsl:function>

   <xsl:template match="xsl:otherwise" mode="y">
      <xsl:if test="@test">
         <xsl:sequence select="'&#10;'"/>
      </xsl:if>
      <dummy>
         <xsl:sequence select="
             error(xs:QName('fg:ERR007'), 'My error message')"/>
      </dummy>
   </xsl:template>

</xsl:stylesheet>

When applied to itself, the stylesheet should throw an error in the last template rule, matching xsl:otherwise. Here is what I expect the XSLT stacktrace should look like:

ERR007: My error message
  in template #y matching "xsl:otherwise" (at style.xsl:47)
    `-> /xsl:stylesheet[1]/xsl:function[1]/xsl:choose[1]/xsl:otherwise[1]
  applied in function fg:fun #1 (at style.xsl:36)
  called in template #y matching "xsl:function/xsl:choose | aa//bb" (at style.xsl:24)
    `-> /xsl:stylesheet[1]/xsl:function[1]/xsl:choose[1]
  applied in function fg:fun #1 (at style.xsl:36)
  called in template nana #y matching "*" (at style.xsl:18)
    `-> /xsl:stylesheet[1]/xsl:function[1]
  applied in function fg:fun #1 (at style.xsl:36)
  called in template nana #y matching "*" (at style.xsl:18)
    `-> /xsl:stylesheet[1]
  applied in template matching "/" (at style.xsl:10)
    `-> /
  applied from external application

We can see first the message of the error and its local part name. Then the stack itself is composed by templates and function calling/applying each other. For a function, there is its name and its arity: function fg:fun #1. For a template (both named templates and template rules), there is its name if any, then its mode if any, and finally its pattern if any (what it is matching): template nana #y matching "*". In addition, there is the location within the stylesheet (the stylesheet module's file name and the line number) as well as path to the current node within its document, if any.

This is what I actually get now, except for the pattern. The initial text of the pattern is not always keep in memory by Saxon. It is sometimes reconstructed from its internal form (its compiled form). I'm sure I could improve this situation, but I didn't yet. In the above example, the only difference is that the pattern for the first template is not: template #y matching "xsl:otherwise" but instead: template #y matching "element({http://www.w3.org/1999/XSL/Transform}otherwise, xs:anyType)".

Instead of directly writing those info as I walk the Saxon's stack, I instead build an alternative Java representation of the XSLt stack, and I then output it to the console. That way it would be easier to reuse that code for other purpose (for instance build a graphical view of the stack within a GUI for Saxon ;-)).

You can find the Java source files there:

saxon9-xslt-stacktrace.zip
The ZIP archive with all the files
StackFrame.java
This class represent an abstract XSLT stack frame.
FunctionFrame.java
This class represent an XSLT stack frame for a function.
TemplateFrame.java
This class represent an XSLT stack frame for a template.
StackVisitor.java
The stack visitor interface.
Main.java
A sample program that run a transform that throws an error, and display the obtained XSLT stack.
style.xsl
The sample transform.

Below are the complete sources inline, for completeness.

StackFrame.java

/*
 * StackFrame.java
 * 
 * Created on Oct 27, 2007, 2:57:04 PM
 */

package saxon.xslt.stacktrace;

import javax.xml.XMLConstants;
import javax.xml.namespace.QName;
import javax.xml.transform.SourceLocator;
import net.sf.saxon.expr.XPathContext;
import net.sf.saxon.instruct.Template;
import net.sf.saxon.instruct.UserFunction;
import net.sf.saxon.om.Axis;
import net.sf.saxon.om.AxisIterator;
import net.sf.saxon.om.Item;
import net.sf.saxon.om.NodeInfo;
import net.sf.saxon.om.SequenceIterator;
import net.sf.saxon.om.StandardNames;
import net.sf.saxon.om.StructuredQName;
import net.sf.saxon.pattern.NameTest;
import net.sf.saxon.pattern.NodeKindTest;
import net.sf.saxon.pattern.NodeTest;
import net.sf.saxon.pattern.NodeTestPattern;
import net.sf.saxon.pattern.Pattern;
import net.sf.saxon.trace.InstructionInfo;
import net.sf.saxon.trace.Location;
import net.sf.saxon.trans.Mode;
import net.sf.saxon.trans.Rule;
import net.sf.saxon.trans.XPathException;
import net.sf.saxon.type.Type;
import org.xml.sax.Locator;
import org.xml.sax.helpers.LocatorImpl;

/**
 * The base class for XSLT stack frames.
 * 
 * <p>The main entry point are {@link #makeStack(XPathContext,Locator)} and
 * {@link #makeStack(XPathException)}.</p>
 *
 * @author Florent Georges
 */
public abstract class StackFrame
{
    /**
     * Get the locator of the XSLT istruction this frame stands for.
     */
    public Locator getLocator()
    {
        return myLocator;
    }

    /**
     * Get the path of the current node if any.  May be null.
     */
    public String getPath()
    {
        return myPath;
    }

    /**
     * Get the next frame in the stack (that is, the frame that <em>called</em> this one).
     */
    public StackFrame getNext()
    {
        return myNext;
    }

    /**
     * Set the next frame in the stack (that is, the frame that <em>called</em> this one).
     */
    public void setNext(StackFrame next)
    {
        myNext = next;
    }

    /**
     * Accept a visitor.
     * 
     * <p>The visitor will visit this frame, then the rest of the stack (the next
     * frame, then its next frame, etcetera).</p>
     */
    public abstract void acceptVisitor(StackVisitor visitor);

    /**
     * Make a stack representation from the XPath exception.
     * 
     * <p>Just extract the locator from the exception, then call
     * {@link #makeStack(XPathContext,Locator)}.</p>
     */
    public static StackFrame makeStack(XPathException ex)
    {
        Locator locator = jaxpToSaxLocator(ex.getLocator());
        XPathContext ctxt = ex.getXPathContext();
        return makeStack(ctxt, locator);
    }

    /**
     * Make a stack representation from the XPath context.
     * 
     * <p><b>Discussion</b>: The Saxon's XPath context <em>seems</em> to be
     * organised as following.  One context represent either a template, a
     * call-template, an apply-template or a function call (and a few other
     * like for-each, not relevant here and ignored).  On the one hand, that's
     * important for a template to have all those things in the contexts, because
     * a same template can be called or applied, and a same apply-template can
     * apply different template rules.  For a function, on the other hand, a
     * call identify clearly the function.</p>
     * 
     * <p>So you have to different types of context to identify the template or
     * function you are in: template and function call (as there is no function
     * context).  And you have three different way of knowing where you are in
     * a template or function (where you leave it to another template or
     * function, the line number that interests you): apply-templates,
     * call-template and function call.  The later are encountered first, the
     * former later.</p>
     * 
     * <p>The locaction of the error (the starting point) is again something
     * different, that the {@link TransformerException} will give you.</p>
     * 
     * <p>So the idea is the following.  We know the location to put in the
     * stack one (or more) context before the context telling the template or
     * function name, mode, arity, etc.  The function call does both things:
     * it tells the next context the line number to use, and it tell which
     * function to use with the current line number.</p>
     * 
     * <p>So we start by providing the locator, then recurse the contexts.  If
     * the context is a template, we make a new template frame with the locator,
     * then recurse with a null locator.  If the context is an apply-templates
     * or a call-template, we recurse with the locator the give us.  If the
     * context is a function call, both things are done: we use the current
     * locator to create a new function frame with the function we have, then
     * we recurse with the locator the function call gives us.</p>
     * 
     * <p>Locators are copied to not keep references to a lot of Saxon objects,
     * directly or indirectly.</p>
     */
    public static StackFrame makeStack(XPathContext ctxt, Locator locator)
    {
        return makeStack(ctxt, locator, null);
    }

    /**
     * Implementation of {@link #makeStack(XPathContext ctxt, Locator locator)}.
     * 
     * <p>To deal with template frames, we need to keep a reference to the
     * <em>called</em> frame.  Thus this method with an extra parameter.</p>
     */
    private static StackFrame makeStack(XPathContext ctxt, Locator locator, StackFrame called)
    {
        // stop recursion
        if ( ctxt == null ) {
            return null;
        }

        InstructionInfo info = ctxt.getOrigin().getInstructionInfo();
        switch ( info.getConstructType() ) {
            case StandardNames.XSL_TEMPLATE: {
                // path
                String path = makePathToCurrentNode(ctxt);
                // pattern
                Rule rule = ctxt.getCurrentTemplateRule();
                String pattern = getPatternText(rule.getPattern());
                // name
                Template t = (Template) rule.getAction();
                QName name = structuredToQName(t.getTemplateName());
                // frame & recurse
                StackFrame frame = new TemplateFrame(path, pattern, name, null, locator);
                frame.setNext(makeStack(ctxt.getCaller(), null, frame));
                return frame;
            }
            case Location.FUNCTION_CALL: {
                // path
                String path = makePathToCurrentNode(ctxt);
                // name
                QName name = structuredToQName(info.getObjectName(ctxt.getNamePool()));
                // arity
                UserFunction fun = (UserFunction) info.getProperty("target");
                int arity = fun.getNumberOfArguments();
                // frame & recurse
                StackFrame frame = new FunctionFrame(path, name, arity, locator);
                frame.setNext(makeStack(ctxt.getCaller(), new LocatorImpl(info), frame));
                return frame;
            }
            case StandardNames.XSL_CALL_TEMPLATE: {
                // Should always be, but is not...?
                if ( called instanceof TemplateFrame ) {
                    TemplateFrame frame = (TemplateFrame) called;
                    frame.setCalled(true);
                }
                return makeStack(ctxt.getCaller(), new LocatorImpl(info), called);
            }
            case StandardNames.XSL_APPLY_TEMPLATES: {
                // Should always be, but is not...?
                if ( called instanceof TemplateFrame ) {
                    TemplateFrame frame = (TemplateFrame) called;
                    // mode
                    Object pmode = info.getProperty("mode");
                    if ( pmode != null ) {
                        StructuredQName smode = ((Mode) pmode).getModeName();
                        frame.setMode(structuredToQName(smode));
                    }
                }
                // recurse with a new locator
                return makeStack(ctxt.getCaller(), new LocatorImpl(info), called);
            }
            default: {
                // just recurse the next context
                return makeStack(ctxt.getCaller(), locator, called);
            }
        }
    }

    /**
     * Internal helper, returning a representation of a QName for human readers.
     */
    protected static String displayQName(QName name)
    {
        if ( XMLConstants.DEFAULT_NS_PREFIX.equals(name.getPrefix()) ) {
            return name.getLocalPart();
        }
        else {
            return name.getPrefix() + ":" + name.getLocalPart();
        }
    }

    /**
     * Return the text view of a <em>compiled</em> pattern.
     * 
     * <p>Could be improved to be more human-friendly in some cases.</p>
     */
    private static String getPatternText(Pattern pattern)
    {
        NodeTest test = pattern.getNodeTest();
        if ( pattern instanceof NodeTestPattern && test instanceof NodeKindTest ) {
            if ( test == NodeKindTest.ATTRIBUTE ) {
                return "@*";
            }
            else if ( test == NodeKindTest.DOCUMENT ) {
                return "/";
            }
            else if ( test == NodeKindTest.ELEMENT ) {
                return "*";
            }
        }
        return pattern.toString();
    }

    /**
     * Build a new SAX {@link Locator} from a JAXP {@link SourceLocator}.
     */
    private static Locator jaxpToSaxLocator(SourceLocator jaxp)
    {
        LocatorImpl sax = new LocatorImpl();
        sax.setColumnNumber(jaxp.getColumnNumber());
        sax.setLineNumber(jaxp.getLineNumber());
        sax.setPublicId(jaxp.getPublicId());
        sax.setSystemId(jaxp.getSystemId());
        return sax;
    }

    /**
     * Build a new JAXP {@link QName} from a Saxon {@link StructuredQName}.
     */
    private static QName structuredToQName(StructuredQName sname)
    {
        if ( sname == null ) {
            return null;
        }
        String uri = sname.getNamespaceURI();
        String local = sname.getLocalName();
        String prefix = sname.getPrefix();
        return new QName(uri, local, prefix);
    }

    /**
     * Return a human-friendly view of the path to the current node if any.
     */
    private static String makePathToCurrentNode(XPathContext ctxt)
    {
        SequenceIterator it = ctxt.getCurrentIterator();
        if ( it == null ) {
            return null;
        }
        Item item = it.current();
        if ( item instanceof NodeInfo ) {
            return makePathTo((NodeInfo) item);
        }
        return null;
    }

    /**
     * Return a human-friendly view of the path to a node within its document.
     */
    private static String makePathTo(NodeInfo node)
    {
        if ( node == null ) {
            return null;
        }
        String path = null;
        switch ( node.getNodeKind() ) {
            case Type.DOCUMENT: {
                return "/";
            }
            case Type.ELEMENT: {
                String name = node.getNamePool().getDisplayName(node.getNameCode());
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = name + "[" + pos + "]";
                break;
            }
            case Type.ATTRIBUTE: {
                String name = node.getNamePool().getDisplayName(node.getNameCode());
                path = "@" + name;
                break;
            }
            case Type.TEXT: {
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, NodeKindTest.TEXT);
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "text()[" + pos + "]";
                break;
            }
            case Type.COMMENT: {
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, NodeKindTest.COMMENT);
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "comment()[" + pos + "]";
                break;
            }
            case Type.PROCESSING_INSTRUCTION: {
                String name = node.getNamePool().getDisplayName(node.getNameCode());
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "processing-instruction(" + name + ")[" + pos + "]";
                break;
            }
            case Type.NAMESPACE: {
                int name_code = node.getNameCode();
                String name = name_code < 0 ? "" : node.getNamePool().getDisplayName(name_code);
                AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
                int pos = 1;
                while ( ai.moveNext() ) {
                    ++ pos;
                }
                path = "namespace(" + name + ")[" + pos + "]";
                break;
            }
            default: {
                throw new RuntimeException("FIXME: What to do?!?");
            }
        }

        String parent = makePathTo(node.getParent());
        if ( parent == null ) {
            return path;
        }
        else if ( "/".equals(parent) ) {
            return "/" + path;
        }
        else {
            return parent + "/" + path;
        }
    }

    protected Locator myLocator;
    protected String myPath;
    private StackFrame myNext;
}

FunctionFrame.java

/*
 * FunctionFrame.java
 * 
 * Created on Oct 31, 2007, 8:10:37 PM
 */

package saxon.xslt.stacktrace;

import javax.xml.namespace.QName;
import org.xml.sax.Locator;

/**
 * Represent an XSLT stack frame for a function.
 *
 * @author Florent Georges
 */
public class FunctionFrame
        extends StackFrame
{
    /**
     * Build a new function frame.
     * 
     * @param curr_item
     *             The path to the current node if any.
     * 
     * @param name
     *             The name of the function.
     * 
     * @param arity
     *             The arity of the function (its number of parameters).
     * 
     * @param locator
     *             The SAX locator for the function.
     */
    public FunctionFrame(String curr_item, QName name, int arity, Locator locator)
    {
        myPath = curr_item;
        myName = name;
        myArity = arity;
        myLocator = locator;
    }

    /**
     * Return the name of the function.
     */
    public QName getName()
    {
        return myName;
    }

    /**
     * Return the arity of the function (its number of parameters).
     */
    public int getArity()
    {
        return myArity;
    }

    @Override
    public String toString()
    {
        return "function " + displayQName(myName) + " #" + myArity;
    }

    public void acceptVisitor(StackVisitor visitor)
    {
        visitor.visitFunctionFrame(this);
        if ( getNext() != null ) {
            getNext().acceptVisitor(visitor);
        }
    }

    private QName myName;
    private int myArity;
}

TemplateFrame.java

/*
 * TemplateFrame.java
 * 
 * Created on Oct 31, 2007, 8:09:11 PM
 */

package saxon.xslt.stacktrace;

import javax.xml.namespace.QName;
import org.xml.sax.Locator;

/**
 * Represent an XSLT stack frame for a template.
 *
 * @author Florent Georges
 */
public class TemplateFrame
        extends StackFrame
{
    /**
     * Build a new template frame.
     * 
     * @param curr_item
     *             The path to the current node if any.
     * 
     * @param pattern
     *             The pattern the template matches, if it is a template rule.
     * 
     * @param name
     *             The name of the template, if it is a named template.
     * 
     * @param mode
     *             The mode of the template, if any.
     * 
     * @param locator
     *             The SAX locator for the template.
     */
    public TemplateFrame(String curr_item, String pattern, QName name, QName mode, Locator locator)
    {
        myPath = curr_item;
        myPattern = pattern;
        myName = name;
        myMode = mode;
        myLocator = locator;
        myCalled = false;
    }

    /**
     * Return the pattern of the template, if it is a template rule.
     */
    public String getPattern()
    {
        return myPattern;
    }

    /**
     * Return the name of the template if it is a named template.
     */
    public QName getName()
    {
        return myName;
    }

    /**
     * Return the mode's name of the template, if it is a template rule and has a mode.
     */
    public QName getMode()
    {
        return myMode;
    }

    /**
     * Set the mode of the template.
     */
    public void setMode(QName mode)
    {
        myMode = mode;
    }

    /**
     * Return {@code true} if the template was called, {@code false} if it was applied.
     */
    public boolean isCalled()
    {
        return myCalled;
    }

    /**
     * Set the {@code isCalled} property of the template (see {@link #isCalled()}).
     */
    public void setCalled(boolean called)
    {
        myCalled = called;
    }

    @Override
    public String toString()
    {
        StringBuilder buf = new StringBuilder("template ");
        if ( myName != null ) {
            buf.append(displayQName(myName)).append(' ');
        }
        if ( myMode != null ) {
            buf.append('#').append(displayQName(myMode)).append(' ');
        }
        if ( myPattern != null ) {
            buf.append("matching ").append('\"').append(myPattern).append('\"');
        }
        return buf.toString();
    }

    public void acceptVisitor(StackVisitor visitor)
    {
        visitor.visitTemplateFrame(this);
        if ( getNext() != null ) {
            getNext().acceptVisitor(visitor);
        }
    }

    private String myPattern;
    private QName myName;
    private QName myMode;
    private boolean myCalled;
}

StackVisitor.java

/*
 * StackFrameVisitor.java
 * 
 * Created on Oct 31, 2007, 8:14:12 PM
 */

package saxon.xslt.stacktrace;

/**
 * Visitor that visits an XSLT stack.
 *
 * @see {@link StackFrame#acceptVisitor(StackFrameVisitor)}
 * 
 * @author Florent Georges
 */
public interface StackVisitor
{
    /**
     * Visit a function frame.
     */
    public void visitFunctionFrame(FunctionFrame frame);

    /**
     * Visit a template frame.
     */
    public void visitTemplateFrame(TemplateFrame frame);
}

Main.java

/*
 * Main.java
 * 
 * Created on Oct 27, 2007, 1:21:32 PM
 */

package saxon.xslt.stacktrace;

import java.io.File;
import javax.xml.transform.ErrorListener;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import net.sf.saxon.TransformerFactoryImpl;
import net.sf.saxon.trans.XPathException;

/**
 * Sample application that run a stylesheet an output the XSLT stacktrace in case of error.
 *
 * <p>The stylesheet used is {@code src/saxon/xslt/stacktrace/style.xsl}.</p>
 * 
 * @author Florent Georges
 */
public class Main
{
    public static void main(String[] args)
            throws TransformerException
    {
        TransformerFactory factory = TransformerFactoryImpl.newInstance();
        Source style = new StreamSource(new File("src/saxon/xslt/stacktrace/style.xsl"));
        Transformer trans = factory.newTransformer(style);
        trans.setErrorListener(new NullErrorListener());
        try {
            trans.transform(style, new StreamResult(System.out));
        }
        catch ( XPathException ex ) {
            System.err.println(ex.getErrorCodeLocalPart() + ": " + ex.getMessage());
            StackFrame stack = StackFrame.makeStack(ex);
            DisplayerStackVisitor visitor = new DisplayerStackVisitor();
            stack.acceptVisitor(visitor);
            visitor.finish();
        }
    }

    /**
     * Does nothing, to supress error messages from Saxon itself.
     */
    private static class NullErrorListener
            implements ErrorListener
    {
        public void warning(TransformerException ex)
                throws TransformerException {
        }
        public void error(TransformerException ex)
                throws TransformerException {
        }
        public void fatalError(TransformerException ex)
                throws TransformerException {
        }
    }

    /**
     * Stack visitor that display the stack as text on {@code System.err}.
     */
    private static class DisplayerStackVisitor
            implements StackVisitor
    {
        public void visitFunctionFrame(FunctionFrame frame)
        {
            doVisit(frame);
            myHead = "  called ";
        }

        public void visitTemplateFrame(TemplateFrame frame)
        {
            doVisit(frame);
            myHead = frame.isCalled() ? "  called " : "  applied ";
        }

        public void finish()
        {
            System.err.println(myHead + "from external application");
        }

        private void doVisit(StackFrame frame)
        {
            int line = frame.getLocator().getLineNumber();
            int col = frame.getLocator().getColumnNumber();
            String pos = getFile(frame) + ":" + line + ( col < 0 ? "" : ":" + col );
            System.err.println(myHead + "in " + frame + " (at " + pos + ")");
            String path = frame.getPath();
            if ( path != null ) {
                System.err.println("    `-> " + path);
            }
        }

        private static String getFile(StackFrame frame)
        {
            String sysid = frame.getLocator().getSystemId();
            int slash = sysid.lastIndexOf('/');
            if ( slash < 0 ) {
                return sysid;
            }
            else {
                return sysid.substring(slash + 1);
            }
        }

        private String myHead = "  ";
    }
}

Labels: ,