Tuesday, December 09, 2008

FXSL currying and nestable sequences

After an interesting discussion on the FXSL Help forum, the problem of currying and nested sequences showed up again. The FXSL project provides, among other things, first-class citizen functions. Basically, it represents a function as an element. When executing such a function, the dispatching to the code is done by applying templates on that element.

An interesting feature of FXSL is the ability to curry parameters to a function, to create an other function of a lesser order. The principle is to attach parameters to the function. This new function can then be used as any other function, with specified parameters bound to specified values.

The result of currying is then another first-class citizen function. So it has to be a node, because f:apply() applies templates on it to find the code to execute. And it has to be a single item, in order to be used as any other items (in particular its behaviour in sequence handling and atomization.) The later point makes it impossible to use a sequence as result of currying.

The approach taken by FXSL for now is to create an XML element as the result of f:curry(). This element contains several information: the child fun holds the curried function (may be itself a currying), cnArgs is the cardinality of the curried function and then the childs arg hold the curried values. For instance, the expression f:curry(my:add(), 2, 1024) will return the following element:

<f-curry:f-curry xmlns:f-curry="http://fxsl.sf.net/curry">
   <fun>
      <my:add xmlns:my="urn:X-FGeorges.org:tests:curry-sref.xsl"/>
   </fun>
   <cnArgs>2</cnArgs>
   <arg t="xs:integer">1024</arg>
</f-curry:f-curry>

This approach is convenient because we can use any structure we need to represent currying. Unfortunately, the semantics of adding items to an XML tree implies to copy nodes and to make nodes from atomic values. That means that if the curried argument is an XML element, piece of a whole document, it will be copied to the element representing currying. For example if the curried function uses the ancestor axis on this curried element, it will see the f-curry:f-curry element, instead of the ancestors in the original document. That was actually the problem reported by Christoph Lange on the FXSL forum.

And this leads to other problems related to identity. For instance, items are transformed to nodes. FXSL resolves that problem by recording the initial type in the currying structure, and convert the node back to that type. While this is ok for standard simple types, that can't be applied to user-defined simple types. Another example is for validated nodes; they loose their type annotations when added to the currying structure, which can be a problem for the curried function. You can find more about this topic in Type-preserving copy in XSLT 2.0.

Actually, all those problem could be solved with a simple feature that does not exist in standard XPath: the ability to nest sequence. If we could nest sequences, or if we had a special type of sequences that wouldn't atomize when added to another sequence, we could use them as the result of currying. Even if that's not a node anymore, we could adapt f:apply() to handle those particular sequences and use its, say, first item as the node to apply templates on.

The good news is that this is simple to implement such a sequence as a Java extension in Saxon. Here is a very simple implementation. I have called it SRef, for Sequence Reference. I guess we would need something more elaborated to be efficient and general-purpose, but this is just a proof-of-concept:

package org.fgeorges.saxon;

import java.util.ArrayList;
import java.util.List;
import net.sf.saxon.om.ArrayIterator;
import net.sf.saxon.om.Item;
import net.sf.saxon.om.SequenceIterator;
import net.sf.saxon.trans.XPathException;

/**
 * XPath sequence reference, or non-atomizable XPath sequence.
 *
 * @author Florent Georges - fgeorges.org
 * @date 2006-12-01
 */
public class SequenceRef
{
    public SequenceRef(SequenceIterator seq) throws XPathException
    {
        myIter = seq.getAnother();
    }

    public SequenceIterator getSequence()
    {
        return myIter;
    }

    static public boolean isSequenceRef(Object obj)
    {
        return obj instanceof SequenceRef;
    }

    @Override
    public String toString()
    {
        throw new RuntimeException("toString not supported, cannot be added to a tree!");
    }

    private SequenceIterator myIter = null;
}

This implementation in Java is coupled to an simple API in XPath. Three functions are created: sref:make-sref() takes a sequence and returns an sref for this sequence, sref:sequence() takes an sref and return the original sequence, and sref:is-sref() get an item and return true if it is an sref. The following XSLT module defines those functions:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:sref="http://www.fgeorges.org/xslt/sref"
                xmlns:impl="java:org.fgeorges.saxon.SequenceRef"
                exclude-result-prefixes="xs sref impl"
                version="2.0">

   <xsl:function name="sref:make-sref" as="item()">
      <xsl:param name="seq" as="item()*"/>
      <xsl:sequence select="impl:new($seq)"/>
   </xsl:function>

   <xsl:function name="sref:sequence" as="item()*">
      <xsl:param name="ref" as="item()"/>
      <xsl:sequence select="impl:getSequence($ref)"/>
   </xsl:function>

   <xsl:function name="sref:is-sref" as="xs:boolean">
      <xsl:param name="ref" as="item()"/>
      <xsl:sequence select="impl:isSequenceRef($ref)"/>
   </xsl:function>

   <xsl:function name="sref:atomize" as="item()*">
      <xsl:param name="seq" as="item()*"/>
      <xsl:sequence select="
          for $item in $seq return
            if ( sref:is-sref($item) ) then
              sref:sequence($item)
            else
              $item"/>
   </xsl:function>

   <xsl:function name="sref:deep-atomize" as="item()*">
      <xsl:param name="seq" as="item()*"/>
      <xsl:sequence select="
          for $item in $seq return
            if ( sref:is-sref($item) ) then
              sref:deep-atomize(sref:sequence($item))
            else
              $item"/>
   </xsl:function>

</xsl:stylesheet>

With those simple functions, it is then possible to modify f:curry() and f:apply() to support (to take advantage of) SRefs. The folowing is a simple example (supporting only currying a function of cardinality 2 with a single argument). I create a first-citizen function my:add() that takes two integers and returns their sum, I write new versions of f:apply() and f:curry(), then I call my:add() both directly and with currying:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:f="http://fxsl.sf.net/"
                xmlns:my="urn:X-FGeorges.org:tests:curry-sref.xsl"
                xmlns:sref="http://www.fgeorges.org/xslt/sref"
                xmlns:impl="java:org.fgeorges.saxon.SequenceRef"
                exclude-result-prefixes="xs f my sref impl"
                version="2.0">

   <xsl:import href="sref.xsl"/>

   <xsl:output indent="yes"/>

   <!--
      The my:add() first class function.
   -->
   <xsl:variable name="my:add" as="element()">
      <my:add/>
   </xsl:variable>

   <xsl:function name="my:add" as="node()">
      <xsl:sequence select="$my:add"/>
   </xsl:function>

   <xsl:function name="my:add" as="xs:integer">
      <xsl:param name="lhs" as="xs:integer"/>
      <xsl:param name="rhs" as="xs:integer"/>
      <xsl:sequence select="$lhs + $rhs"/>
   </xsl:function>

   <xsl:template match="my:add" mode="f:FXSL">
      <xsl:param name="arg1"/>
      <xsl:param name="arg2"/>
      <xsl:sequence select="my:add($arg1, $arg2)"/>
   </xsl:template>

   <!--
      Apply on SRefs.
   -->
   <xsl:function name="f:apply-sref">
      <xsl:param name="pFunc" as="item()"/>
      <xsl:param name="arg1" as="item()*"/>
      <xsl:variable name="seq" select="sref:sequence($pFunc)"/>
      <xsl:apply-templates select="$seq[1]" mode="f:FXSL">
         <xsl:with-param name="seq" select="$seq"/>
         <xsl:with-param name="arg1" select="$arg1"/>
      </xsl:apply-templates>
   </xsl:function>

   <!--
      Currying using SRefs.
   -->
   <xsl:function name="f:curry-sref" xmlns:f-c-s="http://fxsl.sf.net/curry-sref">
      <xsl:param name="pFun" as="node()"/>
      <xsl:param name="pNargs" as="xs:integer"/>
      <xsl:param name="arg1"/>
      <xsl:variable name="curry-fun" as="element()">
         <f-c-s:f-c-s/>
      </xsl:variable>
      <xsl:sequence select="
          sref:make-sref(($curry-fun, $pFun, $pNargs, sref:make-sref($arg1)))"/>
   </xsl:function>

   <xsl:template match="f-c-s:*" mode="f:FXSL"
       xmlns:f-c-s="http://fxsl.sf.net/curry-sref">
      <xsl:param name="seq" as="item()*"/>
      <xsl:param name="arg1" as="item()*"/>
      <xsl:apply-templates select="$seq[2]" mode="f:FXSL">
         <xsl:with-param name="arg1" select="sref:sequence($seq[position() gt 3])"/>
         <xsl:with-param name="arg2" select="$arg1"/>
      </xsl:apply-templates>      
   </xsl:template>

   <!--
      The testing template.
   -->
   <xsl:template match="/">
      <root>
         <test-1>
            <xsl:sequence select="my:add(512, 1024)"/>
         </test-1>
         <test-2>
            <xsl:variable name="fun" select="f:curry-sref(my:add(), 2, 1024)"/>
            <xsl:sequence select="f:apply-sref($fun, 512)"/>
         </test-2>
      </root>
   </xsl:template>

</xsl:stylesheet>

Thanks to Christoph Lange for the original problem and to Dimitre for his ideas.

Labels: , ,