Type-preserving copy in XSLT 2.0
Disclaimer
This post refers to FXSL, because its currying functionality was the starting point and the context of the following thoughts. But there is no official link between these and FXSL, so neither Dimitre nor Colin could be judged as guilty for what is written here. I want to thanks them a lot for all their valuable input, while all remaining errors are only mine.
Problematic
A few months ago, I finally had a look at FXSL. This is a project that provides first-class object functions. That opens up some very interesting possibilities, and the possibility of a more functional programming style.
An interesting feature is the ability to curry parameters to a function, to create an other function of a lesser order. The principle is to attach parameters to the function. This new function can then be used as any other function, with specified parameters bound to specified values.
To achive this goal, we need a complex structure, because we have to be able to retrieve the original function and each curried parameter. The first thing that comes in mind is to use a sequence of the needed items. But this is not possible. We want to be able to use the resulting function as any other function object. For example to be able to create a sequence of functions. As sequences can not be nested, we would not be able to retrieve the new function after having added it to a sequence (only each individual item, no longer related to each other).
Instead, FXSL uses a dynamically built element as complex container. An element is at the same time a unique item and a complex structure, from which we may easily retrieve specific pieces of information.
But unlike sequences, the content of an element cannot reference an item. When we attach an item to a tree in XSLT, it is copied. A lot of properties are copied as is, but some change. The most obvious is that atomic items are no longer atomics, but become nodes. So it is not possible to know later if we attached an atomic value or a text node, for example.
If we do nothing special, the type is changed too. It
is always set to xs:untyped
. But we want to
preserve it, because it can change the result of the
evaluation of the new function (with curried
parameters).
Solution
The idea is to have two functions.
f:copy-with-type
that takes a sequence of zero
or more items as arguments and returns a node, and
f:get-typed
that takes a node obtained by the
former as its argument and returns a sequence of zero or
mode items:
<xsl:function name="f:copy-with-type" as="node()"> <xsl:param name="arg" as="item()*"/> <copy> <!-- Still to implement... --> </copy> </xsl:function> <xsl:function name="f:get-typed" as="item()*"> <xsl:param name="arg" as="element(copy)"/> <!-- Still to implement... --> </xsl:function>
The solution is different if we are in Basic mode or Schema Aware mode (SA). It is different for nodes and atomic values also.
For nodes in Basic mode, it is simple. A node can never
have an annotation other than xs:untyped
. So just using
xsl:copy-of
is enough. In SA mode, XSLT 2.0 has
also the solution: just use the attribute
[xsl:]validation
with the value "preserve"
.
This will preserve the type annotation for the copied nodes:
<!-- In Basic mode --> <xsl:when test="$arg instance of node()"> <node> <xsl:copy-of select="$arg"/> </node> </xsl:when> <!-- In Schema Aware mode --> <xsl:when test="$arg instance of node()"> <node xsl:validation="preserve"> <xsl:copy-of select="$arg" validation="preserve"/> </node> </xsl:when>
For atomic values, it is more complex. Actually, there is no way
to say "I want to get the type of this atomic value and copy them
(the value and the type) to the tree". The only way we have to
simulate this is by using an xsl:choose
on the type of
the item (using instance of
). In SA mode, we can
use the attribute [xsl:]type
to set the container element
type to the same type as the item. But in Basic mode, it is
impossible to set the type of a node to something else than
xs:untyped
. Instead, we use as the container element
name the name of the simple type. This will act as a constructor
function later (actually, these constructors are already defined in
FXSL).
<!-- In Basic mode --> <xsl:when test="$arg instance of xs:double"> <f:double> <xsl:copy-of select="$arg"/> </f:double> </xsl:when> <!-- In Schema Aware mode --> <xsl:when test="$arg instance of xs:double"> <atomic xsl:type="xs:double"> <xsl:copy-of select="$arg" validation="preserve"/> </atomic> </xsl:when>
Below is what the whole solution looks like:
<!-- In Basic mode --> <xsl:function name="f:get-typed" as="item()*"> <xsl:param name="arg" as="element(copy)"/> <xsl:apply-templates select="$arg/*" mode="f:get-typed"/> </xsl:function> <xsl:template match="node" mode="f:get-typed" as="node()"> <xsl:sequence select="@*|node()"/> </xsl:template> <xsl:template match="f:*" mode="f:get-typed" as="item()"> <xsl:sequence select="f:apply(., data(.))"/> </xsl:template> <xsl:function name="f:copy-with-type" as="node()"> <xsl:param name="arg" as="item()*"/> <copy> <xsl:sequence select="for $a in $arg return f:copy-with-type-1($a)"/> </copy> </xsl:function> <xsl:function name="f:copy-with-type-1" as="node()"> <xsl:param name="arg" as="item()"/> <xsl:choose> <xsl:when test="$arg instance of node()"> <node> <xsl:copy-of select="$arg"/> </node> </xsl:when> <xsl:otherwise> <xsl:when test="$arg instance of xs:a-basic-type"> <f:a-basic-type> <xsl:copy-of select="$arg"/> </f:a-basic-type> </xsl:when> <!-- An xsl:when by simple type here... --> ... </xsl:otherwise> </xsl:choose> </xsl:function> <!-- In SA mode --> <xsl:function name="f:get-typed" as="item()*"> <xsl:param name="arg" as="element(copy)"/> <xsl:apply-templates select="$arg/*" mode="f:get-typed"/> </xsl:function> <xsl:template match="node" mode="f:get-typed" as="node()"> <xsl:sequence select="@*|node()"/> </xsl:template> <xsl:template match="atomic" mode="f:get-typed" as="item()"> <xsl:sequence select="data(.)"/> </xsl:template> <xsl:function name="f:copy-with-type" as="node()"> <xsl:param name="arg" as="item()*"/> <copy xsl:validation="preserve"> <xsl:sequence select="for $a in $arg return f:copy-with-type-1($a)"/> </copy> </xsl:function> <xsl:function name="f:copy-with-type-1" as="node()"> <xsl:param name="arg" as="item()"/> <xsl:choose> <xsl:when test="$arg instance of node()"> <node xsl:validation="preserve"> <xsl:copy-of select="$arg" validation="preserve"/> </node> </xsl:when> <xsl:otherwise> <xsl:when test="$arg instance of xs:a-type"> <atomic xsl:type="xs:a-type"> <xsl:copy-of select="$arg" validation="preserve"/> </atomic> </xsl:when> <!-- An xsl:when by simple type here... --> ... </xsl:otherwise> </xsl:choose> </xsl:function>
For the actual complete files, you can go to:
- f/func-copy.xsl.xi
- f/func-copy.xsl
- data/func-copy-make-whens.xsl
- data/func-copy-whens.xml
- data/xs-simple-types.xml
Problem & Future
Off course, there is a problem with atomic items in SA
mode. Because we use an xsl:choose
, we have to know
statically all the possible types. For the standard types, it is not
a problem, but it is not usable as is with user-defined types.
Two compatible techniques could be used to help to live with this restriction. The first one is the combination of the import mechanism of XSLT and the possibility to define first-class object functions. If we think about facilities to define resolver functions by namespace (i.e. by piece of XML Schema), that could result in a flexible system.
The second technique is to use a generator for pieces of XSLT
code. Actually, I use such a simple generator to generate the whole
two xsl:choose
elements (with an xsl:when
by
atomic type). The input document is an ad-hoc document that lists the
standard simple types an XSLT processor has to know. But we could maybe
write a generator that takes as input XML Schemas.
I hope this will be the subject of an other post.
Labels: xslt