Introduction
After having released a first implementation of EXPath
Packaging System for eXist, here is a version for Saxon. You can
read this previous blog entry to get more information on the packaging
system; in particular, it says: "The concept is quite simple:
defining a package format to enable users to install libraries in
their processor with just a few clicks, and to enable library authors
to provide a single package to be installed on every processors,
without the need to document (and maintain) the installation process
for each of them."
The package manager for Saxon is a graphical application (a textual
front-end will be provided soon,) and is provided as a single JAR
file. Go to the implementations
page, or use this following direct
link to get the JAR. Run it as usual, for instance by
double-clicking on it or by executing the command java -jar
expath-pkg-saxon-0.1.jar
. That will launch the package manager
window.
Repositories
The implementation for Saxon differs from the one for eXist in a
fundamental way: Saxon does not have a home directory where you can
put the installed packaged, and you can invoke Saxon in so many
different ways (while the eXist core is always started the same way.)
That involves two different aspects regarding package management with
Saxon: the package manager itself that installs and remove packages,
and a way to configure Saxon itself, regardless with the way you
invoke it. In addition, the homeless property of Saxon needs to
introduce the concept of package repository.
A repository is a directory dedicated to installing packages, and
should only be modified through the package manager. It contains the
packages themselves (under a form usable by Saxon) as well as
administrative informations to be able to use them (like catalogs,
etc.) The graphical package manager allows one to create a new
repository directly from the graphical interface, as well as switching
between different repositories (if you need to maintain several
repositories for several purposes.)
Importing stylesheet
But as I said above, having a repository full of packages is not
enough. You have to configure Saxon to use this repository. Because
you can invoke Saxon in a plenty of ways, the configuration itself is
implemented as a Java helper class that you can use in your own code
if you invoke Saxon from within Java (for instance in a Java EE web
application.) If you use Saxon from the command line, there is a
script that takes care of configuring everything for you.
But before looking in details at how to configure Saxon to use a
repository, let's have a look at how a stylesheet can use an installed
package. This is the whole point of the packaging system, after all.
The goal is simply to be able to use a public import URI in
an import statement, this URI being automatically resolved to its
local copy in the repository. Like a namespace URI is just a kind of
identifier (it is just used as a string, your processor does not try
to actually access anything at that address,) the public import
URI is an identifier to a specific stylesheet. This machanism
supports also having functions implemented in Java. So all you need
to do is to use this public URI, like the following:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:h="http://www.example.org/hello"
version="2.0">
<xsl:import href="http://www.example.org/hello.xsl"/>
<xsl:template ...>
...
<xsl:value-of select="h:hello('world')"/>
For XQuery, this is a bit different as XQuery does have a module
system. But this is actually very similar. XQuery library modules
are identified by their namespace URI. Once again, it can be seen as
a public identifier for that XQuery module. So let's say we have an
XQuery library module for the namespace URI
http://www.example.org/hello
, then you can simply write a
module that imports it as following:
import module namespace h = "http://www.example.org/hello";
h:hello('world')
And that's it! In the package samples section below, you
can see completes examples of such importing stylesheets and queries,
as well as the packages they use.
Java configuration
To configure Saxon to use a repository from Java, you need to get a
Configuration
object. This is a central class in Saxon,
which is used almost everywhere in the Saxon code base. You can get
it from a Saxon TransformerFactory
or from a S9API
Processor
. With that object on the one hand, and a
File
object pointing to the repository directory on the
other hand, you can just call:
File repo = ...;
Configuration config = ...;
ConfigHelper helper = new ConfigHelper(repo);
helper.config(config);
Besides the Java code itself, you have to be sure 1/ to have an
actual repository at the location you pass to the
ConfigHelper
constructor and 2/ to have the JAR files
used by and containing the extension functions written in Java into
your classpath. The only exception to this rule is when you register
such an extension function (written in Java) to Saxon 9.2; in this
case EXPath Pkg will try to dynamically add the JAR files from the
repository to the classpath. But playing with the classpath at
runtime is not something I would recommend in Java.
Shell script
When using Saxon from the command line, EXPath Pkg comes with an
alternate class to launch Saxon (this class automatically uses
ConfigHelper
to configure Saxon) as well as with a shell
script to launch Saxon with the correct classpath.
To use this shell script (only available on Unix-like systems for
now, including Cygwin under Windows) you have to set the environment
variables SAXON_HOME
to the directory where you put the
Saxon JAR files, EXPATH_PKG_JAR
to the EXPath Pkg JAR
file, and APACHE_XML_RESOLVER_JAR
to the XML Resolver JAR file
from Apache. Additionally, you can set EXPATH_REPO
to
the repository directory, to not have to explicitely give it as an
option each time you invoke Saxon. If all the above environment
variables have been correctly set, and the script added to your PATH,
you can just invoke Saxon as usual: saxon -s:source.xml
-xsl:stylesheet.xsl
.
Use saxon --help
to get the usage help of this script.
You can set the EXPath repository (and thus override
EXPATH_REPO
if it is set) with the option
--repo=
. You can add items to the classpath with the
option --add-cp=
. You can set the classpath (so
overriding SAXON_HOME
and other environment variables)
with the option --cp=
. The script detects if Saxon SA is
present, and if so will use the SA version. You can force either B or
SA version with either --b
or --sa
. You can
also set any option to the Java Virtual Machine by using
--java=
, for instance to set a system property, and
--mem=
to set the amount of memory of the virtual machine
(shortcut for the Java option -Xmx) And finally, you can also set the
HTTP and HTTPS proxy information with --proxy=host:port
(for instance --proxy=proxyhost:8080
.)
Package samples
The first
example is a packaged version of Priscilla Walmsley's FunctX. This package contains both
the XSLT and the XQuery versions of this library. Of course, the
XQuery module defines a module namespace, but the XSLT stylesheet does
not have any public import URI (as this is behind the standard.) I
chose the URI http://www.functx.com/functx-1.0.xsl
, but
keep in mind this is not official by any means, this is just
the URI I chose. It is intended that library authors package their
own libraries and choose the public URIs themselves.
The package itself is a plain ZIP file. If you open it or unzip it
with your preffered tool, you can see that at the top level, there is
a file named expath-pkg.xml
. This is the package
descriptor, that defines what the package contains (at least what
is publicly exported from the package, so what can be used from within
a stylesheet or a query.) In the case of this FunctX package, this
descriptor looks like:
<package xmlns="http://expath.org/mod/expath-pkg">
<module version="1.0" name="functx">
<title>FunctX library for XQuery 1.0 and XSLT 2.0</title>
<xsl>
<import-uri>http://www.functx.com/functx-1.0.xsl</import-uri>
<file>functx-1.0-doc-2007-01.xsl</file>
</xsl>
<xquery>
<namespace>http://www.functx.com</namespace>
<file>functx-1.0-doc-2007-01.xq</file>
</xquery>
</module>
</package>
To install the package, just download it to a temporary location,
launch the package manager as explained at the beginning of this blog
post, choose "install" in the file menu, and choose the package on
your filesystem. To test if it is correctly installed, write the
following stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:f="http://www.functx.com"
version="2.0">
<xsl:import href="http://www.functx.com/functx-1.0.xsl"/>
<xsl:template match="/" name="main">
<result>
<xsl:sequence select="f:date(1979, 9, 1)"/>
</result>
</xsl:template>
</xsl:stylesheet>
and/or the following XQuery main module (depending on what you want
to test):
import module namespace f = "http://www.functx.com";
<result> {
f:date(1979, 9, 1)
}
</result>
To evaluate them, make sure you configured the shell script
correctly, as explained above, then open a shell and type one of the
following command (or both) where style.xsl is the file where you
saved the above stylesheet and query.xq is the file where your saved
the above query:
$ saxon -xsl:style.xsl -it:main
<result>1979-09-01</result>
$ saxon --xq query.xq
<result>1979-09-01</result>
$
If you prefer to test from Java, just write a simple main class
that evaluates the above stylesheet and/or query, taking care of using
ConfigHelper
to set up the Saxon Configure
object. For instance, if you want to use the S9API, you can configure
the Processor
object like the following (don't forget to
add the EXPath Pkg and the Apache XML resolver JAR files to your
classpath):
File repo = new File("...");
ConfigHelper helper = new ConfigHelper(repo);
Processor proc = new Processor(false);
helper.config(proc.getUnderlyingConfiguration());
'
The second
sample package provides a single function:
ext:hello($who)
. It is written in Java. Besides other
stuff related to the packaging itself, it contains a JAR file with the
implementation of that extension function. To test it, just follow
the same steps as for the FunctX package, except that you have to add
the installed JAR file (from within the repository) to your claspath
(this is done automatically for you if you use the shell script, but
not if you test it from a Java program.)
Conclusion
This is just a prototype implementation of a package manager for
Saxon, which is consistent with the
one for eXist. The main issue is the configuration of the
classpath, but I think this is best let to the user than having to
deal with the classpath, in particular within the context of a Java EE
application. This issue shows up also in your IDE configuration. For
now, I configure oXygen by adding the catalogs from the repository to
the oXygen's main catalog list, and the extension JAR files to the
oXygen classpath, so the built-in Saxon processors can be used exactly
as usual. But such issues can be resolved by native support right into
the processors ad IDEs.
Besides this classpath issue, I am convinced that package
management will really improve the current situation, and maybe could
be the missing piece to distribute real general-purpose libraries for
XQuery and XSLT, and one of the basis to other systems, like an
implementation-independent XRX system.
Labels: expath, saxon, xquery, xslt