EXPath Packaging System prototype implementation for Saxon
Introduction
After having released a first implementation of EXPath Packaging System for eXist, here is a version for Saxon. You can read this previous blog entry to get more information on the packaging system; in particular, it says: "The concept is quite simple: defining a package format to enable users to install libraries in their processor with just a few clicks, and to enable library authors to provide a single package to be installed on every processors, without the need to document (and maintain) the installation process for each of them."
The package manager for Saxon is a graphical application (a textual
front-end will be provided soon,) and is provided as a single JAR
file. Go to the implementations
page, or use this following direct
link to get the JAR. Run it as usual, for instance by
double-clicking on it or by executing the command java -jar
expath-pkg-saxon-0.1.jar
. That will launch the package manager
window.
Repositories
The implementation for Saxon differs from the one for eXist in a fundamental way: Saxon does not have a home directory where you can put the installed packaged, and you can invoke Saxon in so many different ways (while the eXist core is always started the same way.) That involves two different aspects regarding package management with Saxon: the package manager itself that installs and remove packages, and a way to configure Saxon itself, regardless with the way you invoke it. In addition, the homeless property of Saxon needs to introduce the concept of package repository.
A repository is a directory dedicated to installing packages, and should only be modified through the package manager. It contains the packages themselves (under a form usable by Saxon) as well as administrative informations to be able to use them (like catalogs, etc.) The graphical package manager allows one to create a new repository directly from the graphical interface, as well as switching between different repositories (if you need to maintain several repositories for several purposes.)
Importing stylesheet
But as I said above, having a repository full of packages is not enough. You have to configure Saxon to use this repository. Because you can invoke Saxon in a plenty of ways, the configuration itself is implemented as a Java helper class that you can use in your own code if you invoke Saxon from within Java (for instance in a Java EE web application.) If you use Saxon from the command line, there is a script that takes care of configuring everything for you.
But before looking in details at how to configure Saxon to use a repository, let's have a look at how a stylesheet can use an installed package. This is the whole point of the packaging system, after all. The goal is simply to be able to use a public import URI in an import statement, this URI being automatically resolved to its local copy in the repository. Like a namespace URI is just a kind of identifier (it is just used as a string, your processor does not try to actually access anything at that address,) the public import URI is an identifier to a specific stylesheet. This machanism supports also having functions implemented in Java. So all you need to do is to use this public URI, like the following:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:h="http://www.example.org/hello" version="2.0"> <xsl:import href="http://www.example.org/hello.xsl"/> <xsl:template ...> ... <xsl:value-of select="h:hello('world')"/>
For XQuery, this is a bit different as XQuery does have a module
system. But this is actually very similar. XQuery library modules
are identified by their namespace URI. Once again, it can be seen as
a public identifier for that XQuery module. So let's say we have an
XQuery library module for the namespace URI
http://www.example.org/hello
, then you can simply write a
module that imports it as following:
import module namespace h = "http://www.example.org/hello"; h:hello('world')
And that's it! In the package samples section below, you can see completes examples of such importing stylesheets and queries, as well as the packages they use.
Java configuration
To configure Saxon to use a repository from Java, you need to get a
Configuration
object. This is a central class in Saxon,
which is used almost everywhere in the Saxon code base. You can get
it from a Saxon TransformerFactory
or from a S9API
Processor
. With that object on the one hand, and a
File
object pointing to the repository directory on the
other hand, you can just call:
// the repo directory File repo = ...; // the Saxon config object Configuration config = ...; // the EXPath Pkg configurer ConfigHelper helper = new ConfigHelper(repo); // actually configure Saxon helper.config(config);
Besides the Java code itself, you have to be sure 1/ to have an
actual repository at the location you pass to the
ConfigHelper
constructor and 2/ to have the JAR files
used by and containing the extension functions written in Java into
your classpath. The only exception to this rule is when you register
such an extension function (written in Java) to Saxon 9.2; in this
case EXPath Pkg will try to dynamically add the JAR files from the
repository to the classpath. But playing with the classpath at
runtime is not something I would recommend in Java.
Shell script
When using Saxon from the command line, EXPath Pkg comes with an
alternate class to launch Saxon (this class automatically uses
ConfigHelper
to configure Saxon) as well as with a shell
script to launch Saxon with the correct classpath.
To use this shell script (only available on Unix-like systems for
now, including Cygwin under Windows) you have to set the environment
variables SAXON_HOME
to the directory where you put the
Saxon JAR files, EXPATH_PKG_JAR
to the EXPath Pkg JAR
file, and APACHE_XML_RESOLVER_JAR
to the XML Resolver JAR file
from Apache. Additionally, you can set EXPATH_REPO
to
the repository directory, to not have to explicitely give it as an
option each time you invoke Saxon. If all the above environment
variables have been correctly set, and the script added to your PATH,
you can just invoke Saxon as usual: saxon -s:source.xml
-xsl:stylesheet.xsl
.
Use saxon --help
to get the usage help of this script.
You can set the EXPath repository (and thus override
EXPATH_REPO
if it is set) with the option
--repo=
. You can add items to the classpath with the
option --add-cp=
. You can set the classpath (so
overriding SAXON_HOME
and other environment variables)
with the option --cp=
. The script detects if Saxon SA is
present, and if so will use the SA version. You can force either B or
SA version with either --b
or --sa
. You can
also set any option to the Java Virtual Machine by using
--java=
, for instance to set a system property, and
--mem=
to set the amount of memory of the virtual machine
(shortcut for the Java option -Xmx) And finally, you can also set the
HTTP and HTTPS proxy information with --proxy=host:port
(for instance --proxy=proxyhost:8080
.)
Package samples
The first
example is a packaged version of Priscilla Walmsley's FunctX. This package contains both
the XSLT and the XQuery versions of this library. Of course, the
XQuery module defines a module namespace, but the XSLT stylesheet does
not have any public import URI (as this is behind the standard.) I
chose the URI http://www.functx.com/functx-1.0.xsl
, but
keep in mind this is not official by any means, this is just
the URI I chose. It is intended that library authors package their
own libraries and choose the public URIs themselves.
The package itself is a plain ZIP file. If you open it or unzip it
with your preffered tool, you can see that at the top level, there is
a file named expath-pkg.xml
. This is the package
descriptor, that defines what the package contains (at least what
is publicly exported from the package, so what can be used from within
a stylesheet or a query.) In the case of this FunctX package, this
descriptor looks like:
<package xmlns="http://expath.org/mod/expath-pkg"> <module version="1.0" name="functx"> <title>FunctX library for XQuery 1.0 and XSLT 2.0</title> <xsl> <import-uri>http://www.functx.com/functx-1.0.xsl</import-uri> <file>functx-1.0-doc-2007-01.xsl</file> </xsl> <xquery> <namespace>http://www.functx.com</namespace> <file>functx-1.0-doc-2007-01.xq</file> </xquery> </module> </package>
To install the package, just download it to a temporary location, launch the package manager as explained at the beginning of this blog post, choose "install" in the file menu, and choose the package on your filesystem. To test if it is correctly installed, write the following stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:f="http://www.functx.com" version="2.0"> <xsl:import href="http://www.functx.com/functx-1.0.xsl"/> <xsl:template match="/" name="main"> <result> <xsl:sequence select="f:date(1979, 9, 1)"/> </result> </xsl:template> </xsl:stylesheet>
and/or the following XQuery main module (depending on what you want to test):
import module namespace f = "http://www.functx.com"; <result> { f:date(1979, 9, 1) } </result>
To evaluate them, make sure you configured the shell script correctly, as explained above, then open a shell and type one of the following command (or both) where style.xsl is the file where you saved the above stylesheet and query.xq is the file where your saved the above query:
$ saxon -xsl:style.xsl -it:main <result>1979-09-01</result> $ saxon --xq query.xq <result>1979-09-01</result> $
If you prefer to test from Java, just write a simple main class
that evaluates the above stylesheet and/or query, taking care of using
ConfigHelper
to set up the Saxon Configure
object. For instance, if you want to use the S9API, you can configure
the Processor
object like the following (don't forget to
add the EXPath Pkg and the Apache XML resolver JAR files to your
classpath):
// the repo directory File repo = new File("..."); // the EXPath Pkg configurer ConfigHelper helper = new ConfigHelper(repo); // the Saxon processor Processor proc = new Processor(false); // actually configure Saxon helper.config(proc.getUnderlyingConfiguration()); // then use 'proc' as usual...
The second
sample package provides a single function:
ext:hello($who)
. It is written in Java. Besides other
stuff related to the packaging itself, it contains a JAR file with the
implementation of that extension function. To test it, just follow
the same steps as for the FunctX package, except that you have to add
the installed JAR file (from within the repository) to your claspath
(this is done automatically for you if you use the shell script, but
not if you test it from a Java program.)
Conclusion
This is just a prototype implementation of a package manager for Saxon, which is consistent with the one for eXist. The main issue is the configuration of the classpath, but I think this is best let to the user than having to deal with the classpath, in particular within the context of a Java EE application. This issue shows up also in your IDE configuration. For now, I configure oXygen by adding the catalogs from the repository to the oXygen's main catalog list, and the extension JAR files to the oXygen classpath, so the built-in Saxon processors can be used exactly as usual. But such issues can be resolved by native support right into the processors ad IDEs.
Besides this classpath issue, I am convinced that package management will really improve the current situation, and maybe could be the missing piece to distribute real general-purpose libraries for XQuery and XSLT, and one of the basis to other systems, like an implementation-independent XRX system.