Kowari Descriptors

Introduction

Descriptors are composed of several components:
A Descriptor is therefore an XSL stylesheet that performs a specific, well defined task.  Useful tasks include
Advantages to working with descriptors over using an API include
Disadvantages when Developing Descriptors are: Disadvantages when Using Descriptors are:

Namespaces

NOTE all tags must be in a certain namespace in order to be recognized as XSL extensions, when using descriptors created by the descriptor Wizard a prefix of kowariDescriptor such be used e.g.

<kowariDescriptor:query>
    perform Kowari Query
</kowariDescriptor:query>


<kowariDescriptor:descriptor
    call Kowari Descriptor
/>

The documentation below leaves out these name spaces for clarity.

XSL query Tag

The query tag is an extension to XSL that enables Kowari queries to be made from an XSL stylesheet.  

<query>
<!CDATA[[
select $person from <rmi://host/server1#people> where $person <http://foo#hasName> 'James Gosling';
]]>
</query>

The result of a query is XML with the answer to the query, if any.  Answers are normally transformed into something more suitable for the client or into a presentation format such as HTML or PDF.   Other descriptors make calls on a descriptor to perform a task without needing to know how that task is performed, e.g. return the title of a document in between 2 <title> tags.

Most queries will need some parameters in order to make the queries reusable across servers and models, e.g. most will take the model as a parameter.  The model parameter can simply be inserted in several ways as specified in the XSL specification, however because the query is in a CDATA segment breaking out of it makes the query not easily human readable e.g.

<query>
<!CDATA[[
select $person from <]]><xsl:variable select="$model"/><!CDATA[[> where $person <http://foo#hasName> 'James Gosling';
]]>
</query>

The query tag has a workaround to avoid having to do this.  If any string in the CDATA section is surrounded by @@ symbols then the string is replaced with the value of an attribute of the same name in the query tag, this is best explained by an example:

<query model="rmi://host/server1#people">
<!CDATA[[
select $person from <@@model@@> where $person <http://foo#hasName> 'James Gosling';
]]>
</query>

When the descriptor is executing this element it substitutes rmi://host/server1# for @@model@@ before passing the query to Kowari.  By specifing the parameter as an Descriptor parameter the value can be passed in by the client e.g.

<query model="{$model}">
<!CDATA[[
select $person from <@@model@@> where $person <http://foo#hasName> 'James Gosling';
]]>
</query>

If the descriptor was created with a model parameter defined as a string this query is now portable across models.  The substitution is unrestricted i.e. it is not limited to models, any string can be substituted.

XSL descriptor Tag

The descriptor tag is an extension to XSL that enables a descriptor (i.e. XSL stylesheet) to call another descriptor and operate on the output of the called descriptor.  This effectively allows subtasking or delegation of tasks to other descriptors whose implementations may evolve over time.

<descriptor _target="http://foo/descriptors/extractPeopleAsHtml.xsl" />

The above tag executes the descriptor located at http://foo/descriptors/extractPeopleAsHtml.xsl and returns the output into the output stream of the XSL transformer i.e. it as if the result of the called (target) descriptor was part of the source descriptor.

NOTE any attributes being with an underscore '_' are reserved and are internal to the descriptor.  

The attribute _target specifies the descriptor to invoke, if the descriptors full URL is NOT known but its relative URL is known then a relative URL can be used if the source URL of the source descriptor is included as a _source attribute, e.g.

<descriptor _target="extractPeopleAsHtml.xsl"  _source="http://foo/descriptors/extractPeopleAndCompaniesAsHTML.xsl" />

This allows the descriptor code to build a complete URL for the target descriptor, however since descriptors do NOT know their URL until invoked (i.e. they are told) the tag above is more often used in the form below.

<descriptor _target="extractPeopleAsHtml.xsl"  _source="{$_self}" />

_self is a special parameter passed to all descriptors on invokation, it is set to the URL of the descriptor, this oddity is due to the fact that descriptors have no URL until they are deployed, when they are deployed they assume the URL they were loaded in as.  Therefore the above descriptor will be located as long as it is in the same directory as the calling (source descriptor).

Descriptor parameters are specified as attributes on the descriptor tag, using our previous example where we extracted people from a model, where the model had to be specifed the tag could look like this.

<descriptor _target="extractPeopleAsHtml.xsl"  _source="{$_self}" model="{$model}"/>

If some attributes are long it is preferable to format the tag like this:

<descriptor
    _target="extractPeopleAsHtml.xsl" 
    _source="{$_self}"
    model="{$model}"/>

Reserved Parameters

Several parameters have special meanings which may be different depending on the context (Element, XSL, HTTP/SOAP)

Parameter Descriptor Element
XSL
HTTP/SOAP
_self
N/A
Is set to the URL of the descriptor being invoked
Set to the URL of the descriptor to invoke
_target
Set to the URL of the descriptor to invoke
N/A
N/A
_source
Set to the URL of the current descriptor being invoked
N/A
N/A
_cacheClear
see HTTP
see HTTP
DEVELOPMENT ONLY

if set to any value as a HTTP parameter this causes all cached Descriptors to be cleared.  This is useful to a developer when testing a descriptor.  Ordinarily descriptors are cached for performance and do not know if they have changed unless the cache is cleared.

NOTE depending on context, some variables ARE set (e.g. _self) while others need to BE set (e.g. _target).

FAQ - Frequently Asked Questions

Descriptors depend on a lot of disparate technologies therefore there is no single point of failure which can make debugging a descriptor frustrating unless you have knowledge over the underlying technologies (XSL, Servlets, SOAP).  Some of the most common pitfalls and solutions are summarized here.

How do I create a Descriptor ?

Use the Descriptor wizard - available as a task from the Descriptor Management page.

How do I execute an iTQL query ?

There are 2 ways to execute an iTQL query:

1) use the iTQL command line client
2) use the built in GUI Kowari viewer

How do I deploy the built in Descriptors ?

Use one of the Deploy built in descriptors tasks from the Descriptor Management page.  There are 2 tasks available e to deploy the built in descriptors.  One task completely removes all descriptors before redeploying the built in ones, while the other deploys the built in descriptors while preserving the exisiting descriptors - such as the ones you may have developed.

How do I deploy my descriptor ?

To use a descriptor it must be deployed in a Kowari database.  Deployment is simple, all that is required is that the RDF embedded in the descriptor XSL file is loaded into a model in a Kowari database.

Start an iTQL client TODO link.

Check if the descriptor model already exists TODO link

To deploy simply load the descriptor XSL file into this model:

iTQL> load <file:/home/joe/work/helloworld.xsl> into <rmi://localhost/server1#descriptors>;

NOTE the location of the descriptor MUST be readable to the Kowari server.

If the there were no errors in the XSL then the Descriptor is now deployed and available for use, if there were errors check your XML for bad syntax such as unclosed tags.  The Descriptor wizard generates correct XML and XSL.

NOTE
the EXACT URL used to load the descriptor is the URL you MUST use when invoking the descriptor from a client (SOAP, Java, descriptor).  i.e. you can not load a descriptor using a file URL and then try and use if from a HTTP url - even if its on a web server, if it is to be used from a web server then it should be loaded with a HTTP url after it has been put on the webserver e.g. if the file above was on a web server under a work directory e.g. /home/joe/work/ is equivalent to http://foo/joe/work then the descriptor should be deployed like this:

iTQL> load <http:/foo//joe/work/helloworld.xsl> into <rmi://localhost/server1#descriptors>;

When descriptors are deployed as HTTP URL they are accessable by any clients with HTTP access, including anywhere on the internet if accessable through a firewall.

I can't invoke my descriptor.

First see if the descriptor model exists on the Kowari server TODO link.  If it does then check if your descriptor is deployed TODO link.  If you have changed any of the parameters required, their types or anything else from you entered from the Descriptor Wizard then you will have to redeploy, the easiest way to do this is to redeploy the built in descriptors and redeploy your descriptor.

How do I check if the descriptor model exists on a Kowari server ?

There are 2 ways to check if the descriptor model exists

1) Use the lists descriptor task from the Descriptor Management page, if it fails no descriptor model exists and the built in descriptor should be deployed TODO link
2) Use an iTQL query

Execute this query (see *TODO* here for how to execute an iTQL query.)

select $model from <rmi://HOSTNAME/server1#> where $model <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://tucana.org/tucana#Model>;

NOTE replace HOSTNAME with the fully qualified hostname of the Kowari server e.g. foo.bar.com and not just foo.

This query will return all models on the server, if there is no #descriptor model then you should deploy the built in descriptors and any descriptors you have developed.  

How do I check if my descriptor is deployed ?

Descriptors must be deployed before use.  The Kowari code that invokes descriptors looks up the parameters a descriptor uses as well as other information before it invokes a descriptor, if the descriptor has not been deployed then the code  does not know how to invoke it and will have errors.

First make sure the built in descriptors are deployed TODO link, then select the See List of Descriptor on this host task from the Descriptor Management page.

What is the life cycle of a Descriptor ?

When a descriptor is first invoked the descriptor code retrieves the descriptor from its URL and creates a java object representing that descriptor.  The XSL transformer processes the XSL stylesheet into an internal form which only happens once.  This descriptor object is placed in a free pool for use, when used it is placed  in a busy pool until it has performed its duty, then it is returned to the free pool.  If a client requests a descriptor and an instance is already in the busy pool and new instance is created and made available, both descriptors will eventually return to the free pool.  Descriptors are NOT currently garbage collected, the pool grows to the size needed and stays that size, it is not expected that these pools will get very large - especially in comparison to resources in a Kowari server.  This may change in future.  There is a facility to purge all the cached descriptors - see here TODO

How do I empty the Descriptor Cache ?

This task is available from the Descriptor Management page.

I've changed my Descriptor XSL code but when invoked it does not notice my changes, whats wrong ?

Descriptors are downloaded and processed into Java Objects which are cached and reused.  If you have changed the logic in a descriptor but NOT the parameters, mime type etc. (i.e. everything specified in the Descriptor Wizard) then you need to clear the Descriptor cache. This task is available from the Descriptor Management page.  If you have added or removed a parameter, mime type etc then you will need to re deploy your descriptor TODO link.

If the descriptor is on a web server have you published it to the web server?

My Descriptor doesn't work, are errors logged anywhere ?

A file called descriptor-errors.log is created where Kowari was started from, it will contain the reasons for Descriptors failing, it may also write out a stack trace for more detailed error explanations for example when the XML or XSL parser fails.  

Can't have more than one ROOT in a DOM ! Error 

org.w3c.dom.DOMException: DOM006 Hierarchy request Error

This frequently happens when calling one descriptor from another.  The descriptor being called is returning XML with more than one root, this is fine but if you try and store this in a variable or pass it to something expecting a single root you will get this error e.g.

Descriptor A returns

<price>17.55</price>
<album>Is This It</album>
<artist>The Strokes</artist>

Descriptor B calls descriptor A and tries to store the result in a variable, it can not since the XML is not in a tree.  Descriptor A should return something like:

<cd>
<price>17.55</price>
<album>Is This It</album>
<artist>The Strokes</artist>
</cd>

The XML now has one root.

Note that this can happen where there is some text after a node e.g.

<price>17.55</price>
The Strokes

There is an invisible text node in the above example, it needs outer <cd> tags like the previous example. If the error is the DOM006 Hierarchy request error then it usually means the calling descriptors needs to have tags around the descriptor its calling e.g.

Descriptor A returns

<cd>
<price>17.55</price>
<album>Is This It</album>
<artist>The Strokes</artist>
</cd>

Descriptor B calls descriptor A and descriptor C (similar to A) and simply copies them to the output stream.  There are now 2 root nodes.  As previously mentioned put a tag around where the 2 descriptors are called.

Can I see how other Descriptors work ? Can I get their source ?

Yes, since all descriptors have to be downloaded to be invoked they can be downloaded with a web browser or other mechanism.  The Descriptor List tasks available from the Descriptor Management page allow you to list descriptors available on your host OR other hosts.  The page while lists the descriptors includes a link to download the descriptor.  
NOTE Some browsers have trouble displaying the descriptor so you may have to download the link directly using a right mouse button click.

NOTE Some links to local desciptor with file URLs such as file:/home/joe/work/helloworld.xsl will NOT work from some browsers, this is a security feature designed to stop web pages accessing local files, you should retrieve the file using a local file mananger such as Windows Explorer (and not Internet Explorer).

How do I store the result of a Kowari query in a variable ?

<xsl:variable name="answer">
<kowariDescriptor:query model="{$model}" node="{$node}">
<![CDATA[
select $predicate from <@@model@@> where <@@node@@> $predicate $object;
]]>
</kowariDescriptor:query>
</xsl:variable>

How do I store the returned data from a descriptor invoked from another descriptor ?

<xsl:variable name="answer">
<kowariDescriptor:query model="{$model}" node="{$node}">
<![CDATA[
select $predicate from <@@model@@> where <@@node@@> $predicate $object;
]]>

</kowariDescriptor:query>
</xsl:variable>

What is the 'Can not convert TREEFRAG into NodeSet' Error ?

The Tree fragment must be converted into a Node set using a XALAN extension, see the next step.

How do I apply the current stylesheets rules to XML returned from a query or another descriptor ?

XSL does not directly allow for this, a XALAN extension must be used.  The XML must be converted from a Tree fragment to a proper DOM Nodeset.  To apply XML to the the style sheet use something like the following.


<xsl:variable name="answer">
<kowariDescriptor:query model="{$model}" node="{$node}">
<![CDATA[
select $predicate from <@@model@@> where <@@node@@> $predicate $object;
]]>
</kowariDescriptor:query>
</xsl:variable>

<!-- Now apply the templates to the answer -->
<xsl:apply-templates select="xalan:nodeset($answer)/*"/>

How can I see the raw XML response from a Kowari query ?

Simply copy the result to the output stream like this:

<xsl:variable name="answer">
<kowariDescriptor:query model="{$model}" node="{$node}">
<![CDATA[
select $predicate from <@@model@@> where <@@node@@> $predicate $object;
]]>
</kowariDescriptor:query>
</xsl:variable>

<!-- Now apply the templates to the answer -->
<xsl:copy-of select="xalan:nodeset($answer)/*"/>

If you are viewing the response in a browser the XML tags may not be visible, wrap them in xmp tags to see them i.e.

<!-- Now apply the templates to the answer -->
<xmp>
<xsl:copy-of select="xalan:nodeset($answer)/*"/>
</xmp>

Why are some XSL parameters or variables written like $model and some like {$model} ?

When refering to a parameter from an element in the XSL namespace it is sufficent to refer to it using a $ prefix e.g.

<h1><xsl:value-of select="$title"/><h1>

However when refering to parameters from any other namespace it must be surrounded with curly braces {} e.g.

<xhtml:a href="{$homepage}">Home Page</xhtml:a>

and even when using no namespace:

<a href="{$homepage}">Home Page</a>

Common problems