Topic - xquery cacheability
Topic - xquery cacheability Topic - xquery cacheability
from forum XML
 forum index   my profile   search 
 new topic  post reply 
moderators: pjr tab
xquery cacheability
Joined: 25-May-2007
Posts: 30
Posted: 8-May-2008 22:21
I'm fighting with caching once again..  I'm using xquery to scan a folder of xml files and being lazy about it because I can, or at least I think I should be able to :)

I'm using the ability in saxon to specify a directory in collection() but then I would like the result be be dependent on the contents of that directory, in particular to expire when any file is updated, created, or removed.  My code looks like this:
<idoc>
  <instr>
    <type>xquery</type>
    <target>this:response</target>
    <operator>
      <xquery> &lt;titles&gt;{ collection("file:///tmp/xmlfiles?select=[0-9]+.xml")/doc/title }&lt;/titles&gt; </xquery>
    </operator>
  </instr>
</idoc>

I figure that the response has no dependency on /tmp/xmlfiles (since the xquery accessor isn't smart enough to add it), so it will never expire when that changes.  The question is, how do I add that dependency?  I don't want to declare the whole result to be intermediate or uncacheable, since the responses will be the same 99% of the time when the files aren't changing.  [/xml]
Saxon has stepped out of NK boundary
Joined: 7-February-2005
Posts: 397
Location: UK
Posted: 9-May-2008 16:31
If you can provide a brief example we can test this. But what I think is happening is that Saxon is going down to the filesystem independently for this operation - we provide it with the NK URI resolver but it would appear to be  treating a directory listing as a low level file access and is not using the URI resolver to treat the files as resources.  In which case NK never sees these resources since they are not loaded/accessed through its file: accessor and therefore the dependencies cannot be assigned and change detection will never happen.

Not sure how you could trick Saxon into loading the resources via the URIResolver in the XQuery domain - maybe Menzo knows a trick?

At the NK/ROC level you could assign these dependencies as golden threads but that would be a lot of extra code (more than doing this the non-lazy way ;)  and they still wouldn't depend on the underlying file and wouldn't gain NK's change detection on file resources.

Without some digging into the inner workings of Saxon its not clear if this function can be overridden with a modified (resource oriented) implementation that uses the NK infrastructure.

Let us know how critical this is and we can dig into further.

P.
attachDependency
Joined: 25-May-2007
Posts: 30
Posted: 12-May-2008 23:31
Somehow I missed this earlier, but it seems like exactly what I want to do.  Since saxon is bypassing the NK accessors it can't get in that dependency.  so instead, I can use attachDependency to explicitly add the dependency on the underlying files.
<instr>
  <type>attachDependency</type>
  <operand>this:response</operand>
  <param>file:///tmp/xmlfiles</param>
  <target>this:response</target>
</instr>

However, this ends up with it being always expired - it looks like NK is unable to determine expiries on file: urls.  Using the request trace, if I enter a file: url, it always comes up as already expired.

My example code above is pretty much complete, the only thing missing is the set of documents to be queried.  For example, file:///tmp/xmlfiles/1.xml is
<doc>
  <title>hello</title>
</doc>
and file:///tmp/xmlfiles/2.xml is
<doc>
  <title>world</title>
</doc>
with the result of the xquery being something like
<result:sequence xmlns:result="saxon">
  <result:element>
    <titles>
      <title>hello</title>
      <title>world</title>
    </titles>
  </result:element>
</result:sequence>
Joined: 14-March-2005
Posts: 60
Location: Amsterdam, The Netherlands
Posted: 13-May-2008 09:20
Recently we resolved another bug by adding a NKCollectionURIResolver which extends the net.sf.saxon.functions.StandardCollectionURIResolver (see http://www.1060.org/forum/topic/348/1). In principle I would expect this resolver to be also used for the collection, but apparently not ... so we may have to take a look in the Saxon codebase ...

Menzo
Re: attachDependency
Joined: 7-February-2005
Posts: 249
Location: Uncharted territory
Posted: 13-May-2008 10:33
Hi Jeff,

I wonder if you are using NetKernel 3.3.1? We recently enhanced the file accessor to support proper expiry checking (just like module resources) for the 3.3.1 release.

If you are using 3.3.1 can you do some digging with the visualizer or the debugger to see what the expiration metadata looks like for the response? I would expect to see the file resource expiry metadata attached and that it wouldn't be causing it to be expired.

Cheers,
Tony
latest version
Joined: 25-May-2007
Posts: 30
Posted: 13-May-2008 19:55
At the moment it appears we're still on 3.0.  Installing ext-layer1-1.3.5 (from the 3.3.1 release) gives me the new file: accessor and helps greatly.  The only remaining problem now is that the dependency on the directory is only expired when the directory changes, not when any file in the directory changes.  Of course, this is the expected and correct thing to do (unless the file accessor could support globbing).  But in any case it is sufficient to get my caching working without extensive additional work; I'll simply need to make my writing program force a directory update.

Secondary quick question on the xquery accessor - it there a way to tell saxon to not wrap the result in the "result:sequence" elements and just give the raw result?  The saxon docs indicate a command-line flag for this but we're not at a command line here.
sx_unwrap.xsl
Joined: 14-March-2005
Posts: 60
Location: Amsterdam, The Netherlands
Posted: 13-May-2008 20:41
As XQuery can return a nodeset instead of an XML document the wrapper makes sure that the accessor always returns an XML document. In my code I use the following XSL sheet to unwrap the XQuery result:


<xsl:stylesheet
   version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:result="http://saxon.sf.net/xquery-results"
>
   <xsl:output method="xml" encoding="utf-8"/>
   <xsl:template match="/result:*">
      <xsl:choose>
         <xsl:when test="count(result:*/*)=1">
            <xsl:copy-of select="result:*/*" copy-namespaces="no"/>
         </xsl:when>
         <xsl:otherwise>
            <xsl:message>WRN: can't strip the Saxon result envelope, it would result in multiple roots</xsl:message>
            <xsl:copy-of select="." copy-namespaces="no"/>
         </xsl:otherwise>
      </xsl:choose>
   </xsl:template>
</xsl:stylesheet>


And I've a simple script to execute the XQuery and execute the above XSLT in one run:


// get the current request
req = context.getThisRequest();

// run the XQuery
subreq = context.createSubRequest();
subreq.setURI("active:xquery");

for(iter = req.getArguments(); iter.hasNext(); ) {
   arg = iter.next();
   nme = arg;
   if (arg.equals("_operator"))
      nme = "operator";
   if (!arg.equals("operator")) {
      argURI = req.getArgument(arg);
      if (argURI != null) {
         argValue = req.getArgumentValue(argURI);
         if (argValue != null) {
            subreq.addArgument(nme,argValue);
         } else {
                                subreq.addArgument(nme,argURI);
         }
      }
   }
}

reply = context.issueSubRequest(subreq);

// cleanup the result
subreq = context.createSubRequest();
subreq.setURI("active:xslt2");

subreq.addArgument("operand",reply);
subreq.addArgument("operator","ffcpl:/tools/sx_unwrap.xsl");

clean = context.issueSubRequest(subreq);

// create response, and exit
response = context.createResponseFrom(clean);


The accompanying rewrite rule in the module.xml file is as follows:


   <rewrite>
      <match>active:sloot.xquery(.*)\+operator@([^+]*)(.*)</match>
      <to>active:javascript+operator@ffcpl:/tools/xquery.js+_operator@$2$1$3</to>
   </rewrite>


Hope this helps,

Menzo
 new topic  post reply  To find out about new replies to this post as they occur
please subscribe to one of these feeds:
AtomRSS moderate 
© 2003-2006, 1060 Research Limited. 1060 registered trademark, NetKernel trademark of 1060 Research Limited.