August 24, 2003

libxslt Extension Functions in Python

This is just a note about writing extension functions for libxslt. The documentation on this is slim at best and the only example that comes with libxslt is minimal.

I'm working on a project where I wanted to run a piece of text through a textile processor and then insert the result into the result tree of the XSL document. I thought this would be pretty simple, but it turned out to be a bit more work then I expected. Here's the code that finally works.

Update: the original version of this code inserted the output directly into the tree. That's probably not what you want to really do, so I updated it to return the content.

def textile_process(ctx, content):
    """
    An XSL-T extension function to process textile formatting included in a post.
    """
    try:
        node = libxml2.xmlNode(_obj=content[0])
        parserContext = libxslt.xpathParserContext(_obj=ctx)
        xpathContext = parserContext.context()
        
        resultContext = xpathContext.transformContext()

        source = "<div>" + textile.textile(node.content) + "</div>"
        
        doc = libxml2.parseDoc(source)
        
        root = doc.getRootElement()
        root.unlinkNode()
        
 	# If you do this you insert the result directly into the output tree
        # resultContext.insertNode().addChild(root)

	return [root]
    except Exception, err:
        sys.stderr.write("Context error " + str(err))
        
    return ""

The tricky parts about this code are converting all parameters into the right types and getting to the pieces of the documents that you need. The problem is that the parameters come in as raw PyCObject instances so you have to convert to the more specific object types manually.

For the way I was using this function the content parameter contains a list with one item and that item is an xmlNode. So the code.

node = libxml2.xmlNode(_obj=content[0])

converts the PyCObject into an xmlNode instance that you can then use as you usually would. You have to do similar things with the xpathParserContext that comes in as the first parameter. From the xpathParserContext you then have to get an xpathContext and from that you get the context for the result document. The xpathContext variable provides a reference into the source document and the resultContext variable provides access to the document being created.

Update: this paragraph applies to the commented out version of the original code. One confusing thing about the resultContext is the use of the insertNode() method. At first I thought that was inserting a new node into the result document, what it really is doing is requesting the node under which the result of the function will be inserted. It would probably have been clearer if the method was named getInsertionNode() or something like that.

Now that this is working it's pretty slick, but it sure took a while to figure out exactly what needed to happen.

Posted by kstaken at August 24, 2003 04:23 AM | TrackBack