Writing libxslt extension modules in Python

A while back I wrote up a quick post on writinglibxslt extension functions in Python. Writing extension functions allows you to easily call python code from within an XSL-T stylesheet. I wanted to add a little more detail to that post and show how you can build extension modules instead of just single functions. As far as I know there isn't any documentation as part of the libxslt distribution.

Here's an example program.


#!/usr/bin/env python
#
# Sample libxslt extension functions
import libxml2, libxslt

def countChars(ctx, content):
    if (isinstance(content, str)):
        return str(len(content))
        
    return str(0)

def firstChar(ctx, content):
    if (isinstance(content, str)):
        return str(content[0])
        
    return ""
    
libxslt.registerExtModuleFunction("countChars", "http://www.xmldatabases.org/extension", countChars)
libxslt.registerExtModuleFunction("firstChar", "http://www.xmldatabases.org/extension", firstChar)

data = """
<doc>This contains 27 characters</doc>
"""
        
styledoc = libxml2.parseFile("style.xsl")
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseDoc(data)
result = style.applyStylesheet(doc, None)
            
print style.saveResultToString(result)
                           
doc.freeDoc()
style.freeStylesheet()


  

This example registers two functions under the namespace "http://www.xmldatabases.org/extension". To make them part of a module you have to call the libxslt.registerExtModuleFunction to register them with the libxslt environment. You can register as many functions as you want this way. The first parameter is the name you use to call the function from within libxslt, the second is the XML namespace URI that identifies the module and the third is a Python callable that implements the function. All functions with the same namespace URI are considered part of the same module.

The functions always take at least one argument, the xpath parser context. However, you can add additional arguments to pass other data from the style sheet to your function. These arguments will come in as various different types based on how they're passed from XSL-T. For instance they could be a libxml2 node, a list, or a string. In this example I convert the arguments to a string within the XSL-T before passing the data to the function. If I didn't do that I'd have to convert the argument into a node and then get the content of the node as a string.

The same consideration must be given to what you return from the function. In this case, both functions convert their results to strings, but they could also return libxml2 nodes or lists of nodes that will be inserted into the result document.

Here's the associated stylesheet that uses the functions.


<xsl:stylesheet
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:extension="http://www.xmldatabases.org/extension"
        extension-element-prefixes="extension"
        xsl:version="1.0">
        
    <xsl:template match="/doc">
        <xsl:value-of select="extension:countChars(string(.))"/>
        <xsl:value-of select="extension:firstChar(string(.))"/>
    </xsl:template>
    
</xsl:stylesheet>


  

There are two things you have to do before you can use the functions.

  1. Declare the namespace URI that you used to define the functions within the libxslt environment. You can bind this namespace URI to any prefix you want.
  2. Declare the prefix that you chose as an extension element prefix by adding it to the extension-element-prefixes attribute.

Once you do both of those things you can call the functions like any other XSL-T functions. Just make sure you pass types that your functions know how to handle. It's obviously a good idea for your functions to explicitly convert the different types if they can.

Posted by Kimbro Staken

Sunday Jan 11, 2004 at 12:39 AM
Recommended Sites