Python Archives


August 27, 2003

Java on Mac OS X

I mentioned in my rant about window proliferation on Mac OS X that I was looking at using JEdit as a solution. It turns out JEdit is almost a perfectly workable solution, through various plugins I was able to get it to do everything that I wanted and it felt close enough to a native app to not drive me crazy. Unfortunately, all is not well in this world.

The JDK on Mac OS X has some serious problems when running JEdit. With the 1.4 JDK the VM crashes far too often. This shouldn't be JEdit's fault as it's the VM that's crashing. Unless jEdit is running some native code a Java app should never be able to crash the VM. This wouldn't have been a huge deal as JEdit is really good about saving its current state, just restarting the app is a minor annoyance. However, the real problem is that sometimes when the VM crashes it takes the whole WindowServer process down with it. This results in instant termination of all your running apps and getting logged out of the system. To say that's a bit of a problem is an understatement.

When you install jEdit it recommends that you use the 1.3 VM. Unfortunately doing this also results in the loss of a number of functions (in particular mouse scroll wheel support) and substantially changes the display of the app (i.e. tab layouts). Also if you turn on the hardware acceleration it has problems with garbled text and incorrect positioning in text areas. Running without mouse scroll wheel support is extremely annoying.

I'm quite disappointed with the current situation of Java on Mac OS X for running GUI apps. I've had no problems with it running server apps, it seems most of the current problems can be traced to the things they're doing with Swing, i.e hardware acceleration and the switch to the Cocoa toolkit for GUIs in 1.4. This stuff will be great once it's fully stable, but I've heard nothing about progress in this area for some time.

I'm still trying to make do with jEdit running on the 1.3 VM with no hardware acceleration. However, I would definitely prefer to be able to take advantage of the 1.4 features, that scroll wheel thing can really spoil you.

Posted by kstaken at 11:08 PM | TrackBack

August 24, 2003

libxslt Extension Functions in Python

This is just a note about writing extension functions for libxslt. The documentation on this is slim at best and the only example that comes with libxslt is minimal.

I'm working on a project where I wanted to run a piece of text through a textile processor and then insert the result into the result tree of the XSL document. I thought this would be pretty simple, but it turned out to be a bit more work then I expected. Here's the code that finally works.

Update: the original version of this code inserted the output directly into the tree. That's probably not what you want to really do, so I updated it to return the content.

def textile_process(ctx, content):
    """
    An XSL-T extension function to process textile formatting included in a post.
    """
    try:
        node = libxml2.xmlNode(_obj=content[0])
        parserContext = libxslt.xpathParserContext(_obj=ctx)
        xpathContext = parserContext.context()
        
        resultContext = xpathContext.transformContext()

        source = "<div>" + textile.textile(node.content) + "</div>"
        
        doc = libxml2.parseDoc(source)
        
        root = doc.getRootElement()
        root.unlinkNode()
        
 	# If you do this you insert the result directly into the output tree
        # resultContext.insertNode().addChild(root)

	return [root]
    except Exception, err:
        sys.stderr.write("Context error " + str(err))
        
    return ""

The tricky parts about this code are converting all parameters into the right types and getting to the pieces of the documents that you need. The problem is that the parameters come in as raw PyCObject instances so you have to convert to the more specific object types manually.

For the way I was using this function the content parameter contains a list with one item and that item is an xmlNode. So the code.

node = libxml2.xmlNode(_obj=content[0])

converts the PyCObject into an xmlNode instance that you can then use as you usually would. You have to do similar things with the xpathParserContext that comes in as the first parameter. From the xpathParserContext you then have to get an xpathContext and from that you get the context for the result document. The xpathContext variable provides a reference into the source document and the resultContext variable provides access to the document being created.

Update: this paragraph applies to the commented out version of the original code. One confusing thing about the resultContext is the use of the insertNode() method. At first I thought that was inserting a new node into the result document, what it really is doing is requesting the node under which the result of the function will be inserted. It would probably have been clearer if the method was named getInsertionNode() or something like that.

Now that this is working it's pretty slick, but it sure took a while to figure out exactly what needed to happen.

Posted by kstaken at 04:23 AM | TrackBack

August 21, 2003

libxml2 2.5.10 and libxslt 1.0.32 released

Daniel Veillard has released new versions of libxml2 and libxslt, classifying the libxml2 release as a "major bugfix release", and writing that libxml2 2.5.9 and 2.5.10 "include a lot of bugfixes spanning the whole library; upgrading is strongly recommended." The libxslt 1.0.32 release is also significant in that it is the first to include Python bindings for extension elements. [xmlhack]

I've been using libxml2 in Python a lot lately. It's the first XML parser I actually like, mainly because of the very convenient XPath API.

Posted by kstaken at 11:53 AM | TrackBack

August 19, 2003

Java Programmers Unite: Say NO To Python

In his comparison of Java and Python productivity, Steve Ferg notes that: A programmer can be significantly more productive in... Via: Nu Cardboard

This is kind of funny, but I do agree with the assessment that Python is vastly more productive then Java. I spent several years working with Java and in retrospect it was astonishingly unproductive. This included a project where I converted a team development effort from Perl to Java for all the reasons that are commonly stated for making such moves. In retrospect that was a horrible decision. I did it under the belief that we would have a more scalable process, more stable code and ultimately faster development of new applications because of this. It never really worked out that way and this was with Perl. I really regret making that move even though Perl is a far worse language for team projects then Python is. Over the last few months I've been working more and more with Python and I'm pretty convinced that the difference in raw productivity makes up for any loss from static typing and a compilation phase. I believe the same applies to Perl, although not quite as much.

What's important to understand about running scripting languages in large projects is you have to have good tests. When we were using Perl we had lots of unit tests, however when we switched to Java the code itself took too much precedence and the creation of unit tests suffered. Sure it's nice to have a testing policy, but schedule pressure can have nasty ways of interfering, especially when the overhead of the language is slowing things down. In the end our system became even harder to change then when we started out. The problem of course is that just because a language is statically typed and compiled it does not remove the need to write tests. So you're now taking on the burden of a less productive language while not shedding the burden of writing comprehensive tests. In addition, because of the static typing, compilation phase and OO access protection features of languages like Java your tests have also become much harder to write.

Because of this, I'm really starting to believe that efforts to improve software quality by tightening up the language features through stronger typing and rigid language features are really the wrong approach. I tend to believe a more profitable future will be attained by making languages easier to use and building in mechanisms that improve the testability of code. My feeling is that it isn't really important that you pass the right type to the right method, what's important is that the code does what it is supposed to do. Passing the right type to the right method may be a precondition for this, however it is not sufficient to guarantee it, and that to me just makes it overhead. Tests are still required. If dealing with all the static typing and compilation phases makes test development suffer, then I tend to believe those features are counterproductive and just get in the way.

Posted by kstaken at 10:46 AM | TrackBack

August 18, 2003

Still thinking about Blosxom

I'm still thinking about converting this site to running on a Blosxom derivative, in particular on Pyblosxom. I spent a fair amount of time over the last week getting everything working in a test setup and I've worked everything out, except I'm worried about the performance of it. Even running on my dual 1.25Ghz Powermac there's a noticeable delay when viewing a page. This concerns me as the server this site runs on is only a dual 266Mhz Pentium II. I haven't tested it on this machine yet. This really shouldn't be a major problem as this site doesn't get all that much traffic and the network is kind of slow anyway, but CGI scripts always bug me. This is the one good thing about MovableType, it's slow to post, but that's because it creates static pages for everything.

The Perl version of Blosxom can also be used to generate static pages, the Python version can't.

So far I've written four plugins for Pybosxom to make it as compatible with the current site as possible. I had to add RSS 1.0 support, Textile formatting support and a post body summarize function. Along with these I also created a new plugin that tracks referrers and hit counts on a per post basis. That one was more my experimenting with Berkeley DB XML then anything, but it's very useful.

Anyway, now I'm stuck trying to decide whether to go with Pyblosxom, go with the original Perl Blosxom or to punt on the whole thing and just stick with MovableType.

Posted by kstaken at 10:11 PM | TrackBack

August 14, 2003

10 Python Pitfalls

A good list of common problems in Python. I like Python, but I'm anything but an expert so I learned a thing or two from this.

Posted by kstaken at 10:00 AM | TrackBack

August 10, 2003

Thinking about Blosxom

I'm thinking about moving this site to Blosxom from MovableType. I'm getting annoyed with a lot of things about MovableType, in particular how slow it is when publishing. Each time I add a post it gets a little slower and it's starting to really bug me. I'm also annoyed by how cumbersome it is to edit templates, especially since I have more then one blog and they all use the same templates. I also like the way NetNewsWire handles Blosxom blogs, it shows you the hierarchy and all posts you've made at all times. With MovableType you only get your recent posts and if you restart NetNewsWire you'll only get the last 10 or so posts you've made. This makes it really cumbersome to go back and edit your old posts. There are many other little things as well that are bugging me as I use this system more.

I really like the simple, simple file based mechanism that Blosxom uses. I've always considered MovableType's use of a database as massive overkill (even if it is just MySQL). I also like the idea of being able to build the whole thing locally and then just shove it up on the server with nothing major needing to be installed on the server.

After reading through the documentation it looks like the one problem area with Blosxom may be categories. I tend to add posts into several different categories, but it looks like Blosxom may only support one category for each post. I'll explore this a little more to find out for sure, but it can't be that hard to make it use symlinks or something to do it.

I found this site which has some pretty good information on making the move.

There's also a version written in Python that looks interesting. Hmm, that could actually be fun. Blosxom is written in Perl and even though I'm perfectly comfortable writing Perl code, it's not something I really enjoy anymore. Python, however I do like. Very, very tempting.

If I do make this move it will be the fourth time I've changed the software I use for this blog. Ugh! Sadly, it probably won't be the last either.

Posted by kstaken at 02:04 AM | TrackBack

August 09, 2003

Installing Berkeley DB XML on Mac OS X with Python and Perl API support

I just wanted to post some notes about installing Sleepycat Berkeley DB XML on Mac OS X 10.2 with Perl and Python support. The builds are relatively straight forward and Sleepycat has posted a simple script to help build Berkeley DB XML it self. However, it isn't clear what is necessary to get Perl and Python working.

The most important thing, before you start compiling anything, make sure you have the latest GCC 3.3 from Apple. This is distributed as a patch to the December 2002 developer tools. This is critical, without it Python and Perl support will not work.

Next, unfortunately, you'll have to build a new Perl and Python. The Mac OS X 10.2 Python should be the right version, but I couldn't get it to work. Building a fresh Python 2.3 does work. For Perl, Mac OS X includes Perl 5.6 and Berkeley DB XML requires 5.6.1 so you have to build a new one. I used Perl 5.8.0 and it seems to work fine. So you have to build a new Python, a new Perl and the Berkeley DB XML distribution. These should all build using the standard instructions and for DB XML you can use their script.

Once you have all that built, you can then build the DB XML Perl and Python libraries.

For Python you first need to build and install bsddb3, once that's done you can build the python support for DB XML in the usual Python fashion. Make sure the python you're using is the one you built previously. Unless you specified otherwise, it's installed in /usr/local/bin/python.

cd dbxml-1.1.0/src/python
/usr/local/bin/python setup.py build
sudo /usr/local/bin/python setup.py install

There's an example Python program in dbxml-1.1.0/examples/python/examples.py that you can run to test the build.

For Perl you just build it in the usual Perl manner. Again, make sure you use the perl you compiled.

cd dbxml-1.1.0/src/perl
/usr/local/bin/perl Makefile.PL
make
sudo make install

There are some examples for the Perl API in dbxml-1.1.0/src/perl/examples.

Posted by kstaken at 08:33 PM | TrackBack

XML Document Construction With Python and libxml2

The libxml Python API is very lightly documented, so this is an attempt to fill in some of the holes that exist.

Creating a new document

To create a new document using the libxml2 API you use a document constructor function that returns an empty document instance. This method takes one argument that repesents the XML version of the document being created.

 import libxml2
 doc = libxml2.newDoc("1.0")

Creating elements

Once you have a document instance you then need to add elements to it. First off you need to create the root element.

root = doc.newChild(None, "root-element", None)

The root node is created using the xmlDoc.newChild() method. This method takes three parameters.

  • namespace - The namespace that the element should belong to or None if no namespace.
  • node name - The name of the node with no namespace prefix.
  • element content - The content for the element or None if the element is empty.

In this particular case we're creating an empty element named root-element. If we were to print this out at this point it would look something like this.

<?xml version="1.0"?>
<root-element/>

If we wanted to put the node into a namespace we would write this instead.

root = doc.newChild(None, "root-element", None)
namespace = root.newNs("http://example.com/sample", "sample")
root.setNs(namespace)

The resulting document then becomes.

<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample"/>

Now that we've created the root we can continue adding elements to the document. We can add a element child-node in the http://example.com/sample namespace by adding.

child = root.newChild(namespace, "child-node", None)

And our document now looks like

<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
    <sample:child-node/>
</sample:root-element>

If we had wanted to included some text within the added child it's as simple as just changing the third parameter to newChild.

child = root.newChild(namespace, "child-node", "Some sample text")

Which generates the document

<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
    <sample:child-node>Some sample text</sample:child-node>
</sample:root-element>

Adding an attribute to an element is also very easy.

child = root.newChild(namespace, "child-node", "Some sample text")
child.setProp("an-attribute", "with a value")

Which of course generates a document that looks like this.

<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
    <sample:child-node an-attribute="with a value">Some sample text</sample:child-node>
</sample:root-element>

If you wanted the attribute to be part of a namespace, you use setNsProp instead of setProp.

child = root.newChild(namespace, "child-node", "Some sample text")
child.setNsProp(namespace, "an-attribute", "with a value")

And the result

<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
    <sample:child-node sample:an-attribute="with a value">Some sample text</sample:child-node>
</sample:root-element>

Beside simple elements and attributes libxml defines methods to create all the other common XML types. Here's a summary of the methods that are available.

  • xmlDoc.newDocComment(comment) - Creates a comment node.
  • xmlDoc.newCDataBlock(content, length) - Create a CDATA section.
  • xmlDoc.newDocText(content) - Creates a new text node.

These methods are all node construction methods that are called to create the instance of the required type. Once you have the instance you then need to add it into the document tree where ever you want it. There's also a function available to create processing instructions. This function differs in that it called on the libxml2 module, rather then an xmlDoc instance.

  • libxml2.newPI (name, content) - Creates a processing instruction

Since these functions require you to create the node and then add it to the document in two steps, libxml provides a number of methods to control where the node is placed in the document tree. These methods are available on any instance of an xmlNode.

  • xmlNode.addChild(node) - Appends the new node to the list of children for the node.
  • xmlNode.addChildList(nodeList) - Appends a list of new nodes to the children for the node.
  • xmlNode.addNextSibling(node) - Adds the new node as a sibling after the selected node.
  • xmlNode.addPrevSibling(node) - Adds the new node as a sibling before the selected node.
  • xmlNode.addSibling(node) - Adds the new node as a sibling after the selected node. (similar to addNextSibling)
  • xmlNode.addContent(content) - Appends additional text content to an element.

Here's an example that puts everything together.

#!/usr/local/bin/python 
import libxml2
doc = libxml2.newDoc("1.0")

root = doc.newChild(None, "root-element", None)
namespace = root.newNs("http://example.com/sample", "sample")
root.setNs(namespace)

child = root.newChild(namespace, "child-node", "Some sample text")

child.setNsProp(namespace, "an-attribute", "with a value")

comment = doc.newDocComment("Just commenting")
child.addPrevSibling(comment)

pi = libxml2.newPI("a-sample-pi", "with some useless content")
root.addPrevSibling(pi)

text = doc.newDocText(" This will be added to the existing text.")
child.addChild(text)

child.addContent(" This will also be added to the text")

print doc.serialize(None,  1)

And a final result.

<?xml version="1.0"?>
<?a-sample-pi with some useless content?>
<sample:root-element xmlns:sample="http://example.com/sample">
<!--Just commenting-->
  <sample:child-node sample:an-attribute="with a value">Some sample text This will be added to the existing text. This will also be added to the text</sample:child-node>
</sample:root-element>

Posted by kstaken at 07:42 PM | TrackBack

July 29, 2003

Python 2.3 released

July 29, 2003
Press Release
SOURCE: Python Software Foundation

PYTHON SOFTWARE FOUNDATION (PSF) ANNOUNCES PYTHON VERSION 2.3
New release enhances powerful programming language

FREDERICKSBURG, Va., July 29, 2003 -- The Python Software Foundation (PSF) announces the release of version 2.3 of the Python programming language. This major release introduces performance enhancements, increased robustness, several minor language features, many additions to the extensive standard library, improved support for Mac OS X and several other Unix-based systems, and a large number of other improvements.

... and if the release of Python 2.3 w/first class OS X support were not enough ....

"The combination of the open source Unix-based core of Mac OS X running on PowerBook G4 high-performance portables has attracted a large number of developers using open source scripting languages like Python," said Bud Tribble, Apple's vice president of Software Technology. "Python 2.3 provides greatly improved support for existing Mac OS X users, and with the upcoming release of Panther, Apple will provide Python 2.3 developers direct access to APIs for the PDF-based Quartz graphics engine and QuickTime image formats."

Excellent. And, of course, PyObjC will continue to provide first class support for integrating Python and Objective-C, including full blown Cocoa application development using Python in place of Objective-C.

... and on my son's third birthday and everything. [bbum's rants, code & references]

Posted by kstaken at 11:43 PM | TrackBack

July 09, 2003

PyObjC 1.0b1 Released

The improvements include: Improved performance and stability, Better tutorials and examples, Initial support for MacOS X 10.1, Support for the WebKit framework, Write plugin bundles in Python (requires Python 2.3) [Studio Log]

I've been playing around with the 0.9 release of PyObjC. It's a very promising project badly in need of better documentation. I've really been getting into Python lately and have been using it heavily with my Cocoa projects. I'd love to see Python included by Apple as a full peer with Objective C and Java for Cocoa applications. It would make Cocoa development even easier then it already it. Applescript Studio is nice, but Applescript is still not a language that I feel comfortable with and the way it is integrated with Cocoa means there are quite a few things you can't do with Applescript. I've found it easier to just stick with Objective C. The way PyObjC is being integrated it is functionally equivalent to Objective C and should bring all the power along with the ease of a scripting language.

Posted by kstaken at 12:18 PM | TrackBack

June 29, 2003

iTunes Playlist to Blog

While messing around today I wrote a little Python script to post an iTunes playlist to a Metaweblog API enabled blog (like MovableType). I'm toying with the idea of using it to auto-post a top 25 list of songs once per week or something. The script is available here.

Here's what the top 25 looks like for this week. This is from an iTunes smart playlist that shows the top 25 most played songs that have been added to my library in the last month. iTunes smart playlists are an absolutely great feature that I hope shows up in other places in Mac OS X, like oh maybe in the Finder as a smart list of files.

ArtistSongAlbumPlay Count
MetallicaFranticSt. Anger12
Massive AttackAntistar100th Window11
Massive AttackButterfly Caught100th Window11
Massive AttackEverywhen100th Window11
Massive AttackFuture Proof100th Window11
Massive AttackName Taken100th Window11
Massive AttackHymn Of The Big Wheel (OriginHymn Of The Big Wheel11
MetallicaDirty WindowSt. Anger11
MetallicaMy WorldSt. Anger11
Annie LennoxA Thousand Beautiful ThingsBare10
Annie LennoxBitter PillBare10
Annie LennoxErasedBare10
EvanescenceBring Me To Life (Feat. PaulFallen10
Massive AttackPrayer For England100th Window10
Massive AttackSmall Time Shot Away100th Window10
Massive AttackSpecial Cases100th Window10
Massive AttackWhat Your Soul Sings100th Window10
Massive AttackAny Love (Larry Heard Mix)Hymn Of The Big Wheel10
Massive AttackHome Of The WhaleHymn Of The Big Wheel10
Massive AttackHymn Of The Big Wheel (NelleeHymn Of The Big Wheel10
Annie LennoxHonestlyBare9
Annie LennoxLonelinessBare9
Annie LennoxOh God (Prayer)Bare9
EvanescenceEverybody`s FoolFallen9
EvanescenceGoing UnderFallen9

What's funny is that I have a tremendous breadth of musical interest, but you sure wouldn't know it from this list. I bought a number of more popular albums a couple weeks ago which skews the results away from the more eclectic mix I usually get from eMusic.

Currently Playing "Marquis Cha-Cha" by "The Fall" from the album "Palace Of Swords Reversed", a little more eclectic bit from eMusic.

Posted by kstaken at 05:01 PM | TrackBack