September 08, 2003
Pathan 2.0 alpha provides XPath 2.0 implementation
DecisionSoft has released an alpha version of Pathan 2.0 that implements XPath 2.0.
At the same time they also released Pathan 1.2 release 2 that updates it to use Xerces 2.3 and fixes a few bugs. This release is important to users of Berkeley DB XML.
September 03, 2003
Ah, now it makes sense
I recently posted about how Sonic Software was challenging vendors to remove their license restrictions against posting benchmark findings. Now I just read this article about how Sonic Software is being sued by Tibco for publishing benchmarks. Hmm, OK, guess Sonic hasn't actually seen the light at all.
Is NeoCore Dead
I've been tracking various native XML database companies for a while and one that I was watching was NeoCore. A couple weeks ago their site abruptly disappeared, at first I figured it was just a glitch, but now it seems they may have died a rather silent death. So anyone know what happened to NeoCore? I didn't see any news about an acquisition or name change so I can only conclude they're dead.
August 30, 2003
Sonic Software challenges vendors to "Open Kimonos"
Now this is an interesting tactic. Sonic Software is claiming they're eliminating their license restrictions that prevented the publishing of benchmarks and are challenging others in the industry to do the same. I say, amen, these kinds of restrictions don't do anybody any good.
August 27, 2003
Updated XQuery drafts
The W3C XML Query working group has released updates to five of the drafts that make up XQuery 1.0 and XPath 2.0.
- XML Path Language (XPath) 2.0
- XQuery 1.0: An XML Query Language
- XQuery 1.0 and XPath 2.0 Formal Semantics
- XML Query Use Cases
- XPath Requirements Version 2.0
If you're having trouble sleeping reading these should help you out.
August 18, 2003
Still thinking about Blosxom
I'm still thinking about converting this site to running on a Blosxom derivative, in particular on Pyblosxom. I spent a fair amount of time over the last week getting everything working in a test setup and I've worked everything out, except I'm worried about the performance of it. Even running on my dual 1.25Ghz Powermac there's a noticeable delay when viewing a page. This concerns me as the server this site runs on is only a dual 266Mhz Pentium II. I haven't tested it on this machine yet. This really shouldn't be a major problem as this site doesn't get all that much traffic and the network is kind of slow anyway, but CGI scripts always bug me. This is the one good thing about MovableType, it's slow to post, but that's because it creates static pages for everything.
The Perl version of Blosxom can also be used to generate static pages, the Python version can't.
So far I've written four plugins for Pybosxom to make it as compatible with the current site as possible. I had to add RSS 1.0 support, Textile formatting support and a post body summarize function. Along with these I also created a new plugin that tracks referrers and hit counts on a per post basis. That one was more my experimenting with Berkeley DB XML then anything, but it's very useful.
Anyway, now I'm stuck trying to decide whether to go with Pyblosxom, go with the original Perl Blosxom or to punt on the whole thing and just stick with MovableType.
August 15, 2003
PostrgreSQL and XML
I just came across another project to add some XML support to PostgreSQL, xpsql. It looks like it's pretty rough at this point, but development is ongoing. There's also an older post about some different support, but I don't know if the code was ever released. I'm kind of surprised this type of thing has been so slow in coming.
August 14, 2003
A peek at X#
Linked from this blog entry is a paper that gives some insight into Microsoft's XML programming language that has been called X# by some people. It looks like they're doing more then just integrating XML into the language, they're also integrating relational database access. The paper's a very interesting read.
Legal Troubles For Hydra
Hydra is in trouble over their name. This naming thing is starting to get really annoying. If you ever wondered why the Xindice project has such a weird name, here's your answer. When I came up with the Xindice name I spent a lot of time using the Babelfish translator and google searches to come up with something that was somewhat meaningful somewhere, yet didn't show up in Google results. Indice is Spanish and Italian (among others) for index, it's also used in english as a plural form of index. Xindice is a made up word of course, but at least now when you type it into a search engine you're pretty much sure you'll get information about the Xindice native XML database. It also means the probability of a naming conflict is small. Obviously Google isn't authoritative on global naming, but it's better then nothing when you can't afford a trademark lawyer. I like naming things using made up words.August 11, 2003
eXist 0.9.2 Released
Wolfgang Meier just announced the release of eXist 0.9.2. Here's the text of the announcement.
I'm pleased to announce that release 0.9.2 is now available on sourceforge.
For those who have not been able to follow the discussions on this list,
here's a quick summary of changes:This is the first official release with support for XUpdate. Also, much effort
has been invested to ensure that other character encodings than Latin 1 are
correctly processed by the database as well as the query engine. This applies
in particular to East Asian languages and scripts. Further changes include:
important missing parts of the XPath spec have been implemented, more
synchronization and database corruption issues have been addressed,
interfaces improved, and dozens of bugs fixed.
August 10, 2003
Namespace training wheels
The namespaces in XML debate just never dies, Jon Udell has a new take on his perspective at InfoWorld.com.
August 09, 2003
Installing Berkeley DB XML on Mac OS X with Python and Perl API support
I just wanted to post some notes about installing Sleepycat Berkeley DB XML on Mac OS X 10.2 with Perl and Python support. The builds are relatively straight forward and Sleepycat has posted a simple script to help build Berkeley DB XML it self. However, it isn't clear what is necessary to get Perl and Python working.
The most important thing, before you start compiling anything, make sure you have the latest GCC 3.3 from Apple. This is distributed as a patch to the December 2002 developer tools. This is critical, without it Python and Perl support will not work.
Next, unfortunately, you'll have to build a new Perl and Python. The Mac OS X 10.2 Python should be the right version, but I couldn't get it to work. Building a fresh Python 2.3 does work. For Perl, Mac OS X includes Perl 5.6 and Berkeley DB XML requires 5.6.1 so you have to build a new one. I used Perl 5.8.0 and it seems to work fine. So you have to build a new Python, a new Perl and the Berkeley DB XML distribution. These should all build using the standard instructions and for DB XML you can use their script.
Once you have all that built, you can then build the DB XML Perl and Python libraries.
For Python you first need to build and install bsddb3, once that's done you can build the python support for DB XML in the usual Python fashion. Make sure the python you're using is the one you built previously. Unless you specified otherwise, it's installed in /usr/local/bin/python.
cd dbxml-1.1.0/src/python
/usr/local/bin/python setup.py build
sudo /usr/local/bin/python setup.py install
There's an example Python program in dbxml-1.1.0/examples/python/examples.py that you can run to test the build.
For Perl you just build it in the usual Perl manner. Again, make sure you use the perl you compiled.
cd dbxml-1.1.0/src/perl
/usr/local/bin/perl Makefile.PL
make
sudo make install
There are some examples for the Perl API in dbxml-1.1.0/src/perl/examples.
XML Document Construction With Python and libxml2
The libxml Python API is very lightly documented, so this is an attempt to fill in some of the holes that exist.
Creating a new document
To create a new document using the libxml2 API you use a document constructor function that returns an empty document instance. This method takes one argument that repesents the XML version of the document being created.
import libxml2
doc = libxml2.newDoc("1.0")
Creating elements
Once you have a document instance you then need to add elements to it. First off you need to create the root element.
root = doc.newChild(None, "root-element", None)
The root node is created using the xmlDoc.newChild() method. This method takes three parameters.
- namespace - The namespace that the element should belong to or
Noneif no namespace. - node name - The name of the node with no namespace prefix.
- element content - The content for the element or
Noneif the element is empty.
In this particular case we're creating an empty element named root-element. If we were to print this out at this point it would look something like this.
<?xml version="1.0"?> <root-element/>
If we wanted to put the node into a namespace we would write this instead.
root = doc.newChild(None, "root-element", None)
namespace = root.newNs("http://example.com/sample", "sample")
root.setNs(namespace)
The resulting document then becomes.
<?xml version="1.0"?> <sample:root-element xmlns:sample="http://example.com/sample"/>
Now that we've created the root we can continue adding elements to the document. We can add a element child-node in the http://example.com/sample namespace by adding.
child = root.newChild(namespace, "child-node", None)
And our document now looks like
<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
<sample:child-node/>
</sample:root-element>
If we had wanted to included some text within the added child it's as simple as just changing the third parameter to newChild.
child = root.newChild(namespace, "child-node", "Some sample text")
Which generates the document
<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
<sample:child-node>Some sample text</sample:child-node>
</sample:root-element>
Adding an attribute to an element is also very easy.
child = root.newChild(namespace, "child-node", "Some sample text")
child.setProp("an-attribute", "with a value")
Which of course generates a document that looks like this.
<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
<sample:child-node an-attribute="with a value">Some sample text</sample:child-node>
</sample:root-element>
If you wanted the attribute to be part of a namespace, you use setNsProp instead of setProp.
child = root.newChild(namespace, "child-node", "Some sample text") child.setNsProp(namespace, "an-attribute", "with a value")
And the result
<?xml version="1.0"?>
<sample:root-element xmlns:sample="http://example.com/sample">
<sample:child-node sample:an-attribute="with a value">Some sample text</sample:child-node>
</sample:root-element>
Beside simple elements and attributes libxml defines methods to create all the other common XML types. Here's a summary of the methods that are available.
xmlDoc.newDocComment(comment)- Creates a comment node.xmlDoc.newCDataBlock(content, length)- Create a CDATA section.xmlDoc.newDocText(content)- Creates a new text node.
These methods are all node construction methods that are called to create the instance of the required type. Once you have the instance you then need to add it into the document tree where ever you want it. There's also a function available to create processing instructions. This function differs in that it called on the libxml2 module, rather then an xmlDoc instance.
libxml2.newPI (name, content)- Creates a processing instruction
Since these functions require you to create the node and then add it to the document in two steps, libxml provides a number of methods to control where the node is placed in the document tree. These methods are available on any instance of an xmlNode.
xmlNode.addChild(node)- Appends the new node to the list of children for the node.xmlNode.addChildList(nodeList)- Appends a list of new nodes to the children for the node.xmlNode.addNextSibling(node)- Adds the new node as a sibling after the selected node.xmlNode.addPrevSibling(node)- Adds the new node as a sibling before the selected node.xmlNode.addSibling(node)- Adds the new node as a sibling after the selected node. (similar to addNextSibling)xmlNode.addContent(content)- Appends additional text content to an element.
Here's an example that puts everything together.
#!/usr/local/bin/python
import libxml2
doc = libxml2.newDoc("1.0")
root = doc.newChild(None, "root-element", None)
namespace = root.newNs("http://example.com/sample", "sample")
root.setNs(namespace)
child = root.newChild(namespace, "child-node", "Some sample text")
child.setNsProp(namespace, "an-attribute", "with a value")
comment = doc.newDocComment("Just commenting")
child.addPrevSibling(comment)
pi = libxml2.newPI("a-sample-pi", "with some useless content")
root.addPrevSibling(pi)
text = doc.newDocText(" This will be added to the existing text.")
child.addChild(text)
child.addContent(" This will also be added to the text")
print doc.serialize(None, 1)
And a final result.
<?xml version="1.0"?> <?a-sample-pi with some useless content?> <sample:root-element xmlns:sample="http://example.com/sample"> <!--Just commenting--> <sample:child-node sample:an-attribute="with a value">Some sample text This will be added to the existing text. This will also be added to the text</sample:child-node> </sample:root-element>
August 06, 2003
Saxon does XQuery
i just discovered that Saxon now includes support for XQuery. Probably old news for people in the XSL community, but news to me. I'm not much of a fan of XQuery, but it looks like Saxon will finally bring a usable implementation to play with. There's a discussion starting about using it to implement XQuery in Xindice. Unfortunately, it's under the Mozilla Public License which will probably kill that idea.
August 04, 2003
Xindice Wiki
Just discovered there's a Wiki setup for the Xindice project. That's a very good thing to see as one of the major problems with the Apache projects is that it's too much trouble to update the web site regularly and this leads to stale sites. The Xindice web site is a prime example of this. I was looking at it the other day and noticed it doesn't even have any links to where you can download Xindice 1.1. The problem is it's a multistep process, you have to write the content in XML using Forrest, build the site, make sure it's right and then copy it up to the server. It doesn't seem like that big of a deal, but for me it's enough that it got in the way of making as many updates as I'd like. It seems this has carried forward with the current committers as well.
The other major benefit of the Wiki is that anyone can edit it and this leads to incremental development of the content. There's no bottleneck waiting for someone to come in and publish things.
July 28, 2003
Xindice Project Web Site Statistics
I just noticed that there's a page showing interesting graphs of statistics for various Apache projects. The page for Xindice is interesting. Looks like Xindice is downloaded roughly 7,000 times per month. I'm sure this was probably posted on the mailing list at some point, but I completely missed it. It's interesting to compare it to the stats I posted shortly after the dbXML project was moved to the ASF to become Xindice.
July 23, 2003
Computerworld on Native XML Databases
Computerworld has an article looking for success stories in using native XML databases.
I'm also interested in this topic. In particular I'm interested in success stories using any of the Open Source native XML databases. Particularly, Sleepycat Berkeley DB XML, Xindice and eXist. If you have any please feel free to contact me. I'm looking at a number of writing projects coming up in the future that revolve around this topic.
July 11, 2003
Xindice Project Needs Help
In order to get a final Xindice 1.1 release out the Xindice project needs more help.
In particular someone is needed to complete the work on building a standalone distribution that uses Jetty as the container. Also extremely important, the documentation needs major updates to account for the significant changes that have been introduced since the 1.0 release.
Xindice 1.1b2 Released
Xindice 1.1 beta 2 has been released.
Just the next step to finally getting a 1.1 stable release out.
July 09, 2003
I've received a great "honor"
Mike Champion was kind enough to let me know that I've received the great honor of being debunked by the great Fabian Pascal. What's funny is that the email he's picking on was written at least two years ago, maybe even longer. I'm sure much of what he says is true, or maybe not who knows I just laughed when I read it. I have no idea why he feels that a mailing list posting from a couple years ago is worthy of his time.
July 04, 2003
LDAP Exhaustion and XQuery Lamenting
Spent the day today in a data center installing a couple LDAP servers for a client. It's been a while since I've spent any amount of time inside a data center and I'd forgotten how exhausting it can be. The noise and the cold air blasting through the floor really takes it out of you.
LDAP is an interesting technology, it's what sparked my interest in semi-structured data which led me to working on native XML databases. I hadn't worked with it in quite a few years, but it's a good example of a stable standard. In fact the LDAP protocol it self hasn't changed at all in that time period. The products have of course matured, but it's all still the exact same concepts and at the low levels the details are the same. A refreshing change compared to the spec a week mess that XML has become.
Fortunately the core specs in XML (XML 1.0, Namespaces, XPath 1.0, XSL-T 1.0) have now proven to be stable and I suspect we'll start to see the weaker (and much more complex) later specs beginning to drop off the radar. It's too bad, but in a lot of ways just about everything after the release of XSL-T 1.0 seems pretty irrelevant. This isn't altogether a good thing. The current XPath 1.0 and XSL-T 1.0 specs definitely have room for improvement. I just wonder if XPath 2.0 and XSL-T 2.0 are going to provide that improvement without drowning under the added complexity that being associated with XQuery has introduced.
I've gotten to the point where I don't even pay much attention to XQuery anymore, maybe that's not a good thing, but I just don't see any real world interest in it. In particular, it's pretty much non-existent in the Open Source world. None of the big three Open Source XML databases (eXist, Xindice, Sleepycat DbXML) support it, and I kind of doubt that they ever will. The sad reality is that It's just not asked for all that often.
Once upon a time there was a real need for XQuery(or at least there was for XPath with joins and updates), but in the pursuit for academic perfection the complexity has mounted, the number of associated specs has mushroomed and the spec has delayed it self into irrelevance.
Why am I writing about this? I don't know, maybe I just wish that XML databases actually mattered anymore. There was an awful lot of waiting brought out by the presence of XQuery. It's provided a big cloud to hang over the whole XML database arena. Instead of focusing on doing profitable work with XPath and just adding the missing pieces, innovation stopped and everyone delayed all their plans around the development of XQuery. Now, several years later XQuery still isn't finished and products are finally shipping with incomplete XQuery implementations with proprietary extensions for things like updates. This stuff should have been added years ago and now we have delayed products shipping with implementations that aren't going to interoperate anyway. So what have we gained? In my opinion, not much. In fact I think XQuery may end up killing the entire XML database market.
June 28, 2003
New Xindice 1.1 Build Available
I've been virtually silent about Xindice lately, but Kevin Ross has stepped up as a new leader within the project and has just announced the release of a new Xindice build. This is a 1.1 build and Kevin seems intent on getting a 1.1 release out, something I'm very happy to hear. My original plans called for Xindice 1.1 to be released almost 1.5 years ago. The project has struggled a lot since then and my dropping in and out of it hasn't helped much. I did manage to get a 1.1 beta build out a couple months ago, but again had to drop out of the project before making any more progress.
Xindice definitely needs some new blood among its developers. Kevin has stepped up as a new leader, but he needs a lot more help.
May 07, 2003
XML query specs edge closer to completion
They would allow collections of XML files on the Web to be queried like databases. [Computerworld XML News]
TEN! TEN!!!! TEN!!!!!!! Working drafts for XQuery/XPath 2.0. Man it just keeps getting worse. I'm curious does anyone really care about XQuery anymore? It certainly could be useful, but who is going to be able to implement it? Ugh, anyway, I'm glad to see they're adding full text, but it looks like updates are still missing. Oh well, it doesn't matter now.
April 17, 2003
New XML:DB API Implementation
Lars Martin just posted an announcement from Cincom saying that they've implemented the XML:DB API. They're claiming that they support Core and Transaction which is kind of amusing since we never actually finished Transaction, but it's all good I guess. I haven't worked on the XML:DB API in a long time and the project is badly in need of new blood to continue the work. There are many, many things that could be improved greatly with it.
March 27, 2003
Microsoft Yukon
[Sam Gentile's Blog]I'm anything, but a fan of Microsoft. However, Yukon is worth watching closely. Especially due to its significance in regard to the upcoming WinFS file system in Windows Longhorn. Watch the XML support. Microsoft has the potential to do some very interesting things there if they don't screw it up. If they do what I think they're going to do it will be very cool and I might actually regret not being able to use their software. OK maybe not, it will still be Windows and Microsoft is still not to be trusted, but it has potential.
March 07, 2003
XinCJ - C++ Inteface to Xindice
Hauke von Bremen sent me a link to XinCJ which is his C++ interface to Xindice. Looks like he mirrored the XML:DB API into C++.
February 22, 2003
JaxMe 1.53 with XML:DB API support
JaxMe 1.53 released with XML:DB API support. [xmlhack]
Cool, another XML:DB API implementation.
Latent Semantic Indexing
Came across an interesting article on O'Reilly network Building a Vector Space Search Engine in Perl that led me to an even more interesting paper on Latent Semantic Indexing. Fascinating stuff and could be very useful for providing full text indexing of XML data where you could add the XPath to a node as another dimension in the index. Something certainly worth exploring.
February 19, 2003
Xindice 1.1 beta 1
I've just posted builds of Xindice 1.1 Beta 1. Documentation is in a pretty bad state right now, but this is the first downloadable release of the 1.1 tree.
Because of the documentation issue, I'm not particularly happy with this release. We needed to get something out so that people can start migrating away from 1.0 systems. Hopefully someone will step up to write better documentation.
