Conferences on Intelligent Computer Mathematics - Birmingham 2008
CICM '08

Doctoral Programme

Birmingham, UK, 29-31 July 2008

Normen M"uller: Fine-Granular Version Control, Redundancy Resolution & Long-range Effect Computation

We propose an abstract theory of collaborative content management and version control for collections of semi-structured documents. Our fs-tree model generalizes version-controlled file systems and XML files alike, which allows us to specify and implement version control algorithms that seamlessly work across the file/file system border. An added value of this model which incorporates symbolic links and XML inclusions as first-class citizens, is that we can significantly enhance workflow consistency, space-efficiency, and processing speed on working copies by eliminating link copy redundancy. An added value of this model which incorporates symbolic links and XML inclusions as first-class citizens, is that we can significantly enhance workflow consistency, space-efficiency, and processing speed on working copies by eliminating link copy redundancy.

To evaluate our model and algorithms we have implemented the locutor prototype based on the ideas put forward in this paper. locutor is a full reimplementation of the Unix Subversion client developed at Jacobs University focusing on smart management of change. Whereas SVN compares text files line by line, locutor is able to use semantic knowledge about the document format to distinguish semantically relevant from irrelevant changes. Semantically irrelevant changes, for example, for LaTeX include whitespacing, comments, newlines, include/ import and URI normalization. The next release of locutor will be able to perform customized operations before every commit and after every update so that it is possible to filter LaTeX files available for locutor and the SVN database for change management. Thus, only semantically relevant changes would trigger conflicts. Furthermore, in the presence of our fs-tree model all files may reference other files, so that if locutor receives a change upon updating a file, it is not only able to determine its semantic relevance. It can also compute the effect of the change and determine the set of files that need to be rechecked.