Modeling Project Releng/SVN Support

Overview

Currently, all Modeling projects store their source code in CVS, and there are a number of tools that take advantage of CVS's logs, in particular:

Affected Tools

While these tools query a MySQL database which contains the parsed CVS logs and thus don't directly touch the CVS logs, the set of information that SVN provides is not strictly a superset (or subset) of the information that CVS provides and mapping it on to the current database schema is infeasible.

The Good

Tracking updates in SVN is much easier than tracking updates in CVS, in particular:

new "branches" and "tags" can be parsed out without needing to parse the entire log
all changes in a particular commit are tied together
files can be formally moved/renamed

Overall, a "Search SVN" tool should be easier to develop from scratch than Search CVS was.

The Bad

SVN in very flexible and in particular:

there is no formal concept of tags
there is no formal concept of branches
SVN revisions are per-repository, not per-file

Tags and Branches

The majority of the complexity in parsing the CVS logs and doing queries on the resulting data comes from handling branches and tags properly. The fact that SVN's approach to both is completely different (and vastly more flexible) makes matters difficult.

SVN has the convention of laying out a repository with three top level directories, like so:

/trunk
/tags
/branches

Where active development is done on /trunk, /trunk is copied to a new directory in /tags for releases (as copies are essentially free in SVN), and /trunk is copied to a new directory in /branches when there is a desire to branch and work on that branch is done on the copy.

Why is this bad? It's bad (for a potential "Search SVN" tool) as none of this is enforced by SVN, and there's nothing special about those directories beyond the convention of the committer. Given n committers, it's inevitable that one will end up with at least n+1 different commit conventions. It should be obvious that supporting an ever growing number of commit conventions is a losing battle.

Revisions

The SVN approach to revisions is actually preferable when starting from scratch, but it's so completely different from the CVS approach that mapping one onto the other is a bad idea.

Supporting SVN

I would suggest a fresh start for the above tools in supporting SVN, not only because the approach and data presented would be fairly different, but because bolting on SVN support to the above tools would be more work than starting afresh, and the end result wouldn't be as useful in comparison to two distinct tools.

Additionally, I think it's crucially important that all projects that want support should follow the same commit conventions, such that branches and tags can be found algorithmically rather than heuristically.

However, regardless of the approach taken to support SVN, there will be a non-trivial amount of development work to do.

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Modeling Project Releng/SVN Support

Contents

Overview

Affected Tools

The Good

The Bad

Tags and Branches

Revisions

Supporting SVN

Breadcrumbs

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Modeling Project Releng/SVN Support

Contents

Overview

Affected Tools

The Good

The Bad

Tags and Branches

Revisions

Supporting SVN