1 MigrationToSubversionAndTrac
Dag-Erling Smørgrav edited this page 2021-10-20 23:28:58 +02:00

Migration to Subversion and Trac

Background

At the time when I started writing OpenPAM, I was working on PAM integration and other security-related issues in FreeBSD, as a subcontractor for Network Associates (since then acquired by McAfee) under the DARPA CHATS program. This involved maintaining a rather large set of patches to the FreeBSD source tree; the most convenient way to do this was to use the FreeBSD project's Perforce depot. Thus, OpenPAM, which grew out of this work, was initially maintained in that depot.

I quickly grew annoyed with Perforce, and especially with how little it improved over time. There was also the issue of control: I did not control the Perforce server, nor the hardware it ran on. By virtue of being hosted by it, OpenPAM remained closely tied to FreeBSD, which I believe has been a major obstacle to its adoption by other operating systems.

For a long time, however, there were no clear alternatives to Perforce, especially because I did not want to lose history when I switched version control systems. I had long been interested in Subversion, however, and made several attempts at migrating to it until I finally succeeded in February 2006.

Converting the repository

I searched for a tool which could extract a set of files from a Perforce depot and add them, with full history, to a Subversion repository, and came across Ray Miller's p42svn.

After much expermientation, I came to the conclusion that I could not use p42svn unmodified, because of a problem with the way it tries to work around a bug in the Perforce client libraries.

The basic problem is that {{{P4::Print()}}} behaves inconsistently for different file types: if the requested file is a text file, it returns its contents as a string; but if it is a binary file, it prints the contents to {{{STDOUT}}} instead. The way {{{p42svn}}} works around this is that for every file it needs to download (which is every file that was changed in every changeset), it forks off a child which opens a server connection, calls {{{P4::Print()}}} and prints the results. The parent simply captures the child's {{{STDOUT}}} and gets the data it needs, regardless of the file type.

This workaround was problematic for me for two reasons: first of all, my access to FreeBSD's Perforce depot was over SSH across 13 hops with a 200 ms RTT, which means that connection setup and teardown alone takes almost a second. Furthermore, the server is fairly heavily congested and apparently implements some kind of SSH connection rate limiting, which meant that sooner or later {{{P4::Init()}}} would fail and {{{p42svn}}} would immediately quit instead of retrying after a short pause.

I ended up making the following changes:

  • {{{p4_init()}}} was modified to always return the same client connection, which is cached in a global variable
  • All references to {{{P4::Final()}}} were commented out to avoid closing the cached connection.
  • {{{p4_get_file_content()}}} was modified to call {{{P4::Print()}}} directly, without forking. This was safe to do for OpenPAM, because it does not contain any binary files.

I admit that it's a hack, but it works. The result can be found in the repository.