CVS is the defacto VCS for many people, for tiny projects it performs ok, but once you end up with 40k+ files that are branched 4 times pr. day, year after year, it starts to creak quite loudly and you start to want to migrate to Subversion.
Take a look at this graph which shows the number of milliseconds taken to "cvs tag -b" 10 files as a function of the number of branches, cvs 1.11.22 (latest stable at this time) is labeled vanilla:
You will notice that the vanilla version has a complexity of O(n2), clearly this is not something we can live with.
It turns out that I'm not the first one to notice that branching becomes horrendously expensive after a while, Adrian West noticed and published a fix., his fix produced the result labeled patched in the graph above.
Adrians patch makes cvs tag -b around 10 times faster when there are 1000 tags on the files, which is fine on its own, but I thought I might be able to do better.
The harddisk does a whole lot of seeking when tagging, so I tried disabling fsync for tagging and branching operations, this is the result labeled fsync in the graph.
The tag operation was also helped quite a bit by disabling fsync (patched-tag vs. fsync-tag):
Disabling fsync might seem dangerous, but it really shouldn't be much of a problem in the case of tagging and branching, because the worst that can happen is that you have to re-issue the cvs tag command in case the machine that houses the cvs server crashes.
I've put up my tiny fsync hack along with Adrians fix here: cvs-branch-performance.tar.gz [16K]
The files in the tarball are:
|cvs-1.11.22-adrian-fsync.diff||The holy grail, contains both Adrians and my fix for cvs 1.11.22|
|benchmark-branch||Benchmark harness for cvs operations.|
|plot||plots the logs generated by the benchmark.|
|cvs-1.11.22-fsync.diff||My humble fsync disabling fix.|
|cvs-1.11.22-adrian.diff||Adrians fix, updated for 1.11.22|
|original-adrian-rcs.c.diff||Adrians orignal patch.|