Keeping FreeBSD sources up-to-date

This is the process I use to keep FreeBSD sources (and ports) up-to-date so I can track FreeBSD stable (and head) and build current packages, as referenced in Upgrading systems running FreeBSD.

Environment

As noted in that document, I actually build FreeBSD (and ports/packages) on two systems: my laptop and a dedicated "build machine" (called "freebeast"). One of my requirements is that my laptop and freebeast are kept (as close as possible to) "in sync" (except for brief periods when I am not building software).

While freebeast is a fixed part of the infrastructure, my laptop tends to travel with me -- and when I'm away from home, the laptop cannot easily communicate with freebeast. This provides a complication for keeping them in sync. This document shows how I go about it.

Subversion

As of when this was originally written, the FreeBSD project used Subversion (or "svn") as the software to access its "source-of-truth" repositories. While I could have just set up crontab entries on my laptop and freebeast to perform svn update on the various working copies at the same time, that approach seemed unnecessarily fragile to me, as well as making about twice the load on the FreeBSD infrastructure that one entity ought to make.

Back before FreeBSD used svn, it used CVS; back then, I had set things up so my (then-)build machine had, and updated regularly (every night), a local private mirror of the CVS repositories, and I set up my laptop to then "mirror" that local private mirror. So I decided to do the same thing with svn.

However, there was a small complication: For CVS, I used CVSup on freebeast to mirror the official CVS repo, and then used it on the laptop to update the laptop's mirror (either from freebeast or from cvs.freebsd.org (if I was away from home). But with svn, using svnsync, the local mirror has a pointer to its "source" repository, as shown by:

g1-55(12.2-S)[1] svnsync info file:///repo/svn/freebsd/ports/
Source URL: svn://svn.freebsd.org/ports
Source Repository UUID: 35697150-7ecd-e111-bb59-0022644237b5
Last Merged Revision: 552400
g1-55(12.2-S)[2] svnsync info file:///repo/svn/freebsd/src/base
Source URL: svn://svn.freebsd.org/base
Source Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Last Merged Revision: 366720
g1-55(12.2-S)[3] 
So merely running svnsync from within the local mirror causes svnsync to contact the source and update the local mirror accordingly. Which is fine for freebeast, but on the laptop, that would only work while I am at home. (As far as I am aware, there was no provision to switch source repositories back when I set this up. There is now such a facility, but what I have has been working....)

So what I did is set up freebeast to (also) run rsyncd and wrote a small shell script that would be invoked from crontab; the shell script could be run in one mode where it would contact svn.freebsd.org (freebeast would always use this mode) or it could run in "local" mode, where it would use rsync. That way, both the mirror on freebeast and the mirror on my laptop would "point to" freebsd.org as the "source" repository, and my laptop would use "local" mode (a couple of minutes after freebeast runs the script in "freebsd" mode) when I'm at home and "freebsd" mode (at the same time that freebeast does) when I'm not. And, yes, when I'm away from home, I depend on co-incident access to provide a semblance of syncronization (which does usually work out fairly well), but once I'm home again, things revert to "lock-step," which appeals to me.

That done, I set up invocations of the script to update the local private mirrors every night. (With CVS, when tagging a branch required an update for every single file in that branch, I had set up the nightly updates to run in two stages -- one a bit after (local) 01:00; the other, around local 03:30. While svn didn't have the awful branch-tagging overhead, I left that approach in place, as the general notion of splitting the workload into a "big push" followed a bit later by a "catch-up run" seemed to make sense.)

And now that each of the machines (my laptop and freebeast) has a local private mirror, the working copies derived from them stay in sync as long as the mirrors do (and as long as the svn update is done at a time when the mirrors are in sync -- which is all but a few minutes of each day). I start updating working copies and doing builds shortly after local 03:30.


git

Now that the FreeBSD project has migrated to using git as the software to access its source-of-truth repositories, I needed to determine how I was going to keep things in sync again. Thanks to advice and suggestions from folks who understand git far better than I do, and a bit of experimentation on my part, I found a way. (Though, I hasten to add: any errors or misunderstandings are mine, not theirs.)

In principle, it is quite similar to what I implemented for svn -- in fact, for the script, I copied the one I had written for svn, then hacked it to use git, as (like svn) git maintains a pointer to the "parent" repository:

g1-55(12.2-S)[9] dirs
/repo/git/freebsd 
g1-55(12.2-S)[10] grep -w url */config
doc.git/config: url = https://cgit-beta.freebsd.org/doc.git
ports.git/config:       url = https://cgit-beta.freebsd.org/ports.git
src.git/config: url = https://cgit-beta.freebsd.org/src.git

The "key" is that the local private (git) mirrors need to be "bare" (or, more accurately, "mirror") repositories: instead of creating them via (just) git clone, I used git clone --mirror. They are "bare" in that they do not (also) contain working copies.

Once created, the git local mirrors are updated via git fetch --all (just as the svn mirrors are updated with svnsync sync).

Local working copies are then created via git clone and updated via git pull. In order to also get the "notes" (that indicate the svn revision number that corresponds to the git commit, I augmented each working copy's .git/config:

--- config      2020/10/16 10:06:03     1.1
+++ config      2020/10/16 10:07:10
@@ -6,6 +6,7 @@
 [remote "origin"]
        url = file:///repo/git/freebsd/src.git
        fetch = +refs/heads/*:refs/remotes/origin/*
+       fetch = +refs/notes/*:refs/notes/*
 [branch "stable/12"]
        remote = origin
        merge = refs/heads/stable/12
(There are probably other ways to go about all of this. I do not pretend to be an expert on using git. Indeed, a slightly different approach (also) used a "bare" repo, but instantiated the working copies as git "worktrees." While this does work, I found that: Which I found both annoying and rather counter-intuitive, so I was not eager to implement it.)

One complication that I encountered a few weeks after implpementing the above was that after running it for several days in succession, suddenly one day, git reported errors when trying to re-sync my local repositories -- which I thought odd, as I wasn't making any local changes to anything (that git was touching, anyway). After receiving a suggestion to perform git "garbage collection" (and an experimental run that demonstrated that doing so appeared to be effective), I have taken to running a little script that performs:

for each of my git repositories (both the mirrors and those that contain working copies) early every Thursday morning (before the first re-sync starts that morning.

I do this Thursday morning, as I expect that if some problem were to arise, that would allow me a couple of days to get it resolved before the weekend builds start -- and that's when I update the "production" machines.


Logs from the repository "sync" operations

Logs from the above-described svn and git synchronization oprations may be found here.

Please recall that I perform the synchronization nightly, in two stages; thus, while the logs will nearly always reflect the status of my local mirrors, they may lag FreeBSD's repositories by as much as 22 hours if all is working as it should be -- and more than that, if there are problems. ed>

Getting cron to use "local" time

I referred to running things via crontab at a certain "local time" up there. While it is possible to set up a machine to run on local time, I do not do that. Rather, I set up machines to run on UTC and set an environment variable in my shell login script for local time. And I do something somewhat similar for cron, by placing the following in /etc/rc.conf:

cron_program="/usr/bin/env"
cron_flags="TZ=America/Los_Angeles /usr/sbin/cron"

This has the property that events scheduled by cron run at times based on local time, but use UTC for such events as logging (absent a deliberate attempt to make them do otherwise). As I have occasion to send excerpts of log files to correspondents in other countries, I thought it easier to ensure that the logs reflect UTC, rather than try to explain which excerpts of the logs have what offsets from UTC.


$Id: repo-sync.html,v 1.8 2023/01/26 13:28:50 david Exp $