Friday, July 07, 2006

A Simple Package-Management Script

This is the third in a series of posts about a simple nightly update system for OS X.

In previous posts I've shown scripts to automate the installation of software under OS X, at least for software packaged in several of the common Mac formats (app bundles and pkg & mpkg bundles, possibly packed into a dmg image). In this post I'll talk about a simple package-management script that I'll use to keep packages up-to date, and to install new packages (and eventually, uninstall unwanted packages) in an automatic way every day.

The script checks a central repository (located on an anonymous ftp site) for a list of available packages, then looks at a master list of packages that should be installed on all standard computers. If any of the packages on the master list aren't installed locally, or if a newer version is present in the repository, those packages are fetched from the repository and installed. Updated packages can thus be distributed to all standard computers by simply adding updated packages to the repository, and new software can be added by taking the additional step of adding the new packages to the master list. Currently, I haven't implemented any mechanism for removing packages. This system is analogous to one we've been using successfully for many years on our Linux computers.

Note that this system is only intended to supplement the services already provided by Apple's own softwareupdate command. It's intended to install and maintain non-Apple software that won't be updated by softwareupdate.

I call the script "pkgupdate", and it looks like this:
Script pkgupdate:
#!/usr/bin/perl

use strict;

BEGIN {
$PkgUpdate::noyaml = 0;
unless (eval "use YAML; 1") {
warn "YAML not found, falling back to simple config.\n";
$PkgUpdate::noyaml = 1;
}
}

use File::Temp qw/tempdir/;

my $configfile = "/etc/localconfig";
my @pkgtypes = ('app.tar.gz',
'mpkg.tar.gz',
'pkg.tar.gz',
'dmg');

# Get list of actions:
my $pkglist = shift;
my $keep = shift;

# Load local configuration:
my $config;
my %simpleconfig;
unless ($PkgUpdate::noyaml) {
$config = LoadFile("$configfile.yml") or die "Cannot load config file $configfile.yml: $!\n";
} else {
%simpleconfig = simpleconfig($configfile);
$config = \%simpleconfig;
}

# Specify package URL:
my $pkgurl = "ftp://$config->{UPDATEMASTER}/$config->{UPDATEVERSION}/PKGS";

# Create temp directory:
my $cleanup = 1;
$keep && ($cleanup = 0);
my $tmpdir = tempdir( CLEANUP => $cleanup );
print "Using tmpdir $tmpdir.\n";

# Get list of available packages:
print "Fetching list of available packages... ";
`cd $tmpdir && curl -s -O $pkgurl/.pkglist`;
my %pkglist = ();
open (PKGLIST, "<$tmpdir/.pkglist") or die "Cannot open $tmpdir/.pkglist: $!\n";
while (<PKGLIST>) {
chomp;
/^(\w{32})\s+(.*)$/;
$pkglist{$2}{sum} = $1;
}
close (PKGLIST);
print "Done.\n";

# Create local package database, if it doesn't already exist:
unless ( -f "/etc/pkglist.sqlite3" ) {
print "Local package database /etc/pkglist.sqlite3 not found. Creating.\n";
print `sqlite3 /etc/pkglist.sqlite3 "create table package (sum text, file text)"`;
$? && die "Error creating /etc/pkglist.sqlite3\n";
}

# Get list of already-installed packages
my %pkglocal = ();
open (PKGLIST,"sqlite3 /etc/pkglist.sqlite3 \"select * from package\" |" ) or die "Cannot get list of locally-installed packages: $!\n";
while (<PKGLIST>) {
chomp( my ($sum,$file) = split(/\s*\|\s*/) );
$pkglocal{$file}{sum} = $sum;
}
close (PKGLIST);

# Process packages to install/update:
open (PKGLIST, "<$pkglist") or die "Cannot open $pkglist: $!\n";
PACKAGE: while (<PKGLIST>) {
chomp;

# Skip comments and blanks:
/^(\#.*|\s*)$/ && next PACKAGE;

my ($action,$package) = split(/\s+/);
# Convert from glob syntax to regexp:
$package =~ s/\*/\.\*/g;
$package =~ s/\?/\./g;
# Find any matching packages:
my @matches = grep (/$package/, reverse sort keys %pkglist);
unless ( @matches ) {
warn "No matching package found for $package. Skipping.\n";
next PACKAGE;
}
my $match = $matches[0];
# If multiple matches are found, choose one of them:
if ( @matches > 1 ) {
TYPE: for my $t (@pkgtypes) {
for my $m (@matches) {
if ( $m =~ /.*$t$/ ) {
$match = $m;
last TYPE;
}
}
}
}

# Find package type:
my $type = $match;
$type =~ s/.tar.gz$//;
$type =~ /^.*\.([^\.]+)$/;
$type = $1;

# Check to see if we already have this package:
unless ( $pkglocal{$match}{sum} eq $pkglist{$match}{sum} ) {

print "Executing action $action for package $match of type $type...\n";

for ($action) {
/install/ && do {
# Escape any spaces:
my $escmatch = $match;
$escmatch =~ s/(\s)/\\$1/g;
# Invoke the appropriate installer for this type:
`cd $tmpdir && curl -s -O $pkgurl/$escmatch`;
chomp( my $result = `md5 -r $tmpdir/$escmatch` );
$result =~ /^(\w{32}).*$/;
my $checksum = $1;
unless ( $checksum == $pkglist{$match}{sum} ) {
print STDERR "Error: Checksum mismatch for package $match. Skipping.\n";
next PACKAGE;
}
if ( $escmatch =~ /^(.*)\.tar\.gz$/ ) {
`cd $tmpdir && tar xzf $escmatch && rm $escmatch`;
$escmatch = $1;
}
unless ( print `cd $tmpdir && /common/manager/$type-install $escmatch`) {
print STDERR "Error: $!, $?\n";
next PACKAGE;
}
if ( $? ) {
print STDERR "Error installing package $match. Skipping local package list update.\n";
next PACKAGE;
}
print "Updating local package database.\n";
print `sqlite3 /etc/pkglist.sqlite3 "insert into package (file,sum) values ('$match','$pkglist{$match}{sum}')"`;
if ( $? ) {
print STDERR "Error updating local package database.\n";
next PACKAGE;
}
};
}
}
}
close (PKGLIST);

sub simpleconfig {
my $file = shift;

my %config = ();
open (FILE,"<$file") or die "Cannot open simpleconfig file $file: $!\n";
while (<FILE>) {
/^\#/ && next;
/^([^:]+)?\s*:\s*(.*)?\s*$/;
$config{$1} = $2;
}
close (FILE);
return %config;
}
The script draws local configuration information from either a YAML file (if this is available) or a simple file with the format "variable: value" if not. Only two parameters are used from the local configuration: UPDATEMASTER and UPDATEVERSION. From these, an ftp url of the form
ftp://$UPDATEMASTER/$UPDATEVERSION/PKGS
is constructed. This is the location of the package repository.

The repository can contain software packaged as either dmg files or compressed tar archives of application or package bundles. The repository must also contain the file ".pkglist", which is a list of available files and their checksums. This list can be maintained on the server with a command like the following:
(find . -type f -print0 | sed -e 's/\.\///g'| xargs --null md5sum) > .pkglist
which can either be run by hand after dropping a new package into the repository, or run periodically through a cron job. Note that the "-print0" flag on "find" and the "--null" flag on xargs are necessary because many of the file names will contain spaces. I also pipe the "find" output through a sed command to strip off the leading "./" that would otherwise appear in the file names.

The resulting .pkglist file looks like this:
bd2ab919477b545bdc51209ae9ade105  Firefox 1.5.0.1.dmg
0c1920da27ead93b41958afa1c80f2fd Fugu1.2.0.dmg
504d1e037a639753e307e3d48e3f1f01 II2.dmg
72091614b4656fbdb0d5bc2f74abfe5a OracleCalendar.dmg
350a369b3ec955537822b4387b17b101 RDC103EN.dmg
5c38e7bbb389d129430c3af169745314 Thunderbird 1.5.dmg
f771d754c01023856444628514865e2e mozilla-mac-1.7.12.dmg
49369f4e72fe334c23d52489fc61110f OSXvnc1.71.dmg
The checksums will be used on the local computer to determine whether a given package/version has already been installed. The pkgupdate script maintains a local sqlite database of the packages it installs. After pkgupdate fetches the .pkglist file, compares the checksums of the available packages with the checksums of the already-installed packages and uses this information when deciding whether a package needs to be fetched from the repository and installed. I've used checksums for this to avoid having to try to figure out what version of a piece of software is packed into a given file. Filenames are unreliable for this purpose, and it would be complicated to unpack each package and try to find out what version number (or numbers, in the case of multi-packages!) was associated with it. Instead, I just assume that if two packages have the same checksum, they are the same version.

The pkgupdate script takes one argument: the name of the file containing the master list of packages to be installed on all computers. That file looks like this:
install OSXvnc*
install Firefox*
install Thunderbird*
install mozilla-mac*
install OracleCalendar*
install Fugu*
install II2*
install magicolor2430DL
install RDC103EN
with each line consisting of an action followed by a (possibly wild-carded) package name. Currently, the only available action is "install", although I'd like to add "remove" eventually, since we support this under the analogous system used on our Linux computers. The package names should match names in the .pkglist file.

When pkgupdate determines that it needs to install a package, it fetches the appropriate file from the ftp repository using curl, then invokes an external installer script (dmg-install, app-install or pkg-install), chosen based on the name of the downloaded file. (Compressed tar archives are first unpacked.)

If the installation succeeds, pkgupdate updates the sqlite database containing information about currently-installed packages. This lives in /etc/pkglist.sqlite3 . Currently, the database only contains filenames and corresponding checksums.

In the finaly posting in this series, I'll show how pkgupdate can be used as a component of a general-purpose nightly update system for OS X.

No comments: