Keeping Hudson configuration and data in SVN

We all know that keeping important files in version control is critical, as it ensures problematic changes can be reverted and can serve as a backup mechanism as well. Code and resources are often kept in version control, but it can be easy to forget your continuous integration (CI) server itself! If a disk were to die or fall victim to a misplaced rm -rf, you could lose all the history and configuration associated with the jobs your CI server manages.

Mr. HudsonIt’s pretty simple to create a repository, but it isn’t obvious which parts of your $HUDSON_HOME you’ll want to backup. You’ll also want to have some automation so new projects get added to the repository, and deleted ones get removed. Luckily we have a great tool to handle this: Hudson!

We have a Hudson job which runs nightly, performs the appropriate SVN commands, and checks in. The high-level overview of this job is basically:

  1. Add any new jobs, users, plugin configurations, et cetera:
    svn add -q --parents *.xml jobs/*/config.xml users/*/config.xml userContent/*
  2. Remove anything from SVN that no longer exists (such as a deleted job):
    svn status | grep '\!' | awk '{print $2;}' | xargs -r svn rm
  3. Check it in!
    svn ci --non-interactive --username=mrhudson -m "automated commit of Hudson configuration"

    You’ll want to make sure to use the --non-interactive option for any automated svn operations, as this ensures Subversion won’t hang asking a question but instead fail immediately. You may also need to provide your password with the --password option.

To make such a Hudson job, create a new job, tie it to the master (since this is where the configuration files are), set it to build periodically (we use “@midnight”), and add an “Execute shell” build step. Here’s the full script we use, to put into the build step:

# Change into your HUDSON_HOME.
cd /opt/hudson
# Add any new conf files, jobs, users, and content.
svn add -q --parents *.xml jobs/*/config.xml users/*/config.xml userContent/*
# Add the names of plugins so that we know what plugins we have.
ls -1 plugins > plugins.list
svn add -q plugins.list
# Ignore things in the root we don't care about.
echo -e "war\nlog\n*.log\n*.tmp\n*.old\n*.bak\n*.jar\n*.json" > myignores
svn propset svn:ignore -F myignores . && rm myignores
# Ignore things in jobs/* we don't care about.
echo -e "builds\nlast*\nnext*\n*.txt\n*.log\nworkspace*\ncobertura\njavadoc\nhtmlreports\nncover\ndoclinks" > myignores
svn propset svn:ignore -F myignores jobs/* && rm myignores
# Remove anything from SVN that no longer exists in Hudson.
svn status | grep '\!' | awk '{print $2;}' | xargs -r svn rm
# And finally, check in of course, showing status before and after for logging.
svn st && svn ci --non-interactive --username=mrhudson -m "automated commit of Hudson configuration" && svn st

You’ll notice this does some extra things like set the svn:ignores property to provide a relatively clean svn st which it shows before and after the commit for logging purposes. One thing this job doesn’t do is put the build results of your jobs in version control. Because historical build logs and artifacts will never change and are also potentially large, a periodic (daily or weekly) cp or rsync of the jobs directory will still give you restorability while keeping your repository lean.

Now you can sleep well at night knowing that your CI server is safe and sound. If you are doing a similar thing with Hudson or another CI system, let us know about your solution!

Update (2010/07/15): This post has been referenced a few times, so I thought I’d mention the references:

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • DZone
  • HackerNews
  • LinkedIn
  • Reddit