Puppet External Node Classifier

In Puppet the initial method of holding information about your machines is through the site.pp config file, this rapidly becomes tiresome when you have more than 5 servers. This is where an external node classier comes in as a handy tool.

The first step in setting up an external node classifier is developing what will be generating the YAML for the puppetmaster to read. In this situation Django chosen as a web framework as the built in admin interface saved a bit of development time for management. Additional thought behind using an ORM framework was cutting down on development time on other projects that might need access to the truth database and could simply pull YAML from it. Within the Django application there are classes describing hosts, environments, server class, puppet classes, hard drive and filesystem layout. Here is the class describing a host:

class Host(models.Model):
    hostname = models.CharField(null=True,blank=False,max_length=128)
    basename = models.CharField(null=True,blank=True,max_length=128)
    maintenance = models.BooleanField(default=False)
    datacenter = models.CharField(max_length=5,choices=(('sjc','san jose'),
        ('smc','server room')),blank=True,null=True)
    environment = models.CharField(max_length=5,choices=(('app','production'),
        ('stg1','staging 1'),('stg2','staging 2'),('dev','development')),blank=True,null=True)
    env = models.ForeignKey(Environment)
    ostype = models.CharField(max_length=32,choices=(('linux','linux'),
    info = models.TextField(blank=True,null=True)
    bit = models.CharField(blank=True,null=True,max_length=5,choices=(('32','32 bit'),('64','64 bit')))
    puppetrun = models.IntegerField(default=1800)
    serverclass = models.ForeignKey(SvrClass)
    classes = models.ManyToManyField(PuppetClass,blank=True,null=True)
    drives = models.ManyToManyField(Drive,blank=True,null=True)
    interfaces = models.ManyToManyField(NetInterface,blank=True,null=True)
    script = models.ManyToManyField(Script,blank=True,null=True)
    osver = models.ManyToManyField(Osversion,blank=True,null=True)
    services = models.ManyToManyField(Service,blank=True,null=True)
    supervisors = models.ManyToManyField(SupervisorProgram,blank=True,null=True)
    def getBranch(self):
        return self.env.branch.name
    def getEnvironment(self):
        return self.env.name
    def __str__(self):
        return self.__repr__()
    def __repr__(self):
        return "hostname: %s environment: %s" % (self.hostname,self.env.name)
class HostAdmin(admin.ModelAdmin):
    list_display    =   ('hostname','environment','bit','ostype','datacenter','maintenance')
    list_filter     =   ('environment','datacenter')
    ordering        =   ('hostname','datacenter','environment')
    search_fields   =   ('hostname',)

and now the code for the Puppet class in Python, the Puppet class class:

class PuppetClass(models.Model):
    name = models.CharField(max_length=128)
    def __unicode__(self):
        return self.name
    class Meta:
        ordering = ('name',)
class PuppetClassAdmin(admin.ModelAdmin):
    list_display = ('name',)

With these two classes a host and what Puppet classes it has assigned to it can readily be described and the following code will output YAML describing the host as the puppetmaster understands:

import yaml
def hostinfo(request,hostname):
    host = get_object_or_404(Host,hostname=hostname)
    classlist = []
    [classlist.append(str(x.name)) for x in host.classes.all()]
    #if a host's classlist is empty at this point pull from the default list
    if not classlist:
        #default list is simply a list of puppet class objects
        defaultlist = DefaultList.objects.get(name="standard")
        [classlist.append(str(x.name)) for x in defaultlist.classes.all()]
    #add in some extra parameters
    params = {'datacenter':str(host.datacenter),'machine':str(host.basename),'serverclass':str(host.serverclass.name),'env':str(host.env.name),'branch':'%s' % str(host.env.branch.name),'envmaint':str(host.env.maintenance),'hostmaint':str(host.maintenance)}
    params['rserver'] = str(host.env.rserver)
    yamlsrc = yaml.dump({'classes':classlist,'parameters':params})
    return  render_to_response("sock/hostinfo.yaml",{'hostname':hostname,'yaml':yamlsrc})

One issue that was encountered with exporting data from django to yaml was the usage of utf for strings, hence the continual usage of str(). The result of the function is dumped out through the following incredibly complex template

{{ yaml }}

At the end of all of this we finally get data that comes out as something similar to the following:

classes: [bots, puppet-classes, puppet-master, repos, yamlsvn]
parameters: {branch: '33', datacenter: sjc, env: app, envmaint: '0', hostmaint: '0',
  machine: repo, rserver: sjc-repo.genops.net, serverclass: none}

Next we need a command that the puppetmaster can run to get the YAML, here is the current script:

curl http://example.genius.com/dpuppet/sock3/hostinfo/$1/ 2>/dev/null | \
sed "s/'/'/g"
exit 0;

Finally the puppetmaster needs to be configured to pull from the external node classifier:

    # Where Puppet stores dynamic and growing data.
    # The default value is '/var/puppet'.
    vardir = /var/lib/puppet
    # The Puppet log directory.
    # The default value is '$vardir/log'.
    logdir = /var/log/puppet
    # Where Puppet PID files are kept.
    # The default value is '$vardir/run'.
    rundir = /var/run/puppet
    # Where SSL certificates are kept.
    # The default value is '$confdir/ssl'.
    ssldir = $vardir/ssl
    # use external nodes ftw
    external_nodes = /usr/local/bin/puppetinfo.sh
    node_terminus = exec
    # The file in which puppetd stores a list of the classes
    # associated with the retrieved configuratiion.  Can be loaded in
    # the separate ``puppet`` executable using the ``--loadclasses``
    # option.
    # The default value is '$confdir/classes.txt'.
    classfile = $vardir/classes.txt
    # Where puppetd caches the local configuration.  An
    # extension indicating the cache format is added automatically.
    # The default value is '$confdir/localconfig'.
    localconfig = $vardir/localconfig
    server =  puppetmaster.genius.com
    runinterval = 300
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • DZone
  • HackerNews
  • LinkedIn
  • Reddit