Cfengine v2 Tutorial


Next: , Previous: (dir), Up: (dir)

Cfengine-Tutorial

COMPLETE TABLE OF CONTENTS

This manual is for version 2.2.9 of cfengine and was last updated on the 24 December 2008.

Summary of contents


Next: , Previous: Top, Up: Top

1 Overview

In this manual the word “host” is used to refer to a single computer system – i.e. a single machine which has a name termed its “hostname”.


Next: , Previous: Overview, Up: Overview

1.1 What is cfengine and who can use it?

Cfengine is a tool for setting up and maintaining computer systems. It consists of several components:

cfagent
An autonomous configuration agent (required).
cfservd
A file server and remote activation service (optional).
cfexecd
A scheduling and report service (recommended).
cfenvd
An anomaly detection service (strongly recommended).
cfrun
A way of activating cfagent remotely (use this as you need to).
cfshow
A way of examining the contents of helper databases (helper).
cfenvgraph
Ancillary tool for cfenvd (helper).
cfkey
Key generation tool (run once on every host).

The agent cfagent can be used without the other programs, but not all of the capabilities of cfengine will be available unless the components are installed and used appropriately.

Cfengine incorporates a declarative language—much higher level than Perl or shell: a single statement can result in many hundreds of operations being performed on multiple hosts. Cfengine is good at performing a lot of common system administration tasks, and allows you to build on its strengths with your own scripts. You can also use it as a netwide front-end for cron. Once you have set up cfengine, you'll be free to use your time doing other things instead of manual configuration.

The main purpose of cfengine is to allow you to create a single system configuration which will allow you to define how every host on your network should be configured, and to do so in an intuitive way – either centralized or decentralized as you prefer. An interpreter runs on every host on your network and parses the master file (or file set). The configuration of each host is checked against this file; then, if you request it, any deviations from the defined configuration are fixed automatically. You do not have to mention every host specifically by name in order to configure them: instead you can refer to the properties which distinguish hosts from one another. Cfengine uses a flexible system of “classes” which helps you to single out a specific group of hosts with a single statement.

Cfengine grew out of the need to control the accumulation of complex shell scripts used in the automation of key system maintenance at University College in Oslo. There were a lot of scripts, written in shell and in Perl, performing system administration tasks such as file tidying, find-database updates, process checking and several other tasks. In a mixed environment, shell scripts work very poorly: shell commands have differing syntax across different operating systems, and the locations and names of key files differ. In fact, the non-uniformity of Unix was a major headache. Scripts were filled with tests to determine what kind of operating system they were being run on, to the point where they became so complicated and unreadable that no one was quite sure what they did anymore. Other scripts were placed only on the systems where they were relevant, out of sight and out of mind. It quickly became clear that our dream solution would be to replace this proliferation of scripts by a single file containing everything to be checked on every host on the network. By defining a new language, this file could hide all of the tests by using classes (a generalized `switch/case' syntax) to label operations and improve the readability greatly. The gradual refinement of this idea resulted in the present day cfengine.

As an inexperienced cfengine user, you will probably find yourself trying to do things as you would have tried to do them in shell or Perl. This is probably not the right way to think when using cfengine. You will need to think in a more `cfengine way'. When reading the manual, keep in mind that cfengine's way of working is to think about what the final result should be like, rather than on how to get there (with shell and Perl you specify what to do, rather than what you would like).

The remainder of this manual assumes that you know a little about BSD and UNIX System V systems and have every day experience in using either the C shell (csh) or the Bourne shell (sh), or their derivatives. If you are experienced in system administration, you might like to skip the earlier chapters and turn straight to the example in the section Example configuration file of the Reference manual. This is the probably quickest way to learn cfengine for the initiated. If you are not so familiar with system administration and would like a more gentle introduction, then we begin here...


Next: , Previous: What is cfengine?, Up: Overview

1.2 Site configuration

To the system administrator of a small network, with just a few workstations or perhaps even a single mainframe system, it might seem superfluous to create a big fuss about the administration of the system. After all, it's easy to `fix' things manually should any problems arise, making a link here, writing a script there and so on — and its probably not even worth writing down what you did because you know that it will always be easy to fix next time around too... But networks have a tendency to expand and—before you know it—you have five different types of operating system and each type of system has to be configured in a special way, you have to make patches to each system and you can't remember whether you fixed that host on the other side of the building... Also, you discover fairly quickly that what you thought of as BSD or System V is not as standard as you thought and that none of your simple scripts that worked on one system work on the others without a considerable amount of hacking and testing. You try writing a script to help you automate the task, but end up with an enormous number of `if..then..else..' tests which make it hard to see what is really going on.

To manage a network with many different flavours of operating system in a systematic way, what is needed is a more disciplined way of making changes which is robust against system re-installation. After all, it would be tragic to spend many hours setting up a system by hand only to lose everything in an unfortunate disk crash a week or even a year later when you have forgotten what you had to do. Upgrades of the operating system software might delete your carefully worked out configuration. What is needed is a separate record of all of the patches required on all of the systems on the network; a record which can be compared to the state of each host at any time and which a suitable engine can use to fix any deviations from that reference standard.

The idea behind cfengine is to focus upon a few key areas of basic system administration and provide a language in which the transparency of a configuration program is optimal. It eliminates the need for lots of tests by allowing you to organize your network according to “classes”. From a single configuration file (or set of files) you can specify how your network should be configured — and cfengine will then parse your file and carry out the instructions, warning or fixing errors as it goes.


Next: , Previous: Site configuration, Up: Overview

1.3 Key Concepts

Some of the important issues in system administration which cfengine can help with.


Next: , Previous: Key concepts, Up: Key concepts

1.3.1 Configuration files and registries

One of the endearing characteristics of BSD and System V systems is that they are configured through human-readable text files. To add a new user to the system you edit /etc/passwd, to add a new disk you must edit /etc/fstab, etc. Many applications are also configured with the help of text files. When installing a new system for the first time, or when changing updating the setup of an old system, you are faced with having to edit lots of files. In some cases you will have to add precisely the same line to the same file on every system in your network as a change is made, so it is handy to have a way of automating this procedure so that you don't have to load every file into an editor by hand and make the changes yourself. This is one of the tasks which cfagent will automate for you.

On Windows systems, configuration data are stored in a system registry. With the right tools, the Windows system registry can also be edited by cfengine, but this requires more care.


Next: , Previous: Control files, Up: Key concepts

1.3.2 Network interface

Each host which you connect to an Ethernet-based network running TCP/IP protocols must have a so-called `net interface'. This network interface must be configured before it will work. Normally, one does this with the help of the ifconfig command. This can also be checked and configured automatically by cfagent.

Network configuration involves telling the interface hardware what the internet (IP) address of your system is, so that it knows which incoming `packets' of data to pay attention to. It involves telling the interface how to interpret the addresses it receives by setting the `netmask' for your network (see below). Finally, you must tell it which dummy address is to be used for messages which are broadcast to all hosts on your network simultaneously (see the Reference Manual).

Cfagent's features are mainly meant for hosts which use static IP addresses; if you are using DHCP clients, then you will not need the net configuration features.


Next: , Previous: Network interface, Up: Key concepts

1.3.3 Network File System (NFS) or file distribution?

Probably the first thing you are interested in doing with a network (after you've had your fill of the World-Wide Web) is to make your files available to some or all hosts on the network, no matter where in your corporate empire (or university dungeon) you might be sitting. In other words, if you have a disk which is physically connected to host A, you would like to make the contents of that disk available to hosts B, C, D ..., etc. NFS (the Network FileSystem) makes this possible.

The process works by `filesystems'. A filesystem is one partition of a disk drive – or one unit of disk space which can be accessed by a single `logical device' `/dev/something'.

To make a filesystem available to other hosts you have to do three things.

Only after all three of these have been done will a filesystem become available across the network. Cfagent will help you with the last two in a very transparent way. You could also use the text-editing facility in cfagent to edit the exports file, but there are other ways to update the exports file using NIS and netgroups, which we shall not go into here. If you are in doubt, look up the manual page on exports(5).

Some sites prefer to minimize the use of NFS filesystems to avoid one machine being dependent on another. They prefer to make a local copy of the files on a remote machine instead. Traditionally, programs like rdist have been used for this purpose. You may also use cfagent to copy files in this way, See Emulating rdist.


Next: , Previous: Network File System (NFS), Up: Key concepts

1.3.4 Name servers (DNS)

There are two ways to specify addresses on the internet (called IP addresses). One is to use the text address like `ftp.uu.net' and the other is to use the numerical form `192.48.96.9'. Alas, there is no direct one-to-one correspondence between the numerical addresses and the textual ones, thus a service (called DNS) is required to map one to the other.

The service is performed by one or more special hosts on the network called name servers. Each host must know how to contact a name server or it will probably hang the first time you give it an IP address. You tell it how to contact a name server by editing the text-file /etc/resolv.conf. This file must contain the domain name for your domain and a list of possible name servers which can be contacted, in order of priority. Since this is a special file which every host must have, you don't have to use the general text file editing facilities in cfagent. You can just define the name servers for each host in the cfagent file and cfagent will do the editing to /etc/resolv.conf automatically. If you want to change the priority of name servers later, or even change the list then a simple change of one or two lines in the configuration file will enable you to reconfigure every host on your network automatically without having to do any editing yourself!


Next: , Previous: Name servers (DNS), Up: Key concepts

1.3.5 Monitoring important files

Security is an important issue on any system. In the busy life of a system administrator, it is not always easy to remember to set the correct access rights on every file; this can result in either a security breach or problems in accessing files.

A common scenario is that you, as administrator, fetch a new package using FTP, compile it and install it without thinking too carefully. Since the owner and permissions of the files in an FTP archive remains those of the program author, it often happens that the software is left lying around with the owner and permissions as set by the author of the program rather than any user name on your system. The userid of the author might be anybody on your system — or perhaps nobody at all! The files should clearly be owned by root and made readable but unwritable by normal users.

Simple accidents and careless actions under stress could result in, for example, the password file being writable to ordinary users. If this were the case, the security of the entire system would be compromised. Cfagent therefore allows you to monitor the permissions, ownership, and general existence of files and directories and, if you wish, to either correct them or warn about them automatically.


Previous: Monitoring important files, Up: Key concepts

1.3.6 Making links

One of the difficulties with having so many different variations on the theme of BSD and System V based operating systems is that similar files are not always where you expect to find them. They have different names or lie in different directories. The usual solution to the problem is to make an alias for these files, or a pointer from one filename to another. The name for such an alias is a symbolic link.

It is often very convenient to make symbolic links. For example, you might want the sendmail configuration file /etc/sendmail.cf to be a link to a global configuration file, say,

     /usr/local/mail/etc/sendmail.cf

on every single host on your network so that there is only one file to edit. If you had to make all of these links yourself, it would take a lifetime. Cfagent will make such a link automatically and check it each time it is run. You can also ask it to tidy up old links which have been left around and no longer point to existing files. If you reinstall your operating system later, it doesn't matter because all your links are defined in your cfagent configuration file, recorded for all time. Cfengine won't forget it, and you won't forget it because the setup is defined in one central place.

Cfagent will also allow you to make hard links to regular files, but not to other kinds of files. A hard link that points to a symbolic link is the same as a hard link to the file the symbolic link points to.


Previous: Key concepts, Up: Overview

1.4 Functionality

The notes above give you a rough idea of what cfengine can be used for. Here is a quick summary of cfagent's capabilities.

How do you run cfagent? You can run it as a cron job, or you can run it manually. You may run cfagent scripts/programs as often as you like. Each time you run a script, cfengine determines whether anything needs to be done — if nothing needs to be done, nothing is done! If you use it to monitor and configure your entire network from a central file-base, then the natural thing is to run cfengine repeatedly with the help of cron and/or cfexecd.


Next: , Previous: Overview, Up: Top

2 Getting started


Next: , Previous: Getting started, Up: Getting started

2.1 What you must have in a cfagent program

A cfagent configuration file for a large network can become long and complex so, before we get down to details, let's try to strip away the complexity and look only to the essentials.

Each cfagent program or configuration file is a list of declarations of items to be checked and perhaps fixed. You begin by creating a file called cfagent.conf. The simplest meaningful file you can create is something like this:

     
     # Comment...
     
     control:
     
       actionsequence = ( links )
     
     links:
     
       /bin -> /usr/bin
     

The example above checks and makes (if necessary) a link from /bin to /usr/bin. Let's examine this example more closely. In a cfengine program:

In simple example above has three of the four types of object described above. The control: section of any program tells cfengine how to behave. In this example it adds the action links to the action sequence. For links you could replace some other action. The essential point is that, if you don't have an action sequence, your cfengine program will do absolutely nothing! The action sequence is a list which tells cfagent what do to and in which order.

The links: section of the file tells cfagent that what follows is a number of links to be made. If you write this part of the file, but forget to add links to the action sequence, then nothing will be done! You can add any number of links in this part of the file and they will all be dealt with in order when—and only when—you write links in the action sequence.

To summarize, you must have:

Now let's think a bit about how useful this short example program is. On a SunOS (Solaris) system, where the directory /bin is in fact supposed to be a link, such a check could be useful, but on some other system where /bin is a not a link but a separate directory, this would result in an error message from cfagent, telling you that /bin exists and is not a link. The lesson is that, if we want to use cfagent to make one single program which can be run on any host of any type, then we need some way of restricting the above link so that it only gets checked on SunOS systems. We can write the following:

     
     # Comment...
     
     control:
     
       actionsequence = ( links  )
     
     links:
     
       sun4::
     
            /bin -> /usr/bin
            # other links
     
        osf::
     
            # other links
     

The names which have double colons after them are called classes and they are used to restrict a particular action so that it only gets performed if the host running the program is a member of that class. If you are familiar with C++, this syntax should make you think of classes definitions in C++. Classes works like this: the names above sun4, sun3, osf etc. are all internally defined by cfagent. If a host running, say, the OSF operating system executes the file it automatically becomes a member of the class osf. Since it cannot be a member more than one of the above, this distinguishes between different types of operating system and creates a hidden if..then...else test.

This is the way in which cfagent makes decisions. The key idea is that actions are only carried out if they are in the same class as the host running the program. Classes are dealt with in detail in the next chapter.

Now let's see how to add another kind of action to the action sequence.

     
     # Comment...
     
     control:
     
       actionsequence = ( tidy links )
     
     links:
     
       /bin -> /usr/bin
     
     tidy:
     
        /tmp  pattern=* age=7 recurse=inf
     

We have now added a new kind of declaration called tidy: which deletes files. In the example above, we are looking at files in the directory /tmp which match the pattern `*' and have not been accessed for more than seven days. The search for these files descends recursively down any number of subdirectories.

To make any of this happen we must add the word tidy to the action sequence. If we don't, the declaration will be ignored. Notice also that, regardless of the fact that links: comes before tidy:, the order in the action sequence tells us that all tidy actions will be performed before links:.

The above structure can be repeated to build up a configuration file or script.


Next: , Previous: What you must have, Up: Getting started

2.2 Program structure

To summarize the previous section, here is a sketch of a typical cfagent configuration program showing a sensible structure. The various sections are listed in a sensible order which you would probably use in the action sequence.

An individual section-declaration in the program looks something like this:

     
     action-type:
     
        class1::
     
            list of things to do...
     
        class2::
     
            list of things to do...
     

action-type is one of the following reserved words:

     
        groups, control, copy, homeservers, binservers, mailserver, mountables,
        import, broadcast, resolve, defaultroute, directories, miscmounts,
        files, ignore, tidy, required, links, disable, shellcommands, strategies
        editfiles, processes
     

The order in which declarations occur is not important to cfengine from a syntactical point of view, but some of the above actions define information which you will want to refer to later. All variables, classes, groups etc. must be defined before they are used. That means that it is smart to follow the order above for the sections in the first line of the above list.

The order in which items are declared is not to be confused with the order in which they are executed. This is determined by the actionsequence, (see the reference manual). Probably you will want to coordinate the two so that they match as far as possible. For completeness, here is a complete summary of the structure of a very general cfagent configuration program. The format is free and use of space is unrestricted, though it is always a good idea to put a space in front before and after parentheses when defining variables.

     ######################################################################
     #
     # Example of structure
     #
     ######################################################################
     
     groups:
     
        group1 = ( host host ...  )
        group2 = ( host host ...  )
        ...
     
     ######################################################################
     
     control:
     
        class::
     
        site      =  ( mysite )
        domain    =  ( mydomain )
        ...
     
         actionsequence =
           (
           action name
           ....
           )
     
        mountpattern = ( mountpoint )
        homepattern = ( wildcards matching home directories )
     
        addinstallable = ( foo bar )
        addclasses     = ( foo bar )
     
     ######################################################################
     
     homeservers:
     
        class::
                home servers
     
     binservers:
     
        class::
                binary servers
     
     mailserver:
     
        class::
                mail server
     
     mountables:
     
        class::
     
                list of resources
     
     
     ######################################################################
     
     import:
     
        class::    include file
     
        class::    include file
     
     
     ######################################################################
     
     broadcast:
     
       class::  ones   # or zeros / zeroes
     
     defaultroute:
     
        class::  my-gw
     
     
     ######################################################################
     
     resolve:
     
        any::
     
            list of nameservers
     
     
        ...
     


Next: , Previous: Program structure, Up: Getting started

2.3 Building a distributed configuration

If a configuration is to be specified at one central location, how does it get distributed to many hosts? The simple answer is to get cfengine to distribute the configuration to the hosts. To do that, a separate configuration file is used. Why?

Imagine what would happen if you made a mistake in the configuration, i.e. a syntax error which got distributed to every host. Now all the hosts would be unable to run cfengine, and thereafter unable to download a corrected configuration file. The whole setup would be broken. To prevent this kind of accident, a separate configuration file is used to copy the files and binaries to each host. This configuration should be simple, and should almost never be edited: they key word here is reliability.


Next: , Previous: Building a distributed configuration, Up: Building a distributed configuration

2.3.1 Startup update.conf

The file update.conf can have more or less the same form for all sites, looking something like this.

     #######
     #
     # BEGIN update.conf
     #
     # This script distributes the configuration, a simple file so that,
     # if there are syntax errors in the main config, we can still
     # distribute a correct configuration to the machines afterwards, even
     # though the main config won't parse. It is read and run just before the
     # main configuration is parsed.
     #
     #######
     
     control:
     
      actionsequence  = ( copy tidy )  # Keep this simple and constant
     
      domain          = ( iu.hio.no )  # Needed for remote copy
     
      #
      # Which host/dir is the master for configuration roll-outs?
      #
     
      policyhost      = ( nexus.iu.hio.no )
      master_cfinput  = ( /masterfiles/inputs )
     
      #
      # Some convenient variables
      #
     
      workdir         = ( /var/cfengine )
      cf_install_dir  = ( /usr/local/sbin )
     
      # Avoid server contention
     
      SplayTime = ( 5 )
     
     ############################################################################
     
      #
      # Make sure there is a local copy of the configuration and
      # the most important binaries in case we have no connectivity
      # e.g. for mobile stations or during DOS attacks
      #
     
     copy:
     
          $(master_cfinput)            dest=$(workdir)/inputs
                                       r=inf
                                       mode=700
                                       type=binary
                                       exclude=*.lst
                                       exclude=*~
                                       exclude=#*
                                       server=$(policyhost)
     
          $(cf_install_dir)/cfagent    dest=$(workdir)/bin/cfagent
                                       mode=755
                                       backup=false
                                       type=checksum
     
          $(cf_install_dir)/cfservd    dest=$(workdir)/bin/cfservd
                                       mode=755
                                       backup=false
                                       type=checksum
     
          $(cf_install_dir)/cfexecd    dest=$(workdir)/bin/cfexecd
                                       mode=755
                                       backup=false
                                       type=checksum
     
     #####################################################################
     
     tidy:
     
          #
          # Cfexecd stores output in this directory.
          # Make sure we don't build up files and choke on our own words!
          #
     
          $(workdir)/outputs pattern=* age=7
     
     #######
     #
     # END cf.update
     #
     #######
     


Next: , Previous: Startup update.conf, Up: Building a distributed configuration

2.3.2 Startup cfservd.conf

In order to set up remote distribution from a central server, you will need to start the cfservd service on the host from which the configuration is to be copied, and grant access to the hosts which need to download it. Here is a simple get-started file which does this:

     #########################################################
     #
     # This is a cfservd config file - it is used for the server
     # part of cfengine, for remote file transfers and control
     # over cfengine using the cfrun program.
     #
     #########################################################
     
     control:
     
       domain = ( iu.hio.no )
     
          cfrunCommand = ( "/var/cfengine/bin/cfagent" )
     
      any::
     
       IfElapsed = ( 1 )
       ExpireAfter = ( 15 )
       MaxConnections = ( 50 )
       MultipleConnections = ( true )
     
     #########################################################
     
     grant:
     
        # Grant access to all hosts at example.org.
        # Files should be world readable
     
        /masterfiles/inputs        *.example.org
     
        # Make sure there is permission to execute by cfrun
     
        /var/cfengine/bin/cfagent  *.example.org
     
     ########
     #
     # END cfservd.conf
     #
     ########
     


Previous: Startup cfservd.conf, Up: Building a distributed configuration

2.3.3 Where should I put the files?

Where should the files be located? To organize your files, you should think of three potential locations, for different purposes:

Modules and methods are normally kept in a separate directory than inputs files are kept in, because they require a directory with special authorizations whe executing. This is good practice As long as the update.conf places the master versions in the correct location (usually /var/cfengine/modules) on the local host, all will be okay.

You should not try to copy files directly from a version controlled repository, as you might end up sending out an incomplete or partially tested version of the files to all your hosts.

     
     # Example update.conf
     
     control:
     
        master_cfinput  = ( /usr/local/masterfiles/cfengine/inputs )
        workdir         = ( /var/cfengine )
     
     copy:
     
        # Copy from bullet 2 to bullet 3
     
          $(master_cfinput)            dest=$(workdir)/inputs
                                       r=inf
                                       mode=700
                                       type=binary
                                       exclude=*.lst
                                       exclude=*~
                                       exclude=#*
                                       server=$(policyhost)
                                       trustkey=true
     
          $(master_modules)            dest=$(workdir)/modules
                                       r=inf
                                       mode=700
                                       type=binary
                                       exclude=*.lst
                                       exclude=*~
                                       exclude=#*
                                       server=$(policyhost)
                                       trustkey=true
     


Next: , Previous: Building a distributed configuration, Up: Getting started

2.4 Optional features in cfagent

Cfagent doesn't do anything unless you ask it to. When you run a cfagent program it generates no output unless it finds something it believes to be wrong. It does not carry out any actions unless they are declared in the action sequence.

If you like, though, you can make cfagent positively chatty. Cfagent can be run with a number of command line options (see the reference manual). If you run the program with the `-v' or `--verbose' options, it will supply you cheerily with a resume of what it is doing. Certain warning messages also get printed in verbose mode, so it is a useful debugging tool.

You can ask cfagent to check lots of things – the timezone for instance, or the domain name. In order for it to check these things, it needs some information from you. All of the switches and options which change the way in which cfagent behaves get specified either on the command line or in the control: section of the control file. Some special control variables are used for this purpose. Here is a short example:

     
     control:
     
       domain   = ( example.org )
       netmask  = ( 255.255.255.0 )
       timezone = ( MET CET )
     
       mountpattern = ( /mydomain/mountpoint )
     
       actionsequence =
          (
          checktimezone     # check time zone
          netconfig         # includes check netmask
          resolve           # includes domain
          mountinfo         # look for mounted disks under mountpattern
          )
     

To get verbose output you must run cfagent with the appropriate command line option `--verbose' or `-v'.

Notice that setting values has a special kind of syntax: a variable name, an equals sign and a value in parentheses. This tells you that the quantity of the left hand side assumes the value on the right hand side. There are lots of questions you might ask at this point. The answers to these will be covered as we go along and in the next chapter.

Before leaving this brief advertisement for control parameters, it is worth noting the definition of mountpattern above. This declares a directory in which cfagent expects to find mounted disks. It will be explained in detail later, for now notice that this definition looks rather stupid and inflexible. It would be much better if we could use some kind of variables to define where to look for mounted filesystems. And of course you can...

Having briefly scraped the surface of what cfagent can do, turn to the example and take a look at what a complete program can look like, (see the reference manual). If you understand it, you might like to skip through the rest of the manual until you find what you are looking for. If it looks mysterious, then the next chapter should answer some questions in more depth.


Next: , Previous: Options, Up: Getting started

2.5 Invoking cfagent

Cfagent may be invoked in a number of ways. Here are some examples:

     host% cfagent
     
     host% cfagent --file myfile
     
     host% cfagent -f myfile -v -n
     
     host% cfagent --help

The first of these (the default command, with no arguments) causes cfagent to look for a file called cfagent.conf in the directory pointed to by the environment variables CFINPUTS or /var/cfengine/inputs by default, and execute it silently. The second command reads the file myfile and works silently. The third works in verbose mode and the -n option means that no actions should actually be carried out, only warnings should be printed. The final example causes cfagent to print out a list of its command line options. The complete list of options is listed in the summary at the beginning of this manual, or you can see it by giving the -h option, (see the reference manual). In addition to running cfagent with a filename, you can also treat cfagent files as scripts by starting your cfagent program with the standard shell line:

     #!/usr/local/sbin/cfagent -f
     #
     # My config script
     #

Here we assume that you have installed cfengine under the directory /usr/local/sbin. By adding a header like this to the first line of your program and making the file executable with the chmod shell command, you can execute the program just by typing its name—i.e. without mentioning cfengine explicitly at all.

As a novice to cfengine, it is advisable to check all programs with the -n option before trusting them to your system, at least until you are familiar with the behaviour of cfengine. This `safe' option allows you to see what cfengine wants to do, without actually committing yourself to doing it.


Next: , Previous: Invoking cfagent, Up: Getting started

2.6 Running cfengine permanently, monitoring and restarting cfexecd

Once you are happy using cfengine, you will want it to run least once per hour on your systems. This is easily achieved by adding the following line to the root crontab file of each system:

     0,30 * * * * /usr/local/sbin/cfexecd -F

This is enough to ensure that cfengine will get run. Any output generated by this job, will be stored in /var/cfengine/outputs. In addition, if you add the following to the file cfagent.conf, the system administrator will be emailed a summary of any output:

     
     control:
     
     smtpserver = ( mailhub.example.org ) # site MTA which can talk smtp
     sysadm     = ( mark@example.org )   # mail address of sysadm
     

Fill in suitable values for these variables. An alternative, or additional way to run cfengine, is to run the cfexecd program is daemon mode (without the `-F') option. In this mode, the daemon lives in the background and sleeps, activating only in accordance with a scheduling policy. The default policy is to run once every hour (equivalent to Min00_05). Here is how you would modify cfagent.conf in order to make the daemon execute cfagent every half-hour:

      control:
     
        # When should cfexecd in daemon mode wake up the agent?
     
        schedule   = ( Min00_05 Min30_35 )

Note that the time specifications are the basic cfengine time classes, See Building flexible time classes. Although one of these methods should suffice, no harm will arise from running both cron and the cfexecd side-by-side. Cfagents locking mechanisms ensure that no contention will occur.

The other components of cfengine can be started by cfagent itself:

     processes:
     
      "cfenvd"  restart "/usr/local/sbin/cfenvd"
      "cfservd" restart "/usr/local/sbin/cfservd"
     

Note that, to start cfexecd by cfengine, one must do this

     processes:
     
      "bin/cfexecd$"  restart "/usr/local/sbin/cfexecd"
     

It's important to use as specific a regular expression as possible in match statements (the path to the program and the regular expression metacharacter $ meaning "end of line", in this example) because bare strings can often match unexpected processes. For instance, using cfexecd by itself will also match a process spawned by cfagent -F, which shows up as /var/cfagent/bin/cfagent -Dfrom_cfexecd in the process table!


Next: , Previous: Running cfengine permanently monitoring and restarting cfexecd, Up: Getting started

2.7 CFINPUTS environment variable

Whenever cfengine looks for a file it asks a question: is the filename an absolute name (that is a name which begins from / like /usr/file), is it a file in the directory in which you invoke cfengine or is it a file which should be searched for in a special place? If you use an absolute filename either on the command line using -f or in the import section of your program (a name which begins with a slash '/'), then cfengine trusts the name of the file you have given and treats it literally. If you specify the name of the file as simple `.' or `-' then cfengine reads its input from the standard input. If you run cfengine without arguments (so that the default filename is cfagent.conf) or you specify a file without a leading slash in the import section, then the value of the environment variable CFINPUTS is prepended to the start of the file name. This allows you to keep your configuration in a standard place, pointed to by CFINPUTS. For example:

     
     host# setenv CFINPUTS /usr/local/masterfiles/cfengine/inputs
     
     host# cfagent -f myfile
     

In this example, cfengine tries to open myfile. in the directory /usr/local/masterfiles/cfengine/inputs. If no value is set for CFINPUTS, then the default location is the trusted cfengine directory /var/cfengine/inputs.


Previous: CFINPUTS environment variable, Up: Getting started

2.8 What to aim for

If you are a beginner to cfengine, you might not be certain exactly how you want to use it. Here are some hints from Dr. Daystrom about how to get things working quickly.

When you have set up these components, you can sit back and edit the configuration files and watch things being done.


Next: , Previous: Getting started, Up: Top

3 More advanced concepts


Next: , Previous: More advanced concepts, Up: More advanced concepts

3.1 Classes

The idea of classes is central to the operation of cfengine. Saying that cfengine is `class orientated' means that it doesn't make decisions using if...then...else constructions the way other languages do, but only carries out an action if the host running the program is in the same class as the action itself. To understand what this means, imagine sorting through a list of all the hosts at your site. Imagine also that you are looking for the class of hosts which belong to the computing department, which run GNU/Linux operating system and which have yellow spots! To figure out whether a particular host satisfies all of these criteria you first delete all of the hosts which are not GNU/Linux, then you delete all of the remaining ones which don't belong to the computing department, then you delete all the remaining ones which don't have yellow spots. If you are on the remaining list, then you are in the class of all computer-science-Linux-yellow-spotted hosts and you can carry out the action.

Cfengine works in this way, narrowing things down by asking if a host is in several classes at the same time. Although some information (like the kind of operating system you are running) can be obtained directly, clearly, to make this work we need to have lists of which hosts belong to the computer department and which ones have yellow spots.

So how does this work in a cfengine program? A program or configuration script consists of a set of declarations for what we refer to as actions which are to be carried out only for certain classes of host. Any host can execute a particular program, but only certain action are extracted — namely those which refer to that particular host. This happens automatically because cfengine builds up a list of the classes to which it belongs as it goes along, so it avoids having to make many decisions over and over again.

By defining classes which classify the hosts on your network in some easy to understand way, you can make a single action apply to many hosts in one go – i.e. just the hosts you need. You can make generic rules for specific type of operating system, you can group together clusters of workstations according to who will be using them and you can paint yellow spots on them – what ever works for you.

A cfengine action looks like this:

     
     action-type:
     
        compound-class::
     
            declaration

A single class can be one of several things:

A compound class is a sequence of simple classes connected by dots or `pipe' symbols (vertical bars). For example:

     
     myclass.sun4.Monday::
     
     sun4|ultrix|osf::
     

A compound class evaluates to `true' if all of the individual classes are separately true, thus in the above example the actions which follow compound_class:: are only carried out if the host concerned is in myclass, is of type sun4 and the day is Monday! In the second example, the host parsing the file must be either of type sun4 or ultrix or osf. In other words, compound classes support two operators: AND and OR, written `.' and `|' respectively. From cfengine version 2.1.1, I bit the bullet and added `&' as a synonym for the AND operator. Cfengine doesn't care how many of these operators you use (since it skips over blank class names), so you could write either

     
     solaris|irix::
     

or

     
     solaris||irix::
     

depending on your taste. On the other hand, the order in which cfengine evaluates AND and OR operations does matter, and the rule is that AND takes priority over OR, so that `.' binds classes together tightly and all AND operations are evaluated before ORing the final results together. This is the usual behaviour in programming languages. You can use round parentheses in cfengine classes to override these preferences.

Cfengine allows you to define switch on and off dummy classes so that you can use them to select certain subsets of action. In particular, note that by defining your own classes, using them to make compound rules of this type, and then switching them on and off, you can also switch on and off the corresponding actions in a controlled way. The command line options -D and -N can be used for this purpose. See also addclasses in the Reference manual. A logical NOT operator has been added to allow you to exclude certain specific hosts in a more flexible way. The logical NOT operator is (as in C and C++) `!'. For instance, the following example would allow all hosts except for myhost:

        action:
     
         !myhost::
     
             command

and similarly, so allow all hosts in a user-defined group mygroup, except for myhost, you would write

        action:
     
         mygroup.!myhost::
     
             command

which reads `mygroup AND NOT myhost'. The NOT operator can also be combined with OR. For instance

     
        class1|!class2

would select hosts which were either in class 1, or those which were not in class 2.

Finally, there is a number of reserved classes. The following are hard classes for various operating system architectures. They do not need to be defined because each host knows what operating system it is running. Thus the appropriate one of these will always be defined on each host. Similarly the day of the week is clearly not open to definition, unless you are running cfengine from outer space. The reserved classes are:

     ultrix, sun4, sun3, hpux, hpux10, aix, solaris, osf, irix4, irix, irix64
        sco, freebsd, netbsd, openbsd, bsd4_3, newsos, solarisx86, aos,
               nextstep, bsdos, linux, debian, cray, unix_sv, GnU, NT

If these classes are not sufficient to distinguish the hosts on your network, cfengine provides more specific classes which contain the name and release of the operating system. To find out what these look like for your systems you can run cfengine in `parse-only-verbose' mode:

     
       cfagent -p -v
     

and these will be displayed. For example, Solaris 2.4 systems generate the additional classes sunos_5_4 and sunos_sun4m, sunos_sun4m_5_4.

Cfengine uses both the unqualified and fully host names as classes. Some sites and operating systems use fully qualified names for their hosts. i.e. uname -n returns to full domain qualified hostname. This spoils the class matching algorithms for cfengine, so cfengine automatically truncates names which contain a dot `.' at the first `.' it encounters. If your hostnames contain dots (which do not refer to a domain name, then cfengine will be confused. The moral is: don't have dots in your host names! NOTE: in order to ensure that the fully qualified name of the host becomes a class you must define the domain variable. The dots in this string will be replaced by underscores. In summary, the operator ordering in cfengine classes is as follows:

`()'
Parentheses override everything.
`!'
The NOT operator binds tightest.
`. &'
The AND operator binds more tightly than OR.
`|'
OR is the weakest operator.


Next: , Previous: Classes basics, Up: More advanced concepts

3.2 Variable substitution

When you are building up a configuration file it is very useful to be able to use variables. If you can define your configuration in terms of some key variables, it can be changed more easily later, it is more transparent to the reader of the program and you can also choose to define the variables differently on different types of system. Another way of saying this is that cfengine variables also belong to classes. Cfengine makes use of variables in three ways.

Environment variables are fetched directly from the shell on whatever system is running the program. An example of a special variable is the domain variable from the previous section. Straightforward macro substitution allows you to define a symbol name to be replaced by an arbitrary text string. All these definitions (apart from shell environment variables, of course) are made in the control part of the cfengine program:

     
     control:
     
       myvar = ( /usr/local/mydir/lib/very/long/path )   # define macro
     
     ...
     
     links:
     
       $(myvar) -> /another/directory
     

Here we define a macro called myvar, which is later used to define the creation of a link. As promised we can also define class-dependent variables:

     
     control:
     
       sun4:: myvar = ( sun )
       hpux:: myvar = ( HP )
     

Cfagent gives you access to the shell environment variables and allows you to define variables of your own. It also keeps a few special variables which affect the way in which cfengine works. When cfengine expands a variable it looks first at the name in its list of special variables, then in the list of user-defined macros and finally in the shell environment for a match. If none of these are found it expands to the empty string. If you nest macros,

     
     control:
     
       myvar = ( "$(othervar)" )
     

then you must quote the right hand side and ensure that the value is already defined.

You can also import values from the execution of a shell command by prefixing a command with the word exec. This method is deprecated in version 2 and replaced by a function.

     
       control:
     
        # old method
     
        listing = ( "exec /bin/ls -l" )
     
        # new method
     
        listing = ( ExecResult(/bin/ls -l) )
     

This sets the variable `listing' to the output of the command in the quotes. Some other internal functions are

RandomInt(a,b)
Generate a random integer between a and b.
ExecResult(command)
Executes the named shell command and inserts the output into the variable.

For example:

     
     control:
     
      variable2 = ( RandomInt(0,23) )
     
      variable3 = ( ExecResult(/bin/ls -a /opt) )
     

Variables are referred to in either of two different ways, depending on your taste. You can use the forms $(variable) or ${variable}. The variable in braces or parentheses can be the name of any user defined macro, environment variable or one of the following special internal variables.

AllClasses
A long string in the form `CFALLCLASSES=class1:class2...'. This variable is a summary of all the defined classes at any given time. It is always kept up to date so that scripts can make use of cfengine's class data.
arch
The current detailed architecture string—an amalgamation of the information from uname. Non-definable.
binserver
The default server for binary data. See NFS resources. Non definable.
class
The currently defined system hard-class (e.g. sun4, hpux). Non-definable.
date
The current date string. Note that if you use this in a shell command it might be interpreted as a list variable, since it contains the default separator `:'.
domain
The currently defined domain.
faculty
The faculty or site as defined in control (see site).
fqhost
The fully qualified (DNS/BIND) hostname of the system, which includes the domain name as well.
host
The hostname of the machine running the program.
ipaddress
The numerical form of the internet address of the host currently running cfengine.
MaxCfengines
The maximum number of cfengines which should be allowed to co-exist concurrently on the system. This can prevent excessive load due to unintentional spamming in situations where several cfagents are started independently. The default value is unlimited.
ostype
A short for of $(arch).
OutputPrefix
This quoted string can be used to change the default `cfengine:' prefix on output lines to something else. You might wish to shorten the string, or have a different prefix for different hosts. The value in this variable is appended with the name of the host. The default is equivalent to,
            OutputPrefix = ( "cfengine:$(host):")
     


RepChar
The character value of the string used by the file repository in constructing unique filenames from path names. This is the character which replaces `/' (see the reference manual).
site
This variable is identical to $(faculty) and may be used interchangeably.
split
The character on which list variables are split (see the reference manual).
sysadm
The name or mail address of the system administrator.
timezone
The current timezone as defined in control.
UnderscoreClasses
If this is set to `on' cfengine uses hard-classes which begin with an underscore, so as to avoid name collisions. See also Runtime Options in the Reference manual.
year
The current year.

These variables are kept special because they play a special role in setting up a system configuration. See Global configurations. You are encouraged to use them to define fully generalized rules in your programs. Variables can be used to advantage in defining filenames, directory names and in passing arguments to shell commands. The judicious use of variables can reduce many definitions to a single one if you plan carefully.

NOTE: the above control variables are not case sensitive, unlike user macros, so you should not define your own macros with these names.

The following variables are also reserved and may be used to produce troublesome special characters in strings.

cr
Expands to the carriage-return character.
dblquote
Expands to a double quote "
dollar
Expands to `$'.
lf
Expands to a line-feed character (Unix end of line).
n
Expands to a newline character.
quote
Expands to a single quote '.
spc
Expands simply to a single space. This can be used to place spaces in filenames etc.
tab
Expands to a single tab character.

You can use variables in the following places:

     
     links:
     
       osf::
           /$(site)/${host}/directory -> somefile
     
     shellcommands:
     
       any::
     
        "/bin/echo $(timezone) | /bin/mail $(sysadm)"
        '/bin/echo "double quotes!"'
     

The latter possibility enables cfengine's variables to be passed on to user-defined scripts.

Variables can be defined differently under different classes by preceding the definition with a class name. For example:

     control:
     
        sun4::  my_macro = ( User_string_1 )
        irix::  my_macro = ( User_string_2 )
     

Here the value assigned to $(my_macro) depends on which of the classes evaluates to true. This feature can be used to good effect to define the mail address of a suitable system administrator for different groups of host.

     control:
     
      physics::   sysadm = ( mark,fred )
      chemistry:: sysadm = ( localsys@domain )
     

Note, incidentally, that the `-a' option can be used to print out the mail address of the system administrator for any wrapper scripts.


Next: , Previous: Variable substitution, Up: More advanced concepts

3.3 Undefined variables

Note that macro-variables which are undefined are not expanded as of version 1.6 of cfengine. In earlier versions, undefined variables would be replaced by an empty string, as in Perl. In versions 1.6.x and later, the variable string remains un-substituted, if the varaiable does not exist. For instance,

     
     control:
     
       actionsequence = ( shellcommands )
     
       myvar = ( "test string " )
     
     shellcommands:
     
      "/bin/echo $(myvar) $(myvar2)"
     

results in:

     
     cfengine:host: Executing script /bin/echo test string  $(myvar2)
     cfengine:host:/bin/echo test : sh: syntax error at line 1: `(' unexpected
     cfengine:host: Finished script /bin/echo test string  $(myvar2)
     

This allows variables to be defined on-the-fly by modules.


Next: , Previous: Undefined variables, Up: More advanced concepts

3.4 Defining classes and making exceptions

Cfengine communicates with itself by passing messages in the form of classes. When a class becomes switched on or off, cfengine's program effectively becomes modified. There are several ways in which you can switch on and off classes. Learning these fully will take some time, and only then will you harness the full power of cfengine.

Because cfagent works at a very high level, doing very many things for very few lines of code it might seem that some flexibility is lost. When we restrict certain actions to special classes it is occasionally useful to be able to switch off classes temporarily so as to cancel the special actions.


Next: , Previous: Defining classes, Up: Defining classes

3.4.1 Command line classes

You can define classes of your own which can be switched on and off, either on the command line or from the action sequence. For example, suppose we define a class include. We use addclasses to do this.

     addclasses = ( include othersymbols )

The purpose of this would be to allow certain `excludable actions' to be defined. Actions defined by

     
     any.include::
                    actions
     

will normally be carried out, because we have defined include to be true using addclasses. But if cfagent is run in a restricted mode, in which include is set to false, we can exclude these actions.

So, by defining the symbol include to be false, you can exclude all of the actions which have include as a member. There are two ways in which this can be done, one is to negate a class globally using

     

cfagent -N include

This undefines the class include for the entire duration of the program.


Next: , Previous: Command line classes, Up: Defining classes

3.4.2 actionsequence classes

Another way to specify actions is to use a class to select only a subset of all the actions defined in the actionsequence. You do this by adding a class name to one on the actions in action sequence by using a dot `.' to separate the words. In this case the symbol only evaluates to `true' for the duration of the action to which it attached. Here is an example:

     
       links.onlysome
       shellcommands.othersymbols.onlysome
     

In the first case onlysome is defined to be true while this instance of links is executed. That means that only actions labelled with the class onlysome will be executed as a result of that statement. In the latter case, both onlysome and othersymbols are defined to be true for the duration of shellcommands.

This syntax would normally be used to omit certain time-consuming actions, such as tidying all home directories. Or perhaps to synchronize certain actions which have to happen in a certain order.


Next: , Previous: actionsequence classes, Up: Defining classes

3.4.3 shellcommand classes

For more advanced uses of cfengine you might want to be able to define a class on the basis of the success or failure of a user-program, a shell command or user script. Consider the following example

     
     groups:
     
        have_cc = ( "/bin/test -f /usr/ucb/cc"
                    "/bin/test -f /local/gnu/cc"  )
     

Note that as of version 1.4.0 of cfengine, you may use the word classes as an alias for groups. Whenever cfagent meets an object in a class list or variable, which is surrounded by either single, double quotes or reversed quotes, it attempts to execute the string as a command passed to the Bourne shell. If the resulting command has return code zero (proper exit) then the class on the left hand side of the assignment (in this case `have_cc') will be true. If the command returns any other value (an error number) the result is false. Since groups are the logical OR of their members (it is sufficient that one of the members matches the current system), the class `have_cc' will be defined above if either /usr/ucb/cc or /local/gnu/cc exist, or both.


Next: , Previous: shellcommand classes, Up: Defining classes

3.4.4 Feedback classes

Classes may be defined as the result of actions being carried out by cfagent. For example, if a file gets copied, needs to be edited or if diskspace falls under a certain threshhold, cfagent can be made to respond by activating classes at runtime. This allows you to create dynamically responsive programs which react to the changing environment. These classes are defined as part of other statements with clauses of the form

     
       define=classlist
     

Classes like these should generally be declared at the start of a program unless the define statements always precede the actions which use the defined classes, with addinstallable.


Previous: Feedback classes, Up: Defining classes

3.4.5 Writing plugin modules

If the regular mechanisms for setting classes do not produce the results you require for your configuration, you can write your own routines to concoct the classes of your dreams. Plugin modules are added to cfagent programs from within the actionsequence, (see Reference manual). They allow you to write special code for answering questions which are too complex to answer using the other mechanisms above. This allows you to control classes which will be switched on and the moment at which your module attempts to evaluate the condition of the system.

Modules must lie in a special directory defined by the variable moduledirectory. They must have a name of the form module:mymodule and they must follow a simple protocol. Cfagent will only execute a module which is owned either by root or the user who is running cfagent, if it lies in the special directory and has the special name. A plug-in module may be written in any language, it can return any output you like, but lines which begin with a `+' sign are treated as classes to be defined (like -D), while lines which begin with a `-' sign are treated as classes to be undefined (like -N). Lines starting with `=' are variables/macros to be defined. Any other lines of output are cited by cfagent, so you should normally make your module completely silent. Here is an example module written in perl. First we define the module in the cfagent program:

     
      control:
     
        moduledirectory = ( /local/cfagent/modules )
     
        actionsequence = (
                         files
                         module:myplugin
                         "module:argplugin arg1 arg2"
                         copy
                         )
      ...
        AddInstallables = ( specialclass )
     

Note that the class definitions for the module should also be defined in as AddInstallables, if this is more convenient. NOTE: you must declare the classes before using them in the cfagent configuration, or else those actions will be ignored. Next we write the plugin itself.

     #!/usr/bin/perl
     #
     # module:myplugin
     #
     
       # lots of computation....
     
     if (special-condition)
        {
        print "+specialclass";
        }
     

Modules inherit the environment variables from cfagent and accept arguments, just as a regular shellcommand does.

     #!/bin/sh
     #
     # module:myplugin
     #
     
     /bin/echo $*
     

Cfagent defines the classes as an environment variable so that programs have access to these. E.g. try the following module:

     #!/usr/bin/perl
     
     print "Decoding $ENV{CFALLCLASSES}\n";
     
     @allclasses = split (":","$ENV{CFALLCLASSES}");
     
     while ($c=shift(@allclasses))
       {
       $classes{$c} = 1;
       print "$c is set\n";
       }
     

Modules can define macros in cfagent by outputting strings of the form

     
     =variablename=value
     

When the $(allclasses) variable becomes too large to manipulate conveniently, you can access the complete list of currently defined classes in the file /var/cfengine/state/allclasses.


Next: , Previous: Defining classes, Up: More advanced concepts

3.5 The generic class any

The generic wildcard any may be used to stand for any class. Thus instead of assigning actions for the class sun4 only you might define actions for any architecture by specifying:

     
       any::
             actions
     

If you don't specify any class at all then cfengine assumes a default value of any for the class.


Next: , Previous: The generic class any, Up: More advanced concepts

3.6 Debugging tips

A useful trick when debugging is to eliminate unwanted actions by changing their class name. Since cfengine assumes that any class it does not understand is the name of some host, it will simply ignore entries it does not recognize. For example:

        myclass::

can be changed to

        Xmyclass::

Since Xmyclass no longer matches any defined classes, and is not the name of any host it will simply be ignored. The -N option can also be used to the same effect. (see Reference manual).


Next: , Previous: Debugging tips, Up: More advanced concepts

3.7 Access control

It is sometimes convenient to be able to restrict the access of a program to a handful of users. This can be done by adding an access list to the control: section of your program. For example,

     control:
         ...
         access = ( mark root )
     

would cause cfengine to refuse to run the program for any other users except mark and root. Such a restriction would be useful, for instance, if you intended to make set-user-id scripts but only wished certain users to be able to run them. If the access list is absent, all users can execute the program. Note: if you are running cfagent via the cfrun program then cfagent is always started with the same user identity as the cfservd process on the remote host. Normally this is the root user identity. This means that the access keyword will have no effect on the use of the command cfrun.


Next: , Previous: Access control, Up: More advanced concepts

3.8 Wildcards in directory names

In the two actions files and tidy you define directory names at which file checking or tidying searches should start. One economical feature is that you can define a whole group of directories at which identical searches should start in one fell swoop by making use of wildcards. For example, the directory names

          /usr/*/*
          /bla/*/ab?/bla

represent all of the directories (and only directories) which match the above wildcard strings. Cfagent opens each matching directory and iterates the action over all directories which match.

The symbol `?' matches any single character, whereas `*' matches any number of characters, in accordance with shell file-substitution wildcards. When this notation is used in directory names, it always defines the starting point for a search. It does not tell the command how to search, only where to begin. The pattern directive in tidy can be used to specify patterns when tidying files and under files all files are considered, (see Reference manual),


Next: , Previous: Wildcards in directory names, Up: More advanced concepts

3.9 Recursive file sweeps/directory traversals

File sweeps are searches through a directory tree in which many files are examined and considered for processing in some way. There are many instances where one uses cfagent to perform a file sweep.

The problem with file sweeps is that they can be too sweeping! Often you are not interested in examining every single file in a file tree. You might wish to perform a search The tidy action is slightly different in this respect, since it already always expects to match a specific pattern. One is generally not interested in a search which deletes everything except for a named pattern: this would be too dangerous. For this reason, the syntax of tidy does not allow ignore, include and exclude. It is documented in the section on tidying, (see Reference manual).

Items declared under the global ignore section affect files, copy, links and tidy. For file sweeps within files, copy and links, you may provide private ignore lists using ignore=. The difference between exclude and ignore is that ignore can deal with absolute directories. It prunes directories, while exclude only looks at the files within directories. For file sweeps within files and copy you can specify specific search parameters using the keywords include= and exclude= and as of version 1.6.x filter=. For example,

     files:
     
        /usr/local/bin m=0755 exclude=*.ps action=fixall
     

In this example cfagent searches the entire file tree (omitting any directories listed in the ignore-list and omitting any files ending in the extension .ps), (see Reference manual).

Specifying the include= keyword is slightly different since it automatically restricts the search to only named patterns (using * and ? wildcards), whenever you have one or more instances of it. If you include patterns in this way, cfagent ignores any files which do not match the given patterns. It also ignores any patterns which you have specified in the global ignore-list as well as patterns excluded with exclude=pattern. In other words, exclusions always override inclusions.

If you exclude a pattern or a directory and wish to treat it in some special way, you need to code an explicit check for that pattern as a separate entity. For example, to handle the exluded .ps files above, you would need to code something like this:

     files:
     
        /usr/local/bin m=0644 include=*.ps action=fixall
     

Note: don't be tempted to enclose your wildcards in quotes. The quotes will be treated literally and the pattern might not match the way you would expect. For editfiles the syntax is somewhat different. Here one needs to add lines to the edit stanza:

     
     editfiles:
     
      { /tmp/testdir
     
      Include .*
      Exclude bla.*
      Ignore "."
      Ignore ".."
      Recurse 6
     
      ReplaceAll "search" With "replace"
      }
     


Next: , Previous: File sweeps, Up: More advanced concepts

3.10 Security in Recursive file sweeps

Recursively descending into directories and performing a globally `destructive' change is an inherently risky thing to do, unless you are certain of the directory structure.

Suppose, for instance, that a user with write access to the filesystem added a symbolic link to /etc/passwd, and we were doing a recursive deletion. Suddenly, cfengine becomes a destructive weapon. The default behaviour is that cfengine does not follow symbolic links in recursive descents, for this reason. The option travlinks can be set to true, in order to change this. However, in general, you should never change this option, especially if untrusted users have access to parts of the filesystem, e.g. if you clear /tmp recursively.

Cfagent checks for link race attacks, in which users try to swap a directory for a link, in between system calls, to trick cfagent into believing that a link is a directory, as of version 2.0.3 (and 1.6.4). Note that, even if travlinks is set to true, cfagent will not follow symbolic links that are not owned by the agent user ID; this is to minimize the possibilty of link race attacks, in which users with write access could divert the agent to another part of the filesystem.


Next: , Previous: Security in File sweeps, Up: More advanced concepts

3.11 Log files written by cfagent

Cfagent keeps two kinds of log-file privately and it allows you to log its activity to syslog. Syslog logging may be switched on with the Syslog variable, (see Reference manual).

The first log cfagent keeps is for every user (every subdirectory of a home directory filesystem). A file ~/.cfengine.rm keeps a list of all the files which were deleted during the last pass of the tidy function. This is useful for users who want to know files have been removed without their blessing. This helps to identify what is happening on the system in case of accidents.

Another file is built when cfagent searches through file trees in the files action. This is a list of all programs which are setuid root, or setgid root. Since such files are a potential security risk, cfagent always prints a warning when it encounters a new one (one which is not already in its list). This allows the system administrator to keep a watchful eye over new programs which appear and give users root access. The cfengine log is called /var/cfengine/cfengine.log. The file is not readable for general users.


Next: , Previous: Log files, Up: More advanced concepts

3.12 Quoted strings

In several cfengine commands, you use quoted strings to define a quantity of text which may contain spaces. For example

     
     control:
     
       macro = ( "mycommand" )
     
     editfiles:
     
       { $(HOME)/myfile
     
        AppendIfNoSuchLine 'This text contains space'
       }
     

In each case you may use any one of the three types of quote marks in order to delimit strings,

     
       ' or " or `
     

If you choose, say ", then you may not use this symbol within the string itself. The same goes for the other types of string delimiters. Unlike the shell, cfengine treats these three delimiters in precisely the same way. There is no difference between them. If you need to quote a quoted string, then you should choose a delimiter which does not conflict with the substring before version 2.0.7. From version 2.0.7, you can escape quotes.

     
       qstring = ( "One string\"with substring\" escaped" )
     

Note that you can use special variables for certain symbols in a string See Variable substitution.


Next: , Previous: Quoted strings, Up: More advanced concepts

3.13 Regular expressions

Regular expressions can be used in cfagent in connection with editfiles and processes to search for lines matching certain expressions. A regular expression is a generalized wildcard. In cfagent wildcards, you can use the characters '*' and '?' to match any character or number of characters. Regular expressions are more complicated than wildcards, but have far more flexibility.

NOTE: the special characters `*' and `?' used in wildcards do not have the same meanings as regular expressions!.

Some regular expressions match only a single string. For example, every string which contains no special characters is a regular expression which matches only a string identical to itself. Thus the regular expression `cfengine' would match only the string "cfengine", not "Cfengine" or "cfengin" etc. Other regular expressions could match more general strings. For instance, the regular expression `c*' matches any number of c's (including none). Thus this expression would match the empty string, "c", "cccc", "ccccccccc", but not "cccx".

Here is a list of regular expression special characters and operators.

`\'
The backslash character normally has a special purpose: either to introduce a special command, or to tell the expression interpreter that the next character is not to be treated as a special character. The backslash character stands for itself only when protected by square brackets [\] or quoted with a backslash itself `\\'.
`\b'
Matches word boundary operator.
`\B'
Match within a word (operator).
`\<'
Match beginning of word.
`\>'
Match end of word.
`\w'
Match a character which can be part of a word.
`\W'
Match a character which cannot be part of a word.
`any character'
Matches itself.
`.'
Matches any character
`*'
Match zero or more instances of the previous object. e.g. `c*'. If no object precedes it, it represents a literal asterisk.
`+'
Match one or more instances of the preceding object.
`?'
Match zero or one instance of the preceding object.
`{ }'
Number of matches operator. `{5}' would match exactly 5 instances of the previous object. `{6,}' would match at least 6 instances of the previous object. `{7,12}' would match at least 7 instances of, but no more than 12 instances of the preceding object. Clearly the first number must be less than the second to make a valid search expression.
`|'
The logical OR operator, OR's any two regular expressions.
`[list]'
Defines a list of characters which are to be considered as a single object (ORed). e.g. `[a-z]' matches any character in the range a to z, `abcd' matches either a, b, c or d. Most characters are ordinary inside a list, but there are some exceptions: `]' ends the list unless it is the first item, `\' quotes the next character, `[:' and `:]' define a character class operator (see below), and `-' represents a range of characters unless it is the first or last character in the list.
`[^list]'
Defines a list of characters which are NOT to be matched. i.e. match any character except those in the list.
``