Cfengine configuration directory

cfenvd

Relevant classes: any

As of version 2.0.0 of cfengine, there is an extra daemon which is used to collect statistical data about the recent history of each host (past two months). You must have the Berkeley database installed in order to use this. The cf-environment daemon is meant to to trivial to use. The long-term data recorded are:
Number of users
Number of root processes
Number of non-root processes
Percentage disk full for /
Number of incoming and outgoing sockets for
   netbiosns,
   netbiosdgm,
   netbiosssn,
   irc,
   cfengine,
   nfsd,
   smtp,
   www,
   ftp,
   ssh,
   telnet

The results of the daemon will not be reliable until about six to eight weeks after installing and running it, since a suitable training period is required to build up enough data. The daemon automatically adapts to the changing conditions, but has a built-in inertia which prevents anomalous signals from being given too much credence. Peristent changes will gradually change the `normal state' of the host over an interval of a few weeks.

The cfd daemon does not have to be run with root privileges, but it must be able to write to a database. The database records one week's worth of data which are iteratively updated in order to give an approximate decaying average based on two months worth of data. The size of the database is approximately 2MB. Measurements are taken every five minutes (approximately). This interval is based on auto-correlation times measured for networked hosts in practice. When changes in the state are observed, cfenvd records the time and state of the changes using a scale:

-2 -1 -0 +0 +1 +2
or
(low) anomalous loaded normal | normal loaded anomalous (high)

cfenvd sets a number of classes in cfengine which describe the current state of the host in relation to its recent history. The classes describe whether a parameter is above or below its average value, and how far from the average the current value is, in units of the standard-deviation. Note carefully that the distribution of data about the mean is not often Gaussian/normal, but skew about the mean, depending on the loading of the system. The structure of classes is:

VariableType_[high|normal|low]_[normal|dev1|dev2|dev3|anomaly]
                      ^                      ^
                      |                      |

         Above or below average      over 2 stddevs
The second field measures whether the current value is consistently above the average values. There are two average values to consider, in order to be certain there is no fluke: the stored weekly average, and the local (recent time) average. Both of these must be above or below in order to generate a high or low result.

Here is an example script for following the state of anomalies:


 # ROOT PROCS

  RootProcs_high_dev3::   # over 3 std deviations of the average

   "/bin/echo RootProc anomaly high 3 dev on $(host) value $(value_rootprocs) average $(average_rootprocs) pm $(stddev_rootprocs)"

 # USER PROCS

 UserProcs_high_dev3::

   "/bin/echo UserProc anomaly high 3 dev on $(host) value $(value_userprocs) average $(average_userprocs) pm $(stddev_userprocs)"

 UserProcs_high_anomaly::

   "/bin/echo UserProc anomaly high 4 dev!! on $(host)"

 # WWW IN

 www_in_high_dev3::

   "/bin/echo Incoming www anomaly high 3 dev on $(host) - value $(value_www_in) average $(average_www_in) pm $(stddev_www_in)"

 www_in_high_anomaly::

   "/bin/echo Incoming www anomaly high 4 dev!! on $(host) - value $(value_www_in) average $(average_www_in) pm $(stddev_www_in)"

The current and average values of each of these variables are passed on to cfagent and may be used in rules:
$(value_users)                   $(average_users)
$(value_rootprocs)               $(average_rootprocs)
$(value_otherprocs)              $(average_otherprocs)
$(value_diskfree)                $(average_diskfree)
$(value_netbiosns_in)            $(average_netbiosns_in)
$(value_netbiosns_out)           $(average_netbiosns_out)
$(value_netbiosdgm_in)           $(average_netbiosdgm_in)
$(value_netbiosdgm_out)          $(average_netbiosdgm_out)
$(value_netbiosssn_in)           $(average_netbiosssn_in)
$(value_netbiosssn_out)          $(average_netbiosssn_out)
$(value_irc_in)                  $(average_irc_in)
$(value_irc_out)                 $(average_irc_out)
$(value_cfengine_in)             $(average_cfengine_in)
$(value_cfengine_out)            $(average_cfengine_out)
$(value_nfsd_in)                 $(average_nfsd_in)
$(value_nfsd_out)                $(average_nfsd_out)
$(value_smtp_in)                 $(average_smtp_in)
$(value_smtp_out)                $(average_smtp_out)
$(value_www_in)                  $(average_www_in)
$(value_www_out)                 $(average_www_out)
$(value_ftp_in)                  $(average_ftp_in)
$(value_ftp_out)                 $(average_ftp_out)
$(value_ssh_in)                  $(average_ssh_in)
$(value_ssh_out)                 $(average_ssh_out)
$(value_telnet_in)               $(average_telnet_in)
$(value_telnet_out)              $(average_telnet_out)
Back to documentation