smokeping_config - Reference for the SmokePing Config File
SmokePing takes its configuration from a single central configuration file.
Its location must be hardcoded in the smokeping script and smokeping.cgi.
The contents of this manual is generated directly from the configuration
file parser.
The Parser for the Configuration file is written using David Schweikers
ParseConfig module. Read all about it in the ISG::ParseConfig manpage.
The Configuration file has a tree-like structure with section headings at
various levels. It also contains variable assignments and tables.
The text below describes the syntax of the SmokePing configuration file.
General configuration values valid for the whole SmokePing setup.
The following variables can be set in this section:
- owner (mandatory setting)
-
Name of the person responsible for this smokeping installation.
- imgcache (mandatory setting)
-
A directory which is visible on your webserver where SmokePing can cache graphs.
- imgurl (mandatory setting)
-
Either an absolute URL to the imgcache directory or one relative to the directory where you keep the
SmokePing cgi.
- datadir (mandatory setting)
-
The directory where SmokePing can keep its rrd files.
- pagedir
-
Directory to store static representations of pages.
- piddir (mandatory setting)
-
The directory where SmokePing keeps its pid when daemonised.
- sendmail
-
Path to your sendmail binary. It will be used for sending mails in connection with the support of DYNAMIC addresses.
- offset
-
If you run many instances of smokeping you may want to prevent them from
hitting your network all at the same time. Using the offset parameter you
can change the point in time when the probes are run. Offset is specified
in % of total interval, or alternatively as 'random'. I recommend to use
'random'. Note that this does NOT influence the rrds itself, it is just a
matter of when data acqusition is initiated. The default offset is 'random'.
- smokemail (mandatory setting)
-
Path to the mail template for DYNAMIC hosts. This mail template
must contain keywords of the form <##keyword##>. There is a sample
template included with SmokePing.
- cgiurl (mandatory setting)
-
Complete URL path of the SmokePing.cgi
- mailhost
-
Instead of using sendmail, you can specify the name of an smtp server
and use perl's Net::SMTP module to send mail to DYNAMIC host owners (see below).
- contact (mandatory setting)
-
Mail address of the person responsible for this smokeping installation.
- netsnpp
- syslogfacility
-
The syslog facility to use, eg. local0...local7.
Note: syslog logging is only used if you specify this.
- syslogpriority
-
The syslog priority to use, eg. debug, notice or info.
Default is info.
- concurrentprobes
-
If you use multiple probes or multiple instances of the same probe and you
want them to run concurrently in separate processes, set this to 'yes'. This
gives you the possibility to specify probe-specific step and offset parameters
(see the 'Probes' section) for each probe and makes the probes unable to block
each other in cases of service outages. The default is 'yes', but if you for
some reason want the old behaviour you can set this to 'no'.
- changeprocessnames
-
When using 'concurrentprobes' (see above), this controls whether the probe
subprocesses should change their argv string to indicate their probe in
the process name. If set to 'yes' (the default), the probe name will
be appended to the process name as '[probe]', eg. '/usr/bin/smokeping
[FPing]'. If you don't like this behaviour, set this variable to 'no'.
If 'concurrentprobes' is not set to 'yes', this variable has no effect.
Describes the properties of the round robin database for storing the
SmokePing data. Note that it is not possible to edit existing RRDs
by changing the entries in the cfg file.
The following variables can be set in this section:
- step (mandatory setting)
-
Duration of the base operation interval of SmokePing in seconds.
SmokePing will venture out every step seconds to ping your target hosts.
If 'concurrent_probes' is set to 'yes' (see above), this variable can be
overridden by each probe. Note that the step in the RRD files is fixed when
they are originally generated, and if you change the step parameter afterwards,
you'll have to delete the old RRD files or somehow convert them.
- pings (mandatory setting)
-
How many pings should be sent to each target. Suggested: 20 pings.
This can be overridden by each probe. Some probes (those derived from
basefork.pm, ie. most except the FPing variants) will even let this
be overridden target-specifically in the PROBE_CONF section (see the
basefork documentation for details). Note that the number of pings in
the RRD files is fixed when they are originally generated, and if you
change this parameter afterwards, you'll have to delete the old RRD
files or somehow convert them.
This section also contains a table describing the setup of the
SmokePing database. Below are reasonable defaults. Only change them if
you know rrdtool and its workings. Each row in the table describes one RRA.
# cons xff steps rows
AVERAGE 0.5 1 1008
AVERAGE 0.5 12 4320
MIN 0.5 12 4320
MAX 0.5 12 4320
AVERAGE 0.5 144 720
MAX 0.5 144 720
MIN 0.5 144 720
- column 0
-
Consolidation method.
- column 1
-
What part of the consolidated intervals must be known to warrant a known entry.
- column 2
-
How many steps to consolidate into for each RRA entry.
- column 3
-
How many rows this RRA should have.
Defines how the SmokePing data should be presented.
The following variables can be set in this section:
- template (mandatory setting)
-
The webpage template must contain keywords of the form
<##keyword##>. There is a sample
template included with SmokePing; use it as the basis for your
experiments. Default template contains a pointer to the SmokePing
counter and homepage. I would be glad if you would not remove this as
it gives me an indication as to how widely used the tool is.
- charset
-
By default, SmokePing assumes the 'iso-8859-15' character set. If you use
something else, this is the place to speak up.
The following sections are valid on level 1:
- +overview (mandatory section)
-
The Overview section defines how the Overview graphs should look.
-
The following variables can be set in this section:
- width (mandatory setting)
-
Width of the Overview Graphs.
- height (mandatory setting)
-
Height of the Overview Graphs.
- range
-
How much time should be depicted in the Overview graph. Time must be specified
as a number followed by a letter which specifies the unit of time. Known units are:
seconds, minutes, hours, ddays, weeks, years.
- max_rtt
-
Any roundtrip time larger than this value will cropped in the overview graph
- median_color
-
By default the median line is drawn in red. Override it here with a hex color
in the format rrggbb.
- strftime
-
Use posix strftime to format the timestamp in the left hand
lower corner of the overview graph
- +detail (mandatory section)
-
The following variables can be set in this section:
- width (mandatory setting)
-
How many pixels wide should detail graphs be
- height (mandatory setting)
-
How many pixels high should detail graphs be
- logarithmic
-
should the graphs be shown in a logarithmic scale (yes/no)
- unison_tolerance
-
if a graph is more than this factor of the median 'max' it drops out of the unison scaling algorithm. A factor of two would mean that any graph with a max either less than half or more than twice the median 'max' will be dropped from unison scaling
- max_rtt
-
Any roundtrip time larger than this value will cropped in the detail graph
- strftime
-
Use posix strftime to format the timestamp in the left hand
lower corner of the detail graph
- nodata_color
-
Paint the graph background in a special color when there is no data for this period because smokeping has not been running (#rrggbb)
The detailed display can contain several graphs of different resolution. In this
table you can specify the resolution of each graph.
Example:
"Last 3 Hours" 3h
"Last 30 Hours" 30h
"Last 10 Days" 10d
"Last 400 Days" 400d
- column 0
-
Description of the particular resolution.
- column 1
-
How much time should be depicted. The format is the same as for the age parameter of the Overview section.
The following sections are valid on level 2:
- ++loss_colors
-
In the Detail view, the color of the median line depends
the amount of lost packets. SmokePing comes with a reasonable default setting,
but you may choose to disagree. The table below
lets you specify your own coloring.
-
Example:
-
Loss Color Legend
1 00ff00 "<1"
3 0000ff "<3"
100 ff0000 ">=3"
- column 0
-
Activate when the lossrate (in percent) is larger of equal to this number
- column 1
-
Color for this range.
- column 2
-
Description for this range.
- ++uptime_colors
-
When monitoring a host with DYNAMIC addressing, SmokePing will keep
track of how long the machine is able to keep the same IP
address. This time is plotted as a color in the graphs
background. SmokePing comes with a reasonable default setting, but you
may choose to disagree. The table below lets you specify your own
coloring
-
Example:
-
# Uptime Color Legend
3600 00ff00 "<1h"
86400 0000ff "<1d"
604800 ff0000 "<1w"
1000000000000 ffff00 ">1w"
-
Uptime is in days!
- column 0
-
Activate when uptime in days is larger of equal to this number
- column 1
-
Color for this uptime range range.
- column 2
-
Description for this range.
The Probes Section configures Probe modules. Probe modules integrate an external ping command into SmokePing. Check the documentation of the FPing module for configuration details.
The following sections are valid on level 1:
- +/[-_0-9a-zA-Z]+/
-
Each module can take specific configuration information from this area. The jumble of letters above is a regular expression defining legal module names.
-
The following variables can be set in this section:
- step
-
Duration of the base interval that this probe should use, if different
from the one specified in the 'Database' section. Note that the step in
the RRD files is fixed when they are originally generated, and if you
change the step parameter afterwards, you'll have to delete the old RRD
files or somehow convert them. (This variable is only applicable if
the variable 'concurrentprobes' is set in the 'General' section.)
- offset
-
If you run many probes concurrently you may want to prevent them from
hitting your network all at the same time. Using the probe-specific
offset parameter you can change the point in time when each probe will
be run. Offset is specified in % of total interval, or alternatively as
'random', and the offset from the 'General' section is used if nothing
is specified here. Note that this does NOT influence the rrds itself,
it is just a matter of when data acqusition is initiated.
(This variable is only applicable if the variable 'concurrentprobes' is set
in the 'General' section.)
- pings
-
How many pings should be sent to each target, if different from the global
value specified in the Database section. Some probes (those derived from
basefork.pm, ie. most except the FPing variants) will even let this be
overridden target-specifically in the PROBE_CONF section (see the
basefork documentation for details). Note that the number of pings in
the RRD files is fixed when they are originally generated, and if you
change this parameter afterwards, you'll have to delete the old RRD
files or somehow convert them.
- /[-_0-9a-zA-Z.]+/
-
Each module defines which
variables it wants to accept. So this expression here just defines legal variable names.
The following sections are valid on level 2:
- ++/[-_0-9a-zA-Z]+/
-
You can define multiple instances of the same probe with subsections.
These instances can have different values for their variables, so you
can eg. have one instance of the FPing probe with packet size 1000 and
step 30 and another instance with packet size 64 and step 300.
The name of the subsection determines what the probe will be called, so
you can write descriptive names for the probes.
-
If there are any subsections defined, the main section for this probe
will just provide default parameter values for the probe instances, ie.
it will not become a probe instance itself.
-
The following variables can be set in this section:
- step
- offset
- pings
- /[-_0-9a-zA-Z.]+/
-
Each module defines which
variables it wants to accept. So this expression here just defines legal variable names.
The Alert section lets you setup loss and RTT pattern detectors. After each
round of polling, SmokePing will examine its data and determine which
detectors match. Detectors are enabled per target and get inherited by
the targets children.
Detectors are not just simple thresholds which go off at first sight
of a problem. They are configurable to detect special loss or RTT
patterns. They let you look at a number of past readings to make a
more educated decision on what kind of alert should be sent, or if an
alert should be sent at all.
The patterns are numbers prefixed with an operator indicating the type
of comparison required for a match.
The following RTT pattern detects if a target's RTT goes from constantly
below 10ms to constantly 100ms and more:
old ------------------------------> new
<10,<10,<10,<10,<10,>10,>100,>100,>100
Loss patterns work in a similar way, except that the loss is defined as the
percentage the total number of received packets is of the total number of packets sent.
old ------------------------------> new
==0%,==0%,==0%,==0%,>20%,>20%,>=20%
Apart from normal numbers, patterns can also contain the values *
which is true for all values regardless of the operator. And U
which is true for unknown data together with the == and =! operators.
Detectors normally act on state changes. This has the disadvantage, that
they will fail to find conditions which were already present when launching
smokeping. For this it is possible to write detectors that begin with the
special value ==S it is inserted whenever smokeping is started up.
You can write
==S,>20%,>20%
to detect lines that have been losing more than 20% of the packets for two
periods after startup.
Sometimes it may be that conditions occur at irregular intervals. But still
you only want to throw an alert if they occur several times within a certain
amount of times. The operator *X* will ignore up to X values and still
let the pattern match:
>10%,*10*,>10%
will fire if more than 10% of the packets have been losst twice over the
last 10 samples.
A complete example
*** Alerts ***
to = admin@company.xy,peter@home.xy
from = smokealert@company.xy
+lossdetect
type = loss
# in percent
pattern = ==0%,==0%,==0%,==0%,>20%,>20%,>20%
comment = suddenly there is packet loss
+miniloss
type = loss
# in percent
pattern = >0%,*12*,>0%,*12*,>0%
comment = detected loss 3 times over the last two hours
+rttdetect
type = rtt
# in milliseconds
pattern = <10,<10,<10,<10,<10,<100,>100,>100,>100
comment = routing messed up again ?
+rttbadstart
type = rtt
# in milliseconds
pattern = ==S,==U
comment = offline at startup
The following variables can be set in this section:
- to (mandatory setting)
- from (mandatory setting)
The following sections are valid on level 1:
- +/[^\s,]+/
-
The following variables can be set in this section:
- type (mandatory setting)
-
Currently the pattern types rtt and loss and matcher are known
- pattern (mandatory setting)
-
a comma separated list of comparison operators and numbers. rtt patterns are in milliseconds, loss patterns are in percents
- comment (mandatory setting)
- to
The Target Section defines the actual work of SmokePing. It contains a hierarchical list
of hosts which mark the endpoints of the network connections the system should monitor.
Each section can contain one host as well as other sections.
The following variables can be set in this section:
- probe (mandatory setting)
-
The name of the probe module to be used for this host. The value of
this variable gets propagated
- menu (mandatory setting)
-
Menu entry for this section. If not set this will be set to the hostname.
- title (mandatory setting)
-
Title of the page when it is displayed. This will be set to the hostname if
left empty.
- remark
-
An optional remark on the current section. It gets displayed on the webpage.
- alerts
-
A comma separated list of alerts to check for this target. The alerts have
to be setup in the Alerts section. Alerts are inherited by child nodes. Use
an empty alerts definition to remove inherited alerts from the current target
and its children.
The following sections are valid on level 1:
- +PROBE_CONF
-
Probe specific variables.
-
The following variables can be set in this section:
- /[-_0-9a-zA-Z.]+/
-
Should be found in the documentation of the
corresponding probe. The values get propagated to those child
nodes using the same Probe.
- +/[-_0-9a-zA-Z]+/
-
Each target section can contain information about a host to monitor as
well as further target sections. Most variables have already been
described above. The expression above defines legal names for target
sections.
-
The following variables can be set in this section:
- probe
- menu
- title
- alerts
-
Comma separated list of alert names
- note
-
Some information about this entry which does NOT get displayed on the web.
- email
-
This is the contact address for the owner of the current host. In connection with the DYNAMIC hosts,
the address will be used for sending the belowmentioned script.
- host
-
Can either contain the name of a target host or the string DYNAMIC.
-
In the second case, the target machine has a dynamic IP address and
thus is required to regularly contact the SmokePing server to verify
its IP address. When starting SmokePing with the commandline argument
--email it will add a secret password to each of the DYNAMIC
host lines and send a script to the owner of each host. This script
must be started regularly on the host in question to make sure
SmokePing monitors the right box. If the target machine supports
SNMP SmokePing will also query the hosts
sysContact, sysName and sysLocation properties to make sure it is
still the same host.
- remark
- rawlog
-
Log the raw data, gathered for this target, in tab separated format, to a file with the
same basename as the corresponding RRD file. Use posix strftime to format the timestamp to be
put into the file name. The filename is built like this:
-
basename.strftime.csv
-
Example:
-
rawlog=%Y-%m-%d
-
this would create a new logfile every day with a name like this:
-
targethost.2004-05-03.csv
- alertee
-
If you want to have alerts for this target and all targets below it go to a particular address
on top of the address already specified in the alert, you can add it here. This can be a comma separated list of items.
The following sections are valid on level 2:
- ++PROBE_CONF
-
Probe specific variables.
-
The following variables can be set in this section:
- /[-_0-9a-zA-Z.]+/
-
Should be found in the documentation of the
corresponding probe. The values get propagated to those child
nodes using the same Probe.
- ++/[-_0-9a-zA-Z]+/
-
Each target section can contain information about a host to monitor as
well as further target sections. Most variables have already been
described above. The expression above defines legal names for target
sections.
Copyright (c) 2001-2003 by Tobias Oetiker. All right reserved.
This program is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the GNU General Public License for more
details.
You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 675 Mass Ave, Cambridge, MA
02139, USA.
Tobias Oetiker <tobi@oetiker.ch>