Logfile MonitorsThere are several Hybrid Indication Policy Monitor Types available: Logfile MonitorOne of the most used features of all the monitoring solutions is the logfile monitoring. This function is implemented as a Hybrid Policy with Java monitor. The implementation of the LogFileMonitor has flexible possibilities to specify the format of the monitored logfile and pre-process filter. The monitor automatically handles the truncation of the monitored file. It supports file masks for finding rotating log files. In the Nagin Trigger/LogFileMonitor Details section the following data needs to be defined: the Java type has to be selected, the polling interval and the call parameters have to be defined.
Call: Line1: com.blixx.agent.monitors.LogFileMonitor Line2: "path to the logfile" Line3: "pattern matching start of message blocks" Line4: "general pattern matching entries" Line5 (optional): "one of: FROM_START | FROM_LAST" Line6 (optional): "executable call before logfile processing"
com.blixx.agent.monitors.LogFileMonitor
/var/log/apache2/error_log
\[\w{3}\s+\w{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\s\d{4}\].*
.*
The most difficult part of the configuration is Line 3. You can easily adjust the pattern by using the Pattern Validation Dialog which
is integrated in the boom GUI (see screenshot below). In real life
it is better to fetch an existing logfile from the managed node and load it into the dialog. If a logfile
has no multi-line entries a simple ".*" (match all) pattern can be used.
Logfile Transaction MonitorAnother modification of the Logfile Monitor can be used for monitoring transactions like log entries. The responsibility of this Hybrid Java Monitor is the detection of not closed or failed transactions by tracking logfile entries and their expected follow-up entries. Why to use the logfile transaction monitor: More than one transaction is writing to the same monitored file. There are two type of Logfile Transaction Monitors: "Multiline" gives you the possibility to track multiline logfile entries. The delimiter is defined in the "Split Records" field using the pattern notation. The Logfile Transaction Monitor has the following details:
Monitor Type: Logfile transaction Monitor / (MP)Logfile Transaction Monitor
Predefined objects provide the possibility to filter the following transactions:
FAIL_END for failed transactions (matches the fail pattern)
The following predefined variables can be used:
<$SUCCESS_END_MSG> includes the finish line
<$FAIL_END_MSG> includes the fail line
Note: The monitored file will always be scanned from the last processed position. Example1: Monitoring request: All log entries for failed cron jobs should be identified/filtered from /var/adm/cron/log. Initial position: - A good-job is executed every minute and succeeds - A failed-job is executed every minute and fails Analysis: - More than one job is writing to the same log file. The entries of the different jobs are mixed up hence the normal Logfile monitor doesn’t fit. - Every log entry consists of one single line. - The START, SUCCESS and FAILED line can be identified via patterns - Solution: use of the "Logfile Transaction Monitor" Extract of the cron log file: root : CMD ( echo "Failjob"; exit 1 ) : PID ( 311444 ) : Thu Sep 29 14:36:00 2011 root : CMD ( echo "Gutjob"; exit 0 ) : PID ( 364788 ) : Thu Sep 29 14:36:00 2011 Cron Job with pid: 364788 Successful Cron Job with pid: 311444 Failed The log file contains the following transactions: Start jobs: root : CMD ( echo "Failjob"; exit 1 ) : PID ( 311444 ) : Thu Sep 29 14:36:00 2011 root : CMD ( echo "Gutjob"; exit 0 ) : PID ( 364788 ) : Thu Sep 29 14:36:00 2011 Failed job: Cron Job with pid: 311444 Failed Success finish job: Cron Job with pid: 364788 Successful Setup of the "Logfile Transaction Monitor": 1. Load the cron log file entries (Start/Fail/Success) into the pattern validation dialog. Define the Start Pattern / Finish Pattern / Fail Pattern for the according log entries. Add the pattern to the according fields in the hybrid indication policy. In the example the job PID defines the dependency of the Start Line and Fail/Finish Line and hence differentiates a single transaction. Replace Finish and Fail patterns with the variables from the start pattern. In our example a single variable in the start pattern needs to be extracted (see OPM Monitor) and used as <$svar3> in the finish and fail pattern (PID = <$svar3>). The PID has to be the same otherwise the lines don’t belong together. Note: It is not possible to extract variables from the fail and finish pattern. Start Line Pattern: root : CMD ( echo "Failjob"; exit 1 ) : PID ( 311444 ) : Thu Sep 29 14:36:00 2011 root : CMD ( echo "Gutjob"; exit 0 ) : PID ( 364788 ) : Thu Sep 29 14:36:00 2011 Pattern: ^(\w+)\s+:\s+CMD\s+\(\s+(.*)\s+\) : PID\s+\(\s+(\d+)\s+\)\s+:.* Finish Line Pattern: Cron Job with pid: 311444 Failed Pattern: ^\s*Cron\s+Job\s+with\s+pid:\s+<$svar3>\s+Successful\s*$ Fail Line Pattern: Cron Job with pid: 364788 Successful Pattern: ^\s*Cron\s+Job\s+with\s+pid:\s+<$svar3>\s+Failed\s*$
2. Define a condition which filters all transaction fail messages (we are not
interested in success messages).
Predefined objects provides the possibility to filter the following messages:
FAIL_END for failed transactions (matches the fail pattern)
SUCCESS_END for successsful transactions (matches the finish pattern)
FAIL_TIMEOUT for timed out transactions (timeout in sec)
As mentioned above it’s not possible to extract any variables from the fail and
finish pattern. If you want to extract information from these pattern you have
to extract this with the help of prefedined variables in the "Match Variables"
section using simplified pattern matching.
The following predefined variables can be used:
<$SUCCESS_END_MSG> includes the finish line
<$FAIL_END_MSG> includes the fail line
Example 2: The LogFileTransactionMonitor Java Monitor has the following call format:
com.blixx.agent.monitors.LogFileTransactionMonitor
<path to the logfile>
<startPatternT1> - should match start of transaction logfile entry
<finishPatternT1> - should match SUCCESS end of transaction
<failPatternT1> - should match FAILED end of transaction
<timeoutT1> - in seconds
<startPatternT2> - should match start of transaction logfile entry
<finishPatternT2> - should match SUCCESS end of transaction
<failPatternT2> - should match FAILED end of transaction
<timeoutT2> - in seconds
...
Usually such logfiles contain many asynchronously started transactions. Therefore it is recommended that the start pattern includes a capturing group(s) extracting a unique identifier of the transaction and use variables in the finish/fail patterns (<$svarN> extracted during first match) to be able to differentiate the transactions.
Several products use transaction like logging, where a missing ‘successful finish message’ is just as bad as
a logged failure. For example the vmware ESX server logfile: com.blixx.agent.monitors.LogFileTransactionMonitor logfilePath /var/log/vmware/hostd.log startPattern: .*ha-eventmgr.*Event \d+ : (\S+) on host.*in ha-datacenter is starting.* fihishPattern: .*ha-eventmgr.*Event \d+ : <$svar1>.* in ha-datacenter is powered on.* failPattern: .*ha-eventmgr.*Event \d+ : Failed to power on <$svar1>.* in ha-datacenter.* timeout: 60 /var/log/vmware/hostd.log Example content: Logfile Transaction Monitor (multiline)Another modification of the Logfile Monitor can be used for monitoring transactions like log entries. The responsibility of this Hybrid Java Monitor is the detection of not closed or failed transactions by tracking logfile entries and their expected follow-up entries. There are two type of Logfile Transaction Monitors: "Multiline" gives you the possibility to track multiline logfile entries. The delimiter is defined in the "Split Records" field using the pattern notation. Why to use the logfile transaction monitor (multiline): More than one transaction is writing to the same monitored file. The Logfile Transaction Monitor (multiline) has the following details:
Monitor Type: Logfile transaction Monitor / (MP)Logfile Transaction Monitor
Interval: The interval specifies how often logfile is checked in minutes.
Monitor Name: com.blixx.agent.monitors.LogFileTransactionMonitor
Path/File mask: The logfile path contains the path of the monitored file or it can contain a
mask for the filename.
Predefined objects provide the possibility to filter the following transactions:
FAIL_END for failed transactions (matches the fail pattern)
The following predefined variables can be used:
<$SUCCESS_END_MSG> includes the finish line
<$FAIL_END_MSG> includes the fail line
Note: The monitored file will always be scanned from the last processed position. Example:
Monitoring request:
All log entries for failed cron jobs should be identified/filtered from /var/adm/cron/log.
Initial position:
- A good-job is executed every minute and succeeds
- A failed-job is executed every minute and fails
Analysis:
- More than one job is writing to the same log file. The entries of the different jobs
are mixed up hence the normal Logfile monitor doesn’t fit.
- The START/Fail/SUCCESS entry consists of two lines.
- The START, SUCCESS and FAILED line can be identified via patterns
- Solution: use of the "Logfile Transaction Monitor (multiline)"
Extract of the cron log file:
> CMD: echo "Failjob";exit 1
> root 2202 c Thu Sep 29 10:05:00 2011
> CMD: echo "Gutjob";exit 0
> root 2203 c Thu Sep 29 10:05:00 2011
< root 2202 c Thu Sep 29 10:05:00 2011 rc=1
! could not obtain latest contract from popen(3C): No such process Thu Sep 29 10:05:00 2011
< root 2203 c Thu Sep 29 10:05:00 2011
! could not obtain latest contract from popen(3C): No such process Thu Sep 29 10:05:00 2011
The log file contains the following transactions:
Start jobs:
> CMD: echo "Failjob";exit 1
> root 2202 c Thu Sep 29 10:05:00 2011
CMD: echo "Gutjob";exit 0
root 2203 c Thu Sep 29 10:05:00 2011
Failed job:
< root 2202 c Thu Sep 29 10:05:00 2011 rc=1
Success finish job:
< root 2203 c Thu Sep 29 10:05:00 2011
Setup of the "Logfile Transaction Monitor":
1. Load the cron log file entries into the pattern validation dialog.
Define the Start Pattern / Finish Pattern / Fail Pattern for the according log
entries.
Add the pattern to the according fields in the hybrid indication policy.
In the example the job PID and the user define the dependency of the Start Line and
Fail/Finish Line and hence differentiate a single transaction.
Replace Finish and Fail patterns with the variables from the start pattern.
In our example the single variables in the start pattern needs to be extracted
(see OPM Monitor) and used as <$svar3> and <$svar2> in the finish and fail pattern
(PID = <$svar3>, User=<$svar2>). The PID and user have to be the same otherwise the
lines don't belong together.
Note: It is not possible to extract variables from the fail and finish pattern.
Start Line Pattern:
< CMD: echo "Failjob";exit 1
< root 2202 c Thu Sep 29 10:05:00 2011
Pattern: >\s+CMD:\s+(.*)\n>\s+(\w+)\s+(\d+)\s+.*
Fail Line Pattern:
< root 2202 c Thu Sep 29 10:05:00 2011 rc=1
Pattern: <\s+<$svar2>\s+<$svar3>\s+[^\n]+\s+rc=\d+\s*\n*.*
Finish Line Pattern:
< root 2203 c Thu Sep 29 10:05:00 2011
Pattern: <\s+<$svar2>\s+<$svar3>\s+[^\n]+\s+\d+\s*(|\n.*)
2. Define "Split Records" Java Patterns for the Start/Fail and Success information.
Split Records: (>\s+CMD:|<\s+).*
The pattern ">\s+CMD:.*" matches the Start Transaction which matches the
following lines:
> CMD: echo "Failjob";exit 1
> root 2202 c Thu Sep 29 10:05:00 2011
The pattern "<\s+.*" matches the Fail and Success Transaction which
matches the following lines:
< root 2202 c Thu Sep 29 10:05:00 2011 rc=1
< root 2202 c Thu Sep 29 10:05:00 2011
Use the Java Pattern Validation Dialog to test the "Split Records" pattern definitions.
Use the "Re-split" button to test the recognition of multiline records.
Note: Based on the "Split Records" definitions the monitor matches as many
lines as possible. That have to be been taken into account in the Fail
and Success patterns definition.
3. Define a condition which filters all transaction fail messages (we are not
interested in success messages).
Predefined objects provides the possibility to filter the following messages:
FAIL_END for failed transactions (matches the fail pattern)
SUCCESS_END for successsful transactions (matches the finish pattern)
FAIL_TIMEOUT for timed out transactions (timeout in sec)
As mentioned above it’s not possible to extract any variables from the fail and
finish pattern. If you want to extract information from these pattern you have
to extract this with the help of prefedined variables in the "Match Variables"
section using simplified pattern matching.
The following predefined variables can be used:
<$SUCCESS_END_MSG> includes the finish line
<$FAIL_END_MSG> includes the fail line
MultiPath (MP)Logfile MonitorThe Multipath Logfile Monitors are extensions of the Logfile Monitors that we have in the boom. The main difference is that the file/path mask is extended to support not only file masks but also path masks. It is recommended to use the MP logfile family instead of the normal ones. These are kept for backwards compatibility. The path mask syntax is specialized for finding rotating log files and consists of the following rules: • '*' (star) symbol matches any char sequence within path element name. Among all matching files, the file with latest modification time should be taken. This is useful to monitor rotating logfiles, so if logging has switched to another file, the monitor will also pick the newest file (after finishing the previous one). • Several '*' (star) symbols in different path elements (i.e. on different file tree levels) still comprise the same single group for selecting the newest file. • '**' (double star) symbol matches any char sequence within path element name. All matching files should be taken as separate monitoring target, regardless of modification time. • The path element consisting of exactly double star (like '/**/') specifies recursive scan and matches arbitrary path depth, including zero. • double star overrides single star in the same path element, that is all matching files will be taken. • path starting from wildcard is considered relative to current dir, unless followed by colon on MS Windows. This is reserved as a special notation for selecting all disk drives on MS Windows: *: (is the same as **:). • All slashes and backslashes are treated as file (path element) separators, interim double slashes are reduced to single slash. On MS Windows, network paths starting with double (back-)slash are also recognized. • Several independent masks can be specified, separated by the ‘|’ (pipe) symbol. • Path mask can be replaced with exec call : “<$LOGFILES(command)>”. In this case the MP LFM expects back a list of log file paths (one or more) separated by ‘|’ (pipe) or new line ‘\n’ character For more information and examples please see Indication policy for Multipath Logfile Monitors. |