Logfile Monitors

  • Logfile Monitoring
  • Logfile Transaction Monitor
  • Logfile Transaction Monitor (multiline)
  • Multipath (MP)Logfile Monitor

  • There are several Hybrid Indication Policy Monitor Types available:

    JAVA
    NAGIN
    Logfile Monitor
    Logfile Transaction Monitor
    Logfile Transaction Monitor (multiline)
    (MP) Logfile Monitor
    (MP) Logfile Transaction Monitor
    (MP) Logfile Transaction Monitor (multiline)

    Note: It is recommended to use the Multipath (MP) Logfile family instead of the "normal" monitors. The "normal" monitors are kept for backwards compatibility.


    Logfile Monitor

    One of the most used features of all the monitoring solutions is the logfile monitoring. This function is implemented as a Hybrid Policy with Java monitor. The implementation of the LogFileMonitor has flexible possibilities to specify the format of the monitored logfile and pre-process filter. The monitor automatically handles the truncation of the monitored file. It supports file masks for finding rotating log files.

    In the Nagin Trigger/LogFileMonitor Details section the following data needs to be defined: the Java type has to be selected, the polling interval and the call parameters have to be defined.

    Call:

    	  	Line1: com.blixx.agent.monitors.LogFileMonitor
    		Line2: "path to the logfile"
    		Line3: "pattern matching start of message blocks"
    		Line4: "general pattern matching entries"
    		Line5 (optional): "one of: FROM_START | FROM_LAST"
    		Line6 (optional): "executable call before logfile processing"
    	  
    Line 1Contains the name of the Java Monitor bundled with default installations as BoomJavaMonitor package.
    Line 2The path of the logfile, that should be monitored.
    Line 3Many of the log files contain multi-line entries. It is necessary to have here a Java Pattern that is matching the start of the message. Usual it can be a fixed prefix containing the date, severity, log level, etc.
    Line 4A Java Pattern as a pre-process filter. This filter allows to reduce the number of messages that must be processed by an Agent. Pattern ".*" matches all lines - that means all entries will be processed.
    Line 5 (optional) FROM_START - indicates that logfile has to be monitored from the begin on every scheduled interval
    FROM_LAST - sets default processing mode. Logfile will be scanned from last processed point.
    Line 6 (optional)   Allows to specify a trigger that needs to be executed before parsing the logfile.
    This can be used together with FROM_START parameter in line 5 to process logfiles re-generated by trigger on every polling interval.

    Example of the Apache ErrorLogfile Policy:
    	  	com.blixx.agent.monitors.LogFileMonitor
    		/var/log/apache2/error_log
    		\[\w{3}\s+\w{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\s\d{4}\].*
    		.*
    	  
    The most difficult part of the configuration is Line 3. You can easily adjust the pattern by using the Pattern Validation Dialog which is integrated in the boom GUI (see screenshot below). In real life it is better to fetch an existing logfile from the managed node and load it into the dialog. If a logfile has no multi-line entries a simple ".*" (match all) pattern can be used.


    If you use brackets "()" around parts of the pattern, you can see in the result area which data this part of the pattern is matching.






    Logfile Transaction Monitor

    Another modification of the Logfile Monitor can be used for monitoring transactions like log entries. The responsibility of this Hybrid Java Monitor is the detection of not closed or failed transactions by tracking logfile entries and their expected follow-up entries.


    Why to use the logfile transaction monitor:

    More than one transaction is writing to the same monitored file.
    The entries of the different transactions are mixed up.
    The START, SUCCESS and FAILED line can be identified individually via patterns.
    The log entries are single lined.


    There are two type of Logfile Transaction Monitors:
  • LogFile Transaction Monitor
  • LogFile Transaction Monitor (Multiline)

    "Multiline" gives you the possibility to track multiline logfile entries. The delimiter is defined in the "Split Records" field using the pattern notation.



  • The Logfile Transaction Monitor has the following details:

          Monitor Type: 	Logfile transaction Monitor / (MP)Logfile Transaction Monitor
    Interval: The interval specifies how often logfile is checked in minutes.
    Monitor Name: com.blixx.agent.monitors.LogFileTransactionMonitor
    Path/File mask: The logfile path contains the path of the monitored file or it can contain a mask for the filename.
    Start Pattern: should match the start line of the transaction log file entry
    Finish Pattern: should match the SUCCESS of the successfully transaction end
    Fail Pattern: should match the FAIL of the failed transaction end
    Timeout: transaction time out

    Predefined objects provide the possibility to filter the following transactions:

          FAIL_END		for failed transactions (matches the fail pattern)
    SUCCESS_END for successsful transactions (matches the finish pattern)
    FAIL_TIMEOUT for timed out transactions (timeout in sec)

    The following predefined variables can be used:
          <$SUCCESS_END_MSG>	includes the finish line
          <$FAIL_END_MSG>		includes the fail line
          

    Note: The monitored file will always be scanned from the last processed position.


    Example1:
    	Monitoring request:
    	All log entries for failed cron jobs should be identified/filtered from /var/adm/cron/log.
    
    
    	Initial position:
    	-	A good-job is executed every minute and succeeds
    	-	A  failed-job is executed every minute and fails
    
    
    	Analysis:
    	-	More than one job  is writing to the same log file. The entries of the different jobs 
    		are mixed up hence the normal Logfile monitor doesn’t fit.
    	-	Every log entry consists of one single line.
    	-	The START, SUCCESS and FAILED line can be identified via patterns
    	-	Solution: use of the "Logfile Transaction Monitor"
    
    
    	Extract of the cron log file:
    
    	root	: CMD ( echo "Failjob"; exit 1 ) : PID ( 311444 ) : Thu Sep 29 14:36:00 2011
    	root   : CMD ( echo "Gutjob"; exit 0 ) : PID ( 364788 ) : Thu Sep 29 14:36:00 2011
    	Cron Job with pid: 364788 Successful
    	Cron Job with pid: 311444 Failed
    
    
    	The log file contains the following transactions:
    
    	Start jobs:  
    	root      : CMD ( echo "Failjob"; exit 1 ) : PID ( 311444 ) : Thu Sep 29 14:36:00 2011
    	root      : CMD ( echo "Gutjob"; exit 0 ) : PID ( 364788 ) : Thu Sep 29 14:36:00 2011
    
    	Failed job:
    	Cron Job with pid: 311444 Failed
    
    	Success finish job:
    	Cron Job with pid: 364788 Successful
    
    
    	Setup of the "Logfile Transaction Monitor":
    
    	1. Load the cron log file entries (Start/Fail/Success) into the pattern validation dialog.
    	
    	   Define the Start Pattern / Finish Pattern / Fail Pattern for the according log
    	   entries.
    	   
    	   Add the pattern to the according fields in the hybrid indication policy.
    	   
    	   In the example the job PID defines the dependency of the Start Line and 
    	   Fail/Finish Line and hence differentiates a single transaction.
    
    	   Replace Finish and Fail patterns with the variables from the start pattern.
    	   In our example a single variable in the start pattern needs to be extracted 
    	   (see OPM Monitor) and used as <$svar3> in the finish and fail pattern 
    	   (PID = <$svar3>). The PID has to be the same otherwise the lines don’t belong 
    	   together.
    	   
    	   Note: It is not possible to extract variables from the fail and finish pattern. 
    
    	   Start Line Pattern:
    	   root      : CMD ( echo "Failjob"; exit 1 ) : PID ( 311444 ) : Thu Sep 29 14:36:00 2011
      	   root      : CMD ( echo "Gutjob"; exit 0 ) : PID ( 364788 ) : Thu Sep 29 14:36:00 2011
    	   Pattern: ^(\w+)\s+:\s+CMD\s+\(\s+(.*)\s+\) : PID\s+\(\s+(\d+)\s+\)\s+:.*
    
    	   Finish Line Pattern:
    	   Cron Job with pid: 311444 Failed								
    	   Pattern: ^\s*Cron\s+Job\s+with\s+pid:\s+<$svar3>\s+Successful\s*$
    
    	   Fail Line Pattern:
    	   Cron Job with pid: 364788 Successful								
    	   Pattern: ^\s*Cron\s+Job\s+with\s+pid:\s+<$svar3>\s+Failed\s*$
    	

    	
    	2.  Define a condition which filters all transaction fail messages (we are not
    	    interested in success messages).
    
    	   Predefined objects provides the possibility to filter the following messages:
    
    		FAIL_END 	for failed transactions (matches the fail pattern)
    		SUCCESS_END	for successsful transactions (matches the finish pattern)
    		FAIL_TIMEOUT	for timed out transactions (timeout in sec)
    
    
    	   As mentioned above it’s not possible to extract any variables from the fail and 
    	   finish pattern. If you  want to extract information from these pattern you have 
    	   to extract this with the help of prefedined variables in the "Match Variables" 
    	   section using simplified pattern matching.
     
      	   The following predefined variables can be used:
    
    		<$SUCCESS_END_MSG>	includes the finish line
    		<$FAIL_END_MSG>	includes the fail line
          
          




    Example 2:

    The LogFileTransactionMonitor Java Monitor has the following call format:

    
    com.blixx.agent.monitors.LogFileTransactionMonitor
    <path to the logfile>
    <startPatternT1>	- should match start of transaction logfile entry
    <finishPatternT1>	- should match SUCCESS end of transaction
    <failPatternT1>		- should match FAILED end of transaction
    <timeoutT1>		- in seconds
    <startPatternT2>	- should match start of transaction logfile entry
    <finishPatternT2>	- should match SUCCESS end of transaction
    <failPatternT2>		- should match FAILED end of transaction
    <timeoutT2>		- in seconds
    ...
          

    Usually such logfiles contain many asynchronously started transactions. Therefore it is recommended that the start pattern includes a capturing group(s) extracting a unique identifier of the transaction and use variables in the finish/fail patterns (<$svarN> extracted during first match) to be able to differentiate the transactions.

    Several products use transaction like logging, where a missing ‘successful finish message’ is just as bad as a logged failure. For example the vmware ESX server logfile:
    The Call field in the Hybrid Java Policy can be defined as:

    		com.blixx.agent.monitors.LogFileTransactionMonitor
    logfilePath	/var/log/vmware/hostd.log
    startPattern:	.*ha-eventmgr.*Event \d+ : (\S+) on host.*in ha-datacenter is starting.*
    fihishPattern:	.*ha-eventmgr.*Event \d+ : <$svar1>.* in ha-datacenter is powered on.*
    failPattern:	.*ha-eventmgr.*Event \d+ : Failed to power on <$svar1>.* in ha-datacenter.*
    timeout:		60
    
    /var/log/vmware/hostd.log Example content:
    

    [... 22:05:13.459 'ha-eventmgr' 54164400 info] Event 56 : vm33 on host ESX1Server in ha-datacenter is starting Start pattern matched with line above. Extracts <$svar1>=vm33 - enough to uniquely identify transaction. Modify finish/file pattern to: .*ha-eventmgr.*Event \d+ : vm33.* in ha-datacenter is powered on.* .*ha-eventmgr.*Event \d+ : Failed to power on vm33.* in ha-datacenter.* Remember start transaction message (for sending later) Start monitoring end of transaction or timeout. [... 22:05:13.459 'vm:/vmfs/volumes/49c6/vm33/vm33.vmx' 54164400 info] State Transition \ (VM_STATE_OFF -> VM_STATE_POWERING_ON) [... 22:05:13.555 'BaseLibs' 35318704 info] VMHSVMCbPower: Setting state of VM \ /vm/#cba6c6929cc15181/ to powerOn with option soft [... 22:05:13.556 'BaseLibs' 35318704 info] VMHS: Exec()'ing /usr/lib/vmware/bin/vmkload_app, \ /vmfs/volumes/49c6/vm33/vm33.vmx [... 22:05:13.920 'BaseLibs' 35318704 info] Established a connection. Killing intermediate child: 10472 [... 22:05:13.924 'BaseLibs' 35318704 info] Mounting virtual machine paths on connection: \ /db/connection/#c1/, /vmfs/volumes/49c6/vm33/vm33.vmx [... 22:05:13.947 'BaseLibs' 35318704 info] Mount VM completion for vm: /vmfs/volumes/49c6/vm33/vm33.vmx [... 22:05:13.948 'BaseLibs' 35318704 info] Mount VM Complete: /vmfs/volumes/49c6/vm33/vm33.vmx, \ Return code: OK [... 22:05:29.762 'BaseLibs' 35318704 info] VMX status has been set for vm: /vmfs/volumes/49c6/vm33/vm33.vmx [... 22:05:29.762 'BaseLibs' 35318704 info] Disconnect check in progress: /vmfs/volumes/49c6/vm33/vm33.vmx [... 22:05:29.762 'BaseLibs' 35318704 info] Disconnect check in progress: /vmfs/volumes/49c6/vm33/vm33.vmx [... 22:05:29.784 'BaseLibs' 55413680 info] Connected to /vmfs/volumes/49c6/vm33/vm33.vmx:testAutomation-fd, \ remote end sent pid: 101091 [... 22:05:30.104 'BaseLibs' 21207984 info] DISKLIB-VMFS : "/vmfs/volumes/49c6/vm33/vm33-flat.vmdk" : \ open successful (17) size = 19327352832, hd = 0. Type 3 [... 22:05:30.123 'BaseLibs' 21207984 info] DISKLIB-VMFS : "/vmfs/volumes/49c6/vm33/vm33-flat.vmdk" : closed. [... 22:05:30.196 'ha-eventmgr' 21474224 info] Event 57 : vm33 on ESX1Server in ha-datacenter is powered on The line above matched with finishPattern. (SUCCESS) -> sends start transaction message to the boom server. OBJECT=SUCCESS_END SEVERITY=normal + optional variable: <$SUCCESS_END_MSG>="success transaction logfile line" OR IF ERROR: [... 22:05:30.196 'ha-eventmgr' 21474224 info] Event 57 : Failed to power on vm33 on ESX1Server in ha-datacenter: \ A general system error occurred: The line above matched with failPattern. (FAIL) -> sends start transaction message to the boom server. OBJECT=FAIL_END SEVERITY=critical + optional variable: <$FAIL_END_MSG>="fail transaction logfile line" OR IF TIMEOUT: Sends start transaction message to the boom server. OBJECT=FAIL_TIMEOUT SEVERITY=critical




    Logfile Transaction Monitor (multiline)

    Another modification of the Logfile Monitor can be used for monitoring transactions like log entries. The responsibility of this Hybrid Java Monitor is the detection of not closed or failed transactions by tracking logfile entries and their expected follow-up entries.


    There are two type of Logfile Transaction Monitors:
  • LogFile Transaction Monitor
  • LogFile Transaction Monitor (Multiline)

    "Multiline" gives you the possibility to track multiline logfile entries. The delimiter is defined in the "Split Records" field using the pattern notation.

  • Why to use the logfile transaction monitor (multiline):

    More than one transaction is writing to the same monitored file.
    The entries of the different transactions are mixed up.
    The START, SUCCESS and FAILED line can be identified individually via patterns.
    The log entries are single lined or multilined.



    The Logfile Transaction Monitor (multiline) has the following details:

          Monitor Type: 	Logfile transaction Monitor / (MP)Logfile Transaction Monitor
          Interval: 		The interval specifies how often logfile is checked in minutes.
          Monitor Name: 	com.blixx.agent.monitors.LogFileTransactionMonitor
          Path/File mask: 	The logfile path contains the path of the monitored file or it can contain a
    				mask for the filename.
    Split Records: Split Records allows multi-line processing, because many of the log files contain multiline entries. It is necessary to have here a Java Pattern that is matching the first line of the Start/Fail/Success message. Usually it can be a fixed prefix containing the date, severity, log level, etc. If a logfile has no multiline entries a simple ".*" (match all) pattern can be used. Start Pattern: should match the start line of the transaction log file entry Finish Pattern: should match the SUCCESS of the successfully transaction end Fail Pattern: should match the FAIL of the failed transaction end Timeout: transaction time out

    Predefined objects provide the possibility to filter the following transactions:

          FAIL_END		for failed transactions (matches the fail pattern)
    SUCCESS_END for successsful transactions (matches the finish pattern)
    FAIL_TIMEOUT for timed out transactions (timeout in sec)

    The following predefined variables can be used:
          <$SUCCESS_END_MSG>	includes the finish line
          <$FAIL_END_MSG>		includes the fail line
          

    Note: The monitored file will always be scanned from the last processed position.


    Example:
    	Monitoring request:
    	All log entries for failed cron jobs should be identified/filtered from /var/adm/cron/log.
    
    
    	Initial position:
    	-	A good-job is executed every minute and succeeds
    	-	A  failed-job is executed every minute and fails
    
    
    	Analysis:
    	-	More than one job  is writing to the same log file. The entries of the different jobs 
    		are mixed up hence the normal Logfile monitor doesn’t fit.
    	-	The START/Fail/SUCCESS entry consists of two lines.
    	-	The START, SUCCESS and FAILED line can be identified via patterns
    	-	Solution: use of the "Logfile Transaction Monitor (multiline)"
    
    
    	Extract of the cron log file:
    
    	> CMD: echo "Failjob";exit 1
    	> root 2202 c Thu Sep 29 10:05:00 2011
    	> CMD: echo "Gutjob";exit 0
    	> root 2203 c Thu Sep 29 10:05:00 2011
    	< root 2202 c Thu Sep 29 10:05:00 2011 rc=1 
    	! could not obtain latest contract from popen(3C): No such process Thu Sep 29 10:05:00 2011
    	< root 2203 c Thu Sep 29 10:05:00 2011
    	! could not obtain latest contract from popen(3C): No such process Thu Sep 29 10:05:00 2011
    	
    
    
    	The log file contains the following transactions:
    
    	Start jobs:  
    	> CMD: echo "Failjob";exit 1
    	> root 2202 c Thu Sep 29 10:05:00 2011
    	CMD: echo "Gutjob";exit 0
    	root 2203 c Thu Sep 29 10:05:00 2011
    
    	Failed job:
    	 < root 2202 c Thu Sep 29 10:05:00 2011 rc=1
    	
    	Success finish job:
    	< root 2203 c Thu Sep 29 10:05:00 2011
    
    
    	Setup of the "Logfile Transaction Monitor":
    
    	1. Load the cron log file entries into the pattern validation dialog.
    	
    	   Define the Start Pattern / Finish Pattern / Fail Pattern for the according log
    	   entries.
    	   
    	   Add the pattern to the according fields in the hybrid indication policy.
    
    	   In the example the job PID and the user define the dependency of the Start Line and 
    	   Fail/Finish Line and hence differentiate a single transaction.
    
    	   Replace Finish and Fail patterns with the variables from the start pattern.
    	   In our example the single variables in the start pattern needs to be extracted 
    	   (see OPM Monitor) and used as <$svar3>  and <$svar2> in the finish and fail pattern 
    	   (PID = <$svar3>, User=<$svar2>). The PID and user have to be the same otherwise the 
    	   lines don't belong together.
    
    	   Note: It is not possible to extract variables from the fail and finish pattern. 
    
    	   Start Line Pattern:
    	   < CMD: echo "Failjob";exit 1
    	     < root 2202 c Thu Sep 29 10:05:00 2011
    	   Pattern: >\s+CMD:\s+(.*)\n>\s+(\w+)\s+(\d+)\s+.*
    
    	   Fail Line Pattern:
    	   < root 2202 c Thu Sep 29 10:05:00 2011 rc=1							
    	   Pattern: <\s+<$svar2>\s+<$svar3>\s+[^\n]+\s+rc=\d+\s*\n*.*
    
    	   Finish Line Pattern:
    	   < root 2203 c Thu Sep 29 10:05:00 2011							
    	   Pattern: <\s+<$svar2>\s+<$svar3>\s+[^\n]+\s+\d+\s*(|\n.*)
    	
    	
    	2. Define "Split Records" Java Patterns for the Start/Fail and Success information.
    
    	   Split Records: (>\s+CMD:|<\s+).*
    
    	   The pattern ">\s+CMD:.*" matches the Start Transaction which matches the 
    	   following lines:
    
    	   > CMD: echo "Failjob";exit 1
    	   > root 2202 c Thu Sep 29 10:05:00 2011
    
    	   The pattern "<\s+.*" matches the Fail and Success Transaction which
    	   matches the following lines:
    
    	   < root 2202 c Thu Sep 29 10:05:00 2011 rc=1
    	   < root 2202 c Thu Sep 29 10:05:00 2011
    	   
    	   Use the Java Pattern Validation Dialog to test the "Split Records" pattern definitions.
    	   Use the "Re-split" button to test the recognition of multiline records.
    	   
    	   Note: Based on the "Split Records" definitions the monitor matches as many 
    		lines as possible. That have to be been taken into account in the Fail
    		and Success patterns definition. 
    		
    
    
    	3. Define a condition which filters all transaction fail messages (we are not
    	   interested in success messages).
    
    	   Predefined objects provides the possibility to filter the following messages:
    
    		FAIL_END 	for failed transactions (matches the fail pattern)
    		SUCCESS_END	for successsful transactions (matches the finish pattern)
    		FAIL_TIMEOUT	for timed out transactions (timeout in sec)
    
    
    	   As mentioned above it’s not possible to extract any variables from the fail and 
    	   finish pattern. If you  want to extract information from these pattern you have 
    	   to extract this with the help of prefedined variables in the "Match Variables" 
    	   section using simplified pattern matching.
     
      	   The following predefined variables can be used:
    
    		<$SUCCESS_END_MSG>	includes the finish line
    		<$FAIL_END_MSG>	includes the fail line
          
          






    MultiPath (MP)Logfile Monitor

    The Multipath Logfile Monitors are extensions of the Logfile Monitors that we have in the boom. The main difference is that the file/path mask is extended to support not only file masks but also path masks. It is recommended to use the MP logfile family instead of the normal ones. These are kept for backwards compatibility.


    The path mask syntax is specialized for finding rotating log files and consists of the following rules:

    • '*' (star) symbol matches any char sequence within path element name. Among all matching files, the file with latest modification time should be taken. This is useful to monitor rotating logfiles, so if logging has switched to another file, the monitor will also pick the newest file (after finishing the previous one).

    • Several '*' (star) symbols in different path elements (i.e. on different file tree levels) still comprise the same single group for selecting the newest file.

    • '**' (double star) symbol matches any char sequence within path element name. All matching files should be taken as separate monitoring target, regardless of modification time.

    • The path element consisting of exactly double star (like '/**/') specifies recursive scan and matches arbitrary path depth, including zero.

    • double star overrides single star in the same path element, that is all matching files will be taken.

    • path starting from wildcard is considered relative to current dir, unless followed by colon on MS Windows. This is reserved as a special notation for selecting all disk drives on MS Windows: *: (is the same as **:).

    • All slashes and backslashes are treated as file (path element) separators, interim double slashes are reduced to single slash. On MS Windows, network paths starting with double (back-)slash are also recognized.

    • Several independent masks can be specified, separated by the ‘|’ (pipe) symbol.

    • Path mask can be replaced with exec call : “<$LOGFILES(command)>”. In this case the MP LFM expects back a list of log file paths (one or more) separated by ‘|’ (pipe) or new line ‘\n’ character

    For more information and examples please see Indication policy for Multipath Logfile Monitors.