SNMP Monitor

From OpenNMS
Jump to navigation Jump to search

Purpose

The SNMP monitor enables the OpenNMS poller to request the value of an arbitrary MIB object from an SNMP agent, compare the retrieved value against a flexible set of rules, and based on the evaluation declare a service to be up or down.

Configuration Overview

In order to use the SNMP monitor to check out values for particular SNMP OIDs, you'll first need to configure a detector in the default foreign-source definition to discover the service, or use a requisition to provision the service directly on one or more of a node's interfaces. You'll then configure the poller to monitor the service.

Configuration Examples

Provisioning via Detectors

The default usage of Provisiond's SNMP detector in the default foreign-source definition looks like this:

Default-snmp-detector.png

Custom Example: Logged In Users

Suppose you wanted to make sure at least one user was logged into a system at a time. You could use hrSystemNumUsers (.1.3.6.1.2.1.25.1.5.0) to test this:

Numusers-snmp-detector.png

This plugin will test if the object exists, and if so it will assign the NumUsers service to the IP address. You can also test for a specific value (vbvalue). This would be useful if there is an OID that indicates if a service is active, such as ipForwarding that indicates a device is a Router:

Custom Example: IP Forwarding Enabled

Router-snmp-detector.png

Configuring Service Monitoring

Once the service has been provisioned, a monitor can be added to poller-configuration.xml. Again, the default service definition for this monitor looks like this (note that it's turned off by default, which is why the SNMP service shows as Not Monitored in the node details):

                <service name="SNMP" interval="300000" user-defined="false" status="off">
                        <parameter key="retry" value="2"/>
                        <parameter key="timeout" value="3000"/>
                        <parameter key="port" value="161"/>
                        <parameter key="oid" value=".1.3.6.1.2.1.1.2.0"/>
                </service>
...
        <monitor service="SNMP"         class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>

Do Not Forget The Monitor Definition!

Note well the presence of the <monitor> element near the bottom of the file. Each <service> must have a corresponding <monitor> so that the poller will know what poller monitor class to use to monitor that service.

Custom Example: Users Logged In

For the NumUsers service to see if at least one person is logged in:

                <service name="NumUsers" interval="300000" user-defined="false" status="on">
                        <parameter key="retry" value="2"/>
                        <parameter key="timeout" value="3000"/>
                        <parameter key="port" value="161"/>
                        <parameter key="oid" value=".1.3.6.1.2.1.25.1.5.0"/>
                        <parameter key="operator" value="&gt;="/>
                        <parameter key="operand" value="1"/>
                </service>

Note that there are two parameters that allow you to customize the comparison performed on the value returned from the SNMP agent: operator and operand. The operator can be one of:

  • Less than: "<" (you will need to use an entity &lt;)
  • Greater than: ">" (&gt;)
  • Less than or equals: "<=" (&lt;)
  • Greater than or equals: ">=" (&gt;=)
  • Equals: "="
  • Does not equal: "!="
  • Matches regular expression: "~"

The last value is used when the OID returns a string.

Be sure to add a corresponding monitor line at the bottom of the poller-configuration.xml file when adding new monitors:

        <monitor service="NumUsers"         class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>

Get Fuzzy: The walk parameter

If you don't know the exact object identifier for the MIB object whose value you wish to compare against your criteria, you can add the walk parameter to your service definition with a value of true. This setting causes the monitor to send a GETNEXT request instead of a directed GET request to the SNMP agent. For example, you could use this setting to check whether a device supports any objects in the ASTERISK-MIB:

<service name="Asterisk_MIB" interval="300000" user-defined="false" status="on">
 <parameter key="oid" value=".1.3.6.1.4.1.22736.1" />
 <parameter key="walk" value="true" />
</service>

Using the match-all parameter

If you want to verify that every row of a conceptual MIB table has a value in a given column that matches your criteria, you can use the walk parameter (described above) along with the match-all parameter set to a value of true. For example, you could use this setting to check that every IP route in a node's routing table is statically configured (ipRouteProto has a value of local(2)):

<service name="All_Static_Routes" interval="300000" user-defined="false" status="on">
 <parameter key="oid" value=".1.3.6.1.2.1.4.21.1.9" />
 <parameter key="operator" value="=" />
 <parameter key="operand" value="2" />
 <parameter key="walk" value="true" />
 <parameter key="match-all" value="true" />
</service>

The default behavior for match-all is true. If you would like see if minimum one entry exist, set match-all to false. The first match with your operand set the monitor to Up.

Matching a Bounded Number of Rows

As a special case of checking the same column for many rows of a conceptual MIB table, you can set the value of the match-all parameter to count and add a minimum and/or maximum parameter to bound the number of rows that must meet the specified criteria. Extending the static routes example above, you could require that at least three but fewer than ten static routes exist in a node's IP routing table:

<service name="All_Static_Routes" interval="300000" user-defined="false" status="on">
 <parameter key="oid" value=".1.3.6.1.2.1.4.21.1.9" />
 <parameter key="operator" value="=" />
 <parameter key="operand" value="2" />
 <parameter key="walk" value="true" />
 <parameter key="match-all" value="count" />
 <parameter key="minimum" value="3" />
 <parameter key="maximum" value="10" />
</service>

Matching Multiple Rows

In the case of matching multiple rows, where an "Up" may be represented by "1=Ready" and "3=Online", a regex pattern can be used:

<service name="All_Static_Routes" interval="300000" user-defined="false" status="on">
 <parameter key="oid" value=".1.3.6.1.2.1.4.21.1.9" />
 <parameter key="operator" value="~" />
 <parameter key="operand" value="[1,3]" />
 <parameter key="walk" value="true" />
 <parameter key="match-all" value="true" />
</service>

Customizing Reason Codes

Note: this feature is available in stable releases 1.6.3 and later, development releases 1.7.1 and later

When the SNMP monitor detects that a service is down according to the configured criteria, it tries to create a human-readable reason code. The default reason codes are quite generic, as this monitor can be used to check the status of an endless variety of values. You can specify a template to be used when creating the reason code by adding a reason-template parameter to your service definition. This feature can make the reason codes for outages more immediately understandable in the context in which you're using the SNMP monitor. For our logged in users example, we might use a reason code template like this one:

<parameter key="reason-template" value="Users logged in must be ${operator} ${operand} but actual value was ${observedValue}" />

Assuming the example service definition above were in effect, but the monitor found that hrSystemNumUsers had a value of zero, the resulting reason code would read:

Users logged in must be >= 1 but actual value was 0

The following tokens, if present in the form ${foo} in the reason code template, will be expanded to the values configured for the corresponding parameters in the service definition:

  • oid
  • operator
  • operand
  • walk
  • matchAll (actual parameter key is match-all, camel case conversion is XML convention)
  • minimum
  • maximum
  • timeout
  • retry (retries is a synonym)
  • port

The following additional tokens, if present, will be expanded to their runtime values:

ipaddr 
The IP address of the interface on which the service is being polled
observedValue 
The actual value (as a string) returned by the underlying SNMP operation. Note that this token usually does not make sense to use if the "walk" param is set to "true" or if the "match-all" param is set to "count".
matchCount 
The number of matching rows found. Set only if the "match-all" param is set to "count".