SuspendPollingService

From OpenNMS
Jump to navigation Jump to search

To suspend/resume polling of individual services on a node, use the uei = suspendPollService and resumePollService like this:

send-event.pl uei.opennms.org/internal/poller/suspendPollingService --node 3 --interface 169.254.1.1 --service HTTP
send-event.pl uei.opennms.org/internal/poller/resumePollingService --node 3 --interface 169.254.1.1 --service HTTP


As a possible alternative, use a script write to the /opt/opennms/etc/poll-outages.xml config file and then send an event to cause OpenNMS to re-read the config file.

Use:

/opt/opennms/bin/send-event.pl uei.opennms.org/internal/schedOutagesChanged

Using outages you can only suspend polling of "all" services in a polling package (or turn off notifications, threshold and data collection.) If you have a need to suspend just one or a few services, you would need to create a separate polling package for each service.

With these events, if you want to suspend polling of just a single service, you can use the suspendPollingService event. This works well for us.

send-event.pl uei.opennms.org/internal/poller/suspendPollingService --node 123 --interface 169.254.1.1 --service HTTP 

This starts polling again.

send-event.pl uei.opennms.org/internal/poller/resumePollingService --node 123 --interface 169.254.1.1 --service HTTP 

You will see in the "Recent Events" view on the Node View in the web UI:

Polling will be discontinued for HTTP service on interface 169.254.1.1.
Polling will begin/resume for HTTP service on interface 169.254.1.1.

I am not sure that this is the intended use of the UEI's but it works great.

Configure scheduling outages via RESTful Web Service http://issues.opennms.org/browse/NMS-4232


wgethack to use SuspendPollingService

root@opennms:/home/monitor# wget --tries=1 --read-timeout=1 -d -O - --header='Content-Type: text/xml' --header='Accept: application/xml' --header='Connection: close' --post-data='<log><events><event uuid="1234567890"><uei>uei.opennms.org/internal/poller/suspendPollingService</uei><source>wget_hack</source><nodeid>207</nodeid><time>Tuesday, May 29, 2013 15:57:24 PM GMT</time> <host>servername.domain.com</host><interface>192.168.1.1</interface><service>ICMP</service></event> </events></log>' http://localhost:5817

Restart servers and pause services polling script

Suspend / resume polling for services of server servername01.domain.com in a Windows batch file

plink -ssh -pw password monitor@opennms.domain.com python bin/SwitchMonitor.py -u batch -p password -m servername01.domain.com
echo Waiting for reboot...
sleep 240
plink -ssh -pw password monitor@opennms.domain.com python bin/SwitchMonitor.py -u batch -p password +m servername01.domain.com

/home/monitor/bin/SwitchMonitor.py

#!/usr/bin/python
'''
Switch OpenNMS monitoring of a node on or off

Using the ReST interface of OpenNMS, the given node(-label) is searched for
defined services on the primary interface. Polling of that interface is
subsequently suspended or resumed, depending on the command given.

'''
import argparse
import io
import ConfigParser
import urllib2
import json

import SendEvent

defaultConfig = """
[Monitor]
ReSTNode = http://opennms.domain.com:8980/

username = onmsUser
password = onmsPassword
"""
_mainsection = 'Monitor'
_cfg = ConfigParser.ConfigParser()
_cfg.readfp(io.BytesIO(defaultConfig))
_cfg.read('sendServiceEvents._cfg')

_getNodeData=   "opennms/rest/nodes?label=%s"
_getIpIfs=      "opennms/rest/nodes/%s/ipinterfaces"
_getSvcList=    "opennms/rest/nodes/%s/ipinterfaces/%s/services"
_forceJson=     {'Accept': 'application/json'}

def _getCommandLineParser():
    clparse = argparse.ArgumentParser(
      description='Switch OpenNMS monitoring of a node on or off',
      epilog='Example: %(prog)s -m server01.domain.com',
      prefix_chars = '-+'
                                      )
    clparse.add_argument('-V', '--version', action='version', version='1.0-beta',
                         help='print version and exit successfully')
    clparse.add_argument('-u', '--user', nargs=1, metavar='text',
                         help='noc username')
    clparse.add_argument('-p', '--password', nargs=1, metavar='text',
                         help='noc password')
    clparse.add_argument('+m', action='store_true', default=None,
                         help='switch monitoring on')
    clparse.add_argument('-m', action='store_false', default=None,
                         help='switch monitoring off')
    clparse.add_argument('nodeName', metavar='nodeName',
                         help='the name of the monitored node')
    return clparse

def getNodeId(nodeName):
    req = urllib2.Request(ReSTNode + _getNodeData % nodeName, None, _forceJson)
    nodeData = json.load(opener.open(req))
    if int(nodeData['@count']) == 1:
        return nodeData['node']['@id']

def getPrimaryIp(nodeId):
    req = urllib2.Request(ReSTNode + _getIpIfs % nodeId, None, _forceJson)
    ifData = json.load(opener.open(req))
    ips = ifData['ipInterface']
    if type(ips) is not list:
        return ips['ipAddress']
    else:
        for ip in ips:
            if ip['@snmpPrimary'] == 'P':
                return ip['ipAddress']

def getServices(nodeId, address):
    req = urllib2.Request(ReSTNode + _getSvcList % (nodeId, address), None, _forceJson)
    lst = json.load(opener.open(req))
    srv = lst['service']
    services = []
    for service in srv:
        services.append(service['serviceType']['name'])
    return services

def sendServiceEvents(event, nodeid, address, srvList):
    "For each service in the list, send the given event to given node"
    for srv in srvList:
        tags = argparse.Namespace()
        setattr(tags, 'UEI', event)
        setattr(tags, 'nodeid', [nodeid])
        setattr(tags, 'interface', [address])
        setattr(tags, 'service', [srv])
        uei = SendEvent.create(tags, None)
#        print uei
        SendEvent.send(uei)

if __name__ == '__main__':
    clparse = _getCommandLineParser()
    args = clparse.parse_args()

    if args.user:
        onms_user = args.user[0]
    else:
        onms_user = _cfg.get(_mainsection, 'username')
    if args.password:
        onms_pwd=args.password[0]
    else:
        onms_pwd = _cfg.get(_mainsection, 'password')
    ReSTNode = _cfg.get(_mainsection, 'ReSTNode')

    accessManager = urllib2.HTTPPasswordMgrWithDefaultRealm()
    accessManager.add_password(None, ReSTNode, onms_user, onms_pwd)
    opener = urllib2.build_opener(urllib2.HTTPBasicAuthHandler(accessManager))

    Id = getNodeId(args.nodeName)
    address = getPrimaryIp(Id)
    srvList = getServices(Id, address)

    if args.m:
        event = 'internal/poller/resumePollingService'
    else:
        event = 'internal/poller/suspendPollingService'

    sendServiceEvents(event, Id, address, srvList)

Events Polling will begin resume for services.jpg