Discussion:
Configure alerts so that none are sent if Internet goes down
kiddbios
2012-01-17 17:41:28 UTC
Permalink
kiddbios [http://community.zenoss.org/people/kiddbios] created the discussion

"Configure alerts so that none are sent if Internet goes down"

To view the discussion, visit: http://community.zenoss.org/message/63745#63745

--------------------------------------------------------------
Hello,

We monitor multiple remote sites with Zenoss Core. When the Internet at our monitoring site goes down and then comes back up, we receive a ton of alerts. Obviously this is because it could not reach the remote networks and therefore generates alerts that queue and are then sent when connectivity is restored. Is there a way to make it so that if Zenoss detects that it does not have an Internet connection it will suspend alerts? Similar to the way that if a server is down it doesn't send you an alert that all of the services are down, just the server. This would also be helpful to apply to remote sites. For instance, if Zenoss detected that the gateway for a remote site was unreachable, it would alert you about that, but not about all of the servers that are behind it.

Any help is greately appreciated.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63745#63745]

Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
jcurry
2012-01-17 20:22:59 UTC
Permalink
jcurry [http://community.zenoss.org/people/jcurry] created the discussion

"Re: Configure alerts so that none are sent if Internet goes down"

To view the discussion, visit: http://community.zenoss.org/message/63757#63757

--------------------------------------------------------------
That's the way it is supposed to work!

Zenoss tries to build its own internal topology of your enterprise - it uses interface tables and routing tables gathered using SNMP.  It should then build its own, internal topology map such that, if a single point of failure goes down (like your internet connection) then you should get an alert for that element only.  Strictly, you will get alerts for the stuff behind it but their state is Suppressed (which isn't normally displayed) but if you click the dropdown arrow beside Status in an event console then you can elect to see suppressed events.

Most alerts (emails, pages,...) have a filter that includes Event State = New . This will trigger for your single-point-of-failure event but is not triggered by the Suppressed status.

Sometimes Zenoss doesn't get the internal topology quite right.  It depends on getting accurate information from at least all the routing devices in your network.  This may not be possible (you may have outsourced "black holes" in your network)??  If you think that you do have SNMP access to all the devices, you may need to do a spot of deleting and rediscovering to ensure that the internal map is correct (it has also got better at this over the years - what version of Zenoss are you on??).

I use a little utility called tracepath.py that checks out the internal topology. You need to change the line that sets "source" to the name that Zenoss knows the zenoss server by (zenoss.class.example.org in my case).

#!/usr/bin/env python
import Globals
import sys
import os

from Products.ZenUtils.ZenScriptBase import ZenScriptBase
from transaction import commit

dmd = ZenScriptBase(connect=True).dmd
find = dmd.Devices.findDevice

arg = sys.argv[1]

source = 'localhost'
zenhome = os.environ['ZENHOME']
zpc = open(os.path.join(zenhome, 'etc', 'zenping.conf'), 'r')
for line in zpc:
    line = line.rstrip()
    if line.startswith('name'):
        name, source = line.split(' ')

zpc.close()

print "Getting path from %s to %s..." % (source, arg)
# After all that, hardcode source to avoid hacking $ZENHOME/etc/zenping.conf
source='zenoss.class.example.org'

source = find(source)
if source is None:
    print "Invalid source adress."
    print "Add your zenoss server name into %s/etc/zenping.conf" % (zenhome)
    print "eg: "
    print "name yourzenossserver"
    sys.exit(1)

destination = find(arg)
if destination is None:
    print "Invalid destination."
    sys.exit(1)

path = source.os.traceRoute(destination, [])
if len(path) > 1:
    print "zenoss -> " \
    + " -> ".join([ dmd.Networks.findIp(ip).device().id for ip in path[:-1] ]) \
    + " -> " + arg
else:
    print "zenoss -> " + arg


If you do have parts of the path through the network then there is a way to add pseudo interfaces to the devices that you DO manage and effectively bridge the black hole part of your network.

Cheers,
Jane
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63757#63757]

Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Loading...