Discussion:
ValueChangeThreshold - warning
jcurry
2013-05-19 19:50:45 UTC
Permalink
jcurry [http://community.zenoss.org/people/jcurry] created the discussion

"ValueChangeThreshold - warning"

To view the discussion, visit: http://community.zenoss.org/message/73268#73268

--------------------------------------------------------------
The ValueChangeThreshold is a recent development that arrived (I believe) with Zenoss 4.2. Rather than the administrator having to create a template that checks a MAX and/or a MIN value, this standard template simply checks that the value has "changed" from the last sampling interval.

It is deployed as standard on the ethernetCsmacd and ethernetCsmacd_64 component templates on the ifOperStatusChange threshold.  It is configured to generate a /Status/Perf event when the the value changes, of severity Info.

Loading Image... Loading Image...

The /Status/Perf Zenoss-supplied event has a transform that checks that a component attribute exists and that the eventKey field matches with the ifOperStatusChange threshold for the ifOperStatus_ifOperStatus datapoint.   If these criteria are met then the *event action is set to drop*.
Next the status value from the event is compared with the status for this interface in the Zope  configuration Database (ZODB).  If the two are different then the ZODB value is set to match the status detected by the event.                           

Loading Image... Loading Image...


zenperfsnmp polls the SNMP OID ifOperStatus every 5 minutes (by default).  Possible values for the operational status (defined in the RFC-1213 MIB and IF-MIB) are: * 1     Up
* 2     Down
* 3     Testing
* 4     Unknown
* 5     Dormant
* 6     notPresent
* 7     lowerLayerDown

Unfortunately the ValueChangeThreshold threshold type is not a reliable source of events  for interfaces.  This is because other daemons and processes in Zenoss can reset the previous value (strictly the template threshold instance is recreated).  This happens:
* When zenmodeler changes a relevant object
* When any other script (such as an interface monitor / unmonitor script) changes a relevant object in the ZODB database
* When an event transform changes the status of an interface (as in the transform above)
* When the template or object are manipulated by the zendmd testing environment
* When zenping detects an interface change
* When the performance template is changed
The result is a spurious event that changes the value from a previous value of “None” to the current value.  This means that “real” events (eg. changes from Up to Down), may become masked and hence this whole ValueChangeThreshold technique is unreliable.

If an interface changes from Up to Down then zenping (which runs every minute) is more likely to see the interface down before the zenperfsnmp daemon sees it.  zenping changes the status in the ZODB database and a new threshold instance for the ifOperStatusChange threshold, is pushed to the relevant collector (typically the Zenoss server in a Core environment); this new template instance generates an event saying that the status has changed from None to the new status (2 if the interface has gone down) and the "lastValue" of this threshold is set to the new status of down (2). 

At some later stage, the threshold is checked - the value is still down so the ValueChangeThreshold does not fire, even though the last time that zenperfsnmp checked this threshold, the value was 1 (up).

I have tried dropping or closing values that change from None to something - this doesn't help.  The other changes (like zenping) still create a new template instance and still change the lastValue in that template.  The other issue is that if you drop good news events from None to 1 (up) then the bad news never gets closed by the template.  The last Value has been set to 1 (up) so the threshold sees no change.  Further, any "good news" events that don't match a "bad news" event are always silently dropped (standard behaviour, irrelevant whether it is a ValueChangeThreshold event).

After struggling for some time with this quirky behaviour (because it is all time-dependent) I have come to the conclusion that for interfaces at least, you are much safer sticking with the MaxMin threshold.

Cheers,
Jane
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73268#73268]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Loading...