Discussion:
Zenoss - Ping Data source problem
Stefan Reinke
2013-09-03 10:21:02 UTC
Permalink
Stefan Reinke [http://community.zenoss.org/people/Stefan] created the discussion

"Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74541#74541

--------------------------------------------------------------
Hi,

We are busy with a project where we want to monitor latency for our devices under the Ping Class group using the default ping data source provided in Zenoss V4.

The problem is that most of our devices are using a 3G connection, where the first ping (icmp packet) from Zenoss only opens the connection to the device and gives a very inaccurate/bad latency (>2000ms). We can see three pings to the device from zenoss itself using tcpdump. We are assuming the first ping is the start of a new cycle from Zenhub, where Zenoss itself sends a ICMP packet to the device to see if it's down or UP (Which it will then create an event for). Then we assume the next two pings are coming from the Ping data source. The data source probably sends two pings to make sure there is a data entry made for the average rtt. The problem is, the two pings from the data source are 25-45 seconds apart from each other so we still get the high/inaccurate latency from where a connection is opened from the server to the device. We have tried looking into ways to decrease this amount, but we haven't been successful. What we are assuming (Yes, once again) the data source pings all the devices under the /Device/Ping group from the first device to the last one (We have about 1450 devices under the ping group) and then starts over for the second ping interval.

We have tried the following:

1. Tried to change the number of pings per cycle to 1,2,3,4,5 in the ping data source. Doesn't seem to have any affect in the amount if pings it sends.
2. Changing the parallel jobs from 10 to 20. Which doesn't help, I think these configuration settings don't apply to the ping data source?
3. Changed the pings to send in flight from 75 to 200. Same as above.

We can't find any documentation on the zenping data source.So any input/help or insight would be appreciated.

Server specifications:

CPU: Intel(R) Xeon(R) CPU E5640  @ 2.67GHz (4 Cores, 8 Threads)
RAM: 12GB
HDD: 2 X 250GB HDD's 10K in raid 1
Zenoss version: 4.2.0

Thanks :)
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74541#74541]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
hydruid
2013-09-12 18:20:25 UTC
Permalink
hydruid [http://community.zenoss.org/people/hydruid] created the discussion

"Re: Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74623#74623

--------------------------------------------------------------
I'm not 100% sure this will help, but it's something you could try.

Modify the zenping daemon and set this value: starttimeout=30

starttimeout is defined as "Wait seconds for initial heartbeat"

Let me know if that helps!
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74623#74623]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Stefan Reinke
2013-09-19 12:24:35 UTC
Permalink
Stefan Reinke [http://community.zenoss.org/people/Stefan] created the discussion

"Re: Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74704#74704

--------------------------------------------------------------
Hi,

Thanks for the response. I have made the change, but I am still seeing these huge gaps (30-50 seconds) between the ICMP packets.

That parameter is used to timeout a ICMP packet after a certain amount of seconds right?

This is what I am seeing:

2013-09-19 14:27:01.986555 IP  X.X.X.X >  X.X.X.X: ICMP echo request, id 52595, seq 0, length 16
2013-09-19 14:27:40.662221 IP  X.X.X.X >  X.X.X.X: ICMP echo request, id 58223, seq 0, length 16
2013-09-19 14:28:28.484224 IP  X.X.X.X >  X.X.X.X: ICMP echo request, id 31777, seq 0, length 16

Any other suggestions?
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74704#74704]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
hydruid
2013-09-19 13:51:12 UTC
Permalink
hydruid [http://community.zenoss.org/people/hydruid] created the discussion

"Re: Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74706#74706

--------------------------------------------------------------
Take a look at this thread....it might be able to help:
http://community.zenoss.org/message/74280#74280 http://community.zenoss.org/message/74280

Also, jump into the IRC channel between 8am-5pm CST and ask your question
there......tons of people that might have a tip!
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74706#74706]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Stefan Reinke
2013-09-25 08:21:26 UTC
Permalink
Stefan Reinke [http://community.zenoss.org/people/Stefan] created the discussion

"Re: Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74750#74750

--------------------------------------------------------------
Hi,

If I am correct that thread is a issue with devices showing as down? That is not the problem we are experiencing at the moment. The interval between the ICMP packets sent from Zenoss itself is to far apart. *As seen in the previous post I made. It would be perfect if we can bring it down to 2-5 seconds instead of 30-45 seconds.

Will have a look at the IRC channel :)

Thanks for the help sofar!
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74750#74750]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Doug Syer
2013-09-26 05:02:26 UTC
Permalink
Doug Syer [http://community.zenoss.org/people/dsyer%40nwnit.com] created the discussion

"Re: Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74773#74773

--------------------------------------------------------------
I believe they where going to build this feature into 4.2.4, i had it in 4.1.1 but when I updated to 4.2.3, i forgot to include the feature request patch I had requests so im missing it now....

Im just not 100% sure they did add it to 4.2.4,   if its there is should show up when you do zenping --help.

--delay-count=DELAYCOUNT
Delay down events until more than this many ping
downs are collected in a row. Default is 0 (no delay).

add it to zenping.conf.  I use two pings every 60 seconds
and Ive found that setting this to 1 ( or we have to loose 4 pings in over two polling cycles over 2 minutes) ..eliminates a ton or random ping noise for us.  I'm not sure if it will help your situation because of the 3G stuff..

im not sure what it will do for your latency graphs because im not sure exactly how each (or which) ping result is stored in the rrd.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74773#74773]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Stefan Reinke
2013-09-26 07:58:37 UTC
Permalink
Stefan Reinke [http://community.zenoss.org/people/Stefan] created the discussion

"Re: Zenoss - Ping Data source problem"

To view the discussion, visit: http://community.zenoss.org/message/74774#74774

--------------------------------------------------------------
We are still running version 4.2.0, so I don't know if that has been implemented yet. I only see the following information in the zenping.conf.

# Config file written out from GUI
duallog False
allowduplicateclears True
uid zenoss
zenhubpinginterval 30
ping-backend nmap
watchdog False
eventflushseconds 5.0
hubhost localhost
traceroute-interval 5
monitor localhost
hubusername admin
maxqueuelen 15000
starttimeout 10
hubpassword zenoss
duplicateclearinterval 0
initialHubTimeout 30
logseverity 20
maxlogsize 10240
disable-correlator True
maxbackuplogs 3
maxparallel 500
logTaskStats 0
eventflushchunksize 50
hubport 8789

I can try to add it to the bottom, but I am not sure if it will have any affect.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74774#74774]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Loading...