Discussion:
Zenoss 4.2.3 event queue maxing out
Wayne Spangenberg
2013-04-03 11:58:07 UTC
Permalink
Wayne Spangenberg [http://community.zenoss.org/people/waynes] created the discussion

"Zenoss 4.2.3 event queue maxing out"

To view the discussion, visit: http://community.zenoss.org/message/72688#72688

--------------------------------------------------------------
Hi All,

I'm new to Zenoss and Linux, but so far I think it's awesome.

However I am having an issue with the event queue of the collector maxing out, see attached.
As a quick fix I reboot the server, thats why the event queue graphs drop off.

Any advice on where to start looking for the cause of the high event queue?

Environment:

Zenoss server is a VM in HyperV environment, 8GB RAM, Centos 6.3
Monitoring 150 Cisco routers (SNMP and SYSLOG)

Many thanks for any assistance

Wayne
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72688#72688]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
jcurry
2013-04-05 15:23:05 UTC
Permalink
jcurry [http://community.zenoss.org/people/jcurry] created the discussion

"Re: Zenoss 4.2.3 event queue maxing out"

To view the discussion, visit: http://community.zenoss.org/message/72711#72711

--------------------------------------------------------------
Hi Wayne,
Assuming that your "top" screenshot is typical then you seem to have no problem at all on CPU or memory.  One other thing you might check is that you are not running out of disk space, especially in the temporary directory, as this can cause all sorts of funnies.  Try the disk free command:
df -h

If anything is close to 100% full then post it here.

Your graphs show that you are getting lots of events from zenperfsnmp, zentrap and zensyslog but my suspicion is that they are not being processed.

The events subsystem in Zenoss 4.x depends on having RabbitMQ running (should have been installed and setup when you installed Zenoss and should run automatically).  This is one of the very few checks you need to do as the root user; try:
rabbitmqctl -p /zenoss list_queues

You should see a bunch of queues which, typically have 0 events queued - see attachment.
If queues are blocked for some reason or if rabbitmq is not running correctly, then your events won't get processed.

Do you see lots of events in the Events Console?

If RabbitMQ is running, the next thing to look at might be $ZENHOME/log/zeneventd.log as it is zeneventd that actually processes the events.

Cheers,
Jane

Loading Image... Loading Image...
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72711#72711]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Wayne Spangenberg
2013-04-08 12:41:14 UTC
Permalink
Wayne Spangenberg [http://community.zenoss.org/people/waynes] created the discussion

"Re: Zenoss 4.2.3 event queue maxing out"

To view the discussion, visit: http://community.zenoss.org/message/72732#72732

--------------------------------------------------------------
Hi,

Thanks for taking the time to help :)

df -h shows disk usage is ok (don't know why it insists on a table?)

| Filesystem        | Size  Used Avail Use% Mounted on |
/dev/mapper/vg_divss029-lv_root
|                    | 30G  4.3G   24G  16% / |
| tmpfs             | 3.9G   56K  3.9G   1% /dev/shm |
| /dev/sda1         | 485M   95M  365M  21% /boot |

Rabbirmq queues look ok

Listing queues ...
celery  0
zenoss.queues.zep.modelchange   0
zenoss.queues.zep.signal        0
zenoss.queues.zep.migrated.summary      0
zenoss.queues.zep.rawevents     0
zenoss.queues.zep.heartbeats    0
zenoss.queues.zep.zenevents     0
DIVSS029.celeryd.pidbox 0
zenoss.queues.zep.migrated.archive      0
...done.

I did find that a Python process was using 99% CPU this morning, I didn't think to look in Zenoss -> Advanced -> Daemons to see what if anything matched that PID brfore I killed it, but i suspect it's Zenhub. After I killed the process i had a look a look at Daemons and saw Zenhub was down.

Loading Image... Loading Image...

I'm having a look at the logs and will post what I don't understand soon.

Regards

Wayne
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72732#72732]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Wayne Spangenberg
2013-04-09 12:00:48 UTC
Permalink
Wayne Spangenberg [http://community.zenoss.org/people/waynes] created the discussion

"Re: Zenoss 4.2.3 event queue maxing out"

To view the discussion, visit: http://community.zenoss.org/message/72748#72748

--------------------------------------------------------------
I had this issue again this morning, tis is the log from zensyslog, it looks like that at 09:34 it just stops and starts again after 12:34.

See corresponding graphs below. Loading Image... Loading Image...

2013-04-09 09:29:43,142 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 09:29:43,143 INFO zen.zensyslog: Counter eventCount, value 626766
2013-04-09 09:29:43,144 INFO zen.zensyslog: 157 devices processed (0 datapoints)
2013-04-09 09:29:43,146 INFO zen.collector.scheduler: Tasks: 159 Successful_Runs: 266 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 0
2013-04-09 09:34:43,164 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 09:34:43,164 INFO zen.zensyslog: Counter eventCount, value 626782
2013-04-09 09:34:43,166 INFO zen.zensyslog: 157 devices processed (0 datapoints)
2013-04-09 09:34:43,168 INFO zen.collector.scheduler: Tasks: 159 Successful_Runs: 294 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 0
2013-04-09 09:39:43,183 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 09:39:43,183 INFO zen.zensyslog: Counter eventCount, value 626799
2013-04-09 09:39:43,184 INFO zen.zensyslog: 157 devices processed (0 datapoints)
2013-04-09 09:39:43,186 INFO zen.collector.scheduler: Tasks: 159 Successful_Runs: 311 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 0
2013-04-09 12:28:48,143 INFO zen.zensyslog: 0 events processed in 0.00 seconds
2013-04-09 12:28:48,182 INFO zen.zensyslog: Deleting PID file /opt/zenoss/var/zensyslog-localhost.pid ...
2013-04-09 12:28:48,182 INFO zen.zensyslog: Daemon SyslogDaemon shutting down
2013-04-09 12:31:31,304 INFO zen.zensyslog: Connecting to localhost:8789
2013-04-09 12:31:35,524 INFO zen.zensyslog: Connected to ZenHub
2013-04-09 12:31:35,547 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 12:31:35,547 INFO zen.zensyslog: Counter eventCount, value 626800
2013-04-09 12:31:35,555 INFO zen.zensyslog: 0 devices processed (0 datapoints)
2013-04-09 12:31:35,555 INFO zen.collector.scheduler: Tasks: 1 Successful_Runs: 0 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 0
2013-04-09 12:32:09,414 INFO zen.zensyslog: Connecting to localhost:8789
2013-04-09 12:32:09,429 INFO zen.zensyslog: Connected to ZenHub
2013-04-09 12:32:09,444 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 12:32:09,444 INFO zen.zensyslog: Counter eventCount, value 626801
2013-04-09 12:32:09,445 INFO zen.zensyslog: 0 devices processed (0 datapoints)
2013-04-09 12:32:09,445 INFO zen.collector.scheduler: Tasks: 1 Successful_Runs: 0 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 0
2013-04-09 12:37:25,780 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 12:37:25,780 INFO zen.zensyslog: Counter eventCount, value 626816
2013-04-09 12:37:25,781 INFO zen.zensyslog: 22 devices processed (0 datapoints)
2013-04-09 12:37:25,782 INFO zen.collector.scheduler: Tasks: 24 Successful_Runs: 2 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 0
2013-04-09 12:42:25,865 INFO zen.maintenance: Performing periodic maintenance
2013-04-09 12:42:25,866 INFO zen.zensyslog: Counter eventCount, value 626833
2013-04-09 12:42:25,867 INFO zen.zensyslog: 34 devices processed (0 datapoints)

I am busy removing the sending of syslog messages to zenoss from all devices, maybe that helps.

Warm regards,

Wayne
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72748#72748]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Loading...