Discussion:
Mysql connection errors stop the monitoring
tigerpaws
2012-01-25 18:37:41 UTC
Permalink
tigerpaws [http://community.zenoss.org/people/tigerpaws] created the discussion

"Mysql connection errors stop the monitoring"

To view the discussion, visit: http://community.zenoss.org/message/63915#63915

--------------------------------------------------------------
On many of my servers monitored by zenoss, I get the "Host '...' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'" alert. Yet, if I reset mysql, everything is fine for a while, so there are no connection errors. What kind of connection errors could I be getting to cause mysql to blacklist the zenoss system? The logon timeouts for the MySQL module are at 15 seconds, which seems more than ample. Yet I get this from nearly all my servers where mysql is being monitored. There is no mention of this in the mysql log. This really messes up the monitoring and leaves large holes in the graphs.

Does anyone have any idea how to test this to find out exactly what is causing these errors?
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63915#63915]

Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
dpetzel
2012-01-26 02:16:35 UTC
Permalink
dpetzel [http://community.zenoss.org/people/dpetzel] created the discussion

"Re: Mysql connection errors stop the monitoring"

To view the discussion, visit: http://community.zenoss.org/message/63922#63922

--------------------------------------------------------------
What is your polling interval set to?

I am by no means a mysql expert, but I was reading: http://dev.mysql.com/doc/refman/5.0/en/blocked-host.html http://dev.mysql.com/doc/refman/5.0/en/blocked-host.html and wondering if its possible the connection attempts are being dropped somehow. There are of course countless reasons that *might* happen, but at a very high level seems related to the link above.

Do all your mysql boxes go into error at the same time? or does the host in error vary? Is it possible you have interface errors on any of those boxes or your zenoss server?

I think the most direct route to getting solid data would be tcpdump if you can catch the condition as it starts
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63922#63922]

Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
tigerpaws
2012-01-26 17:12:47 UTC
Permalink
tigerpaws [http://community.zenoss.org/people/tigerpaws] created the discussion

"Re: Mysql connection errors stop the monitoring"

To view the discussion, visit: http://community.zenoss.org/message/63929#63929

--------------------------------------------------------------
It seems to be happenning with the systems that are off site relative to the zenoss server. We have two sites, the second is at a colo facility. The mysql servers can get pretty busy, and I am now thinking of upping the default timeout zenoss uses with these sites. The only thing I can think of so far is that when the servers get busy, they cannot respond in 15 seconds, and thus timeout. However, it takes 10 failed connections for mysql to balcklist the address, and that makes little sense to me, in that can 10 consecutive connections take more than 15 seconds? I fnd that hard to believe, but I'll give it a try.

I have no detectable interface errors or downtimes on any of the boxes, and usually I can connect to them very easily, so this many failures is strange. I didn't have this happenning until a few weeks ago, though.

Thanks for the response. I'll add some scripts to see if I can catch it and see exactly what is going on.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63929#63929]

Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Loading...