omeganon
2012-10-29 18:31:02 UTC
omeganon [http://community.zenoss.org/people/omeganon] created the discussion
"Rabbitmq not starting FYI"
To view the discussion, visit: http://community.zenoss.org/message/69590#69590
--------------------------------------------------------------
| rabbitmq 4844 | 1 0 13:04 ?   | 00:00:00 /usr/lib64/erlang/erts-5.8.5/bin/epmd -daemon |
So, I rebooted my box today and zenhub wouldn't start. I was getting the connection refused error WRT rabbitmq. After a bunch of troubleshooting related to hostname and /etc/hosts (which were unchanged), I decided to zenbackup, blow away everything and re-install via the core autodeploy script again. Even after doing so, I still had the same issue (I even had removed everything under /var/lib/{rabbitmq,mysql} to ensure I was starting fresh.
In the end I discovered that the rabbitmq-server init script wasn't actually starting rabbitmq. The problem is that the first test is to see if rabbitmq is currently running in it's init script --
start_rabbitmq () {
   status_rabbitmq quiet
...
status_rabbitmq() {
   set +e
   if [ "$1" != "quiet" ] ; then
       $CONTROL status 2>&1
   else
       $CONTROL status > /dev/null 2>&1
   fi
   if [ $? != 0 ] ; then
       RETVAL=3
   fi
   set -e
}
As you can see it checks the exit code of rabbitmqctl status. If '0', it's running, if other, it's not. The problem is, rabbitmqctl seems to exit with a 0 even when it's demonstrably down (only beam running in 'ps' output and even it's own output shows that state) --
[***@buzz bin]# rabbitmqctl status
Status of node ***@buzz ...
Error: unable to connect to node ***@buzz: nodedown
DIAGNOSTICS
===========
nodes in question: [***@buzz]
hosts, their running nodes and ports:
- buzz: [{rabbitmqctl7254,40130}]
current node details:
- node name: ***@buzz
- home dir: /var/lib/rabbitmq
- cookie hash: Cpx4pv1fUqIGNrAnkVsdIA==
[***@buzz bin]# echo $?
0
[***@buzz rabbitmq]# ps -ef | grep rabbit
rabbitmq 4082    1 0 12:59 ?       00:00:00 /usr/lib64/erlang/erts-5.8.5/bin/epmd -daemon
I ended up fudging the status_rabbitmq() function to return a non-zero result to ensure it always tried to start it. This is _not ideal_ but suits my needs for the time being.
To verify that it's actually running, a full suite of rabbitmq processes looks something like this:
rabbitmq 4844    1 0 13:04 ?       00:00:00 /usr/lib64/erlang/erts-5.8.5/bin/epmd -daemon
root     7384    1 0 13:11 pts/0   00:00:00 runuser rabbitmq --session-command /usr/sbin/rabbitmq-server
rabbitmq 7397 7384 0 13:11 pts/0   00:00:00 /bin/sh /usr/sbin/rabbitmq-server
rabbitmq 7401 7397 0 13:11 pts/0   00:00:03 /usr/lib64/erlang/erts-5.8.5/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -noshell -noinput -sname buzz -boot /var/lib/rabbitmq/mnesia/buzz-plugins-expand/rabbit -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/log/rabbitmq/buzz.log"} -rabbit sasl_error_logger {file,"/var/log/rabbitmq/buzz-sasl.log"} -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/buzz"
rabbitmq 7546 7401 0 13:11 ?       00:00:00 inet_gethost 4
rabbitmq 7548 7546 0 13:11 ?       00:00:00 inet_gethost 4
--------------------------------------------------------------
Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/69590#69590]
Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
"Rabbitmq not starting FYI"
To view the discussion, visit: http://community.zenoss.org/message/69590#69590
--------------------------------------------------------------
| rabbitmq 4844 | 1 0 13:04 ?   | 00:00:00 /usr/lib64/erlang/erts-5.8.5/bin/epmd -daemon |
So, I rebooted my box today and zenhub wouldn't start. I was getting the connection refused error WRT rabbitmq. After a bunch of troubleshooting related to hostname and /etc/hosts (which were unchanged), I decided to zenbackup, blow away everything and re-install via the core autodeploy script again. Even after doing so, I still had the same issue (I even had removed everything under /var/lib/{rabbitmq,mysql} to ensure I was starting fresh.
In the end I discovered that the rabbitmq-server init script wasn't actually starting rabbitmq. The problem is that the first test is to see if rabbitmq is currently running in it's init script --
start_rabbitmq () {
   status_rabbitmq quiet
...
status_rabbitmq() {
   set +e
   if [ "$1" != "quiet" ] ; then
       $CONTROL status 2>&1
   else
       $CONTROL status > /dev/null 2>&1
   fi
   if [ $? != 0 ] ; then
       RETVAL=3
   fi
   set -e
}
As you can see it checks the exit code of rabbitmqctl status. If '0', it's running, if other, it's not. The problem is, rabbitmqctl seems to exit with a 0 even when it's demonstrably down (only beam running in 'ps' output and even it's own output shows that state) --
[***@buzz bin]# rabbitmqctl status
Status of node ***@buzz ...
Error: unable to connect to node ***@buzz: nodedown
DIAGNOSTICS
===========
nodes in question: [***@buzz]
hosts, their running nodes and ports:
- buzz: [{rabbitmqctl7254,40130}]
current node details:
- node name: ***@buzz
- home dir: /var/lib/rabbitmq
- cookie hash: Cpx4pv1fUqIGNrAnkVsdIA==
[***@buzz bin]# echo $?
0
[***@buzz rabbitmq]# ps -ef | grep rabbit
rabbitmq 4082    1 0 12:59 ?       00:00:00 /usr/lib64/erlang/erts-5.8.5/bin/epmd -daemon
I ended up fudging the status_rabbitmq() function to return a non-zero result to ensure it always tried to start it. This is _not ideal_ but suits my needs for the time being.
To verify that it's actually running, a full suite of rabbitmq processes looks something like this:
rabbitmq 4844    1 0 13:04 ?       00:00:00 /usr/lib64/erlang/erts-5.8.5/bin/epmd -daemon
root     7384    1 0 13:11 pts/0   00:00:00 runuser rabbitmq --session-command /usr/sbin/rabbitmq-server
rabbitmq 7397 7384 0 13:11 pts/0   00:00:00 /bin/sh /usr/sbin/rabbitmq-server
rabbitmq 7401 7397 0 13:11 pts/0   00:00:03 /usr/lib64/erlang/erts-5.8.5/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -noshell -noinput -sname buzz -boot /var/lib/rabbitmq/mnesia/buzz-plugins-expand/rabbit -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/log/rabbitmq/buzz.log"} -rabbit sasl_error_logger {file,"/var/log/rabbitmq/buzz-sasl.log"} -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/buzz"
rabbitmq 7546 7401 0 13:11 ?       00:00:00 inet_gethost 4
rabbitmq 7548 7546 0 13:11 ?       00:00:00 inet_gethost 4
--------------------------------------------------------------
Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/69590#69590]
Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]