Dmitri Minaev | 22 Dec 08:05 2011
Picon

Re: RabbitMQ hangs, does not accept connections

Oh...

$ erl -sname qwer
Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4]
[async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.4  (abort with ^G)
(qwer <at> dbx)1> net_adm:names().
{ok,[{"rabbit",60040},{"qwer",58043}]}
(qwer <at> dbx)2> net_adm:ping(rabbit).
pang

On 22 December 2011 10:55, Alvaro Videla <videlalvaro <at> gmail.com> wrote:
> Hi,
>
> A small note,
>
> When connecting to a remote Erlang node, in this case the the rabbit node, you have to choose a different
node name.
>
> For example:
>
> erl -sname foo
>
> Once you are on the Erlang REPL then you can try to remotely connect to the rabbit node using net_adm:ping
>
> -Alvaro.
>
> Sent from my iFad
>
> On Dec 22, 2011, at 7:32 AM, Dmitri Minaev <minaev <at> gmail.com> wrote:
>
>> Now, I have a hanging Rabbit available for the autopsy.
>>
>> Running processes (ps ax|grep rabbit):
>>
>> -------------
>> 29699 ?        Ss     0:00 sh -c
>> RABBITMQ_PID_FILE=/var/run/rabbitmq/pid /usr/sbin/rabbitmq-server >
>>         /var/log/rabbitmq/startup_log 2>
>> /var/log/rabbitmq/startup_err
>> 29702 ?        S      0:00 /bin/sh /usr/sbin/rabbitmq-server
>> 29708 ?        S      0:00 su rabbitmq -s /bin/sh -c
>> /usr/lib/rabbitmq/bin/rabbitmq-server
>> 29710 ?        S      0:00 sh -c /usr/lib/rabbitmq/bin/rabbitmq-server
>> 29711 ?        Sl   4715:59 /usr/lib/erlang/erts-5.7.4/bin/beam.smp -W
>> w -K true -A30 -P 1048576 -- -root /usr/lib/erlang -progname erl --
>> -home /var/lib/rabbitmq -- -noshell -noinput -sname rabbit <at> dbx
>> -setcookie riak -boot
>> /var/lib/rabbitmq/mnesia/rabbit <at> dbx-plugins-expand/rabbit -config
>> /etc/rabbitmq/rabbitmq -kernel inet_default_connect_options
>> [{nodelay,true}] -rabbit tcp_listeners [{"0.0.0.0",5672}] -sasl
>> errlog_type error -kernel error_logger
>> {file,"/var/log/rabbitmq/rabbit <at> dbx.log"} -sasl sasl_error_logger
>> {file,"/var/log/rabbitmq/rabbit <at> dbx-sasl.log"} -os_mon start_cpu_sup
>> true -os_mon start_disksup false -os_mon start_memsup false -mnesia
>> dir "/var/lib/rabbitmq/mnesia/rabbit <at> dbx"
>> -------------
>>
>> Network sockets are available:
>> $ sudo netstat -tunlp|grep beam
>> tcp        0      0 0.0.0.0:5672            0.0.0.0:*
>> LISTEN      29711/beam.smp
>> tcp        0      0 0.0.0.0:60040           0.0.0.0:*
>> LISTEN      29711/beam.smp
>>
>> $ cat /etc/rabbitmq/rabbitmq.config
>> [{rabbit, [{vm_memory_high_watermark, 0.7}]},
>> {rabbit, [{tcp_listeners, [{"0.0.0.0", 5672}]}]}].
>>
>> $ cat /etc/rabbitmq/rabbitmq-env.conf
>> RABBITMQ_NODE_IP_ADDRESS=0.0.0.0
>>
>> strace -p 29711 shows that the process is waiting in select():
>> select(0, NULL, NULL, NULL, NULL
>>
>>
>> Last lines in rabbit <at> dbx.log:
>> ---------------------------
>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>> exception on TCP connection <0.367.0> from x.x.x.26:43157
>> connection_closed_abruptly
>>
>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>> closing TCP connection <0.367.0> from x.x.x..26:43157
>>
>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>> exception on TCP connection <0.379.0> from x.x.x.26:43160
>> connection_closed_abruptly
>>
>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>> closing TCP connection <0.379.0> from x.x.x.26:43160
>>
>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>> exception on TCP connection <0.335.0> from x.x.x.26:43154
>> connection_closed_abruptly
>>
>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>> closing TCP connection <0.335.0> from x.x.x.26:43154
>>
>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>> exception on TCP connection <0.467.0> from x.x.x.26:43166
>> connection_closed_abruptly
>>
>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>> closing TCP connection <0.467.0> from x.x.x.26:43166
>> ---------------------------
>>
>> PHP clients cannot connect to RabbitMQ. When I run my test Python
>> script which uses amqplib.client_0_8, it hangs on
>> amqp.Connection(host, "guest", "guest", ssl=False)
>>
>> strace shows the following:
>>
>> connect(3, {sa_family=AF_INET, sin_port=htons(5672),
>> sin_addr=inet_addr("127.0.0.1")}, 16) = 0
>> fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
>> fcntl(3, F_SETFL, O_RDWR)               = 0
>> sendto(3, "AMQP\1\1\t\1", 8, 0, NULL, 0) = 8
>> brk(0x1461000)                          = 0x1461000
>> recvfrom(3,
>>
>> Now, I try to connect to the RabbitMQ node using 'erl':
>> $ erl -sname 'rabbit <at> dbx'
>> {error_logger,{{2011,12,22},{10,26,33}},"Protocol: ~p: register error:
>> ~p~n",["inet_tcp",{{badmatch,{error,duplicate_name}},[{inet_tcp_dist,listen,1},{net_kernel,start_protos,4},{net_kernel,start_protos,3},{net_kernel,init_node,2},{net_kernel,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}]}
>> {error_logger,{{2011,12,22},{10,26,33}},crash_report,[[{initial_call,{net_kernel,init,['Argument__1']}},{pid,<0.21.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[net_sup,kernel_sup,<0.9.0>]},{messages,[]},{links,[#Port<0.68>,<0.18.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,377},{stack_size,24},{reductions,442}],[]]}
>> {error_logger,{{2011,12,22},{10,26,33}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfa,{net_kernel,start_link,[['rabbit <at> dbx',shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
>> {error_logger,{{2011,12,22},{10,26,33}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfa,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
>> {error_logger,{{2011,12,22},{10,26,33}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
>> {"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}
>>
>> Crash dump was written to: erl_crash.dump
>> Kernel pid terminated (application_controller)
>> ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})
>>
>> Is there any other information that might be useful?
>>
>> On 13 December 2011 18:26, Dmitri Minaev <minaev <at> gmail.com> wrote:
>>> Thank you for the reply. Yes, TCP connection could be established, but
>>> not AMQP. We generally use PHP library, but I also tested RabbitMQ
>>> using Python amqplib. In both cases, the client side cannot get the
>>> connection.
>>>
>>> Besides the common information messages (starting/closing TCP
>>> connection), there's only one type of messages in the log files:
>>>
>>> =WARNING REPORT==== 13-Dec-2011::16:56:51 ===
>>> exception on TCP connection <0.14474.173> from x.x.x.x:xxx
>>> connection_closed_abruptly
>>>
>>> But then, again, these messages may be found even during normal
>>> operation, this is why I don't think they're relevant.
>>>
>>>
>>> On 13 December 2011 14:42, Simon MacMullen <simon <at> rabbitmq.com> wrote:
>>>> Hmm. I can't really say anything from your description - can you post the
>>>> logs somewhere? It's possible that your definition of "nothing unusual in
>>>> the logs" differs from mine.
>>>>
>>>> And when you say that "the server refused attempts to connect", what exactly
>>>> do you mean. You say that a TCP connection *could* be established - so does
>>>> your client hang during AMQP handshaking? Disconnect? Something else?
>>>>
>>>> Cheers, Simon
>>>>
>>>>
>>>> On 12/12/11 16:24, Dmitri Minaev wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> We use RabbitMQ for about a year now. From time to time I upgraded it
>>>>> and switched from one server to another. About a month ago the last
>>>>> such transition took place. I installed new RabbitMQ (2.7) on a new
>>>>> server and our web application was reconfigured. Quite soon we faced
>>>>> new problems. After some days of stable work clients could not connect
>>>>> to RabbitMQ. I could list run rabbitmqctl, list queues, kill
>>>>> connections, but the server refused attempts to connect. That is, TCP
>>>>> socket was available and telnet could connect to port 5672, but the
>>>>> AMQP connection could not be established. There was nothing unusual in
>>>>> the logs. vm_memory_high_watermark is set to 0.7 and there's still
>>>>> plenty of free memory.
>>>>>
>>>>> After a couple of such failures I tried to downgrade to 2.6.1, but the
>>>>> problem remained. The last time I disabled IPv6, but today we hit the
>>>>> same trouble again.
>>>>>
>>>>> I think I must have done something wrong when setting up the
>>>>> environment, but what could that be?
>>>>>
>>>>> OS: Ubuntu 10.04 LTS.
>>>>> 16GB RAM.
>>>>> RabbitMQ 2.6.1
>>>>> Erlang R13B03 (erts-5.7.4) (package erlang-nox from Ubuntu repository)
>>>>> Client: php-amqplib
>>>>>
>>>>
>>>>
>>>> --
>>>> Simon MacMullen
>>>> RabbitMQ, VMware
>>>> _______________________________________________
>>>> rabbitmq-discuss mailing list
>>>> rabbitmq-discuss <at> lists.rabbitmq.com
>>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>>
>>>
>>>
>>> --
>>> With best regards,
>>> Dmitri Minaev
>>
>>
>>
>> --
>> With best regards,
>> Dmitri Minaev
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss <at> lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

--

-- 
With best regards,
Dmitri Minaev
_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss <at> lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Gmane