'rabbitmq connections blocking but memory is below watermark

we are use rabbitmq in our application, two hours ago, one of our app server is blocked when try to connect to rabbitmq, after check rabbitmq server , we found one node's memory is over watermark, a few minutes later, this node is down. after restart this node, the whole cluster sames work fine, but i notice there's a lot of connection in blocking and blocked state from web management,but use rabbitmqctl list_connections pid name peer_address state in all nodes shows there is no connection in blocking/blocked…so this really make me confuse:

  1. after one node of whole cluster over watermark, but other node is work fine, my application can't connect to rabbitmq cluster? ps: we use spring.amqp & spring-rabbit with version 1.1.0.RELEASE
  2. node will down for what reason when over watermark?
  3. why after restart node, there is still blocking connection, but with rabbitmqctl they all in running state?

here is some logs from my rabbitmq server:

=INFO REPORT==== 1-Mar-2013::19:36:21 ===
vm_memory_high_watermark clear. Memory used:1656590680 allowed:1658778419

=INFO REPORT==== 1-Mar-2013::19:36:21 ===
alarm_handler: {clear,{resource_limit,memory,rabbit@cos22}}

when i try to close blocked connection from web management, it goes error:

=INFO REPORT==== 1-Mar-2013::20:55:24 ===
Closing connection <0.17197.115> because "Closed via management plugin"

=ERROR REPORT==== 1-Mar-2013::20:55:24 ===
webmachine error: path="/api/connections/10.64.13.200%3A45891%20-%3E%2010.64.12.226%3A5672"
{throw,
{error,{not_a_connection_pid,<0.17197.115>}},
[{rabbit_networking,close_connection,2,
     [{file,"src/rabbit_networking.erl"},{line,317}]},
 {rabbit_mgmt_wm_connection,delete_resource,2,
     [{file,"rabbitmq-management/src/rabbit_mgmt_wm_connection.erl"},
      {line,52}]},
 {webmachine_resource,resource_call,3,
     [{file,
          "webmachine-wrapper/webmachine-git/src/webmachine_resource.erl"},
      {line,169}]},
 {webmachine_resource,do,3,
     [{file,
          "webmachine-wrapper/webmachine-git/src/webmachine_resource.erl"},
      {line,128}]},
 {webmachine_decision_core,resource_call,1,
     [{file,
          "webmachine-wrapper/webmachine-git/src/webmachine_decision_core.erl"},
      {line,48}]},
 {webmachine_decision_core,decision,1,
     [{file,
          "webmachine-wrapper/webmachine-git/src/webmachine_decision_core.erl"},
      {line,416}]},
 {webmachine_decision_core,handle_request,2,
     [{file,
          "webmachine-wrapper/webmachine-git/src/webmachine_decision_core.erl"},
      {line,33}]},
 {rabbit_webmachine,'-makeloop/1-fun-0-',3,
     [{file,"rabbitmq-mochiweb/src/rabbit_webmachine.erl"},{line,75}]}]}

use rabbitmqctl shows all in running state:

rabbitmqctl list_connections pid name peer_address state
Listing connections ...
<[email protected]>        10.64.13.197:57321 -> 10.64.12.225:5672 10.64.13.197    running
<[email protected]>        10.64.13.196:57240 -> 10.64.12.225:5672 10.64.13.196    running
<[email protected]>        10.64.12.196:58608 -> 10.64.12.225:5672 10.64.12.196    running
<[email protected]>        10.64.11.235:48962 -> 10.64.12.225:5672 10.64.11.235    running
<[email protected]>        10.64.13.228:49857 -> 10.64.12.225:5672 10.64.13.228    running
<[email protected]>        10.64.11.193:36387 -> 10.64.12.225:5672 10.64.11.193    running
<[email protected]>        10.64.10.123:52017 -> 10.64.12.225:5672 10.64.10.123    running
<[email protected]>       10.64.12.247:38504 -> 10.64.12.225:5672 10.64.12.247    running
<[email protected]>        10.64.10.29:51483 -> 10.64.12.225:5672  10.64.10.29     running
<[email protected]>        10.64.11.234:50244 -> 10.64.12.225:5672 10.64.11.234    running
<[email protected]>        10.64.11.178:33795 -> 10.64.12.225:5672 10.64.11.178    running
<[email protected]>        10.64.10.28:39557 -> 10.64.12.225:5672  10.64.10.28     running
<[email protected]>        10.64.13.233:38766 -> 10.64.12.225:5672 10.64.13.233    running
<[email protected]>        10.64.13.229:50932 -> 10.64.12.225:5672 10.64.13.229    running
<[email protected]>        10.64.13.241:49311 -> 10.64.12.225:5672 10.64.13.241    running
<[email protected]>        10.64.11.195:39455 -> 10.64.12.225:5672 10.64.11.195    running
<[email protected]>        10.64.10.27:58938 -> 10.64.12.225:5672  10.64.10.27     running
<[email protected]>        10.64.13.240:37777 -> 10.64.12.225:5672 10.64.13.240    running
<[email protected]>        10.64.10.130:37251 -> 10.64.12.225:5672 10.64.10.130    running
<[email protected]> 10.64.13.200:54840 -> 10.64.12.226:5672 10.64.13.200    running
...done.

and there is a connection with a lot of channel is in blocked state, but i can't find this connection by use rabbitctl list_connections:

AMQP 0-9-1  
10.64.13.200:45891 -> 10.64.12.226:5672
rabbit@cos22    0B/s
(49.2MB total)
0B/s
(2.4MB total)
0s  60920

thanks a lot for any help and suggestion.



Solution 1:[1]

Got a answer from the rabbitmq mailing list:

These connections / channels do not exist. You're seeing a bug in the management plugin where it will retain information about connections and channels that were alive on a cluster node when it crashed.

This bug was fixed in RabbitMQ 3.0.3.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 j0k