RabbitMQ Monitoring
1. Exchange metrics
v Messages published in
v Messages published out
v messages unroutable
2. Node Metrics
v File descriptors used
v The network sockets count used
v Disk space used (low watermark default 50MB)
v Memory used (default memory threshold is set to 40% of installed
RAM)
v Disk I/O
NOTE: By default, when the RabbitMQ server uses above 40% of the
available RAM, it raises a memory alarm and blocks all connections that are
publishing messages.
In case of less disk space, if one node goes under the limit, then
all nodes will block incoming messages. Also Rabbitmq nodes will refuse
incoming connections, if its running out of file descriptors and sockets.
3. Connection Metrics
v Data/Message Rates
v No. of channels, no. of connections and no. of queues (more numbers
will reduce performance and eventually leads to high resource utilization)
4. Queue Metrics
v Queue depth
v Messages unacknowledged
v Messages ready
v Message rates
v No. of unacknowledged Messages (more number means , it will consume
more memory)
v Number of consumers/publishers
v Consumer utilization
5. Rabbitmq windows service
v Ensure up and running
v Ensure rabbitmq cluster status shows all the 3 nodes(through CLI)
6. Node Health Check up
v Load balancer to Rabbitmq health checkup (port 5671 and port 5672).
A collection interval of 30 or even 60 seconds is recommended.
7. Node Restart
In case Rabbitmq server restart is needed, do it one by one and
need to ensure that it formed the cluster back. In exceptional situations where
the servers got down unexpectedly, the last node to stop must be the first node
to be started and so on.
8. Log Monitoring
Logs are also very important in troubleshooting. Like metrics, logs can provide important clues that will help identify the root cause in case of issues.
RabbitMQ CLI tools
Below rabbitmq CLI tools
can be used for monitoring
·
rabbitmqctl
for service management and general operator tasks
·
rabbitmq-diagnostics for
diagnostics and health checking
rabbitmqctl
is the original CLI tool that ships with RabbitMQ. It supports a wide range of
operations including access to node status, health checks, listing queues,
connections, channels, exchanges, consumers etc.
How to Access rabbitmqctl
Step 1: Open command prompt as administrator on the rabbitmq node
Step 2: Navigate to C:\Program Files\RabbitMQ Server\rabbitmq_server-3.8.2\sbin
Step 3: Execute rabbitmqctl commands here
1. Health Check
ü
To check ping response for the node
·
rabbitmq-diagnostics
-q ping
Expected
Result: Ping succeeded
ü
To Check Ports connectivity &
Health Check up
·
rabbitmq-diagnostics -q
check_port_connectivity && rabbitmq-diagnostics -q node_health_check
Expected Result: Successfully connected to ports <<ports>>
Health
check passed
2. Check Any Alarms Raised within RabbitMQ
·
rabbitmq-diagnostics -q alarms
Expected Result: Node rabbit@<<hostname>> reported no
alarms, local or cluster wide
·
rabbitmq-diagnostics -q check_running
&& rabbitmq-diagnostics -q check_local_alarms
Expected Result: RabbitMQ on node rabbit@@<<hostname>>
is fully booted and running
Node rabbit@@<<hostname>> reported no local alarms
3. Node status
·
rabbitmqctl status
If node is up and running, it will display node statistics with
“uptime” details
If node is not running, it will display, “Target node is not
running”
Parameters to check: File Descriptors, Sockets, Connection count,
free disk space, Alarms, Erlang Processes
4. To check Cluster Status
·
rabbitmqctl cluster_status
If node is up and running, it will display all the 3 nodes
statistics with “uptime” details
If node is not running, it will display, “Target node is not
running”
Parameters to check: Running nodes should be 3, Alarms, Network
Partitions
5. To check Queue Depth
·
rabbitmqctl list_queues
It will display queue name with the count of messages on the queue
(depth)
No comments:
Post a Comment