Saturday, October 17, 2020

 RABBITMQ Monitoring with Prometheus-Grafana and ELK stack


Prometheus Installation

Step 1 : Download Latest Prometheus from the bellow url

·         https://github.com/prometheus/prometheus/releases/download/v2.16.0/prometheus-2.16.0.windows-amd64.tar.gz

Step 2 : Configure Prometheus with Rabbitmq. Unzip the file and locate “prometheus.yml”

·         Modify  the config parameters as below for prometheus.yml

scrape_configs:

- job_name: 'prometheus'

                static_configs:

                 - targets: ['<IP1>:9090']

- job_name: 'rabbitmq'

static_configs:

    - targets: ['<IP1>:15692']

    - targets: ['<IP2>:15692']

·         Start Prometheus by clicking prometheus.exe

·         Access the Prometheus Dashboard using the below link

http://<IP1:9090/

Rabbitmq-Prometheus

Grafana Installation

Step 1 :Download latest Grafana from the below url

·         https://dl.grafana.com/oss/release/grafana-6.6.2.windows-amd64.zip

Step 2 : Unzip the file and locate “defaults.ini”. Modify defaults.ini file with IP as below

domain = <ip address>

·         Start Prometheus with grafana-server.exe

·         Access the Prometheus Dashboard using the below link

http://<ip address:3000/

username:admin

password:admin

·         Select the datasource as prometheus and change the url with correct IP and port

·         Free Dashboards can be downloaded from

o   https://grafana.com/grafana/dashboards?dataSource=prometheus&category=rabbitmq

 

To Run Grafana& Prometheus as Service

Step 1 :Stop Grafana and Prometheus

Step 2 :Download NSSM from https://nssm.cc/download

Step 3 :From the zip file, copy “win64/nssm.exe” into both Prometheus and grafana folders

For Grafana

Step 1: From the command prompt:

·         nssm install grafana<exe file name with full location>

Step 2: Start the service from "services"

For Prometheus

Step 1: From the command prompt:

·         nssm install prometheus<exe file name with full location>

Step 2: Start the service from "services"

NOTE: Use the below command to edit the NSSm interface

·         nssm edit prometheus/grafana

 

RABBITMQ with ELK stack Integration

Step 1: Download the latest Metricbeat Windows zip file from the Download page.

https://www.elastic.co/downloads/beats/metricbeat

Step 2: Extract the contents of the zip file into C:\Program Files.

Step 3: Rename the metricbeat-7.6.0-windows directory to Metricbeat.

Step 4: Open a PowerShell prompt as an Administrator and run the following commands to install Metricbeat as a Windows service.

·         cd "C:\Program Files\Metricbeat"

·         .\install-service-metricbeat.ps1

Step 5: Modify the settings under output.elasticsearch in the C:\Program Files\Metricbeat\metricbeat.yml file to point to your Elasticsearch installation.

output.elasticsearch:

hosts: ["<es_url>"]

username: "elastic"

password: "<password>"

 

setup.kibana:

host: "<kibana_url>"

 

Step 6: From the C:\Program Files\Metricbeat folder, run:

·         .\metricbeat.exe modules enable rabbitmq

Step 7 :Modify the settings in the modules.d/rabbitmq.yml file.

- module: rabbitmq

metricsets:

    - node

    - queue

    - connection

period: 10s

hosts: ["localhost:15672"]

Step 8 :The setup command loads the Kibana dashboards. If the dashboards are already set up, omit this command.

·         .\metricbeat.exe setup

·         Start-Service metricbeat

Step 9: Check that data is received from the Metricbeatrabbitmq module to Kibana

                http://<ip address:5601/

NOTE: Use the default dashboard in kibana to visualize the data.


RabbitMQ Monitoring


1. Exchange metrics

v  Messages published in

v  Messages published out

v  messages unroutable

2. Node Metrics

v  File descriptors used

v  The network sockets count used

v  Disk space used (low watermark default 50MB)

v  Memory used (default memory threshold is set to 40% of installed RAM)

v  Disk I/O

NOTE: By default, when the RabbitMQ server uses above 40% of the available RAM, it raises a memory alarm and blocks all connections that are publishing messages.

In case of less disk space, if one node goes under the limit, then all nodes will block incoming messages. Also Rabbitmq nodes will refuse incoming connections, if its running out of file descriptors and sockets.

3. Connection Metrics

v  Data/Message Rates

v  No. of channels, no. of connections and no. of queues (more numbers will reduce performance and eventually leads to high resource utilization)

 4. Queue Metrics

v  Queue depth

v  Messages unacknowledged

v  Messages ready

v  Message rates

v  No. of unacknowledged Messages (more number means , it will consume more memory)

v  Number of consumers/publishers

v  Consumer utilization

 5. Rabbitmq windows service

v  Ensure up and running

v  Ensure rabbitmq cluster status shows all the 3 nodes(through CLI)

  6. Node Health Check up

v  Load balancer to Rabbitmq health checkup (port 5671 and port 5672). A collection interval of 30 or even 60 seconds is recommended.

 

7. Node Restart

In case Rabbitmq server restart is needed, do it one by one and need to ensure that it formed the cluster back. In exceptional situations where the servers got down unexpectedly, the last node to stop must be the first node to be started and so on.

8. Log Monitoring

Logs are also very important in troubleshooting. Like metrics, logs can provide important clues that will help identify the root cause in case of issues. 


RabbitMQ CLI tools

      Below rabbitmq CLI tools can be used for monitoring

·         rabbitmqctl for service management and general operator tasks

·         rabbitmq-diagnostics for diagnostics and health checking

rabbitmqctl is the original CLI tool that ships with RabbitMQ. It supports a wide range of operations including access to node status, health checks, listing queues, connections, channels, exchanges, consumers etc.

 

How to Access rabbitmqctl

Step 1: Open command prompt as administrator on the rabbitmq node

Step 2: Navigate to C:\Program Files\RabbitMQ Server\rabbitmq_server-3.8.2\sbin

Step 3: Execute rabbitmqctl commands here


1.      Health Check

ü  To check ping response for the node

·         rabbitmq-diagnostics -q ping

Expected Result: Ping succeeded

ü  To Check Ports connectivity & Health Check up

·         rabbitmq-diagnostics -q check_port_connectivity && rabbitmq-diagnostics -q node_health_check

Expected Result: Successfully connected to ports <<ports>>

                              Health check passed

 

2.      Check Any Alarms Raised within RabbitMQ

·         rabbitmq-diagnostics -q alarms

Expected Result: Node rabbit@<<hostname>> reported no alarms, local or cluster wide

·         rabbitmq-diagnostics -q check_running && rabbitmq-diagnostics -q check_local_alarms

Expected Result: RabbitMQ on node rabbit@@<<hostname>> is fully booted and running

Node rabbit@@<<hostname>> reported no local alarms

 

3.      Node status

·         rabbitmqctl status

If node is up and running, it will display node statistics with “uptime” details

If node is not running, it will display, “Target node is not running”

Parameters to check: File Descriptors, Sockets, Connection count, free disk space, Alarms, Erlang Processes

 

4.      To check Cluster Status

·         rabbitmqctl cluster_status

If node is up and running, it will display all the 3 nodes statistics with “uptime” details

If node is not running, it will display, “Target node is not running”

Parameters to check: Running nodes should be 3, Alarms, Network Partitions

 

5.      To check Queue Depth

·         rabbitmqctl list_queues

It will display queue name with the count of messages on the queue (depth)


 

RABBITMQ CLUSTER(3 NODE) INSTALLATION - COMPLETE SETUP ON WINDOWS

RabbitMQ_Cluster_Installation-Overview


RabbitMQ is a message broker, a tool for implementing a messaging architecture. Some parts of your application publish messages, others consume them, and RabbitMQ routes them between producers and consumers.

• Producer: Application that sends the messages.

• Consumer: Application that receives the messages.

• Queue: Buffer that stores messages.

• Message: Information that is sent from the producer to a consumer through RabbitMQ.

• Connection: A TCP connection between your application and the RabbitMQ broker.

• Channel: A virtual connection inside a connection. When publishing or consuming messages from a queue - it's all done over a channel.

• Exchange: Receives messages from producers and pushes them to queues depending on rules defined by the exchange type. To receive messages, a queue needs to be bound to at least one exchange.

• Binding: A binding is a link between a queue and an exchange.

• Routing key: A key that the exchange looks at to decide how to route the message to queues. Think of the routing key like an address for the message.

• AMQP: Advanced Message Queuing Protocol is the protocol used by RabbitMQ for messaging.

• Users: It is possible to connect to RabbitMQ with a given username and password. Every user can be assigned permissions such as rights to read, write and configure privileges within the instance. Users can also be assigned permissions for specific virtual hosts.

• Vhost, virtual host: Provides a way to segregate applications using the same RabbitMQ instance. Different users can have different permissions to different vhost and queues and exchanges can be created, so they only exist in one vhost.

• RabbitMQ cluster: Multiple Rabbitmq nodes can join together as a cluster which can accept requests through a load balancer in a balanced way.


PREREQUISITES

·         Rabbitmq 3.8.2 (https://github.com/rabbitmq/rabbitmq-server/releases/download/v3.8.2/rabbitmq-server-3.8.2.exe)

·         Erlang OTP 22.2 (https://erlang.org/download/otp_win64_22.2.exe)·  

·         Handle v4.22 (https://docs.microsoft.com/en-us/sysinternals/downloads/handle)

·         Win64OpenSSL-1_1_1g (https://slproweb.com/download/Win64OpenSSL-1_1_1g.exe )

·         Sample Rabbitmq conf file (https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq.conf.example )

·         Exported ssl certfificate files(.pfx format files with private key) for the corresponding rabbitmq nodes.

INSTALLATION STEPS

I.                   INSTALL RABBITMQ AS STANDALONE ON ALL 3 NODES

Step 1: Install Erlang OTP (run as administrator)


                                Erlang OTP22                                   





 Step 2 : Set erlang environment variable

a).  Go to Start > Settings > Control Panel > System > Advanced > Environment Variables

 

b).  Add the new entry in system variables

 



c).  E.g., set variable name = ERLANG_HOME and variable value=C:\Program Files\erl10.6.

 

d).  You can set from the command prompt also as “set ERLANG_HOME=C:\Program Files\erl10.6”

 

e).  To check the home directory on cmd prompt execute  %ERLANG_HOME%

 

Erlang Cookie: The Erlang cookie is a shared secret key used for authentication between

 

RabbitMQ nodes and CLI tools (Command Line Interface). Elang cookie can be present in two locations. Make sure the Erlang cookie value is the same. Otherwise, copy it from the system profile to the user directory.

 

ü  C:\Windows\System32\config\systemprofile

ü  C:\Users\%USERNAME%\

 

Step 3 : Install Rabbitmq ( Run as Administrator )

 

RabbitMQ Installation



Step 4 : Reboot machine

Step 5 : After logged in with the same windows user,  Go to start>RabbitMQ server> RabbitMQ command Prompt (sbin dir)

ü  Check Rabbitmq status:

ü  Rabbitmqctl status

ü  Enable rabbitmq management interface:

ü  rabbitmq-plugins enable rabbitmq_management

Step 6 : Once the plugin gets enabled, the management interface can be accessed with the below url using default guest user (u/p:guest/guest)

http://localhost:15672

II.                 GENERATE RABBITMQ CERTIFICATE  ON ALL 3 NODES

Step 1 : Get the exported rabbitmq server ssl certificates in pfx format from the client (ensure to get passphrase also)

Step 2 : Install openssl on the server

 






 Step 3 : Go to “C:\Program Files\OpenSSL-Win64\bin” and execute “openssl.exe” (as administrator)


Step 4 : Create a directory on ‘C’ Drive for certificates as “certs” and move the client certificate over there.

Step 5 : Execute the below commands after modifying the file names to generate .pem files from the client pfx certificate file .

Only private key:  pkcs12 -in C:\certs\<<cert file name>>.pfx -nocerts -out C:\certs\privatekey.pem

Only certificate:  pkcs12 –in C:\certs\<<cert file name>>.pfx -clcerts -nokeys -out C:\certs\cert.pem

Only CA certificate:  pkcs12 -in C:\certs\<<cert file name>>.pfx –nokeys -cacerts -out C:\certs\cacert.pem

 

III.              RUN RABBITMQ WITH TLS 1.2  ON ALL 3 NODES

Once the Rabbitmq is installed, it will keep the data directory under the current (installed user) window’s users home directory. The location will be “C:\Users\<<username>>\AppData\Roaming\RabbitMQ”

Step 1 : Create a conf file within "C:\Users\<<username>>\AppData\Roaming\RabbitMQ" as rabbitmq.conf ( extension as .conf)

Step 2 :  Paste the contents from the sample file :https://github.com/rabbitmq/rabbitmq-server/blob/v3.7.x/docs/rabbitmq.conf.example

Step 3 : Uncomment or Modify the entries as below within the corresponding sections(for ssl, secure password)

Networking

----------

management.tcp.port       = 15672

listeners.ssl.default = 5671

 

Security, Access Control

------------------------------

   ssl_options.verify               = verify_none

   ssl_options.fail_if_no_peer_cert = false

   ssl_options.cacertfile           = C:\\certs\\cacert.pem

   ssl_options.certfile             = C:\\certs\\cert.pem

   ssl_options.keyfile              = C:\\certs\\privatekey.pem

   ssl_options.password                     = <<certificate password>>

   ssl_options.versions.1 = tlsv1.2

   ssl_options.client_renegotiation = false

   ssl_options.secure_renegotiate   = true

 

Default User / VHost

--------------------------

 credential_validator.validation_backend = rabbit_credential_validator_password_regexp

 credential_validator.regexp = ^[a-zA-Z0-9$@]{8,20}

 

Step 4 : In case, if the Load balancer within the client premises support proxy protocol, we have to uncomment the below entry within the conf file also. This will display the application IP address instead of load balancer IP within management interface dashboard.

 

Misc/Advanced Options

---------------------

proxy_protocol = true

 

Step 5 : Set the Rabbitmq environment variables for Rabbitmq Base and Config

RABBITMQ_CONFIG_FILE=C:\Users\<<username>>\AppData\Roaming\RabbitMQ\rabbitmq.conf

                RABBITMQ_BASE=C:\Users\<<username>>\AppData\Roaming\RabbitMQ

Note (1): username is the user in which rabbitmq is installed and running. Ensure rabbitmq file should be with a type as conf. For Rabbitmq_Base, the variable will be a directory.

Note(2) : Since we add a config file, we need to reinstall windows service to load it by Rabbitmq.

Step 6 : Go to start>RabbitMQ server> RabbitMQ command Prompt (sbin dir)

ü  Stop Rabbitmq service

o   rabbitmq-service stop

ü  Remove Rabbitmq service 

o   rabbitmq-service remove

Step 7 : Reboot Machine

Step 8 : After logged in with the same windows user,  Go to start>RabbitMQ server> RabbitMQ command Prompt (sbin dir)

ü  Rabbitmq service install

ü  rabbitmq-service install

ü  Rabbitmq service start

ü  rabbitmqctl start

ü  Check Rabbitmq status

ü  Rabbitmqctl status

NOTE : Do the same steps across other two Rabbitmq nodes also. Before reinstalling rabbitmq service (After doing the above step 6 on other two nodes) need to copy the erlang cookie file from the  first server and paste it on the two locations specified at the starting section on those two servers.

IV.              SET UP RABBITMQ CLUSTER

Once the installations of rabbitmq on other two nodes are also done, we can proceed to configure as cluster. Currently all three nodes are running as standalone rabbitmq servers.

Step 1 : Stop rabbitmq service on the first node.

o   rabbitmq-service stop

Step 2 :  Start the rabbitmq server in server mode with detached

o   rabbitmq-server –detatched

Step 3 : Do step 1 and 2 on other two nodes

Step 4 : Now all the 3 nodes are running in rabbitmq server mode with detached

Step 5 : On the second Rabbitmq node, perform the below steps

ü  Stop rabbitmq application

o   rabbitmqctl stop_app

ü  reset the rabbitmq node

o   rabbitmqctl reset

ü  Join the node with the first node in the cluster

o   rabbitmqctl join_cluster rabbit@<<HOSTNAME>>

Step 6 : Check cluster status within the first rabbitmq server

o   Rabbitmqctl cluster_status

Step 7 : On the third Rabbitmq node, perform the below steps

ü  Stop rabbitmq application

o   rabbitmqctl stop_app

ü  reset the rabbitmq node

o   rabbitmqctl reset

ü  Join the node with the first node in the cluster

o   rabbitmqctl join_cluster rabbit@<<HOSTNAME>>

Step 8 : Check cluster status within the first rabbitmq server

o   Rabbitmqctl cluster_status

Note : Normally, the cluster name will be the rabbitmq server name of the first node. The names can be seen either within the rabbitmqctl status command option or within management interface.

Step 9 : If all the rabbitmq servers are up and its running as cluster, we can proceed to start all the nodes as windows service.

Step 10 : Stop rabbitmq application and rabbitmq server on the first node and start as service.

ü  Stop rabbitmq application

o   rabbitmqctl stop_app

ü  Stop rabbitmq server

o   rabbitmqctl stop

ü  Start rabbitmq service

o   rabbitmq-service start

ü  check rabbitmq cluster status

o   Rabbitmqctl cluster_status

Step 11 : Stop rabbitmq application and rabbitmq server on the second node and start as service.

ü  Stop rabbitmq application

o   rabbitmqctl stop_app

ü  Stop rabbitmq server

o   rabbitmqctl stop

ü  Start rabbitmq service

o   rabbitmq-service start

ü  check rabbitmq cluster status

o   Rabbitmqctl cluster_statuS

 

Step 12 : Stop rabbitmq application and rabbitmq server on the third node and start as service.

ü  Stop rabbitmq application

o   rabbitmqctl stop_app

ü  Stop rabbitmq server

o   rabbitmqctl stop

ü  Start rabbitmq service

o   rabbitmq-service start

ü  check rabbitmq cluster status

o   Rabbitmqctl cluster_status

 

NOTE: Make sure the above steps are doing one by one. Also check the cluster status in between the operations.

 



 V.                ENABLE HA SYNC MODE ON CLUSTER AS AUTOMATIC

                To enable HA sync mode for mirroring of queues, execute the below command on any of the node

rabbitmqctl set_policy ha-all "" "{""ha-mode"":""all"",""ha-sync-mode"":""automatic""}"