We do quite a lot of things through proxies at my place of employment. Mainly so we don't get blacklisted as the behavior of our malware/phish processing systems often times looks like we're doing some pretty shady stuff.
I had a use case where one of our nagios checks was designed to hit a vendor API endpoint from our local on box proxy via specific ports and determine if the endpoint was reachable. This caused some issues as sometimes the endpoint would be down for a very small window of time, or the proxy was being derp for some reason - either way, I was getting up in the middle of the night for false positives. This makes for an angry systems engineer so I thought I'd just rewrite the check.
It may become useful for anyone that needs to check the state of something multiple times before sending off the nagios exit response. Instead of checking for a failure once and alerting it checks for 3 consecutive failures.
The "Nagios Plugin" script
NRPE definition
Nagios definition
I had a use case where one of our nagios checks was designed to hit a vendor API endpoint from our local on box proxy via specific ports and determine if the endpoint was reachable. This caused some issues as sometimes the endpoint would be down for a very small window of time, or the proxy was being derp for some reason - either way, I was getting up in the middle of the night for false positives. This makes for an angry systems engineer so I thought I'd just rewrite the check.
It may become useful for anyone that needs to check the state of something multiple times before sending off the nagios exit response. Instead of checking for a failure once and alerting it checks for 3 consecutive failures.
The "Nagios Plugin" script
#!/bin/bash #Varaible initilization http_response=$(curl -s -o /dev/null -w "%{http_code}" --proxy localhost:$1 'http://API.endpoint.com') frequency=0 #if the http response isn't 200 it will check 3 consecutive times for a change. If no change occurs it will increment a flag for each failure. while [ "$http_response" != 200 ] do echo "$http_response" http_response=$(curl -s -o /dev/null -w "%{http_code}" --proxy localhost:$1 'http://API.endpoint.com') ((frequency++)) if [ "$frequency" -eq 3 ]; then break fi sleep 60 done #Compare the flag value. if it is less than 3 the check corrected itself and prevented false positive, otherwise its probably a real alert. if [ "$frequency" -eq 3 ]; then echo "Port $1 not reachable - $http_response response" exit 2 else echo "Port $1 reachable" exit 0 fi
NRPE definition
command[check_endpoint_through_proxy]=/usr/lib64/nagios/plugins/check_endpoint_proxy "27845"
Nagios definition
define service{
use remote-service,srv-pnp
host_name server.nrpe.response
service_description Endpoint Local Proxy Connection
contact_groups emailadmins
max_check_attempts 3
check_command check_nrpe!check_endpoint_through_proxy