Detecting when Mongod's port is open inside a script - linux

I'm trying to write a bash script that starts a mongod process, waits for it to start (i.e. have it's default port open) and then pipe some commands into it through the mongo shell. I'd like some way to wait for the mongod process to be completely up that's more deterministic than just sleep 5.
This is the script so far:
set_up_authorization() {
echo "Setting up access control"
/path/to/mongo < configure_access_controls.js
}
wait_for_mongod_to_start() {
RETRIES=1000
CONNECTED="false"
echo "Waiting for mongod to start"
while [[ $RETRIES -ge 0 && $CONNECTED == "false" ]] ; do
RESPONSE=$(exec 6<>/dev/tcp/127.0.0.1/27017 || echo "1")
if [[ $RESPONSE == "" ]] # which should happen if the exec is successful
CONNECTED="true"
fi
RETRIES=$((RETRIES - 1))
done
if [[ $RETRIES -eq 0 ]] ; then
echo "Max retries reached waiting for mongod to start. Exiting."
exit 1
fi
echo "Mongod started"
}
./start_mongod_instance.sh
wait_for_mongod_to_start
set_up_authorization
While this script works, it produces a ton of output on the terminal while the exec is failing:
./initialize_cluster.sh: connect: Connection refused
./initialize_cluster.sh: line xx: /dev/tcp/127.0.0.1/27017: Connection refused
...which repeats for all ~900 failed attempts.
Neither of the following seems to get rid of the terminal logging either:
exec 6<>/dev/tcp/127.0.0.1/27017 >/dev/null
OR
exec 6<>/dev/tcp/127.0.0.1/27017 2>/dev/null
I've also tried using the following:
ps -aux | grep "mongod" | wc -l
but the process having a pid that ps lists isn't equivalent to it's port being open or it accepting connections.
Any ideas on either front would be appreciated - a more elegant way to wait for the process to start completely or a way to get rid of the excessive logging to the terminal.
Note: I don't have access to nmap or nc to check the port (this is on a client's machine).

exec is a bit special. It affects the output of the current shell. Meaning you need to redirect stderr of the current shell before running the port check:
host="localhost"
port="9000"
exec 2>/dev/null # redirect error here
while ! exec 3<>"/dev/tcp/${host}/${port}" ; do
echo "Waiting ..."
sleep 1
done
Furthermore you might have noticed that I check the exit status of exec rather than some output to decide whether the port is open or not.
If you want to reset it afterwards:
host="localhost"
port="9000"
# Copy fd 2 into fd 3 and redirect fd 2 to /dev/null
exec 3<&2 2>/dev/null
while ! exec 3<>"/dev/tcp/${host}/${port}" ; do
echo "Waiting ..."
sleep 1
done
# Copy back fd 3 into fd 2
exec 2<&3
echo "EE oops!" >&2

Related

Checking if a node.js app has started properly using bash

I wrote a node.js application and have written a bash script to start it and verify if it's running. I have my script run npm start & first, then I have a check to see if the ports I want open are open using netstat. The check works fine when I run it after the script is run, but during the running of the script, the check fails because the server has not fully started before the check is run. My code is below:
echo "Starting the server..."
npm start & > /dev/null 2>&1
if [[ -z $(sudo netstat -tulpn | grep :$portNum | grep node) ]] ; then
echo -e "\tPort $portNum is not in use, something went wrong. Exiting."
else
echo -e "\tPort $portNum is in use!"
fi
Is there a good way to take the above script and change it so that the check doesn't occur until the server is fully started? I don't want to use sleep if I can help it.
You can use a wait call:
echo "Starting the server..."
npm start & > /dev/null 2>&1
wait
if [[ -z $(sudo netstat -tulpn | grep :$portNum | grep node) ]] ; then
echo -e "\tPort $portNum is not in use, something went wrong. Exiting."
else
echo -e "\tPort $portNum is in use!"
fi
The only limitation to this is that if npm daemonizes itself then it's no longer a child of the script so the wait command will have no effect (the reason for this is that a process daemonizes itself by terminating and spawning a new process that inherits its role).

Linux script to monitor remote port and launch script if not successful

RHEL 7.1 is the OS this will be used on.
I have two servers which are identical (A and B). Server B needs to monitor a port on Server A and if it's down for 30 seconds, launch a script. I read netcat was replaced with ncat on RHEL 7 so this is what I have so far:
#!/bin/bash
Server=10.0.0.1
Port=123
ncat $Server $Port &> /dev/null; echo $?
If the port is up, the output is 0. If the port is down, the output is 1. I'm just not sure on how to do the next part which would be "if down for 30 seconds, then launch x script"
Any help would be appreciated. Thanks in advance.
If you really want to script this rather than using a dedicated tool like Pacemaker as #CharlesDuffy suggested, then you could do something like this:
Run an infinite loop
Check the port
If up, save the timestamp
Otherwise check the difference from the last saved timestamp
If more time passed then threshold, then run the script
Sleep a bit
For example:
#!/bin/bash
server=10.0.0.1
port=123
seconds=30
seen=$(date +%s)
while :; do
now=$(date +%s)
if ncat $server $port &> /dev/null; then
seen=$now
else
if ((now - seen > seconds)); then
run-script && exit
fi
fi
sleep 1
done
#!/bin/bash
Server=10.0.0.1
Port=123
port_was_down=0
while true; do
sleep 30
if ! ncat $Server $Port &> /dev/null; then
if [[ $port_was_down == "1" ]]; then
run-script
exit
else
port_was_down=1
fi
else
port_was_down=0
fi
done
what about using nmap?
something like:
TIMEOUT=30s;
HOST=10.0.0.1;
PORT=123;
if nmap --max-rtt-timeout $TIMEOUT --min-rtt-timeout $TIMEOUT -p $PORT $HOST | grep "^$PORT.*open"; then
echo 'OPEN';
else
echo 'CLOSED';
fi;

Bash: Loop until command exit status equals 0

I have a netcat installed on my local machine and a service running on port 25565. Using the command:
nc 127.0.0.1 25565 < /dev/null; echo $?
Netcat checks if the port is open and returns a 0 if it open, and a 1 if it closed.
I am trying to write a bash script to loop endlessly and execute the above command every second until the output from the command equals 0 (the port opens).
My current script just keeps endlessly looping "...", even after the port opens (the 1 becomes a 0).
until [ "nc 127.0.0.1 25565 < /dev/null; echo $?" = "0" ]; do
echo "..."
sleep 1
done
echo "The command output changed!"
What am I doing wrong here?
Keep it Simple
until nc -z 127.0.0.1 25565
do
echo ...
sleep 1
done
Just let the shell deal with the exit status implicitly
The shell can deal with the exit status (recorded in $?) in two ways, explicit, and implicit.
Explicit: status=$?, which allows for further processing.
Implicit:
For every statement, in your mind, add the word "succeeds" to the command, and then add
if, until or while constructs around them, until the phrase makes sense.
until nc succeeds; do ...; done
The -z option will stop nc from reading stdin, so there's no need for the < /dev/null redirect.
You could try something like
while true; do
nc 127.0.0.1 25565 < /dev/null
if [ $? -eq 0 ]; then
break
fi
sleep 1
done
echo "The command output changed!"

How to set up an automatic (re)start of a background ssh tunnel

I am a beginner user of linux, and also quite newbie at ssh and tunnels.
Anyway, my goal is to maintain a ssh tunnel open in background.
In order to do that, I wrote the following batch that I then added into crontab (the batch is automatically processed every 5 minutes during workdays and from 8am to 9pm).
I read in some other thread in stackoverflow that one should use autossh that will ensure the ssh will always be ok through a recurrent check. So did I....
#!/bin/bash
LOGFILE="/root/Tunnel/logBatchRestart.log"
NOW="$(date +%d/%m/%Y' - '%H:%M)" # date & time of log
if ! ps ax | grep ssh | grep tunnelToto &> /dev/null
then
echo "[$NOW] ssh tunnel not running : restarting it" >> $LOGFILE
autossh -f -N -L pppp:tunnelToto:nnnnn nom-prenom#193.xxx.yyy.zzz -p qqqq
if ! ps ax | grep ssh | grep toto &> /dev/null
then
echo "[$NOW] failed starting tunnel" >> $LOGFILE
else
echo "[$NOW] restart successfull" >> $LOGFILE
fi
fi
My problem is that sometimes the tunnel stops working, although every thing looks ok (ps ax | grep ssh > the result shows the two expected tasks : autossh main task and the ssh tunnel itself). I actually know about the problem cause the tunnel is used by a third party software that triggers an error as soon as the tunnel is no more responding.
SO I am wondering how I should improve my batch in order It will be able to check the tunnel and restart it if it happens to be dead. I saw some ideas in there, but it was concluded by the "autossh" hint... which I already use. Thus, I am out of ideas... If any of you have, I'd gladly have a look at them!
Thanks for taking interest in my question, and for your (maybe) suggestions!
Instead of checking the ssh process with ps you can do the following trick
create script, that does the following and add it to your crontab via crontab -e
#!/bin/sh
REMOTEUSER=username
REMOTEHOST=remotehost
SSH_REMOTEPORT=22
SSH_LOCALPORT=10022
TUNNEL_REMOTEPORT=8080
TUNNEL_LOCALPORT=8080
createTunnel() {
/usr/bin/ssh -f -N -L$SSH_LOCALPORT:$REMOTEHOST:SSH_REMOTEPORT -L$TUNNEL_LOCALPORT:$REMOTEHOST:TUNNEL_REMOTEPORT $REMOTEUSER#$REMOTEHOST
if [[ $? -eq 0 ]]; then
echo Tunnel to $REMOTEHOST created successfully
else
echo An error occurred creating a tunnel to $REMOTEHOST RC was $?
fi
}
## Run the 'ls' command remotely. If it returns non-zero, then create a new connection
/usr/bin/ssh -p $SSH_LOCALPORT $REMOTEUSER#localhost ls >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo Creating new tunnel connection
createTunnel
fi
In fact, this script will open two ports
port 22 which will be used to check if the tunnel is still alive
port 8080 which is the port you might want to use
Please check and send me further questions via comments
(I add this as an answer since there is not enough room for it un a comment)
Ok, I managed to make the batch run to launch the ssh tunnel (I had to specify my hostname instead of localhost in order it could be triggered) :
#!/bin/bash
LOGFILE="/root/Tunnel/logBatchRedemarrage.log"
NOW="$(date +%d/%m/%Y' - '%H:%M)" # date et heure du log
REMOTEUSER=username
REMOTEHOST=remoteHost
SSH_REMOTEPORT=22
SSH_LOCALPORT=10022
TUNNEL_REMOTEPORT=12081
TUNNEL_SPECIFIC_REMOTE_PORT=22223
TUNNEL_LOCALPORT=8082
createTunnel() {
/usr/bin/ssh -f -N -L$SSH_LOCALPORT:$REMOTEHOST:$SSH_REMOTEPORT -L$TUNNEL_LOCALPORT:$REMOTEHOST:$TUNNEL_REMOTEPORT $REMOTEUSER#193.abc.def.ghi -p $TUNNEL_SPECIFIC_REMOTE_PORT
if [[ $? -eq 0 ]]; then
echo [$NOW] Tunnel to $REMOTEHOST created successfully >> $LOGFILE
else
echo [$NOW] An error occurred creating a tunnel to $REMOTEHOST RC was $? >> $LOGFILE
fi
}
## Run the 'ls' command remotely. If it returns non-zero, then create a new connection
/usr/bin/ssh -p $SSH_LOCALPORT $REMOTEUSER#193.abc.def.ghi ls >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo [$NOW] Creating new tunnel connection >> $LOGFILE
createTunnel
fi
However, I got some immediate message (below) when the tunnel is running and when cron tries to lauch the batch again... sounds like it cannot listen to it. Also since I need some time to get a proof , I can't say yet it will successfully restart if the tunnel is out.
Here's the response to the second start of the batch.
bind: Address already in use channel_setup_fwd_listener: cannot listen
to port: 10022 bind: Address already in use
channel_setup_fwd_listener: cannot listen to port: 8082 Could not
request local forwarding.

Continue to grep for traceroute result with bash

Every night I go through the same process of checking failover systems for our T1's. I essentially go through the following process:
Start the failover process.
traceroute $server;
Once I see it's failed over, I verify that connections work by SSHing into a server.
ssh $server;
Then once I see it works, I take it off of failover.
So what I want to do is to continually run a traceroute until I get a certain result, then run a SSH command.
Put your list of successful messages in a file (omit the variable lines and fractions of the line, and use a ^ to identify the start of the line, as such:)
patterns.list:
^ 7 4.68.63.165
^ 8 4.68.17.133
^ 9 4.79.168.210
^10 216.239.48.108
^11 66.249.94.46
^12 72.14.204.99
Then a simple while loop:
while ! traceroute -n ${TARGET} | grep -f patterns.list
do
sleep 5 # 5 second delay between traceroutes, for niceness.
done
ssh ${DESTINATION}
Use traceroute -n to generate the output so you don't get an IP address that resolves one time, but and a name the next, resulting in a false positive.
I think you could be better off using ping command to verify server's accessability than traceroute.
It is easy to check for return status of ping command without using any grep at all:
if [ ping -c 4 -n -q 10.10.10.10 >/dev/null 2>& ]; then
echo "Server is ok"
else
echo "Server is down"
fi
If you want to do it continually in a loop, try this:
function check_ssh {
# do your ssh stuff here
echo "performing ssh test"
}
while : ; do
if [ ping -c 4 -n -q 10.10.10.10 >/dev/null 2>& ]; then
echo "Server is ok"
check_ssh
else
echo "Server is down"
fi
sleep 60
done

Resources