ssh connections are spawning hundreds of ssh-agent /bin/bash instances - linux

I have an Arch server running on a VMWare VM. I connect to it through a Firewall that forwards ssh connections from port X to port 22 on the server. Yesterday, I started receiving the error "Bash: Fork: Resource Temporarily Unavailable". I can log in as root and manage things without problem, but it seems that when I ssh in as the user I normally use, the ssh session is now spawning hundreds of ssh-agent /bin/bash sessions. This is, in turn, using up all the threads and file descriptors (from what I can tell) on the system and making it unusable. The little bit of info I've been able to find thus far tells me that I must have some sort of loop, but this hasn't happened until yesterday, possibly when I ran updates. At this point, I am open to suggestions.

One of your shell initialization files is probably spawning a shell, which, when reading the shell initialization files will spawn a shell, etc.
You mentioned ssh-agent /bin/bash. Putting this in .bashrc will definitely cause problems, as this instructs ssh-agent to spawn bash...
Instead, use something like
if [[ -z "$SSH_AUTH_SOCK" ]]; then
eval $(ssh-agent)
fi
in .bashrc (or .xinitrc or .xsession for systems with graphical logins).
Or possibly (untested):
if [[ -z "$SSH_AUTH_SOCK" ]]; then
ssh-agent /bin/bash
fi
in .bash_profile.

in my case (windows) it was because i was not exiting when done with a shell, so they were not getting disposed.
when done use ctrl+d or type exit to terminate the ssh agent

Related

SSH tunnel prevents my spawned script from exiting

I'm executing a bash script in Node.js like this:
const script = child_process.spawn('local_script.sh');
const stdout = fs.createWriteStream('stdout');
const stderr = fs.createWriteStream('stderr');
script.stdout.pipe(stdout);
script.stderr.pipe(stderr);
script.on('close', function(code) {
console.log('Script exited with code', code);
});
My local_script.sh uploads a script to my remote server and executes it:
#!/bin/bash
FILE=/root/remote_script.sh
HOST=123.456.78.9
scp remote_script.sh root#${HOST}:${FILE}
ssh root#${HOST} bash ${FILE}
Finally, my remote_script.sh is supposed to open an SSH tunnel (and perform some other actions that are not relevant for this question):
#!/bin/bash
REDIS_HOST=318.353.31.3
ssh -f -n root#${REDIS_HOST} -L 6379:127.0.0.1:6379 -N &
The problem is that even though I'm opening the SSH tunnel in the background, it seems my remote_script.sh never exits, because the Node.js close event is never called. If I don't open the SSH tunnel, it exits and emits the event as expected.
How can I make sure the script exits cleanly after opening the SSH tunnel? Note that I want the tunnel to persist after the script finishes.
I haven't tested this, but my guess is that the backgrounded ssh session (remote -> REDIS) is keeping the remote tty alive, and thus preventing the local -> remote session from closing. Try changing remote_script.sh to this:
#!/bin/bash
redis_host=318.353.31.3
nohup ssh -f -n "root#${redis_host}" -L 6379:127.0.0.1:6379 -N >/dev/null 2>&1 &
BTW, note that I switched the variable name to lowercase; there are a number of all-caps variables with special meanings (including HOST), and re-using any of them can have weird effects, so lower- or mixed-case variables are preferred for script use. Also, I double-quoted the variable reference, which won't matter in this case but is a good general habit for those cases where it does matter.
The way I managed to solve it is by using ssh -a root${HOST} bash ${FILE} in my local_script.sh. Note the -a flag which disables my forwarding ssh agent.
The important clue was that when I ran remote_script.sh on my remote machine directly, it would do everything as expected including a clean exit, but when I would try to logout, that's where it would hang.
Apparently ssh doesn't want to terminate while there are still active connections. When I typed ~#, which shows active ssh connections, it did indeed show my forwarding ssh agent.

Linux script for probing ssh connection in a loop and start log command after connect

I have a host machine that gets rebooted or reconnected quite a few times.
I want to have a script running on my dev machine that continuously tries to log into that machine and if successful runs a specific command (tailing the log data).
Edit: To clarify, the connection needs to stay open. The log command keeps tailing until I stop it manually.
What I have so far
#!/bin/bash
IP=192.168.178.1
if (("$#" >= 1))
then
IP=$1
fi
LOOP=1
trap 'echo "stopping"; LOOP=0' INT
while (( $LOOP==1 ))
do
if ping -c1 $IP
then
echo "Host $IP reached"
sshpass -p 'password' ssh -o ConnectTimeout=10 -q user#$IP '<command would go here>'
else
echo "Host $IP unreachable"
fi
sleep 1
done
The LOOP flag is not really used. The script is ended via CTRL-C.
Now this works if I do NOT add a command to be executed after the ssh and instead start the log output manually. On a disconnect the script keeps probing the connection and logs back in once the host is available again.
Also when I disconnect from the host (CTRL-D) the script will log right back into the host if CTRL-C is not pressed fast enough.
When I add a command to be executed after ssh the loop is broken. So pressing (CTRL-C) does not only stop the log but also disconnects and ends the script on the dev machine.
I guess I have to spawn another shell somewhere or something like that?
1) I want the script to keep probing, log in and run a command completely automatically and fall back to probing when the connection breaks.
2) I want to be able to stop the log on the host (CTRL-C) and thereby fall back to a logged in ssh connection to use it manually.
How do I fix this?
Maybe best approach on "fixing" would be fixing requirements.
The problematic part is number "2)".
The problem is from how SIGINT works.
When triggered, it is sent to the current control group related to your terminal. Mostly this is the shell and any process started from there. With more modern shells (you seem to use bash), the shell manages control groups such that programs started in the background are disconnected (by having been assigned a different control group).
In your case the ssh is started in the foreground (from a script executed in the foreground), so it will receive the interrupt, forward it to the remote and terminate as soon as the remote end terminated. As by that time the script shell has processed its signal handler (specified by trap) it is going to exit the loop and terminate itself.
So, as you can see, you have overloaded CTRL-C to mean two things:
terminate the monitoring script
terminate the remote command and continue with whatever is specified for the remote side.
You might get closer to what you want if you drop the first effect (or at least make it more explicit). Then, calling a script on the remote side that does not terminate itself but just the tail command, will be step. In that case you will likely need to use -t switch on ssh to get a terminal allocated for allowing normal shell operation later.
This, will not allow for terminating the remote side with just CTRL-C. You always will need to exit the remote shell that is going to be run.
The essence of such a remote script might look like:
tail command
shell
of course you would need to add whatever parts will be necessary for your shell or coding style.
An alternate approach would be to keep the current remote command being terminated and add another ssh call for the case of being interrupted that is spanning the shell for interactive use. But in that case, also `CTRL-C will not be available for terminating the minoring altogether.
To achieve this you might try changing active interrupt handler with your monitoring script to trigger termination as soon as the remote side returns. However, this will cause a race condition between the user being able to recognize remote command terminated (and control has been returned to local script) and the proper interrupt handler being in place. You might be able to sufficiently lower that risk be first activating the new trap handler and then echoing the fact and maybe add a sleep to allow the user to react.
Not really sure what you are saying.
Also, you should disable PasswordAuthentication in /etc/ssh/sshd_config and log by adding the public key of your home computer to `~/.ssh/authorized_keys
! /bin/sh
while [ true ];
do
RESPONSE=`ssh -i /home/user/.ssh/id_host user#$IP 'tail /home/user/log.txt'`
echo $RESPONSE
sleep 10
done

Git- How to kill ssh-agent properly on Linux

I am using git on linux, when pushing to gitlab, sometimes it either stuck at:
debug1: Connecting to gitlab.com [52.167.219.168] port 22.
or
debug1: client_input_channel_req: channel 0 rtype keepalive#openssh.com reply 1
debug3: send packet: type 100
Seems restarting linux could solve it, but no body like to reboot machines.
So, I am trying to kill ssh-agent process, then restart it.
But the process is always defunct after kill, and then I can't use git via ssh at all, so is there any way to restart the ssh-agent, or solve the issue described above without restarting the machine?
#Update
The ssh keys that I use include key phrase, which I would input on first use of a ssh key.
The issue usually occurs after I bring the Linux desktop back from sleep, thus the network is reconnected, not sure whether this matters?
Again, does any one knows how to kill or restart a ssh-agent agent, without making it become defunct?
You can kill ssh-agent by running:
eval "$(ssh-agent -k)"
You can try this bash script to terminate the SSH agent:
#!/bin/bash
## in .bash_profile
SSHAGENT=`which ssh-agent`
SSHAGENTARGS="-s"
if [ -z "$SSH_AUTH_SOCK" -a -x "$SSHAGENT" ]; then
eval `$SSHAGENT $SSHAGENTARGS`
trap "kill $SSH_AGENT_PID" 0
fi
## in .logout
if [ ${SSH_AGENT_PID+1} == 1 ]; then
ssh-add -D
ssh-agent -k > /dev/null 2>&1
unset SSH_AGENT_PID
unset SSH_AUTH_SOCK
fi
Many ways:
killall ssh-agent
SSH_AGENT_PID="$(pidof ssh-agent)" ssh-agent -k
kill -9 $(pidof ssh-agent)
pidof is from the procps project. You may be able to find it for your distro if it is packaged
yes, ssh-agent might be defunct: [ssh-agent] <defunct>
trying to kill the agent could help:
eval "$(ssh-agent -k)"
but also try to check your keyring process (e.g. gnome-keyring-daemon), restart it or even remove the ssh socket file:
rm /run/user/$UID/keyring/ssh
It shows defunct probably because its parent process is still monitoring it, so it's not removed from the process table. It's not a big deal, the process is killed. Just start a new ssh-agent:
eval $(ssh-agent)
ssh-add

Alias <cmd> to "do X then <cmd>" transparently

The title sucks but I'm not sure of the correct term for what I'm trying to do, if I knew that I'd probably have found the answer by now!
The problem:
Due to an over-zealous port scanner (customer's network monitor) and an overly simplistic telnet daemon (busybox linux) every time port 23 gets scanned, telnetd launches another instance of /bin/login waiting for user input via telnet.
As the port scanner doesn't actually try to login, there is no session, so there can be no session timeout, so we quickly end up with a squillion zombie copies of /bin/login running.
What I'm trying to do about it:
telnetd gives us the option (-l) of launching some other thing rather than /bin/login so I thought we could replace /bin/login with a bash script that kills old login processes then runs /bin/login as normal:
#!/bin/sh
# First kill off any existing dangling logins
# /bin/login disappears on successful login so
# there should only ever be one
killall -q login
# now run login
/bin/login
But this seems to return immediately (no error, but no login prompt). I also tried just chaining the commands in telnetd's arguments:
telnetd -- -l "killall -q login;/bin/login"
But this doesn't seem to work either (again - no error, but no login prompt). I'm sure there's some obvious wrinkle I'm missing here.
System is embedded Linux 2.6.x running Busybox so keeping it simple is the greatly preferred option.
EDIT: OK I'm a prat for not making the script executable, with that done I get the login: prompt but after entering the username I get nothing further.
Check that your script has the execute bit set. Permissions should be the same as for the original binary including ownership.
As for -l: My guess is that it tries to execute the command killall -q login;/bin/login (that's one word).
Since this is an embedded system, it might not write logs. But you should check /var/log anyway for error messages. If there are none, you should be able to configure it using the documentation: http://wiki.openwrt.org/doc/howto/log.overview
Right, I fixed it, as I suspected there was a wrinkle I was missing:
exec /bin/login
I needed exec to hand control over to /bin/login rather than just call it.
So the telnet daemon is started thusly:
/usr/sbin/telnetd -l /usr/sbin/not_really_login
The contents of the not-really-login script are:
#!/bin/sh
echo -n "Killing old logins..."
killall -q login
echo "...done"
exec /bin/login
And all works as it should, on telnet connect we get this:
**MOTD Etc...**
Killing old logins......done
login: zero_cool
password:
And we can login as usual.
The only thing I haven't figured out is if we can detect the exit-status of /bin/login (if we killed it) and print a message saying Too slow, sucker! or similar. TBH though, that's a nicety that can wait for a rainy day, I'm just happy our stuff can't be DDOS'ed over Telnet anymore!

LDAP - SSH script across multiple VM's

So I'm ssh'ing into a router that has several VM's. It is setup using LDAP so that each VM has the same files, settings, etc. However they have different cores allocated, different libraries and packages installed. Instead of logging into each VM individually and running the command, I want to automate it by putting the script in .bashrc.
So what I have so far:
export LD_LIBRARY_PATH=/lhome/username
# .so files are in ~/ to avoid permission denied problems
output=$(cat /proc/cpuinfo | grep "^cpu cores" | uniq | tail -c 2)
current=server_name
if [[ `hostname-s` != $current ]]; then
ssh $current
fi
/path/to/program --hostname $(echo $(hostname -s)) --threads $((output*2))
Each VM, upon logging in, will execute this script, so I have to check if the current VM has the hostname to avoid an SSH loop. The idea is to run the program, then exit back out to the origin to resume the script. The problem is of course that the process will die upon logging out.
It's been suggested to me to use TMUX on an array of the hostnames, but I would have no idea on how to approach this.
You could install clusterSSH, set up a list of hostnames, and execute things from the terminal windows opened. You may use screen/tmux/nohup to allow processes started to keep running, even after logout.
Yet, if you still want to play around with scripting, you may install tmux, and use:
while read host; do
scp "script_to_run_remotely" ${host}:~/
ssh ${host} tmux new-session -d '~/script_to_run_remotely'\; detach
done < hostlist
Note: hostlist should be a list of hostnames, one per line.

Resources