Wrong exit status from expect script - linux

I developed this expect script, TELNET_TEST.expect to test a TELNET connection on a remote machine.
This script should connect via telnet on a target machine, wait for the login prompt, send the password and then exit.
This script does work and you can see in example 1 that the script does successfully login via telnet then exit, but something very confusing is going on, (to me).
Why do I get an exit status 1? I believe I should be getting an exit of status 0...
Please let me know why I am getting an exit of status 1? Also, what would I need to change in my script in order to get the exit code I am anticipating?
My expect script:
more TELNET_TEST.expect
#!/usr/bin/expect --
set LOGIN [lindex $argv 0]
set PASSWORD [lindex $argv 1]
set IP [lindex $argv 2]
set timeout 20
spawn telnet -l $LOGIN $IP
expect -re "(Password:|word:)"
send $PASSWORD\r
expect -re "(#|>)"
send exit\r
expect {
timeout {error "incorrect password"; exit 1}
eof
}
catch wait result
set STATUS [ lindex $result 3 ]
exit $STATUS
EXAMPLE1
Running the expect script from my Linux machine I get an exit status 1 even though the telnet login is ok.
./var/TELNET_TEST.expect root pass123 198.23.234.12
.
spawn telnet -l root pass123
Trying 198.23.234.12...
Connected to 198.23.234.12.
Escape character is '^]'.
Digital UNIX (machine1001) (ttyp0)
login: root
Password:
Last login: Mon Jul 14 16:40:15 from 198.23.234.12
Digital UNIX V4.0F (Rev. 1229); Wed Nov 23 15:08:48 IST 2005
****************************************************************************
Wide Area Networking Support V3.0-2 (ECO 3) for Digital UNIX is installed.
You have new mail.
machine1001> Connection closed by foreign host.
[root#LINUX_XOR]# echo $?
1

I see that in the transcript of you session:
machine1001> Connection closed by foreign host.
Exit code 1 is the exit code for "Connection closed by foreign host". That is the "correct" code when the connection is closed by the "other side" (in that case, in response to your exit command).
As far as I can tell, if you want an exit code of 0, you need to enter command mode in your telnet client and send the quit command. That way, the connection is closed by the client not by the foreign host. But is this really more "normal" than the other way?
From the sources of GNU telnet (inetutils-1.9), in the file commands.c:
int
tn (int argc, char *argv[])
{
....
.... many many lines of code here
....
close (net);
ExitString ("Connection closed by foreign host.\n", 1);
return 0;
}
and (utilities.c):
void
ExitString (char *string, int returnCode)
{
SetForExit ();
fwrite (string, 1, strlen (string), stderr);
exit (returnCode);
}

Related

ssh port forwarding ("ssh -fNL") doesn't work via expect spawn to automatically provide password

I know that to do port forwarding, the command is ssh -L. I also use other options to decorate it. So for example, a final full command may look like this ssh -fCNL *:10000:127.0.0.1:10001 127.0.0.1. And everything just works after entering password.
Then, because there is not only one port need to be forwarded, I decide to leave the job to shell script and use expect(tcl) to provide passwords(all the same).
Although without a deep understanding of expect, I managed to write the code with the help of Internet. The script succeeds spawning ssh and provides correct password. But I end up finding there is no such process when I try to check using ps -ef | grep ssh and netstat -anp | grep 10000.
I give -v option to ssh and the output seems to be fine.
So where is the problem? I have searched through Internet but most of questions are not about port forwarding. I'm not sure whether it is proper to use expect while I just want to let script automatically provide password.
Here the script.
#!/bin/sh
# Port Forwarding
# set -x
## function definition
connection ()
{
ps -ef | grep -v grep | grep ssh | grep $1 | grep $2 > /dev/null
if [ $? -eq 0 ] ; then
echo "forward $1 -> $2 done"
exit 0
fi
# ssh-keygen -f "$HOME/.ssh/known_hosts" -R "127.0.0.1"
/usr/bin/expect << EOF
set timeout 30
spawn /usr/bin/ssh -v -fCNL *:$1:127.0.0.1:$2 127.0.0.1
expect {
"yes/no" {send "yes\r" ; exp_continue}
"password:" {send "1234567\r" ; exp_continue}
eof
}
catch wait result
exit [lindex \$result 3]
EOF
echo "expect ssh return $?"
echo "forward $1 -> $2 done"
}
## check expect available
which expect > /dev/null
if [ $? -ne 0 ] ; then
echo "command expect not available"
exit 1
fi
login_port="10000"
forward_port="10001"
## check whether the number of elements is equal
login_port_num=$(echo ${login_port} | wc -w)
forward_port_num=$(echo ${forward_port} | wc -w)
if [ ${login_port_num} -ne ${forward_port_num} ] ; then
echo "The numbers of login ports and forward ports are not equal"
exit 1
fi
port_num=${login_port_num}
## provide pair of arguments to ssh main function
index=1
while [ ${index} -le ${port_num} ] ; do
login_p=$(echo ${login_port} | awk '{print $'$index'}')
forward_p=$(echo ${forward_port} | awk '{print $'$index'}')
connection ${login_p} ${forward_p}
index=$((index + 1))
done
Here the output from script
spawn /usr/bin/ssh -v -fCNL *:10000:127.0.0.1:10001 127.0.0.1
OpenSSH_7.2p2 Ubuntu-4ubuntu2.10, OpenSSL 1.0.2g 1 Mar 2016
...
debug1: Next authentication method: password
wang#127.0.0.1's password:
debug1: Enabling compression at level 6.
debug1: Authentication succeeded (password).
Authenticated to 127.0.0.1 ([127.0.0.1]:22).
debug1: Local connections to *:10000 forwarded to remote address 127.0.0.1:10001
debug1: Local forwarding listening on 0.0.0.0 port 10000.
debug1: channel 0: new [port listener]
debug1: Local forwarding listening on :: port 10000.
debug1: channel 1: new [port listener]
debug1: Requesting no-more-sessions#openssh.com
debug1: forking to background
expect ssh return 0
forward 10000 -> 10001 done
This should work for you:
spawn -ignore SIGHUP ssh -f ...
UPDATE:
Another workaround is:
spawn bash -c "ssh -f ...; sleep 1"
UPDATE 2 (a bit explanation):
ssh -f calls daemon() to make itself a daemon. See ssh.c in the souce code:
/* Do fork() after authentication. Used by "ssh -f" */
static void
fork_postauth(void)
{
if (need_controlpersist_detach)
control_persist_detach();
debug("forking to background");
fork_after_authentication_flag = 0;
if (daemon(1, 1) == -1)
fatal("daemon() failed: %.200s", strerror(errno));
}
daemon() is implemented like this:
int
daemon(int nochdir, int noclose)
{
int fd;
switch (fork()) {
case -1:
return (-1);
case 0:
break;
default:
_exit(0);
}
if (setsid() == -1)
return (-1);
if (!nochdir)
(void)chdir("/");
if (!noclose && (fd = open(_PATH_DEVNULL, O_RDWR, 0)) != -1) {
(void)dup2(fd, STDIN_FILENO);
(void)dup2(fd, STDOUT_FILENO);
(void)dup2(fd, STDERR_FILENO);
if (fd > 2)
(void)close (fd);
}
return (0);
}
There's a race condition (not sure if its the correct term for here) between _exit() in the parent process and setsid() in the child process. Here _exit() would always complete first since "the function _exit() terminates the calling process immediately" and setsid() is much more heavy weight. So when the parent process exits, setsid() is not effective yet and the child process is still in the same session as the parent process. According to the apue book (I'm referring to the 2005 edition, Chapter 10: Signals), SIGHUP "is also generated if the session leader terminates. In this case, the signal is sent to each process in the foreground process group."
In brief:
Expect allocates a pty and runs ssh on the pty. Here, ssh would be running in a new session and be the session leader.
ssh -f calls daemon(). The parent process (session leader) calls _exit(). At this time, the child process is still in the session so it'll get SIGHUP whose default behavior is to terminate the process.
How the workarounds works:
The nohup way (spawn -ignore SIGHUP) is to explicitly ask the process to ignore SIGHUP so it'll not be terminated.
For bash -c 'sshh -f ...; sleep 1', bash would be the session leader and sleep 1 in the end prevents the session leader from exiting too soon. So after sleep 1, the child ssh process's setsid() has already done and child ssh is already in a new process session.
UPDATE 3:
You can compile ssh with the following modification (in ssh.c) and verify:
static int
my_daemon(int nochdir, int noclose)
{
int fd;
switch (fork()) {
case -1:
return (-1);
case 0:
break;
default:
// wait a while for child's setsid() to complete
sleep(1);
// ^^^^^^^^
_exit(0);
}
if (setsid() == -1)
return (-1);
if (!nochdir)
(void)chdir("/");
if (!noclose && (fd = open(_PATH_DEVNULL, O_RDWR, 0)) != -1) {
(void)dup2(fd, STDIN_FILENO);
(void)dup2(fd, STDOUT_FILENO);
(void)dup2(fd, STDERR_FILENO);
if (fd > 2)
(void)close (fd);
}
return (0);
}
/* Do fork() after authentication. Used by "ssh -f" */
static void
fork_postauth(void)
{
if (need_controlpersist_detach)
control_persist_detach();
debug("forking to background");
fork_after_authentication_flag = 0;
if (my_daemon(1, 1) == -1)
// ^^^^^^^^^
fatal("my_daemon() failed: %.200s", strerror(errno));
}

how to invoke sudo su - user from python code (without password)? Tried -> stdin, stdout, stderr = ssh.exec_command('sudo su - user')

I am writing a script to execute some operation in multiple server.
Where i used to login in server using 1 account, later need to do sudo to switch the account, that is not working.
Tried below cmd
stdin, stdout, stderr = ssh.exec_command('sudo su - user')
Getting below error
stdin, stdout, stderr = ssh.exec_command('sudo su - user')
^
TabError: inconsistent use of tabs and spaces in indentation
.......
ssh = paramiko.SSHClient() `ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())`
ssh.connect(host,username=uname,password=pwd)
stdin, stdout, stderr = ssh.exec_command("hostname")
stdin, stdout, stderr = ssh.exec_command('sudo su - user')
.....
Post ssh connection my script should switch the user.
For e.x-> server abc.com , uname = abhi, password = pwd, user = xyz
Than output will be
login as: abhi
abhi#abc.com's password:
Last login: Thu Apr 4 01:49:06 2019 from abc.com
[abhi#abc.com ~]$ sudo su - xyz
Last login: Thu Apr 4 06:38:36 CDT 2019 on pts/6
[xyz#abc.com ~]$

Find out whether tcp port is bound (not listening) using bash

I am trying to determine whether a TCP port that was bound by a process, that was recently started, is actually in use by that particular process.
Take this program.cpp
int daemonport = 11234;
struct sockaddr_in loopback;
memset ((char*) &loopback, 0, sizeof (loopback));
socklen_t len = sizeof (loopback);
loopback.sin_family = AF_INET;
loopback.sin_port = htons (daemonport);
loopback.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
daemonfd = socket (AF_INET, SOCK_STREAM, 0);
if (daemonfd < 0)
{
errx (EXIT_FAILURE, "Critical error");
}
if (bind (daemonfd, (struct sockaddr*) &loopback, sizeof (loopback)) != 0)
{
errx (EXIT_FAILURE, "Daemon already running, TCP port: '%d'", daemonport);
}
if (getsockname (daemonfd, (struct sockaddr*) &loopback, &len) != 0)
{
errx (EXIT_FAILURE, "Critical error");
}
printf ("%d\n", ntohs (loopback.sin_port));
if (daemon (1, 0) < 0)
{
close (daemonfd);
errx (EXIT_FAILURE, "Failed to daemonize!");
}
// event loop...
close (daemonfd);
Now with the tcp socket bound (but not listening) to port 11234 I want to check whether the port is bound by the process using a bash script.
I tried various netstat and lsof patterns w/o success:
netstat -a | grep ':11234' as well as lsof -i :11234.
They all don't print a line with the bound port.
But when I try to run the program a 2nd time it errors out with:
Daemon already running, TCP port: '11234'
Assuming Linux, start with this:
netstat --inet -n -a -p | grep ':myport'
and see what you're getting. The --inet keeps from showing IP6 and Unix domain sockets. -n shows numerical results and not names translated from the port number. -p tells you which process is listening on it.
If any of those lines lay "LISTEN" then a process is lisening on that port. However, any open connections using that port (even "TIME_WAIT") will prevent the port from being re-opened unless you use the SO_REUSEPORT option every time you bind to it.
If that command isn't showing you anything then nothing is listening on that port which means there must be a problem with your program.
You're printing an error message but assuming the problem is something already running. Print out the errno value (use perror(...)) so you can exactly what the problem is.
By way of example, to check to see if port 56789 is available locally:
port=56789
retval=$(python3 -c 'import socket; s=socket.socket(); s.bind(("", '"${port}"')); print(s.getsockname()[1]); s.close()' 2>/dev/null)
echo "$retval"
This will print a blank line if the port is already bound, and will print 56789 if it is not bound. If port 56789 was recently used and closed, but the TIME_WAIT period has not yet elapsed (typically one or two minutes), then the port will not be available and the above code will not echo 56789.
I realize that this is a bit of a cheat, because it also uses python, but it is bash scriptable if python 3 is available. No sudo required.

gdb stops in a command file if there is an error. How to continue despite the error?

I my real gdb script while analyzing a core file I try to dereference a pointer and get "Error in sourced command file: Cannot access memory at address " and then my gdb script stops. What I want is just to go on executing my gdb script without stopping. Is it possible?
This is a test program and a test gdb script that demonstrates my problem. In this situation the pointer has NULL value but in a real situation the pointer will like have not null invalid value.
This is test C program:
#include <stdio.h>
struct my_struct {
int v1;
int v2;
};
int main()
{
my_struct *p;
printf("%d %d\n", p->v1, p->v2);
return 0;
}
This is a test gdb script:
>cat analyze.gdb
p p->v1
q
And this is demonstration of the problem (what I want from gdb here is to get this error message and then go process quit command):
>gdb -silent a.out ./core.22384 -x ./analyze.gdb
Reading symbols from /a.out...done.
[New Thread 22384]
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400598 in main () at main.cpp:11
11 printf("%d %d\n", p->v1, p->v2);
./analyze.gdb:1: Error in sourced command file:
Cannot access memory at address 0x0
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6.x86_64
Update
Thanks to Tom. This is a gdb script that handles this problem:
>cat ./analyze.v2.gdb
python
def my_ignore_errors(arg):
try:
gdb.execute("print \"" + "Executing command: " + arg + "\"")
gdb.execute (arg)
except:
gdb.execute("print \"" + "ERROR: " + arg + "\"")
pass
my_ignore_errors("p p")
my_ignore_errors("p p->v1")
gdb.execute("quit")
This is how it works:
>gdb -silent ./a.out -x ./analyze.v2.gdb -c ./core.15045
Reading symbols from /import/home/a.out...done.
[New Thread 15045]
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400598 in main () at main.cpp:11
11 printf("%d %d\n", p->v1, p->v2);
$1 = "Executing command: p p"
$2 = (my_struct *) 0x0
$3 = "Executing command: p p->v1"
$4 = "ERROR: p p->v1"
$5 = "Executing command: quit"
gdb's command language doesn't have a way to ignore an error when processing a command.
This is easily done, though, if your gdb was built with the Python extension. Search for the "ignore-errors" script. With that, you can:
(gdb) ignore-errors print *foo
... and any errors from print will be shown but not abort the rest of your script.
You can also do this:
gdb a.out < analyze.v2.gdb
This will execute the commands in analyze.v2.gdb line by line, even if an error occurs.
If you just want to exit if any error occurs, you can use the -batch gdb option:
Run in batch mode. Exit with status 0 after processing all the command
files specified with ā€˜-xā€™ (and all commands from initialization files,
if not inhibited with ā€˜-nā€™). Exit with nonzero status if an error
occurs in executing the GDB commands in the command files. [...]

Resume a stopped process through command line

I've executed the following C code in Linux CentOS to create a process.
#include <stdio.h>
#include <unistd.h>
int main ()
{
int i = 0;
while ( 1 )
{
printf ( "\nhello %d\n", i ++ );
sleep ( 2 );
}
}
I've compiled it to hello_count. When I do ./hello_count, The output is like this:
hello 0
hello 1
hello 2
...
till I kill it.
I've stopped the execution using the following command
kill -s SIGSTOP 2956
When I do
ps -e
the process 2956 ./hello_count is still listed.
Is there any command or any method to resume (not to restart) the process having process id 2956?
Also, when I stop the process, the command line shows:
[1]+ Stopped ./hello_count
What does the [1]+ in the above line mean?
To continue a stopped process, that is resume use kill -SIGCONT PID
Regd [1]+ that is bash way of handling jobs. For further information try help jobs from bash prompt.

Resources