I am trying to run example code from this question: MPI basic example doesn't work but when I do:
$ mpirun -np 2 mpi_test
I get this:
ssh: Could not resolve hostname wvxvw-laptop: Name or service not known
And then the program hangs until interrupted.
wvxvw-laptop is the "host name" of my laptop, which is just that, really, a laptopt...
All I want is to try to run the example code, not to set up a network cluster or anything like that.
What did I miss? I'm reading the wiki page http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager but I can't understand what is the reason.
Sorry, I'm very new to this.
Some more verbose output:
/usr/bin/ssh -x wvxvw-laptop "/usr/lib64/mpich/bin/hydra_pmi_proxy" \
--control-port wvxvw-laptop:54320 --debug --rmk user --launcher ssh \
--demux poll --pgid 0 --retries 10 --usize -2 --proxy-id 0
Formatted for readability. I'm not quite sure why is this even supposed to work (I've never used ssh -x not sure what it is supposed to do :/
mpirun execute your program on all node registered on your mpi cluster.
MPI use the computer name so you can edit your /etc/hosts to add an entry for wvxvw-laptop
Related
I have a list of servers in a file I'll call "boxes."
I'm running the following:
bastion:~ # while read i j; do ssh -n $i 'echo $(hostname)'; done < boxes
When I run this, I should be getting the host name of every box I'm supposedly SSHing into right? But what I'm getting is my hostname "bastion" over and over as if the command is not SSHing to any box.
Any idea why that might be or what I'm doing wrong here?
Thanks.
EDIT: I couldn't get SSH to work for this task. In the end I swapped the ssh for ansible -m shell.
hostname is enough
my favorite solution with GNU Parallel (much faster) sudo apt-get install parallel
parallel --nonall --slf boxes hostname
you can add --tag and much more https://www.gnu.org/software/parallel/parallel_tutorial.html
I would like to make a shutdown-script for my raspberry pi to shut down anothe raspberry pi over ssh.
The script works if it is running itself but at the shutdown routine the ssh command is not executed.
So that I have done until now:
Made the script in /etc/init.d:
#!/bin/sh
# the first thing is to test if the shutdown script is working
echo "bla bla bla " | sudo tee -a /test.txt
ssh pi#10.0.0.98 sudo shutdown -h now
Made it executable
sudo chmod +x /etc/init.d/raspi.sh
Made a symlink to the rc0.d
sudo ln -s /etc/init.d/raspi.sh /etc/rc0.d/S01raspi.sh
Now I know so far that the shutdown script is working outside of the shutdown routing by calling itself and the shutdown symlink I made is also working partially because I see the changes in the test.txt file every time I shut down.
Can anyone help me how to solve my problem?
Have you tried with single quotes?
The first link in Google has it
http://malcontentcomics.com/systemsboy/2006/07/send-remote-commands-via-ssh.html
What about the sudo, how do you solve entering the password?
https://superuser.com/questions/117870/ssh-execute-sudo-command
Please check this or other links on the web that have useful information.
I would have send all this in a comment but I cant yet because of reputation.
I have now got the script running by myself. I do not really know why it is now working but I write it down beneath and maybe someone else can clearifiy it.
I don´t think the first two changes at my system makes a difference but I also write it down. In the meanwhile because I do not managed the script to get working I had made a button to shutdown the system manually. Also I made a script which backs the mysql-database up (which is on the Raspberry Pi which I would like to switch off with the script) and copies the backup to the raspberry pi which should switch of the other raspberry automatically via the shutdown-script. This happens with scp and also for the password is a key generated.
I have also changed my script to get a log-message out of the script.
#!/bin/sh
ssh -t -t pi#10.0.0.99 'sudo shutdown -h now' >> /home/osmc/shutdown.log 2>&1
To get it into the shutdown-routine I used:
sudo update-rc.d raspi-b stop 01 0
I hope somebody can say me why my code now worked on the first day but not on the next few days until now.
I structured a command to suspend or shutdown a remote host over ssh. You may find this useful. This may be used to suspend / shutdown a remote computer without an interactive session and yet not keep a terminal busy. You will need to give permissions to the remote user to shutdown / suspend using sudo without a password. Additionally, the local and remote machines should be set up to SSH without an interactive login. The script is more useful for suspending the machine as a suspended machine will not disconnect the terminal.
local_user#hostname:~$ ssh remote_user#remote_host "screen -d -m sudo pm-suspend"
source: कार्यशाला (Kāryaśālā)
SOLVED
Scenario: I am a beginner in bash script, windows task scheduler and such. I am able to run a local bash script in my Windows Task Scheduler successfully.
Problem: I need to do this on many computers, thus I think storing just 1 copy of the bash script on a remote server may be of help. What my Task Scheduler needs to do is just to run the script and output a log. However, I can't get the correct syntax for the argument.
The below is what I have currently:
Program/Script: C:\cygwin64\bin\bash.exe
Argument (works successfully):
-l -c "ssh -p 222 ME#ME.com "httpdocs/bashscript.sh" >> /cygdrive/c/Users/ME/Desktop/`date +%Y%m%d`.log 2>&1"
Start in: C:\cygwin64\bin
Also had to make sure that the user account under Properties in Task Scheduler is correct, as mine was incorrect before. And need key authentication for ME#ME.com too.
For the password issue, you really should use ssh keys. I think your command would simply be ssh -p 222 ME#ME.com:.... I.e., just get rid of the --rsh stuff. – chrisaycock
I need some help to try and figure out a problem we are experiencing. We had the following bash shell script running in devices on two separate networks (network1 and network2). Both networks go to the same destination server.
while
do
# do something ...
scp *.zip "$username#$server_ip:$destination_directory"
# do something ...
sleep 30
done
The script worked fine until a recent change to network2 where the scp command in the script above sometimes hangs for hours before resetting. The same script is still working fine on netowrk1 which did not change. We are not able identify what the issue is with network2, everything seem to work except scp. The hang does not happened on every try but when it does hang it hangs for hours.
So I changed the scp command as follow and it now resets within minutes and the data delay is bearable but not desirable.
scp -o BatchMode=yes -o ServerAliveCountMax=3 -o ServerAliveInterval=10 -o \
ConnectTimeout=60 *.zip "$username#$server_ip:$destination_directory"
I also tried sftp as follows;
sftp -o ConnectTimeout=60 -b "batchfiles.txt" "$username#$server_ip"
The ConnectTimeout does not seem to work well in sftp because it still hangs for hours sometimes. So I am back to using scp.
I even included the -o IdentityFile=path_to_key/id_rsa option in both scp and sftp thinking it maybe an authentication issue. That did not work either.
What is really strange is that it always works when I issue the same commands from a terminal. The shell script run as a background task. I am running Linux 3.8.0-26-generic #38-Ubuntu and OpenSSH_6.1p1 Debian-4. I don’t think is a local script permissions issue because; 1) it worked before network2 changed, 2) It works some of the time.
I did a network packet capture. I can see that each time when the scp command hangs it is accompanied by [TCP Retransmission] and [RST, ACK] within seconds from the start of a scp conversation.
I am very confused as to if the issue is networks or script related. Base on the sequence of events I am thinking is likely due to the recent change in network2. But why the same command works from a terminal every time I tried?
Can someone kindly tell me what my issue is or tell me how to go about troubleshoot it?
Thank you for reading and helping.
I'm using dancer's shell dsh (http://www.netfort.gr.jp/~dancer/software/dsh.html.en) to send a tail -f command to 6 machines. I was hoping to use this to view a merged log from a service which resides in the same directory on each of these machines. The machines are all running RHEL 4. (Not my choice.)
What actually happens, is that I retrieve 4-20 lines from each log and then it just hangs.
Here are my options:
dsh -c -M -r ssh -g services -- /usr/bin/tail -f /var/myservice/my.log
"services" refers to a group of 6 servers.
I've tried several different ssh options in the dsh.conf file, including -n, -t, and -f, but it doesn't seem to make a difference. Also, screen is not installed on the target servers.
What's wrong with my command? How can I make it act like a proper tail -f?
It turns out chepner's comment is right. Those logs just don't create much output. I tried the identical command with a set of more active web applications and it works fine.
I know that command as "distributed shell", but no matter.
I'm suspicious the double-dash in the middle of your command string is asking for it to accept stdin input, which indeed would make it appear to hang. Try it without the "--"