Why can't My mpirun/mpiexec work in a multi machine environment without any information feedback? - openmpi

openmpi --version 4.0.5
os --Ubuntu 20.04
The two machines have completed the SSH login without password.
They can run hello_c independently, as shown in the fig. 1 and fig. 2. But they can't work with "mpirun -np 2 --hostfile xxx". There‘s no information feedback(warnings, errors and etc.). We can check the process on the host, but we can't see it on the second computer.
enter image description here
enter image description here
enter image description here

Related

Can you help me access Mac SMB share from Ubuntu using smbclient? (NT_STATUS_ACCESS_DENIED error)

I've been working on a file server product that uses smbcilent to transfer files between client computers and the server. It's been working great so far with our LAMP (Ubuntu) server and Windows machines.
I'm currently trying to expand the setup to include Mac's, but am having trouble with the server accessing the share on the Mac.
Here's my command and error (bracketed descriptions replace private info):
# smbclient //10.101.0.7/[share-file] -U [username]%[password] -c ls
WARNING: The "syslog" option is deprecated
NTLMSSP packet check failed due to short signature (0 bytes)!
NTLMSSP NTLM2 packet check failed due to invalid signature!
session setup failed: NT_STATUS_ACCESS_DENIED
Things I've tried:
✓ Accessing share using a Windows machine to ensure the share is setup properly - check! Works fine there.
✓ Invoking -S off or --signing=off in the command - no change.
✓ Just looking at the shares first using smbclient -L 10.101.0.7 -U [username]%[password] - same error.
✓ Googling for an answer - check! Several people with similar problems, but no working solutions so far.
The most promising thing I've see so far involves compiling smbclient 4.4 from sources and running that with no authentication (-U ""%""), but that seems like a temporary solution based on a bug rather than a solid plan that will work for a long time. (But I'll try that next if I can't find any better ideas...)
Thanks for reading and trying to help!
Try adding --option="ntlmssp_client:force_old_spnego = yes" to the smbclient command as suggested on the samba-technical mailing list.
For me, this now lists shares on a Mac OSX server:
smbclient -U$user%$password -L $mac_host --option="ntlmssp_client:force_old_spnego = yes"
For mounting, you may need to add the nounix,sec=ntlmssp options as in
sudo mount -t cifs //$mac_host/$share $mountpoint -o nounix,sec=ntlmssp,username=$user,password=$password
On recent versions of MacOS (e.g. Monterey) it is necessary to do several configuration steps to enable smb access from Linux:
Open System Preferences.
Select Sharing.
Select File Sharing.
Ensure that the directory is listed in Shared Folders.
Right-click/two-finger click on the share directory.
Click on Advanced Options
Ensure Only allow SMB encrypted connections is checked.
Click OK
Click on Options
Click on the checkbox for Share files and folders using SMB.
Under Windows File Sharing ensure the appropriate user is checked.
Type the user's password in the 'Authenticate' dialog bo and press 'OK'.
Click 'Done'.
You should now be able to connect from Linux to the MacOS share using the commands given by #mivk.

Vagrant "Timed out while waiting for the machine to boot" after deleting /project/.vagrant

Problem
I was working with bento/centos7.2 box. I did a vagrant up and while it was booting up, I noticed the box has an update and I instinctively cancelled the operation (which I suggest not to do, ever!). So I went ahead and did vagrant destroy, rm -rf .vagrantjust to be sure (Again, I suggest not to do, ever!). I removed my box by vagrant box remove bento/centos7.2 and did vagrant up and ended up with this:
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
Environment
Ubuntu 16.04
Vagrant 1.81
Centos 7.2 Box
Things I tried
Following are the threads I have tried:
vagrant + virtualbox Timed out while waiting for the machine to boot
Timed out while waiting for the machine to boot when vagrant up
Vagrant "Timed out while waiting for the machine to boot."
When I enabled the GUI, I realized the box is booting up properly; it's just stuck at login screen(bug in box with ssh?). Screenshot:
Any help is much appreciated.
There are multiple possibilities that cause this issue:
Try running:
vagrant reload
This re-installs the guest-additions on the box.
Try opening Virtualbox (GUI interface) and the open the virtualbox (console). The box might for example be
i) waiting for fsck (filesystem check) if it was shutdown uncleanly
ii) login to the box over Virtualbox GUI by using the default username/password (typically vagrant/vagrant) and figure out is the ssh server running on the box or not.
Run
vagrant ssh-config
and see to what port and by which ssh key it is trying to use. Use them manually e.g.:
ssh -i <identity_key_location> vagrant#localhost -p 2222

HDP 2.5 Hortonworks ambari-admin-password-reset missing

I have downloaded the sandbox from hortonworks (Centos OS), then tried to follow the tutorial. It seems like the ambari-admin-password-reset command is not there and missing. I tried also to login with putty, the console asked me to change the password so I did.
now it seems like the command is there, but I have different passwords for the console and one for the putty for the same user.
I have tried to look for the reason why for the same user 'root' I have 2 different passwords (one for the virtual box console and one for the putty) that I can login with. I see different commands on each box. more than that when I share folder I can only see it on the virtual box console but not on the putty console) which is really frustrating.
How can I enforce that what I would see from putty would be the same as what I see from the virtual box console.
I think it somehow related to TTY but I am not sure.
EDIT:
running commands from the virtual box machine output:
grep "^passwd" /etc/nsswitch.conf
OUT: passwd: files sss
grep root /etc/passwd
OUT: rppt"x"0"0"root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
getent passwd root
OUT: root:x:0:0:root:/root:/bin/bash
EDIT:
I think this is all about docker containers. It seems like the machine 2222 port is the ssh port for the hdp 2.5 container and not for the hosting machine.
Now I get another problem. when running
docker exec sandbox ls
it is getting stuck. any help ?
Thanks for helpers
So now I had the time to analyze the sandbox vm, and write it up for other users.
As you stated correctly in your edit of the question, its the docker container setup of the sandbox, which confuses with two separate root users:
via ssh root#127.0.0.1 -p 2222 you get into the docker container called "sandbox". This is a CentOS release 6.8 (Final), containing all the HDP services, especially the ambari service. The configuration enforces a password change at first login for the root user. Inside this VM you can also execute the ambari-admin-password-reset and set there a password for the ambari admin.
via console access you reach the docker host running a Centos 7.2, here you can login with the default root password for the VM as found in the HDP docs.
Coming to your sub-question with the hanging docker exec, it seems to be a bug in that specific docker version. If you google that, you will find issues discussing this or similar problems with docker.
So I thought that it would be a good idea to just update the host via yum update. However this turned out to be a difficult path.
yum tried to update the kernel, but complained that there is not enough space on the boot partion.
So I moved the boot partion to the root partition:
edit /etc/fsab and comment out the boot entry
unmount /boot
mv /boot
cp -a /boot.org /boot
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install /dev/sda
reboot
After that I have found out that the docker configuration is broken and docker does not start anymore. In the logs it complained about
"Error starting daemon: error initializing graphdriver:
\"/var/lib/docker\" contains other graphdrivers: devicemapper; Please
cleanup or explicitly choose storage driver (-s )"
So I edited /etc/systemd/system/multi-user.target.wants/docker.service and changed the ExecStart setting to:
ExecStart=/usr/bin/dockerd --storage-driver=overlay
After a service docker start and a docker start sandbox. The container worked again and I could could login to the container and after a ambari-server restart everything worked again.
And now - with the new docker version 1.12.2, docker exec sandbox ls works again.
So to sum up the docker exec command has a bug in that specific version of the sandbox, but you should think twice if you want to upgrade your sandbox.
I ran into the same issue.
The HDP 2.5 sandbox runs all of its components in a docker container, but commands like docker exec -it sandbox /bin/bash or docker attach sandbox got stuck.
When I ran a simple ps aux, I found several /usr/bin/docker-proxy commands which looked like :
/usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 60000 -container-ip 172.17.0.2 -container-port 60000
They probably forward the HTTP ports of the various UIs of HDP components.
I could ssh into the container ip (here 172.17.0.2) using root/hadoop to authenticate. From there, I could use all "missing" commands like ambari-admin-password-reset.
$ ssh root#172.17.0.2
... # change password
$ ambari-admin-password-reset
NB: I am new to docker, so there's probably a better way to deal with this.
I'd like to post here the instructions for 3.0.1 here.
I followed the instructions of installing hortonworks version 3.0.1 here: https://youtu.be/5TJMudSNn9c
After running the docker container, go to your browser and enter "localhost:4200", that will take you to the in browser terminal of the container, that hosts ambari. Enter "root" for login and "hadoop" for password, change the root password, and then enter "ambari-admin-password-reset" in order to reset ambari password.
In order to be able to use sandbox-hdp.hortonworks.com, you need to add the line "127.0.0.1 sandbox-hdp.hortonworks.com" at the end of the /private/etc/hosts file on your mac.
Incorrect Pass
Then right corner click on power button >> power off drop down >> Restart >> when it boots up then press Esc key to get into recovery menu
Restart
select advance option and hit enter
Advance Option
Select Recovery mode hit enter
Select Root
Root enter
Command
mount -rw -o remount/
ls /home
change pass command
passwd username
user as yours
last step
enter pass two times by pressing enter
enter image description here
Hopefully you changed password (:

Error pulling image (latest) from centos, Authentication is required

I have installed docker.io on CentOS 6.4 64 bit following the steps mentioned here: http://nareshv.blogspot.in/2013/08/installing-dockerio-on-centos-64-64-bit.html
Now I am able to start the docker daemon. When I am searching for a container as follows it's giving me result
[root#test ~]# docker search tutorial
Found 8 results matching your query ("tutorial")
NAME DESCRIPTION
mhubig/echo Simple echo loop from the tutorial.
learn/tutorial
jbarbier/tutorial1
mzdaniel/buildbot-tutorial
kyma/ping Ping image from the tutorial.
ivarvong/redis From the redis tutorial. Just redis-server and telnet on the base image.
amattn/postgresql-9.3.0 precise base, PostgreSQL 9.3.0 installed w/ default configuration. http://amattn.com/2013/09/19/tutorial_postgresql_us...
danlucraft/postgresql Postgresql 9.3, on port 5432, un:docker, pw:docker. From following the Postgresql example tutorial.
But When I am trying to pull a container it's giving me below error
[root#test ~]# docker pull learn/tutorial
Pulling repository learn/tutorial
8dbd9e392a96: Error pulling image (latest) from learn/tutorial, Authentication is required.
2013/10/08 02:50:01 Internal server error: 404 trying to fetch remote history for learn/tutorial
How to set the authentication and where? Please help
I had the same problem and this answer was the solution for me.
It was a time-zone issue. I ran docker on a VM, and my host and guest clock had different ctimezone, the authentication failure was due to clock divergence. Once I setup ntp correctly (with HW clock set to UTC) on my host, this problem went away.

How to run Jprofiler from Windows machine to Remote Linux JVM

Kindly let me know how to run Jprofiler from Windows machine to Remote Linux JVM.
Thanks a lot in advance.
1) Go to the download page, download the .tar.gz distribution and extract it on the remote Linux machine.
2) On the remote Linux machine, start the command line utility bin/jpintegrate, then follow the steps in the command line wizard.
3) Transfer the generated JProfiler config file to your local Windows machine.
4) On your local Windows machine, start the JProfiler GUI and import the config file with Session->Import Session Settings
5) Start the profiled JVM on the remote Linux machine and the imported session in the JProfiler GUI on the Windows machine.
For remote connect to jprofiler on Windows with remote machine JVM(Centos 7)
Download (.tar.gz) the Linux version jprofiler on centos. Both the Windows and remote machine jprofiling agent are of the same version. If bots are not same version then it will not create with the jprofiler on Windows.
Untar the .tar.gz file.
tar xvzf folder_name
Go to /bin path.
cd folder_name/bin
Run following command to enable profiling agent to connect JVMTI data on a specific port.
./jpenable
On running the above command it gives all list of process running on the JVM. Select the process which you required for profiling. (eg. lets i have to stream 6th process out of 8 process. Then enter 6).
Select tthe GUI mode or offline mode. Enter 1. (This option does not exist on old version).
Enter the port on which you want to listen. (Eg 33668)
Now your VM is ready for connection from Windows jprofiler.
Connection setting on window jprofiler
Click on start center.
Select a new Session.
Click on attach and select “Attach to remote machine” radio button.
Set ssh tunnel from the drop down.
Slick edit button and configure the direct ssh tunneling connection.
Click next and provide the VM credential.
Manually configure the profiling port. It should be defined at the time of configuring profiling agent.
16.Click finish.
17.Select ‘ok’ button and enter the key you received through mail.
If the credential is correct, following prompt will show up. Click “configure” button. Select “CPU data”, “Call tracer” and “allocation stack” check box. Click ok.
Click ‘ok’ button. Congratulation !! Now your remote VM is connected with your Windows jprofiler.
for remote connect to jprofiler you can following this steps:
download linux version of jprofiler.
install it on linux system.
go to folder bin and run ./jpenable. follow the wizard for choose the process id of jvm you want to profiled. after that it give you a port number.
install the jprofiler in local machine like windows.
in start center menu choose quick attach and chose the another computer. enter the host address and port number in step "3" then you can remotely connect to jprofiler

Resources