Why does gem5 run parsec3.0 encounter deadlock error? - linux

I run gem5 full system mode on a multi-core system, use AtomicCPU to establish a checkpoint, and then turn to O3CPU to start, and execute a command similar to the following:
./build/ARM_MOESI_hammer/gem5.opt -d fs_results/blackscholes configs/example/fs.py --ruby --num-cpus=64 --caches --l2cache --cpu-type=AtomicSimpleCPU --network=garnet2.0 --disk-image=$M5_PATH/disks/expanded-linaro-minimal-aarch64.img --kernel=/home/GEM5/gem5/2017sys/binaries/vmlinux.vexpress_gem5_v1_64.20170616 --param 'system.realview.gic.gem5_extensions = True'
Next, establish a checkpoint, and use the following command to restore the checkpoint and run PARSEC.
./build/ARM_MOESI_hammer/gem5.opt -d fs_results/blackscholes configs/example/fs.py --ruby --num-cpus=64 --caches --l2cache --cpu-type=AtomicSimpleCPU --network=garnet2.0 --disk-image=$M5_PATH/disks/expanded-linaro-minimal-aarch64.img --kernel=/home/GEM5/gem5/2017sys/binaries/vmlinux.vexpress_gem5_v1_64.20170616 --param 'system.realview.gic.gem5_extensions = True' --restore-with-cpu=DeriveO3CPU --script=../arm-gem5-rsk/parsec_rcs/blackscholes_simsmall_64.rcS -r 1
But I encountered the following problems:
First of all, the rcs file is not executed. Does the startup checkpoint conflict with the --script command?
The second point, I manually enter in the operating system booted by gem5:
parsecmgmt -a run -c gcc-hooks -i simsmall -n 1 -p blackscholes
I got the following error:
panic: Possible Deadlock detected. Aborting!
I tried to find the solution from the internet,it seems that there used to be a way to add parameters--garnet-network=flexible,but this method is no longer applicable in gem5-20.0 version.
Can someone help me solve this deadlock problem? By the way, when running the facesim program, I can get the correct running result by using 'test' input.

Related

Why might I get this error on a script that has been running fine for a year? - sudo: sorry, you must have a tty to run sudo

I have a script that runs nightly. The userid is set up in sudoers to perform these functions. I do not intend to disable "Defaults requiretty", particularly without knowing why it's suddenly a problem now.
Here's what it does with sudo:
sudo lvcreate -- size 19000M –snapshot –name snap_u /dev/mapper/vg_u-lvu
sudo mount /dev/vg_u/snap_u /snapshot
sudo rsync -av --delete --bwlimit=12000 –exclude usr/spoolhold --exclude email --exclude tempfile /snapshot/ /u1/prev/dir
sudo umount /snapshot
sudo lvremove -f /dev/vg_u/snap_u
For the past few weeks it doesn't work most of the time. Sometimes when I run the commands "manually" it works fine. When it fails I see this message filling the log file:
sudo: sorry, you must have a tty to run sudo
The problem began when I switched some other scripts for a remote backup. The only things I changed in this script were comments. This script is invoked by an application program that uses ‘nohup’ to run it in the background.
During my testing I killed the process to stop it from running in the background when I wanted to run it again immediately. Since then I’ve had this problem. So, my questions are these:
Could this error be related to ‘killing’ those processes (Maybe I killed the wrong one)?
Any ideas for a solution?
1) Could this error be related to ‘killing’ those processes (Maybe I killed the wrong one)?
No
2) Any ideas for a solution?
This is related to requiretty configuration option in /etc/sudoers. It probably changed in there or in default during some of the updates. Set it to off and you should be good.

How to run command during Docker build which requires a tty?

I have some script I need to run during a Docker build which requires a tty (which Docker does not provide during a build). Under the hood the script uses the read command. With a tty, I can do things like (echo yes; echo no) | myscript.sh.
Without it I get strange errors I don't completely understand. So is there any way to use this script during the build (given that its not mine to modify?)
EDIT: Here's a more definite example of the error:
FROM ubuntu:14.04
RUN echo yes | read
which fails with:
Step 0 : FROM ubuntu:14.04
---> 826544226fdc
Step 1 : RUN echo yes | read
---> Running in 4d49fd03b38b
/bin/sh: 1: read: arg count
The command '/bin/sh -c echo yes | read' returned a non-zero code: 2
RUN <command> in Dockerfile reference:
shell form, the command is run in a shell, which by default is /bin/sh -c on Linux or cmd /S /C on Windows
let's see what exactly /bin/sh is in ubuntu:14.04:
$ docker run -it --rm ubuntu:14.04 bash
root#7bdcaf403396:/# ls -n /bin/sh
lrwxrwxrwx 1 0 0 4 Feb 19 2014 /bin/sh -> dash
/bin/sh is a symbolic link of dash, see read function in dash:
$ man dash
...
read [-p prompt] [-r] variable [...]
The prompt is printed if the -p option is specified and the standard input is a terminal. Then a line
is read from the standard input. The trailing newline is deleted from the line and the line is split as
described in the section on word splitting above, and the pieces are assigned to the variables in order.
At least one variable must be specified. If there are more pieces than variables, the remaining pieces
(along with the characters in IFS that separated them) are assigned to the last variable. If there are
more variables than pieces, the remaining variables are assigned the null string. The read builtin will
indicate success unless EOF is encountered on input, in which case failure is returned.
By default, unless the -r option is specified, the backslash ``\'' acts as an escape character, causing
the following character to be treated literally. If a backslash is followed by a newline, the backslash
and the newline will be deleted.
...
read function in dash:
At least one variable must be specified.
let's see read function in bash:
$ man bash
...
read [-ers] [-a aname] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name...]
If no names are supplied, the line read is assigned to the variable REPLY. The return code is zero,
unless end-of-file is encountered, read times out (in which case the return code is greater than
128), or an invalid file descriptor is supplied as the argument to -u.
...
So I guess your script myscript.sh is start with #!/bin/bash or something else but not /bin/sh.
Also, you can change your Dockerfile like below:
FROM ubuntu:14.04
RUN echo yes | read ENV_NAME
Links:
https://docs.docker.com/engine/reference/builder/
http://linux.die.net/man/1/dash
http://linux.die.net/man/1/bash
Short answer : You can't do it straightly because docker build or either buildx didn't implement [/dev/tty, /dev/console]. But there is a hacky solution where you can achieve what you need but I highly discourage using it since it break the concept of CI. That's why docker didn't implement it.
Hacky solution
FROM ubuntu:14.04
RUN echo yes | read #tty requirement command
As mentioned in docker reference document the RUN consist of two stage, first is execution of command and the second is commit to the image as a new layer. So you can do the stages manually on your own where we will provide tty to first stage(execution) and then commit the result.
Code:
cd
cat >> tty_wrapper.sh << EOF
echo yes | read ## Your command which needs tty
rm /home/tty_wrapper.sh
EOF
docker run --interactive --tty --detach --privileged --name name1 ubuntu:14.04
docker cp tty_wrapper.sh name1:/home/
docker exec name1 bash -c "cd /home && chmod +x tty_wrapper.sh && ./tty_wrapper.sh "
docker commit name1 your:tag
Your new image is ready.
Here is a description about the code.
At first we make a bash script which wrap our tty to it and then remove itself after fist execute. Then we run a container with provided tty option(you can remove privileged if you don't need). Next step we copy wrapped bash script inside container and do the execution & commit stage on our own.
You don't need a tty for feeding your data to your script . just doing something like (echo yes; echo no) | myscript.sh as you suggested will do. also please make sure you copy your file first before trying to execute it . something like COPY myscript.sh myscript.sh
Most likely you don't need a tty. As the comment on the question shows, even the example provided is a situation where the read command was not properly called. A tty would turn the build into an interactive terminal process, which doesn't translate well to automated builds that may be run from tools without terminals.
If you need a tty, then there's the C library call to openpty that you would use when forking a process that includes a pseudo tty. You may be able to solve your problem with a tool like expect, but it's been so long that I don't remember if it creates a ptty or not. Alternatively, if your application can't be built automatically, you can manually perform the steps in a running container, and then docker commit the resulting container to make an image.
I'd recommend against any of those and to work out the procedure to build your application and install it in a non-interactive fashion. Depending on the application, it may be easier to modify the installer itself.

How to reset tty after exec-ed program crashes?

I am writing a Ruby wrapper around Docker and nsenter. One of the command my tool provides is to start a Bash shell within a container. Currently, I am doing it like this:
payload = "sudo nsenter --target #{pid(container_name)} --mount --uts --ipc --net --pid -- env #{env} /bin/bash -i -l;"
Kernel.exec(payload)
In Ruby, Kernel#exec relies on the exec(2) syscall, hence there is no fork.
One issue is that the container sometime dies prematurely which effectively kills my newly created Bash prompt. I then get back the prompt originally used to run my Ruby tool, but I cannot see what I am typing anymore, the tty seems broken and running reset effectively solves the issue.
I'd like to conditionally run reset if the program I exec-ed crashes. I found that the following works well:
$ ./myrubytool || reset
Except I'd like to avoid forcing people using my tool to append || reset every time.
I have tried the following:
payload = "(sudo nsenter --target #{pid(container_name)} --mount --uts --ipc --net --pid -- env #{env} /bin/bash -i -l) || reset;"
But this surprisingly puts reset in the background (i.e. I can run reset by entering fg). One benefit is that the tty is working properly, but it's not really ideal.
Would you have any idea to solve this issue?
If terminal echo has been disabled in a terminal, then you can run the command stty echo to re-enable the terminal echo. (Conversely, stty -echo disables terminal echo, and stty -a displays all terminal settings.)
This is safe to run even if terminal echo is already enabled, so if you want to play it safe, you can do something like ./myrubytool ; stty echo which will re-enable terminal echo if it is disabled regardless of the exit status of your Ruby program. You can put this in a shell script if you want to.
It might be that there is a way to execute a command when the Ruby program exits (often referred to as a "trap"), but I'm not familiar enough with Ruby to know whether such capabilities exist.
However, if you are creating a script for general use, you probably should look into more robust techniques and not rely on workarounds.
How about this? It should do exactly what you want.
It runs the command in a separate process, waits on it, and if, when it finishes, the return value is not 0, it runs the command reset.
payload = "sudo nsenter --target #{pid(container_name)} --mount --uts --ipc --net --pid -- env #{env} /bin/bash -i -l;"
fork { Kernel.exec(payload) }
pid, status = Process.wait2
unless status.exitstatus == 0
system("reset")
end
EDIT
If all you want to do is turn echo back on, change the system("reset") line to system("stty echo").

Run Hydra (mpiexec) locally gives strange SSH error

I am trying to run example code from this question: MPI basic example doesn't work but when I do:
$ mpirun -np 2 mpi_test
I get this:
ssh: Could not resolve hostname wvxvw-laptop: Name or service not known
And then the program hangs until interrupted.
wvxvw-laptop is the "host name" of my laptop, which is just that, really, a laptopt...
All I want is to try to run the example code, not to set up a network cluster or anything like that.
What did I miss? I'm reading the wiki page http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager but I can't understand what is the reason.
Sorry, I'm very new to this.
Some more verbose output:
/usr/bin/ssh -x wvxvw-laptop "/usr/lib64/mpich/bin/hydra_pmi_proxy" \
--control-port wvxvw-laptop:54320 --debug --rmk user --launcher ssh \
--demux poll --pgid 0 --retries 10 --usize -2 --proxy-id 0
Formatted for readability. I'm not quite sure why is this even supposed to work (I've never used ssh -x not sure what it is supposed to do :/
mpirun execute your program on all node registered on your mpi cluster.
MPI use the computer name so you can edit your /etc/hosts to add an entry for wvxvw-laptop

Activating a script using udev when power supply is connected/disconnected

I'm trying to get udev to run a couple of small scripts when I connect/disconnect the powersupply. I have the following code in /etc/udev/rules.d/50-caff.rules :
SUBSYSTEM=="power_supply", ENV{POWER_SUPPLY_STATUS}=="Charging", RUN+="/home/haukur/rules/off.sh"
SUBSYSTEM=="power_supply", ENV{POWER_SUPPLY_STATUS}=="Discharging", RUN+="/home/haukur/rules/on.sh"
Here is on.sh:
#!/bin/sh
caffeine -a
and off.sh:
#!/bin/sh
caffeine -d
Anyway, I wrote these, wrote udevadm control --reload-rules into bash and... nothing happened. caffeine doesn't appear to activate at all when I plug or unplug the power supply.
According to /var/log/syslog (Ubuntu's replacement for /var/log/messages) udev recognizes when I pull the plug:
Feb 26 08:44:52 (none) udevd[3838]: starting '/home/haukur/rules/off.sh'
but when it tries to run off.sh (which itself tries to run caffeine), it returns this error:
udevd[2719]: '/home/haukur/rules/off.sh'(err) '** (caffeine:3840): WARNING **: Command line `dbus-launch --autolaunch=62907743a139af9b3c86412e00000026 --binary-syntax --close-stderr' exited with non-zero exit status 1: Autolaunch error: X11 initialization failed.\n'
Do you know any way to get around this? Running Ubuntu 12.04 LTS with xmonad WM.
If the application "caffeine" needs to access you desktop, you probably need to export the DISPLAY before calling the program:
export DISPLAY=:0
You may simply prepend this to the command invocation:
DISPLAY=:0 caffeine -a

Resources