This might seems like a similar question roaming around on the internet but it not as I didn't find any similar, so asking here.
The thing is, I have a go program named abc.go which contains two functions which are to run and stop someScript.sh script. Run() and stop() are being called at API hit. I am running this abc.go file using command sudo go run abc.go someFolder/someScript.sh, while passing someScript.sh path as argument. Instop(), I am saving the process-groupID and then killing the whole process-group.
But when I call run and then stop functions, it gives me this output
pid=5844 duration=13.667µs err=exec: already started
and doesn't actually stop the running docker container (I am checking using docker container ls -a ).
The someScript.sh file is:
#!/bin/bash
docker container run --rm --name someContainerName nginx
The abc.go file is:
func Run(){
someVar= true
execCMD = exec.Command("/bin/sh", "-c", commandFromTerminal)
output, err = execCMD.CombinedOutput()
fmt.Println("Output()=", bp.Output())
someVar= false
}
func Stop(){
execCMD.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
start := time.Now()
syscall.Kill(-execCMD.Process.Pid, syscall.SIGKILL)
err := execCMD.Run()
fmt.Printf("pid=%d duration=%s err=%s\n", execCMD.Process.Pid, time.Since(start),
err)
}
As per my understanding, it seems like docker command which is written in someScript.sh, didn't run the docker container as a subchild/grandchild of /bin/bash but rather ran it as a separate process which the code in my stop() is unable to actaully stop it
Below is the flow diagram which is according to my understanding where i think on calling abc.go, it internally calling /bin/bash, then running sudo as its child, further sudo has a subchildsomeScript.sh. And finally the docker, which is not running as any child/subchild of the above hierarchy, but as a different process.
My question finally is, how to stop this docker container on calling stop(). Or how to make this docker container run as a subchild of the hierarchy so that I can kill it using process-groupID method which I have used above.
PS: I have also tried
err := execCMD.Process.Kill()
if err != nil {
panic(err.Error())
}
execCMD.Process.Release()
but it too didn't help.
docker is just a client for the docker daemon. docker run simply sends a few HTTP requests to the daemon, and the daemon sets up the container and executes it.
So docker run is a grandchild of your Go program, but the nginx processes are descendants of the Docker daemon, and entirely unrelated to your Go program. Mind you, the docker daemon can even be on a different machine, in principle at least.
That being said,
Assigning SysProcAttr after a process has been started has no effect.
You're calling Run in Stop (very suspicious) and you cannot Run a process that has already been started, even after it terminated.
Sending SIGKILL gives docker run no chance to terminate the container. After fixing the other errors, it's possible that the docker daemon takes care of the cleanup due to the --rm flag (I forget how this works, exactly). If not, send SIGTERM instead.
Related
What I'm doing
I am using AWS batch to run a docker container for a large compute job. I have configured the ECR/ECS successfully to the best of my knowledge but am having issues running the required commands for reasons that are beyond my level of understanding with docker ( newbie )
What I need to do is pass the below commands into my application and start my application to perform some heavy computing tasks; all commands listed below must be present.
The Issue(s)
The issue arises when I send the submit job to AWS batch; this service pulls the image from the ACR ( amazon container repository ) and spins up a compute environment. The issue comes from when I try to run the command I pass in, below I will go throgh it.
"command": [
"mkdir -p logging",
"chmod 777 logging/",
"docker run -t -i -e my-application", # container name
"-e APIKEY",
"-e BASEURI",
"-e APIUSER",
"-v WORKSPACE /logging:/src/log",
"DOCKERIMAGE",
"python my_app.py",
"-t APP_USER",
"-e APP_ENVIRONMENT",
"-u APP_USERNAME",
"-p APP_PASSWORD",
"-i IN_PATH",
"-o OUT_PATH",
"-b tmp/"
]
The command above generates the following error(s)
container_linux.go:370: starting container process caused: exec: "mkdir -p log": executable file not found in $PATH
I tried to pass in the command to echo the env var $PATH but was unsuccesfull getting a response and resulted in a similar error.
I have ran successfully "ls" and was able to see the directory contents of my application inside.
I am not however able to run any of these commands that I have included in the command [] section. I have tried just running python and such in hopes of getting a more detailed error but was unsuccessful.
Logic in plain English
Create a path called logging if it doesnt exist
set the permissions for logging
run the docker container and pass in the environment variables while doing so
Tell docker to run the python file my_app.py and pass in the expected runtime args
Execute and perform the required logic deligated in the python3 application
Questions
Why can I not create a directory here called "logging" where am I ?
Am I running these properly as defined by AWS batch? or docker
What am I missing or where am I going wrong?
AWS Batch high level doc
AWS Batch link specific to what i'm doing
Assuming that you're following the syntax described in the Container
Properties
section of the AWS docs, you have several problems with the syntax of
your command directive.
First
The command directive can only run a single command. You can't mash together a bunch of commands as you're trying to do in your example. If you need to run multiple commands you would need to embed them as an argument to a shell. For example, something like:
command: ["/bin/sh", "-c", "mkdir -p logging; chmod 777 logging; ..."]
Second
You must properly tokenize your
command lines -- that is, when you type mkdir -p logging at the
command prompt, the shell splits this into three parts (or "tokens"): ['mkdir', '-p', 'logging']. You need to do the same thing when building up the
list of arguments to command.
This is invalid:
command: ["mkdir -p logging"]
That would looking for a command named mkdir -p logging, and of course no such command exists. That would properly be written as:
command: ["mkdir", "-p", "logging"]
Third
I'm not very familiar with the AWS batch environment, but it's unlikely you can run a docker command inside a docker` container as you're trying to do. It's unclear why you're doing this, though: why not just configure your AWS batch job with the appropriate image, environment variables, etc?
Take a look at some of these example job definitions.
I'm not aware if this could be considered as a duplicate since it's a problem for an specific case.
Currently, I have created a docker outside docker image for handling my Jenkins agent which will perform auto restarts without using supervisor as a solution ( lack of python 3.7 support ), and by that, since I'm using openjdk:slim as base image and I don't want to install any additional dependencies I opted to compensate the lack of tools like lsof and ps, or others for checking if the process is running or not, by writing the started process pid on a file which will be used for validating if the process exists or not under the path /proc/pid/status. Currently this works and the main reason of creating this solution for handling the auto start of the agents.
But my question is, Is this the best or more appropriated approach?
Please find the following code with the implementation:
#!/bin/bash
set -e
agent_runner() {
while :
do
if [ ! -f "/proc/$(cat /tmp/agent.pid)/status" ]
then
curl $JNLP_AGENT_DOWNLOAD_URL -o agent.jar
java \
-Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=300 \
-Dhttps.protocols=TLSv1.2 \
-jar agent.jar \
-jnlpUrl $JNLP_AGENT_URL \
-secret $JENKINS_SECRET \
-workDir "$JENKINS_WORKDIR" &
echo $! > /tmp/agent.pid
else
:
fi
sleep 10
done
}
while :
do
if [ cat < /dev/tcp/"$TARGET" ]; then
echo "Starting Agent"
agent_runner
else
echo "Jenkins master is offline, waiting...."
fi
sleep 10
done
Link for the repository: https://github.com/thcp/jenkins-agent-dod
If the main process in the container dies, you should let the container die with it.
Docker and the various layers above it have functionality to restart whole containers. There is a docker run --restart option for the basic Docker CLI, and equivalent Docker Compose option, and restarting dying containers after some backoff is the default behavior for Kubernetes pods.
So, if you just let a container die on its own, you’ll have out-of-the-box support for the container engine to restart itself, without adding any special support into your image; just set the CMD to the thing you actually need the container to do and go. This approach also has the benefit that if you detect your environment has become unstable (“I depend on a database and it’s unreachable”) the process can choose to abort itself and let it be restarted later when hopefully the environment has improved.
I am working with Ubuntu 16.04 and I have two shell scripts:
run_roscore.sh : This one fires up a roscore in one terminal.
run_detection_node.sh : This one starts an object detection node in another terminal and should start up once run_roscore.sh has initialized the roscore.
I need both the scripts to execute as soon as the system boots up.
I made both scripts executable and then added the following command to cron:
#reboot /path/to/run_roscore.sh; /path/to/run_detection_node.sh, but it is not running.
I have also tried adding both scripts to the Startup Applications using this command for roscore: sh /path/to/run_roscore.sh and following command for detection node: sh /path/to/run_detection_node.sh. And it still does not work.
How do I get these scripts to run?
EDIT: I used the following command to see the system log for the CRON process: grep CRON /var/log/syslog and got the following output:
CRON[570]: (CRON) info (No MTA installed, discarding output).
So I installed MTA and then systemlog shows:
CRON[597]: (nvidia) CMD (/path/to/run_roscore.sh; /path/to/run_detection_node.sh)
I am still not able to see the output (which is supposed to be a camera stream with detections, as I see it when I run the scripts directly in a terminal). How should I proceed?
Since I got this working eventually, I am gonna answer my own question here.
I did the following steps to get the script running from startup:
Changed the type of the script from shell to bash (extension .bash).
Changed the shebang statement to be #!/bin/bash.
In Startup Applications, give the command bash path/to/script to run the script.
Basically when I changed the shell type from sh to bash, the script starts running as soon as the system boots up.
Note, in case this helps someone: My intention to have run_roscore.bash as a separate script was to run roscore as a background process. One can run it directly from a single script (which is also running the detection node) by having roscore& as a command before the rosnode starts. This command will fire up the master as a background process and leave the same terminal open for following commands to be executed.
If you could install immortal you could use the require option to start in sequence your services, for example, this is could be the run config for /etc/immortal/script1.yml:
cmd: /path/to/script1
log:
file: /var/log/script1.log
wait: 1
require:
- script2
And for /etc/immortal/script2.yml
cmd: /path/to/script2
log:
file: /var/log/script2.log
What this will do it will try to start both scripts on boot time, the first one script1 will wait 1 second before starting and also wait for script2 to be up and running, see more about the wait and require option here: https://immortal.run/post/immortal/
Based on your operating system you will need to configure/setup immortaldir, her is how to do it for Linux: https://immortal.run/post/how-to-install/
Going more deep in the topic of supervisors there are more alternatives here you could find some: https://en.wikipedia.org/wiki/Process_supervision
If you want to make sure that "Roscore" (whatever it is) gets started when your Ubuntu starts up then you should start it as a service (not via cron).
See this question/answer.
I want to be able to run node inside a docker container, and then be able to run docker stop <container>. This should stop the container on SIGTERM rather than timing out and doing a SIGKILL. Unfortunately, I seem to be missing something, and the information I have found seems to contradict other bits.
Here is a test Dockerfile:
FROM ubuntu:14.04
RUN apt-get update && apt-get install -y curl
RUN curl -sSL http://nodejs.org/dist/v0.11.14/node-v0.11.14-linux-x64.tar.gz | tar -xzf -
ADD test.js /
ENTRYPOINT ["/node-v0.11.14-linux-x64/bin/node", "/test.js"]
Here is the test.js referred to in the Dockerfile:
var http = require('http');
var server = http.createServer(function (req, res) {
console.log('exiting');
process.exit(0);
}).listen(3333, function (err) {
console.log('pid is ' + process.pid)
});
I build it like so:
$ docker build -t test .
I run it like so:
$ docker run --name test -p 3333:3333 -d test
Then I run:
$ docker stop test
Whereupon the SIGTERM apparently doesn't work, causing it to timeout 10 seconds later and then die.
I've found that if I start the node task through sh -c then I can kill it with ^C from an interactive (-it) container, but I still can't get docker stop to work. This is contradictory to comments I've read saying sh doesn't pass on the signal, but might agree with other comments I've read saying that PID 1 doesn't get SIGTERM (since it's started via sh, it'll be PID 2).
The end goal is to be able to run docker start -a ... in an upstart job and be able to stop the service and it actually exits the container.
My way to do this is to catch SIGINT (interrupt signal) in my JavaScript.
process.on('SIGINT', () => {
console.info("Interrupted");
process.exit(0);
})
This should do the trick when you press Ctrl+C.
Ok, I figured out a workaround myself, which I'll venture as an answer in the hope it helps others. It doesn't completely answer why the signals weren't working before, but it does give me the behaviour I want.
Using baseimage-docker seems to solve the issue. Here's what I did to get this working with the minimal test example above:
Keep test.js as is.
Modify Dockerfile to look like the following:
FROM phusion/baseimage:0.9.15
# disable SSH
RUN rm -rf /etc/service/sshd /etc/my_init.d/00_regen_ssh_host_keys.sh
# install curl and node as before
RUN apt-get update && apt-get install -y curl
RUN curl -sSL http://nodejs.org/dist/v0.11.14/node-v0.11.14-linux-x64.tar.gz | tar -xzf -
# the baseimage init process
CMD ["/sbin/my_init"]
# create a directory for the runit script and add it
RUN mkdir /etc/service/app
ADD run.sh /etc/service/app/run
# install the application
ADD test.js /
baseimage-docker includes an init process (/sbin/my_init) which handles starting other processes and dealing with zombie processes. It uses runit for service supervision. The Dockerfile therefore sets the my_init process as the command to run on boot, and adds a script /etc/service for runit to pick it up.
The run.sh script is simple:
#!/bin/sh
exec /node-v0.11.14-linux-x64/bin/node /test.js
Don't forget to chmod +x run.sh!
By default, runit will automatically restart the service if it goes down.
Following these steps (and build, run, and stop as before), the container properly responds to requests for it to shutdown, in a timely fashion.
I'm setting up a container with the following Dockerfile
# Start with project/baseline
FROM project/baseline # => image with mongo / nodejs / sailsjs
# Create folder that will contain all the sources
RUN mkdir -p /var/project
# Load the configuration file and the deployment script
ADD init.sh /var/project/init.sh
ADD src/ /var/project/ # src contains a list of folder, each one being a sails app
# Compile the sources / run the services / run mongodb
CMD /var/project/init.sh
The init.sh script is called when the container runs.
It should start a couple of webapp and mongodb.
#!/bin/bash
PROJECT_PATH=/var/project
# Start mongodb
function start_mongo {
mongod --fork --logpath /var/log/mongodb.log # attempt to have mongo running in daemon
}
# Start services
function start {
for service in $(ls);do
cd $PROJECT_PATH/$service
npm start # Runs sails lift on each service
done
}
# start mongodb
start_mongo
# start web applications defined in /var/project
start
Basically, there is a couple of nodejs (sailsjs) application in /var/project.
When I run the container, I got the following message:
$ sudo docker run -t -i projects/test
about to fork child process, waiting until server is ready for connections.
forked process: 10
and then it remains stuck.
How can mongo and the sails processes can be started and the container to remain in a running state ?
UPDATE
I now use this supervisord.conf file
[supervisord]
nodaemon=false
[program:mongodb]
command=/usr/bin/mongod
[program:process1]
command=/bin/bash "cd /var/project/service1 && node app.js"
[program:process2]
command=/bin/bash "cd /var/project/service2 && node app.js"
it is called in the Dockerfile like:
# run the applications (mongodb + project related services)
CMD ["/usr/bin/supervisord"]
As my services are dependent upon mongo starting correctly, supervisord does not wait that long and the services are not started then. Any idea to solve that ?
By the way, it that a so best practice to use mongo in the same container ?
UPDATE 2
I went back to a service.sh script that is called when the container is running. I know this is not clean (but I'll say it's temporary so I can fix the pb I have in supervisor), but I'm doing the following:
run nohup mongod &
wait 60 sec
run my node (forever) processes
The thing is, the container exit right after the forever processes are ran... how can it be kept active ?
If you want to cleanly start multiple services inside a container, one option is to use a process supervisor of some sort. One option is documented here, in the official Docker documentation.
I've done something similar using runit. You can see my base runit image here, and a multi-service application image using that here.