Long-running process gets SIGINT terminated, help analyze strace - node.js

I have a script that fetches data from the internet.
It uses commander, axios, https-proxy-agent, pg and a bunch of other modules, probably unrelated to the issue.
The logic is not complicated: get a list of URLs from the DB, loop downloading data using a proxy and saving it back to the DB, exit.
It works fine, except sometimes the script exits in the middle of the job.
No errors, just sudden termination. I haven't found a way to replicate it reliably.
When launched through an IDEA debugger, the final message reads:
^C
Process finished with exit code 130 (interrupted by signal 2: SIGINT)
I haven't pressed Ctrl+C, just in case.
strace final lines are:
--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
+++ killed by SIGINT +++
There's a SIGINT handler attached to process. It runs if I interrupt the script manually, but not when it happens by itself.
Attached are 2 excerpts from strace output: one when things are working as expected and the other is the final output before the script is terminated.
success.txt
failure.txt
It looks like things break when it comes to these lines:
epoll_wait(13, [{events=EPOLLIN, data={u32=19, u64=19}}], 1024, 8131) = 1
read(19, "", 65536) = 0
19 is a socket descriptor.
Could anyone help to figure out what is going on exactly and why the process is terminated, please?
Kubuntu 21.10, Node.js 16.13.

Related

Profiling in Odoo

I am new to odoo and code profiling. I am using py-spy to profile my odoo code, as I need a flame graph as the output of profiling. Everything is working fine with py-spy, but the issue is, the py-spy needs to be stopped by pressing ctrl + C on the terminal where it is running or shutting the odoo server down. I can't stop or reset the odoo server neither I can do ** Ctrl + C** on the server.
I had tried to create to do this
To start py-spy
def start_pyflame(self):
pyflame_started = self.return_py_spy_pid('py-spy')
error = False
if not pyflame_started:
self.start_pyflame()
else:
error = 'PyFlame Graph process already created. Use Stop button if needed.'
_logger.error(error)
which is working fine, the problem is with this one
def stop_pyflame_and_download_graph(self):
pyflame_running = self.return_py_spy_pid('py-spy')
if pyflame_running:
subprocess.run(["sudo", "pkill", "py-spy"])
Now the issue is when I am killing the process with pkill or kill it kills the process but along with this, it also terminates the py-spy, because of which, the output file is not generated.
Is there any way to stop or soft kill py-spy so that the output file will be created.
Thanks in advance for help
After some research, I came to know that all these kill commands are just killing the process whereas in this case, we need to stop the process.
This thing I have achieved by
sudo kill -SIGINT <pid>
As it is clear from the name, this command is not killing/terminating the process, it is simply asking the process to stop working by passing an interrupt signal.
This worked for me.

Nextflow error in 1 process stops all other processes

When I submit a job with nextflow, one of the processes fails, there's a corrupted file. Obviously I can remove that file from the job list but I don't want this to happen in the future when I scale it up. By default this stops all the other processes (9) from running and the nextflow job finishes.
How do I stop this one failed job from affecting the others?
I found the answer after some more digging in the docs here (https://www.nextflow.io/docs/latest/process.html#errorstrategy). I needed to add errorStrategy 'finish' to my process
process ignoreAnyError {
errorStrategy 'finish'
script:
<your command string here>
}

How to completely exit a running asyncio script in python3

I'm working on a server bot in python3 (using asyncio), and I would like to incorporate an update function for collaborators to instantly test their contributions. It is hosted on a VPS that I access via ssh. I run the process in tmux and it is often difficult for other contributors to relaunch the script once they have made a commit, etc. I'm very new to python, and I just use what I can find. So far I have used subprocess.Popen to run git pull, but I have no way for it to automatically restart the script.
Is there any way to terminate a running asyncio loop (ideally without errors) and restart it again?
You can not start a event loop stopped by event_loop.stop()
And in order to incorporate the changes you have to restart the script anyways (some methods might not exist on the objects you have, etc.)
I would recommend something like:
asyncio.ensure_future(git_tracker)
async def git_tracker():
# check for changes in version control, maybe wait for a sync point and then:
sys.exit(0)
This raises SystemExit, but despite that exits the program cleanly.
And around the python $file.py a while true; do git pull && python $file.py ; done
This is (as far as I know) the simplest approach to solve your problem.
For your use case, to stay on the safe side, you would probably need to kill the process and relaunch it.
See also: Restart process on file change in Linux
As a necromancer, I thought I give an up-to-date solution which we use in our UNIX system.
Using the os.execl function you can tell python to replace the current process with a new one:
These functions all execute a new program, replacing the current process; they do not return. On Unix, the new executable is loaded into the current process, and will have the same process id as the caller. Errors will be reported as OSError exceptions.
In our case, we have a bash script which executes the killall python3.7, sending the SIGTERM signal to our python apps which in turn listen to it via the signal module and gracefully shutdown:
loop = asyncio.get_event_loop()
loop.call_soon_threadsafe(loop.stop)
sys.exit(0)
The script than starts the apps in background and finishes.
Note that killall python3.7 will send SIGTERM signal to every python3.7 process!
When we need to restart we jus rune the following command:
os.execl("./restart.sh", 'restart.sh')
The first parameter is the path to the file and the second is the name of the process.

CImage::Save caused error Thread 0x**** has exited with code 1 (0x1)

I'm working on a small app on Windows.
It uses CImage to write a PNG file on disk.
It just goes like this:
CImage theImage;
...
theImage.Save("D:\\xxx.png");
After the file was written on the disk and I clicked the close button on the top-right corner to exit the program. The console showed me a message like this:
Thread 0x**** has exited with code 1 (0x1)
Program "[*****] xxx.exe" has exited with code 0 (0x0).
Code 0x1 should indicate an error, right? Seems something wrong happened when the thread created by CImage::Save was writing the file.
The image file is perfectly on the disk, nothing wrong with it. And I checked the return value of Save, it also indicated success.
I have walked through all my code and I'm sure it is definitely caused by the invocation of CImage::Save, if I don't call it, this message never pops up. That is, the console would look like this:
Program "[*****] xxx.exe" has exited with code 0 (0x0).
I did some search and I found this post, but they didn't work it out either.
Even though the program didn't crash, but this message still annoys me.
Any idea? Thanks a lot.
"Code 0x1 should indicate an error, right?"
Only in the most abstract meaning of 'error'. Whoever authored the creation and destruction of the thread decided what value to return, and what that value meant. Could be that someone used some library that required cleanup which didn't get done in time, or another of 1000 possible causes. Not something to spend time on.

python script gets killed by test for stdout

I'm writing a CGI script that is supposed to send data to a user until they disconnect, then run logging tasks afterwards.
THE PROBLEM: Instead of break executing and the logging getting completed when the client disconnects (detected by inability to write to the stdout buffer), the script ends or is killed (I cannot find any logs anywhere for how this exit is occurring)
Here is a snippet of the code:
for block in r.iter_content(262144):
if stopRecord == True:
r.close()
if not block:
break
if not sys.stdout.buffer.write(block): #The code fails here after a client disconnects
break
cacheTemp.close()
####write data to other logs and exit gracefully####
I have tried using "except:" as well as "except SystemExit:" but to no avail. Has anyone been able to solve this problem? (It is for a CGI script which is supposed to log when the client terminates their connection)
UPDATE: I have now tried using signal to interrupt the kill process in the script, which also didn't work. Where can I see an error log? I know exactly which line fails and under which conditions, but there is no error log or anything like I would get if I ran a script which failed in a terminal.
When you say it kills the program, you mean the main python process exits - and not by some thrown exception? That's kinda weird. A workaround might be to have the task run in a separate Thread or process, and then monitor that until it dies and subsequently execute the second task.

Resources