cannot execute jax code jax.random.PRNGKey(0) - jax

I had a problem with jax where I would get stuck executing jax.random.PRNGKey(0). Previously the jax code could successfully execute until I killed some background processes and then the problem occurred.
def main():
jax.random.PRNGKey(0)
I thought maybe some defunct process caused this so I had checked the background process but didn't find any problem.
process

Related

Python multiprocessing deadlock when calling logger-issue6721

I have a code running in Python 3.7.4 which forks off multiple processes. I believe I'm hitting a known issue (issue6721: https://github.com/python/cpython/issues/50970). I setup the child process to send "progress report" through a pipe to the parent process and noticed that sometimes a log statement doesn't get printed and that the code gets stuck in a deadlock situation.
After reading issue6721, I'm not sure I'm still understanding why parent might hold logger Handler lock after a log statement is done execution (i.e the line that logs is executed and the execution has moved to the next line of code). I totally get it that in the context of C++, the compiler might re-arrange instructions. Not fully understand it in context of Python. In C++ I can have barrier instructions to stop the compiler moving instructions beyond a point. Is there something similar that can be done in Python to avoid having a lock getting copied to child process?
I have seen solutions using "atfork" which is a library that seems not supported (so I can't really use it).
Does anyone know a reliable and standard solution to this problem?

Python3 subprocess module not reading stdout properly

I am running a one process using subprocess module. After starting a process using while Loop and poll method i am reading the process output line by line. Stdout becomes empty after sometime, however while Loop keeps running as process poll is not none. When process exits all the remaining output is read.
I tried changing bufsize.when i used strace to track the process i found that when main process starts child process stdout stops displaying at the same point

Jenkins: After build cancel , handle SIGTERM interruption inside the Python script in order to cleanup before finishing

My Jenkins server (version 2.167) is running a shell build job that executes a script written with Python 3.7.0.
Sometimes users need to cancel the build manually (by clicking on the red button with white cross from Jenkins GUI), and the Python scripts needs to handle the interruption in order to perform cleanup tasks before exiting. Some times, the interruption is handled correctly, but others, it seems that the parent process gets terminated before the Python script can run the cleanup procedure.
At the beginning of the Python script, I defined the following:
def cleanup_after_int(signum, frame):
# some cleanup code here
sys.exit(0)
signal.signal(signal.SIGINT, cleanup_after_int)
signal.signal(signal.SIGTERM, cleanup_after_int)
# the rest of the script here
Is the code I'm using sufficient, or should I consider something more?
The Jenkins doc for aborting a build is https://wiki.jenkins.io/display/JENKINS/Aborting+a+build
Found a pretty good document showing how this work: https://gist.github.com/datagrok/dfe9604cb907523f4a2f
You describe a race:
it seems that the parent process [sometimes] gets terminated before the Python script can run the cleanup procedure.
It would be helpful to know how you know that, in terms of the symptoms you observe.
In any event, the python code you posted looks fine. It should work as expected if SIGTERM is delivered to your python process. Perhaps jenkins is just terminating the parent bash. Or perhaps both bash & python are in the same process group and jenkins signals the process group. Pay attention to PGRP in ps -j output.
Perhaps your cleanup code is complex and requires resources that are not always available. For example, perhaps stdout is a pipe to the parent, and cleanup code logs to that open file descriptor, though sometimes a dead parent has closed it.
You might consider having the cleanup code first "daemonize", using this chapter 3 call: http://man7.org/linux/man-pages/man3/daemon.3.html. Then your cleanup would at least be less racy, leading to more reproducible results when you test it and when you use it in production.
You could choose to have the parent bash script orchestrate the cleanup:
trap "python cleanup.py" SIGINT SIGTERM
python doit.py
You could choose to not worry about cleaning upon exit at all. Instead, log whatever you've dirtied, and (synchronously) clean that just before starting, then begin your regularly scheduled script that does the real work. Suppose you create three temp files, and wish to tidy them up. Append each of their names to /tmp/temp_files.txt just before creating each one. Be sure to flush buffers and persist the write with fsync() or close().
Rather than logging, you could choose to clean at startup without a log. For example:
$ rm -f /tmp/{1,2,3}.txt
might suffice. If only the first two were created last time, and the third does not exist, no big deal. Use wildcards where appropriate.

Linux console - start process and wait until finished

I have to write console app which starts another process (GUI). Then with other app or option of the same I have to be able to stop the child process. In addition, if the child process is closed from GUI, I have to be informed to do final tasks (same if killed).
I suppose it is good to keep first (parent) app running while child (GUI) is working and continue with final tasks. For example in .Net this is made with Process.WaitForExit() after Process.Start().
Read wait(2) and exit(2) system calls manpages. wait(2) stops the calling process until some of it's children has exit(2) and exit(2) just do the reciprocal, exits the program and lets the kernel inform its parent process of that, passing it the exit code supplied.

Shell script process is getting killed automatically

I am facing problem with shell script i have ascript which will be running in infinite loop so say its havin PID X.The process is running for 4-5 hours but automatically the process getting killed.This is happening only for some long time running system and i am observing some times after 2 hours also its getting killed.
I am not able to find the reason the why its going down why its getting killed.No one is using the system other than me.And i am running the process as a root user.
Can any one explain or suspect the reason who is killing the process.
Below is the sample script
#!/bin/bash
until ./test.tcl; do
echo "Server 'test.tcl' crashed with exit code $?. Respawing.." >&2
done
In test.tcl script i am running it for infinite loop and the script is used to trap signal and do some special operation.But we find that test.tcl is also going down.
So is there any way from where i capture who and how it gets killed.
Enable core dump in your system, it is the most commonly used method for app crash analysis. I know it is a bit painful to gdb core file, but more or less you can find something out of it.
Here is a reference link for you.(http://www.cyberciti.biz/tips/linux-core-dumps.html).
Another way to do this is tracing you script by "strace -p PID-X", note that it will slow down your system, espeically several hours in your case, but it can be the last resort.
Hope above helpful to you.
Better to check all the signals generated and caught by OS at that time by specific script might be one of signal is causing to kill your process.

Resources