How do I use flag within a script? - python-3.x

From the docs I can see that developer can utilize all cores available in the computer to speed up the recognition process using --cpus n flag (https://pypi.org/project/face-recognition/)
which is to use when running face_recognition --cpus 4 ./pictures_of_people_i_know/ ./unknown_pictures/ in a command line.
Is there an equivalent of this flag to be used within a script or do I need to use Python's multiprocessing module ? Docs don't mention this.

Related

Why isn't this command running properly?

So I'm making a better command-line frontend for APT and I'm putting on some finishing touches and when the code below runs.
Command::new("unbuffer")
.arg("apt")
.arg("list")
.arg("|")
.arg("less")
.arg("-r")
.status()
.expect("Something went wrong.");
it spits out:
E: Command line option 'r' [from -r] is not understood in combination with the other options.
but when I just run unbuffer apt list | less -r manually in my terminal it works perfectly. How do I get it to run properly when calling it in Rust?
Spawning a process via Command uses the system's native functionality to create a process. This is a low level feature and has little to do with your shell/terminal that you are used to. In particular, your shell (e.g. bash or zsh, running inside of your terminal) offers a lot more features. For example, piping via | is such a feature. Command does not support these features as the low level system's API doesn't.
Luckily, the low level interface offers other means of achieving a lot of stuff. Piping for example is mostly just redirecting the standard inputs and outputs. You can do that with Command::{stdin, stdout, sterr}. Please see this part of the documentation for more information.
There are a few very similar questions, which are not similar enough to warrent closing this as a dupe though:
Execute a shell command
Why does the compgen command work in the Linux terminal but not with process::Command?: mentions shell built-in commands that do not work with Command.
Executing find using std::process::Command on cygwin does not work

How to run tensorflow in single process mode?

I am trying to run tensorflow 1.13.1 with Python 2.7 on SLF 6 without GPU support. When I start my model, tensorflow appears to be spawning multiple subprocesses and running my model in parallel, trying to load every core in the system. While in most cases this is what one would probably want, this is not my case. I would like to run my model on single core only.
I have tried setting these variables:
export OMP_NUM_THREADS=1
export KMP_BLOCKTIME=0
export KMP_AFFINITY=granularity=fine,verbose,compact,1,0
in different combinations, but was not able to achieve single-core running.
Is there a way to run Tensorflow in "dumb" single-process mode ?
There are two configurable options regarding parallelism inter_op_parallelism_threads and intra_op_parallelism_threads in the tf.ConfigProto protocol buffer. To use a single process, I think you can try:
import tensorflow as tf
config = tf.ConfigProto(intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1,
allow_soft_placement=True)
There are other possible forms of parallelism, see mrry# 's answer is this thread.

Creating a time serie with top and nodeJS

I would like to create a time series and inject it into InfluxDb for a demo. I thought about using the top command (top -pid 1393 -stats cpu), and use the CPU value. And then use NodeJS to extract the data and inject it into an InfluxDB. However, there are a couple of but...:
1- The top command has a display section: can it be removed?
2- In Node, I would call (repeatedly) "top -pid 1393 -stats cpu -l 1" with the "-l 1" option to only get a singe sample. I feel it is a misuse of the fact that top generates data at given intervals (basically, I recreate in Node what top does automatically)
Is there a better way to do this - in the ideal world, I would launch top in node and "pipe" the output stream to a variable in an async way (to execute the insertion into InfluxDB).
Thanks for any hints you may have.
Christian
In actual fact, there is a node module for this: see npm usage.
Note that you may have an issue when installing the usage module, generated by node-gyp rebuild (usage requires this module, which requires XCode in Mac platform - see https://github.com/nodejs/node-gyp). To solve this, look here: xcode-select active developer directory error)
Thanks - C
To monitor the process' resource consuming metrics (correct me if I took your intentions wrong), you don't need your NodeJS stuff at all.
All you need is running Telegraf agent, with this plugin configured, on the machine(s) you're targeting.
Point the output plugin to your Influx - and that's it.
As mention in other comments, there are specific tools for that task. Anyway, if you still want to do it programmatically, I would suggest to either:
run top in non-interactive mode, as explained here: https://unix.stackexchange.com/questions/255100/get-top-output-for-non-interactive-shell
read directly the information that you need from /proc/<pid of your process>/<file of interest>

Limit number of cores used by OMPython

Background
I need to run a blocks simulation. I've used OMEdit to create the system and I call omc to run the simulation using OMPython with zmq for messaging. The simulation works fine but now I need to move it to a server to simulate the system for long times.
Since the server is shared among a team of people, it uses slurm to queue the jobs. The server has 32 cores but they asked me to use only 8 while I tune my script and then 24 when I want to run my final simulation.
I've configured slurm to call my script in the following manner:
#!/bin/bash
#
#SBATCH --job-name=simulation_Test_ADC2_pipe_4096s
#SBATCH --output=simulation_Test_ADC2_pipe_4096s.txt
#
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=10000
source activate /home/jabozzo/conda_envs/py27
#which python
python ./Test_ADC2_pipe_4096s.py
Then I execute the slurm file using sbatch.
Problem
The omc compilation works fine. When it starts to simulate all the 32 cores of the server become loaded, even if it was configured to only use 8.
I've tried
There are compilation and simulation flags that can be passed to omc. I've tried to use --numProcs (a compilation flag) but this only seem to apply during the compilation process and does not affect the final executable. I've scanned the page of simulation flags looking for something related but it seems there is no option to change the cpu usage.
The only thing that we add when doing our OpenModelica testing in parallel is to add the GC_MARKERS=1 environment variable and --numProcs=1; this makes our nightly library testing of 10000 tests all run in serial. But GC_MARKERS shouldn't affect simulations unless they are allocating extreme amounts of memory. Other than that, OpenModelica simulations are serial unless perhaps you use a parallel blas/lapack/sundials library which might use more cores without OpenModelica knowing anything about it; in that case you would need to read the documentation for the library that's consuming all your resources.
What's a bit surprising is also how slurm allows your process to consume more CPUs than you set; it could use the taskset command to make the kernel force the process to only use certain CPUs.
My server administrator was unsure if taskset would interfere with slurm internals. So we found another option. If omc uses openMP for compilation we can also limit the number of cores replacing the last line of the slurm file with:
OMP_NUM_THREADS=8 python ./Test_ADC2_pipe_4096s.py
I'm leaving this anwser here to complement sjoelund.se anwser

Stanford Parser multithread usage

Stanford Parser is now 'thread-safe' as of version 2.0 (02.03.2012). I am currently running the command line tools and cannot figure out how to make use of my multiple cores by threading the program.
In the past, this question has been answered with "Stanford Parser is not thread-safe", as the FAQ still says. I am hoping to find someone who has had success threading the latest version.
I have tried using -t flag (-t10 and -tLLP) since that was all I could find in my searches, but both throw errors.
An example of a command I issue is:
java -cp stanford-parser.jar edu.stanford.nlp.parser.lexparser.LexicalizedParser \
-outputFormat "oneline" ./grammar/englishPCFG.ser.gz ./corpus > corpus.lex
Starting with version 2.0.5, you can now easily use multiple threads with the option -nthreads k. For example, your command can be like this:
java -mx6g edu.stanford.nlp.parser.lexparser.LexicalizedParser -nthreads 4 edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz file.txt > file.stp
(Releases of version 2 prior to 2013 had no way to enable multithreading from the command-line, but only when using the API.)
Internally, you can simultaneously run as many parsing threads inside one JVM process as you want. You can do this either by getting and using multiple LexicalizedParserQuery objects (via the parserQuery() method) or implicitly by calling apply(...) or parseTree(...) off one LexicalizedParser. The -nthreads k option does this for you by sending successive sentences to different parsers using the Executor framework. You can also simultaneously create multiple LexicalizedParser's, e.g., for parsing different languages.
Multiple LexicalizedparserQuery objects share the same grammar (LexicalizedParser), but the memory space savings aren't huge, as most of the memory goes to the transient structures used in chart parsing. So, if you are running lots of parsing threads concurrently, you will need to give a lot of memory to the JVM, as in the example above.
p.s. Sorry, yes, some of the documentation still needs updating. But -tLPP is one flag for specifying language-specific resources. The Stanford Parser has no -t flag.

Resources