I have googled many different sources and cannot determine why the ipython kernel has this behavior on a jupyter notebook, so I'm posting here to figure out why.
I'm using notebooks in order to document commandline analysis, and jupyter notebooks are a very useful format to run commands and have them saved into a pdf.
I want to be able to utilize multiple variable expansions in 1 line of a shell magic command in ipython.
Right now I can accomplish this using multiple %env magic commands like so:
#commented code below for context of commands run in a previous cell:
#ssl_logs are some logs of ssl headers
#ssl_log = f"SOMEDIR/ssl.log"
#!cat {ssl_log} | jq -r '.ja3' | sort | uniq > uniq_ja3.txt
#j3_ua_db = "ja3fingerprint_ua.json" #HTTP user-agent strings associated with each JA3 hash
#j3_hash_db = "ja3fingerprint_hash.json" #before/after hashing
#code I'm running in the specific cell with issues
for j in ja3s:
%env j {j}
%env j3_ua_db {j3_ua_db}
!echo $j $j3_ua_db
!grep "$j" "$j3_ua_db" | jq -cr "{ ja3fp, useragent }"
break
Its output is below:
env: j=021d3c3f14b88d57a9ce2d946cabe87f
env: j3_ua_db=ja3fingerprint_ua.json
021d3c3f14b88d57a9ce2d946cabe87f ja3fingerprint_ua.json
{"ja3fp":"021d3c3f14b88d57a9ce2d946cabe87f","useragent":"curl"}
{"ja3fp":"021d3c3f14b88d57a9ce2d946cabe87f","useragent":"curl/7.29.0"}
.....
However what I want to be able to accomplish is to expand the j and j3_ua_db variables on the grep command and pipe them into jq. When I run the below command, the second variable does not expand, and I think its because ipython won't expand multiple python variables in the same line.
for j in ja3s:
print(j)
print(j3_ua_db)
!echo {j} {j3_ua_db}
!grep {j} {j3_ua_db} | jq -cr "{ ja3fp, useragent }"
break
OUTPUT:
021d3c3f14b88d57a9ce2d946cabe87f
ja3fingerprint_ua.json
021d3c3f14b88d57a9ce2d946cabe87f ja3fingerprint_ua.json
grep: {j3_ua_db}: No such file or directory
To be clear, these are all example outputs from simulation data. No internal data is being published here.
TLDR:
I understand I can just do this in a shell like so:
for ja3fp in $( cat uniq_ja3.txt ); do
grep "$ja3fp" "$j3_ua_db" | jq -cr '{ ja3fp, useragent }'
done
But I want to be able to have the same expansion variable functionality in ipython on a jupyter notebook for ease of use, rather than needing to create a %env line for every variable I want to use in multiple lines.
Does anyone know how I can expand multiple ipython variables in 1 line of a "!" magic shell command?
Is there some syntax foo I am missing?
**EDIT:
my versions are as follows:
python3 --version.
Python 3.8.5.
jupyter-lab --version.
3.3.3.
ipython --version.
8.2.0.
I have a module which contains the python code and I execute it using the following command:
python script.py \
--eps 12 \
--minpts \
--train \
--predict \
--lower_case \
--input_file data.csv \
--dev_file devdata.csv \
--output_dir /output/
All I want to do is to execute the above command through a python function. Is there any way of doing it?
I don't know why everyone is having such difficulty with this question, it's perfectly clear, unfortunately it's also difficult to do what you want because python uses implicit data types, and that's uncommon. As a result all command line arguments are passed to python as strings.
I'd check this out for the details:
https://www.tutorialspoint.com/python/python_command_line_arguments.htm
but the tldr is to have inside your python script:
import sys
eps = int(sys.argv[sys.argv.index('eps')+1])
minpts = True if '-minputs' in sys.argv else False
…
Obviously this isn't ideal, or pretty but it is quick and easy.
Alternatively you can use the argparser library:
https://docs.python.org/3/library/argparse.html
For a more robust and user friendly solution. Hope this helps
A.
Edit:
I was missing the ' around eps
Command-Line Arguments
import sys
print 'version is', sys.version
The first line imports a library called sys, which is short for "system". It defines values such as sys.version, which describes which version of Python we are running.
This command tells the python interpreter installed in your machine to run program sys-version.py from the current directory.
Here's another script that does something more interesting:import sys
print 'sys.argv is', sys.argv
Here's the problem: I have this script foo.py, and if the user invokes it without the --bar option, I'd like to display the following error message:
Please add the --bar option to your command, like so:
python foo.py --bar
Now, the tricky part is that there are several ways the user might have invoked the command:
They may have used python foo.py like in the example
They may have used /usr/bin/foo.py
They may have a shell alias frob='python foo.py', and actually ran frob
Maybe it's even a git alias flab=!/usr/bin/foo.py, and they used git flab
In every case, I'd like the message to reflect how the user invoked the command, so that the example I'm providing would make sense.
sys.argv always contains foo.py, and /proc/$$/cmdline doesn't know about aliases. It seems to me that the only possible source for this information would be bash itself, but I don't know how to ask it.
Any ideas?
UPDATE How about if we limit possible scenarios to only those listed above?
UPDATE 2: Plenty of people wrote very good explanation about why this is not possible in the general case, so I would like to limit my question to this:
Under the following assumptions:
The script was started interactively, from bash
The script was start in one of these 3 ways:
foo <args> where foo is a symbolic link /usr/bin/foo -> foo.py
git foo where alias.foo=!/usr/bin/foo in ~/.gitconfig
git baz where alias.baz=!/usr/bin/foo in ~/.gitconfig
Is there a way to distinguish between 1 and (2,3) from within the script? Is there a way to distinguish between 2 and 3 from within the script?
I know this is a long shot, so I'm accepting Charles Duffy's answer for now.
UPDATE 3: So far, the most promising angle was suggested by Charles Duffy in the comments below. If I can get my users to have
trap 'export LAST_BASH_COMMAND=$(history 1)' DEBUG
in their .bashrc, then I can use something like this in my code:
like_so = None
cmd = os.environ['LAST_BASH_COMMAND']
if cmd is not None:
cmd = cmd[8:] # Remove the history counter
if cmd.startswith("foo "):
like_so = "foo --bar " + cmd[4:]
elif cmd.startswith(r"git foo "):
like_so = "git foo --bar " + cmd[8:]
elif cmd.startswith(r"git baz "):
like_so = "git baz --bar " + cmd[8:]
if like_so is not None:
print("Please add the --bar option to your command, like so:")
print(" " + like_so)
else:
print("Please add the --bar option to your command.")
This way, I show the general message if I don't manage to get their invocation method. Of course, if I'm going to rely on changing my users' environment I might as well ensure that the various aliases export their own environment variables that I can look at, but at least this way allows me to use the same technique for any other script I might add later.
No, there is no way to see the original text (before aliases/functions/etc).
Starting a program in UNIX is done as follows at the underlying syscall level:
int execve(const char *path, char *const argv[], char *const envp[]);
Notably, there are three arguments:
The path to the executable
An argv array (the first item of which -- argv[0] or $0 -- is passed to that executable to reflect the name under which it was started)
A list of environment variables
Nowhere in here is there a string that provides the original user-entered shell command from which the new process's invocation was requested. This is particularly true since not all programs are started from a shell at all; consider the case where your program is started from another Python script with shell=False.
It's completely conventional on UNIX to assume that your program was started through whatever name is given in argv[0]; this works for symlinks.
You can even see standard UNIX tools doing this:
$ ls '*.txt' # sample command to generate an error message; note "ls:" at the front
ls: *.txt: No such file or directory
$ (exec -a foobar ls '*.txt') # again, but tell it that its name is "foobar"
foobar: *.txt: No such file or directory
$ alias somesuch=ls # this **doesn't** happen with an alias
$ somesuch '*.txt' # ...the program still sees its real name, not the alias!
ls: *.txt: No such file
If you do want to generate a UNIX command line, use pipes.quote() (Python 2) or shlex.quote() (Python 3) to do it safely.
try:
from pipes import quote # Python 2
except ImportError:
from shlex import quote # Python 3
cmd = ' '.join(quote(s) for s in open('/proc/self/cmdline', 'r').read().split('\0')[:-1])
print("We were called as: {}".format(cmd))
Again, this won't "un-expand" aliases, revert to the code that was invoked to call a function that invoked your command, etc; there is no un-ringing that bell.
That can be used to look for a git instance in your parent process tree, and discover its argument list:
def find_cmdline(pid):
return open('/proc/%d/cmdline' % (pid,), 'r').read().split('\0')[:-1]
def find_ppid(pid):
stat_data = open('/proc/%d/stat' % (pid,), 'r').read()
stat_data_sanitized = re.sub('[(]([^)]+)[)]', '_', stat_data)
return int(stat_data_sanitized.split(' ')[3])
def all_parent_cmdlines(pid):
while pid > 0:
yield find_cmdline(pid)
pid = find_ppid(pid)
def find_git_parent(pid):
for cmdline in all_parent_cmdlines(pid):
if cmdline[0] == 'git':
return ' '.join(quote(s) for s in cmdline)
return None
See the Note at the bottom regarding the originally proposed wrapper script.
A new more flexible approach is for the python script to provide a new command line option, permitting users to specify a custom string they would prefer to see in error messages.
For example, if a user prefers to call the python script 'myPyScript.py' via an alias, they can change the alias definition from this:
alias myAlias='myPyScript.py $#'
to this:
alias myAlias='myPyScript.py --caller=myAlias $#'
If they prefer to call the python script from a shell script, it can use the additional command line option like so:
#!/bin/bash
exec myPyScript.py "$#" --caller=${0##*/}
Other possible applications of this approach:
bash -c myPyScript.py --caller="bash -c myPyScript.py"
myPyScript.py --caller=myPyScript.py
For listing expanded command lines, here's a script 'pyTest.py', based on feedback by #CharlesDuffy, that lists cmdline for the running python script, as well as the parent process that spawned it.
If the new -caller argument is used, it will appear in the command line, although aliases will have been expanded, etc.
#!/usr/bin/env python
import os, re
with open ("/proc/self/stat", "r") as myfile:
data = [x.strip() for x in str.split(myfile.readlines()[0],' ')]
pid = data[0]
ppid = data[3]
def commandLine(pid):
with open ("/proc/"+pid+"/cmdline", "r") as myfile:
return [x.strip() for x in str.split(myfile.readlines()[0],'\x00')][0:-1]
pid_cmdline = commandLine(pid)
ppid_cmdline = commandLine(ppid)
print "%r" % pid_cmdline
print "%r" % ppid_cmdline
After saving this to a file named 'pytest.py', and then calling it from a bash script called 'pytest.sh' with various arguments, here's the output:
$ ./pytest.sh a b "c d" e
['python', './pytest.py']
['/bin/bash', './pytest.sh', 'a', 'b', 'c d', 'e']
NOTE: criticisms of the original wrapper script aliasTest.sh were valid. Although the existence of a pre-defined alias is part of the specification of the question, and may be presumed to exist in the user environment, the proposal defined the alias (creating the misleading impression that it was part of the recommendation rather than a specified part of the user's environment), and it didn't show how the wrapper would communicate with the called python script. In practice, the user would either have to source the wrapper or define the alias within the wrapper, and the python script would have to delegate the printing of error messages to multiple custom calling scripts (where the calling information resided), and clients would have to call the wrapper scripts. Solving those problems led to a simpler approach, that is expandable to any number of additional use cases.
Here's a less confusing version of the original script, for reference:
#!/bin/bash
shopt -s expand_aliases
alias myAlias='myPyScript.py'
# called like this:
set -o history
myAlias $#
_EXITCODE=$?
CALL_HISTORY=( `history` )
_CALLING_MODE=${CALL_HISTORY[1]}
case "$_EXITCODE" in
0) # no error message required
;;
1)
echo "customized error message #1 [$_CALLING_MODE]" 1>&2
;;
2)
echo "customized error message #2 [$_CALLING_MODE]" 1>&2
;;
esac
Here's the output:
$ aliasTest.sh 1 2 3
['./myPyScript.py', '1', '2', '3']
customized error message #2 [myAlias]
There is no way to distinguish between when an interpreter for a script is explicitly specified on the command line and when it is deduced by the OS from the hashbang line.
Proof:
$ cat test.sh
#!/usr/bin/env bash
ps -o command $$
$ bash ./test.sh
COMMAND
bash ./test.sh
$ ./test.sh
COMMAND
bash ./test.sh
This prevents you from detecting the difference between the first two cases in your list.
I am also confident that there is no reasonable way of identifying the other (mediated) ways of calling a command.
I can see two ways to do this:
The simplest, as suggested by 3sky, would be to parse the command line from inside the python script. argparse can be used to do so in a reliable way. This only works if you can change that script.
A more complex way, slightly more generic and involved, would be to change the python executable on your system.
Since the first option is well documented, here are a bit more details on the second one:
Regardless of the way your script is called, python is ran. The goal here is to replace the python executable with a script that checks if foo.py is among the arguments, and if it is, check if --bar is as well. If not, print the message and return.
In every other case, execute the real python executable.
Now, hopefully, running python is done trough the following shebang: #!/usr/bin/env python3, or trough python foo.py, rather than a variant of #!/usr/bin/python or /usr/bin/python foo.py. That way, you can change the $PATH variable, and prepend a directory where your false python resides.
In the other case, you can replace the /usr/bin/python executable, at the risk of not playing nice with updates.
A more complex way of doing this would probably be with namespaces and mounts, but the above is probably enough, especially if you have admin rights.
Example of what could work as a script:
#!/usr/bin/env bash
function checkbar
{
for i in "$#"
do
if [ "$i" = "--bar" ]
then
echo "Well done, you added --bar!"
return 0
fi
done
return 1
}
command=$(basename ${1:-none})
if [ $command = "foo.py" ]
then
if ! checkbar "$#"
then
echo "Please add --bar to the command line, like so:"
printf "%q " $0
printf "%q " "$#"
printf -- "--bar\n"
exit 1
fi
fi
/path/to/real/python "$#"
However, after re-reading your question, it is obvious that I misunderstood it. In my opinion, it is all right to just print either "foo.py must be called like foo.py --bar", "please add bar to your arguments" or "please try (instead of )", regardless of what the user entered:
If that's an (git) alias, this is a one time error, and the user will try their alias after creating it, so they know where to put the --bar part
with either with /usr/bin/foo.py or python foo.py:
If the user is not really command line-savvy, they can just paste the working command that is displayed, even if they don't know the difference
If they are, they should be able to understand the message without trouble, and adjust their command line.
I know it's bash task, but i think the easiest way is modify 'foo.py'. Of course it depends on level of script complicated, but maybe it will fit. Here is sample code:
#!/usr/bin/python
import sys
if len(sys.argv) > 1 and sys.argv[1] == '--bar':
print 'make magic'
else:
print 'Please add the --bar option to your command, like so:'
print ' python foo.py --bar'
In this case, it does not matter how user run this code.
$ ./a.py
Please add the --bar option to your command, like so:
python foo.py --bar
$ ./a.py -dua
Please add the --bar option to your command, like so:
python foo.py --bar
$ ./a.py --bar
make magic
$ python a.py --t
Please add the --bar option to your command, like so:
python foo.py --bar
$ /home/3sky/test/a.py
Please add the --bar option to your command, like so:
python foo.py --bar
$ alias a='python a.py'
$ a
Please add the --bar option to your command, like so:
python foo.py --bar
$ a --bar
make magic
I was trying to capture output of top command using the following python script:
import os
process = os.popen('top')
preprocessed = process.read()
process.close()
output = 'show_top.txt'
fout = open(output,'w')
fout.write(preprocessed)
fout.close()
However, the script does not work for top. It gets stuck for a long time. However it works well with commands like 'ls'. I have no clue why this is happening?
Since you're waiting for the process to finish, you need to tell top to only print its output once, and then quit.
You can do that by running:
top -n 1
-b argument required when stdout read from python
os.popen('top -b -n 1')
top -b -n 1
I want to execute an exe file using Python 3.4.
That is,
C:/crf_test.exe -m input.txt output.txt
When I executed this at the command line, the result was:
Go SEARCH
to O
...
But, when I executed this in Python like this:
import os
os.startfile('crf_test.exe -m model.txt test.txt')
Nothing happened (I mean appeared in the result window.)
Using os.popen() you can execute and read commands:
cmd = os.popen(r'crf_test.exe -m model.txt test.txt')
result = cmd.read()