How to set/define/use sys.argv - python-3.x

I'm fairly new to Python, so please bear with me.
Currently, I'm using Python 3.5 in an Anaconda environment on Pycharm, and I am trying to understand how to set/define/use sys.argv so that I can automate several processes before uploading my changes onto github.
For example:
python function/function.py input_folder/input.txt output_folder/output.txt
This means that function.py will take input.txt from input_folder, apply whatever script written in function.py, and store the results into output.txt in folder output_folder.
However, when I type this into terminal, I get the following error:
python: can't open file 'function/function.py': [Errno 2] No such file or directory
Then, typing sys.argv into Python console, I receive the following:
['C:\\Program Files (x86)\\JetBrains\\PyCharm 2016.2\\helpers\\pydev\\pydevconsole.py',
'53465',
'53466']
My guess is that if I were to set sys.argv[0:1] correctly, then I should be able to apply function.py to input.txt and store the results into output.txt.
I've already tried to define these directories, but they wouldn't work. Any help would be awesome!

your issue is that python does not know where the function directory exists. If you are trying to run a script from a sub directory like so
function
|_function.py
|
input_folder
|_input.txt
|
|output_folder
|_output.txt
you must tell python that the function folder is local, so
python ./function/function.py ./input_folder/input.txt ./output_folder/output.txt
or
python $PWD/function/function.py $PWD/input_folder/input.txt $PWD/output_folder/output.txt
$PWD is a bash variable that gives the current directory

Related

Execute a subprocess that takes an input file and write the output to a file

I am using a third-party C++ program to generate intermediate results for the python program that I am working on. The terminal command that I use looks like follows, and it works fine.
./ukb/src/ukb_wsd --ppr_w2w -K ukb/scripts/wn30g.bin -D ukb/scripts/wn30_dict.txt ../data/glass_ukb_input2.txt > ../data/glass_ukb_output2w2.txt
If I break it down into smaller pieces:
./ukb/src/ukb_wsd - executable program
--ppr_w2w - one of the options/switches
-K ukb/scripts/wn30g.bin - parameter K indicates that the next item is a file (network file)
-D ukb/scripts/wn30_dict.txt - parameter D indicate that the next item is a file (dictionary file)
../data/glass_ukb_input2.txt - input file
> - shell command to write the output to a file
../data/glass_ukb_output2w2.txt - output file
The above works fine for one instance. I am trying to do this for around 70000 items (input files). So found a way by using the subprocess module in Python. The body of the python function that I created looks like this.
with open('../data/glass_ukb_input2.txt', 'r') as input, open('../data/glass_ukb_output2w2w_subproc.txt', 'w') as output:
subprocess.run(['./ukb/src/ukb_wsd', '--ppr_w2w', '-K', 'ukb/scripts/wn30g.bin', '-D', 'ukb/scripts/wn30_dict.txt'],
stdin=input,
stdout=output)
This error is no longer there
When I execute the function, it gives an error as follows:
...
STDOUT = subprocess.STDOUT
AttributeError: module 'subprocess' has no attribute 'STDOUT'
Can anyone shed some light about solving this problem.
EDIT
The error was due to a file named subprocess.py in the source dir which masked Python's subprocess file. Once it was removed no error.
But the program could not identify the input file given in stdin. I am thinking it has to do with having 3 input files. Is there a way to provide more than one input file?
EDIT 2
This problem is now solved with the current approach:
subprocess.run('./ukb/src/ukb_wsd --ppr_w2w -K ukb/scripts/wn30g.bin -D ukb/scripts/wn30_dict.txt ../data/glass_ukb_input2.txt > ../data/glass_ukb_output2w2w_subproc.txt',shell=True)

how to get a variable of a python file from bash script

I have a python file, conf.py which is used to store configuration variables. conf.py is given below:
import os
step_number=100
I have a bash script runner.sh which tries to reach the variables from conf.py:
#! /bin/bash
#get step_number from conf file
step_number_=$(python ./conf.py step_number)
However, if I try to print the step_number_ with echo $step_number_, it returns empty value. Can you please help me to fix it?
$(command) is replaced with the standard output of the command. So the Python script needs to print the variable so you can substitute it this way.
import os
step_number = 100
print(step_number)

finding a file using general location in a python script

I making a script in python3. this script takes an input file. depends on who is running the script every time the location of this input file is different but always would be in the same directory as the script is. so I want to give the location of the input file to the script but basically the script should find it. my input file always would have the same name (infile.txt). to do so, I am using this way in python3:
path = os.path.join(os.getcwd())
input = path/infile.txt
but it does not return anything. do you know how to fix it?
os.getcwd() return the working directory which can be different to the directory where the script is. The working directory corresponds from where python is executed.
In order to know where is the scipt you should use
`
import os
input = os.path.join(os.path.dirname(os.path.realpath(__file__)), infile.txt)
# and you should use os.path.join
`
If i understand your question properly;
You have python script (sample.py) and a input file (sample_input_file.txt) in a directory say; D:\stackoverflow\sample.y and D:\stackoverflow\sample_input_file.txt respectively.
import os
stackoverflow_dir = os.getcwd()
sample_txt_file_path = os.path.join(stackoverflow_dir, 'sample_input_file.txt')
print(sample_txt_file_path)
os.path.join() takes *args as second argument which must have been your file path to join.

An error when loading a 2mb dataset of floating points (python)

Does any one know why i got an error of "FileNotFoundError: [Errno 2] No such file or directory: 'bcs.xlsx'" when i'm loading this file of size 2mb it has around 60,000 rows and 4 columns.
i tried using csv instead of xlsx but i get the same error and i've checked hundreds times that the script and the file are at he same directory.
This is because Python does not find your file, errors are not lying.
But there's a misunderstanding in your question, you checked that the file is in the same directory as your script, but that's not the check you have to do. You have to check the file is in the current working directory of your python script.
To see your current working directory, use:
import os
print(os.getcwd())
And as we're at it you can list this directory:
print(os.listdir())
I don't know how you execute your script, but if you're using a terminal emulator, a typical way to give a file name to a program is by argument, not hardcoding its name, like by using argparse. And if you do this way, your shell completion may help you naming your file properly, like:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('file', type=argparse.FileType('r'))
args = parser.parse_args()
print(args.file.read())
Now on a shell if you type:
python3 ./thescript.py ./th[TAB]
your shell will autocomplete "./th" to "./thescript.py" (if and only if it exists), highly reducing the probablity of having a typo. Typically if there's a space in the filename like "the script.py", your shell should properly autocomplete the\ script.py.
Also if you use argparse with the argparse.FileType as I did, you'll have a verbose error in case the file does not exist:
thescript.py: error: argument file: can't open 'foo': [Errno 2] No such file or directory: 'foo'
But… you already have a verbose error.

Testing python programs without using python shell

I would like to easily test my python programs without constantly using the python shell since each time the program is modified you have to quit, re-enter the python shell and import the program again. I am using a 2012 Macbook pro with OSX. I have the following code:
import sys
def read_strings(filename):
with open(filename) as file:
return file.read().split('>')[1:0]
file1 = sys.argv[1]
filename = read_strings(file1)
Essentially I would like to read into and split a txt file containing:
id1>id2>id3>id4
I am entering this into my command line:
pal-nat184-102-127:python_stuff ceb$ python3 program.py string.txt
However when I try the sys.argv approach on the command line my program returns nothing. Is this a good approach to testing code, could anyone point me in the correct direction?
This is what I would like to happen:
pal-nat184-102-127:python_stuff ceb$ python3 program.py string.txt
['id1', 'id2', 'id3', 'id4']
Let's take this a piece at a time:
However when I try the sys.argv approach on the command line my
program returns nothing
The final result of your program is that it writes a string into the variable filename. It's a little strange to have a program "return" a value. Generally, you want a program to print it's something out or save something to a file. I'm guessing it would ease your debugging if you modified your program by adding,
print (filename)
at the end: you'd be able to see the result of your program.
could anyone point me in the correct direction?
One other debugging note: It can be useful to write your .py files so that they can be run both independently at the command line or in a python shell. How you've currently structured your code, this will work semi-poorly. (Starting a shell and then importing your file will cause an error because sys.argv[1] isn't defined.)
A solution to this is to change your the bottom section of your code as follows:
if __name__ == '__main__':
file1 = sys.argv[1]
filename = read_strings(file1)
The if guard at the top says, "If running as a standalone script, then run what's below me. If you imported me from some place else, then do not execute what's below me."
Feel free to follow up below if I misinterpreted your question.
You never do anything with the result of read_strings. Try:
print(read_strings(file1))

Resources