finding a file using general location in a python script - python-3.x

I making a script in python3. this script takes an input file. depends on who is running the script every time the location of this input file is different but always would be in the same directory as the script is. so I want to give the location of the input file to the script but basically the script should find it. my input file always would have the same name (infile.txt). to do so, I am using this way in python3:
path = os.path.join(os.getcwd())
input = path/infile.txt
but it does not return anything. do you know how to fix it?

os.getcwd() return the working directory which can be different to the directory where the script is. The working directory corresponds from where python is executed.
In order to know where is the scipt you should use
`
import os
input = os.path.join(os.path.dirname(os.path.realpath(__file__)), infile.txt)
# and you should use os.path.join
`

If i understand your question properly;
You have python script (sample.py) and a input file (sample_input_file.txt) in a directory say; D:\stackoverflow\sample.y and D:\stackoverflow\sample_input_file.txt respectively.
import os
stackoverflow_dir = os.getcwd()
sample_txt_file_path = os.path.join(stackoverflow_dir, 'sample_input_file.txt')
print(sample_txt_file_path)
os.path.join() takes *args as second argument which must have been your file path to join.

Related

Execute a subprocess that takes an input file and write the output to a file

I am using a third-party C++ program to generate intermediate results for the python program that I am working on. The terminal command that I use looks like follows, and it works fine.
./ukb/src/ukb_wsd --ppr_w2w -K ukb/scripts/wn30g.bin -D ukb/scripts/wn30_dict.txt ../data/glass_ukb_input2.txt > ../data/glass_ukb_output2w2.txt
If I break it down into smaller pieces:
./ukb/src/ukb_wsd - executable program
--ppr_w2w - one of the options/switches
-K ukb/scripts/wn30g.bin - parameter K indicates that the next item is a file (network file)
-D ukb/scripts/wn30_dict.txt - parameter D indicate that the next item is a file (dictionary file)
../data/glass_ukb_input2.txt - input file
> - shell command to write the output to a file
../data/glass_ukb_output2w2.txt - output file
The above works fine for one instance. I am trying to do this for around 70000 items (input files). So found a way by using the subprocess module in Python. The body of the python function that I created looks like this.
with open('../data/glass_ukb_input2.txt', 'r') as input, open('../data/glass_ukb_output2w2w_subproc.txt', 'w') as output:
subprocess.run(['./ukb/src/ukb_wsd', '--ppr_w2w', '-K', 'ukb/scripts/wn30g.bin', '-D', 'ukb/scripts/wn30_dict.txt'],
stdin=input,
stdout=output)
This error is no longer there
When I execute the function, it gives an error as follows:
...
STDOUT = subprocess.STDOUT
AttributeError: module 'subprocess' has no attribute 'STDOUT'
Can anyone shed some light about solving this problem.
EDIT
The error was due to a file named subprocess.py in the source dir which masked Python's subprocess file. Once it was removed no error.
But the program could not identify the input file given in stdin. I am thinking it has to do with having 3 input files. Is there a way to provide more than one input file?
EDIT 2
This problem is now solved with the current approach:
subprocess.run('./ukb/src/ukb_wsd --ppr_w2w -K ukb/scripts/wn30g.bin -D ukb/scripts/wn30_dict.txt ../data/glass_ukb_input2.txt > ../data/glass_ukb_output2w2w_subproc.txt',shell=True)

how to get a variable of a python file from bash script

I have a python file, conf.py which is used to store configuration variables. conf.py is given below:
import os
step_number=100
I have a bash script runner.sh which tries to reach the variables from conf.py:
#! /bin/bash
#get step_number from conf file
step_number_=$(python ./conf.py step_number)
However, if I try to print the step_number_ with echo $step_number_, it returns empty value. Can you please help me to fix it?
$(command) is replaced with the standard output of the command. So the Python script needs to print the variable so you can substitute it this way.
import os
step_number = 100
print(step_number)

Using Path to check if file exists when running script outside of the directory

So I currently use Path to check if a file exists
from pathlib import Path
if Path("main.conf").is_file():
pass
else:
setup_config()
While this works as long as I'm in the directory where I'm running the script, I'd like to make it work whatever directory I'm at and just run the script. I know it doesn't work because it's expecting the main.conf to be in the directory I'm currently in but how do I tell path to only check the in the folder where the script is located in?
You can resolve the absolute path of the script by using sys.argv[0] and then replace the name of that script with the config file to check, eg:
import sys
import pathlib
path = path.Pathlib(sys.argv[0]).resolve()
if path.with_name('main.conf').is_file():
# ...
else:
# ...
Although it seems like you should probably not worry about that check and structure your setup_config so it takes a filename as an argument, eg:
def setup_config(filename):
# use with to open file here
with open(filename) as fin:
# do whatever for config setup
Then wrap your main in a try/except (which'll also cover file not exists/can't open file for other reasons), eg:
path = pathlib.Path(sys.argv[0]).resolve()
try:
setup_config(path.with_name('main.conf'))
except IOError:
pass

An error when loading a 2mb dataset of floating points (python)

Does any one know why i got an error of "FileNotFoundError: [Errno 2] No such file or directory: 'bcs.xlsx'" when i'm loading this file of size 2mb it has around 60,000 rows and 4 columns.
i tried using csv instead of xlsx but i get the same error and i've checked hundreds times that the script and the file are at he same directory.
This is because Python does not find your file, errors are not lying.
But there's a misunderstanding in your question, you checked that the file is in the same directory as your script, but that's not the check you have to do. You have to check the file is in the current working directory of your python script.
To see your current working directory, use:
import os
print(os.getcwd())
And as we're at it you can list this directory:
print(os.listdir())
I don't know how you execute your script, but if you're using a terminal emulator, a typical way to give a file name to a program is by argument, not hardcoding its name, like by using argparse. And if you do this way, your shell completion may help you naming your file properly, like:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('file', type=argparse.FileType('r'))
args = parser.parse_args()
print(args.file.read())
Now on a shell if you type:
python3 ./thescript.py ./th[TAB]
your shell will autocomplete "./th" to "./thescript.py" (if and only if it exists), highly reducing the probablity of having a typo. Typically if there's a space in the filename like "the script.py", your shell should properly autocomplete the\ script.py.
Also if you use argparse with the argparse.FileType as I did, you'll have a verbose error in case the file does not exist:
thescript.py: error: argument file: can't open 'foo': [Errno 2] No such file or directory: 'foo'
But… you already have a verbose error.

Testing python programs without using python shell

I would like to easily test my python programs without constantly using the python shell since each time the program is modified you have to quit, re-enter the python shell and import the program again. I am using a 2012 Macbook pro with OSX. I have the following code:
import sys
def read_strings(filename):
with open(filename) as file:
return file.read().split('>')[1:0]
file1 = sys.argv[1]
filename = read_strings(file1)
Essentially I would like to read into and split a txt file containing:
id1>id2>id3>id4
I am entering this into my command line:
pal-nat184-102-127:python_stuff ceb$ python3 program.py string.txt
However when I try the sys.argv approach on the command line my program returns nothing. Is this a good approach to testing code, could anyone point me in the correct direction?
This is what I would like to happen:
pal-nat184-102-127:python_stuff ceb$ python3 program.py string.txt
['id1', 'id2', 'id3', 'id4']
Let's take this a piece at a time:
However when I try the sys.argv approach on the command line my
program returns nothing
The final result of your program is that it writes a string into the variable filename. It's a little strange to have a program "return" a value. Generally, you want a program to print it's something out or save something to a file. I'm guessing it would ease your debugging if you modified your program by adding,
print (filename)
at the end: you'd be able to see the result of your program.
could anyone point me in the correct direction?
One other debugging note: It can be useful to write your .py files so that they can be run both independently at the command line or in a python shell. How you've currently structured your code, this will work semi-poorly. (Starting a shell and then importing your file will cause an error because sys.argv[1] isn't defined.)
A solution to this is to change your the bottom section of your code as follows:
if __name__ == '__main__':
file1 = sys.argv[1]
filename = read_strings(file1)
The if guard at the top says, "If running as a standalone script, then run what's below me. If you imported me from some place else, then do not execute what's below me."
Feel free to follow up below if I misinterpreted your question.
You never do anything with the result of read_strings. Try:
print(read_strings(file1))

Resources