Get filename and arguments from path on Windows system (with Python)

Get filename and arguments from path on Windows system (with Python) - python-3.x

Im running into a issue when Im trying to get the filename and arguments from a binary path.
For example, here is the binary path that is giving me trouble:
binaryPath = "C:\Windows\System32\msiexec \V"
Ideally, I would like the result to be:
filename: "msiexec.exe"
arguments: "\V"
Here is what I have tried (and this works for 99% of paths with arguments, just not the one above). And obviously the "//V" is messing this up and os.path is looping it in with the filepath.
import os
binaryPath = "C:\Windows\System32\msiexec \V"
fn_with_arguments = os.path.basename(binaryPath)
image = fn_with_arguments[0].replace("'","")
arguments = " ".join(fn_with_arguments[1:])
if image:
print("Image: {}".format(image))
if arguments:
print("Arguments: {}".format(arguments))
>>> Image: V
Any ideas? Speed is of importance here so I dont really want to split the path into pieces and then iterate to find the piece with a "dot" in it...

Related

Python filepaths have double backslashes

Ultimately, I want to loop through every pdf in specified directory ('C:\Users\dude\pdfs_for_parsing') and print the metadata for each pdf. The issue is that when I try to loop through the "directory" I'm receiving the error "FileNotFoundError: [Errno 2] No such file or directory:". I understand this error is occurring because I now have double slashes in my filepaths for some reason.
Example Code
import PyPDF2
import os
path_of_the_directory = r'C:\Users\dude\pdfs_for_parsing'
directory = []
ext = ('.pdf')
def isolate_pdfs():
for files in os.listdir(path_of_the_directory):
if files.endswith(ext):
x = os.path.abspath(files)
directory.append(x)
for pdf in directory:
reader = PyPDF2.PdfReader(pdf)
information = reader.metadata
print(information)
isolate_pdfs()
If I print the file paths one at a time, I see that the files have single '/' like I'm expecting:
for pdf in directory:
print(pdf)
The '//' seems to get added when I try to open each of the PDFs 'PDFFile = open(pdf,'rb')'

Your issue has nothing to do with //, it's here:
os.path.abspath(files)
Say you have C:\Users....\x.pdf, you list that directory, so the files will contain x.pdf. You then take the absolute path of x.pdf, which the abspath supposes to be in the current directory. You should replace it with:
x = os.path.join(path_of_the_directory, files)
Other notes:
PDFFile and PDF shouldn't be in uppercase. Prefer pdf_file and pdf_reader. The latter also avoids the confusion with the for pdf in...
Try to use a debugger rather than print statements. This is how I found your bug. It can be in your IDE or in command line with python -i You can step through your code, test a few variations, fiddle with the variables...
Why is ext = ('.pdf') with braces ? It doesn't do anything but leads to think that it might be a tuple (but isn't).
As an exercise the first for can be written as: directory = [os.path.join(path_of_the_directory, x) for x in os.listdir(path_of_the_directory) if x.endswith(ext)]

Python - Script that copy certain files by file name

I wrote a script to copy files with specific names from one folder to another.
The file name format I want to copy is 2021052444592AKC. However, the script I wrote copies all files with the ending AKC, but in the if condition I specified that it should copy only files if the filename starts with "202105" and ends with "AKC". In the folder I have other files in the same format that is"YYYYMMDD44592threeUpperCaseLetters"
Can anyone help, because I haven't found the answer to this problem, thanks in advance :)
P.S I'm using Python3 in PyCharm
import shutil
import os
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if filename.startswith("202105") and filename.endswith("AKC"):
shutil.copy(dir_src + filename, dir_dst)
print("End")

I'm not sure exactly why your script is failing, but you might want to try a solution with a regular expression (re).
import re
pattern = re.compile(r'^202105(\d{2})44592AKC$')
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if pattern.match(filename):
shutil.copy(dir_src + filename, dir_dst)
print("End")

Problem with multivariables in string formatting

I have several files in a folder named t_000.png, t_001.png, t_002.png and so on.
I have made a for-loop to import them using string formatting. But when I use the for-loop I got the error
No such file or directory: '/file/t_0.png'
This is the code that I have used I think I should use multiple %s but I do not understand how.
for i in range(file.shape[0]):
im = Image.open(dir + 't_%s.png' % str(i))
file[i] = im

You need to pad the string with leading zeroes. With the type of formatting you're currently using, this should work:
im = Image.open(dir + 't_%03d.png' % i)
where the format string %03s means "this should have length 3 characters and empty space should be padded by leading zeroes".
You can also use python's other (more recent) string formatting syntax, which is somewhat more succinct:
im = Image.open(f"{dir}t_{i:03d}")

You are not padding the number with zeros, thus you get t_0.png instead of t_000.png.
The recommended way of doing this in Python 3 is via the str.format function:
for i in range(file.shape[0]):
im = Image.open(dir + 't_{:03d}.png'.format(i))
file[i] = im
You can see more examples in the documentation.
Formatted string literals are also an option if you are using Python 3.6 or a more recent version, see Green Cloak Guy's answer for that.

Try this:
import os
for i in range(file.shape[0]):
im = Image.open(os.path.join(dir, f't_{i:03d}.png'))
file[i] = im
(change: f't_{i:03d}.png' to 't_{:03d}.png'.format(i) or 't_%03d.png' % i for versions of Python prior to 3.6).
The trick was to specify a certain number of leading zeros, take a look at the official docs for more info.
Also, you should replace 'dir + file' with the more robust os.path.join(dir, file), which would work regardless of dir ending with a directory separator (i.e. '/' for your platform) or not.
Note also that both dir and file are reserved names in Python and you may want to rename your variables.
Also check that if file is a NumPy array, file[i] = im may not be working.

Double backslashes for filepath_or_buffer with pd.read_csv

Python 3.6, OS Windows 7
I am trying to read a .txt using pd.read_csv() using relative filepath. So, from pd.read_csv() API checked out that the filepath argument can be any valid string path.
So, in order to define the relative path I use pathlib module. I have defined the relative path as:
df_rel_path = pathlib.Path.cwd() / ("folder1") / ("folder2") / ("file.txt")
a = str(df_rel_path)
Finally, I just want to use it to feed pd.read_csv() as:
df = pd.read_csv(a, engine = "python", sep = "\s+")
However, I am just getting an error stating "No such file or directory: ..." showing double backslashes on the folder path.
I have tried to manually write the path on pd.read_csv() using a raw string, that is, using r"relative/path". However, I am still getting the same result, double backslashes. Is there something I am overlooking?

You can get what you want by using os module
df_rel_path = os.path.abspath(os.path.join(os.getcwd(), "folder1", "folder2"))
This way the os module will deal with the joining the path parts with the proper separator. You can omit os.path.abspath if you read a file that's within the same directory but I wrote it for the sake of completeness.
For more info, refer to this SO question: Find current directory and file's directory

You need a filename to call pd.read_csv. In the example 'a' is a only the path and does not point to a specific file. You could do something like this:
df_rel_path = pathlib.Path.cwd() / ("folder1") / ("folder2")
a = str(df_rel_path)
df = pd.read_csv(a+'/' +'filename.txt')
With the filename your code works for me (on Windows 10):
df_rel_path = pathlib.Path.cwd() / ("folder1") / ("folder2")/ ("file.txt")
a = str(df_rel_path)
df = pd.read_csv(a)

File transfer for R extension in NetLogo - filename string with backslash and quotes

I need to use the R extension in NetLogo to do some network calculations. I am creating the network in NetLogo, exporting it to a text file, having R read the text file and construct a graph and calculate properties, then getting the calculation results. The export, read, calculate and get are being controlled by NetLogo through the R extension.
However, NetLogo and R have different default working directories. The problem I have about changing directories in R breaking the connection to the extensions (see R extension breaks connection to extensions directory in NetLogo) is affecting my attempts to use BehaviorSpace on the model.
My new approach is to not change the R working directory, but simply to provide the full path to R of the exported file.
r:clear
let dir pathdir:get-model
r:eval "library(igraph)"
; read network in R (avoid bug of R change working directory)
let runstring (word "r:eval \"gg <- read_graph(file = \"" dir "\\netlogo.gml\", format = \"gml\")\"")
print runstring
run runstring
This produces the correct string to run, output from print statement:
r:eval "gg <- read_graph(file = "C:\Users\Jen\Desktop\Intervention Effect\netlogo.gml", format = "gml")"
But I get an error on the run runstring that this nonstandard character is not allowed. Debugging by putting my constructed string into the run command directly, I have realised it is because I am now in a string environment and have to escape ('\') all my backslashes and quotes. That is, the command that would work if directly typed or included in the NetLogo code, will not work if it is provided as a string to be run.
I haven't yet been able to construct a string to put into the line run runstring that works, even by hand. This means I don't know what the string looks like that I am trying to create. Having identified the appropriate target string, I will need code to take the variable 'dir', convert it to a string, add the various \ characters to the dir, add the various \ characters to the quotes for the rest of the command, and join it so that it runs.
Can anyone provide some bits of this to get me further along?
Still struggling with this
I am now trying to work backwards. Find a string that works and then create it.
If I hard code the run command as follows, NetLogo closes. Even though if I copy the text between the quotes and enter it directly into R, it does what is expected.
let Rstring "gg <- read_graph(file = 'C:\\Users\\Jen\\Desktop\\Intervention Effect\\Networks\\netlogo.gml', format = 'gml')"
r:eval Rstring

The pathdir option ended up working. Here is example code for anyone who has a similar problem in the future.
let filename (word "Networks/netlogo" behaviorspace-run-number ".gml")
export-simple-gml filename
r:clearLocal
let dir pathdir:get-model
set filename (word dir "/" filename)
r:put "fn" filename
r:eval "gg <- read_graph(file = fn, format = 'gml')"
r:eval "V(gg)$name <- V(gg)$id" ; gml uses 'id', but igraph uses 'name'
I have a separate procedure for the actual export, which constructs a simplified gml file because the igraph import of gml format files is somewhat crippled. That's the procedure called within the code above, and the relevant piece is:
to export-simple-gml [ FN ]
carefully [ file-close-all ] [ ]
carefully [ file-delete FN ] [ ]
file-open FN
file-print <line to write to file>
...
end

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get filename and arguments from path on Windows system (with Python) - python-3.x

Related

Python filepaths have double backslashes

Python - Script that copy certain files by file name

Problem with multivariables in string formatting

Double backslashes for filepath_or_buffer with pd.read_csv

File transfer for R extension in NetLogo - filename string with backslash and quotes

Categories

Resources