FileNotFoundError long file path python - filepath longer than 255 characters - python-3.x

Normally I don't ask questions, because I find answers on this forum. This place is a goldmine.
I am trying to move some files from a legacy storage system(CIFS Share) to BOX using python SDK. It works fine as long as the file path is less than 255 characters.
I am using os.walk to pass the share name in unix format to list files in the directory
Here is the file name.
//dalnsphnas1.mydomain.com/c$/fs/hdrive/home/abcvodopivec/ENV Resources/New Regulation Review/Regulation Reviews and Comment Letters/Stormwater General Permits/CT S.W. Gen Permit/PRMT0012_FLPR Comment Letter on Proposed Stormwater Regulations - 06-30-2009.pdf
I also tried to escape the file, but still get FileNotFoundError, even though file is there.
//dalnsphnas1.mydomain.com/c$/fs/hdrive/home/abcvodopivec/ENV Resources/New Regulation Review/Regulation Reviews and Comment Letters/Stormwater General Permits/CT S.W. Gen Permit/PRMT0012_FLPR\ Comment\ Letter\ on\ Proposed\ Stormwater\ Regulations\ -\ 06-30-2009.pdf
So I tried to shorten the path using win32api.GetShortPathName, but it throws the same FileNotFoundError. This works fine on files with path length less than 255 characters.
Also tried to copy the file using copyfile(src, dst) to another destination folder to overcome this issue, and still get the same error.
import os, sys
import argparse
import win32api
import win32con
import win32security
from os import walk
parser = argparse.ArgumentParser(
description='Migration Script',
)
parser.add_argument('-p', '--home_path', required = True, help='Home Drive Path')
args = vars(parser.parse_args())
if args['home_path']:
pass
else:
print("Usage : script.py -p <path>")
print("-p <directory path>/")
sys.exit()
dst = (args['home_path'] + '/' + 'long_file_path_dir')
for dirname, dirnames, filenames in os.walk(args['home_path']):
for filename in filenames:
file_path = (dirname + '/' + filename)
path_len = len(file_path)
if(path_len > 255):
#short_path = win32api.GetShortPathName(file_path)
copyfile(file_path, dst, follow_symlinks=True)

After a lot of trial and error, figured out the solution (thanks to stockoverflow forum)
switched from unix format to UNC path
Then appending each file generated through os.walk with r'\\?\UNC' like below. UNC path starts with two backward slashes, I have to remove one to make it to work
file_path = (r'\\?\UNC' + file_path[1:])
Thanks again for everyone who responded.
Shynee

Related

Python filepaths have double backslashes

Ultimately, I want to loop through every pdf in specified directory ('C:\Users\dude\pdfs_for_parsing') and print the metadata for each pdf. The issue is that when I try to loop through the "directory" I'm receiving the error "FileNotFoundError: [Errno 2] No such file or directory:". I understand this error is occurring because I now have double slashes in my filepaths for some reason.
Example Code
import PyPDF2
import os
path_of_the_directory = r'C:\Users\dude\pdfs_for_parsing'
directory = []
ext = ('.pdf')
def isolate_pdfs():
for files in os.listdir(path_of_the_directory):
if files.endswith(ext):
x = os.path.abspath(files)
directory.append(x)
for pdf in directory:
reader = PyPDF2.PdfReader(pdf)
information = reader.metadata
print(information)
isolate_pdfs()
If I print the file paths one at a time, I see that the files have single '/' like I'm expecting:
for pdf in directory:
print(pdf)
The '//' seems to get added when I try to open each of the PDFs 'PDFFile = open(pdf,'rb')'
Your issue has nothing to do with //, it's here:
os.path.abspath(files)
Say you have C:\Users....\x.pdf, you list that directory, so the files will contain x.pdf. You then take the absolute path of x.pdf, which the abspath supposes to be in the current directory. You should replace it with:
x = os.path.join(path_of_the_directory, files)
Other notes:
PDFFile and PDF shouldn't be in uppercase. Prefer pdf_file and pdf_reader. The latter also avoids the confusion with the for pdf in...
Try to use a debugger rather than print statements. This is how I found your bug. It can be in your IDE or in command line with python -i You can step through your code, test a few variations, fiddle with the variables...
Why is ext = ('.pdf') with braces ? It doesn't do anything but leads to think that it might be a tuple (but isn't).
As an exercise the first for can be written as: directory = [os.path.join(path_of_the_directory, x) for x in os.listdir(path_of_the_directory) if x.endswith(ext)]

Problem using glob: file not found after os.path.join()

I met strange problem using glob (python 3.10.0/Linux):
if I use glob for location of the required file using following construct:
def get_last_file(folder, date=datetime.today().date()):
os.chdir(folder)
_files = glob.glob("*.csv")
_files.sort(key=os.path.getctime)
os.chdir(os.path.join("..", ".."))
for _filename in _files[::-1]:
string = str(date).split("-")
if "".join(string) in _filename:
return _filename
# if cannot find the specific date, return newest file
return _files[-1]
but when I try to
os.path.join(fileDir, file)
with the resulting file, I get the relative path which leads to:
FileNotFoundError: [Errno 2] No such file or directory: 'data/1109.csv'.
File certainly exist and whet i try os.path.join(fileDir, '1109.csv'), file is found.
The weirdest thing - if i do:
filez = get_last_file(fileDir, datetime.today().date())
file = '1109.csv''
I still get file not found for file after os.path.join(fileDir, file).
Should I avoid using glob at all?
I made such solution:
file =''
_mtime=0
for root, dirs, filenames in os.walk(fileDir):
for f in sorted(filenames):
if f.endswith(".csv"):
if os.path.getmtime(fileDir+f) > _mtime:
_mtime = os.path.getmtime(fileDir+f)
file = f
print (f'fails {file}')
and the resulting os.path.join(fileDir, file) gives (relative) path fit for further operations
Also the difference between getctime and getmtime is accounted for.
While not a direct solution, try looking at Python's Pathlib library. It often leads to cleaner, less buggy solutions.
from pathlib import Path
def get_last_file(folder, date=datetime.today().date()):
folder = pathlib.Path(folder) # Works for both relative and absolute paths
_files = Path.cwd().glob("*.csv")
_files.sort(key=os.path.getctime)
grandparent_path = folder.parents[1]
for _filename in _files[::-1]:
string = str(date).split("-")
if "".join(string) in _filename:
return _filename
# if cannot find the specific date, return newest file
return _files[-1]
Then instead of using os.path.join() you can do path_dir / file_name where path_dir is Path object. This may also be the case that you are changing the base path in within your function, leading to unexpected behaviour.

Python - Script that copy certain files by file name

I wrote a script to copy files with specific names from one folder to another.
The file name format I want to copy is 2021052444592AKC. However, the script I wrote copies all files with the ending AKC, but in the if condition I specified that it should copy only files if the filename starts with "202105" and ends with "AKC". In the folder I have other files in the same format that is"YYYYMMDD44592threeUpperCaseLetters"
Can anyone help, because I haven't found the answer to this problem, thanks in advance :)
P.S I'm using Python3 in PyCharm
import shutil
import os
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if filename.startswith("202105") and filename.endswith("AKC"):
shutil.copy(dir_src + filename, dir_dst)
print("End")
I'm not sure exactly why your script is failing, but you might want to try a solution with a regular expression (re).
import re
pattern = re.compile(r'^202105(\d{2})44592AKC$')
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if pattern.match(filename):
shutil.copy(dir_src + filename, dir_dst)
print("End")

Search anywhere in a filename for a string, case insensitive. Python 3

Env: Python 3.6, O/S: Windows 10
I have the following code that will search for filenames that contain a string either at the start (.startswith) of a filename or the end of a filename (.endswith), including sub directories and is case sensitive, i.e. searchText = 'guess' as opposed to searchText = 'Guess'.
I would like to modifyif FILE.startswith(searchText): that allows a search anywhere in the filename and is case insensitive. Is this possible?
For example, a directory contains two files called GuessMyNumber.py and guessTheNumber.py.
I would like to search for 'my' and the code to return the filename GuessMyNumber.py
#!/usr/bin/env python3
import os
# set text to search for
searchText = 'Guess'
# the root (top of tree hierarchy) to search, remember to change \ to / for Windows
TOP = 'C:/works'
found = 0
for root, dirs, files in os.walk(TOP, topdown=True, onerror=None, followlinks=True):
for FILE in files:
if FILE.startswith(searchText):
print ("\nFile {} exists..... \t\t{}".format(FILE, os.path.join(root)))
found += 1
else:
pass
print('\n File containing \'{}\' found {} times'.format(searchText, found))
Thanks guys,
Tommy.
A simple glob-based approach:
#!/usr/bin/env python3
import os
import glob
# set text to search for
searchText = 'Guess'
# the root (top of tree hierarchy) to search, remember to change \ to / for Windows
TOP = 'C:/works'
found = 0
for filename in glob.iglob(os.path.join(TOP, '**', f'*{searchText}*'), recursive=True):
print ("\nFile {} exists..... \t\t{}".format(filename, os.path.dirname(filename)))
found += 1
print('\n File containing \'{}\' found {} times'.format(searchText, found))
A simple fnmatch-based approach:
#!/usr/bin/env python3
import os
import fnmatch
# set text to search for
searchText = 'Guess'
# the root (top of tree hierarchy) to search, remember to change \ to / for Windows
TOP = 'C:/works'
found = 0
for root, dirnames, filenames in os.walk(TOP, topdown=True, onerror=None, followlinks=True):
for filename in filenames:
if fnmatch.fnmatch(filename, f'*{searchText}*'):
print ("\nFile {} exists..... \t\t{}".format(filename, os.path.join(root)))
found += 1
print('\n File containing \'{}\' found {} times'.format(searchText, found))
You could also use a PERL-compatible (more general) regular expression supported by re instead of the POSIX-compatible (less general) supported by glob and fnmatch.
However, in this simple scenario, the POSIX-compatible is more than enough.
Instead you can locally store all the .py files in your directory with their absolute path, use a simple regex to see if the absolute path has "Guess" or "guess" (depends on your use). Try this code and let us know
import pathlib
import os
import re
required=[]
for filepath in pathlib.Path('C:\\your\\directory').glob('**/*.py'): #any file extension
required.append(os.path.abspath(filepath)
nameregex=re.compile(r'(.*Guess)')
mo=list(filter(nameregex.match, required))
print(mo)
#or len will give you the length of the list
print(len(mo))

Moving files in python based on file and folder name

Relatively new to python ( not using it everyday ). However I am trying to simplify some things. I basically have Keys which have long names however a subset of the key ( or file name ) has the same sequence of the associated folder.{excuse the indentation, it is properly indented.} I.E
file1 would be: 101010-CDFGH-8271.dat and folder is CDFGH-82
file2 would be: 101010-QWERT-7425.dat and folder is QWERT-74
import os
import glob
import shutil
files = os.listdir("files/location")
dest_1 = os.listdir("dest/location")
for f in files:
file = f[10:21]
for d in dest_1:
dire = d
if file == dire:
shutil.move(file, dest_1)
The code runs with no errors, however nothing moves. Look forward to your reply and chance to learn.
Sorry updated the format.
Try a variation of:
basedir = "dest/location"
for fname in os.listdir("files/location"):
dirname = os.path.join(basedir, fname[10:21])
if os.path.isdir(dirname):
path = os.path.join("files/location", fname)
shutil.move(path, dirname)

Resources