Python - Script that copy certain files by file name - python-3.x

I wrote a script to copy files with specific names from one folder to another.
The file name format I want to copy is 2021052444592AKC. However, the script I wrote copies all files with the ending AKC, but in the if condition I specified that it should copy only files if the filename starts with "202105" and ends with "AKC". In the folder I have other files in the same format that is"YYYYMMDD44592threeUpperCaseLetters"
Can anyone help, because I haven't found the answer to this problem, thanks in advance :)
P.S I'm using Python3 in PyCharm
import shutil
import os
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if filename.startswith("202105") and filename.endswith("AKC"):
shutil.copy(dir_src + filename, dir_dst)
print("End")

I'm not sure exactly why your script is failing, but you might want to try a solution with a regular expression (re).
import re
pattern = re.compile(r'^202105(\d{2})44592AKC$')
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if pattern.match(filename):
shutil.copy(dir_src + filename, dir_dst)
print("End")

Related

Python filepaths have double backslashes

Ultimately, I want to loop through every pdf in specified directory ('C:\Users\dude\pdfs_for_parsing') and print the metadata for each pdf. The issue is that when I try to loop through the "directory" I'm receiving the error "FileNotFoundError: [Errno 2] No such file or directory:". I understand this error is occurring because I now have double slashes in my filepaths for some reason.
Example Code
import PyPDF2
import os
path_of_the_directory = r'C:\Users\dude\pdfs_for_parsing'
directory = []
ext = ('.pdf')
def isolate_pdfs():
for files in os.listdir(path_of_the_directory):
if files.endswith(ext):
x = os.path.abspath(files)
directory.append(x)
for pdf in directory:
reader = PyPDF2.PdfReader(pdf)
information = reader.metadata
print(information)
isolate_pdfs()
If I print the file paths one at a time, I see that the files have single '/' like I'm expecting:
for pdf in directory:
print(pdf)
The '//' seems to get added when I try to open each of the PDFs 'PDFFile = open(pdf,'rb')'
Your issue has nothing to do with //, it's here:
os.path.abspath(files)
Say you have C:\Users....\x.pdf, you list that directory, so the files will contain x.pdf. You then take the absolute path of x.pdf, which the abspath supposes to be in the current directory. You should replace it with:
x = os.path.join(path_of_the_directory, files)
Other notes:
PDFFile and PDF shouldn't be in uppercase. Prefer pdf_file and pdf_reader. The latter also avoids the confusion with the for pdf in...
Try to use a debugger rather than print statements. This is how I found your bug. It can be in your IDE or in command line with python -i You can step through your code, test a few variations, fiddle with the variables...
Why is ext = ('.pdf') with braces ? It doesn't do anything but leads to think that it might be a tuple (but isn't).
As an exercise the first for can be written as: directory = [os.path.join(path_of_the_directory, x) for x in os.listdir(path_of_the_directory) if x.endswith(ext)]

Date as part of file name nming convention for Gsheet API exported xlsx file using python

May I know what function I need to add/replace on my python script, here's my issue I have exported xlsx file from gsheet API to my server and I need to add an generic filename with file name and date (ex. FILENAME20211107.xlsx)
here's my code:
with open("/xlsx/FILENAME.xlsx", 'wb') as f:
f.write(res.content)
From the goal is to set File naming convention using date in the script for example. If I run the script the extracted file from google api will be automatically named like this format "filename_202111107.xlsx", you want to use the specific filename like filename_202111107.xlsx in your script. In this case, how about the following modification?
Modified script:
import datetime # Please add this.
import os # Please add this.
path = "/xlsx/"
prefix = "sampelName"
suffix = datetime.datetime.now().strftime("%Y%m%d")
filename = prefix + "_" + suffix + ".xlsx"
v = os.path.join(path, filename)
print(v) # <--- /xlsx/sampelName_20211108.xlsx
with open(v, 'wb') as f:
f.write(res.content)
When above script is run, v is /xlsx/sampelName_20211108.xlsx.
Note:
In your comment, you use 202111107 of filename_202111107.xlsx. In this case, when the script is run on November 7, 2021, you want to use 20211107. I understood like this. In your question, I understood that you might want to use 20211107 instead of 202111107.

python script to slice filenames to certain length

I am new to python language, trying to write a script that can slice the last characters of filenames to a specific length.
It worked, but for some reason, it denied to proceed and the loop unexpectedly broke giving an error message that the file doesn't exist.
error message: "FileNotFoundError: [WinError 2] The system cannot find the file specified:"
Here is my script, please tell me what is wrong!!!
import os
#define a function to trim latest characters to a specefic length"""
def renamer(folderpath, newlength):
while True:
for filename in os.listdir(folderpath):
root = os.path.splitext(filename)[0]
exten = os.path.splitext(filename)[1]
while len(root) >= newlength:
os.rename(folderpath + '\\' + root + exten, folderpath + '\\' + root[:-1] + exten)
continue
if len(root) <= newlength:
break
I do not like the way you are doing the task, but for solving your problem I am stating the mistake you are doing.
You changed the file from root='Name' to root='Nam' but did not update the value of root. So the next time the loop runs, it again looks for a file 'name' which obviously does not exist, because you renamed it to 'nam'.
So update the value of root also and you will be good to go.
But again, I should mention that you should solve it in some other way.
There are 2 problems here :
It's a good idea to use os.path.join for joining the folder and file name, so that it works on all systems without having to change the code (Eg. - *nix OSes that use / instead of \ as the seperator), instead of concatenating the \\ directly
Like #Illrnr said, the problem is that after the first rename (eg. abcdefgh.png to abcdefg.png), the code will continue looking for the old filename (abcdefgh.png) and this raises the error.
Using a while loop to keep shortening it one character at a time is complicating your logic, and will increase the runtime of your code a lot too with so many calls to rename - you can shorten it to required length in one go without all those loops and tracking length etc etc...
Try this and see if you understand the code:
import os
def renamer(folderpath, newlength):
for filename in os.listdir(folderpath):
root, exten = os.path.splitext(filename)
if len(root)>newlength:
oldname = os.path.join(folderpath, root+exten)
newname = os.path.join(folderpath, root[:newlength]+exten)
os.rename(oldname, newname)
print(f"Shortened {oldname} to {newname}")

copy and rename files starting at a specific integer value in python

My codes work but for a few pain points that perhaps you can help me understand. I want to copy files from one directory to another and rename them at the same time. for example:
c:\path\
octo.jpeg
novem.jpeg
decem.jpeg
to:
c:\newpath\
001.jpeg
002.jpeg
003.jpeg
The codes I wrote from a cursory google search are as follows but I'm not sure why I need the 'r' in the path variables. The 'files = os.listdir(srcPath)' line I'm sure I don't need. This will move the files and renames them using the 'count' variable in the for loop but I want to name each file starting at a specific number, say 65. Should I use the shutil library and copy2 method to first copy the files and then rename or is there an easier way?
import os
from os import path
srcPath = r'C:\Users\Talyn\Desktop\New folder\Keep\New folder'
destPath = r'C:\Users\Talyn\Desktop\New folder\Keep\hold'
#files = os.listdir(srcPath)
def main():
for count, filename in enumerate(os.listdir(srcPath)):
dst = '{:03d}'.format(count) + ".jpeg"
os.rename(os.path.join(srcPath, filename), os.path.join(destPath, dst))
if __name__=="__main__":
main()
From the official Python Docs:
Both string and bytes literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and treat backslashes as literal characters.
The r is telling python interpreter to treat the backslashes(\) in the path string as literal characters and not as escaping characters.
For naming the files from a specific number:
dst = '{:03d}'.format(count + your_number) + ".jpeg"
Using copyfile from shutil
copyfile(srcPath + filename, destPath + dst)

FileNotFoundError long file path python - filepath longer than 255 characters

Normally I don't ask questions, because I find answers on this forum. This place is a goldmine.
I am trying to move some files from a legacy storage system(CIFS Share) to BOX using python SDK. It works fine as long as the file path is less than 255 characters.
I am using os.walk to pass the share name in unix format to list files in the directory
Here is the file name.
//dalnsphnas1.mydomain.com/c$/fs/hdrive/home/abcvodopivec/ENV Resources/New Regulation Review/Regulation Reviews and Comment Letters/Stormwater General Permits/CT S.W. Gen Permit/PRMT0012_FLPR Comment Letter on Proposed Stormwater Regulations - 06-30-2009.pdf
I also tried to escape the file, but still get FileNotFoundError, even though file is there.
//dalnsphnas1.mydomain.com/c$/fs/hdrive/home/abcvodopivec/ENV Resources/New Regulation Review/Regulation Reviews and Comment Letters/Stormwater General Permits/CT S.W. Gen Permit/PRMT0012_FLPR\ Comment\ Letter\ on\ Proposed\ Stormwater\ Regulations\ -\ 06-30-2009.pdf
So I tried to shorten the path using win32api.GetShortPathName, but it throws the same FileNotFoundError. This works fine on files with path length less than 255 characters.
Also tried to copy the file using copyfile(src, dst) to another destination folder to overcome this issue, and still get the same error.
import os, sys
import argparse
import win32api
import win32con
import win32security
from os import walk
parser = argparse.ArgumentParser(
description='Migration Script',
)
parser.add_argument('-p', '--home_path', required = True, help='Home Drive Path')
args = vars(parser.parse_args())
if args['home_path']:
pass
else:
print("Usage : script.py -p <path>")
print("-p <directory path>/")
sys.exit()
dst = (args['home_path'] + '/' + 'long_file_path_dir')
for dirname, dirnames, filenames in os.walk(args['home_path']):
for filename in filenames:
file_path = (dirname + '/' + filename)
path_len = len(file_path)
if(path_len > 255):
#short_path = win32api.GetShortPathName(file_path)
copyfile(file_path, dst, follow_symlinks=True)
After a lot of trial and error, figured out the solution (thanks to stockoverflow forum)
switched from unix format to UNC path
Then appending each file generated through os.walk with r'\\?\UNC' like below. UNC path starts with two backward slashes, I have to remove one to make it to work
file_path = (r'\\?\UNC' + file_path[1:])
Thanks again for everyone who responded.
Shynee

Resources