python script to slice filenames to certain length

python script to slice filenames to certain length - python-3.x

I am new to python language, trying to write a script that can slice the last characters of filenames to a specific length.
It worked, but for some reason, it denied to proceed and the loop unexpectedly broke giving an error message that the file doesn't exist.
error message: "FileNotFoundError: [WinError 2] The system cannot find the file specified:"
Here is my script, please tell me what is wrong!!!
import os
#define a function to trim latest characters to a specefic length"""
def renamer(folderpath, newlength):
while True:
for filename in os.listdir(folderpath):
root = os.path.splitext(filename)[0]
exten = os.path.splitext(filename)[1]
while len(root) >= newlength:
os.rename(folderpath + '\\' + root + exten, folderpath + '\\' + root[:-1] + exten)
continue
if len(root) <= newlength:
break

I do not like the way you are doing the task, but for solving your problem I am stating the mistake you are doing.
You changed the file from root='Name' to root='Nam' but did not update the value of root. So the next time the loop runs, it again looks for a file 'name' which obviously does not exist, because you renamed it to 'nam'.
So update the value of root also and you will be good to go.
But again, I should mention that you should solve it in some other way.

There are 2 problems here :
It's a good idea to use os.path.join for joining the folder and file name, so that it works on all systems without having to change the code (Eg. - *nix OSes that use / instead of \ as the seperator), instead of concatenating the \\ directly
Like #Illrnr said, the problem is that after the first rename (eg. abcdefgh.png to abcdefg.png), the code will continue looking for the old filename (abcdefgh.png) and this raises the error.
Using a while loop to keep shortening it one character at a time is complicating your logic, and will increase the runtime of your code a lot too with so many calls to rename - you can shorten it to required length in one go without all those loops and tracking length etc etc...
Try this and see if you understand the code:
import os
def renamer(folderpath, newlength):
for filename in os.listdir(folderpath):
root, exten = os.path.splitext(filename)
if len(root)>newlength:
oldname = os.path.join(folderpath, root+exten)
newname = os.path.join(folderpath, root[:newlength]+exten)
os.rename(oldname, newname)
print(f"Shortened {oldname} to {newname}")

Related

Python filepaths have double backslashes

Ultimately, I want to loop through every pdf in specified directory ('C:\Users\dude\pdfs_for_parsing') and print the metadata for each pdf. The issue is that when I try to loop through the "directory" I'm receiving the error "FileNotFoundError: [Errno 2] No such file or directory:". I understand this error is occurring because I now have double slashes in my filepaths for some reason.
Example Code
import PyPDF2
import os
path_of_the_directory = r'C:\Users\dude\pdfs_for_parsing'
directory = []
ext = ('.pdf')
def isolate_pdfs():
for files in os.listdir(path_of_the_directory):
if files.endswith(ext):
x = os.path.abspath(files)
directory.append(x)
for pdf in directory:
reader = PyPDF2.PdfReader(pdf)
information = reader.metadata
print(information)
isolate_pdfs()
If I print the file paths one at a time, I see that the files have single '/' like I'm expecting:
for pdf in directory:
print(pdf)
The '//' seems to get added when I try to open each of the PDFs 'PDFFile = open(pdf,'rb')'

Your issue has nothing to do with //, it's here:
os.path.abspath(files)
Say you have C:\Users....\x.pdf, you list that directory, so the files will contain x.pdf. You then take the absolute path of x.pdf, which the abspath supposes to be in the current directory. You should replace it with:
x = os.path.join(path_of_the_directory, files)
Other notes:
PDFFile and PDF shouldn't be in uppercase. Prefer pdf_file and pdf_reader. The latter also avoids the confusion with the for pdf in...
Try to use a debugger rather than print statements. This is how I found your bug. It can be in your IDE or in command line with python -i You can step through your code, test a few variations, fiddle with the variables...
Why is ext = ('.pdf') with braces ? It doesn't do anything but leads to think that it might be a tuple (but isn't).
As an exercise the first for can be written as: directory = [os.path.join(path_of_the_directory, x) for x in os.listdir(path_of_the_directory) if x.endswith(ext)]

Python - Script that copy certain files by file name

I wrote a script to copy files with specific names from one folder to another.
The file name format I want to copy is 2021052444592AKC. However, the script I wrote copies all files with the ending AKC, but in the if condition I specified that it should copy only files if the filename starts with "202105" and ends with "AKC". In the folder I have other files in the same format that is"YYYYMMDD44592threeUpperCaseLetters"
Can anyone help, because I haven't found the answer to this problem, thanks in advance :)
P.S I'm using Python3 in PyCharm
import shutil
import os
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if filename.startswith("202105") and filename.endswith("AKC"):
shutil.copy(dir_src + filename, dir_dst)
print("End")

I'm not sure exactly why your script is failing, but you might want to try a solution with a regular expression (re).
import re
pattern = re.compile(r'^202105(\d{2})44592AKC$')
os.chdir(r"C:\\")
# without a double backslash and the letter r, the compiler throws an error
dir_src = r"C:\\Users\\Adam\\Desktop\\1\\"
dir_dst = r"C:\\Users\\Adam\\Desktop\\2\\"
for filename in os.listdir(dir_src):
if pattern.match(filename):
shutil.copy(dir_src + filename, dir_dst)
print("End")

I would like a way to have a "try again" for wrong user inputs. Is there a way to do this?

So I've got a list of files I'm looping over and a list of folders, I match my filenames to the folders that contain matching words and that works fine. My code will detect if there's a matching_folder for a file and tell me which one/s, then I can type the name of that folder and it will move it to that folder. It loops over files in the list, and it can be a large list sometimes. However, if I accidentally type the name of the folder wrong, which is a user input, it passes my file and moves on to the next one. Is there a way I can get my code to NOT move onto the next file. but instead prompt me again?
if len(matching_folders) >= 2:
print(f"There is MORE than one folder for {filename}" + "\n")
if not filename in files_to_move:
continue
for item in matching_folders:
print(item + "\n")
answer_2 = input(f"Type name of folder: " + "\n")
item_words = answer_2.lower().split(' ')
for folder in folder_list:
count = 0
folder_words = folder.lower().split(' ')
for word in item_words:
if word in folder_words:
count += 1
if count == 2:
folder_path = os.path.join(paths[1], folder)
destination_file_path = os.path.join(folder_path, filename)
shutil.move(source_file_path, destination_file_path)
print(f"File moved to --> {folder}")
Excuse me if this is bad code but I'm just learning. I am taking it one step at time. But again, if I make a typo, my code goes onto the next file in the loop (there's actually a for loop for filename one level above all my code, but there's a lot of other irrelevant stuff above it so I didn't include it). I want it to not go to the next file if I make a typo. Thanks.

Use os.path.isdir(user_inputted_folder) to control your logic. This will return True if the user_inputted_folder exists and is a directory (not a regular file).
https://docs.python.org/3/library/os.path.html#os.path.isdir
If the actual folder is named "Folder1", and you misinput "Folderrr1", then this will return False as long as "Folderrr1" doesn't also exist (if it exists, then this "typo" can't be caught).

Problem with multivariables in string formatting

I have several files in a folder named t_000.png, t_001.png, t_002.png and so on.
I have made a for-loop to import them using string formatting. But when I use the for-loop I got the error
No such file or directory: '/file/t_0.png'
This is the code that I have used I think I should use multiple %s but I do not understand how.
for i in range(file.shape[0]):
im = Image.open(dir + 't_%s.png' % str(i))
file[i] = im

You need to pad the string with leading zeroes. With the type of formatting you're currently using, this should work:
im = Image.open(dir + 't_%03d.png' % i)
where the format string %03s means "this should have length 3 characters and empty space should be padded by leading zeroes".
You can also use python's other (more recent) string formatting syntax, which is somewhat more succinct:
im = Image.open(f"{dir}t_{i:03d}")

You are not padding the number with zeros, thus you get t_0.png instead of t_000.png.
The recommended way of doing this in Python 3 is via the str.format function:
for i in range(file.shape[0]):
im = Image.open(dir + 't_{:03d}.png'.format(i))
file[i] = im
You can see more examples in the documentation.
Formatted string literals are also an option if you are using Python 3.6 or a more recent version, see Green Cloak Guy's answer for that.

Try this:
import os
for i in range(file.shape[0]):
im = Image.open(os.path.join(dir, f't_{i:03d}.png'))
file[i] = im
(change: f't_{i:03d}.png' to 't_{:03d}.png'.format(i) or 't_%03d.png' % i for versions of Python prior to 3.6).
The trick was to specify a certain number of leading zeros, take a look at the official docs for more info.
Also, you should replace 'dir + file' with the more robust os.path.join(dir, file), which would work regardless of dir ending with a directory separator (i.e. '/' for your platform) or not.
Note also that both dir and file are reserved names in Python and you may want to rename your variables.
Also check that if file is a NumPy array, file[i] = im may not be working.

The code seems correct but my files aren't getting deleted

I heard that python can make life easier, I wanted to remove duplicates in folderA by comparing folderB with folderA, so I decided to download python and try coding with python. My code seems correct, however, my files are failing to delete, what's wrong with it?
I tried unlink but doesn't work.
import os
with open(r"C:\pathto\output.txt", "w") as a:
for path, subdirs, files in os.walk(r'C:\pathto\directoryb'):
for filename in files:
#f = os.path.join(path, filename)
#a.write(str(f) + os.linesep)
a.write(str(filename) + '\n')
textFile = open(r'C:\output.txt', 'r')
line = textFile.readline()
while line:
target = str(line)
todelete = 'C:\directorya' + target
if (os.path.exists(todelete)):
os.remove(todelete)
else:
print("failed")
line = textFile.readline()
textFile.close()
I want my files deleted, basically folderA contains some files in folderB, and I'm trying to delete it.

The problem is that the place where you're deleting the file isn't actually deleting a file - it's deleting a variable that contains the file's information.
todelete = 'C:\directorya' + target
if (os.path.exists(todelete)):
os.remove(todelete) # this is deleting todelete, but doesn't get rid of the file!
I had a similar problem in a program I've started on but with a list, and in the end I had to use this kind of format:
lst.remove(lst[val1][val2][val3]) # as opposed to something cleaner-looking, like 'lst.remove(var_to_del)'
It's a pain, but I hope that clarifies the issue! You'll have to go to the file without giving it a variable name.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

python script to slice filenames to certain length - python-3.x

Related

Python filepaths have double backslashes

Python - Script that copy certain files by file name

I would like a way to have a "try again" for wrong user inputs. Is there a way to do this?

Problem with multivariables in string formatting

The code seems correct but my files aren't getting deleted

Categories

Resources