Rename multiple files in Python from another list - python-3.x

I am trying to rename multiple files from another list. Like rename the test.wav to test_1.wav from the list ['_1','_2'].
import os
list_2 = ['_1','_2']
path = '/Users/file_process/new_test/'
file_name = os.listdir(path)
for name in file_name:
for ele in list_2:
new_name = name.replace('.wav',ele+'.wav')
os.renames(os.path.join(path,name),os.path.join(path,new_name))
But turns out the error shows "FileNotFoundError: [Errno 2] No such file or directory: /Users/file_process/new_test/test.wav -> /Users/file_process/new_test/test_2.wav
However, the first file in the folder has changed to test_1.wav but not the rest.

You are looping against 1st file with a total list. You have to input both the list and filename in the single for loop.
This can be done using zip(file_name, list_2) function.
This will rename the file with appending whatever is sent through the list. We just have to make sure the list and the number of files are always equal.
Code:
import os
list_2 = ['_1','_2']
path = '/Users/file_process/new_test/'
file_name = os.listdir(path)
for name, ele in zip(file_name, list_2):
new_name = name.replace(name , name[:-4] + ele+'.wav')
print(new_name)
os.renames(os.path.join(path,name),os.path.join(path,new_name))

You've got error in your algorithm.
Your algorithm first gets through the outer loop (for name in file_name) and then in the inner loop, you replace the file test.wav to test_1.wav. At this step, there is no file named test.wav (it has been already replaced as test_1.wav); however, your algorithm, still, tries to rename the file named test.wav to test_2.wav; and can not find it, of course!

Related

Python filepaths have double backslashes

Ultimately, I want to loop through every pdf in specified directory ('C:\Users\dude\pdfs_for_parsing') and print the metadata for each pdf. The issue is that when I try to loop through the "directory" I'm receiving the error "FileNotFoundError: [Errno 2] No such file or directory:". I understand this error is occurring because I now have double slashes in my filepaths for some reason.
Example Code
import PyPDF2
import os
path_of_the_directory = r'C:\Users\dude\pdfs_for_parsing'
directory = []
ext = ('.pdf')
def isolate_pdfs():
for files in os.listdir(path_of_the_directory):
if files.endswith(ext):
x = os.path.abspath(files)
directory.append(x)
for pdf in directory:
reader = PyPDF2.PdfReader(pdf)
information = reader.metadata
print(information)
isolate_pdfs()
If I print the file paths one at a time, I see that the files have single '/' like I'm expecting:
for pdf in directory:
print(pdf)
The '//' seems to get added when I try to open each of the PDFs 'PDFFile = open(pdf,'rb')'
Your issue has nothing to do with //, it's here:
os.path.abspath(files)
Say you have C:\Users....\x.pdf, you list that directory, so the files will contain x.pdf. You then take the absolute path of x.pdf, which the abspath supposes to be in the current directory. You should replace it with:
x = os.path.join(path_of_the_directory, files)
Other notes:
PDFFile and PDF shouldn't be in uppercase. Prefer pdf_file and pdf_reader. The latter also avoids the confusion with the for pdf in...
Try to use a debugger rather than print statements. This is how I found your bug. It can be in your IDE or in command line with python -i You can step through your code, test a few variations, fiddle with the variables...
Why is ext = ('.pdf') with braces ? It doesn't do anything but leads to think that it might be a tuple (but isn't).
As an exercise the first for can be written as: directory = [os.path.join(path_of_the_directory, x) for x in os.listdir(path_of_the_directory) if x.endswith(ext)]

How to copy merge files of two different directories with different extensions into one directory and remove the duplicated ones

I would need a Python function which performs below action:
I have two directories which in one of them I have files with .xml format and in the other one I have files with .pdf format. To simplify things consider this example:
Directory 1: a.xml, b.xml, c.xml
Directory 2: a.pdf, c.pdf, d.pdf
Output:
Directory 3: a.xml, b.xml, c.xml, d.pdf
As you can see the priority is with the xml files in the case that both extensions have similar names.
I would be thankful for your help.
You need to use the shutil module and the os module to achieve this. This function will work on the following assumption:
A given directory has all files with the same extension
The priority_directory will be the directory with file extensions to be prioritized
The secondary_directory will be the directory with file extensions to be dropped in case of a name collision
Try:
import os,shutil
def copy_files(priority_directory,secondary_directory,destination = "new_directory"):
file_names = [os.path.splitext(filename)[0] for filename in os.listdir(priority_directory)] # get the file names to check for collisions
os.mkdir(destination) # make a new directory
for file in os.listdir(priority_directory): # this loop copies the first direcotory as it is
file_path = os.path.join(priority_directory,file)
dst_path = os.path.join(destination,file)
shutil.copy(file_path,dst_path)
for file in os.listdir(secondary_directory): # this loop checks for collisions and drops files whose name collide
if(os.path.splitext(file)[0] not in file_names):
file_path = os.path.join(secondary_directory,file)
dst_path = os.path.join(destination,file)
shutil.copy(file_path,dst_path)
print(os.listdir(destination))
Let's run it with your direcotry names as arguments:
copy_files('directory_1','directory_2','directory_3')
You can now check a new directory with the name directory_3 will be created with the desired files in it.
This will work for all such similar cases no matter what the extension is.
Note: There should not be a need to do this i guess cause a directory can have two files with the same name as long as the extensions differ.
Rough working solution:
import os
from shutil import copy2
d1 = './d1/'
d2 = './d2/'
d3 = './d3/'
ext_1 = '.xml'
ext_2 = '.pdf'
def get_files(d: str, files: list):
directory = os.fsencode(d)
for file in os.listdir(d):
dup = False
filename = os.fsdecode(file)
if filename[-4:] == ext_2:
for (x, y) in files:
if y == filename[:-4] + ext_1:
dup = True
break
if dup:
continue
files.append((d, filename))
files = []
get_files(d1, files)
get_files(d2, files)
for d, file in files:
copy2(d+file, d3)
I'll see if I can get it to look/perform better.

For Loop to Move and Rename .html Files - Python 3

I'm asking for help in trying to create a loop to make this script go through all files in a local directory. Currently I have this script working with a single HTML file, but would like it so it picks the first file in the directory and just loops until it gets to the last file in the directory.
Another way to help would be adding a line to the string would add a (1), (2), (3), etc. at the end if the names are duplicate.
Can anyone help with renaming thousands of files with a string that is parsed with BeautifulSoup4. Each file contains a name and reference number at the same position/line. Could be same name and reference number, or could be different reference number with same name.
import bs4, shutil, os
src_dir = os.getcwd()
print(src_dir)
dest_dir = os.mkdir('subfolder')
os.listdir()
dest_dir = src_dir+"/subfolder"
src_file = os.path.join(src_dir, 'example_filename_here.html')
shutil.copy(src_file, dest_dir)
exampleFile = open('example_filename_here.html')
exampleSoup = bs4.BeautifulSoup(exampleFile.read(), 'html.parser')
elems = exampleSoup.select('.bodycopy')
type(elems)
elems[2].getText()
dst_file = os.path.join(dest_dir, 'example_filename_here.html')
new_dst_file_name = os.path.join(dest_dir, elems[2].getText()+ '.html')
os.rename(dst_file, new_dst_file_name)
os.chdir(dest_dir)
print(elems[2].getText())

read_csv won't read from a file list and specified directory

I have an issue with using Python read_csv in a function. If I use the following code, there is no issue. It will read the two files which I can then append to a single dataframe output:
directory = r"\\*my directory*"
files1 = ['001 Data.txt', '002 Data.txt']
def read_data(directory, files):
list_ = []
for file in files:
df = pd.read_csv(directory + '\\' + file, sep='\t', header=0)
...do stuff...
list_.append()
return list_
df_mct20 = read_data(directory, files1) # This will generate my list which I can then concatenate
df_final = pd.concat(df_mct20)
The above code works fine. However, if I call this exact "read_data()" function in a for loop, I get the "No such file or directory" error:
files2 = ['003 Data.txt', '004 Data.txt']
for file in files2:
df2 = read_data(directory, file) # The error shows up here "No such file or directory"
...want to do stuff...
I've tried a number of things and can't seem to get it to work. Any help will be greatly appreciated~
The second input of your function has to be a list of filenames. So you cannot loop in a series of strings into the function itself.
You either need to re-write your function to just take in one filename at a time, or load in your files2 as a list, and not loop the filenames in files2 in.

Rename files according to list

I'm trying to rename files in a directory using a list. My code so far will only rename the first file before giving me a FileNotFoundError. How can I read the list and rename my files in the same order as it?
import os
import glob
fileLib = ('/filepath1/')
ref = ('/filepath2/ref.csv')
for file in glob.glob(os.path.join(fileLib, '*.csv')):
with open(ref) as list1:
line = list1.read().split(',\n')
for name in line:
os.rename(file, os.path.join(fileLib, '{}.csv'.format(name)))
You're applying the rename to the same file, since the loops are nested.
So the first time it works, and the next time it tries to rename a file that has been already renamed.
Reorganize your code. First, read the new names file:
fileLib = '/filepath1/'
ref = '/filepath2/ref.csv'
with open(ref) as list1:
newnames = list1.read().split(',\n')
then zip directory contents and the new names list together with a single loop:
for file,newname in zip(glob.glob(os.path.join(fileLib, '*.csv')),newnames):
os.rename(file, os.path.join(fileLib, '{}.csv'.format(newname)))
Since zip stops when one of the iterable parameters is exhausted, if the glob result is longer than the new names list, renaming will be done only partially, so it would be better to check that both lists have the same size prior to renamining.

Resources