Moving files in python based on file and folder name - python-3.x

Relatively new to python ( not using it everyday ). However I am trying to simplify some things. I basically have Keys which have long names however a subset of the key ( or file name ) has the same sequence of the associated folder.{excuse the indentation, it is properly indented.} I.E
file1 would be: 101010-CDFGH-8271.dat and folder is CDFGH-82
file2 would be: 101010-QWERT-7425.dat and folder is QWERT-74
import os
import glob
import shutil
files = os.listdir("files/location")
dest_1 = os.listdir("dest/location")
for f in files:
file = f[10:21]
for d in dest_1:
dire = d
if file == dire:
shutil.move(file, dest_1)
The code runs with no errors, however nothing moves. Look forward to your reply and chance to learn.
Sorry updated the format.

Try a variation of:
basedir = "dest/location"
for fname in os.listdir("files/location"):
dirname = os.path.join(basedir, fname[10:21])
if os.path.isdir(dirname):
path = os.path.join("files/location", fname)
shutil.move(path, dirname)

Related

How to copy merge files of two different directories with different extensions into one directory and remove the duplicated ones

I would need a Python function which performs below action:
I have two directories which in one of them I have files with .xml format and in the other one I have files with .pdf format. To simplify things consider this example:
Directory 1: a.xml, b.xml, c.xml
Directory 2: a.pdf, c.pdf, d.pdf
Output:
Directory 3: a.xml, b.xml, c.xml, d.pdf
As you can see the priority is with the xml files in the case that both extensions have similar names.
I would be thankful for your help.
You need to use the shutil module and the os module to achieve this. This function will work on the following assumption:
A given directory has all files with the same extension
The priority_directory will be the directory with file extensions to be prioritized
The secondary_directory will be the directory with file extensions to be dropped in case of a name collision
Try:
import os,shutil
def copy_files(priority_directory,secondary_directory,destination = "new_directory"):
file_names = [os.path.splitext(filename)[0] for filename in os.listdir(priority_directory)] # get the file names to check for collisions
os.mkdir(destination) # make a new directory
for file in os.listdir(priority_directory): # this loop copies the first direcotory as it is
file_path = os.path.join(priority_directory,file)
dst_path = os.path.join(destination,file)
shutil.copy(file_path,dst_path)
for file in os.listdir(secondary_directory): # this loop checks for collisions and drops files whose name collide
if(os.path.splitext(file)[0] not in file_names):
file_path = os.path.join(secondary_directory,file)
dst_path = os.path.join(destination,file)
shutil.copy(file_path,dst_path)
print(os.listdir(destination))
Let's run it with your direcotry names as arguments:
copy_files('directory_1','directory_2','directory_3')
You can now check a new directory with the name directory_3 will be created with the desired files in it.
This will work for all such similar cases no matter what the extension is.
Note: There should not be a need to do this i guess cause a directory can have two files with the same name as long as the extensions differ.
Rough working solution:
import os
from shutil import copy2
d1 = './d1/'
d2 = './d2/'
d3 = './d3/'
ext_1 = '.xml'
ext_2 = '.pdf'
def get_files(d: str, files: list):
directory = os.fsencode(d)
for file in os.listdir(d):
dup = False
filename = os.fsdecode(file)
if filename[-4:] == ext_2:
for (x, y) in files:
if y == filename[:-4] + ext_1:
dup = True
break
if dup:
continue
files.append((d, filename))
files = []
get_files(d1, files)
get_files(d2, files)
for d, file in files:
copy2(d+file, d3)
I'll see if I can get it to look/perform better.

For Loop to Move and Rename .html Files - Python 3

I'm asking for help in trying to create a loop to make this script go through all files in a local directory. Currently I have this script working with a single HTML file, but would like it so it picks the first file in the directory and just loops until it gets to the last file in the directory.
Another way to help would be adding a line to the string would add a (1), (2), (3), etc. at the end if the names are duplicate.
Can anyone help with renaming thousands of files with a string that is parsed with BeautifulSoup4. Each file contains a name and reference number at the same position/line. Could be same name and reference number, or could be different reference number with same name.
import bs4, shutil, os
src_dir = os.getcwd()
print(src_dir)
dest_dir = os.mkdir('subfolder')
os.listdir()
dest_dir = src_dir+"/subfolder"
src_file = os.path.join(src_dir, 'example_filename_here.html')
shutil.copy(src_file, dest_dir)
exampleFile = open('example_filename_here.html')
exampleSoup = bs4.BeautifulSoup(exampleFile.read(), 'html.parser')
elems = exampleSoup.select('.bodycopy')
type(elems)
elems[2].getText()
dst_file = os.path.join(dest_dir, 'example_filename_here.html')
new_dst_file_name = os.path.join(dest_dir, elems[2].getText()+ '.html')
os.rename(dst_file, new_dst_file_name)
os.chdir(dest_dir)
print(elems[2].getText())

Compare by NAME only, and not by NAME + EXTENSION using existing code; Python 3.x

The python 3.x code (listed below) does a great job of comparing files from two different directories (Input_1 and Input_2) and finding the files that match (are the same between the two directories). Is there a way I can alter the existing code (below) to find files that are the same BY NAME ONLY between the two directories. (i.e. find matches by name only and not name + extension)?
comparison = filecmp.dircmp(Input_1, Input_2) #Specifying which directories to compare
common_files = ', '.join(comparison.common) #Finding the common files between the directories
TextFile.write("Common Files: " + common_files + '\n') # Writing the common files to a new text file
Example:
Directory 1 contains: Tacoma.xlsx, Prius.txt, Landcruiser.txt
Directory 2 contains: Tacoma.doc, Avalon.xlsx, Rav4.doc
"TACOMA" are two different files (different extensions). Could I use basename or splitext somehow to compare files by name only and have it return "TACOMA" as a matching file?
To get the file name, try:
from os import path
fil='..\file.doc'
fil_name = path.splitext(fil)[0].split('\\')[-1]
This stores file in file_name. So to compare files, run:
from os import listdir , path
from os.path import isfile, join
def compare(dir1,dir2):
files1 = [f for f in listdir(dir1) if isfile(join(dir1, f))]
files2 = [f for f in listdir(dir2) if isfile(join(dir2, f))]
common_files = []
for i in files1:
for j in files2:
if(path.splitext(i)[0] == path.splitext(j)[0]): #this compares it name by name.
common_files.append(i)
return common_files
Now just call it:
common_files = compare(dir1,dir2)
As you know python is case-sensitive, if you want common files, no matter if they contain uppers or lowers, then instead of:
if(path.splitext(i)[0] == path.splitext(j)[0]):
use:
if(path.splitext(i)[0].lower() == path.splitext(j)[0].lower()):
You're code worked very well! Thank you again, Infinity TM! The final use of the code is as follows for anyone else to look at. (Note: that Input_3 and Input_4 are the directories)
def Compare():
Input_3 = #Your directory here
Input_4 = #Your directory here
files1 = [f for f in listdir(Input_3) if isfile(join(Input_3, f))]
files2 = [f for f in listdir(Input_4) if isfile(join(Input_4, f))]
common_files = []
for i in files1:
for j in files2:
if(path.splitext(i)[0].lower() == path.splitext(j)[0].lower()):
common_files.append(path.splitext(i)[0])

Rename multiple files in Python from another list

I am trying to rename multiple files from another list. Like rename the test.wav to test_1.wav from the list ['_1','_2'].
import os
list_2 = ['_1','_2']
path = '/Users/file_process/new_test/'
file_name = os.listdir(path)
for name in file_name:
for ele in list_2:
new_name = name.replace('.wav',ele+'.wav')
os.renames(os.path.join(path,name),os.path.join(path,new_name))
But turns out the error shows "FileNotFoundError: [Errno 2] No such file or directory: /Users/file_process/new_test/test.wav -> /Users/file_process/new_test/test_2.wav
However, the first file in the folder has changed to test_1.wav but not the rest.
You are looping against 1st file with a total list. You have to input both the list and filename in the single for loop.
This can be done using zip(file_name, list_2) function.
This will rename the file with appending whatever is sent through the list. We just have to make sure the list and the number of files are always equal.
Code:
import os
list_2 = ['_1','_2']
path = '/Users/file_process/new_test/'
file_name = os.listdir(path)
for name, ele in zip(file_name, list_2):
new_name = name.replace(name , name[:-4] + ele+'.wav')
print(new_name)
os.renames(os.path.join(path,name),os.path.join(path,new_name))
You've got error in your algorithm.
Your algorithm first gets through the outer loop (for name in file_name) and then in the inner loop, you replace the file test.wav to test_1.wav. At this step, there is no file named test.wav (it has been already replaced as test_1.wav); however, your algorithm, still, tries to rename the file named test.wav to test_2.wav; and can not find it, of course!

Extracting file names from a list using single line for loop - python

I am trying to see if I can extract the file names from a os.listdir() output by omitting the '.csv' part in one single line for loop.
for example my list of file names look like this :
files = ['OPS020.csv','OPS340.csv',OPS230.csv','OPS349.csv']
Then all i could do was this
file_names = [f.split('.') for f in files]
file_names = [f[0] for f in file_names]
Is there a more elegant and shorter way to do this ?
the output i'm expecting is
file_names : ['OPS020','OPS340','OPS230','OPS349']
I guess, something like this would work.
from os import path
files = ['OPS020.csv','OPS340.csv','OPS230.csv','OPS349.csv']
filenames = [path.splitext(x)[0] for x in files]
Docs

Resources