How to join a directory with wild search in python - python-3.x

I am trying to read all the csv files in the particular set of directories. I have a subdirectories that named as report followed by date like 'report2021-12-22-14_15', 'report2022-01-22-11_10'. I am manually trying to join the path as below
root = os.path.join(base, 'report2021-12-22-14_15' , 'report')
Is there any way I can do a wild search like 'report*' to join the directories so that I will not miss any subdirectories. Below is the snippet
from fnmatch import fnmatch
base = '/Users/user/Desktop/report_files/'
root = os.path.join(base, 'report2021-12-22-14_15','report')
pattern = "report.csv"
for path, files in os.walk(root):
for name in files:
if fnmatch(name, pattern):
file_path = os.path.join(path, name)

Related

Find Files with the term "deadbolt" in it and return only first subfolder with os.walk

This script gets a term and a path to a folder. Its goal is then to search in every subfolder for files that contain the term "deadbolt" in it and make a list and return that list.
So far so good but at the end I want to delete the first subfolder of where the script found a deadbolt file.
So for example I do have following folder structure:
d:/Movies/
├─ Movie1/subfolder1Movie1/subfolder2Movie1/movie1.mp4.deadbolt
├─ Movie2/subfolder1Movie2/subfolder2Movie2/movie2.mpeg
├─ Movie3/subfolder1Movie3/subfolder2Movie3/movie3.avi.deadbolt
In this case I provide the path "D:\Movies" and the term "deadbolt" and want the script to return ["Movie1","Movie3"].
Because I want to delete those folder structures completely. With there subfolders and files. But how can I achieve to get the first subfolder where a file was found without regex?
import os
import re
def findDeadbolts(searchTerm,search_path):
results = []
for root, dir, files in os.walk(search_path, topdown=True):
for filename in files:
if searchTerm in filename:
fullPath = os.path.join(root, filename)
results.append(fullPath)
pattern="(?<=Movies\\\\)[a-zA-Z0-9\_\-\!\?]+" #Dont wont to do it with regex since names can be qutie complex
print(re.search(pattern,fullPath)[0])
return results
print(findDeadbolts('deadbolt','D:\\Movies'))
I found a solution for this.
Using the method "parts" from "pathlib.Path". This gives me every part of a path. Since I know the root path I can get both lengths with len() and count the length of the root +1 or since it starts counting by 0 I just take the length of root path and this will work.
from pathlib import Path
import os
from shutil import rmtree
foundPath = "D:\Movies\Movie2\Movie2.avi.deadbolt"
rootPath = "D:\Movies"
foundParts = Path(foundPath).parts
rootParts = Path(rootPath).parts
folder = foundParts[len(rootParts)]
rmtree(os.path.join(rootPath,folder))
If there is a better solution for this comment below. ;-)

How to rename the files of different format from the same folder but different subfolder using python

I have one scenario where i have to rename the files in the folder. Please find the scenario,
Example :
Elements(Main Folder)<br/>
2(subfolder-1) <br/>
sample_2_description.txt(filename1)<br/>
sample_2_video.avi(filename2)<br/>
3(subfolder2)
sample_3_tag.jpg(filename1)<br/>
sample_3_analysis.GIF(filename2)<br/>
sample_3_word.docx(filename3)<br/>
I want to modify the names of the files as,
Elements(Main Folder)<br/>
2(subfolder1)<br/>
description.txt(filename1)<br/>
video.avi(filename2)<br/>
3(subfolder2)
tag.jpg(filename1)<br/>
analysis.GIF(filename2)<br/>
word.docx(filename3)<br/>
Could anyone guide on how to write the code?
Recursive directory traversal to rename a file can be based on this answer. All we are required to do is to replace the file name instead of the extension in the accepted answer.
Here is one way - split the file name by _ and use the last index of the split list as the new name
import os
import sys
directory = os.path.dirname(os.path.realpath("/path/to/parent/folder")) #get the directory of your script
for subdir, dirs, files in os.walk(directory):
for filename in files:
subdirectoryPath = os.path.relpath(subdir, directory) #get the path to your subdirectory
filePath = os.path.join(subdirectoryPath, filename) #get the path to your file
newFilePath = filePath.split("_")[-1] #create the new name by splitting the old name by _ and grabbing last index
os.rename(filePath, newFilePath) #rename your file
Hope this helps.
check below code example for the first filename1, replace path with the actual path of the file:
import os
os.rename(r'path\\sample_2_description.txt',r'path\\description.txt')
print("File Renamed!")

creating corresponding subfolders and writing a portion of the file in new files inside those subfolders using python

I have a folder named "data". It contains subfolders "data_1", "data_2", and "data_3". These subfolders contain some text files. I want to parse through all these subfolders and generate corresponding subfolders with the same name, inside another folder named "processed_data". I want to also generate corresponding files with "processed" as a prefix in the name and want to write all those lines from the original file where "1293" is there in the original files.
I am using the below code but not able to get the required result. Neither the subfolders "data_1", "data_2", and "data_3" nor the files are getting created
import os
folder_name=""
def pre_processor():
data_location="D:\data" # folder containing all the data
for root, dirs, files in os.walk(data_location):
for dir in dirs:
#folder_name=""
folder_name=dir
for filename in files:
with open(os.path.join(root, filename),encoding="utf8",mode="r") as f:
processed_file_name = 'D:\\processed_data\\'+folder_name+'\\'+'processed'+filename
processed_file = open(processed_file_name,"w", encoding="utf8")
for line_number, line in enumerate(f, 1):
if "1293" in line:
processed_file.write(str(line))
processed_file.close()
pre_processor()
You might need to elaborate on the issue you are having; e.g., are the files being created, but empty?
A few things I notice:
1) Your indentation is off (not sure if this is just a copy-paste issue though): the pre_processor function is empty, i.e. you are defining the function at the same level as the declaration, not inside of it.
try this:
import os
folder_name=""
def pre_processor():
data_location="D:\data" # folder containing all the data
for root, dirs, files in os.walk(data_location):
for dir in dirs:
#folder_name=""
folder_name=dir
for filename in files:
with open(os.path.join(root, filename), encoding="utf8",mode="r") as f:
processed_file_name = 'D:\\processed_data\\'+folder_name+'\\'+'processed'+filename
processed_file = open(processed_file_name,"w", encoding="utf8")
for line_number, line in enumerate(f, 1):
if "1293" in line:
processed_file.write(str(line))
processed_file.close()
pre_processor()
2) Check if the processed_data and sub_folders exist; if not, create them first as this will not do so.
Instead of creating the path to the new Folder by hand you could just replace the name of the folder.
Furthermore, you are not creating the subfolders.
This code should work but replace the Linux folder slashes:
import os
folder_name=""
def pre_processor():
data_location="data" # folder containing all the data
for root, dirs, files in os.walk(data_location):
for dir in dirs:
# folder_name=""
folder_name = dir
for filename in files:
joined_path = os.path.join(root, filename)
with open(joined_path, encoding="utf8", mode="r") as f:
processed_folder_name = root.replace("data/", 'processed_data/')
processed_file_name = processed_folder_name+'/processed'+filename
if not os.path.exists(processed_folder_name):
os.makedirs(processed_folder_name)
processed_file = open(processed_file_name, "w", encoding="utf8")
for line in f:
if "1293" in line:
processed_file.write(str(line))
processed_file.close()
pre_processor()

Moving files in python based on file and folder name

Relatively new to python ( not using it everyday ). However I am trying to simplify some things. I basically have Keys which have long names however a subset of the key ( or file name ) has the same sequence of the associated folder.{excuse the indentation, it is properly indented.} I.E
file1 would be: 101010-CDFGH-8271.dat and folder is CDFGH-82
file2 would be: 101010-QWERT-7425.dat and folder is QWERT-74
import os
import glob
import shutil
files = os.listdir("files/location")
dest_1 = os.listdir("dest/location")
for f in files:
file = f[10:21]
for d in dest_1:
dire = d
if file == dire:
shutil.move(file, dest_1)
The code runs with no errors, however nothing moves. Look forward to your reply and chance to learn.
Sorry updated the format.
Try a variation of:
basedir = "dest/location"
for fname in os.listdir("files/location"):
dirname = os.path.join(basedir, fname[10:21])
if os.path.isdir(dirname):
path = os.path.join("files/location", fname)
shutil.move(path, dirname)

Exclude directory path if contains a given string

I wish to exclude any path from further action, if it includes a string.
Code example:
import os
Dirpath = input('What directory path e.g. C:/ ')
FileType = input('What Ext type to search for e.g. txt ')
for root, dirs, files in os.walk(Dirpath):
for file in files:
if file.endswith(FileType):
print(os.path.join(root, file))
I need to ignore any path with contains Dropbox e.g.
c:/Users\ljh36\Dropbox\Shared Folders\walk.tmp
Can any guidance be given please ?
An example in os.walk displays removing a dir named CVS
from the dirs list. You can adapt this to your code.
The forward slash in the input string can be changed to a
backslash by using r just before the string so you do not
need to escape it with an additional backslash.
import os
Dirpath = input(r'What directory path e.g. C:\ ')
FileType = input('What Ext type to search for e.g. txt ')
for root, dirs, files in os.walk(Dirpath):
# Remove 'Dropbox' from the list of dirs to walk.
if 'Dropbox' in dirs:
dirs.remove('Dropbox')
for file in files:
if file.endswith(FileType):
print(os.path.join(root, file))

Resources