Python read oldest file first - python-3.x

I have object and attributes data in separate csv files. there are 3 different types of objects.
Directory may contain different files but I have to read and process object and attribute files. After reading the object file and then will have to read respective attribute file.
Below is code and files
plant = []
flower = []
person = []
for file_name in os.listdir(dir_path):
if os.path.isfile(os.path.join(dir_path, file_name)):
if file_name.startswith('plant_file'):
plant.append(file_name)
if file_name.startswith('person_file'):
person.append(file_name)
if file_name.startswith('flower_file'):
flower.append(file_name)
for file_name in person:
object_file_path = dir_path + file_name
attribute_file_path = dir_path + file_name.replace('file','attributes_file')
read_object_csv = pd.read_csv(object_file_path)
read_attribute_csv = pd.read_csv(attribute_file_path)
for file_name in flower:
object_file_path = dir_path + file_name
attribute_file_path = dir_path + file_name.replace('file','attributes_file')
read_object_csv = pd.read_csv(object_file_path)
read_attribute_csv = pd.read_csv(attribute_file_path)
file name contains date and time in the format YYYYMMDDHHMMSS . Sample file names are
plant_attributes_file_20221013134403.csv
plant_attributes_file_20221013142151.csv
plant_attributes_file_20221013142455.csv
plant_file_20221013134403.csv
plant_file_20221013142151.csv
plant_file_20221013142455.csv
person_file_20221012134948.csv
person_file_20221012140706.csv
person_attributes_file_20221012134948.csv
person_attributes_file_20221012140706.csv
How can we sort file names in list using timestamp, so that oldest file can be loaded first and load latest file at last ?

Related

how to load files into AudioSegment.from_file

I am trying to make a python script which can convert a file from 'm4a' to a '.wav' however it is not able to find the files, returning "[WinError 2] The system cannot find the file specified" every time.
Paths[0] include the full path of the file in question
m4a_file = paths[0]
wav_filename = filename + ".wav"
#track = AudioSegment.from_file(m4a_file, format= 'm4a')
track = AudioSegment.from_file(file = relative_path,
format = 'm4a')
file_handle = track.export(wav_filename, format='wav')
// I've tried to
I've given both relative and full path for the file in question
'c:\Users\user\....\Workspace\Pollux\my_file.m4a'
and
'my_file.m4a'

How to create and add files to a directory?

I'm writing a program to take large PDF's and convert each page to a .jpg, then add the .jpg's of each pdf file to their own directory (which the program needs to create).
I have completed the conversion part of the program, but I am stuck on creating a directory and adding the files to the directory.
Here's my code so far.
import glob, sys, fitz, os, shutil
zoom_x = 2.0
zoom_y = 2.0
mat = fitz.Matrix(zoom_x, zoom_y) # to get better resolution
all_files = glob.glob('/Users/homefolder/Downloads/*.pdf') # image path
print(all_files)
for filename in all_files:
doc = fitz.open(filename)
head, tail = os.path.split(doc.name)
save_file_name = tail.split('.')[0]
for page in doc: # iterate through the pages
# print(page)
pix = page.get_pixmap(matrix=mat)
# render the image
filepath_save = '/Users/homefolder/Downloads/files' + save_file_name + str(page.number) + '.jpg'
pix.save(filepath_save) # save image
sample = glob.glob('/Users/homefolder/Downloads/*.jpg')
How would I write the code to create a directory for each pdf file and add those .jpg's to the directory?
You can create directory and save to it your processed files, I also refactored your code a bit:
import glob, fitz, os
zoom_x = 2.0
zoom_y = 2.0
mat = fitz.Matrix(zoom_x, zoom_y)
pdf_files = glob.glob('/Users/homefolder/Downloads/*.pdf')
save_to = '/Users/homefolder/Downloads/pdf_as_img/'
for path in pdf_files:
doc = fitz.open(path)
base_name, _ = os.path.splitext(os.path.basename(doc.name))
directory_to_save = os.path.join(save_to, base_name)
if not os.path.exists(directory_to_save):
os.makedirs(directory_to_save)
for page in doc:
pix = page.get_pixmap(matrix=mat)
filepath_save = os.path.join(directory_to_save, str(page.number) + '.jpg')
pix.save(filepath_save)
This script creates a directory for every pdf file and saves pages as jpg to it.

Re-naming multiple files

I have multiple directories inside which there are multiple files.
In directory1 files have the name format:
1.2.826.0.1.3680043.2.133.1.3.49.1.124.27456-3-1-10jd0au.dcm
1.2.826.0.1.3680043.2.133.1.3.49.1.124.27456-3-2-10jd0av.dcm
....
1.2.826.0.1.3680043.2.133.1.3.49.1.124.27456-3-10-17v7m18.dcm
In directory2:
1.2.826.0.1.3680043.2.133.1.3.49.1.46.34440-4-1-r3hu3u.dcm
1.2.826.0.1.3680043.2.133.1.3.49.1.46.34440-4-2-r3hu3v.dcm
....
and so on.
How can I rename these as just 1.dcm, 2.dcm,.....in each directory?
My attempt is as follows:
for dpath, dnames, fnames in os.walk(dir_path):
for dname in dnames:
directory = os.path.join(dir_path,dname)
for filename in os.listdir(directory):
old_name = os.path.join(directory,filename)
new = filename[filename.find("-"):]
new_name = os.path.join(directory, new)
os.rename(old_name, new_name)
But this only yields:
-3-1-10jd0au.dcm
-3-10-17v7m18.dcm
You could write a function that uses a regex to extract the parts of the filename that you need, for example:
import re
def ExtractNumber(filename):
parts = re.search(r".*-.*-(.*)-.*(\..*)", filename)
return parts.group(1) + parts.group(2)
print(ExtractNumber("1.2.826.0.1.3680043.2.133.1.3.49.1.124.27456-3-1-10jd0au.dcm"))
print(ExtractNumber("1.2.826.0.1.3680043.2.133.1.3.49.1.124.27456-3-10-17v7m18.dcm"))
Outputs:
1.dcm
10.dcm

Find folders that has no data in it and get their folder names

I want to find folders that has no data in it and get thier folder names.
The first and second folders are named randomly in numbers and has data in random folders.
the codes are
path = 'M://done/mesh/*'
FL = glob.glob(path)
FL2 = glob.glob(FL[0] + '/*')
FL2
['M://done/mesh\\41\\23',
'M://done/mesh\\41\\24',
'M://done/mesh\\41\\33',
'M://done/mesh\\41\\34',
'M://done/mesh\\41\\35',
'M://done/mesh\\41\\36',
'M://done/mesh\\41\\43',
'M://done/mesh\\41\\44',
'M://done/mesh\\41\\45',
'M://done/mesh\\41\\46',
'M://done/mesh\\41\\47',
'M://done/mesh\\41\\53',
'M://done/mesh\\41\\54',
'M://done/mesh\\41\\55',
'M://done/mesh\\41\\63',
'M://done/mesh\\41\\64',
'M://done/mesh\\41\\65',
'M://done/mesh\\41\\66',
'M://done/mesh\\41\\67',
'M://done/mesh\\41\\74',
'M://done/mesh\\41\\75',
'M://done/mesh\\41\\76',
'M://done/mesh\\41\\77',
'M://done/mesh\\41\\85',
'M://done/mesh\\41\\86',
'M://done/mesh\\41\\87']
FL2[24][24:26] + FL2[24][27:30] + '0000' # why do I need [24:26}, [27:30]???
finding_files = ['_Caminfo.dat','running.csv']
print(FL2[0] + '/0000/02_output/' + FL2[0][24:26] + FL2[0][27:30] + '0000/' + fn1[0])
'41860000'
You can use os.listdir to find empty folders, like below:
import os
folder_list = ["D:\\test_1", "D:\\test_2"]
empty_folders = []
for folder in folder_list:
try:
if not os.listdir(folder):
empty_folders.append(folder)
except FileNotFoundError:
pass
print(empty_folders)

Delete each file older than X days in Y folder

I wrote this but it doesn't work.
In giorni I put the maximum days of stay in the SD andfile_dir is the default location where the files are analyzed.
import os
from datetime import datetime, timedelta
file_dir = "/home/pi/" #location
giorni = 2 #n max of days
giorni_pass = datetime.now() - timedelta(giorni)
for root, dirs, files in os.walk(file_dir):
for file in files:
filetime = datetime.fromtimestamp(os.path.getctime(file))
if filetime > giorni_pass:
os.remove(file)
Solved with:
for file in files:
path = os.path.join(file_dir, file)
filetime = datetime.fromtimestamp(os.path.getctime(path))
if filetime > giorni_pass:
os.remove(path)
Because "Filenames" contains a list of files whose path name is relative to "file_dir" and to make operations on those files should first get the absolute path, using path = os.path.join(file_dir, file)

Resources