Python: Delete a special file - .htaccess

you know special files or folders where the first place is a point. For example: .example_folder or .examaple_file. Think about the .htaccess/.htpassword files. I know how to delete folders and files on the regular way with Python. But how can I delete some special files like this? The other problem is the special files don't have extensions like .txt oder .jpg etc.. When I try to delete the special files/folders on the regular way Python skips all the special files/folder. Does someone has any idea?
def delete_temp_update_files(path_files):
files = glob.glob(path_files)
for f in files:
os.remove(f)
def delete_temp_update_folders(folder_path, path_files):
folder_paths = glob.glob(folder_path)
'''
First, all folders are deleted in this current folder (folder_paths).
'''
if not folder_paths:
'''
The list (folder_paths) is empty, that means there aren't somer folders.
In this case its enough to delete all files - everything including
hidden files.
'''
result_files = delete_temp_update_files(path_files)
return result_files
else:
'''
There are some folders. Before the files are deleted, the folder must be deleted.
'''
for folder_element in folder_paths:
shutil.rmtree(folder_element, ignore_errors=True)
'''
Now all folders are delete have been deleted. Netx all files should be deleted.
'''
result_files = delete_temp_update_files(path_files)
return result_files
def on_delete_files_folders(update_temp_zip_file):
# The variable named (all_files) shows all files - everything including hidden files
all_files = os.path.join(update_temp_zip_file, '*')
# The variable named (all_folders) shows all folders in current folder.
all_folders = os.path.join(update_temp_zip_file, '*/')
delete_temp_update_folders(all_folders, all_files)
on_delete_files_folders("PATH/TO/YOUR/FOLDER")

Your decision is rather sophisticated.
You may use my script below:
import os
def del_recursive(path):
for file in os.listdir(path):
file = os.path.join(path, file)
if os.path.isdir(file):
try:
os.rmdir(file)
except:
del_recursive(file)
os.rmdir(file)
else:
os.remove(file)
If you want to delete everything in a directory you also may use shutil.rmtree(). Both variants work on Python 2.7.3.

Related

Extract text from first page of word document and use it as a folder name, then move the file inside that folder

I have hundreds of word documents that needs to be processed but need to organized them first by versions in subfolders.
I basically get a drop of these word documents within a single folder and need to automate the organization moving forward before I get nuts.
So I have a script that basically creates a folder with the same name of the file and moves the file inside that folder, this part is done.
Now I need to go into each subfolder, and get the document version from within the first word page of each document, then create a sub-folder withe version number and move the word file into that subfolder.
The structure should be as follows (taking two folders as examples):
(Folder) Test
(Subfolder) 12.0
Test.docx
(Folder) Test1
(Subfolder) 13.0
Test1.docx
Luckily I was able to figure it out that "doc.paragraphs[6].text" will always return the version information in a single line as follows:
>>> doc.paragraphs[6].text
'Version Number: 12.0'
Would appreciate if someone can point me out to the right direction.
This is the script I have so far:
#!/usr/bin/env python3
import glob, os, shutil, docx, sys
folder = sys.argv[1]
#print(folder)
for file_path in glob.glob(os.path.join(folder, '*.docx')):
new_dir = file_path.rsplit('.', 1)[0]
#print(new_dir)
try:
os.mkdir(os.path.join(folder, new_dir))
except WindowsError:
# Handle the case where the target dir already exist.
pass
shutil.move(file_path, os.path.join(new_dir, os.path.basename(file_path)))
Please see below the complete solution to your requirement.
Note: To know about re.search go through https://www.geeksforgeeks.org/python-regex-re-search-vs-re-findall/
import docx, os, glob, re, shutil
from pathlib import Path
def create_dir(path): # function to check if a given path exist and create one if not
# Check whether the specified path exists or not
is_exist = os.path.exists(path)
# Create a new directory the path does not exist
if not is_exist:
os.makedirs(path)
folder = fr"C:\Users\rams\Documents\word_docs" #my local folder
for file in glob.glob(os.path.join(folder, '*.docx')):
# Test, Test1, Test2 in your structure
main_folder = os.path.join(folder,Path(file).stem)
file_name = os.path.basename(file)
# Get the first line from the docx
doc = docx.Document(file).paragraphs[0].text
# group(1) = Version Number: (.*)
version_no = re.search("(Version Number: (.*))", doc).group(1)
# extract the number portion from version_no
sub_folder = version_no.split(':')[1].strip()
# path to actual sub_folder with version_no
sub_folder = os.path.join(main_folder, sub_folder)
# destination path
dest_file_path = os.path.join(sub_folder, file_name)
for i in [main_folder,sub_folder]:
create_dir(i) # function call
# to move the file to the corresponding version folder (overwrite if exists)
if os.path.exists(dest_file_path):
os.remove(dest_file_path)
shutil.move(file, sub_folder)
else:
shutil.move(file, sub_folder)
Before execution:
After Execution
So you have a script that creates a folder name being the file name and moves the file inside that folder. This part is done. OK.
Now you know how to get the document version from within the first word page of each document you need to create a sub-folder with this version number and move the word file into that sub-folder. This can be done using the same code as before replacing:
new_dir = file_path.rsplit('.', 1)[0]
with
document_dir = os.path.dirname(file_path)
document_name = os.path.basename(file_path)
# check if the document is already in the right directory:
assert os.path.basename(document_dir) == document_name.rsplit('.', 1)[0]
# here comes: doc = some_function_getting_the_doc_object(file_path)
doc_version_tuple = doc.paragraphs[6].text.rsplit(': ', 1)
# check if doc_version_tuple has the right content:
assert doc_version_tuple[0] == 'Version Number'
doc_version = doc_version_tuple[1]
new_dir = os.path.join(document_dir, doc_version)
Notice that you can also do both of the two steps in one run over the list of full path document names.
Notice further that running the script you posted in your question twice without the check:
assert os.path.basename(document_dir) != document_name.rsplit('.', 1)[0]
giving an Error if the script was already run and the documents are already in folders with the document name will destroy what you already achieved and you will need to write another script to reverse it.
The above is the reason why it would be a good idea to have a backup copy of all the documents you can use to re-create the directory with the documents in case something goes wrong. And ... it is generally a good idea to have always a backup copy if you work on files especially when using a self-written script.

How to create a file in python having name = ".gitignore"

I was trying to create a file without a name in python (only filetype)
I tried this -
open(".gitignore","w+").close()
But it does not work.
edit - it does work real issue is in getting file through glob.glob
classify_folder_name = #path of the folder which contain .gitignore file
rel_paths = glob.glob(classify_folder_name + '/**', recursive=True)
for local_file in rel_paths:
print(local_file)
it does not print .gitignore file.
Any help will be appreciated.
Note -: don't want to use os.listdir()
There are few things that you might check:
files with dot at the beginning are hidden so whatever OS you are using, make sure you have hidden files visibility enabled
It might be saved in different directory
open(".gitignore","w+").close()
It would be better if you do this:
To create a file:
with open('.gitignore', 'w') as fp:
pass

Scons Copying File on Changes?

Well i have some extra text files, with different extensions, and i need them to be copied to the bin. Right now i am using:
files = []
for root, dirs, files in os.walk("extra_src"):
for file in files:
files.append(["extra_src" + os.sep + file, "bin" + os.sep + file])
for element in files:
command = Command(target = element[1], source = element[0], action = Copy("$TARGET", "$SOURCE"))
Requires(program, command)
Is there any other way to get it to register the files and simply specify all the files in a said directory? I can use Command(..., Copy("dir1", "dir2")) but it doesn't detect changes, and doesn't clean out the bin of those files.
Try something along the lines of:
import os # for os.path.join
inst = env.Install('bin', Glob(os.path.join('extra_src','*.*')))
env.Depends(program, inst) # if required
Note, how the Glob() function will even find files that don't exist yet, but get created by another build step.

Replacing files in one folder and all its subdirectories with modified versions in another folder

I have two folders, one called 'modified' and one called 'original'.
'modified' has no subdirectories and contains 4000 wav files each with unique names.
The 4000 files are copies of files from 'original' except this folder has many subdirectories inside which the original wav files are located.
I want to, for all the wav files in 'modified', replace their name-counterpart in 'original' wherever they may be.
For example, if one file is called 'sound1.wav' in modified, then I want to find 'sound1.wav' in some subdirectory of 'original' and replace the original there with the modified version.
I run Windows 8 so command prompt or cygwin would be best to work in.
As requested, I've written the python code that does the above. I use the 'os' and 'shutil' modules to first navigate over directories and second to overwrite files.
'C:/../modified' refers to the directory containing the files we have modified and want to use to overwrite the originals.
'C:/../originals' refers to the directory containing many sub-directories with files with the same names as in 'modified'.
The code works by listing every file in the modified directory, and for each file, we state the path for the file. Then, we look through all the sub-directories of the original directory, and where the modified and original files share the same name, we replace the original with the modified using shutil.copyfile().
Because we are working with Windows, it was necessary to change the direction of the slashes to '/'.
This is performed for every file in the modified directory.
If anyone ever has the same problem I hope this comes in handy!
import os
import shutil
for wav in os.listdir('C:/../modified'):
modified_file = 'C:/../modified/' + wav
for root, dirs, files in os.walk('C:/../original'):
for name in files:
if name == wav:
original_file = root + '/' + name
original_file = replace_file.replace('\\','/')
shutil.copyfile(modified_file, original_file)
print wav + ' overwritten'
print 'Complete'

create a list of files to be deleted

I am working on a search-and-destroy type program which I need it to do is search all directories with a certain file-name and append them to a list. after that delete all those files...not objects in list or the list...
import os
file_list=[]
for root, dirs, files in os.walk(path-to-dir'):
for f_name in files:
if f_name.startswith("file-name"):
file_list.append(f_name)
I could write up to appending part of the code but I don't know next...
Some help please
To remove a file from your computer, use os.remove(). It takes full path to the file as it's parameter, so instead of calling os.remove("infectedFile.dll") you would call os.remove("C:/program files/avira/infectedFile.dll")
So your file_list should contain full paths to the files, and then just call:
for file in file_list:
os.remove(file)
Modify your file_list.append(f_name). The f_name is only a bare name. You need to add the path to the file name in the time of processing, because you do not know where the file was found in the directory hierarchy:
file_list.append(os.path.join(root, f_name))
The root variable contains the path during walking.
To make check whether your code works, just print the content of the list:
print('\n'.join(file_list))
Or you can do it in the loop to get ready for the later part:
for fname in file_list:
print(fname)
Then you just add the os.remove(fname) to remove the file name:
for fname in file_list:
print('removing', fname)
os.remove(fname)

Resources