I am creating a automation script and my requirement is to move some files from one folder to another and get it renamed in the meanwhile
I have tried using shutil and os module but none helped me so far
src = r'C:\\Users\\XX\\Downloads\\'
dst = r'C:\\Users\\XX\\Documents\\UIPATH_DUMP\\'
regex = re.compile('MSS_')
files = os.listdir(src)
for i in files:
if regex.match(i):
src1 = src + i
dst1 = dst + i
shutil.move(src1, dst1)
The expected result is my file should get moved to the destination location. I am not able to figure out just how will I rename it? maybe os.rename() would work?
You can use os.rename() to move the file to another path as well as rename it.
For example, if the original file is:
"/Users/billy/d1/xfile.txt"
and you would like to move it to folder "d2" and name it "yfile.txt", you can use the following line of code:
os.rename('/Users/billy/d1/xfile.txt', '/Users/billy/d2/yfile.txt')
Related
I have hundreds of word documents that needs to be processed but need to organized them first by versions in subfolders.
I basically get a drop of these word documents within a single folder and need to automate the organization moving forward before I get nuts.
So I have a script that basically creates a folder with the same name of the file and moves the file inside that folder, this part is done.
Now I need to go into each subfolder, and get the document version from within the first word page of each document, then create a sub-folder withe version number and move the word file into that subfolder.
The structure should be as follows (taking two folders as examples):
(Folder) Test
(Subfolder) 12.0
Test.docx
(Folder) Test1
(Subfolder) 13.0
Test1.docx
Luckily I was able to figure it out that "doc.paragraphs[6].text" will always return the version information in a single line as follows:
>>> doc.paragraphs[6].text
'Version Number: 12.0'
Would appreciate if someone can point me out to the right direction.
This is the script I have so far:
#!/usr/bin/env python3
import glob, os, shutil, docx, sys
folder = sys.argv[1]
#print(folder)
for file_path in glob.glob(os.path.join(folder, '*.docx')):
new_dir = file_path.rsplit('.', 1)[0]
#print(new_dir)
try:
os.mkdir(os.path.join(folder, new_dir))
except WindowsError:
# Handle the case where the target dir already exist.
pass
shutil.move(file_path, os.path.join(new_dir, os.path.basename(file_path)))
Please see below the complete solution to your requirement.
Note: To know about re.search go through https://www.geeksforgeeks.org/python-regex-re-search-vs-re-findall/
import docx, os, glob, re, shutil
from pathlib import Path
def create_dir(path): # function to check if a given path exist and create one if not
# Check whether the specified path exists or not
is_exist = os.path.exists(path)
# Create a new directory the path does not exist
if not is_exist:
os.makedirs(path)
folder = fr"C:\Users\rams\Documents\word_docs" #my local folder
for file in glob.glob(os.path.join(folder, '*.docx')):
# Test, Test1, Test2 in your structure
main_folder = os.path.join(folder,Path(file).stem)
file_name = os.path.basename(file)
# Get the first line from the docx
doc = docx.Document(file).paragraphs[0].text
# group(1) = Version Number: (.*)
version_no = re.search("(Version Number: (.*))", doc).group(1)
# extract the number portion from version_no
sub_folder = version_no.split(':')[1].strip()
# path to actual sub_folder with version_no
sub_folder = os.path.join(main_folder, sub_folder)
# destination path
dest_file_path = os.path.join(sub_folder, file_name)
for i in [main_folder,sub_folder]:
create_dir(i) # function call
# to move the file to the corresponding version folder (overwrite if exists)
if os.path.exists(dest_file_path):
os.remove(dest_file_path)
shutil.move(file, sub_folder)
else:
shutil.move(file, sub_folder)
Before execution:
After Execution
So you have a script that creates a folder name being the file name and moves the file inside that folder. This part is done. OK.
Now you know how to get the document version from within the first word page of each document you need to create a sub-folder with this version number and move the word file into that sub-folder. This can be done using the same code as before replacing:
new_dir = file_path.rsplit('.', 1)[0]
with
document_dir = os.path.dirname(file_path)
document_name = os.path.basename(file_path)
# check if the document is already in the right directory:
assert os.path.basename(document_dir) == document_name.rsplit('.', 1)[0]
# here comes: doc = some_function_getting_the_doc_object(file_path)
doc_version_tuple = doc.paragraphs[6].text.rsplit(': ', 1)
# check if doc_version_tuple has the right content:
assert doc_version_tuple[0] == 'Version Number'
doc_version = doc_version_tuple[1]
new_dir = os.path.join(document_dir, doc_version)
Notice that you can also do both of the two steps in one run over the list of full path document names.
Notice further that running the script you posted in your question twice without the check:
assert os.path.basename(document_dir) != document_name.rsplit('.', 1)[0]
giving an Error if the script was already run and the documents are already in folders with the document name will destroy what you already achieved and you will need to write another script to reverse it.
The above is the reason why it would be a good idea to have a backup copy of all the documents you can use to re-create the directory with the documents in case something goes wrong. And ... it is generally a good idea to have always a backup copy if you work on files especially when using a self-written script.
I have the following code which works as expected, expect when the source file is the same as the destination file. I have tried os.path. isfile/isdir/exists but I'm hitting a wall.
So essentially this loops through the file_list and moves the file in the list to the destination. However, it can happen that the source and destination are the same, so, if the file's location is the same as the destination, then it is trying to move itself to itself and obviously fails. So in the following I need to add a check, if the file's location (source) is the same as the destination then pass.
def move_files(file_list, destination):
for file in file_list:
source_file = file
shutil.move(source_file, destination)
In this case the destination is a folder path and the source is a folder path + file name, so I need to ignore the source's file name and compare the path with the destination.
I feel I'm over complicating this, but any help is appreciated.
You can make use of abspath and dirname methods of os.path. The first one returns the absolute path of a directory, the second provides the directory name of a path.
def move_files(file_list, destination):
# just in case you don't provide absolute paths
# you can also consider using `expanduser`
destination = os.path.abspath(destination)
for file in file_list:
file_abs_path = os.path.abspath(file)
if os.path.dirname(file_abs_path) != destination:
shutil.move(file_abs_path, destination)
https://docs.python.org/3/library/os.path.html#os.path.abspath
https://docs.python.org/3/library/os.path.html#os.path.dirname
https://docs.python.org/3/library/os.path.html#os.path.expanduser
I think this is what you want
And also, i've found the source_file variable des nothing special here. So i just ignored it.
import os
import shutil
def move_files(file_list, destination):
dir_lst = os.listdir(destination)
for file in file_list:
if file not in dir_lst: # This will only move the files if its not in the destination folder
shutil.move(file, destination)
For a complete code using only os module,
import os
def move_files(file_list, destination):
dir_lst = os.listdir(destination)
for file in file_list:
if file not in dir_lst: # This will only move the files if its not in the destination folder
os.rename(file, destination)
How can I use Python to create new folders relative to my current working directory?
For example, my path is C:/Documents/Code with no folders within and just has my Python file. How do I store some data within C:/Documents/Code/Data without hard coding the absolute path?
This is what I've been trying:
path = "/Data/file.txt"
file = open(path, "w")
This gives me the error of "No such file or directory".
Thanks for any assistance!
Prepend the path with a single dot ., which denotes your current working directory:
path = "./Data/file.txt"
# ^
As per the code below I am having issues with the zipping a directory using the python 3 shutil.make_archive function. The .testdir will be zipped but it is being zipped in /home/pi, instead of /home/pi/Backups.
zip_loc = '/home/pi/.testdir'
zip_dest = '/home/pi/Backups/'
shutil.make_archive(zip_loc, 'zip', zip_dest)
Could anyone explain what I am doing wrong?
Reading the docs here I came up with:
zip_loc = '/home/pi/.testdir'
zip_dest = '/home/pi/Backups/'
shutil.make_archive(base_dir=zip_loc, root_dir=zip_loc, format='zip', base_name=zip_dest)
From the docs:
base_name is the name of the file to create, including the path, minus any format-specific extension.
root_dir is a directory that will be the root directory of the archive; for example, we typically chdir into root_dir before creating the archive.
base_dir is the directory where we start archiving from; i.e. base_dir will be the common prefix of all files and directories in the archive.
root_dir and base_dir both default to the current directory.
Before to write the archive, move to the good directory :
old_path = os.getcwd()
os.chdir(path)
-> write the archive
After writing the archive move back to old directory :
os.chdir(old_path)
Well i have some extra text files, with different extensions, and i need them to be copied to the bin. Right now i am using:
files = []
for root, dirs, files in os.walk("extra_src"):
for file in files:
files.append(["extra_src" + os.sep + file, "bin" + os.sep + file])
for element in files:
command = Command(target = element[1], source = element[0], action = Copy("$TARGET", "$SOURCE"))
Requires(program, command)
Is there any other way to get it to register the files and simply specify all the files in a said directory? I can use Command(..., Copy("dir1", "dir2")) but it doesn't detect changes, and doesn't clean out the bin of those files.
Try something along the lines of:
import os # for os.path.join
inst = env.Install('bin', Glob(os.path.join('extra_src','*.*')))
env.Depends(program, inst) # if required
Note, how the Glob() function will even find files that don't exist yet, but get created by another build step.