Im am getting FileNotFoundError when trying to create a for in loop - python-3.x

I am making an image classifier. I have successfully created the folders and I manually placed the .txt files which has links to images from google inside each folder.
I am now trying to download the images from each .txt file either into a folder inside each folder category, or into each category folder that also has the .txt file, however i keep getting errors. please help.
I manually placed the .txt files in each of the folders as they was in the parent folder name as PLANTS but this has not made a difference.
I expected to get all the images downloaded in their respective folders from the .txt files with google image links but it is not working, whether the .txt is inside the main PLANTS FOLDER or inside each category folder which is inside the PLANTS folder
instead i get the below errors
FileNotFoundError
please see attached screen shot
&

Long filenames or paths with spaces are supported by NTFS in Windows NT. However, these filenames or directory names require quotation marks around them when they are specified in a command prompt operation. Failure to use the quotation marks results in the error message.
for i in range(len(folders)):
path = "content/gdrive/My Drive/Colab Notebooks/Flowers/PLANTS"
#for creating directory we need path class object so below this:
dest_for_creating_directory = path/folders[i]
#for searchi directory which have spaces we need doubleQuotations that
#why user below one in download image function instead of dest.
dest_for_searching_directory = path+"/"+folders[i]
Note: its better practice to write folder/file name without space more info

Related

Loop through multiple folders and subfolders using Pyspark in Azure Blob container (ADLS Gen2)

I am trying to loop through multiple folders and subfolders in Azure Blob container and read multiple xml files.
Eg: I have files in YYYY/MM/DD/HH/123.xml format
Similarly I have multiple sub folders under month, date, hours and multiple XML files at last.
My intention is to loop through all these folder and read XML files. I have tried using few Pythonic approaches which did not give me the intended result. Can you please help me with any ideas in implementing this?
import glob, os
for filename in glob.iglob('2022/08/18/08/225.xml'):
if os.path.isfile(filename): #code does not enter the for loop
print(filename)
import os
dir = '2022/08/19/08/'
r = []
for root, dirs, files in os.walk(dir): #Code not moving past this for loop, no exception
for name in files:
filepath = root + os.sep + name
if filepath.endswith(".xml"):
r.append(os.path.join(root, name))
return r
The glob is a python function and it won't recognize the blob folders path directly as code is in pyspark. we have to give the path from root for this. Also, make sure to specify recursive=True in that.
For Example, I have checked above pyspark code in databricks.
and the OS code as well.
You can see I got the no result as above. Because for the above, we need to give the absolute root. it means the root folder.
glob code:
import glob, os
for file in glob.iglob('/path_from_root_to_folder/**/*.xml',recursive=True):
print(file)
For me in databricks the root to access is /dbfs and I have used csv files.
Using os:
You can see my blob files are listed from folders and subfolders.
I have used databricks for my repro after mounting. Wherever you are trying this code in pyspark, make sure you are giving the root of the folder in the path. when using glob, set the recursive = True as well.
There is an easier way to solve this problem with PySpark!
The tough part is all the files have to have the same format. In the Azure databrick's sample directory, there is a /cs100 folder that has a bunch of files that can be read in as text (line by line).
The trick is the option called "recursiveFileLookup". It will assume that the directories are created by spark. You can not mix and match files.
I added to the data frame the name of the input file for the dataframe. Last but not least, I converted the dataframe to a temporary view.
Looking at a simple aggregate query, we have 10 unique files. The biggest have a little more than 1 M records.
If you need to cherry pick files for a mixed directory, this method will not work.
However, I think that is an organizational cleanup task, versus easy reading one.
Last but not least, use the correct formatter to read XML.
spark.read.format("com.databricks.spark.xml")

Python 3 - Copy files if they do not exist in destination folder

I am attempting to move a couple thousand pdfs from one file location to another. The source folder contains multiple subfolders and I am combining just the pdfs (technical drawings) into one folder to simplify searching for the rest of my team.
The main goal is to only copy over files that do not already exist in the destination folder. I have tried a couple different options, most recently what is shown below, and in all cases, every file is copied every time. Prior to today, any time I attempted a bulk file move, I would received errors if the file existed in the destination folder but I no longer do.
I have verified that some of the files exist in both locations but are still being copied. Is there something I am missing or can modify to correct?
Thanks for the assistance.
import os.path
import shutil
source_folder = os.path.abspath(r'\\source\file\location')
dest_folder = os.path.abspath(r'\\dest\folder\location')
for folder, subfolders, files in os.walk(source_folder):
for file in files:
path_file=os.path.join(folder, file)
if os.path.exists(file) in os.walk(dest_folder):
print(file+" exists.")
if not os.path.exists(file) in os.walk(dest_folder):
print(file+' does not exist.')
shutil.copy2(path_file, dest_folder)
os.path.exists returns a Boolean value. os.walk creates a generator which produces triples of the form (dirpath, dirnames, filenames). So, that first conditional will never be true.
Also, even if that conditional were correct, your second conditional has a redundancy since it's merely the negation of the first. You could replace it with else.
What you want is something like
if file in os.listdir(dest_folder):
...
else:
...

How do you move the contents of a randomly named folder to a new location when you dont know what the folder name will be?

I am using requests to download an image with python. That part works ok. When I download a file I provide a name: sitea.zip . When that file is decompressed it contains a folder with a random name, something like ZX234564563SDSD, that has a qcow2 image in it named gw-vm.qcow2.
I need to move the gw-vm.qcow2 to a specific folder for each site that I download an image for.
I can't figure out how to cd into that randomly named folder to get at the gw-vm.qcow2 file.
Right now I am using os.system('unzip sitea.zip') to decompress.
I don't know how to cd into the resulting folder to then perform the following: os.system('mv gw-vm.qcow2 /opt/unetlab/addons/qemu/sc-branch-a-1.0/gw-vm.qcow2')
Any direction is appreciated.
It's going to be a function of os.walk() -- it performs a dirwalk on a directory. Check out the docs on it: https://docs.python.org/3/library/os.html?#os.walk
Good luck, hope that helps!
(edited formatting)

Get all files in a matching subdirectory name

I have this piece of code from another project:
import pathlib
p = pathlib.Path(root)
for img_file in p.rglob("*.jpg"):
#Do something for each image file
It finds all jpg files in the whole directory and its subfolders and acts upon them.
I have a directory that contains 100+ 'main' folders with each folder having some combination of 2 subfolders - lets call them 'FolderA' and 'FolderB'. The main folders can have one, both or none of these subfolders.
I want to run a piece of code against all the pdf files contained within the 'FolderB' subdirectories, but ignore all files in the main folders and 'FolderA' folders.
Can someone help me manipulate the above code to allow me to continue?
Many thanks!
You can modify the pattern to just search for what you want:
from pathlib import Path
p = Path("root")
for file in p.rglob("*FolderB/*.pdf"):
# Do something with file
pass

WPF C# Unable to find subfolder image resource

I have several images in an assembly that I have all marked as resource in their property setting.
For testing I put an image at the root of my project.
Img.Source = new BitmapImage(new Uri("pack://application:,,,/MyAssembly;component/image.png", UriKind.Absolute));
and it finds and loads the image just fine.
If I place the image in a subfolder no matter what folder or what level, it returns back that it is unable to find the resource.
The folder structure is Resources/Base/Weather/image.png
Img.Source = new BitmapImage(new Uri("pack://application:,,,/MyAssembly;component/Resources/Base/Weather/image.png", UriKind.Absolute));
When I run the application and try to load that image I get this error
Cannot locate resource 'resources/base/weather/image.png'.
Notice the lowercase on the folder names. I am at a loss as to what to try next. I have tried many variations including using the # but that doesn't help. I really do not want to load up the root directory with images.
Thoughts Anyone???

Resources