I am a Python newbie and need to create a script that will do parse some files and put them into a SQL db. So I am trying to create smaller scripts that do what I want, then combine them into a larger script.
To that end, I am trying run this code:
import os
fileList = []
testDir = "/home/me/somedir/dir1/test"
for i in os.listdir(testDir):
if os.path.isfile(i):
fileList.append(i)
for fileName in fileList:
print(fileName)
When I look at the output, I do not see any files listed. I tried the path without quotes and got stack errors. So searching showed I need the double quotes.
Where did I go wrong?
I found this code that works fine:
import os
in_path = "/home/me/dir/"
for dir_path, subdir_list, file_list in os.walk(in_path):
for fname in file_list:
full_path = os.path.join(dir_path, fname)
print(full_path)
I can use full_path to do my next step.
If anyone has any performance tips, feel free to share them. Or point me in the right direction.
that is because you're most likely ejecuting your script from a folder outside your testdir, os.path.isfile need the full path name of the file so it can check is that is a lile or not (os.listdir return the names), if the full path is not provide then it will check is there is a file with the given name in the same folder from which the script is executed, to fix this you need to give the full path name of that file, you can do it with os.path.join like this
for name in os.listdir(testDir):
if os.path.isfile( os.path.join(testDir,name) ):
fileList.append(name)
or if you also want the full path
for name in os.listdir(testDir):
path = os.path.join(testDir,name)
if os.path.isfile(path):
fileList.append(path)
Related
I have the following structure. I want to iterate through sub folders (machine, gunshot) and process .wav files and build mfccresult folder in each category and the .csv file in it. I have the following code and the MFCC folder is keep forming in already formed MFCC folder.
parent_dir = 'sound'
for subdirs, dirs, files in os.walk(parent_dir):
resultsDirectory = subdirs + '/MFCC/'
if not os.path.isdir("resultsDirectory"):
os.makedirs(resultsDirectory)
for filename in os.listdir(subdirs):
if filename.endswith('.wav'):
(rate,sig) = wav.read(subdirs + "/" +filename)
mfcc_feat = mfcc(sig,rate)
fbank_feat = logfbank(sig,rate)
outputFile = resultsDirectory + "/" + os.path.splitext(filename)[0] + ".csv"
file = open(outputFile, 'w+')
numpy.savetxt(file, fbank_feat, delimiter=",")
file.close()
What version of python are you using? Not sure if this has changed in the past, but os.walk does not return "subdirs" as the first of the tuple, but the dirpath. See here for python 3.6.
I don't know your absolute path, but seeing as you are passing in the path sound as a relative reference, I assume it is a folder inside the directory where you run your python code. So for example, lets say you are running this file (lets call it mycode.py) from
/home/username/myproject/mycode.py
and you have some subdirectory:
/home/username/myproject/sound/
So:
resultsDirectory = subdirs + '/MFCC/'
as written in your code above would resolve to:
/home/username/myproject/sound/MFCC/
So your first if statement will be entered since this is not an existing directory. Thereby you create a new directory:
/home/username/myproject/sound/MFCC/
From there, you take
filename in os.listdir(subdirs)
This is also appears to be a misunderstanding of the output of this function. os.listdir() will return directories not files. See here for the man on that.
So now you are looping through the directories in:
/home/username/myproject/sound/
Here, I assume you have some of the directories from your diagram already made. So I assume you have:
/home/username/myproject/sound/machine_sound
/home/username/myproject/sound/gun_shot_sound
or something along those lines.
So the if statement will never be entered, since your directory names to not end with '.wav'.
Even if it did enter, you'd still have issues asfilename will actually be equal to machine_sound on the first loop, and gun_shot_sound in the second time through.
Maybe you are using some other wav library, but the python built-in is called wave and you need to call the wave.open() on the file not wav.read(). See here for the docs.
I'm not sure what you were trying to achieve with the call to os.path.splitext(filename)[0], but you can read about it here You will end up with the same thing that went in in this case though, so machine_sound and gun_shot_sound.
Your output file will thus result in:
/home/username/myproject/sound/MFCC/machine_sound.csv
on the first loop, and
/home/username/myproject/sound/MFCC/gun_shot_sound.csv
the second time through.
So in conclusion, I'm not sure what is happening when you say "MFCC folder is keep forming in already formed MFCC folder" but you definitely have a lot of reading ahead of you before you can understand your own code, and have any hope of fixing it to do what you want. Assuming you read through the links I provided, you should be able to do that though. Good luck!
Additionally, you had quite few typos in your code that I edited, include the immensely important whitespace characters. You should clean that up and ensure your code runs before posting it here, then double check that your copy/paste action did not result in any errors. People will be much more willing to help if you clean up your presentation a bit.
for subdir,dirs,files in os.walk(parent_dir):
for folder in next(os.walk(parent_dir))[1]:
resultsDirectory= folder + '/MFCC'
absPath = os.path.join(parent_dir, resultsDirectory)
if not os.path.isdir(absPath):
os.makedirs(absPath)
for filename in os.listdir(subdir):
print('listdir')
if filename.endswith('.wav'):
print("csv file writing")
(rate,sig) = wav.read(subdir + "/" +filename)
mfcc_feat = mfcc(sig,rate)
fbank_feat = logfbank(sig,rate)
print("fbank_feat")
outputFile =subdir + "/MFCC"+"/" + os.path.splitext(filename)[0] + ".csv"
file = open(outputFile, "w+")
numpy.savetxt(file, fbank_feat, delimiter=",")
file.close()
Here the csv file is stored in the subdirectory not in mfcc folder for each category.
I have issue with output path file.
I have an issue when I try to run a basic function in my code pertaining to the default locations of VMs in a Windows box. These VMs are stored as a single file.
For some reason, the loop with the directory that glob is interacting with is not finding any files.
I have to use glob at the beginning of the path and the end of the path, so that this script can be used around my department.
I have researched with os.walk() and os.listdir(); both fail because the ways that I have written it, I get the error TypeError: expected str, bytes or os.PathLike object, not list.
I need a list of the VMs so that I can write a script that clones all of the VMs within that list through the vix API.
def getVMs():
vmloc = glob.glob('**\\Documents\\Virtual Machines\\*.vmdk', recursive=True)
for f in vmloc:
print(f)
The problem is that it prints an null output and I cannot figure out why. Any help would be appreciated.
EDIT:
I also tried to finalize the path with creating the path through os.path and created the full path of the VM folder:
def getVMs():
path = os.path.join('..','C:','\\','Users',os.getlogin(),'Documents','Virtual Machines\\',)
for vmloc in glob.glob(path +'**.vmdk', recursive=True):
print(vmloc)
It still produces a null output
The issue lied within the fact that I was having it look for .vmdk files within the directory above the actual dir that I needed.
path = os.path.join('..','C:','\\','Users',os.getlogin(),'Documents','Virtual Machines\\',)
path1 = glob.glob(path + '**\\' + '**.vmdk' , recursive=True )
for vms in path1:
print(vms)
I was able to find all of the VMDK files after that
The following is a piece of the code:
files = glob.iglob(studentDir + '/**/*.py',recursive=True)
for file in files:
shutil.copy(file, newDir)
The thing is: I plan to get all the files with extension .py and also all the files whose names contain "write". Is there anything I can do to change my code? Many thanks for your time and attention.
If you want that recursive option, you could use :
patterns = ['/**/*write*','/**/*.py']
for p in patterns:
files = glob.iglob(studentDir + p, recursive=True)
for file in files:
shutil.copy(file, newDir)
If the wanted files are in the same directory you could simply use :
certainfiles = [glob.glob(e) for e in ['*.py', '*write*']]
for file in certainfiles:
shutil.copy(file, newDir)
I would suggest the use of pathlib which has been available from version 3.4. It makes many things considerably easier.
In this case '**' stands for 'descend the entire folder'.
'*.py' has its usual meaning.
path is an object but you can recover its string representation using the str function to get just the file name.
When you want the entire path name, use path.absolute and get the str of that.
Don't worry, you'll get used to it. :) If you look at the other goodies in pathlib you'll see it's worth it.
from pathlib import Path
studentDir = <something>
newDir = <something else>
for path in Path(studentDir).glob('**/*.py'):
if 'write' in str(path):
shutil.copy(str(path.absolute()), newDir)
I'm trying to loop through the files in a folder that have the extension .jpg or .jpeg. This looks like it should work, but for some reason nothing is printing. When I run it, no compile errors, it just doesn't print anything to the Python console.
import os
def find_images(image_dir, extensions=None):
default_extensions = ('jpg', 'jpeg')
if extensions is None:
extensions = default_extensions
elif isinstance(extensions, str):
extensions = (extensions,)
for root, dirs, filenames in os.walk(image_dir):
for filename in filenames:
print(filename, filename.split('.',1))
if filename.split('.', 1)[-1].lower() in extensions:
yield os.path.join(root, filename)
def main(image_dir=None):
if image_dir is None:
image_dir="D:/userName/Pictures/Gotham"
find_images(image_dir)
main() #doesn't return anything in console
find_images("D:/userName/Pictures/Gotham", ) # also doesn't do anything...
All of the questions I tried to find online give the method of doing this (looping through files in a directory), but nothing is showing me how to use the path correctly, as that's what I think I'm not doing right. There are pictures in the folder, I just can't seem to see why the script isn't listing the name for each.
I have also tried image_dir=r'D:\userName\Pictures\Gotham' and ...='D://userName//Pictures//Gotham' to no avail.
edit: meant to add this. This does work:
file_path = "D:/userName/Pictures/"
def list_images():
for root, dirs, files in os.walk(file_path):
for file in files:
if file.endswith(".jpg"):
print(file, file.split('.',1))
list_images()
...so what am I overlooking in the above find_images(...)?
(FYI - I'm trying to implement things from this answer)
I didn't notice the sneaky yield in your function... Well, you have made a generator function so it just returns an iterator when you call it - to get it to actually start executing you have to start iterating through it:
for image in find_images("D:/userName/Pictures/Gotham"):
print(image)
or at least get the first element to start it:
first_result = next(find_images("D:/userName/Pictures/Gotham"))
Also, while I'm here, that's very unsafe way to check for an extension - if the file name has multiple dots (i.e. my.image.jpg) it won't be recognized. Use str.endswith() or slice from the back of the string instead of str.split() (or don't limit splits to one).
I'm attempting to create a script that looks into a specific directory and then lists all the files of my chosen types in addition to all folders within the original location.
I have managed the first part of listing all the files of the chosen types, however am encountering issues listing the folders.
The code I have is:
import datetime, os
now = datetime.datetime.now()
myFolder = 'F:\\'
textFile = 'myTextFile.txt'
outToFile = open(textFile, mode='w', encoding='utf-8')
filmDir = os.listdir(path=myFolder)
for file in filmDir:
if file.endswith(('avi','mp4','mkv','pdf')):
outToFile.write(os.path.splitext(file)[0] + '\n')
if os.path.isdir(file):
outToFile.write(os.path.splitext(file)[0] + '\n')
outToFile.close()
It is successfully listing all avi/mp4/mkv/pdf files, however isn't ever going into the if os.path.isdir(file): even though there are multiple folders in my F: directory.
Any help would be greatly appreciated. Even if it is suggesting a more effective/efficient method entirely that does the job.
Solution found thanks to Son of a Beach
if os.path.isdir(file):
changed to
if os.path.isdir(os.path.join(myFolder, file)):
os.listdir returns the names of the files, not the fully-qualified paths to the files.
You should use a fully qualified path name in os.path.isdir() (unless you've already told Python where to look).
Eg, instead of using if os.path.isdir(file): you could use:
if os.path.isdir(os.path.join(myFolder, file)):