use glob module to get certain files with python?

use glob module to get certain files with python? - python-3.x

The following is a piece of the code:
files = glob.iglob(studentDir + '/**/*.py',recursive=True)
for file in files:
shutil.copy(file, newDir)
The thing is: I plan to get all the files with extension .py and also all the files whose names contain "write". Is there anything I can do to change my code? Many thanks for your time and attention.

If you want that recursive option, you could use :
patterns = ['/**/*write*','/**/*.py']
for p in patterns:
files = glob.iglob(studentDir + p, recursive=True)
for file in files:
shutil.copy(file, newDir)
If the wanted files are in the same directory you could simply use :
certainfiles = [glob.glob(e) for e in ['*.py', '*write*']]
for file in certainfiles:
shutil.copy(file, newDir)

I would suggest the use of pathlib which has been available from version 3.4. It makes many things considerably easier.
In this case '**' stands for 'descend the entire folder'.
'*.py' has its usual meaning.
path is an object but you can recover its string representation using the str function to get just the file name.
When you want the entire path name, use path.absolute and get the str of that.
Don't worry, you'll get used to it. :) If you look at the other goodies in pathlib you'll see it's worth it.
from pathlib import Path
studentDir = <something>
newDir = <something else>
for path in Path(studentDir).glob('**/*.py'):
if 'write' in str(path):
shutil.copy(str(path.absolute()), newDir)

Related

How do i use file path list in the loop?

First of all, I have to say that I'm totally new to Python. I am trying to use it to analyze EEG data with a toolbox. I have 30 EEG data files.
I want to create a loop instead of doing each analysis separately. Below you can see the code I wrote to access all the directories to be analyzed in a folder:
import os
path = 'my/data/directory'
folder = os.fsencode(path)
filenames = []
for file in os.list(folder):
filename = os.fsdecode(file)
if filename.endswith('.csv') and filename.startswith('p'): # whatever file types you're using
filenames.append(filename)
filenames.sort()
But after that, I couldn't figure out how to use them in a loop. This way I could list all the files but I couldn't find how to refer each of them in iteration within the following code:
file = pd.read_csv("filename", header=None)
fg.fit(freqs, file, [0, 60]) #The rest of the code is like this, but this part is not related
Normally I have to write the whole file path in the part that says "filename". I can open all file paths with the code I created above, but I don't know how to use them in this code respectively.
I would be glad if you help.

first of all, this was a good attempt.
What you need to do is make a list of files. From there you can do whatever you want...
You can do this as follows:
from os import listdir
from os.path import isfile, join
mypath = 'F:/code' # whatever path you want
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
# now do the for loop
for i in onlyfiles:
# do whatever you want
print(i)

Python and Pandas - Reading the only CSV file in a directory without knowing the file name

I'm putting together a Python script that uses pandas to read data from a CSV file, sort and filter that data, then save the file to another location.
This is something I have to run regularly - at least weekly if not daily. The original file is updated every day and is placed in a folder but each day the file name changes and the old file is removed so there is only one file in the directory.
I am able to make all this work by specifying the file location and name in the script, but since the name of the file changes each day, I'd rather not have to edit the script every every time I want to run it.
Is there a way to read that file based solely on the location? As I mentioned, it's the only file in the directory. Or is there a way to use a wildcard in the name? The name of the file is always something like: ABC_DEF_XXX_YYY.csv where XXX and YYY change daily.
I appreciate any help. Thanks!

from os import listdir
CSV_Files = [file for file in listdir('<path to folder>') if file.endswith('.csv')
If there is only 1 CSV file in the folder, you can do
CSV_File = CSV_Files[0]
afterwards.

To get the file names solely based on the location:
import os, glob
os.chdir("/ParentDirectory")
for file in glob.glob("*.csv"):
print(file)

Assume that dirName holds the directory holding your file.
A call to os.listdir(dirName) gives you files or child directories in this directory (of course, you must earlier import os).
To limit the list to just files, we must write a little more, e.g.
[f for f in os.listdir(dirName) if os.path.isfile(os.path.join(dirName, f))]
So we have a full list of files. To get the first file, add [0] to the
above expression, so
fn = [f for f in os.listdir(dirName) if os.path.isfile(os.path.join(dirName, f))][0]
gives you the name of the first file, but without the directory.
To have the full path, use os.path.join(dirname, fn)
So the whole script, adding check for proper extension, can be:
import os
dirName = r"C:\Users\YourName\whatever_path_you_wish"
fn = [f for f in os.listdir(dirName)\
if f.endswith('.csv') and os.path.isfile(os.path.join(dirName, f))][0]
path = os.path.join(dirName, fn)
Then you can e.g. open this file or make any use of them, as you need.
Edit
The above program will fail if the directory given does not contain any file
with the required extension. To make the program more robust, change it to
something like below:
fnList = [f for f in os.listdir(dirName)\
if f.endswith('.csv') and os.path.isfile(os.path.join(dirName, f))]
if len(fnList) > 0:
fn = fnList[0]
path = os.path.join(dirName, fn)
print(path)
# Process this file
else:
print('No such file')

Copy files in Python

I want to copy files with a specific file extention from one directory and put in another directory. I tried searching and found code the same as im doing however it doesnt appear to do anything, any help would be great.
import shutil
import os
source = "/tmp/folder1/"
destination = "/tmp/newfolder/"
for files in source:
if files.endswith(".txt"):
shutil.move(files,destination)

I think the problem is your for-loop. You are actually looping over the string "tmp/folder1/" instead of looping over the the files in the folder. What your for-loop does is going through the string letter by letter (t, m, p etc.).
What you want is looping over a list of files in the source folder. How that works is described here: How do I list all files of a directory?.
Going fro there you can run through the filenames, testing for their extension and moving them just as you showed.

Your "for file in source" pick one character after another one from your string "source" (the for doesn't know that source is a path, for him it is just a basic str object).
You have to use os.listdir :
import shutil
import os
source = "source/"
destination = "dest/"
for files in os.listdir(source): #list all files and directories
if os.path.isfile(os.path.join(source, files)): #is this a file
if files.endswith(".txt"):
shutil.move(os.path.join(source, files),destination) #move the file
os.path.join is used to join a directory and a filename (to have a complete path).

List files of specific file type and directories

I'm attempting to create a script that looks into a specific directory and then lists all the files of my chosen types in addition to all folders within the original location.
I have managed the first part of listing all the files of the chosen types, however am encountering issues listing the folders.
The code I have is:
import datetime, os
now = datetime.datetime.now()
myFolder = 'F:\\'
textFile = 'myTextFile.txt'
outToFile = open(textFile, mode='w', encoding='utf-8')
filmDir = os.listdir(path=myFolder)
for file in filmDir:
if file.endswith(('avi','mp4','mkv','pdf')):
outToFile.write(os.path.splitext(file)[0] + '\n')
if os.path.isdir(file):
outToFile.write(os.path.splitext(file)[0] + '\n')
outToFile.close()
It is successfully listing all avi/mp4/mkv/pdf files, however isn't ever going into the if os.path.isdir(file): even though there are multiple folders in my F: directory.
Any help would be greatly appreciated. Even if it is suggesting a more effective/efficient method entirely that does the job.
Solution found thanks to Son of a Beach
if os.path.isdir(file):
changed to
if os.path.isdir(os.path.join(myFolder, file)):

os.listdir returns the names of the files, not the fully-qualified paths to the files.
You should use a fully qualified path name in os.path.isdir() (unless you've already told Python where to look).
Eg, instead of using if os.path.isdir(file): you could use:
if os.path.isdir(os.path.join(myFolder, file)):

Why is this code not printing the directory contents?

I am a Python newbie and need to create a script that will do parse some files and put them into a SQL db. So I am trying to create smaller scripts that do what I want, then combine them into a larger script.
To that end, I am trying run this code:
import os
fileList = []
testDir = "/home/me/somedir/dir1/test"
for i in os.listdir(testDir):
if os.path.isfile(i):
fileList.append(i)
for fileName in fileList:
print(fileName)
When I look at the output, I do not see any files listed. I tried the path without quotes and got stack errors. So searching showed I need the double quotes.
Where did I go wrong?

I found this code that works fine:
import os
in_path = "/home/me/dir/"
for dir_path, subdir_list, file_list in os.walk(in_path):
for fname in file_list:
full_path = os.path.join(dir_path, fname)
print(full_path)
I can use full_path to do my next step.
If anyone has any performance tips, feel free to share them. Or point me in the right direction.

that is because you're most likely ejecuting your script from a folder outside your testdir, os.path.isfile need the full path name of the file so it can check is that is a lile or not (os.listdir return the names), if the full path is not provide then it will check is there is a file with the given name in the same folder from which the script is executed, to fix this you need to give the full path name of that file, you can do it with os.path.join like this
for name in os.listdir(testDir):
if os.path.isfile( os.path.join(testDir,name) ):
fileList.append(name)
or if you also want the full path
for name in os.listdir(testDir):
path = os.path.join(testDir,name)
if os.path.isfile(path):
fileList.append(path)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

use glob module to get certain files with python? - python-3.x

Related

How do i use file path list in the loop?

Python and Pandas - Reading the only CSV file in a directory without knowing the file name

Copy files in Python

List files of specific file type and directories

Why is this code not printing the directory contents?

Categories

Resources