First of all, I have to say that I'm totally new to Python. I am trying to use it to analyze EEG data with a toolbox. I have 30 EEG data files.
I want to create a loop instead of doing each analysis separately. Below you can see the code I wrote to access all the directories to be analyzed in a folder:
import os
path = 'my/data/directory'
folder = os.fsencode(path)
filenames = []
for file in os.list(folder):
filename = os.fsdecode(file)
if filename.endswith('.csv') and filename.startswith('p'): # whatever file types you're using
filenames.append(filename)
filenames.sort()
But after that, I couldn't figure out how to use them in a loop. This way I could list all the files but I couldn't find how to refer each of them in iteration within the following code:
file = pd.read_csv("filename", header=None)
fg.fit(freqs, file, [0, 60]) #The rest of the code is like this, but this part is not related
Normally I have to write the whole file path in the part that says "filename". I can open all file paths with the code I created above, but I don't know how to use them in this code respectively.
I would be glad if you help.
first of all, this was a good attempt.
What you need to do is make a list of files. From there you can do whatever you want...
You can do this as follows:
from os import listdir
from os.path import isfile, join
mypath = 'F:/code' # whatever path you want
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
# now do the for loop
for i in onlyfiles:
# do whatever you want
print(i)
Related
I have multiple folders in a directory and each folder has multiple files. I have a code which checks for a specific file in each folder and does some data preprocessing and analysis if the specific file is present.
A snippet of it is given below.
import pandas as pd
import json
import os
rootdir = os.path.abspath(os.getcwd())
df_list = []
for subdir, dirs, files in os.walk(rootdir):
for file in files:
if file.startswith("StudyParticipants") and file.endswith(".csv"):
temp = pd.read_csv(os.path.join(subdir, file))
.....
.....
'some analysis'
Merged_df.to_excel(path + '\Processed Data Files\Study_Participants_Merged.xlsx')
Now, I want to automate this process. I want this script to be executed whenever a new folder is added. This is my first in exploring automation process and I ham stuck on this for quite a while without major progress.
I am using windows system and Jupyter notebook to create these dataframes and perform analysis.
Any help is greatly appreciated.
Thanks.
I've wrote a script which you should only run once and it will work.
Please note:
1.) This solution does not take into account which folder was created. If this information is required I can rewrite the answer.
2.) This solution assumes folders won't be deleted from the main folder. If this isn't the case, I can rewrite the answer as well.
import time
import os
def DoSomething():
pass
if __name__ == '__main__':
# go to folder of interest
os.chdir('/home/somefolders/.../A1')
# get current number of folders inside it
N = len(os.listdir())
while True:
time.sleep(5) # sleep for 5 secs
if N != len(os.listdir()):
print('New folder added! Doing something useful...')
DoSomething()
N = len(os.listdir()) # update N
take a look at watchdog.
http://thepythoncorner.com/dev/how-to-create-a-watchdog-in-python-to-look-for-filesystem-changes/
you could also code a very simple watchdog service on your own.
list all files in the directory you want to observe
wait a time span you define, say every few seconds
make again a list of the filesystem
compare the two lists, take the difference of them
the resulting list from this difference are your filesystem changes
Best greetings
I have a folder of 90,000 PDF documents with sequential numeric titles (e.g. 02.100294.PDF). I have a list of around 70,000 article titles drawn from this folder. I want to build a Python program that matches titles from the list to titles in the folder and then moves the matched files to a new folder.
For example, say I have the following files in "FOLDER";
1.100.PDF
1.200.PDF
1.300.PDF
1.400.PDF
Then, I have a list with of the following titles
1.200.PDF
1.400.PDF
I want a program that matches the two document titles from the list (1.200 and 1.400) to the documents in FOLDER, and then move these two files to "NEW_FOLDER".
Any idea how to do this in Python?
Thank you!
EDIT: This is the code I currently have. The source directory is 'scr', and 'dst' is the new destination. "Conden_art" is the list of files I want to move. I am trying to see if the file in 'scr' matches a name listed in 'conden_art'. If it does, I want to move it to 'dst'. Right now, the code is finding no matches and is only printing 'done'. This issue is different from just moving files because I need to match file names to a list, and then move them.
import shutil
import os
for file in scr:
if filename in conden_art:
shutil.copy(scr, dst)
else:
print('done')
SOLVED!
Here is the code I used that ended up working. Thanks for all of your help!
import shutil
import os
import pandas as pd
scr = filepath-1
dst = filepath-2
files = os.listdir(scr)
for f in files:
if f in conden_art:
shutil.move(scr + '\\' + f, dst)
Here's a way to do it -
from os import listdir
from os.path import isfile, join
import shutil
files = [f for f in listdir(src) if isfile(join(src, f))] # this is your list of files at the source path
for i in Conden_art:
if i in files:
shutil.move(i,dst+i) # moving the files in conden_art to dst/
src and dst here are your paths for source and destination. Make sure you are at the src path before running the for loop. Otherwise, python will be unable to find the file.
Rather than looping through the files in the source directory it would be quicker to loop through the filenames you already have. You can use os.path.exists() to check if a file is available to be moved.
from os import path
import shutil
for filename in conden_art:
src_fp, dst_fp = path.join(src, filename), path.join(dst, filename)
if path.exists(filepath):
shutil.move(src_fp, dst_fp)
print(f'{src_fp} moved to {dst}')
else:
print(f'{src_fp} does not exist')
I'm putting together a Python script that uses pandas to read data from a CSV file, sort and filter that data, then save the file to another location.
This is something I have to run regularly - at least weekly if not daily. The original file is updated every day and is placed in a folder but each day the file name changes and the old file is removed so there is only one file in the directory.
I am able to make all this work by specifying the file location and name in the script, but since the name of the file changes each day, I'd rather not have to edit the script every every time I want to run it.
Is there a way to read that file based solely on the location? As I mentioned, it's the only file in the directory. Or is there a way to use a wildcard in the name? The name of the file is always something like: ABC_DEF_XXX_YYY.csv where XXX and YYY change daily.
I appreciate any help. Thanks!
from os import listdir
CSV_Files = [file for file in listdir('<path to folder>') if file.endswith('.csv')
If there is only 1 CSV file in the folder, you can do
CSV_File = CSV_Files[0]
afterwards.
To get the file names solely based on the location:
import os, glob
os.chdir("/ParentDirectory")
for file in glob.glob("*.csv"):
print(file)
Assume that dirName holds the directory holding your file.
A call to os.listdir(dirName) gives you files or child directories in this directory (of course, you must earlier import os).
To limit the list to just files, we must write a little more, e.g.
[f for f in os.listdir(dirName) if os.path.isfile(os.path.join(dirName, f))]
So we have a full list of files. To get the first file, add [0] to the
above expression, so
fn = [f for f in os.listdir(dirName) if os.path.isfile(os.path.join(dirName, f))][0]
gives you the name of the first file, but without the directory.
To have the full path, use os.path.join(dirname, fn)
So the whole script, adding check for proper extension, can be:
import os
dirName = r"C:\Users\YourName\whatever_path_you_wish"
fn = [f for f in os.listdir(dirName)\
if f.endswith('.csv') and os.path.isfile(os.path.join(dirName, f))][0]
path = os.path.join(dirName, fn)
Then you can e.g. open this file or make any use of them, as you need.
Edit
The above program will fail if the directory given does not contain any file
with the required extension. To make the program more robust, change it to
something like below:
fnList = [f for f in os.listdir(dirName)\
if f.endswith('.csv') and os.path.isfile(os.path.join(dirName, f))]
if len(fnList) > 0:
fn = fnList[0]
path = os.path.join(dirName, fn)
print(path)
# Process this file
else:
print('No such file')
The following is a piece of the code:
files = glob.iglob(studentDir + '/**/*.py',recursive=True)
for file in files:
shutil.copy(file, newDir)
The thing is: I plan to get all the files with extension .py and also all the files whose names contain "write". Is there anything I can do to change my code? Many thanks for your time and attention.
If you want that recursive option, you could use :
patterns = ['/**/*write*','/**/*.py']
for p in patterns:
files = glob.iglob(studentDir + p, recursive=True)
for file in files:
shutil.copy(file, newDir)
If the wanted files are in the same directory you could simply use :
certainfiles = [glob.glob(e) for e in ['*.py', '*write*']]
for file in certainfiles:
shutil.copy(file, newDir)
I would suggest the use of pathlib which has been available from version 3.4. It makes many things considerably easier.
In this case '**' stands for 'descend the entire folder'.
'*.py' has its usual meaning.
path is an object but you can recover its string representation using the str function to get just the file name.
When you want the entire path name, use path.absolute and get the str of that.
Don't worry, you'll get used to it. :) If you look at the other goodies in pathlib you'll see it's worth it.
from pathlib import Path
studentDir = <something>
newDir = <something else>
for path in Path(studentDir).glob('**/*.py'):
if 'write' in str(path):
shutil.copy(str(path.absolute()), newDir)
I want to copy files with a specific file extention from one directory and put in another directory. I tried searching and found code the same as im doing however it doesnt appear to do anything, any help would be great.
import shutil
import os
source = "/tmp/folder1/"
destination = "/tmp/newfolder/"
for files in source:
if files.endswith(".txt"):
shutil.move(files,destination)
I think the problem is your for-loop. You are actually looping over the string "tmp/folder1/" instead of looping over the the files in the folder. What your for-loop does is going through the string letter by letter (t, m, p etc.).
What you want is looping over a list of files in the source folder. How that works is described here: How do I list all files of a directory?.
Going fro there you can run through the filenames, testing for their extension and moving them just as you showed.
Your "for file in source" pick one character after another one from your string "source" (the for doesn't know that source is a path, for him it is just a basic str object).
You have to use os.listdir :
import shutil
import os
source = "source/"
destination = "dest/"
for files in os.listdir(source): #list all files and directories
if os.path.isfile(os.path.join(source, files)): #is this a file
if files.endswith(".txt"):
shutil.move(os.path.join(source, files),destination) #move the file
os.path.join is used to join a directory and a filename (to have a complete path).