saving text files to .npy file - python-3.x

I have many text files in a directory with numerical extension(example: signal_data1.9995100000000001,signal_data1.99961 etc)
The content of the files are as given below
signal_data1.9995100000000001
-1.710951390504200198e+00
5.720409824754981720e-01
2.730176313110273423e+00
signal_data1.99961
-6.710951390504200198e+01
2.720409824754981720e-01
6.730176313110273423e+05
I just want to arrange the above files into a single .npy files as
-1.710951390504200198e+00,5.720409824754981720e-01, 2.730176313110273423e+00
-6.710951390504200198e+01,2.720409824754981720e-01, 6.730176313110273423e+05
So, I want to implement the same procedure for many files of a directory.
I tried a loop as follows:
import numpy as np
import glob
for file in glob.glob(./signal_*):
np.savez('data', file)
However, it does not give what I want as depicted above. So here I need help. Thanks in advance.

Here is another way of achieving it:
import os
dirPath = './data/' # folder where you store your data
with os.scandir(dirPath) as entries:
output = ""
for entry in entries: # read each file in your folder
dataFile = open(dirPath + entry.name, "r")
dataLines = dataFile.readlines()
dataFile.close()
for line in dataLines:
output += line.strip() + " " # clear all unnecessary characters & append
output += '\n' # after each file break line
writeFile = open("a.npy", "w") # save it
writeFile.write(output)
writeFile.close()

You can use np.loadtxt() and np.save():
a = np.array([np.loadtxt(f) for f in sorted(glob.glob('./signal_*'))])
np.save('data.npy', a)

Related

How to read many files have a specific format in python

I am a little bit confused in how to read all lines in many files where the file names have format from "datalog.txt.98" to "datalog.txt.120".
This is my code:
import json
file = "datalog.txt."
i = 97
for line in file:
i+=1
f = open (line + str (i),'r')
for row in f:
print (row)
Here, you will find an example of one line in one of those files:
I need really to your help
I suggest using a loop for opening multiple files with different formats.
To better understand this project I would recommend researching the following topics
for loops,
String manipulation,
Opening a file and reading its content,
List manipulation,
String parsing.
This is one of my favourite beginner guides.
To set the parameters of the integers at the end of the file name I would look into python for loops.
I think this is what you are trying to do
# create a list to store all your file content
files_content = []
# the prefix is of type string
filename_prefix = "datalog.txt."
# loop from 0 to 13
for i in range(0,14):
# make the filename variable with the prefix and
# the integer i which you need to convert to a string type
filename = filename_prefix + str(i)
# open the file read all the lines to a variable
with open(filename) as f:
content = f.readlines()
# append the file content to the files_content list
files_content.append(content)
To get rid of white space from file parsing add the missing line
content = [x.strip() for x in content]
files_content.append(content)
Here's an example of printing out files_content
for file in files_content:
print(file)

Read multiple text files, search few strings , replace and write in python

I have 10s of text files in my local directory named something like test1, test2, test3, and so on. I would like to read all these files, search few strings in the files, replace them by other strings and finally save back into my directory in such a way that something like newtest1, newtest2, newtest3, and so on.
For instance, if there was a single file, I would have done following:
#Read the file
with open('H:\\Yugeen\\TestFiles\\test1.txt', 'r') as file :
filedata = file.read()
#Replace the target string
filedata = filedata.replace('32-83 Days', '32-60 Days')
#write the file out again
with open('H:\\Yugeen\\TestFiles\\newtest1.txt', 'w') as file:
file.write(filedata)
Is there any way that I can achieve this in python?
If you use Pyhton 3 you can use the scandir in os library.
Python 3 docs: os.scandir
With that you can get the directory entries.
with os.scandir('H:\\Yugeen\\TestFiles') as it:
Then loop over these entries and your code could look something like this.
Notice I changed the path in your code to the entry object path.
import os
# Get the directory entries
with os.scandir('H:\\Yugeen\\TestFiles') as it:
# Iterate over directory entries
for entry in it:
# If not file continue to next iteration
# This is no need if you are 100% sure there is only files in the directory
if not entry.is_file():
continue
# Read the file
with open(entry.path, 'r') as file:
filedata = file.read()
# Replace the target string
filedata = filedata.replace('32-83 Days', '32-60 Days')
# write the file out again
with open(entry.path, 'w') as file:
file.write(filedata)
If you use Pyhton 2 you can use listdir. (also applicable for python 3)
Python 2 docs: os.listdir
In this case same code structure. But you also need to handle the full path to file since listdir will only return the filename.

Python3: Index out of range for script that worked before

the attached script returns:
IndexError: list index out of range
for the line starting with values = {line.split (...)
values=dict()
with open(csv) as f:
lines =f.readlines()
values = {line.split(',')[0].strip():line.split(',')[1].strip() for line in lines}
However, I could use it yesterday for doing exactly the same:
replacing certain text in a dir of xml-files with different texts
import os
from distutils.dir_util import copy_tree
drc = 'D:/Spielwiese/00100_Arbeitsverzeichnis'
backup = 'D:/Spielwiese/Backup/'
csv = 'D:/persons1.csv'
copy_tree(drc, backup)
values=dict()
with open(csv) as f:
lines =f.readlines()
values = {line.split(',')[0].strip():line.split(',')[1].strip() for line in lines}
#Getting a list of the full paths of files
for dirpath, dirname, filename in os.walk(drc):
for fname in filename:
#Joining dirpath and filenames
path = os.path.join(dirpath, fname)
#Opening the files for reading only
filedata = open(path,encoding="Latin-1").read()
for k,v in values.items():
filedata=filedata.replace(k,v)
f = open(path, 'w',encoding="Latin-1")
# We are writing the the changes to the files
f.write(filedata)
f.close() #Closing the files
print("In case something went wrong, you can find a backup in " + backup)
I don't see anything weird and I could, as mentioned before use it before ... :-o
Any ideas on how to fix it?
best Wishes,
K

Python 3.7: Batch renaming numbered files in a directory while preserving their sequence

I'm relatively new to Python, and have only recently started trying to use it for data analysis. I have a list of image files in a directory that have been acquired in sequence, and they have been named as so:
IMG_E5.1.tif
IMG_E5.2.tif
IMG_E5.3.tif
...
...
IMG_E5.107.tif
I would like to replace the dot and the number following it with an underscore and a four-digit integer, while preserving the initial numbering of the file, like so:
IMG_E5_0001.tif
IMG_E5_0002.tif
IMG_E5_0003.tif
...
...
IMG_E5_0107.tif
Could you advise me on how this can be done, or if there is already an answer that I'm not aware, link me to it? Many thanks!
I managed to find a method that works for this
import os
import os.path as path
from glob import glob
# Get current working directory
file_path = os.getcwd()
file_list = []
for i in range(1, 500):
# Generate file name (with wildcards) to search for
file_name = path.abspath(file_path + "/IMG*" + "." + str(i) + ".tif")
# Search for files
file = glob(file_name)
# If found, append to list
if len(file) > 1:
file_list.append(file[0])
elif len(file) == 1:
file_list.append(file[0])
for file in file_list:
# Use the "split" function to split the string at the periods
file_name, file_num, file_ext = file.split(".")
file_new = path.abspath(file_name + "_"
+ str(file_num).zfill(4)
+ "." + file_ext)
os.rename(file, file_new)
I am still relatively inexperienced with coding, so if there is a more straightforward and efficient way to tackle this problem, do let me know. Thanks.

Python - Spyder 3 - Open a list of .csv files and remove all double quotes in every file

I've read every thing I can find and tried about 20 examples from SO and google, and nothing seems to work.
This should be very simple, but I cannot get it to work. I just want to point to a folder, and replace every double quote in every file in the folder. That is it. (And I don't know Python well at all, hence my issues.) I have no doubt that some of the scripts I've tried to retask must work, but my lack of Python skill is getting in the way. This is as close as I've gotten, and I get errors. If I don't get errors it seems to do nothing. Thanks.
import glob
import csv
mypath = glob.glob('\\C:\\csv\\*.csv')
for fname in mypath:
with open(mypath, "r") as infile, open("output.csv", "w") as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
writer.writerow(item.replace("""", "") for item in row)
You don't need to use csv-specific file opening and writing, I think that makes it more complex. How about this instead:
import os
mypath = r'\path\to\folder'
for file in os.listdir(mypath): # This will loop through every file in the folder
if '.csv' in file: # Check if it's a csv file
fpath = os.path.join(mypath, file)
fpath_out = fpath + '_output' # Create an output file with a similar name to the input file
with open(fpath) as infile
lines = infile.readlines() # Read all lines
with open(fpath_out, 'w') as outfile:
for line in lines: # One line at a time
outfile.write(line.replace('"', '')) # Remove each " and write the line
Let me know if this works, and respond with any error messages you may have.
I found the solution to this based on the original answer provided by u/Jeff. It was actually smart quotes (u'\u201d') to be exact, not straight quotes. That is why I could get nothing to work. That is a great way to spend like two days, now if you'll excuse me I have to go jump off the roof. But for posterity, here is what I used that worked. (And note - there is the left curving smart quote as well - that is u'\u201c'.
mypath = 'C:\\csv\\'
myoutputpath = 'C:\\csv\\output\\'
for file in os.listdir(mypath): # This will loop through every file in the folder
if '.csv' in file: # Check if it's a csv file
fpath = os.path.join(mypath, file)
fpath_out = os.path.join(myoutputpath, file) #+ '_output' # Create an output file with a similar name to the input file
with open(fpath) as infile:
lines = infile.readlines() # Read all lines
with open(fpath_out, 'w') as outfile:
for line in lines: # One line at a time
outfile.write(line.replace(u'\u201d', ''))# Remove each " and write the line
infile.close()
outfile.close()

Resources