Append to or create an HDF5 file in Julia - io

julia> using HDF5
I don't seem to be able to create a file in r+ mode in Julia.
julia> fid = h5open("/tmp/test.h5", "r+")
...
ERROR: Cannot access file /tmp/test.h5
...
However:
julia> fid = h5open("/tmp/test.h5", "w")
HDF5 data file: /tmp/test.h5
Is this the intended behaviour? If so, what is the right way to append to an HDF5 file, and create if it doesn't exist?
My attempt:
close(h5open("/tmp/test.h5", "w")) ## looks ugly to me
for dataset in ["A", "B", "C"]:
A = long_operation_which_returns_lots_of_data()
h5open("/tmp/test.h5", "r+") do file
write(file, "group/$dataset", A)
end
end
EDIT: In my scenario, each loop iteration takes a long time to compute and generates a lot of data, which stays in memory. Writing to file at each iteration and clearing the object from memory is therefore necessary.

First, the close(h5open(...)) # look ugly in the question WILL clobber (i.e. delete the contents of) any existing file.
A workaround for appending could be to check for file existence using isfile. Like:
h5open("/tmp/test.h5",isfile("/tmp/test.h5") ? "r+" : "w") do file
write(file,"group/J",[10,11,12,13])
end
You could also try try:
f = try
h5open("/tmp/non.h5","r+")
catch e
if isa(e,ErrorException)
h5open("/tmp/non.h5","w")
else
throw(e)
end
end
In any case, extra ugliness can be safely tucked away in a function and away from the main flow.
When opening a non-existing file there are some error messages from the HDF5 C library. IIRC there is a method to turn them off.

Related

Reading a list of tuples from a text file in python

I am reading a text file and I want to read a list of tuples so that I can add another tuple to it in my program and write that appended tuple back to the text file.
Example in the file
[('john', 'abc')]
Want to write back to the file as
[('john', 'abc'), ('jack', 'def')]
However, I whenever I keep writing back to the file, the appended tuple seems to add in double quotes along with square brackets. I just want it to appear as above.
You can write a reusable function which takes 2 parameters file_path (on which you want to write tuple), tup (which you want to append) and put your logic inside that. Later you can supply proper data to this function and it will do the job for you.
Note: Don't forget to read the documentation as comments in code
tuples.txt (Before writing)
[('john', 'abc')]
Code
def add_tuple_to_file(file_path, tup):
with open(file_path, 'r+') as f:
content = f.read().strip() # read content from file and remove whitespaces around
tuples = eval(content) # convert string format tuple to original tuple object (not possible using json.loads())
tuples.append(tup) # append new tuple `tup` to the old list
f.seek(0) # After reading file, file pointer reaches to end of file so place it again at beginning
f.truncate() # truncate file (erase old content)
f.write(str(tuples)) # write back the updated list
# Try
add_tuple_to_file("./tuples.txt", ('jack', 'def'))
tuples.txt (After writing back)
[('john', 'abc'), ('jack', 'def')]
References
https://www.geeksforgeeks.org/python-ways-to-convert-string-to-json-object/
How to open a file for both reading and writing?
You can use ast.literal_eval to get the list object from the string.
import ast
s = "[('john', 'abc')]"
o = ast.literal_eval(s)
print(repr(o)==s)
o.append(('jack', 'def'))
newstr = repr(o)
print(newstr)
Here it is in action.

Can't iterate through a csv.reader object when turning it to a list via a different variable

I'm doing some basic CSV files manipulation when I stumbled upon this problem, now I know how to avoid it but I just want to have a better understanding of what is going on.
Whenever I want to iterate the csv.reader object, I can do so easily. However, whenever I try to turn the object into a list via another variable, it blocks the iterative loop that iterates the csv.reader object from starting.
def checkPlayer(discordName, playerName):
with open ('PLAYERS.csv', 'r') as fil:
r = csv.reader(fil)
l = list(r)
lineNum = 1
for line in r:
print(line)
The l = list(r) is what blocks the loop from executing.
The code below works fine, and the loop executes normally.
def checkPlayer(discordName, playerName):
with open ('PLAYERS.csv', 'r') as fil:
r = csv.reader(fil)
lineNum = 1
for line in r:
print(line)
I'm expecting the reason on why this happens is because when turning the csv.reader to a list, an iteration of the object happens which means that it sets the csv.reader object at the endpoint before executing the loop.
The csv.reader represents the underlying file (only adding some structure of CSV on top of the raw file). Therefore, it can be read once, from the beginning to the end and it does not return anything once you reach the end of the file.
When you run l = list(r) you read the whole contents of the file represented by handle r and reach the end of the underlying file. Therefore, any further iteration over the handle (for line in r:) will start at the end of the file and therefore you will read nothing.
To fix this you can rewind the file to the beginning: fil.seek(0) before reading from the file for the second time:
with open ('PLAYERS.csv', 'r') as fil:
r = csv.reader(fil)
l = list(r)
lineNum = 1
fil.seek(0)
for line in r:
print(line)

Reading file and getting values from a file. It shows only first one and others are empty

I am reading a file by using a with open in python and then do all other operation in the with a loop. While calling the function, I can print only the first operation inside the loop, while others are empty. I can do this by using another approach such as readlines, but I did not find why this does not work. I thought the reason might be closing the file, but with open take care of it. Could anyone please suggest me what's wrong
def read_datafile(filename):
with open(filename, 'r') as f:
a = [lines.split("\n")[0] for number, lines in enumerate(f) if number ==2]
b = [lines.split("\n")[0] for number, lines in enumerate(f) if number ==3]
c = [lines.split("\n")[0] for number, lines in enumerate(f) if number ==2]
return a, b, c
read_datafile('data_file_name')
I only get values for a and all others are empty. When 'a' is commented​, I get value for b and others are empty.
Updates
The file looks like this:
-0.6908270760153553 -0.4493128078936575 0.5090918714784820
0.6908270760153551 -0.2172871921063448 0.5090918714784820
-0.0000000000000000 0.6666999999999987 0.4597549674638203
0.3097856229862140 -0.1259623621214220 0.5475896447896115
0.6902143770137859 0.4593623621214192 0.5475896447896115
The construct
with open(filename) as handle:
a = [line for line in handle if condition]
b = [line for line in handle]
will always return an empty b because the iterator in a already consumed all the data from the open filehandle. Once you reach the end of a stream, additional attempts to read anything will simply return nothing.
If the input is seekable, you can rewind it and read all the same lines again; or you can close it (explicitly, or implicitly by leaving the with block) and open it again - but a much more efficient solution is to read it just once, and pick the lines you actually want from memory. Remember that reading a byte off a disk can easily take several orders of magnitude more time than reading a byte from memory. And keep in mind that the data you read could come from a source which is not seekable, such as standard output from another process, or a client on the other side of a network connection.
def read_datafile(filename):
with open(filename, 'r') as f:
lines = [line for line in f]
a = lines[2]
b = lines[3]
c = lines[2]
return a, b, c
If the file could be too large to fit into memory at once, you end up with a different set of problems. Perhaps in this scenario, where you only seem to want a few lines from the beginning, only read that many lines into memory in the first place.
What exactly are you trying to do with this script? The lines variable here may not contain what you want: it will contain a single line because the file gets enumerated by lines.

add new row to numpy using realtime reading

I am using a microstacknode accelerometer and intend to save it into csv file.
while True:
numpy.loadtxt('foo.csv', delimiter=",")
raw = accelerometer.get_xyz(raw=True)
g = accelerometer.get_xyz()
ms = accelerometer.get_xyz_ms2()
a = numpy.asarray([[raw['x'],raw['y'],raw['z']]])
numpy.savetxt("foo.csv",a,delimiter=",",newline="\n")
However, the saving is only done on 1 line. Any help given? Still quite a noobie on python.
NumPy is not the best solution for this type of things.
This should do what you intend:
while True:
raw = accelerometer.get_xyz(raw=True)
fobj = open('foo.csv', 'a')
fobj.write('{},{},{}\n'.format(raw['x'], raw['y'], raw['z']))
fobj.close()
Here fobj = open('foo.csv', 'a') opens the file in append mode. So if the file already exists, the next writing will go to the end of file, keeping the data in the file.
Let's have look at your code. This line:
numpy.loadtxt('foo.csv', delimiter=",")
reads the whole file but doe not do anything with the at it read, because you don't assign to a variable. You would need to do something like this:
data = numpy.loadtxt('foo.csv', delimiter=",")
This line:
numpy.savetxt("foo.csv",a,delimiter=",",newline="\n")
Creates a new file with the name foo.csv overwriting the existing one. Therefore, you see only one line, the last one written.
This should do the same but dos not open and close the file all the time:
with open('foo.csv', 'a') as fobj:
while True:
raw = accelerometer.get_xyz(raw=True)
fobj.write('{},{},{}\n'.format(raw['x'], raw['y'], raw['z']))
The with open() opens the file with the promise to close it even in case of an exception. For example, if you break out of the while True loop with Ctrl-C.

Comparing input against .txt and receiving error

Im trying to compare a users input with a .txt file but they never equal. The .txt contains the number 12. When I check to see what the .txt is it prints out as
<_io.TextIOWrapper name='text.txt' encoding='cp1252'>
my code is
import vlc
a = input("test ")
rflist = open("text.txt", "r")
print(a)
print(rflist)
if rflist == a:
p = vlc.MediaPlayer('What Sarah Said.mp3')
p.play()
else:
print('no')
so am i doing something wrong with my open() or is it something else entirely
To print the contents of the file instead of the file object, try
print(rflist.read())
instead of
print(rflist)
A file object is not the text contained in the file itself, but rather a wrapper object that facilitates operations on the file, like reading its contents or closing it.
rflist.read() or f.readline() is correct.
Read the documentation section 7.2
Dive Into Python is a fantastic book to start Python. take a look at it and you can not put it down.

Resources