Adding attributes to hdf5 file - attributes

I had this script to convert some .txt files into .hdf5. These were later used as input for another function.
I had implemented this before and everythin was running smoothly (about 2weeks ago).
name = "ecg.hdf5"
sampling_rate = 250;
ecg = np.genfromtxt('ecg.txt')
hf = h5py.File(name, 'w')
# Create subgroups and add dataset
signals = hf.create_group('signals/ECG/raw')
ecg = signals.create_dataset('ecg',data = ecg)
# max and min for plot limits
ecg_max = max(ecg)
ecg_min = min(ecg)
# Add attributes
ecg.attrs.create('json','{"name": "signal0", "resolution": 16, "labels": ["I"], "units": {"signal": {"max": %f, "min": %f}, "time": {"label": "second"}}, "sampleRate": %d, "type": "/ECG/raw/ecg"}' %(ecg_max,ecg_min,sampling_rate))
hf.close()
As I was running it, I keep having this error and can't atribute the 'attribute'
rro adding atribu
Any idea, please?
Thanks in advance

Solvet it: updating the h5py version from 2.9.0 to 2.10.0

Related

Operations on os.path.getctime

I am reading the following JSON:
{
"Age": 15,
"startTime": {
"date": "06/15/2021",
"time": "4:04 pm",
"utcTimestamp": 1623765862
},...
with
data=json.load(self.fName)
Dict['StartDate'] = data['startTime']['date']
Dict['StartTime'] = data['startTime']['time']
I also created a variable to track the file creation time:
fileCreationTime= datetime.fromtimestamp(os.path.getctime(fname)).strftime('%Y-%m-%d %H:%M:%S')
I am trying to find the amount of time between the time the json file was created and the "StartTime" in the json file.
I tried a few things including:
daysToUpload = datetime.fromtimestamp(os.path.getctime(fname)).strftime('%Y-%m-%d') - Dict['StartDate']
But that did not work. (unsupported operand type(s) for -: 'str' and 'str').
Maybe I can use the UPC time but
>>os.path.getctime(fname)
>>1635377313.0170193
I'm not sure how to relate that the UTC.
I'd like:
timeToUpload = fileCreationDate - TimeSpecifiedInJsonFile
This is Python 3.x running on Windows.
To get the time difference, you can simply use utc_timestamp itself without converting it to datetime format as follows:
fileCreationDate = int(os.path.getctime(fname)) # utc timestamp
TimeSpecifiedInJsonFile = int(data['startTime']['utcTimestamp']) # utc timestamp
timeToUpload = fileCreationDate - TimeSpecifiedInJsonFile
print(timeToUpload) # will be print a time difference (seconds)

Numpy Array value setting issues

I have a data set that spans a certain length of time and data points for each of these time points. I would like to create a much more detailed timescale and fill the empty data points to zero. I wrote a piece of code to do this but it isn't doing what I want it to. I tried a sample case though and it seems to work. Below are the two codes.
This piece of code does not do what I want it to.
import numpy as np
TD_t = np.array([36000, 36500, 37000, 37500, 38000, 38500, 39000, 39500, 40000, 40500, 41000, 41500, 42000, 42500,
43000, 43500, 44000, 44500, 45000, 45500, 46000, 46500, 47000, 47500, 48000, 48500, 49000, 49500,
50000, 50500, 51000, 51500, 52000, 52500, 53000, 53500, 54000, 54500, 55000, 55500, 56000, 56500,
57000, 57500, 58000, 58500, 59000, 59500, 60000, 60500, 61000, 61500, 62000, 62500, 63000, 63500,
64000, 64500, 65000, 65500, 66000])
TD_d = np.array([-0.05466527, -0.04238242, -0.04477601, -0.02453717, -0.01662798, -0.02548617, -0.02339215,
-0.01186576, -0.0029057 , -0.01094671, -0.0095005 , -0.0190277 , -0.01215644, -0.01997112,
-0.01384497, -0.01610656, -0.01927564, -0.02119056, -0.011634 , -0.00544096, -0.00046568,
-0.0017769 , -0.0007341, 0.00193066, 0.01359107, 0.02054919, 0.01420335, 0.01550565,
0.0132394 , 0.01371563, 0.01959774, 0.0165316 , 0.01881992, 0.01554435, 0.01409003,
0.01898334, 0.02300266, 0.03045158, 0.02869013, 0.0238423 , 0.02902356, 0.02568908,
0.02954539, 0.02537967, 0.02927247, 0.02138605, 0.02815635, 0.02733237, 0.03321588,
0.03063803, 0.03783137, 0.04110955, 0.0451221 , 0.04646263, 0.04472884, 0.04935833,
0.03372911, 0.04031406, 0.04165237, 0.03940343, 0.03805504])
time = np.arange(0, 100001,1)
data = np.zeros_like(time)
for i in range(0, len(TD_t)):
t = TD_t[i]
data[t] = TD_d[i]
print(i,t,TD_d[i],data[t])
But for some reason this code works.
import numpy
nums = numpy.array([0,1,2,3])
data = numpy.zeros_like(nums)
data[0] = nums[2]
data[0], nums[2]
Any help will be much appreciated!!
It's because the dtype of data is being set to int64, and so when you try to reassign one of the data elements, it gets rounded to zero.
Try changing the line to:
data = np.zeros_like(time, dtype=float)
and it should work (or use whatever dtype the TD_d array is)

Folium library error in choropleth

Am using folium library with an open data set from kaggle,
map.choropleth(geo_path=country_geo, data=plot_data,
columns=['CountryCode', 'Value'],
key_on='feature.id',
fill_color='YlGnBu', fill_opacity=0.7, line_opacity=0.2,
legend_name=hist_indicator
)
The above part of the code is giving me the following error:
TypeError: choropleth() got an unexpected keyword argument 'geo_path'
When I replace geo_path with geo_data I get this error:
JSONDecodeError: Expecting value: line 7 column 1 (char 6)
Is the issue related to "UCSanDiegoX: DSE200x Python for Data Science"? I took the advice of Cody and rename geo_path to geo_data at the specifications of map.choropleth.
At the git hub repository, take care of using the RAW data, which is in fact a file structured with the format GeoJSON. The first two lines should start like the code provided below
{"type":"FeatureCollection","features":[
{"type":"Feature","properties":{"name":"Afghanistan"},"geometry":
{"type":"Polygon","coordinates":[[[61.210817,35.650072],.....
geo_path doesn't work because it is not a parameter for choropleth. You are correct in replacing it with geo_data.
Your second error is likely due to non-existent or incorrectly formatted geojson file.
From http://python-visualization.github.io/folium/docs-master/modules.html?highlight=chor# your argument for geo_data needs to be a "URL, file path, or data (json, dict, geopandas, etc) to your GeoJSON geometries".
GeoJSON formatted files follow this structure from geojson.org:
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [125.6, 10.1]
},
"properties": {
"name": "Dinagat Islands"
}
}

How to increase resolution of gif image?

How to increase resolution of gif image generated by rgl package of R (plot3d and movie3d functions) - either externally or through R ?
R Code :
MyX<-rnorm(10,5,1)
MyY<-rnorm(10,5,1)
MyZ<-rnorm(10,5,1)
plot3d(MyX, MyY, MyZ, xlab="X", ylab="Y", zlab="Z", type="s", box=T, axes=F)
text3d(MyX, MyY, MyZ, text=c(1:10), cex=5, adj=1)
movie3d(spin3d(axis = c(0, 0, 1), rpm = 4), duration=15, movie="TestMovie",
type="gif", dir=("~/Desktop"))
Output :
Update
Adding this line at the beginning of code solved the problem
r3dDefaults$windowRect <- c(0, 100, 1400, 1400)
I don't think you can do much about the resolution of the gif itself. I think you have to make the image much larger as an alternative, and then when you display it smaller it looks better. This is untested as a recent upgrade broke a thing or two for me, but this did work under 2.15:
par3d(windowRect = c(0, 0, 500, 500)) # make the window large
par3d(zoom = 1.1) # larger values make the image smaller
# you can test your settings interactively at this point
M <- par3d("userMatrix") # save your settings to pass to the movie
movie3d(par3dinterp(userMatrix=list(M,
rotate3d(M, pi, 1, 0, 0),
rotate3d(M, pi, 0, 1, 0) ) ),
duration = 5, fps = 50,
movie = "MyMovie")
HTH. If it doesn't quite work for you, check out the functions used and tune up the settings.

CouchDB historical view snapshots

I have a database with documents that are roughly of the form:
{"created_at": some_datetime, "deleted_at": another_datetime, "foo": "bar"}
It is trivial to get a count of non-deleted documents in the DB, assuming that we don't need to handle "deleted_at" in the future. It's also trivial to create a view that reduces to something like the following (using UTC):
[
{"key": ["created", 2012, 7, 30], "value": 39},
{"key": ["deleted", 2012, 7, 31], "value": 12}
{"key": ["created", 2012, 8, 2], "value": 6}
]
...which means that 39 documents were marked as created on 2012-07-30, 12 were marked as deleted on 2012-07-31, and so on. What I want is an efficient mechanism for getting the snapshot of how many documents "existed" on 2012-08-01 (0+39-12 == 27). Ideally, I'd like to be able to query a view or a DB (e.g. something that's been precomputed and saved to disk) with the date as the key or index, and get the count as the value or document. e.g.:
[
{"key": [2012, 7, 30], "value": 39},
{"key": [2012, 7, 31], "value": 27},
{"key": [2012, 8, 1], "value": 27},
{"key": [2012, 8, 2], "value": 33}
]
This can be computed easily enough by iterating through all of the rows in the view, keeping a running counter and summing up each day as I go, but that approach slows down as the data set grows larger, unless I'm smart about caching or storing the results. Is there a smarter way to tackle this?
Just for the sake of comparison (I'm hoping someone has a better solution), here's (more or less) how I'm currently solving it (in untested ruby pseudocode):
require 'date'
def date_snapshots(rows)
current_date = nil
current_count = 0
rows.inject({}) {|hash, reduced_row|
type, *ymd = reduced_row["key"]
this_date = Date.new(*ymd)
if current_date
# deal with the days where nothing changed
(current_date.succ ... this_date).each do |date|
key = date.strftime("%Y-%m-%d")
hash[key] = current_count
end
end
# update the counter and deal with the current day
current_date = this_date
current_count += reduced_row["value"] if type == "created_at"
current_count -= reduced_row["value"] if type == "deleted_at"
key = current_date.strftime("%Y-%m-%d")
hash[key] = current_count
hash
}
end
Which can then be used like so:
rows = couch_server.db(foo).design(bar).view(baz).reduce.group_level(3).rows
date_snapshots(rows)["2012-08-01"]
Obvious small improvement would be to add a caching layer, although it isn't quite as trivial to make that caching layer play nicely incremental updates (e.g. the changes feed).
I found an approach that seems much better than my original one, assuming that you only care about a single date:
def size_at(date=Time.now.to_date)
ymd = [date.year, date.month, date.day]
added = view.reduce.
startkey(["created_at"]).
endkey( ["created_at", *ymd, {}]).rows.first || {}
deleted = view.reduce.
startkey(["deleted_at"]).
endkey( ["deleted_at", *ymd, {}]).rows.first || {}
added.fetch("value", 0) - deleted.fetch("value", 0)
end
Basically, let CouchDB do the reduction for you. I didn't originally realize that you could mix and match reduce with startkey/endkey.
Unfortunately, this approach requires two hits to the DB (although those could be parallelized or pipelined). And it doesn't work as well when you want to get a lot of these sizes at once (e.g. view the whole history, rather than just look at one date).

Resources