siphon error - 400: NETCDF4 format not supported for ANY_POINT feature typ - netcdf4

I'm trying to get a dataset from TDScatalog with siphon but with multiples variables show me that error or the last line. Here the code:
import siphon
from siphon.catalog import TDSCatalog
import datetime
from xarray.backends import NetCDF4DataStore
import xarray as xr
point = [-7, 41]
hours = 48
best_gfs = TDSCatalog('http://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/'
'Global_0p25deg/catalog.xml?dataset=grib/NCEP/GFS/Global_0p25deg/Best')
best_gfs.datasets
best_ds = list(best_gfs.datasets.values())[0]
ncss = best_ds.subset()
query = ncss.query()
query.lonlat_point( point[1], point[0] ).time_range(datetime.datetime.utcnow(), datetime.datetime.utcnow() + datetime.timedelta(hours))
query.accept('netcdf4')
query.variables('Temperature_surface',
'Relative_humidity_height_above_ground',
'u-component_of_wind_height_above_ground',
'v-component_of_wind_height_above_ground',
'Wind_speed_gust_surface'
)
data = ncss.get_data(query)
Thanks!

That message is because your point request is trying to return a mix of time series (your _surface variables) and time series of profiles (the u/v wind components). The combination of different features in a single netCDF file is unsupported by the netCDF CF-Conventions.
One work-around is to request CSV or XML formatted data instead (which siphon can still parse and return as a dictionary-of-arrays).
The other is to make separate requests for fields with different geometry. So one for Temperature_surface and Wind_speed_gust_surface, one for u-component_of_wind_height_above_ground and v-component_of_wind_height_above_ground, and one final one for Relative_humidity_height_above_ground. This last split is working around an apparent bug in the THREDDS Data Server where profiles with different vertical levels can't be combined either.

Related

How to read multiple levels of JSON file through Python pandas?

I am trying to read a JSON file where the data are at various level, i.e. Top --> inner --> inner most.
I have tried the pd.json_normalization, but I don't think it is working. I have attached a screenshot. In that the top most level is "WTFY_Combined", and inside it there are three more levels of data. So, out of the levels, I need to read "OccuMa" which is marked in yellow color, and then inside "OccuMa", there are another level of data "OccuCode" and "OccuDesc". I need to read those two levels in two different dataframes.
I know that one way of doing is to take those two in two different JSON files, but in real, I will have such multi level structure to read.
I am trying below code:
import pandas as pd
import json as js
with open ("filepath", "r") as f:
data = js.loads(f.read())
df_flat = pd.json_normalize(data, record_path=['OccuCode'])
df_flat2 = pd.json_normalize(data, record_path=['OccuDesc'])
But, it is not working, its giving "keyerror" for obvious reason that I am not able to map the data into the dataframe properly.
import pandas as pd
import json as js
with open ("filepath", "r") as f:
json_data = js.loads(f.read())
json_data = json_data['WTYF_Combined']['Extraction']['abc']['OccuMa']
df_data = pd.DataFrame(json_data)[['OccuCode','OccuDesc']]

seasonal_decompose : How to use seasonal_decompose:Practical Implementation for seasonal_decompose

How to use seasonal_decompose. How to deal with various errors while using seasonal_decompose. How can we practically use or implement seasonal_decompose.
Get all imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
from statsmodels.tsa.seasonal import seasonal_decompose
Prepare test data
data = {'Unix Timestamp': ['1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12'],
'Date': ['4/20/2021 0:02','4/20/2021 0:01','4/20/2021 0:00','4/19/2021 23:59','4/19/2021 23:58','4/19/2021 23:57','4/19/2021 23:56','4/19/2021 23:55','4/19/2021 23:54','4/19/2021 23:53','4/19/2021 23:52','4/19/2021 23:51','4/19/2021 23:50','4/19/2021 23:49','4/19/2021 23:48','4/19/2021 23:47','4/19/2021 23:46','4/20/2021 0:02','4/20/2021 0:01','4/20/2021 0:00','4/19/2021 23:59','4/19/2021 23:58','4/19/2021 23:57','4/19/2021 23:56','4/19/2021 23:55','4/19/2021 23:54','4/19/2021 23:53','4/19/2021 23:52','4/19/2021 23:51','4/19/2021 23:50','4/19/2021 23:49','4/19/2021 23:48','4/19/2021 23:47','4/19/2021 23:46'],
'Symbol': ['BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD'],
'Open': [55717.47,55768.94,55691.79,55777.86,55803.5,55690.64,55624.69,55651.82,55688.08,55749.28,55704.59,55779.38,55816.61,55843.69,55880.12,55890.88,0,55717.47,55768.94,55691.79,55777.86,55803.5,55690.64,55624.69,55651.82,55688.08,55749.28,55704.59,55779.38,55816.61,55843.69,55880.12,55890.88,0],
'High': [55723,55849.82,55793.15,55777.86,55823.88,55822.91,55713.02,55675.92,55730.21,55749.28,55759.27,55779.38,55835.57,55863.89,55916.47,55918.87,0,55723,55849.82,55793.15,55777.86,55823.88,55822.91,55713.02,55675.92,55730.21,55749.28,55759.27,55779.38,55835.57,55863.89,55916.47,55918.87,0],
'Low': [55541.69,55711.74,55691.79,55677.92,55773.08,55682.56,55624.63,55621.58,55641.46,55688.08,55695.42,55688.66,55769.46,55797.08,55815.99,55826.84,0,55541.69,55711.74,55691.79,55677.92,55773.08,55682.56,55624.63,55621.58,55641.46,55688.08,55695.42,55688.66,55769.46,55797.08,55815.99,55826.84,0]}
df=pd.DataFrame(data)
Perform decomposition
df_seasonal = seasonal_decompose(df)
We get our first error
ValueError: could not convert string to float:
Lets fix the above error, for this run below code
df['Date'] = df['Date'].apply(
lambda x : datetime.datetime.strptime(str(x),'%m/%d/%Y %H:%M')
)
Now if you run seasonal_decompose again, you will get new error
df_seasonal = seasonal_decompose(df)
Now the new error will be
TypeError: float() argument must be a string or a number, not 'Timestamp'
To fix this error we pass one column at a time and the column passed should be a string or a number. Try the decompose using below code
df_seasonal = seasonal_decompose(df['Open'])
Now you get a new error, as shown below
ValueError: You must specify a period or x must be a pandas object with a PeriodIndex or a DatetimeIndex with a freq not set to None
There are two solution's to this error
First Solution:- use period parameter for seasonal_decompose
df_seasonal = seasonal_decompose(df['Open'],period = 1) ## here we have data for every minute and hence period is 1 , but this need not be correct.
In above code we have data for every minute and hence period is 1. However, this need not be correct period is actually cycle period of input data. To know more on how to decide on period read this page. To know the complete list of freq abbrevations click here
Second Solution:- create an datetime index for the data along with frequency
df = df.set_index(df.Date).asfreq('2Min') ## M for Months S for Seconds. Here we cannot resample data with frequency 1Min, as data is already in frequency of 1Min, hence we used 2Min here
df_seasonal = seasonal_decompose(df['Open']) ## here we didn't use period and freq argument
In seasonal_decompose we have to set the model ( By default its Addictive). We can either set the model to be Additive or Multiplicative. A rule of thumb for selecting the right model is to see in our plot if the trend and seasonal variation are relatively constant over time, in other words, linear. If yes, then we will select the Additive model. Otherwise, if the trend and seasonal variation increase or decrease over time then we use the Multiplicative model. So that means before we do seasonal_decompose we must plot the preprocessed data over time and see if there are any trends or cycles.
Finally we could run it without error.
Another error that we might see is TypeError: Index(...) must be called with a collection of some kind, 'seasonal' was passed, this again happens due to wrong usage of seasonal_decompose like for example below
df_bt_decomp = seasonal_decompose(df_bt[['Open','High']],period=1) ## this is wrong because we have used two columns together and both are valid metric and not an index.

Python 3 - Scipy and KDEpy

I am using python-3.x and I want to Evaluate the estimated pdf on a provided set of points using the KDEpy but I couldn't get it right,
I used the scipy.stats.gaussian_kde and it is fine and work very well when I apply the pdf Method as I am interested in the Evaluate the estimated pdf on a provided set of points.
so the question is how to get the same result from the scipy.stats.kde if I used the KDEpy FFTKDE
here a small example that describes what I am looking for:
from scipy.stats.kde import gaussian_kde
data = np.array([[-1.84134663, -1.42036525, -1.38819347],
[-2.58165693, -2.49423057, -1.57609454],
[-0.78776371, -0.79168188, 0.21967791],
[-1.0165618 , -1.78509185, -0.68373997],
[-1.21764947, -0.43215885, -0.34393573]])
my_pdf = gaussian_kde(data.T, bw_method = None )
my_pdf1.pdf(data.T)
print (my_pdf1.pdf(data.T)) # here we will Evaluate the estimated pdf on a provided set of points
the result is:
[0.24234078 0.22071922 0.23802877 0.22474656 0.25402297]
how to get the same result by using the KDEpy FFTKDE
from KDEpy import FFTKDE
my_pdf2 = FFTKDE(kernel="gaussian").fit(data.T).evaluate()
but I don't know how to do the Evaluate the estimated pdf on a provided set of points similar to the scipy.stats.kde with pdf method.
You can create an equidistant grid using e.g. numpy.linspace and pass it to .evaluate():
from KDEpy import FFTKDE
import numpy as np
x_grid = np.linspace(-10, 10, num=2**10)
my_pdf = FFTKDE(kernel="gaussian").fit(data.T).evaluate(x_grid)

RuntimeWarning: divide by zero encountered in log when using pvlib

I'm using PVLib to model a PV system. I'm pretty new to coding and Python, and this is my first time using PVLib, so not surprisingly I've hit some difficulties.
Specifically, I've got created the following code using the extensive readthedocs examples at http://pvlib-python.readthedocs.io/en/latest/index.html
import pandas as pd
import numpy as np
from numpy import isnan
import datetime
import pytz
# pvlib imports
import pvlib
from pvlib.forecast import GFS, NAM, NDFD, HRRR, RAP
from pvlib.pvsystem import PVSystem, retrieve_sam
from pvlib.modelchain import ModelChain
# set location (Royal Greenwich Observatory, London, UK)
latitude, longitude, tz = 51.4769, 0.0005, 'Europe/London'
# specify time range.
start = pd.Timestamp(datetime.date.today(), tz=tz)
end = start + pd.Timedelta(days=5)
periods = 8 # number of periods that the GFS model and/or the model chain allows us to forecast power output.
# specify what irradiance variables we want
irrad_vars = ['ghi', 'dni', 'dhi']
# Use Global Forecast System model. The GFS is the US model that provides forecasts for the entire globe.
fx_model = GFS() # note: gives output in 3-hourly intervals
# retrieve data in processed format (convert temps from Kelvin to Celsius, combine elements of wind speed, complete irradiance data)
# Returns pandas.DataFrame object
fx_data = fx_model.get_processed_data(latitude, longitude, start, end)
# load module and inverter specifications
sandia_modules = pvlib.pvsystem.retrieve_sam('SandiaMod')
cec_inverters = pvlib.pvsystem.retrieve_sam('cecinverter')
module = sandia_modules['SolarWorld_Sunmodule_250_Poly__2013_']
inverter = cec_inverters['ABB__PVI_3_0_OUTD_S_US_Z_M_A__240_V__240V__CEC_2014_']
# model a fixed system in the UK. 10 strings of 250W panels, with 40 panels per string. Gives a nominal 100kW array
system = PVSystem(module_parameters=module, inverter_parameters=inverter, modules_per_string=40, strings_per_inverter=10)
# use a ModelChain object to calculate modelling intermediates
mc = ModelChain(system, fx_model.location, orientation_strategy='south_at_latitude_tilt')
# extract relevant data for model chain
mc.run_model(fx_data.index, weather=fx_data)
# OTHER CODE AFTER THIS TO DO SOMETHING WITH THE DATA
Having used a lot of print() statements in the console to debug, I can see that at the final line
mc.run_model(fx_data.index....
I get the following error:
/opt/pyenv/versions/3.6.0/lib/python3.6/site-packages/pvlib/pvsystem.py:1317:
RuntimeWarning: divide by zero encountered in log
module['Voco'] + module['Cells_in_Series']*delta*np.log(Ee) +
/opt/pyenv/versions/3.6.0/lib/python3.6/site-packages/pvlib/pvsystem.py:1323:
RuntimeWarning: divide by zero encountered in log
module['C3']*module['Cells_in_Series']*((delta*np.log(Ee)) ** 2) +
As a result, when I then go on to look at the ac_power outputs, I get what looks like erroneous data (every hour with a forecast that is not NaN = 3000 W).
I'd really appreciate any help you can give as I don't know what's causing it. Maybe I'm specifying the system incorrectly?
Thanks, Matt
I think the warnings you're seeing are ok to ignore. A handful of pvlib algorithms spit out warnings due to things like 0 values at night.
I think your problem with the non-NaN values is unrelated to the warnings. Study the other modeling results (stored as mc attributes -- see documentation and source code) to see if you can track down the source of your problem.

Abaqus Python script -- Reading 'TENSOR_3D_FULL' data from *.odb file

What I want: strain values LE11, LE22, LE12 at nodal points
My script is:
#!/usr/local/bin/python
# coding: latin-1
# making the ODB commands available to the script
from odbAccess import*
import sys
import csv
odbPath = "my *.odb path"
odb = openOdb(path=odbPath)
assembly = odb.rootAssembly
# count the number of frames
NumofFrames = 0
for v in odb.steps["Step-1"].frames:
NumofFrames = NumofFrames + 1
# create a variable that refers to the reference (undeformed) frame
refFrame = odb.steps["Step-1"].frames[0]
# create a variable that refers to the node set ‘Region Of Interest (ROI)’
ROINodeSet = odb.rootAssembly.nodeSets["ROI"]
# create a variable that refers to the reference coordinate ‘REFCOORD’
refCoordinates = refFrame.fieldOutputs["COORD"]
# create a variable that refers to the coordinates of the node
# set in the test frame of the step
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= NODAL)
# count the number of nodes
NumofNodes =0
for v in ROIrefCoords.values:
NumofNodes = NumofNodes +1
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the current frame
currFrame = odb.steps["Step-1"].frames[i1+1]
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the strain 'LE'
Str = currFrame.fieldOutputs["LE"]
ROIStr = Str.getSubset(region=ROINodeSet, position= NODAL)
# initialize list
list = [[]]
# loop over all the nodes in each frame
for i2 in range(NumofNodes):
strain = ROIStr.values [i2]
list.insert(i2,[str(strain.dataDouble[0])+";"+str(strain.dataDouble[1])+\
";"+str(strain.dataDouble[3]))
# write the list in a new *.csv file (code not included for brevity)
odb.close()
The error I get is:
strain = ROIStr.values [i2]
IndexError: Sequence index out of range
Additional info:
Details for ROIStr:
ROIStr.name
'LE'
ROIStr.type
TENSOR_3D_FULL
OIStr.description
'Logarithmic strain components'
ROIStr.componentLabels
('LE11', 'LE22', 'LE33', 'LE12', 'LE13', 'LE23')
ROIStr.getattribute
'getattribute of openOdb(r'path to .odb').steps['Step-1'].frames[1].fieldOutputs['LE'].getSubset(position=INTEGRATION_POINT, region=openOdb(r'path to.odb').rootAssembly.nodeSets['ROI'])'
When I use the same code for VECTOR objects, like 'U' for nodal displacement or 'COORD' for nodal coordinates, everything works without a problem.
The error happens in the first loop. So, it is not the case where it cycles several loops before the error happens.
Question: Does anyone know what is causing the error in the above code?
Here the reason you get an IndexError. Strains are (obviously) calculated at the integration points; according to the ABQ Scripting Reference Guide:
A SymbolicConstant specifying the position of the output in the element. Possible values are:
NODAL, specifying the values calculated at the nodes.
INTEGRATION_POINT, specifying the values calculated at the integration points.
ELEMENT_NODAL, specifying the values obtained by extrapolating results calculated at the integration points.
CENTROID, specifying the value at the centroid obtained by extrapolating results calculated at the integration points.
In order to use your code, therefore, you should get the results using position= ELEMENT_NODAL
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= ELEMENT_NODAL)
With
ROIStr.values[0].data
You will then get an array containing the 6 independent components of your tensor.
Alternative Solution
For reading time series of results for a nodeset, you can use the function xyPlot.xyDataListFromField(). I noticed that this function is much faster than using odbread. The code also is shorter, the only drawback is that you have to get an abaqus license for using it (in contrast to odbread which works with abaqus python which only needs an installed version of abaqus and does not need to get a network license).
For your application, you should do something like:
from abaqus import *
from abaqusConstants import *
from abaqusExceptions import *
import visualization
import xyPlot
import displayGroupOdbToolset as dgo
results = session.openOdb(your_file + '.odb')
# without this, you won't be able to extract the results
session.viewports['Viewport: 1'].setValues(displayedObject=results)
xyList = xyPlot.xyDataListFromField(odb=results, outputPosition=NODAL, variable=((
'LE', INTEGRATION_POINT, ((COMPONENT, 'LE11'), (COMPONENT, 'LE22'), (
COMPONENT, 'LE33'), (COMPONENT, 'LE12'), )), ), nodeSets=(
'ROI', ))
(Of course you have to add LE13 etc.)
You will get a list of xyData
type(xyList[0])
<type 'xyData'>
Containing the desired data for each node and each output. It size will therefore be
len(xyList)
number_of_nodes*number_of_requested_outputs
Where the first number_of_nodes elements of the list are the LE11 at each nodes, then LE22 and so on.
You can then transform this in a NumPy array:
LE11_1 = np.array(xyList[0])
would be LE11 at the first node, with dimensions:
LE.shape
(NumberTimeFrames, 2)
That is, for each time step you have time and output variable.
NumPy arrays are also very easy to write on text files (check out numpy.savetxt).

Resources