Python 3 - Scipy and KDEpy - python-3.x

I am using python-3.x and I want to Evaluate the estimated pdf on a provided set of points using the KDEpy but I couldn't get it right,
I used the scipy.stats.gaussian_kde and it is fine and work very well when I apply the pdf Method as I am interested in the Evaluate the estimated pdf on a provided set of points.
so the question is how to get the same result from the scipy.stats.kde if I used the KDEpy FFTKDE
here a small example that describes what I am looking for:
from scipy.stats.kde import gaussian_kde
data = np.array([[-1.84134663, -1.42036525, -1.38819347],
[-2.58165693, -2.49423057, -1.57609454],
[-0.78776371, -0.79168188, 0.21967791],
[-1.0165618 , -1.78509185, -0.68373997],
[-1.21764947, -0.43215885, -0.34393573]])
my_pdf = gaussian_kde(data.T, bw_method = None )
my_pdf1.pdf(data.T)
print (my_pdf1.pdf(data.T)) # here we will Evaluate the estimated pdf on a provided set of points
the result is:
[0.24234078 0.22071922 0.23802877 0.22474656 0.25402297]
how to get the same result by using the KDEpy FFTKDE
from KDEpy import FFTKDE
my_pdf2 = FFTKDE(kernel="gaussian").fit(data.T).evaluate()
but I don't know how to do the Evaluate the estimated pdf on a provided set of points similar to the scipy.stats.kde with pdf method.

You can create an equidistant grid using e.g. numpy.linspace and pass it to .evaluate():
from KDEpy import FFTKDE
import numpy as np
x_grid = np.linspace(-10, 10, num=2**10)
my_pdf = FFTKDE(kernel="gaussian").fit(data.T).evaluate(x_grid)

Related

How to create a variable number of dimensions for mgrid

I would like to create a meshgrid of variable dimensions by specifying the dimensions with a variable i.e. specifying dim=2, rather than manually changing the expression as in the example below to set a 2D mesh grid.
How would I implement a wrapper function for this?
The problem stems from not being familiar with the syntax that mgrid uses (index_tricks).
import numpy as np
mgrid = np.mgrid[
-5:5:5j,
-5:5:5j,
]
Observed documentation for mgrid, but there seems to be no info on setting the number of dimensions with a variable.
You can create a tuple containing the slice manually, and repeat it some number of times:
import numpy as np
num_dims = 2
mgrid = np.mgrid[(slice(-5, 5, 5j),) * num_dims]

siphon error - 400: NETCDF4 format not supported for ANY_POINT feature typ

I'm trying to get a dataset from TDScatalog with siphon but with multiples variables show me that error or the last line. Here the code:
import siphon
from siphon.catalog import TDSCatalog
import datetime
from xarray.backends import NetCDF4DataStore
import xarray as xr
point = [-7, 41]
hours = 48
best_gfs = TDSCatalog('http://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/'
'Global_0p25deg/catalog.xml?dataset=grib/NCEP/GFS/Global_0p25deg/Best')
best_gfs.datasets
best_ds = list(best_gfs.datasets.values())[0]
ncss = best_ds.subset()
query = ncss.query()
query.lonlat_point( point[1], point[0] ).time_range(datetime.datetime.utcnow(), datetime.datetime.utcnow() + datetime.timedelta(hours))
query.accept('netcdf4')
query.variables('Temperature_surface',
'Relative_humidity_height_above_ground',
'u-component_of_wind_height_above_ground',
'v-component_of_wind_height_above_ground',
'Wind_speed_gust_surface'
)
data = ncss.get_data(query)
Thanks!
That message is because your point request is trying to return a mix of time series (your _surface variables) and time series of profiles (the u/v wind components). The combination of different features in a single netCDF file is unsupported by the netCDF CF-Conventions.
One work-around is to request CSV or XML formatted data instead (which siphon can still parse and return as a dictionary-of-arrays).
The other is to make separate requests for fields with different geometry. So one for Temperature_surface and Wind_speed_gust_surface, one for u-component_of_wind_height_above_ground and v-component_of_wind_height_above_ground, and one final one for Relative_humidity_height_above_ground. This last split is working around an apparent bug in the THREDDS Data Server where profiles with different vertical levels can't be combined either.

How to read specific keypoints in COCOEval

I need to calculate the mean average precision (mAP) of specific keypoints (and not for all keypoints, as it done by default).
Here's my code :
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
# https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
cocoGt = COCO('annotations/person_keypoints_val2017.json') # initialize COCO ground truth api
cocoDt = cocoGt.loadRes('detections/results.json') # initialize COCO pred api
cat_ids = cocoGt.getCatIds(catNms=['person'])
imgIds = cocoGt.getImgIds(catIds=cat_ids)
cocoEval = COCOeval(cocoGt, cocoDt, 'keypoints')
cocoEval.params.imgIds = imgIds
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()
print(cocoEval.stats[0])
This code prints the mAP for all keypoints ['nose', ...,'right_ankle'] but I need only for few specific keypoints like ['nose', 'left_hip', 'right_hip']
I recently solved this and evaluated only the 13 key points, leaving behind the eyes and the ears as per my application.
Just open the cocoeval.py under pycocotools, then head over to the computeOKS function, where you will encounter two sets of keypoints—ground truth keypoints—and detection keypoints, such as a NumPy array.
Make sure to do proper slicing for that 51 array size Python lists.
For example, if you wish to only check the mAP for nose, the slicing would be as follows:
g= np.array(gt['keypoints'][0:3])
Similarly, do it for a dt array.
Also, set the sigma values of those unwanted key points to 0.
You are all set!

Is there a way using librosa's waveplot to store the coordinates of the graph rather than show the image of the waveplot?

I am working on an audio project where I am using Librosa and have the following code from an example online. Rather than opening up an image with a graph of the amplitude versus time, I want to be able to store the coordinates that make up the graph in an array. I have tried a lot of different examples found on stackoverflow as well as other websites with no luck. I am relatively new to python and this is my first question on stackoverflow so please be kind.
import librosa.display
import matplotlib.pyplot as plt
from IPython.display import display, Audio
filename = 'queen2.mp3'
samples, sampleRate = librosa.load(filename)
display(Audio(filename))
plt.figure(figsize=(12, 4))
librosa.display.waveplot(y, sr=None, max_points=200)
plt.show()
librosa is open-source (under the ISC license), so you can look at the code to see how it does this. The documentation for functions has a handy [source] link which takes you do the code. For librosa.display.waveplot you will see that it calls a function __envelope() to compute the envelope. Presumably it is these coordinates you are after.
hop_length = 1
y = __envelope(y, hop_length)
y_top = y[0]
y_bottom = -y[-1]
import numpy as np
def __envelope(x, hop):
'''Compute the max-envelope of non-overlapping frames of x at length hop
x is assumed to be multi-channel, of shape (n_channels, n_samples).
'''
x_frame = np.abs(util.frame(x, frame_length=hop, hop_length=hop))
return x_frame.max(axis=1)

Reprojecting Xarray Dataset

I'm trying to reproject a Lambert Conformal dataset to Plate Carree. I know that this can easily be done visually using cartopy. However, I'm trying to create a new dataset rather than just show a reprojected image. Below is methodology I have mapped out but I'm unable to subset the dataset properly (Python 3.5, MacOSx).
from siphon.catalog import TDSCatalog
import xarray as xr
from xarray.backends import NetCDF4DataStore
import numpy as np
import cartopy.crs as ccrs
from scipy.interpolate import griddata
import numpy.ma as ma
from pyproj import Proj, transform
import metpy
# Declare bounding box
min_lon = -78
min_lat = 36
max_lat = 40
max_lon = -72
boundinglat = [min_lat, max_lat]
boundinglon = [min_lon, max_lon]
# Load the dataset
cat = TDSCatalog('https://thredds.ucar.edu/thredds/catalog/grib/NCEP/HRRR/CONUS_2p5km/latest.xml')
dataset_name = sorted(cat.datasets.keys())[-1]
dataset = cat.datasets[dataset_name]
ds = dataset.remote_access(service='OPENDAP')
ds = NetCDF4DataStore(ds)
ds = xr.open_dataset(ds)
# parse the temperature at 850 and # 0z reftime
tempiso = ds.metpy.parse_cf('Temperature_isobaric')
t850 = tempiso[0][2]
# transform bounding lat/lons to src_proj
src_proj = tempiso.metpy.cartopy_crs #aka lambert conformal conical
extents = src_proj.transform_points(ccrs.PlateCarree(), np.array(boundinglon), np.array(boundinglat))
# subset the data using the indexes of the closest values to the src_proj extents
t850_subset = t850[(np.abs(tempiso.y.values - extents[1][0])).argmin():(np.abs(tempiso.y.values - extents[1][1])).argmin()][(np.abs(tempiso.x.values - extents[0][1])).argmin():(np.abs(tempiso.x.values - extents[0][0])).argmin()]
# t850_subset should be a small, reshaped dataset, but it's shape is 0x2145
# now use nplinspace, npmeshgrid & scipy interpolate to reproject
My transform point > find nearest value subsetting isn't working. It's claiming the closest points are outside the realm of the dataset. As noted, I plan to use nplinspace, npmeshgrid and scipy interpolate to create a new, square lat/lon dataset from t850_subset.
Is there an easier way to resize & reproject an xarray dataset?
Your easiest path forward is to take advantage of xarray's ability to do pandas-like data selection; this is IMO the best part of xarray. Replace your last two lines with:
# By transposing the result of transform_points, we can unpack the
# x and y coordinates into individual arrays.
x_lim, y_lim, _ = src_proj.transform_points(ccrs.PlateCarree(),
np.array(boundinglon), np.array(boundinglat)).T
t850_subset = t850.sel(x=slice(*x_lim), y=slice(*y_lim))
You can find more information in the documentation on xarray's selection and indexing functionality. You would probably also be interested in xarray's built-in support for interpolation. And if interpolation methods beyond SciPy's are of interest, MetPy also has a suite of other interpolation methods.
We have various "regridding" methods in Iris, if that isn't too much of a context switch for you.
Xarray explains its relationship to Iris here, and provides a to_iris() method.

Resources