I am using "arch" package of python . I am fitting a GARCH(1,1) model with mean model ARX. After the fitting, we can call the conditional volatility directly. However, I don't know how to call the modeled conditional mean values
Any help?
If you use the attribute resid you can compute fitted values. For example
import datetime as dt
import pandas_datareader.data as web
st = dt.datetime(1990,1,1)
en = dt.datetime(2016,1,1)
data = web.get_data_yahoo('^GSPC', start=st, end=en)
returns = 100 * data['Adj Close'].pct_change().dropna()
from arch import arch_model
am = arch_model(returns, mean='AR',lags=1)
res = am.fit(update_freq=5)
fitted = returns - res.resid
fitted.plot()
Related
data=pd.read_csv("https://raw.githubusercontent.com/sharmaroshan/Online-Shoppers-Purchasing- Intention/master/online_shoppers_intention.csv")
I'm trying to perform feature selection based on ANOVA (Categorical vs Numerical variable).
dependent variable: Revenue
independent variable: Administrative,Administrative_Duration
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
model = ols('Revenue ~ Informational',data = data).fit()
anova_table=anova_lm(model)
But getting an following error,
Value Error(Shape issue)
The problem is related to the column Revenue in data, because it is boolean.
In fact, if you convert from boolean to integer then it works:
data.Revenue = data.Revenue.astype(int)
I am running a panel reggression using Python linearmodels, something like:
import pandas as pd
from linearmodels.panel import PanelOLS
data = pd.read_csv('data.csv', sep=',')
data = data.set_index(['panel_id', 'date'])
controls = ['A','B','C']
controls['const'] = 1
model = PanelOLS(data.Y, controls, entity_effects= True)
result = model.fit(use_lsdv=True)
I really need to pull out the coefficient on the constant, but looks like this would not work
intercept = result.summary.const
Could not really find the answer in
linearmodels' documentation on github
More generally, does anyone know how to pull out the estimate coefficients from the linearmodels summary? Thank you!
result.params['const']
would give the intercept, in general result.params gives the series of regression coefficients in linearmodels
I'm trying to reproject a Lambert Conformal dataset to Plate Carree. I know that this can easily be done visually using cartopy. However, I'm trying to create a new dataset rather than just show a reprojected image. Below is methodology I have mapped out but I'm unable to subset the dataset properly (Python 3.5, MacOSx).
from siphon.catalog import TDSCatalog
import xarray as xr
from xarray.backends import NetCDF4DataStore
import numpy as np
import cartopy.crs as ccrs
from scipy.interpolate import griddata
import numpy.ma as ma
from pyproj import Proj, transform
import metpy
# Declare bounding box
min_lon = -78
min_lat = 36
max_lat = 40
max_lon = -72
boundinglat = [min_lat, max_lat]
boundinglon = [min_lon, max_lon]
# Load the dataset
cat = TDSCatalog('https://thredds.ucar.edu/thredds/catalog/grib/NCEP/HRRR/CONUS_2p5km/latest.xml')
dataset_name = sorted(cat.datasets.keys())[-1]
dataset = cat.datasets[dataset_name]
ds = dataset.remote_access(service='OPENDAP')
ds = NetCDF4DataStore(ds)
ds = xr.open_dataset(ds)
# parse the temperature at 850 and # 0z reftime
tempiso = ds.metpy.parse_cf('Temperature_isobaric')
t850 = tempiso[0][2]
# transform bounding lat/lons to src_proj
src_proj = tempiso.metpy.cartopy_crs #aka lambert conformal conical
extents = src_proj.transform_points(ccrs.PlateCarree(), np.array(boundinglon), np.array(boundinglat))
# subset the data using the indexes of the closest values to the src_proj extents
t850_subset = t850[(np.abs(tempiso.y.values - extents[1][0])).argmin():(np.abs(tempiso.y.values - extents[1][1])).argmin()][(np.abs(tempiso.x.values - extents[0][1])).argmin():(np.abs(tempiso.x.values - extents[0][0])).argmin()]
# t850_subset should be a small, reshaped dataset, but it's shape is 0x2145
# now use nplinspace, npmeshgrid & scipy interpolate to reproject
My transform point > find nearest value subsetting isn't working. It's claiming the closest points are outside the realm of the dataset. As noted, I plan to use nplinspace, npmeshgrid and scipy interpolate to create a new, square lat/lon dataset from t850_subset.
Is there an easier way to resize & reproject an xarray dataset?
Your easiest path forward is to take advantage of xarray's ability to do pandas-like data selection; this is IMO the best part of xarray. Replace your last two lines with:
# By transposing the result of transform_points, we can unpack the
# x and y coordinates into individual arrays.
x_lim, y_lim, _ = src_proj.transform_points(ccrs.PlateCarree(),
np.array(boundinglon), np.array(boundinglat)).T
t850_subset = t850.sel(x=slice(*x_lim), y=slice(*y_lim))
You can find more information in the documentation on xarray's selection and indexing functionality. You would probably also be interested in xarray's built-in support for interpolation. And if interpolation methods beyond SciPy's are of interest, MetPy also has a suite of other interpolation methods.
We have various "regridding" methods in Iris, if that isn't too much of a context switch for you.
Xarray explains its relationship to Iris here, and provides a to_iris() method.
I am trying to solve a maximization problem using Pyomo which has a recursive relationship. I am trying to maximize the revenue from a battery and it involves updating the state of charge of the battery every hour (which is the recursive relationship here). I am using the following code:
import pyomo
import numpy as np
from pyomo.environ import *
import pandas as pd
model = ConcreteModel()
N = 24 #number of hours
lmpdata = np.random.randint(1,10,24) #LMP Data (to be imported from MISO/PJM)
R = 0 #discount
eta_s = 0.99 #self-discharge efficiency
eta_c = 0.95 #round-trip efficiency
gammas_min = 0.1 #fraction of energy capacity to reserve for discharging
gammas_max = 0.05 #fraction of energy capacity to reserve for charging
S_bar = 50 #energy capacity
Q_bar = 50 #energy charge/discharge rating
model.qd = Var(range(N), within = NonNegativeReals) #variables for energy sold at time t
model.qr = Var(range(N), within = NonNegativeReals) #variables for energy purchased at time t
model.obj = Objective(expr = sum((model.qd[i]-model.qr[i])*lmpdata[i]*np.exp(-R*(i+1)) for i in range(N)), sense = maximize) #objective function
model.SOC = np.zeros(N) #state of charge (s(t) in Sandia's Model)
model.SOC[0] = 25 #SOC at hour 0
#recursion relation describing the SOC
def con_rule1(model,i):
model.SOC[i] = eta_s*model.SOC[i-1] + eta_c*model.qr[i-1] - model.qd[i-1]
return (eta_s*model.SOC[i-1] + eta_c*model.qr[i-1] - model.qd[i-1]== model.SOC[i])
#def con_rule1(model,i):
model.con1 = Constraint(range(1,N), rule = con_rule1)
#model.con2 = Constraint(expr = eta_s*SOC[N-1] + eta_c*model.qr[N-1] - model.qd[N-1] == SOC[0]) #SOC relation for the last hour
#SOC boundaries
def con_rule2(model,i):
return (gammas_min*S_bar <= eta_s*model.SOC[i] + eta_c*model.qr[i] - model.qd[i] <= (1-gammas_max)*S_bar)
model.con3 = Constraint(range(N), rule = con_rule2)
#limits the total energy charged over each time step to the energy
#charge limit (derived from the power limit)
#It restricts the throughput based on the power rating
def con_rule3(model,i):
return (0 <= model.qr[i]+model.qd[i] <= Q_bar)
model.con4 = Constraint(range(N),rule = con_rule3)
def pyomo_postprocess(options=None, instance=None, results=None):
model.qd.display()
model.qr.display()
model.pprint()
However, when I try to run the code, I am getting the following error:
Implicit conversion of Pyomo NumericValue type `<class 'pyomo.core.kernel.expr_coopr3._SumExpression'>' to a float is
disabled. This error is often the result of using Pyomo components as
arguments to one of the Python built-in math module functions when
defining expressions. Avoid this error by using Pyomo-provided math
functions.
I could not find any reference to Pyomo's math function in its documentation. It would be great if anyone could help me solve this problem!
Pyomo defines its own set of math module functions for operations like exp, log, sin, etc. If you want to use any of these functions in your Pyomo expressions you should make sure they are the ones provided by Pyomo and not from some other Python package. I think the issue with your model is that you are using np.exp in your Objective function. The Pyomo math functions are automatically imported when you import pyomo.environ so you should be able to replace np.exp with exp to get the Pyomo-defined function.
According to the documentation and other SO questions, ElasticNetCV accepts multiple output regression. When I try it, though, it fails. Code:
from sklearn import linear_model
import numpy as np
import numpy.random as rnd
nsubj = 10
nfeat_train = 5
nfeat_predict = 20
x = rnd.random((nsubj, nfeat_train))
y = rnd.random((nsubj, nfeat_predict))
lm = linear_model.LinearRegression()
lm.fit(x,y) # works
el = linear_model.ElasticNetCV()
el.fit(x,y) # fails
Error message:
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
This is with scikit-learn version 0.14.1. Is this a mismatch between the documentation and implementation?
You may want to take a look at sklearn.linear_model.MultiTaskElasticNetCV. But beware, this object assumes that your multiple targets share features. Thus, a feature is either active for all tasks (with variable activation for each, which can be small), or active for none of them. Before using this object, make sure this is the functionality you need.