Efficient interpolation implementation - python-3.x

I'm writing an interpolation method without using a library functions which does it directly. The signature of the function is:
def interpolate(self, f: callable, a: float, b: float, n: int) -> callable:
"""
Parameters
----------
f : callable. it is the given function
a : float
beginning of the interpolation range.
b : float
end of the interpolation range.
n : int
maximal number of points to use.
Returns
-------
The interpolating function.
"""
Right now my implementation is straight forward "Lagrange Interpolation" as explained here: https://www.codesansar.com/numerical-methods/python-program-lagrange-interpolation-method.htm
However, this kind of implementation is O(n^2) and I'm looking for a more efficient solution which runs in O(n).
(Maybe Bézier Curves can help here somehow?)

I am doing the same thing and I thought of NumPy elementwise calculation, here is my line of code that solves your question:
for xi, yi in zip(x,y):
yp += yi*np.prod((xp-x[x!=xi])/(xi-x[x!=xi]))

Related

Implementing alternative Fibonacci sequence

So I'm struggling with Question 3. I think the representation of L would be a function that goes something like this:
import numpy as np
def L(a, b):
#L is 2x2 Matrix, that is
return(np.dot([[0,1],[1,1]],[a,b]))
def fibPow(n):
if(n==1):
return(L(0,1))
if(n%2==0):
return np.dot(fibPow(n/2), fibPow(n/2))
else:
return np.dot(L(0,1),np.dot(fibPow(n//2), fibPow(n//2)))
Given b I'm pretty sure I'm wrong. What should I be doing? Any help would be appreciated. I don't think I'm supposed to use the golden ratio property of the Fibonacci series. What should my a and b be?
EDIT: I've updated my code. For some reason it doesn't work. L will give me the right answer, but my exponentiation seems to be wrong. Can someone tell me what I'm doing wrong
With an edited code, you are almost there. Just don't cram everything into one function. That leads to subtle mistakes, which I think you may enjoy to find.
Now, L is not function. As I said before, it is a matrix. And the core of the problem is to compute its nth power. Consider
L = [[0,1], [1,1]]
def nth_power(matrix, n):
if n == 1:
return matrix
if (n % 2) == 0:
temp = nth_power(matrix, n/2)
return np.dot(temp, temp)
else:
temp = nth_power(matrix, n // 2)
return np.dot(matrix, np.dot(temp, temp))
def fibPow(n):
Ln = nth_power(L, n)
return np.dot(L, [0,1])[1]
The nth_power is almost identical to your approach, with some trivial optimization. You may optimize it further by eliminating recursion.
First thing first, there is no L(n, a, b). There is just L(a, b), a well defined linear operator which transforms a vector a, b into a vector b, a+b.
Now a huge hint: a linear operator is a matrix (in this case, 2x2, and very simple). Can you spell it out?
Now, applying this matrix n times in a row to an initial vector (in this case, 0, 1), by matrix magic is equivalent to applying nth power of L once to the initial vector. This is what Question 2 is about.
Once you determine how this matrix looks like, fibPow reduces to computing its nth power, and multiplying the result by 0, 1. To get O(log n) complexity, check out exponentiation by squaring.

Bachelier Normal Implied Vol Python Calculation (Help) Jekel

Writing a python script to calc Implied Normal Vol ; in line with Jekel article (Industry Standard).
https://jaeckel.000webhostapp.com/ImpliedNormalVolatility.pdf
They say they are using a Generalized Incomplete Gamma Function Inverse.
For a call:
F(x)=v/(K - F) -> find x that makes this true
Where F is Inverse Incomplete Gamma Function
And x = (K - F)/(T*sqrt(T) ; v is the value of a call
for that x, IV is =(K-F)/x*sqrt(T)
Example I am working with:
F=40
X=38
T=100/365
v=5.25
Vol= 20%
Using the equations I should be able to backout Vol of 20%
Scipy has upper and lower Incomplete Gamma Function Inverse in their special functions.
Lower: scipy.special.gammaincinv(a, y) : {a must be positive param}
Upper: scipy.special.gammainccinv(a, y) : {a must be positive param}
Implementation:
SIG= sympy.symbols('SIG')
F=40
T=100/365
K=38
def Objective(sig):
SIG=sig
return(special.gammaincinv(.5,((F-K)**2)/(2*T*SIG**2))+special.gammainccinv(.5,((F-K)**2)/(2*T*SIG**2))+5.25/(K-F))
x=optimize.brentq(Objective, -20.00,20.00, args=(), xtol=1.48e-8, rtol=1.48e-8, maxiter=1000, full_output=True)
IV=(K-F)/x*T**.5
Print(IV)
I know I am wrong, but Where am I going wrong / how do I fix it and use what I read in the article ?
Did you also post this on the Quantitative Finance Stack Exchange? You may get a better response there.
This is not my field, but it looks like your main problem is that brentq requires the passed Objective function to return values with opposite signs when passed the -20 and 20 arguments. However, this will not end up happening because according to the scipy docs, gammaincinv and gammainccinv always return a value between 0 and infinity.
I'm not sure how to fix this, unfortunately. Did you try implementing the analytic solution (rather than iterative root finding) in the second part of the paper?

How can I prevent Python from automatically converting int to float when I use pow and mod?

I am trying to encrypt with rsa using the formula c = m^p mod q.
The Problem is if the number is too large, python3 convert it to float when doing modulo.
I tried to stop converting by converting into int
c = int(int((pow(n,p)) % q))
the problem is when p is too big it automatically has decimals and python thinks , that i am trying integer to float. Which leads to this:
OverflowError: int too large to convert to float
Is there a way to solve this ?
This may not solve your problem, but it does address the specific concerns you put forth in your question and suggests possible causes based on what you've told us.
The problem you're having isn't with %. As per the documentation,
The floor division and modulo operators are connected by the following identity: x == (x//y)*y + (x%y).
Given integer x and y, (x//y)*y is always an exact integer, so x - (x//y)*y == x%y must also be an integer.
Since you said you are using the built-in pow function, I suspect that your problem is that your inputs are floats instead of ints. In that case, both pow and ** will try to convert the other of the argument to float, which could be the source of your error. If this is the case, wrapping each argument in int will make the error go away, but your RSA implementation will be incorrect.

pymc3 theano function usage

I'm trying to define a complex custom likelihood function using pymc3. The likelihood function involves a lot of iteration, and therefore I'm trying to use theano's scan method to define iteration directly within theano. Here's a greatly simplified example that illustrates the challenge that I'm facing. The (fake) likelihood function I'm trying to define is simply the sum of two pymc3 random variables, p and theta. Of course, I could simply return p+theta, but the actual likelihood function I'm trying to write is more complicated, and I believe I need to use theano.scan since it involves a lot of iteration.
import pymc3 as pm
from pymc3 import Model, Uniform, DensityDist
import theano.tensor as T
import theano
import numpy as np
### theano test
theano.config.compute_test_value = 'raise'
X = np.asarray([[1.0,2.0,3.0],[1.0,2.0,3.0]])
### pymc3 implementation
with Model() as bg_model:
p = pm.Uniform('p', lower = 0, upper = 1)
theta = pm.Uniform('theta', lower = 0, upper = .2)
def logp(X):
f = p+theta
print("f",f)
get_ll = theano.function(name='get_ll',inputs = [p, theta], outputs = f)
print("p keys ",p.__dict__.keys())
print("theta keys ",theta.__dict__.keys())
print("p name ",p.name,"p.type ",p.type,"type(p)",type(p),"p.tag",p.tag)
result=get_ll(p, theta)
print("result",result)
return result
y = pm.DensityDist('y', logp, observed = X) # Nx4 y = f(f,x,tx,n | p, theta)
When I run this, I get the error:
TypeError: ('Bad input argument to theano function with name "get_ll" at index 0(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')
I understand that the issue occurs in line
result=get_ll(p, theta)
because p and theta are of type pymc3.TransformedRV, and that the input to a theano function needs to be a scalar number of a simple numpy array. However, a pymc3 TransformedRV does not seem to have any obvious way of obtaining the current value of the random variable itself.
Is it possible to define a log likelihood function that involves the use of a theano function that takes as input a pymc3 random variable?
The problem is that your th.function get_ll is a compiled theano function, which takes as input numerical arrays. Instead, pymc3 is sending it a symbolic variable (theano tensor). That's why you're getting the error.
As to your solution, you're right in saying that just returning p+theta is the way to go. If you have scans and whatnot in your logp, then you would return the scan variable of interest; there is no need to compile a theano function here. For example, if you wanted to add 1 to each element of a vector (as an impractical toy example), you would do:
def logp(X):
the_sum, the_sum_upd = th.scan(lambda x: x+1, sequences=[X])
return the_sum
That being said, if you need gradients, you would need to calculate your the_sum variable in a theano Op and provide a grad() method along with it (you can see a toy example of that on the answer here). If you do not need gradients, you might be better off doing everything in python (or C, numba, cython, for performance) and using the as_op decorator.

curve fitting with integer inputs Python 3.3

I am using scipy's curvefit module to fit a function and wanted to know if there is a way to tell it the the only possible entries are integers not real numbers? Any ideas as to another way of doing this?
In its general form, an integer programming problem is NP-hard ( see here ). There are some efficient heuristic or approximate algorithm to solve this problem, but none guarantee an exact optimal solution.
In scipy you may implement a grid search over the integer coefficients and use, say, curve_fit over the real parameters for the given integer coefficients. As for grid search. scipy has brute function.
For example if y = a * x + b * x^2 + some-noise where a has to be integer this may work:
Generate some test data with a = 5 and b = -1.5:
coef, n = [5, - 1.5], 50
xs = np.linspace(0, 10, n)[:,np.newaxis]
xs = np.hstack([xs, xs**2])
noise = 2 * np.random.randn(n)
ys = np.dot(xs, coef) + noise
A function which given the integer coefficients fits the real coefficient using curve_fit method:
def optfloat(intcoef, xs, ys):
from scipy.optimize import curve_fit
def poly(xs, floatcoef):
return np.dot(xs, [intcoef, floatcoef])
popt, pcov = curve_fit(poly, xs, ys)
errsqr = np.linalg.norm(poly(xs, popt) - ys)
return dict(errsqr=errsqr, floatcoef=popt)
A function which given the integer coefficients, uses the above function to optimize the float coefficient and returns the error:
def errfun(intcoef, *args):
xs, ys = args
return optfloat(intcoef, xs, ys)['errsqr']
Minimize errfun using scipy.optimize.brute to find optimal integer coefficient and call optfloat with the optimized integer coefficient to find the optimal real coefficient:
from scipy.optimize import brute
grid = [slice(1, 10, 1)] # grid search over 1, 2, ..., 9
# it is important to specify finish=None in below
intcoef = brute(errfun, grid, args=(xs, ys,), finish=None)
floatcoef = optfloat(intcoef, xs, ys)['floatcoef'][0]
Using this method I obtain [5.0, -1.50577] for the optimal coefficients, which is exact for the integer coefficient, and close enough for the real coefficient.
In general, the answer is No: scipy.optimize.curve_fit() and leastsq() that it is based on, and (AFAIK) all the other solvers in scipy.optimize work strictly on floating point numbers.
You could try increasing the value of epsfcn (which has a default value of numpy.finfo('double').eps ~ 2.e-16), which would be used as the initial step to all variables in the problem. The basic issue is that the fitting algorithm will adjust a floating point number, and if you do
int_var = int(float_var)
and the algorithm changes float_var from 1.0 to 1.00000001, it will see no difference in the result and decide that that value does not actually alter the fit metric.
Another approach would be to have a floating point parameter 'tmp_float_var' that is freely adjusted by the fitting algorithm but then in your objective function use
int_var = int(tmp_float_var / numpy.finfo('double').eps)
as the value for your integer variable. That might need a little tweaking, and might be a little unstable, but ought to work.

Resources