EGF coefficients in sage - combinatorics

To extract coefficients from an ordinary generating function in sage I do it this way
sage: var('z')
z
sage: T=-1/2*(sqrt(-4*z + 1) - 1)/z
sage: T.series(z,101).coefficient(z,100)
896519947090131496687170070074100632420837521538745909320
The above gives us the coefficient 100 of the OGF $-\frac{\sqrt{1-4z}-1}{2z}$ (which is the right solution for $T(z)=1+z\cdot T(z)^2$). However there is an easier way to do it, that is using Lazy Power Series
sage: L.<z> = LazyPowerSeriesRing(QQ)
sage: T = L()
sage: T._name = 'C'
sage: T.define (1+z*T^2)
sage: T.coefficient(100)
896519947090131496687170070074100632420837521538745909320
My problem is the following, I want to extract coefficients from the EGF $\displaystyle e^{e^z-1}$ and using the first method if it works
sage: var('z')
z
sage: F=exp(exp(z)-1)
sage: F.taylor(z,0,21).coefficient(z,20)
263898766507/12412765347840000
My question is, how do I extract coefficients using F.coefficient(20) since in this case LazyPowerSeriesRing does not work.

Why working with the Lazy constructor?! I tried in a similar manner
sage: PREC = 30 # use a higher precision then the needed coefficient(s)
sage: L.<z> = PowerSeriesRing(QQ, default_prec=PREC)
sage: L
Power Series Ring in z over Rational Field
sage: L.default_prec()
30
sage: F = exp( exp(z) - 1 )
sage: F.coefficients()[:6]
[1, 1, 1, 5/6, 5/8, 13/30]
sage: F.coefficients()[20]
263898766507/12412765347840000
and the needed coefficient was reproduced. To be concrete:
sage: L.<z> = PowerSeriesRing(QQ, default_prec=30)
sage: F =exp( exp(z) - 1 )
sage: F.coefficients()[20]
263898766507/12412765347840000

Related

Finding the parameters of a function via curve fit

I'm trying to estimate the parameters (v, n, k) defined in fit_func. I tried the default least squares fit but I couldn't find the parameters successfully.
def fit_func(x, v, n, k):
return v * x ** n / (k ** n + x ** n)
x = [2.5, 2.71317829, 4.08, 4.18604651, 5.19379845, 6.92,
7.98449612, 8.94, 9.92248062, 9.94, 12.36, 13.48837209]
y = [0.16054661, 0.14643943, 0.11639118, 0.11796543, 0.15609638, 0.29527088,
0.40774818, 0.51331307, 0.6163489, 0.61807529, 0.78372639, 0.78643515]
popt, pcov = curve_fit(fit_func, x, y)
print(popt)
plt.plot(x, y, '*')
plt.plot(x, fit_func(x, *popt), 'r')
plt.show()
I get the following error:
raise RuntimeError("Optimal parameters not found: " + errmsg)
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
I'm not sure if I have selected the right method.
Suggestions on alternate methods that I could use to estimate the parameters will be really helpful.
The function y(x) = v * x ** n / (k ** n + x ** n) = v / (k ** n * x ** (-n) + 1) is strictly increasing or decreasing not both. This is not the shape convenient for the data : See the figure below. This can be a cause of bad fitting.
Another possible cause of failure might be the initial values of the parameters v , k , n which have to be set in order to start the iterative computation for non-linear regression.
Just looking at the graph of the points distribution one can see that a cubic function would be more convenient. This is much simpler because the regression is linear and doesn't require initial guess of the parameters. The fitting is very good.

Numpy finding the number of points within a specific distance in absolute value

I have a bumpy array. I want to find the number of points which lies within an epsilon distance from each point.
My current code is (for a n*2 array, but in general I expect the array to be n * m)
epsilon = np.array([0.5, 0.5])
np.array([ 1/np.float(np.sum(np.all(np.abs(X-x) <= epsilon, axis=1))) for x in X])
But this code might not be efficient when it comes to an array of let us say 1 million rows and 50 columns. Is there a better and more efficient method ?
For example data
X = np.random.rand(10, 2)
you can solve this using broadcasting:
1 / np.sum(np.all(np.abs(X[:, None, ...] - X[None, ...]) <= epsilon, axis=-1), axis=-1)

How to visualize feasible region for linear programming (with arbitrary inequalities) in Numpy/MatplotLib?

I need to implement a solver for linear programming problems. All of the restrictions are <= ones such as
5x + 10y <= 10
There can be an arbitrary amount of these restrictions. Also , x>=0 y>=0 implicitly.
I need to find the optimal solutions(max) and show the feasible region in matplotlib. I've found the optimal solution by implementing the simplex method but I can't figure out how to draw the graph.
Some approaches I've found:
This link finds the minimum of the y points from each function and uses plt.fillBetween() to draw the region. But it doesn't work when I change the order of the equations. I'm not sure which y values to minimize(). So I can't use it for arbitrary restrictions.
Find solution for every pair of restrictions and draw a polygon. Not efficient.
An easier approach might be to have matplotlib compute the feasible region on its own (with you only providing the constraints) and then simply overlay the "constraint" lines on top.
# plot the feasible region
d = np.linspace(-2,16,300)
x,y = np.meshgrid(d,d)
plt.imshow( ((y>=2) & (2*y<=25-x) & (4*y>=2*x-8) & (y<=2*x-5)).astype(int) ,
extent=(x.min(),x.max(),y.min(),y.max()),origin="lower", cmap="Greys", alpha = 0.3);
# plot the lines defining the constraints
x = np.linspace(0, 16, 2000)
# y >= 2
y1 = (x*0) + 2
# 2y <= 25 - x
y2 = (25-x)/2.0
# 4y >= 2x - 8
y3 = (2*x-8)/4.0
# y <= 2x - 5
y4 = 2 * x -5
# Make plot
plt.plot(x, 2*np.ones_like(y1))
plt.plot(x, y2, label=r'$2y\leq25-x$')
plt.plot(x, y3, label=r'$4y\geq 2x - 8$')
plt.plot(x, y4, label=r'$y\leq 2x-5$')
plt.xlim(0,16)
plt.ylim(0,11)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
This is a vertex enumeration problem. You can use the function lineqs which visualizes the system of inequalities A x >= b for any number of lines. The function will also display the vertices on which the graph was plotted.
The last 2 lines mean that x,y >=0
from intvalpy import lineqs
import numpy as np
A = -np.array([[5, 10],
[-1, 0],
[0, -1]])
b = -np.array([10, 0, 0])
lineqs(A, b, title='Solution', color='gray', alpha=0.5, s=10, size=(15,15), save=False, show=True)
Visual Solution Link

How to avoid NaN in numpy implementation of logistic regression?

EDIT: I already made significant progress. My current question is written after my last edit below and can be answered without the context.
I currently follow Andrew Ng's Machine Learning Course on Coursera and tried to implement logistic regression today.
Notation:
X is a (m x n)-matrix with vectors of input variables as rows (m training samples of n-1 variables, the entries of the first column are equal to 1 everywhere to represent a constant).
y is the corresponding vector of expected output samples (column vector with m entries equal to 0 or 1)
theta is the vector of model coefficients (row vector with n entries)
For an input row vector x the model will predict the probability sigmoid(x * theta.T) for a positive outcome.
This is my Python3/numpy implementation:
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
vec_sigmoid = np.vectorize(sigmoid)
def logistic_cost(X, y, theta):
summands = np.multiply(y, np.log(vec_sigmoid(X*theta.T))) + np.multiply(1 - y, np.log(1 - vec_sigmoid(X*theta.T)))
return - np.sum(summands) / len(y)
def gradient_descent(X, y, learning_rate, num_iterations):
num_parameters = X.shape[1] # dim theta
theta = np.matrix([0.0 for i in range(num_parameters)]) # init theta
cost = [0.0 for i in range(num_iterations)]
for it in range(num_iterations):
error = np.repeat(vec_sigmoid(X * theta.T) - y, num_parameters, axis=1)
error_derivative = np.sum(np.multiply(error, X), axis=0)
theta = theta - (learning_rate / len(y)) * error_derivative
cost[it] = logistic_cost(X, y, theta)
return theta, cost
This implementation seems to work fine, but I encountered a problem when calculating the logistic-cost. At some point the gradient descent algorithm converges to a pretty good fitting theta and the following happens:
For some input row X_i with expected outcome 1 X * theta.T will become positive with a good margin (for example 23.207). This will lead to sigmoid(X_i * theta) to become exactly 1.0000 (this is because of lost precision I think). This is a good prediction (since the expected outcome is equal to 1), but this breaks the calculation of the logistic cost, since np.log(1 - vec_sigmoid(X*theta.T)) will evaluate to NaN. This shouldn't be a problem, since the term is multiplied with 1 - y = 0, but once a value of NaN occurs, the whole calculation is broken (0 * NaN = NaN).
How should I handle this in the vectorized implementation, since np.multiply(1 - y, np.log(1 - vec_sigmoid(X*theta.T))) is calculated in every row of X (not only where y = 0)?
Example input:
X = np.matrix([[1. , 0. , 0. ],
[1. , 1. , 0. ],
[1. , 0. , 1. ],
[1. , 0.5, 0.3],
[1. , 1. , 0.2]])
y = np.matrix([[0],
[1],
[1],
[0],
[1]])
Then theta, _ = gradient_descent(X, y, 10000, 10000) (yes, in this case we can set the learning rate this large) will set theta as:
theta = np.matrix([[-3000.04008972, 3499.97995514, 4099.98797308]])
This will lead to vec_sigmoid(X * theta.T) to be the really good prediction of:
np.matrix([[0.00000000e+00], # 0
[1.00000000e+00], # 1
[1.00000000e+00], # 1
[1.95334953e-09], # nearly zero
[1.00000000e+00]]) # 1
but logistic_cost(X, y, theta) evaluates to NaN.
EDIT:
I came up with the following solution. I just replaced the logistic_cost function with:
def new_logistic_cost(X, y, theta):
term1 = vec_sigmoid(X*theta.T)
term1[y == 0] = 1
term2 = 1 - vec_sigmoid(X*theta.T)
term2[y == 1] = 1
summands = np.multiply(y, np.log(term1)) + np.multiply(1 - y, np.log(term2))
return - np.sum(summands) / len(y)
By using the mask I just calculate log(1) at the places at which the result will be multiplied with zero anyway. Now log(0) will only happen in wrong implementations of gradient descent.
Open questions: How can I make this solution more clean? Is it possible to achieve a similar effect in a cleaner way?
If you don't mind using SciPy, you could import expit and xlog1py from scipy.special:
from scipy.special import expit, xlog1py
and replace the expression
np.multiply(1 - y, np.log(1 - vec_sigmoid(X*theta.T)))
with
xlog1py(1 - y, -expit(X*theta.T))
I know it is an old question but I ran into the same problem, and maybe it can help others in the future, I actually solved it by implementing normalization on the data before appending X0.
def normalize_data(X):
mean = np.mean(X, axis=0)
std = np.std(X, axis=0)
return (X-mean) / std
After this all worked well!

Linear Regression algorithm works with one data-set but not on another, similar data-set. Why?

I created a linear regression algorithm following a tutorial and applied it to the data-set provided and it works fine. However the same algorithm does not work on another similar data-set. Can somebody tell me why this happens?
def computeCost(X, y, theta):
inner = np.power(((X * theta.T) - y), 2)
return np.sum(inner) / (2 * len(X))
def gradientDescent(X, y, theta, alpha, iters):
temp = np.matrix(np.zeros(theta.shape))
params = int(theta.ravel().shape[1])
cost = np.zeros(iters)
for i in range(iters):
err = (X * theta.T) - y
for j in range(params):
term = np.multiply(err, X[:,j])
temp[0, j] = theta[0, j] - ((alpha / len(X)) * np.sum(term))
theta = temp
cost[i] = computeCost(X, y, theta)
return theta, cost
alpha = 0.01
iters = 1000
g, cost = gradientDescent(X, y, theta, alpha, iters)
print(g)
On running the algo through this dataset I get the output as matrix([[ nan, nan]]) and the following errors:
C:\Anaconda3\lib\site-packages\ipykernel\__main__.py:2: RuntimeWarning: overflow encountered in power
from ipykernel import kernelapp as app
C:\Anaconda3\lib\site-packages\ipykernel\__main__.py:11: RuntimeWarning: invalid value encountered in double_scalars
However this data set works just fine and outputs matrix([[-3.24140214, 1.1272942 ]])
Both the datasets are similar, I have been over it many times but can't seem to figure out why it works on one dataset but not on other. Any help is welcome.
Edit: Thanks Mark_M for editing tips :-)
[Much better question, btw]
It's hard to know exactly what's going on here, but basically your cost is going the wrong direction and spiraling out of control, which results in an overflow when you try to square the value.
I think in your case it boils down to your step size (alpha) being too big which can cause gradient descent to go the wrong way. You need to watch the cost in gradient descent and makes sure it's always going down, if it's not either something is broken or alpha is to large.
Personally, I would reevaluate the code and try to get rid of the loops. It's a matter of preference, but I find it easier to work with X and Y as column vectors. Here is a minimal example:
from numpy import genfromtxt
# this is your 'bad' data set from github
my_data = genfromtxt('testdata.csv', delimiter=',')
def computeCost(X, y, theta):
inner = np.power(((X # theta.T) - y), 2)
return np.sum(inner) / (2 * len(X))
def gradientDescent(X, y, theta, alpha, iters):
for i in range(iters):
# you don't need the extra loop - this can be vectorize
# making it much faster and simpler
theta = theta - (alpha/len(X)) * np.sum((X # theta.T - y) * X, axis=0)
cost = computeCost(X, y, theta)
if i % 10 == 0: # just look at cost every ten loops for debugging
print(cost)
return (theta, cost)
# notice small alpha value
alpha = 0.0001
iters = 100
# here x is columns
X = my_data[:, 0].reshape(-1,1)
ones = np.ones([X.shape[0], 1])
X = np.hstack([ones, X])
# theta is a row vector
theta = np.array([[1.0, 1.0]])
# y is a columns vector
y = my_data[:, 1].reshape(-1,1)
g, cost = gradientDescent(X, y, theta, alpha, iters)
print(g, cost)
Another useful technique is to normalize your data before doing regression. This is especially useful when you have more than one feature you're trying to minimize.
As a side note - if you're step size is right you shouldn't get overflows no matter how many iterations you do because the cost will will decrease with every iteration and the rate of decrease will slow.
After 1000 iterations I arrived at a theta and cost of:
[[ 1.03533399 1.45914293]] 56.041973778
after 100:
[[ 1.01166889 1.45960806]] 56.0481988054
You can use this to look at the fit in an iPython notebook:
%matplotlib inline
import matplotlib.pyplot as plt
plt.scatter(my_data[:, 0].reshape(-1,1), y)
axes = plt.gca()
x_vals = np.array(axes.get_xlim())
y_vals = g[0][0] + g[0][1]* x_vals
plt.plot(x_vals, y_vals, '--')

Resources