Multiplying functions together in Python - python-3.x

I am using Python at the moment and I have a function that I need to multiply against itself for different constants.
e.g. If I have f(x,y)= x^2y+a, where 'a' is some constant (possibly list of constants).
If 'a' is a list (of unknown size as it depends on the input), then if we say a = [1,3,7] the operation I want to do is
(x^2y+1)*(x^2y+3)*(x^2y+7)
but generalised to n elements in 'a'. Is there an easy way to do this in Python as I can't think of a decent way around this problem? If the size in 'a' was fixed then it would seem much easier as I could just define the functions separately and then multiply them together in a new function, but since the size isn't fixed this approach isn't suitable. Does anyone have a way around this?

You can numpy ftw, it's fairly easy to get into.
import numpy as np
a = np.array([1,3,7])
x = 10
y = 0.2
print(x ** (2*y) + a)
print(np.sum(x**(2*y)+a))
Output:
[3.51188643 5.51188643 9.51188643]
18.53565929452874

I haven't really got much for it to be honest, I'm still trying to figure out how to get the functions to not overlap.
a=[1,3,7]
for i in range(0,len(a)-1):
def f(x,y):
return (x**2)*y+a[i]
def g(x,y):
return (x**2)*y+a[i+1]
def h(x,y):
return f(x,y)*g(x,y)
f1= lambda y, x: h(x,y)
integrate.dblquad(f1, 0, 2, lambda x: 1, lambda x: 10)
I should have clarified that the end result can't be in floats as it needs to be integrated afterwards using dblquad.

Related

Make the integrate.quad function return only the analytical value

The integrate.quad function in scipy displays the analytical and absolute error of the integration. How do
I make it output just the analytical portion?
The code below returns (21.333333333333332, 2.3684757858670003e-13). I just want 21.3333
from scipy import integrate
x2 = lambda x: x**2
integrate.quad(x2, 0, 4)
This way you can print only the 21.3333
from scipy import integrate
x2 = lambda x: x**2
results = integrate.quad(x2, 0, 4)
print(results[0])
The output will be:
21.333333333333336
integrate.quad(x2,0,4) returns a tuple. Indexing is one way of dealing with this:
my_result = integrate.quad(x2, 0, 4)[1]
Another option is unpacking the tuple, using underscore for the values you are not interested in:
_, my_result = integrate.quad(x2, 0, 4)
Note that the underscore is still accessible like any other variable would have been (you could also have called it a, or temp, or not_interesting), but using an underscore is a Pythonic way that will make other people reading your code understand that the variable will not be used.

How do I call a list of numpy functions without a for loop?

I'm doing data analysis that involves minimizing the least-square-error between a set of points and a corresponding set of orthogonal functions. In other words, I'm taking a set of y-values and a set of functions, and trying to zero in on the x-value that gets all of the functions closest to their corresponding y-value. Everything is being done in a 'data_set' class. The functions that I'm comparing to are all stored in one list, and I'm using a class method to calculate the total lsq-error for all of them:
self.fits = [np.poly1d(np.polyfit(self.x_data, self.y_data[n],10)) for n in range(self.num_points)]
def error(self, x, y_set):
arr = [(y_set[n] - self.fits[n](x))**2 for n in range(self.num_points)]
return np.sum(arr)
This was fine when I had significantly more time than data, but now I'm taking thousands of x-values, each with a thousand y-values, and that for loop is unacceptably slow. I've been trying to use np.vectorize:
#global scope
def func(f,x):
return f(x)
vfunc = np.vectorize(func, excluded=['x'])
…
…
#within data_set class
def error(self, x, y_set):
arr = (y_set - vfunc(self.fits, x))**2
return np.sum(arr)
func(self.fits[n], x) works fine as long as n is valid, and as far as I can tell from the docs, vfunc(self.fits, x) should be equivalent to
[self.fits[n](x) for n in range(self.num_points)]
but instead it throws:
ValueError: cannot copy sequence with size 10 to array axis with dimension 11
10 is the degree of the polynomial fit, and 11 is (by definition) the number of terms in it, but I have no idea why they're showing up here. If I change the fit order, the error message reflects the change. It seems like np.vectorize is taking each element of self.fits as a list rather than a np.poly1d function.
Anyway, if someone could either help me understand np.vectorize better, or suggest another way to eliminate that loop, that would be swell.
As the functions in question all have a very similar structure we can "manually" vectorize once we've extracted the poly coefficients. In fact, the function is then a quite simple one-liner, eval_many below:
import numpy as np
def poly_vec(list_of_polys):
O = max(p.order for p in list_of_polys)+1
C = np.zeros((len(list_of_polys), O))
for p, c in zip(list_of_polys, C):
c[len(c)-p.order-1:] = p.coeffs
return C
def eval_many(x,C):
return C#np.vander(x,11).T
# make example
list_of_polys = [np.poly1d(v) for v in np.random.random((1000,11))]
x = np.random.random((2000,))
# put all coeffs in one master matrix
C = poly_vec(list_of_polys)
# test
assert np.allclose(eval_many(x,C), [p(x) for p in list_of_polys])
from timeit import timeit
print('vectorized', timeit(lambda: eval_many(x,C), number=100)*10)
print('loopy ', timeit(lambda: [p(x) for p in list_of_polys], number=10)*100)
Sample run:
vectorized 6.817315469961613
loopy 56.35076989419758

Scipy stats.probplot not returning r^2 value bug?

I'm currently plotting the normal proability plot of a set of feature variables in a data matrix X using scipy. However, the module I'm using is not returning my r^2 value. Here's my simple code:
Data_Matrix=pd.read_csv('My csv')
My_datum=My_Data.as_matrix()
#loop through all feature variables
for i in range(My_datum.shape[1]):
#just a simple print line for the column index and its shapiro test
results
print(i)
print(a)
#plot
pplot=stats.probplot(My_datum[:,i],dist='norm',fit=True, plot=plt )
a=sp.stats.shapiro(My_datum[:,i])
I've tried using the same line on some simpler numpy array, but to no avail. I'm working the Ipython console on spyder 3. Thanks in advance!
Probplot returns the R value already. You just need to square it to give you R^2.
(slope, intercept, r) = stats.probplot(My_datum[:,i], dist='norm', fit=True, plot=plt)
R_squared = r**2
So this is a little weird but it returns a list of tuples. You actually just need to extract the second tuple from the list by indexing the probplot function. You'll need to unpack the tuple to then square it.
slope, intercept, r = stats.probplot(My_datum[:,i], dist='norm', fit=True, plot=plt)[1]
r2 = r**2

My Python trapezoidal code is saying x not defined

I am trying to find the integration of Sin^2(x)/x^2 using trapezoidal rule but while running it is saying x id not defined. Can anyone suggest me what is wrong?
import math
c=math.sin`
def trapezoidal(f, a, b, N):
if x!=0:
h=(b-a)/N
s=0.0
s+=f(a)/2.0
for i in range (1,N):
s+=f(a+i*h)
s+=f(b)/2.0
Y=s*h
else:
y=1
return Y`
for n in range (1,11):
N=2**n
result=trapezoidal(lambda x:((c(x)*c(x))/(x**2)), 0, 1000, N)
print(repr(n).rjust(10), repr(result).rjust(30))`
You're not calling the x in the function. In your case you want to loop over every element in the function f.
If you don't want to use the pythonic way, you can have a look at numpy and scipy. These packages offer optimised functions for basic operation such as integrations and matrix calculations. Have a look at np.trapz (https://docs.scipy.org/doc/numpy/reference/generated/numpy.trapz.html), sp.integrate (https://docs.scipy.org/doc/scipy-0.18.1/reference/tutorial/integrate.html)

TypeError: append() missing 1 required positional argument: 'values'

I have variable 'x_data' sized 360x190, I am trying to select particular rows of data.
x_data_train = []
x_data_train = np.append([x_data_train,
x_data[0:20,:],
x_data[46:65,:],
x_data[91:110,:],
x_data[136:155,:],
x_data[181:200,:],
x_data[226:245,:],
x_data[271:290,:],
x_data[316:335,:]],axis = 0)
I get the following error :
TypeError: append() missing 1 required positional argument: 'values'
where did I go wrong ?
If I am using
x_data_train = []
x_data_train.append(x_data[0:20,:])
x_data_train.append(x_data[46:65,:])
x_data_train.append(x_data[91:110,:])
x_data_train.append(x_data[136:155,:])
x_data_train.append(x_data[181:200,:])
x_data_train.append(x_data[226:245,:])
x_data_train.append(x_data[271:290,:])
x_data_train.append(x_data[316:335,:])
the size of the output is 8 instead of 160 rows.
Update:
In matlab, I will load the text file and x_data will be variable having 360 rows and 190 columns.
If I want to select 1 to 20 , 46 to 65, ... rows of data , I simply give
x_data_train = xdata([1:20,46:65,91:110,136:155,181:200,226:245,271:290,316:335], :);
the resulting x_data_train will be the array of my desired.
How can do that in python because it results array of 8 subsets of array for 20*192 each, but I want it to be one array 160*192
Short version: the most idiomatic and fastest way to do what you want in python is this (assuming x_data is a numpy array):
x_data_train = np.vstack([x_data[0:20,:],
x_data[46:65,:],
x_data[91:110,:],
x_data[136:155,:],
x_data[181:200,:],
x_data[226:245,:],
x_data[271:290,:],
x_data[316:335,:]])
This can be shortened (but made very slightly slower) by doing:
xdata[np.r_[0:20,46:65,91:110,136:155,181:200,226:245,271:290,316:335], :]
For your case where you have a lot of indices I think it helps readability, but in cases where there are fewer indices I would use the first approach.
Long version:
There are several different issues at play here.
First, in python, [] makes a list, not an array like in MATLAB. Lists are more like 1D cell arrays. They can hold any data type, including other lists, but they cannot have multiple dimensions. The equivalent of MATLAB matrices in Python are numpy arrays, which are created using np.array.
Second, [x, y] in Python always creates a list where the first element is x and the second element is y. In MATLAB [x, y] can do one of several completely different things depending on what x and y are. In your case, you want to concatenate. In Python, you need to explicitly concatenate. For two lists, there are several ways to do that. The simplest is using x += y, which modifies x in-place by putting the contents of y at the end. You can combine multiple lists by doing something like x += y + z + w. If you want to keep x, unchanged, you can assign to a new variable using something like z = x + y. Finally, you can use x.extend(y), which is roughly equivalent to x += y but works with some data types besides lists.
For numpy arrays, you need to use a slightly different approach. While Python lists can be modified in-place, strictly speaking neither MATLAB matrices nor numpy arrays can be. MATLAB pretends to allow this, but it is really creating a new matrix behind-the-scenes (which is why you get a warning if you try to resize a matrix in a loop). Numpy requires you to be more explicit about creating a new array. The simplest approach is to use np.hstack, which concatenates two arrays horizontally (or np.vstack or np.dstack for vertical and depth concatenation, respectively). So you could do z = np.hstack([v, w, x, y]). There is an append method and function in numpy, but it almost never works in practice so don't use it (it requires careful memory management that is more trouble than it is worth).
Third, what append does is to create one new element in the target list, and put whatever variable append is called with in that element. So if you do x.append([1,2,3]), it adds one new element to the end of list x containing the list [1,2,3]. It would be more like x = [x, {{1,2,3}}}, where x is a cell array.
Fourth, Python makes heavy use of "methods", which are basically functions attached to data (it is a bit more complicated than that in practice, but those complexities aren't really relevant here). Recent versions of MATLAB has added them as well, but they aren't really integrated into MATLAB data types like they are in Python. So where in MATLAB you would usually use sum(x), for numpy arrays you would use x.sum(). In this case, assuming you were doing appending (which you aren't) you wouldn't use the np.append(x, y), you would use x.append(y).
Finally, in MATLAB x:y creates a matrix of values from x to y. In Python, however, it creates a "slice", which doesn't actually contain all the values and so can be processed much more quickly by lists and numpy arrays. However, you can't really work with multiple slices like you do in your example (nor does it make sense to because slices in numpy don't make copies like they do in MATLAB, while using multiple indexes does make a copy). You can get something close to what you have in MATLAB using np.r_, which creates a numpy array based on indexes and slices. So to reproduce your example in numpy, where xdata is a numpy array, you can do xdata[np.r_[1:20,46:65,91:110,136:155,181:200,226:245,271:290,316:335], :]
More information on x_data and np might be needed to solve this but...
First: You're creating 2 copies of the same list: np and x_data_train
Second: Your indexes on x_data are strange
Third: You're passing 3 objects to append() when it only accepts 2.
I'm pretty sure revisiting your indexes on x_data will be where you solve the current error, but it will result in another error related to passing 2 values to append.
And I'm also sure you want
x_data_train.append(object)
not
x_data_train = np.append(object)
and you may actually want
x_data_train.extend([objects])
More on append vs extend here: append vs. extend

Resources