Python 3: Met "ndarray is not contiguous" when construct a regression function - python-3.x

This code is designed for calculating a linear regression by defining a function "standRegres" which compile by ourself. Although we can do the lm by the functions in sklearn or statsmodels, here we just try to construct the function by ourself. But unfortunately, I confront error and can't conquer it. So, I'm here asking for your favor to help.
The whole code runs without any problem until the last row. If I run the last row, an Error message emerges: "ValueError: ndarray is not contiguous".
import os
import pandas as pd
import numpy as np
import pylab as pl
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
# load data
iris = load_iris()
# Define a DataFrame
df = pd.DataFrame(iris.data, columns = iris.feature_names)
# take a look
df.head()
#len(df)
# rename the column name
df.columns = ['sepal_length','sepal_width','petal_length','petal_width']
X = df[['petal_length']]
y = df['petal_width']
from numpy import *
#########################
# Define function to do matrix calculation
def standRegres(xArr,yArr):
xMat = mat(xArr); yMat = mat(yArr).T
xTx = xMat.T * xMat
if linalg.det(xTx) == 0.0:
print ("this matrix is singular, cannot do inverse!")
return NA
else :
ws = xTx.I * (xMat.T * yMat)
return ws
# test
x0 = np.ones((150,1))
x0 = pd.DataFrame(x0)
X0 = pd.concat([x0,X],axis = 1)
# test
standRegres(X0,y)
This code runs without any problem until the last row. If I run the last row, an Error message emerges: "ValueError: ndarray is not contiguous".
I dry to solve it but don't know how. Could you help me? Quite appreciate for that!

Your problem stems from using the mat function. Stick to array.
In order to use array, you'll need to use the # sign for matrix multiplication, not *. Finally, you have a line that says xTx.I, but that function isn't defined for general arrays, so we can use numpy.linalg.inv.
def standRegres(xArr,yArr):
xMat = array(xArr); yMat = array(yArr).T
xTx = xMat.T # xMat
if linalg.det(xTx) == 0.0:
print ("this matrix is singular, cannot do inverse!")
return NA
else :
ws = linalg.inv(xTx) # (xMat.T # yMat)
return ws
# test
x0 = np.ones((150,1))
x0 = pd.DataFrame(x0)
X0 = pd.concat([x0,X],axis = 1)
# test
standRegres(X0,y)
# Output: array([-0.36651405, 0.41641913])

Related

How does one save torch.nn.Sequential models in pytorch properly?

I am very well aware of loading the dictionary and then having a instance of be loaded with the old dictionary of parameters (e.g. this great question & answer). Unfortunately, when I have a torch.nn.Sequential I of course do not have a class definition for it.
So I wanted to double check, what is the proper way to do it. I believe torch.save is sufficient (so far my code has not collapsed), though these things can be more subtle than one might expect (e.g. I get a warning when I use pickle but torch.save uses it internally so it's confusing). Also, numpy has it's own save functions (e.g. see this answer) which tend to be more efficient, so there might be a subtle trade off I might be overlooking.
My test code:
# creating data and running through a nn and saving it
import torch
import torch.nn as nn
from pathlib import Path
from collections import OrderedDict
import numpy as np
import pickle
path = Path('~/data/tmp/').expanduser()
path.mkdir(parents=True, exist_ok=True)
num_samples = 3
Din, Dout = 1, 1
lb, ub = -1, 1
x = torch.torch.distributions.Uniform(low=lb, high=ub).sample((num_samples, Din))
f = nn.Sequential(OrderedDict([
('f1', nn.Linear(Din,Dout)),
('out', nn.SELU())
]))
y = f(x)
# save data torch to numpy
x_np, y_np = x.detach().cpu().numpy(), y.detach().cpu().numpy()
np.savez(path / 'db', x=x_np, y=y_np)
print(x_np)
# save model
with open('db_saving_seq', 'wb') as file:
pickle.dump({'f': f}, file)
# load model
with open('db_saving_seq', 'rb') as file:
db = pickle.load(file)
f2 = db['f']
# test that it outputs the right thing
y2 = f2(x)
y_eq_y2 = y == y2
print(y_eq_y2)
db2 = {'f': f, 'x': x, 'y': y}
torch.save(db2, path / 'db_f_x_y')
print('Done')
db3 = torch.load(path / 'db_f_x_y')
f3 = db3['f']
x3 = db3['x']
y3 = db3['y']
yy3 = f3(x3)
y_eq_y3 = y == y3
print(y_eq_y3)
y_eq_yy3 = y == yy3
print(y_eq_yy3)
Related:
related question from forum: https://discuss.pytorch.org/t/how-to-save-nn-sequential-as-a-model/89117/14
As can be seen in the code torch.nn.Sequential is based on torch.nn.Module:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/container.html#Sequential
So you can use
f = torch.nn.Sequential(...)
torch.save(f.state_dict(), path)
just like with any other torch.nn.Module.

How do I calculate the global efficiency of graph in igraph (python)?

I am trying to calculate the global efficiency of a graph in igraph but I am not sure if I using the module correctly. I think there is a solution that might make a bit of sense but it is in r, and I wasn't able to decipher what they were saying.
I have tried writing the code in a networkx fashion trying to emulate the way they calculate global efficiency but I have been unsuccessful thus far. I am using igraph due to the fact that I am dealing with large graphs. Any help would be really appreciated :D
This is what I have tried:
import igraph
import pandas as pd
import numpy as np
from itertools import permutations
datasafe = pd.read_csv("b1.csv", index_col=0)
D = datasafe.values
g = igraph.Graph.Adjacency((D > 0).tolist())
g.es['weight'] = D[D.nonzero()]
def efficiency_weighted(g):
weights = g.es["weight"][:]
eff = (1.0 / np.array(g.shortest_paths_dijkstra(weights=weights)))
return eff
def global_efficiecny_weighted(g):
n=180.0
denom=n*(n-1)
g_eff = sum(efficiency_weighted(g) for u, v in permutations(g, 2))
return g_eff
global_efficiecny_weighted(g)
The error message I am getting says:- TypeError: 'Graph' object is not iterable
Assuming that you want the nodal efficiency for all nodes, then you can do this:
import numpy as np
from igraph import *
np.seterr(divide='ignore')
# Example using a random graph with 20 nodes
g = Graph.Erdos_Renyi(20,0.5)
# Assign weights on the edges. Here 1s everywhere
g.es["weight"] = np.ones(g.ecount())
def nodal_eff(g):
weights = g.es["weight"][:]
sp = (1.0 / np.array(g.shortest_paths_dijkstra(weights=weights)))
np.fill_diagonal(sp,0)
N=sp.shape[0]
ne= (1.0/(N-1)) * np.apply_along_axis(sum,0,sp)
return ne
eff = nodal_eff(g)
print(eff)
#[0.68421053 0.81578947 0.73684211 0.76315789 0.76315789 0.71052632
# 0.81578947 0.81578947 0.81578947 0.73684211 0.71052632 0.68421053
# 0.71052632 0.81578947 0.84210526 0.76315789 0.68421053 0.68421053
# 0.78947368 0.76315789]
To get the global just do:
np.mean(eff)

Solving the Lorentz model using Runge Kutta 4th Order in Python without a package

I wish to solve the Lorentz model in Python without the help of a package and my codes seems not to work to my expectation. I do not know why I am not getting the expected results and Lorentz attractor. The main problem I guess is related to how to store the various values for the solution of x,y and z respectively.Below are my codes for the Runge-Kutta 45 for the Lorentz model with 3D plot of solutions:
import numpy as np
import matplotlib.pyplot as plt
#from scipy.integrate import odeint
#a) Defining the Runge-Kutta45 method
def fx(x,y,z,t):
dxdt=sigma*(y-z)
return dxdt
def fy(x,y,z,t):
dydt=x*(rho-z)-y
return dydt
def fz(x,y,z,t):
dzdt=x*y-beta*z
return dzdt
def RungeKutta45(x,y,z,fx,fy,fz,t,h):
k1x,k1y,k1z=h*fx(x,y,z,t),h*fy(x,y,z,t),h*fz(x,y,z,t)
k2x,k2y,k2z=h*fx(x+k1x/2,y+k1y/2,z+k1z/2,t+h/2),h*fy(x+k1x/2,y+k1y/2,z+k1z/2,t+h/2),h*fz(x+k1x/2,y+k1y/2,z+k1z/2,t+h/2)
k3x,k3y,k3z=h*fx(x+k2x/2,y+k2y/2,z+k2z/2,t+h/2),h*fy(x+k2x/2,y+k2y/2,z+k2z/2,t+h/2),h*fz(x+k2x/2,y+k2y/2,z+k2z/2,t+h/2)
k4x,k4y,k4z=h*fx(x+k3x,y+k3y,z+k3z,t+h),h*fy(x+k3x,y+k3y,z+k3z,t+h),h*fz(x+k3x,y+k3y,z+k3z,t+h)
return x+(k1x+2*k2x+2*k3x+k4x)/6,y+(k1y+2*k2y+2*k3y+k4y)/6,z+(k1z+2*k2z+2*k3z+k4z)/6
sigma=10.
beta=8./3.
rho=28.
tIn=0.
tFin=10.
h=0.05
totalSteps=int(np.floor((tFin-tIn)/h))
t=np.zeros(totalSteps)
x=np.zeros(totalSteps)
y=np.zeros(totalSteps)
z=np.zeros(totalSteps)
for i in range(1, totalSteps):
x[i-1]=1. #Initial condition
y[i-1]=1. #Initial condition
z[i-1]=1. #Initial condition
t[0]=0. #Starting value of t
t[i]=t[i-1]+h
x,y,z=RungeKutta45(x,y,z,fx,fy,fz,t[i-1],h)
#Plotting solution
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
fig=plt.figure()
ax=fig.gca(projection='3d')
ax.plot(x,y,z,'r',label='Lorentz 3D Solution')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
ax.legend()
I changed the integration step (btw., classical 4th order Runge-Kutta, not any adaptive RK45) to use the python core concept of lists and list operations extensively to reduce the number of places where the computation is defined. There were no errors there to correct, but I think the algorithm itself is more concentrated.
You had an error in the system that changed it into a system that rapidly diverges. You had fx = sigma*(y-z) while the Lorenz system has fx = sigma*(y-x).
Next your main loop has some strange assignments. In every loop you first set the previous coordinates to the initial conditions and then replace the full arrays with the RK step applied to the full arrays. I replaced that completely, there are no small steps to a correct solution.
import numpy as np
import matplotlib.pyplot as plt
#from scipy.integrate import odeint
def fx(x,y,z,t): return sigma*(y-x)
def fy(x,y,z,t): return x*(rho-z)-y
def fz(x,y,z,t): return x*y-beta*z
#a) Defining the classical Runge-Kutta 4th order method
def RungeKutta4(x,y,z,fx,fy,fz,t,h):
k1x,k1y,k1z = ( h*f(x,y,z,t) for f in (fx,fy,fz) )
xs, ys,zs,ts = ( r+0.5*kr for r,kr in zip((x,y,z,t),(k1x,k1y,k1z,h)) )
k2x,k2y,k2z = ( h*f(xs,ys,zs,ts) for f in (fx,fy,fz) )
xs, ys,zs,ts = ( r+0.5*kr for r,kr in zip((x,y,z,t),(k2x,k2y,k2z,h)) )
k3x,k3y,k3z = ( h*f(xs,ys,zs,ts) for f in (fx,fy,fz) )
xs, ys,zs,ts = ( r+kr for r,kr in zip((x,y,z,t),(k3x,k3y,k3z,h)) )
k4x,k4y,k4z =( h*f(xs,ys,zs,ts) for f in (fx,fy,fz) )
return (r+(k1r+2*k2r+2*k3r+k4r)/6 for r,k1r,k2r,k3r,k4r in
zip((x,y,z),(k1x,k1y,k1z),(k2x,k2y,k2z),(k3x,k3y,k3z),(k4x,k4y,k4z)))
sigma=10.
beta=8./3.
rho=28.
tIn=0.
tFin=10.
h=0.01
totalSteps=int(np.floor((tFin-tIn)/h))
t = totalSteps * [0.0]
x = totalSteps * [0.0]
y = totalSteps * [0.0]
z = totalSteps * [0.0]
x[0],y[0],z[0],t[0] = 1., 1., 1., 0. #Initial condition
for i in range(1, totalSteps):
x[i],y[i],z[i] = RungeKutta4(x[i-1],y[i-1],z[i-1], fx,fy,fz, t[i-1], h)
Using tFin = 40 and h=0.01 I get the image
looking like the typical image of the Lorenz attractor.

Solving simple ODE using scipy odeint gives straight line at 0

I am trying to solve a simple ODE:
dN/dt = N*(rho(t)-beta)/lambda
Rho is a function of time and I've generated it using linspace. The code is working for other equations but somehow gives a flat straight line at 0. (You can see it in the graph). Any guidelines about how to correct it?
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def model2(N, t, rho):
beta_val = 0.0065
lambda_val = 0.00002
k = (rho - beta_val) / lambda_val
dNdt = k*N
print(rho)
return dNdt
# initial condition
N0 = [0]
# number of time points
n = 200
# time points
t = np.linspace(0,200,n)
rho = np.linspace(6,9,n)
#rho =np.array([6,6.1,6.2,6.3,6.4,6.5,6.6,6.7,6.8,6.9,7.0,7.1,7.2,7.3,7.4,7.5,7.6,7.7,7.8,7.9]) # Array of constants
# store solution
NSol = np.empty_like(t)
# record initial conditions
NSol[0] = N0[0]
# solve ODE
for i in range(1,n):
# span for next time step
tspan = [t[i-1],t[i]]
# solve for next step
N = odeint(model2,N0,tspan,args=(rho[i],))
print(N)
# store solution for plotting
NSol[i] = N[0][0]
# next initial condition
#z0 = N0[0]
# plot results
plt.plot(t,rho,'g:',label='rho(t)')
plt.plot(t,NSol,'b-',label='NSol(t)')
plt.ylabel('values')
plt.xlabel('time')
plt.legend(loc='best')
plt.show()
This is the graph I get after running this code
I modified your code (and the coefficients) to make it work.
When coefficients are also dependent of t, they have to be python functions called by the derivative function:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# Define
def model2(N, t, rho):
beta_val = 0.0065
lambda_val = 0.02
k = ( rho(t) - beta_val )/lambda_val
dNdt = k*N
return dNdt
def rho(t):
return .001 + .003/20*t
# Solve
tspan = np.linspace(0, 20, 10)
N0 = .01
N = odeint(model2, N0 , tspan, args=(rho,))
# Plot
plt.plot(tspan, N, label='NS;ol(t)');
plt.ylabel('N');
plt.xlabel('time'); plt.legend(loc='best');

Outputting coefficients when running linear regression using sklearn

I'm attempting to run a simple linear regression on a data set and retrieve the coefficients. The data which is from a a .csv file looks like:
"","time","LakeHuron"
"1",1875,580.38
"2",1876,581.86
"3",1877,580.97
"4",1878,580.8
...
import pandas as pd
import numpy as np
from sklearn import datasets, linear_model
def Main():
location = r"~/Documents/Time Series/LakeHuron.csv"
ts = pd.read_csv(location, sep=",", parse_dates=[0], header=None)
ts.drop(ts.columns[[0]], axis=1, inplace=True)
length = len(ts)
x = ts[1].values
y = ts[2].values
x = x.reshape(length, 1)
y = y.reshape(length, 1)
regr = linear_model.LinearRegression()
regr.fit(x, y)
print(regr.coef_)
if __name__ == "__main__":
Main()
Since this is a simple linear model then $Y_t = a_0 + a_1*t$, which in this case should be $Y_t = 580.202 -0.0242t$. and all that prints out when running the above code is [[-0.02420111]]. Is there anyway to get the second coefficient 580.202?
I've had a look at the documentation on http://scikit-learn.org/stable/modules/linear_model.html and it outputs two variables in the array.
Look like you only have one X and one Y, So output is correct.
Try this:
#coef_ : array, shape (n_features, ) or (n_targets, n_features)
print(regr.coef_)
#intercept_ : array Independent term in the linear model.
print(regr.intercept_)
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression

Resources