Google Colab Times out

Google Colab Times out - python-3.x

Trying to find the Jordan form of a given matrix using Colab.
But it always fails or times out.
Not sure why is this failing
import numpy as np
import sys
from sympy import Matrix
sys.set_int_max_str_digits(15000)
a = np.array([[1, 2, 4, 8], [1, 3, 9, 27], [1, 4, 16, 64], [1, 5, 25, 125]])
m = Matrix(a)
P, J = m.jordan_form()
J
I tried finding Jordan form on Matlab and on online calculators like
https://www.wolframalpha.com/input/?i=jordan+normal+form+calculator
It works fine on these platforms.
Not sure why Colab and Jupyter are not able to compute the Jordan form of the matrix

Firstly, Colab and Jupyter are simply environments in which you can run Python codes, and the issue here has nothing to do with using Colab or Jupyter or any IDE.
Secondly, the reason you do not get the results in your example is an algorithmic one. The matrix you are using is ill-conditioned. There is four orders of magnitude difference between the four egnevalues. And the underlying algorithm gets stuck while trying to calculate the Jordan form.
If you try, as an example:
a = np.array([[5, 4, 2, 1], [0, 1, -1, -1], [-1, -1, 3, 0], [1, 1, -1, 2]])
you will see you code works well and fast.

Related

Hiding matplotlib plots while doing tests with pytest

I am writing a simple library where, given a dataset, it runs a bunch of analyses and shows a lot of plots mid-way. (There are many plt.show() calls)
I have written simple tests to check if different functions run without any error with pytest.
The problem is, once I run pytest, it starts showing all of these plots, and I have to close one by one, it takes a lot of time.
How can I silence all the plots and just see if all the tests passed or not?

If your backend supports interactive display with plt.ion(), then you will need only minimal changes (four lines) to your code:
import matplotlib.pyplot as plt
#define a keyword whether the interactive mode should be turned on
show_kw=True #<--- added
#show_kw=False
if show_kw: #<--- added
plt.ion() #<--- added
#your usual script
plt.plot([1, 2, 3], [4, 5, 6])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
#closes all figure windows if show_kw is True
#has no effect if no figure window is open
plt.close("all") #<--- added
print("finished")
However, if the plot generation is time-consuming, this will not be feasible as it only prevents that you have to close them one by one - they will still be generated. In this case, you can switch the backend to a non-GUI version that cannot display the figures:
import matplotlib.pyplot as plt
from matplotlib import get_backend
import warnings
show_kw=True
#show_kw=False
if show_kw:
curr_backend = get_backend()
#switch to non-Gui, preventing plots being displayed
plt.switch_backend("Agg")
#suppress UserWarning that agg cannot show plots
warnings.filterwarnings("ignore", "Matplotlib is currently using agg")
plt.plot([1, 2, 3], [4, 5, 6])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
#display selectively some plots
if show_kw:
#restore backend
plt.switch_backend(curr_backend)
plt.plot([1, 2, 3], [-2, 5, -1])
plt.show()
print("finished")

K-shortest paths using networkx package in python

I have created a multidigraph of motorway of Netherlands using osmnx package.
The graph is a multidigraph returned from osmnx. Since I am interested to compute k-shortest paths between an origin and a destination, I tried networkx library. However, networkx does not seem to work with multidigraph. All I can compute the shortest path.
I would like to ask if there is any other way to perform the k-shortest path computation in python over multidigraph.

Try using the networkx command shortest_simple_paths (documentation).
It returns a generator which returns one path at a time from shortest to longest.
G = nx.karate_club_graph()
X = nx.shortest_simple_paths(G, 0, 5)
k = 5
for counter, path in enumerate(X):
print(path)
if counter == k-1:
break
> [0, 5]
> [0, 6, 5]
> [0, 10, 5]
> [0, 6, 16, 5]
> [0, 4, 6, 5]
This will work with DiGraphs, but I'm not sure about a MultiDiGraph. It's not clear to me that a road network would be a MultiDiGraph, however.

Numpy matrix addition vs ndarrays, convenient oneliner

How does numpy's matrix class work? I understand it will likely be removed in the future, so I am trying to understand how it works, so I can do the same with ndarrrays.
>>> x=np.matrix([[1,1,1],[2,2,2],[3,3,3]])
>>> x[:,0] + x[0,:]
matrix([[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
Seems like a row of ones got added to every row.
>>> x=np.matrix([[1,2,3],[1,2,3],[1,2,3]])
>>> x[0,:] + x[:,0]
matrix([[2, 3, 4],
[2, 3, 4],
[2, 3, 4]])
Now it seems like a column of ones got added to every column. What it does it with the identity is even weirder,
>>> x=np.matrix([[1,0,0],[0,1,0],[0,0,1]])
>>> x[0,:] + x[:,0]
matrix([[2, 1, 1],
[1, 0, 0],
[1, 0, 0]])
EDIT:
It seems if you take a (N,1) shape matrix and add it to a (N,1) shape matrix, then of one these is replicated to form a (N,N) matrix and the other is added to every row or column of this new matrix. It seems to be a convenience restricted to vectors of the right sizes. A nice use case was networkx's implementation of Floyd-Warshal.
Is there an equivalently convenient one-liner for this using standard numpy ndarrays?

Is there any way to plot lines of different lengths with bokeh?

Both down sampling and resizing are not feasible options for me, as suggested here.
I tried to pad the shorter lists with NaNs, but that threw up an error as well.
Is there any work around?
My code looks something like this:
from bokeh.charts import output_file, Line, save
lines=[[1,2,3],[1,2]]
output_file("example.html",title="toy code")
p = Line(lines,plot_width=600,plot_height=600, legend=False)
save(p)

However, as you see below you can plot two different lines with different lengths.
From Bokeh user guide on multiple lines:
from bokeh.plotting import figure, output_file, show
output_file("patch.html")
p = figure(plot_width=400, plot_height=400)
p.multi_line([[1, 3, 2], [3, 4, 6, 6]], [[2, 1, 4], [4, 7, 8, 5]],
color=["firebrick", "navy"], alpha=[0.8, 0.3], line_width=4)
show(p)

scikit-learn: Get selected features for prediction data

I have a training set of data. The python script for creating the model also calculates the attributes into a numpy array (It's a bit vector). I then want to use VarianceThreshold to eliminate all features that have 0 variance (eg. all 0 or 1). I then run get_support(indices=True) to get the indices of the select columns.
My issue now is how to get only the selected features for the data I want to predict. I first calculate all features and then use array indexing but it does not work:
x_predict_all = getAllFeatures(suppl_predict)
x_predict = x_predict_all[indices] #only selected features
indices is a numpy array.
The returned array x_predict has the correct length len(x_predict) but wrong shape x_predict.shape[1] which is still the original length. My classifier then throws an error due to wrong shape
prediction = gbc.predict(x_predict)
File "C:\Python27\lib\site-packages\sklearn\ensemble\gradient_boosting.py", li
ne 1032, in _init_decision_function
self.n_features, X.shape[1]))
ValueError: X.shape[1] should be 1855, not 2090.
How can I solve this issue?

You can do it like this:
Test data
from sklearn.feature_selection import VarianceThreshold
X = np.array([[0, 2, 0, 3],
[0, 1, 4, 3],
[0, 1, 1, 3]])
selector = VarianceThreshold()
Alternative 1
>>> selector.fit(X)
>>> idxs = selector.get_support(indices=True)
>>> X[:, idxs]
array([[2, 0],
[1, 4],
[1, 1]])
Alternative 2
>>> selector.fit_transform(X)
array([[2, 0],
[1, 4],
[1, 1]])

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Google Colab Times out - python-3.x

Related

Hiding matplotlib plots while doing tests with pytest

K-shortest paths using networkx package in python

Numpy matrix addition vs ndarrays, convenient oneliner

Is there any way to plot lines of different lengths with bokeh?

scikit-learn: Get selected features for prediction data

Categories

Resources