Hiding matplotlib plots while doing tests with pytest - python-3.x

I am writing a simple library where, given a dataset, it runs a bunch of analyses and shows a lot of plots mid-way. (There are many plt.show() calls)
I have written simple tests to check if different functions run without any error with pytest.
The problem is, once I run pytest, it starts showing all of these plots, and I have to close one by one, it takes a lot of time.
How can I silence all the plots and just see if all the tests passed or not?

If your backend supports interactive display with plt.ion(), then you will need only minimal changes (four lines) to your code:
import matplotlib.pyplot as plt
#define a keyword whether the interactive mode should be turned on
show_kw=True #<--- added
#show_kw=False
if show_kw: #<--- added
plt.ion() #<--- added
#your usual script
plt.plot([1, 2, 3], [4, 5, 6])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
#closes all figure windows if show_kw is True
#has no effect if no figure window is open
plt.close("all") #<--- added
print("finished")
However, if the plot generation is time-consuming, this will not be feasible as it only prevents that you have to close them one by one - they will still be generated. In this case, you can switch the backend to a non-GUI version that cannot display the figures:
import matplotlib.pyplot as plt
from matplotlib import get_backend
import warnings
show_kw=True
#show_kw=False
if show_kw:
curr_backend = get_backend()
#switch to non-Gui, preventing plots being displayed
plt.switch_backend("Agg")
#suppress UserWarning that agg cannot show plots
warnings.filterwarnings("ignore", "Matplotlib is currently using agg")
plt.plot([1, 2, 3], [4, 5, 6])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
plt.plot([1, 3, 7], [4, 6, -1])
plt.show()
#display selectively some plots
if show_kw:
#restore backend
plt.switch_backend(curr_backend)
plt.plot([1, 2, 3], [-2, 5, -1])
plt.show()
print("finished")

Related

Google Colab Times out

Trying to find the Jordan form of a given matrix using Colab.
But it always fails or times out.
Not sure why is this failing
import numpy as np
import sys
from sympy import Matrix
sys.set_int_max_str_digits(15000)
a = np.array([[1, 2, 4, 8], [1, 3, 9, 27], [1, 4, 16, 64], [1, 5, 25, 125]])
m = Matrix(a)
P, J = m.jordan_form()
J
I tried finding Jordan form on Matlab and on online calculators like
https://www.wolframalpha.com/input/?i=jordan+normal+form+calculator
It works fine on these platforms.
Not sure why Colab and Jupyter are not able to compute the Jordan form of the matrix
Firstly, Colab and Jupyter are simply environments in which you can run Python codes, and the issue here has nothing to do with using Colab or Jupyter or any IDE.
Secondly, the reason you do not get the results in your example is an algorithmic one. The matrix you are using is ill-conditioned. There is four orders of magnitude difference between the four egnevalues. And the underlying algorithm gets stuck while trying to calculate the Jordan form.
If you try, as an example:
a = np.array([[5, 4, 2, 1], [0, 1, -1, -1], [-1, -1, 3, 0], [1, 1, -1, 2]])
you will see you code works well and fast.

What's the most efficient way to scatter plot a pair of 2D lists on top of each other in python?

Working with python and matplotlib. Lets's say for example I have the following lists:
A=[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
B=[[4, 2, 6], [3, 2, 1], [5, 1, 4]]
Each row of these lists represent a single scatter plot, A being x-axis and B being y-axis. Is there an efficient way of stacking these scatter plots on top of each other into a single scatter plot? I have already tried a "for" loop:
for i in range(len(A)):
plt.scatter(A[i], B[i])
It works, but it's a bit slow when working with larger numbers of entries. Is there a more efficient way to do this?
Unless there is a reason to do multiple calls to scatter, I would recommend flattening the lists and doing a single call to plt.scatter like so:
import itertools
A=[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
B=[[4, 2, 6], [3, 2, 1], [5, 1, 4]]
A_flat = list(itertools.chain.from_iterable(A))
B_flat = list(itertools.chain.from_iterable(B))
plt.scatter(A_flat, B_flat)

Plotting list using matplot lib

I have a list like :
[[5, 1.1066720079718957], [10, 1.075297753414681], [15, 1.0958222358382397], [20, 1.092081009894558], [25, 1.0968130408510393]]
I am trying to plot using matplotlib
where values 5,10,15,20,25 are on x-axis where as 1.1066720079718957,1.075297753414681,1.0958222358382397,1.092081009894558,1.0968130408510393 on y-axis
I am not able to do it in few lines.
import matplotlib.pyplot as plt
data = [[5, 1.1066720079718957], [10, 1.075297753414681], [15, 1.0958222358382397], [20, 1.092081009894558], [25, 1.0968130408510393]]
plt.plot([i[0] for i in data], [i[1] for i in data])
plt.show()
Maybe these two links will also useful if you are working with nested lists and list comprehensions
Nested lists python
What does "list comprehension" mean? How does it work and how can I use it?

How to convert a grouped pandas dataframe into a numpy 3d array and apply right-padding?

In order to feed data into a LSTM network to predict remaining-useful-life (RUL) I need to create a 3D numpy array (No of machines, No of sequences, No of variables).
I already tried to combine solutions from stackoverflow and managed to create a prototype (which you can see below).
import numpy as np
import tensorflow as tf
import pandas as pd
df = pd.DataFrame({'ID': [1, 1, 2, 3, 3, 3, 3],
'V1': [1, 2, 2, 3, 3, 4, 2],
'V2': [4, 2, 3, 2, 1, 5, 1],
})
df_desired_result = np.array([[[1, 4], [2, 2], [-99, -99]],
[[2, 3], [-99, -99], [-99, -99]],
[[3, 2], [3, 1], [4, 5]]])
max_len = df['ID'].value_counts().max()
def pad_df(df, cols, max_seq, group_col= 'ID'):
array_for_pad = np.array(list(df[cols].groupby(df[group_col]).apply(pd.DataFrame.as_matrix)))
padded_array = tf.keras.preprocessing.sequence.pad_sequences(array_for_pad,
padding='post',
maxlen=max_seq,
value=-99
)
return padded_array
#testing prototype
pad_df(df, ['V1', 'V2'], max_len)
But when I apply the code above to my data, it applies the right-padding correctly but all values are set to 0.0.
I can't fully figure out this behaviour, I noticed that in the first line of my function, I get returned an array with nested arrays for 'array_for_pad'.
Here is a screenshot of the result:
result padding

Is there any way to plot lines of different lengths with bokeh?

Both down sampling and resizing are not feasible options for me, as suggested here.
I tried to pad the shorter lists with NaNs, but that threw up an error as well.
Is there any work around?
My code looks something like this:
from bokeh.charts import output_file, Line, save
lines=[[1,2,3],[1,2]]
output_file("example.html",title="toy code")
p = Line(lines,plot_width=600,plot_height=600, legend=False)
save(p)
However, as you see below you can plot two different lines with different lengths.
From Bokeh user guide on multiple lines:
from bokeh.plotting import figure, output_file, show
output_file("patch.html")
p = figure(plot_width=400, plot_height=400)
p.multi_line([[1, 3, 2], [3, 4, 6, 6]], [[2, 1, 4], [4, 7, 8, 5]],
color=["firebrick", "navy"], alpha=[0.8, 0.3], line_width=4)
show(p)

Resources