Can I see all attributes of a pyplot without showing the graph? - python-3.x

I am working on developing homework as a TA for a course at my university.
We are using Otter Grader (an extension of OKPy) to grade student submissions of guided homework we provide through Jupyter Notebooks.
Students are being asked to plot horizontal lines on their plots using matplotlib.pyplot.axhline(), and I am hoping to use an assert call to determine whether they added the horizontal line to their plots.
Is there a way to see all attributes that have been added to a pyplot in matplotlib?

I don't believe there is a way to see if the axhline attribute has been used or not, but there is a way to see if the lines are horizontal by accessing all the line2D objects using the lines attribute.
import matplotlib.pyplot as plt
import numpy as np
def is_horizontal(line2d):
x, y = line2d.get_data()
y = np.array(y) # The axhline method does not return data as a numpy array
y_bool = y == y[0] # Returns a boolean array of True or False if the first number equals all the other numbers
return np.all(y_bool)
t = np.linspace(-10, 10, 1000)
plt.plot(t, t**2)
plt.plot(t, t)
plt.axhline(y=5, xmin=-10, xmax=10)
ax = plt.gca()
assert any(map(is_horizontal, ax.lines)), 'There are no horizontal lines on the plot.'
plt.show()
This code will raise the error if there is not at least one line2D object that contains data in which all the y values are the same.
Note that in order for the above to work, the axhline attribute has to be used instead of the hlines method. The hlines method does not add the line2D object to the axes object.

Related

Using setp to hide axes spines

I am trying to use setp in matplotlib to set the visibility of spines to False, but I get the error "AttributeError: 'str' object has no attribute 'update'".
As far as I understand, with setp we can change the properties of iterable objects, and want to execute it with spines.
What is the correct syntax to effectively use setp?
Hier a MWE:
import matplotlib.pyplot as plt
x = range(0,10)
y = [i*i for i in x]
plt.plot(x,y) #Plotting x against y
axes = plt.gca() #Getting the current axis
axes.spines['top'].set_visible(False) #It works
plt.setp(axes.spines, visible=False) #It rises error
plt.show() #Showing the plot
Versions: python3.8.2, Matplotlib 3.2.1
axes.spines is an OrderedDict. When you iterate over a Dict or OrderedDict like this:
for key in axes.spines:
print(type(key))
You are iterating over the keys, which are strings and have no update method. Here you can see what parameters can be set with plt.setp() by just passing in the iterable or object like so.
plt.setp(axes.spines)
This returns None, because its referring to the keys, which are strings and have no update method.
Along this line of logic if we try this:
plt.setp(axes.spines.values())
we see that this does return possible arguments.
So in summary, changing plt.setp(axes.spines, visible=False) to plt.setp(axes.spines.values(), visible=False) will remove all spines since it is iterating through the objects and not the keys.
Full code:
import matplotlib.pyplot as plt
x = range(0,10)
y = [i*i for i in x]
plt.plot(x,y) #Plotting x against y
axes = plt.gca() #Getting the current axis
axes.spines['top'].set_visible(False)
plt.setp(axes.spines.values(), visible=False)
plt.show() #Showing the plot
I will post my desperate solution, only for the record, and if it might help somebody. Though #axe319 answer can hardly be trumped.
I just had to iterate over the names of the spines names:
spine_names = ('top','right', 'bottom', 'left')
for spine_name in spine_names:
axes.spines[spine_name].set_visible(False)
It works, but is not so elegant and flexible, and, obviously, gives up on using setp :-\
Warning:
Somebody might think that an alternative solution is
axes.set_frame_on(False)
But, not at all. I tried it. Although it certainly hides all axes at once as using set_visible(False), afterwards the command axes.spines[spine_name].set_visible(True) does not work!

Legend overwritten by plot - matplotlib

I have a plot that looks as follows:
I want to put labels for both the lineplot and the markers in red. However the legend is not appearning because its the plot is taking out its space.
Update
it turns out I cannot put several strings in plt.legend()
I made the figure bigger by using the following:
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
However now I have only one label in the legend, with the marker appearing on the lineplot while I rather want two: one for the marker alone and another for the line alone:
Updated code:
plt.plot(range(len(y)), y, '-bD', c='blue', markerfacecolor='red', markeredgecolor='k', markevery=rare_cases, label='%s' % target_var_name)
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
# changed this over here
plt.legend()
plt.savefig(output_folder + fig_name)
plt.close()
What you want to do (have two labels for a single object) is not completely impossible but it's MUCH easier to plot separately the line and the rare values, e.g.
# boilerplate
import numpy as np
import matplotlib.pyplot as plt
# synthesize some data
N = 501
t = np.linspace(0, 10, N)
s = np.sin(np.pi*t)
rare = np.zeros(N, dtype=bool); rare[:20]=True; np.random.shuffle(rare)
plt.plot(t, s, label='Curve')
plt.scatter(t[rare], s[rare], label='rare')
plt.legend()
plt.show()
Update
[...] it turns out I cannot put several strings in plt.legend()
Well, you can, as long as ① the several strings are in an iterable (a tuple or a list) and ② the number of strings (i.e., labels) equals the number of artists (i.e., thingies) in the plot.
plt.legend(('a', 'b', 'c'))

Python: Pickle.load function returns the correct 3d-scatter plot, but is not interactive anymore

this is my first question here so let me know if I should make any improvements regarding e.g. formulation of the question, code and so on.
So I am creating several 3-D Scatter Plots in Python and want to safe them for later re usage and comparability. I am using Qt5 as Graphics Backend in Spyder, which perfectly displays my interactive (so I can rotate over the axes and flip the plot) 3-D Scatter plot using the origin Code.
Now I am able to successfully save the created plot and also load it into a new script, which opens the Plot in Qt5 as well. But somehow the interactivity is gone, meaning I am not able to rotate over the axes and flip the plot anymore.
I was unable to find any guidance to that issue or find any person with a similar problem. Do you guys have any idea? I'll put the relevant part of my sample Code below:
""" First script """
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import pandas as pd
import pickle
testdf = pd.DataFrame({"X" : x, "Y" : y, "Z" : z}) #x and y are the criteria, z the values, stored as lists
# Create 3d scatter plot
fig = plt.figure(figsize=(12, 12))
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x, y, z, c=z, marker="o")
ax.set_xlabel("Initial Notional Cluster")
ax.set_ylabel("Laufzeit in Month Cluster")
ax.set_zlabel("Vol. Weighted Margin")
plt.show()
# Save the figure object as binary file
file = open(r"Init_Lfz_VolWeightedMargin.pkl", "wb")
pickle.dump(fig, file)
file.close()
""" Second script """
import matplotlib.pyplot as plt
figx = pickle.load(open(r"Init_Lfz_VolWeightedMargin.pkl", "rb"))
plt.show()
Any idea, why the interactivity is gone? According to the pickle library and other usercases, this shall not happen.
Many thanks.

Add labels to each box in seaborn's factorplot boxplot

I know there are similar answers such as this one, but that one applies to seaborn's boxplot and it's not working for me with seaborn's factorplot. On a simple factorplot:
import seaborn as sns
tips = sns.load_dataset("tips")
means = tips.groupby(["sex","smoker","time"])["tip"].mean().values
means_labels = [str(int(s)) for s in means]
with sns.plotting_context("notebook",font_scale=2):
g = sns.factorplot(x="sex", y="total_bill", hue="smoker",\
col="time", data=tips, kind="box", size=6, aspect=.7)
How can one add an annotation (in the example above, the means_labels) below each box, like this:
As I said, I tried using the answer above to at least try to get the position of each box:
import matplotlib.pyplot as plt
ax = plt.gca()
pos = range(len(means))
for tick,label in zip(pos,ax.get_xticklabels()):
ax.text(pos[tick], means[tick] + 0.5, meanslabels[tick],
horizontalalignment='center', color='r', weight='semibold')
But this produces:
I believe this is because I'm passing the whole plot's axes instead of the "factorplot" axes. But I couldn't find a way to do so (if instead of ax=plt.gca() I use, like in the example, ax=sns.factorplot(...), I get the error: AttributeError: module 'seaborn' has no attribute 'gca').

Plotting a chart a plot in which the Y text data and X numeric data from dictionary. Matplotlib & Python 3 [duplicate]

I can create a simple columnar diagram in a matplotlib according to the 'simple' dictionary:
import matplotlib.pyplot as plt
D = {u'Label1':26, u'Label2': 17, u'Label3':30}
plt.bar(range(len(D)), D.values(), align='center')
plt.xticks(range(len(D)), D.keys())
plt.show()
But, how do I create curved line on the text and numeric data of this dictionarie, I do not know?
Т_OLD = {'10': 'need1', '11': 'need2', '12': 'need1', '13': 'need2', '14': 'need1'}
Like the picture below
You may use numpy to convert the dictionary to an array with two columns, which can be plotted.
import matplotlib.pyplot as plt
import numpy as np
T_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}
x = list(zip(*T_OLD.items()))
# sort array, since dictionary is unsorted
x = np.array(x)[:,np.argsort(x[0])].T
# let second column be "True" if "need2", else be "False
x[:,1] = (x[:,1] == "need2").astype(int)
# plot the two columns of the array
plt.plot(x[:,0], x[:,1])
#set the labels accordinly
plt.gca().set_yticks([0,1])
plt.gca().set_yticklabels(['need1', 'need2'])
plt.show()
The following would be a version, which is independent on the actual content of the dictionary; only assumption is that the keys can be converted to floats.
import matplotlib.pyplot as plt
import numpy as np
T_OLD = {'10': 'run', '11': 'tea', '12': 'mathematics', '13': 'run', '14' :'chemistry'}
x = np.array(list(zip(*T_OLD.items())))
u, ind = np.unique(x[1,:], return_inverse=True)
x[1,:] = ind
x = x.astype(float)[:,np.argsort(x[0])].T
# plot the two columns of the array
plt.plot(x[:,0], x[:,1])
#set the labels accordinly
plt.gca().set_yticks(range(len(u)))
plt.gca().set_yticklabels(u)
plt.show()
Use numeric values for your y-axis ticks, and then map them to desired strings with plt.yticks():
import matplotlib.pyplot as plt
import pandas as pd
# example data
times = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')
data = np.random.choice([0,1], size=len(times))
data_labels = ['need1','need2']
fig, ax = plt.subplots()
ax.plot(times, data, marker='o', linestyle="None")
plt.yticks(data, data_labels)
plt.xlabel("time")
Note: It's generally not a good idea to use a line graph to represent categorical changes in time (e.g. from need1 to need2). Doing that gives the visual impression of a continuum between time points, which may not be accurate. Here, I changed the plotting style to points instead of lines. If for some reason you need the lines, just remove linestyle="None" from the call to plt.plot().
UPDATE
(per comments)
To make this work with a y-axis category set of arbitrary length, use ax.set_yticks() and ax.set_yticklabels() to map to y-axis values.
For example, given a set of potential y-axis values labels, let N be the size of a subset of labels (here we'll set it to 4, but it could be any size).
Then draw a random sample data of y values and plot against time, labeling the y-axis ticks based on the full set labels. Note that we still use set_yticks() first with numerical markers, and then replace with our category labels with set_yticklabels().
labels = np.array(['A','B','C','D','E','F','G'])
N = 4
# example data
times = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')
data = np.random.choice(np.arange(len(labels)), size=len(times))
fig, ax = plt.subplots(figsize=(15,10))
ax.plot(times, data, marker='o', linestyle="None")
ax.set_yticks(np.arange(len(labels)))
ax.set_yticklabels(labels)
plt.xlabel("time")
This gives the exact desired plot:
import matplotlib.pyplot as plt
from collections import OrderedDict
T_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}
T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))
plt.plot(map(int, T_SRT.keys()), map(lambda x: int(x[-1]), T_SRT.values()),'r')
plt.ylim([0.9,2.1])
ax = plt.gca()
ax.set_yticks([1,2])
ax.set_yticklabels(['need1', 'need2'])
plt.title('T_OLD')
plt.xlabel('time')
plt.ylabel('need')
plt.show()
For Python 3.X the plotting lines needs to explicitly convert the map() output to lists:
plt.plot(list(map(int, T_SRT.keys())), list(map(lambda x: int(x[-1]), T_SRT.values())),'r')
as in Python 3.X map() returns an iterator as opposed to a list in Python 2.7.
The plot uses the dictionary keys converted to ints and last elements of need1 or need2, also converted to ints. This relies on the particular structure of your data, if the values where need1 and need3 it would need a couple more operations.
After plotting and changing the axes limits, the program simply modifies the tick labels at y positions 1 and 2. It then also adds the title and the x and y axis labels.
Important part is that the dictionary/input data has to be sorted. One way to do it is to use OrderedDict. Here T_SRT is an OrderedDict object sorted by keys in T_OLD.
The output is:
This is a more general case for more values/labels in T_OLD. It assumes that the label is always 'needX' where X is any number. This can readily be done for a general case of any string preceding the number though it would require more processing,
import matplotlib.pyplot as plt
from collections import OrderedDict
import re
T_OLD = {'10' : 'need1', '11':'need8', '12':'need11', '13':'need1','14':'need3'}
T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))
x_val = list(map(int, T_SRT.keys()))
y_val = list(map(lambda x: int(re.findall(r'\d+', x)[-1]), T_SRT.values()))
plt.plot(x_val, y_val,'r')
plt.ylim([0.9*min(y_val),1.1*max(y_val)])
ax = plt.gca()
y_axis = list(set(y_val))
ax.set_yticks(y_axis)
ax.set_yticklabels(['need' + str(i) for i in y_axis])
plt.title('T_OLD')
plt.xlabel('time')
plt.ylabel('need')
plt.show()
This solution finds the number at the end of the label using re.findall to accommodate for the possibility of multi-digit numbers. Previous solution just took the last component of the string because numbers were single digit. It still assumes that the number for plotting position is the last number in the string, hence the [-1]. Again for Python 3.X map output is explicitly converted to list, step not necessary in Python 2.7.
The labels are now generated by first selecting unique y-values using set and then renaming their labels through concatenation of the strings 'need' with its corresponding integer.
The limits of y-axis are set as 0.9 of the minimum value and 1.1 of the maximum value. Rest of the formatting is as before.
The result for this test case is:

Resources