matplotlib scatter issue with python 3.x - python-3.x

I just update my system from Python 2.x to Python 3.x via anaconda distribution. My script that's compatible with Python 2.x is no longer working properly. I've fixed most of it but have no clue how to fix the error regarding matplotlib scatter. I want to plot scatter (circles) points that are color coded with the calculated statistical value. Each circle is labeled accordingly.
Googling around. It suggests that a bug was found in matplotlib (with python 3.x), which scatter does not work with Iterator types of an input arguments. I am not sure if this bug has been fixed with the latest version of matplotlib.
Partial code:
n=[2,4,5,6,7,8,12]
XPOS, YPOS = [0,1,2,3,4,5,6], [0,1,2,3,4,5,6]
data = np.loadtxt(infile)
value = data[:,1]
stat = median_absolute_deviation(value)*1000.
for i in range(7):
plt.scatter(XPOS[i],YPOS[i], s=1500, c=stat, cmap='RdYlGn_r', edgecolors='black', vmin=0.1, vmax=1.0)
plt.text(XPOS[i], YPOS[i], n[i])
File "//anaconda3/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2841, in scatter
None else {}), **kwargs)
File "//anaconda3/lib/python3.7/site-packages/matplotlib/__init__.py", line 1589, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "//anaconda3/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 4446, in scatter
get_next_color_func=self._get_patches_for_fill.get_next_color)
File "//anaconda3/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 4257, in _parse_scatter_color_args
n_elem = c_array.shape[0]
IndexError: tuple index out of range

just tried to reproduce this; it seems to work if you pass x, y and c not as scalars but as lists:
import numpy as np
import matplotlib.pyplot as plt
n = [2,4,5,6,7,8,12]
XPOS, YPOS = [0,1,2,3,4,5,6], [0,1,2,3,4,5,6]
N = 8
colors = np.linspace(0, 1, N)
for i in range(N-1):
plt.scatter([XPOS[i]], [YPOS[i]], s=1500, c=[colors[i]], cmap='RdYlGn_r',
edgecolors='black', vmin=0.1, vmax=1.0)
plt.text(XPOS[i], YPOS[i], n[i])

Related

datetime64 values, ufunc 'isfinite' not supported for the input types

I would like to plot graphs with gradient. As an example, here the code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
from matplotlib.patches import Polygon
def main():
# This works
# x = np.array([0,1,2])
# This not
x = np.array('2015-12-23 12:51:00',dtype='datetime64[ns]')
x = np.insert(x, 0, ('2015-12-24 12:51:00'))
x = np.insert(x, 0, ('2015-12-25 12:51:00'))
y = np.array([1,2,99])
gradient_fill(x, y)
plt.show()
def gradient_fill(x, y, fill_color=None, ax=None, **kwargs):
"""
Plot a line with a linear alpha gradient filled beneath it.
Parameters
----------
x, y : array-like
The data values of the line.
fill_color : a matplotlib color specifier (string, tuple) or None
The color for the fill. If None, the color of the line will be used.
ax : a matplotlib Axes instance
The axes to plot on. If None, the current pyplot axes will be used.
Additional arguments are passed on to matplotlib's ``plot`` function.
Returns
-------
line : a Line2D instance
The line plotted.
im : an AxesImage instance
The transparent gradient clipped to just the area beneath the curve.
"""
if ax is None:
ax = plt.gca()
line, = ax.plot(x, y, **kwargs)
if fill_color is None:
fill_color = line.get_color()
zorder = line.get_zorder()
alpha = line.get_alpha()
alpha = 1.0 if alpha is None else alpha
z = np.empty((100, 1, 4), dtype=float)
rgb = mcolors.colorConverter.to_rgb(fill_color)
z[:,:,:3] = rgb
z[:,:,-1] = np.linspace(0, alpha, 100)[:,None]
xmin, xmax, ymin, ymax = x.min(), x.max(), y.min(), y.max()
im = ax.imshow(z, aspect='auto', extent=[xmin, xmax, ymin, ymax],
origin='lower', zorder=zorder)
xy = np.column_stack([x, y])
xy = np.vstack([[xmin, ymin], xy, [xmax, ymin], [xmin, ymin]])
clip_path = Polygon(xy, facecolor='none', edgecolor='none', closed=True)
ax.add_patch(clip_path)
im.set_clip_path(clip_path)
ax.autoscale(True)
return line, im
main()
If the x values have the datatype float it works. If I use the datetime values as x the following error is shown:
if not np.any(np.isfinite(xys)):
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
I've installed numpy-1.23.1. As referred here: MatPlotLib, datetimes, and TypeError: ufunc 'isfinite' not supported for the input types…
There was a issue and change requests years ago https://github.com/numpy/numpy/pull/7856
According to my understanding this should be fixed since NumPy 1.17
This appears to be a matplotlib bug; see https://github.com/matplotlib/matplotlib/issues/22105
This has been fixed in matplotlib. Imshow's extents values may now be unitful (will support datetime values). It will be out in the next release.

Can I see all attributes of a pyplot without showing the graph?

I am working on developing homework as a TA for a course at my university.
We are using Otter Grader (an extension of OKPy) to grade student submissions of guided homework we provide through Jupyter Notebooks.
Students are being asked to plot horizontal lines on their plots using matplotlib.pyplot.axhline(), and I am hoping to use an assert call to determine whether they added the horizontal line to their plots.
Is there a way to see all attributes that have been added to a pyplot in matplotlib?
I don't believe there is a way to see if the axhline attribute has been used or not, but there is a way to see if the lines are horizontal by accessing all the line2D objects using the lines attribute.
import matplotlib.pyplot as plt
import numpy as np
def is_horizontal(line2d):
x, y = line2d.get_data()
y = np.array(y) # The axhline method does not return data as a numpy array
y_bool = y == y[0] # Returns a boolean array of True or False if the first number equals all the other numbers
return np.all(y_bool)
t = np.linspace(-10, 10, 1000)
plt.plot(t, t**2)
plt.plot(t, t)
plt.axhline(y=5, xmin=-10, xmax=10)
ax = plt.gca()
assert any(map(is_horizontal, ax.lines)), 'There are no horizontal lines on the plot.'
plt.show()
This code will raise the error if there is not at least one line2D object that contains data in which all the y values are the same.
Note that in order for the above to work, the axhline attribute has to be used instead of the hlines method. The hlines method does not add the line2D object to the axes object.

Bar plot with different minimal value for each bar

I'm trying to reproduce this type of graph :
basically, the Y axis represent the date of beginning and end of a phenomenon for each year.
but here is what I have when I try to plot my data :
It seems that no matter what, the bar for each year is plotted from the y axis minimal value.
Here is the data I use
Here is my code :
select=pd.read_excel("./writer.xlsx")
select=pd.DataFrame(select)
select["dte"]=pd.to_datetime(select.dte)
select["month_day"]=pd.DatetimeIndex(select.dte).strftime('%B %d')
select["month"]=pd.DatetimeIndex(select.dte).month
select["day"]=pd.DatetimeIndex(select.dte).day
gs=gridspec.GridSpec(2,2)
fig=plt.figure()
ax1=plt.subplot(gs[0,0])
ax2=plt.subplot(gs[0,1])
ax3=plt.subplot(gs[1,:])
###2 others graphs that works just fine
data=pd.DataFrame()
del select["res"],select["Seuil"],select["Seuil%"] #these don't matter for that graph
for year_ in list(set(select.dteYear)):
temp=select.loc[select["dteYear"]==year_]
temp2=temp.iloc[[0,-1]] #the beginning and ending of the phenomenon
data=pd.concat([data,temp2]).reset_index(drop=True)
data=data.sort_values(["month","day"])
ax3.bar(data["dteYear"],data["month_day"],tick_label=data["dteYear"])
plt.show()
If you have some clue to help me, I'd really appreciate, because I havn't found any model to make this type of graph.
thanks !
EDIT :
I tried something else :
height,bottom,x_position=[], [], []
for year_ in list(set(select.dteYear)):
temp=select.loc[select["dteYear"]==year_]
bottom.append(temp["month_day"].iloc[0])
height.append(temp["month_day"].iloc[-1])
x_position.append(year_)
temp2=temp.iloc[[0,-1]]
data=pd.concat([data,temp2]).reset_index(drop=True)
ax3.bar(x=x_position,height=height,bottom=bottom,tick_label=x_position)
I got this error :
Traceback (most recent call last):
File "C:\Users\E31\Documents\cours\stage_dossier\projet_python\tool_etiage\test.py", line 103, in <module>
ax3.bar(x=x_position,height=height,bottom=bottom,tick_label=x_position)
File "C:\Users\E31\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\__init__.py", line 1352, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "C:\Users\E31\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 2357, in bar
r = mpatches.Rectangle(
File "C:\Users\E31\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\patches.py", line 752, in __init__
super().__init__(**kwargs)
File "C:\Users\E31\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\patches.py", line 101, in __init__
self.set_linewidth(linewidth)
File "C:\Users\E31\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\patches.py", line 406, in set_linewidth
self._linewidth = float(w)
TypeError: only size-1 arrays can be converted to Python scalars
To make a bar graph that shows a difference between dates you should start by getting your data into a nice format in the dataframe where it is easy to access the bottom and top values of the bar for each year you are plotting. After this you can simply plot the bars and indicate the 'bottom' parameter. The hardest part in your case may be specifying the datetime differences correctly. I added a x tick locator and y tick formatter for the datetimes.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.dates as mdates
# make function that returns a random datetime
# between a start and stop date
def random_date(start, stop):
days = (stop - start).days
rand = np.random.randint(days)
return start + pd.Timedelta(rand, unit='days')
# simulate poster's data
T1 = pd.to_datetime('July 1 2021')
T2 = pd.to_datetime('August 1 2021')
T3 = pd.to_datetime('November 1 2021')
df = pd.DataFrame({
'year' : np.random.choice(np.arange(1969, 2020), size=15, replace=False),
'bottom' : [random_date(T1, T2) for x in range(15)],
'top' : [random_date(T2, T3) for x in range(15)],
}).sort_values(by='year').set_index('year')
# define fig/ax and figsize
fig, ax = plt.subplots(figsize=(16,8))
# plot data
ax.bar(
x = df.index,
height = (df.top - df.bottom),
bottom = df.bottom,
color = '#9e7711'
)
# add x_locator (every 2 years), y tick datetime formatter, grid
# hide top/right spines, and rotate the x ticks for readability
x_locator = ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(2))
y_formatter = ax.yaxis.set_major_formatter(mdates.DateFormatter('%d %b'))
tick_params = ax.tick_params(axis='x', rotation=45)
grid = ax.grid(axis='y', dashes=(8,3), alpha=0.3, color='gray')
hide_spines = [ax.spines[s].set_visible(False) for s in ['top','right']]

AttributeError: module 'pandas.plotting' has no attribute '_matplotlib'

I am generating a graph using pandas. When I run my script locally, I get this error:
AttributeError: module 'pandas.plotting' has no attribute '_matplotlib'
I installed pandas for both my user and system. The operating system is WSL2 Ubuntu 20.04.
This is the relevant part of my code:
import pandas as pd
import matplotlib.pyplot as plt
def plot_multi(data, cols=None, spacing=.1, **kwargs):
from pandas import plotting
# Get default color style from pandas - can be changed to any other color list
if cols is None: cols = data.columns
if len(cols) == 0: return
colors = getattr(getattr(plotting, '_matplotlib').style, '_get_standard_colors')(num_colors=len(cols))
# First axis
print(data.loc[:, cols[0]])
ax = data.loc[:, cols[0]].plot(label=cols[0], color=colors[0], **kwargs)
ax.set_ylabel(ylabel=cols[0])
lines, labels = ax.get_legend_handles_labels()
for n in range(1, len(cols)):
# Multiple y-axes
ax_new = ax.twinx()
ax_new.spines['right'].set_position(('axes', 1 + spacing * (n - 1)))
data.loc[:, cols[n]].plot(ax=ax_new, label=cols[n], color=colors[n % len(colors)], **kwargs)
ax_new.set_ylabel(ylabel=cols[n])
# Proper legend position
line, label = ax_new.get_legend_handles_labels()
lines += line
labels += label
ax.legend(lines, labels, loc=0)
return ax
This worked on a University lab machine. Not sure why it's not working locally.

matplotlib spline adjustment changes tick label visibility

I've found some odd behavior with pyplot. When I run the following code:
#! /usr/bin/env python3
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 2 * np.pi, 100)
y = 2 * np.sin(x)
fig, (ax0, ax1) = plt.subplots(nrows = 2, sharex=True)
ax0.plot(x, y)
ax1.plot(x, y)
#ax0.spines['top'].set_position(('outward', 0))
plt.show()
it produces the plot
However, uncommenting the ax0.spines... line produces this plot
Note that on the top subplot, the x-axis has acquired labels on its ticks. Is this the expected behavior (and due to a misunderstanding on my part of the pyplot API), or is this a bug with pyplot?
Note that this is a minimized example of an issue I noticed with some more complex graph formatting code I'm working on. While the set_position() call in this case has no effect, in my code I'm actually bumping all spines outwards. I found with my testing, however, that the change in position seems not to have an effect -- rather, it's the fact of calling the set_position() function at all.
Turns out it was a problem localized to matplotlib 2.0.0 -- it's fixed in 2.1.0

Resources