How to remove empty x-axis coordinates in Matplotlib - python-3.x

I'm developing in Python using the pandas, numpy and matplotlib modules, to paint various subplots of a dataframe, using the following code:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.ticker as ticker
data = {'Name': ['Status', 'Status', 'HMI', 'Allst', 'Drvr', 'CurrTUBand', 'RUSource', 'RUReqstrPriority', 'RUReqstrSystem', 'RUResReqstStat', 'CurrTUBand', 'DSP', 'SetDSP', 'SetDSP', 'DSP', 'RUSource', 'RUReqstrPriority', 'RUReqstrSystem', 'RUResReqstStat', 'Status', 'Delay', 'Status', 'Delay', 'HMI', 'Status', 'Status', 'HMI', 'DSP'],
'Value': [4, 4, 2, 1, 1, 1, 0, 7, 0, 4, 1, 1, 3, 0, 3, 0, 7, 0, 4, 1, 0, 1, 0, 1, 4, 4, 2, 3],
'Id_Par': [0, 0, 0, 0, 0, 0, 10, 10, 10, 10, 10, 0, 0, 22, 22, 28, 28, 28, 28, 0, 0, 38, 38, 0, 0, 0, 0, 0]
}
signals_df = pd.DataFrame(data)
def plot_signals(signals_df):
# Count signals by parallel
signals_df['Count'] = signals_df.groupby('Id_Par').cumcount().add(1).mask(signals_df['Id_Par'].eq(0), 0)
# Subtract Parallel values from the index column
signals_df['Sub'] = signals_df.index - signals_df['Count']
id_par_prev = signals_df['Id_Par'].unique()
id_par = np.delete(id_par_prev, 0)
signals_df['Prev'] = [1 if x in id_par else 0 for x in signals_df['Id_Par']]
signals_df['Final'] = signals_df['Prev'] + signals_df['Sub']
# Convert and set Subtract to index
signals_df.set_index('Final', inplace=True)
# Get individual names and variables for the chart
names_list = [name for name in signals_df['Name'].unique()]
num_names_list = len(names_list)
# Creation Graphics
fig, ax = plt.subplots(nrows=num_names_list, figsize=(10, 10), sharex=True)
plt.xticks(color='SteelBlue', fontweight='bold')
# Matplotlib's categorical feature to convert x-axis values to string
x_values = [-1, ]
for name in all_names_list:
x_values.append(signals_df[signals_df["Name"] == name]["Value"].index.values[0])
x_values.append(len(signals_df) - 1)
x_values = [str(i) for i in sorted(set(x_values))]
print(x_values)
for pos, (a_, name) in enumerate(zip(ax, names_list)):
# Creating a dummy plot and then remove it
dummy, = ax[pos].plot(x_values, np.zeros_like(x_values))
dummy.remove()
# Get data
data = signals_df[signals_df["Name"] == name]["Value"]
# Get values axis-x and axis-y
x_ = np.hstack([-1, data.index.values, len(signals_df) - 1])
y_ = np.hstack([0, data.values, data.iloc[-1]])
# Plotting the data by position
ax[pos].plot(x_.astype('str'), y_, drawstyle='steps-post', marker='*', markersize=8, color='k', linewidth=2)
ax[pos].set_ylabel(name, fontsize=8, fontweight='bold', color='SteelBlue', rotation=30, labelpad=35)
ax[pos].yaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))
ax[pos].yaxis.set_tick_params(labelsize=6)
ax[pos].grid(alpha=0.4, color='SteelBlue')
# Labeling the markers with CAN-Values
for i in range(len(y_)):
if i == 0:
xy = [x_[0].astype('str'), y_[0]]
else:
xy = [x_[i - 1].astype('str'), y_[i - 1]]
ax[pos].text(x=xy[0], y=xy[1], s=str(xy[1]), color='k', fontweight='bold', fontsize=12)
plt.show()
plot_signals(signals_df)
I'm having trouble when names get repeated, using Matplotlib's categorical feature and converting x-axis values to string; taking into consideration the focus of the answer; this is what is bringing me:
I have been trying to change the pandas conditions, since it is the condition that I am using in this line: x_values.append(signals_df[signals_df["Name"] == name]["Value"].index.values[0]) and when I print the variable x_values it brings me the wrong indices: ['-1', '0', '2', '3', '4', '5', '6', '11', '12', '20', '27'] and I can't make it work well.
I expect to achieve is a graph like the following:
The yellow shading is the jumps that it must make on the x-axis and that it are not painting on the y-axis. Thank you very much to anyone who can help me, any comments help.

I leave this answer for possible searches later for someone with the same topic. I found my error, the way I was handling the for loop was not correct, I replaced it and modified it as follows:
# Matplotlib's categorical feature and to convert x-axis values to string
x_values = [-1,]
x_values + = (list (set (can_signals.index)))
x_values = [str (i) for i in sorted (x_values)]
This now allows to bring up the graph as below:

Related

Python: Plot histograms with customized bins

I am using matplotlib.pyplot to make a histogram. Due to the distribution of the data, I want manually set up the bins. The details are as follows:
Any value = 0 in one bin;
Any value > 60 in the last bin;
Any value > 0 and <= 60 are in between the bins described above and the bin size is 5.
Could you please give me some help? Thank you.
I'm not sure what you mean by "the bin size is 5". You can either plot a histogramm by specifying the bins with a sequence:
import matplotlib.pyplot as plt
data = [0, 0, 1, 2, 3, 4, 5, 6, 35, 60, 61, 82, -5] # your data here
plt.hist(data, bins=[0, 0.5, 60, max(data)])
plt.show()
But the bin size will match the corresponding interval, meaning -in this example- that the "0-case" will be barely visible:
(Note that 60 is moved to the last bin when specifying bins as a sequence, changing the sequence to [0, 0.5, 59.5, max(data)] would fix that)
What you (probably) need is first to categorize your data and then plot a bar chart of the categories:
import matplotlib.pyplot as plt
import pandas as pd
data = [0, 0, 1, 2, 3, 4, 5, 6, 35, 60, 61, 82, -5] # your data here
df = pd.DataFrame()
df['data'] = data
def find_cat(x):
if x == 0:
return "0"
elif x > 60:
return "> 60"
elif x > 0:
return "> 0 and <= 60"
df['category'] = df['data'].apply(find_cat)
df.groupby('category', as_index=False).count().plot.bar(x='category', y='data', rot=0, width=0.8)
plt.show()
Output:
building off Tranbi's answer, you could specify the bin edges as detailed in the link they shared.
import matplotlib.pyplot as plt
import pandas as pd
data = [0, 0, 1, 2, 3, 4, 5, 6, 35, 60, 61, 82, -6] # your data here
df = pd.DataFrame()
df['data'] = data
bin_edges = [-5, 0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65]
bin_edges_offset = [x+0.000001 for x in bin_edges]
plt.figure()
plt.hist(df['data'], bins=bin_edges_offset)
plt.show()
histogram
IIUC you want a classic histogram for value between 0 (not included) and 60 (included) and add two bins for 0 and >60 on the side.
In that case I would recommend plotting the 3 regions separately:
import matplotlib.pyplot as plt
data = [0, 0, 1, 2, 3, 4, 5, 6, 35, 60, 61, 82, -3] # your data here
fig, axes = plt.subplots(1,3, sharey=True, width_ratios=[1, 12, 1])
fig.subplots_adjust(wspace=0)
# counting 0 values and drawing a bar between -5 and 0
axes[0].bar(-5, data.count(0), width=5, align='edge')
axes[0].xaxis.set_visible(False)
axes[0].spines['right'].set_visible(False)
axes[0].set_xlim((-5, 0))
# histogram between (0, 60]
axes[1].hist(data, bins=12, range=(0.0001, 60.0001))
axes[1].yaxis.set_visible(False)
axes[1].spines['left'].set_visible(False)
axes[1].spines['right'].set_visible(False)
axes[1].set_xlim((0, 60))
# counting values > 60 and drawing a bar between 60 and 65
axes[2].bar(60, len([x for x in data if x > 60]), width=5, align='edge')
axes[2].xaxis.set_visible(False)
axes[2].yaxis.set_visible(False)
axes[2].spines['left'].set_visible(False)
axes[2].set_xlim((60, 65))
plt.show()
Output:
Edit: If you wanna plot probability density, I would edit the data and simply use hist:
import matplotlib.pyplot as plt
data = [0, 0, 1, 2, 3, 4, 5, 6, 35, 60, 61, 82, -3] # your data here
data2 = []
for el in data:
if el < 0:
pass
elif el > 60:
data2.append(61)
else:
data2.append(el)
plt.hist(data2, bins=14, density=True, range=(-4.99,65.01))
plt.show()

Is there some way can add label in legend in plot by one step?

My legend now shows,
I want to add my label in legend, from 0 to 7, but I don't want to add a for-loop in my code and correct each label step by step, my code like that,
fig, ax = plt.subplots()
ax.set_title('Clusters by OPTICS in 2D space after PCA')
ax.set_xlabel('First Component')
ax.set_ylabel('Second Component')
points = ax.scatter(
pca_2_spec[:,0],
pca_2_spec[:,1],
s = 7,
marker='o',
c = pred_pca_2_spec,
cmap= 'rainbow')
ax.legend(*points.legend_elements(), title = 'cluster')
plt.show()
Assuming pred_pca_2_spec is some np.array with values [0, 5, 10, 15, 20, 30, 35] to change the values of these to be in the range 0-7, simply divide (each element) by 5.
Sample Data:
import numpy as np
from matplotlib import pyplot as plt
np.random.seed(54)
pca_2_spec = np.random.randint(-100, 300, (100, 2))
pred_pca_2_spec = np.random.choice([0, 5, 10, 15, 20, 25, 30, 35], 100)
Plotting Code:
fig, ax = plt.subplots()
ax.set_title('Clusters by OPTICS in 2D space after PCA')
ax.set_xlabel('First Component')
ax.set_ylabel('Second Component')
points = ax.scatter(
pca_2_spec[:, 0],
pca_2_spec[:, 1],
s=7,
marker='o',
c=pred_pca_2_spec / 5, # Divide By 5
cmap='rainbow')
ax.legend(*points.legend_elements(), title='cluster')
plt.show()

Changing the grid properties of insets in matplotlib

This is a follow up to my question posted here. A network diagram is added as an inset in matplotlib figure.
import networkx as nx
import matplotlib.pyplot as plt
G = nx.gnm_random_graph(n=10, m=15, seed=1)
nxpos = nx.spring_layout(G, dim=3, seed=1)
nxpts = [nxpos[pt] for pt in sorted(nxpos)]
nx_lines = [(nxpts[i], nxpts[j]) for i, j in G.edges()]
# node values
values = [[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[30, 80, 10, 79, 70, 60, 75, 78, 65, 10],
[1, .30, .10, .79, .70, .60, .75, .78, .65, .90]]
time = [0.0, 0.1, 0.2] # in seconds
fig, ax = plt.subplots()
ax.plot(
[1, 2, 3], [1, 2, 3],
'go-',
label='line 1',
linewidth=2
)
from mpl_toolkits.mplot3d import (Axes3D)
from matplotlib.transforms import Bbox
rect = [.6, 0, .5, .5]
bbox = Bbox.from_bounds(*rect)
inax = fig.add_axes(bbox, projection = '3d')
# inax.axis('off')
# set angle
angle = 25
inax.view_init(10, angle)
# hide axes, make transparent
# inax.set_facecolor('none')
inax.grid('off')
import numpy as np
# plot 3d
seen = set()
for i, j in G.edges():
x = np.stack((nxpos[i], nxpos[j]))
inax.plot(*x.T, color = 'k')
if i not in seen:
inax.scatter(*x[0], color = 'skyblue')
seen.add(i)
if j not in seen:
inax.scatter(*x[1], color = "skyblue")
seen.add(j)
fig.show()
I would like to change the grid properties i.e set the grid color to red and change line width. I tried inax.grid('on', color='r') but this doesn't change the color. Suggestions on how to change the settings will be really helpful.
You can do it like this:
inax.w_xaxis._axinfo.update({'grid' : {'color': 'red', 'linewidth': 0.8, 'linestyle': '-'}})
inax.w_yaxis._axinfo.update({'grid' : {'color': 'red', 'linewidth': 0.8, 'linestyle': '-'}})
inax.w_zaxis._axinfo.update({'grid' : {'color': 'red', 'linewidth': 0.8, 'linestyle': '-'}})
Output:

Adding image generated from another library as inset in matplotlib

I've generated a network figure using vedo library and I'm trying to add this as an inset to a figure generated in matplotlib
import networkx as nx
import matplotlib.pyplot as plt
from vedo import *
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
G = nx.gnm_random_graph(n=10, m=15, seed=1)
nxpos = nx.spring_layout(G, dim=3, seed=1)
nxpts = [nxpos[pt] for pt in sorted(nxpos)]
nx_lines = [(nxpts[i], nxpts[j]) for i, j in G.edges()]
pts = Points(nxpts, r=12)
edg = Lines(nx_lines).lw(2)
# node values
values = [[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[30, 80, 10, 79, 70, 60, 75, 78, 65, 10],
[1, .30, .10, .79, .70, .60, .75, .78, .65, .90]]
time = [0.0, 0.1, 0.2] # in seconds
vplt = Plotter(N=1)
pts1 = pts.cmap('Blues', values[0])
vplt.show(
pts1, edg,
axes=False,
bg='white',
at=0,
interactive=False,
zoom=1.5
).screenshot("network.png")
ax = plt.subplot(111)
ax.plot(
[1, 2, 3], [1, 2, 3],
'go-',
label='line 1',
linewidth=2
)
arr_img = vplt.screenshot(returnNumpy=True, scale=1)
im = OffsetImage(arr_img, zoom=0.25)
ab = AnnotationBbox(im, (1, 0), xycoords='axes fraction', box_alignment=(1.1, -0.1), frameon=False)
ax.add_artist(ab)
plt.show()
ax.figure.savefig(
"output.svg",
transparent=True,
dpi=600,
bbox_inches="tight"
)
There resolution of the image in the inset is too low. Suggestions on how to add the inset without loss of resolution will be really helpful.
EDIT:
The answer posted below works for adding a 2D network, but I am still looking for ways that will be useful for adding a 3D network in the inset.
I am not familiar with vedo but the general procedure would be to create an inset_axis and plot the image with imshow. However, your code is using networkx which has matplotlib bindings and you can directly do this without vedo
EDIT: code edited for 3d plotting
import networkx as nx
import matplotlib.pyplot as plt
G = nx.gnm_random_graph(n=10, m=15, seed=1)
nxpos = nx.spring_layout(G, dim=3, seed=1)
nxpts = [nxpos[pt] for pt in sorted(nxpos)]
nx_lines = [(nxpts[i], nxpts[j]) for i, j in G.edges()]
# node values
values = [[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[30, 80, 10, 79, 70, 60, 75, 78, 65, 10],
[1, .30, .10, .79, .70, .60, .75, .78, .65, .90]]
time = [0.0, 0.1, 0.2] # in seconds
fig, ax = plt.subplots()
ax.plot(
[1, 2, 3], [1, 2, 3],
'go-',
label='line 1',
linewidth=2
)
from mpl_toolkits.mplot3d import (Axes3D)
from matplotlib.transforms import Bbox
rect = [.6, 0, .5, .5]
bbox = Bbox.from_bounds(*rect)
inax = fig.add_axes(bbox, projection = '3d')
# inax = add_inset_axes(,
# ax_target = ax,
# fig = fig, projection = '3d')
# inax.axis('off')
# set angle
angle = 25
inax.view_init(10, angle)
# hide axes, make transparent
# inax.set_facecolor('none')
# inax.grid('off')
import numpy as np
# plot 3d
seen = set()
for i, j in G.edges():
x = np.stack((nxpos[i], nxpos[j]))
inax.plot(*x.T, color = 'k')
if i not in seen:
inax.scatter(*x[0], color = 'skyblue')
seen.add(i)
if j not in seen:
inax.scatter(*x[1], color = "skyblue")
seen.add(j)
fig.show()

How to autoscale y-axis for bargraph in matplotlib?

I need to autoscale the y-axis on my bargraph in matplotlib in order to display the small differences in values. The reason why it needs to be autoscaled instead of having a fixed limit is because the values will change depending on what the user inputs. I've tried yscale log, but that doesn't work for negative values. I've tried symlog, but the graph stays the same. This is my current code:
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = range(700, 710, 1)
fig, ax = plt.subplots()
ax.bar(x, y)
plt.show()
Plots are automatically scaled for the full range of the data provided to the API.
For a bar plot, the best option to display the differences in the values of the bars, is probably to set the ylim for vertical bars or xlim for horizontal bars.
negative data
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = range(-700, -750, -5)
fig, ax = plt.subplots(figsize=(7, 5))
ax.bar(x, y)
plt.ylim(min(y), max(y))
positive data
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = range(700, 750, 5)
fig, ax = plt.subplots(figsize=(7, 5))
ax.bar(x, y)
plt.ylim(min(y), max(y))
mixed data
If the data has a wide range of positive and negative values, there's probably not a good option, as you've noted symlog doesn't help the issue.
The best option may be to plot the positive and negative data separately.
Creating a mask does't work with a list, so convert the lists to numpy arrays.
import numpy as np
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = [700, -700, 710, -710, 720, -720, 730, -730, 740, -740]
x = np.array(x)
y = np.array(y)
mask = y >= 0 # positive mask
pos_y = y[mask] # get the positive values
neg_y = y[~mask] # get the negative values; ~ is not
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(7, 5))
ax1.bar(x[mask], pos_y) # also mask x to plot the bar at the correct x-tick
ax1.set_title('Positive Values')
ax1.set_ylim(min(pos_y), max(pos_y))
ax1.set_xticks(range(0, 12)) # buffer the number of x-ticks, so the x-ticks of the two plots align.
ax2.bar(x[~mask], neg_y)
ax2.set_title('Negative Values')
ax2.set_ylim(min(neg_y), max(neg_y))
ax2.set_xticks(range(0, 12))
plt.tight_layout() # better spacing between the two plots

Resources