Matplotlib sliding window not plotting correctly - python-3.x

I have a code that runs a rolling window (30) average over a range (i.e. 300)
So I have 10 averages but they plot against ticks 1-10 rather than spaced over every window of 30.
The only way I can get it to look right is to plot it over (len(windowlength)) but the x-axis isnt right.
Is there any way to manually space the results?
windows30 = (sliding_window(sequence, 30))
Overall_Mean = mean(sequence)
fig, (ax) = plt.subplots()
plt.subplots_adjust(left=0.07, bottom=0.08, right=0.96, top=0.92, wspace=0.20, hspace=0.23)
ax.set_ylabel('mean (%)')
ax.set_xlabel(' Length') # axis titles
ax.yaxis.grid(True, linestyle='-', which='major', color='lightgrey', alpha=0.5)
ax.plot(windows30, color='r', marker='o', markersize=3)
ax.plot([0, len(sequence)], [Overall_Mean, Overall_Mean], lw=0.75)
plt.show()

From what I have understood you have a list of length 300 but only holds 10 values inside. If that is the case, you can remove the other values that are None from your windows30 list using the following solution.
Code Demonstration:
import numpy as np
import random
import matplotlib.pyplot as plt
# Generating the list of Nones and numbers
listofzeroes = [None] * 290
numbers = random.sample(range(50), 10)
numbers.extend(listofzeroes)
# Removing Nones from the list
numbers = [value for value in numbers if value is not None]
step = len(numbers)
x_values = np.linspace(0,300,step) # Generate x-values
plt.plot(x_values,numbers, color='red', marker='o')
This is a working example, the relevant code for you is after the second comment.
Output:
The above code will work independently of where the Nones are located in your list. I hope this solves your problem.

Related

How to set axis ticks with non periodical increment in matplolib

I have a 2D array representing the efficiency of a process for a given set of parameters A and B. The parameter A along the columns changes periodically, starting from 0 to 225 with increment one. The problem is with the rows where the parameter was changed in the following order:
[16 ,18 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30 ,31 ,32 ,33 ,35 ,40 ,45 ,50 ,55 ,60 ,65 ,70 ,75 ,80 ,85 ,90 ,95 ,100 ,105 ,110 ,115 ,120 ,125]
So even though the rows increase with increment one, they represent a non-uniform increment of the parameter B. What I need is to showcase the values of the parameter B on the y-axis. Using axes.set_yticks() does not give me what I am looking for, and I do understand why but I do not know how to solve it.
A minimum example:
# Define parameter B values
parb_increment = [16, 18, 20] + list(range(21,34)) + list(range(35,126,5))
print(len(parb_increment))
print(x.shape)
# Figure and axes
figure, axes = plt.subplots(figsize=(10, 8))
# Plotting
im = axes.imshow(x, aspect='auto',
origin="lower",
cmap='Blues',
interpolation='none',
extent=(0, x.shape[1], 0, parb_increment[-1]))
# Unsuccessful trial for yticks
axes.set_yticks(parb_increment, labels=parb_increment)
# Colorbar
cb = figure.colorbar(im, ax=axes)
The previous code gives the figure and output below, and you can see how the ticks are not only misplaced but also start from an incorrect position.
35
(35, 225)
The item that controls the width/height of each pixel is aspect. Unfortunately you can't make it variable. The aspect won't change even if you modify/update y-axis ticks. That's why in your example ticks are mis-aligned with the rows of pixels.
Therefore, the solution to your problem is to duplicate those rows that increment non-uniformly.
See example below:
import numpy as np
import matplotlib.pyplot as plt
# Generate fake data
x = np.random.random((3, 4))
# Create uniform x-ticks and non-uniform y-ticks
x_increment = np.arange(0, x.shape[1]+1, 1)
y_increment = np.arange(0, x.shape[0]+1, 1) * np.arange(0, x.shape[0]+1, 1)
# Plot the data
fig, ax = plt.subplots(figsize=(6, 10))
img = ax.imshow(
x,
extent=(
0, x.shape[1], 0, y_increment[-1]
)
)
fig.colorbar(img, ax=ax)
ax.set_xlim(0, x.shape[1])
ax.set_xticks(x_increment)
ax.set_ylim(0, y_increment[-1])
ax.set_yticks(y_increment);
This replicates your problem and produces the following outcome.
The solution
First, determine the number of repeats of each row in the array:
nr_of_repeats_per_row =np.diff(y_increment)
nr_of_repeats_per_row = nr_of_repeats_per_row[::-1]
You need to reverse the order as the top row in the image is the first row in the array and y_increments provide the difference between rows starting from the last row in the array.
Now you can repeat each row in the array a specific number of times:
x_extended = np.repeat(x, nr_of_repeats_per_row, axis=0)
Replot with the x_extended:
fig, ax = plt.subplots(figsize=(6, 10))
img = ax.imshow(
x_extended,
extent=(
0, x.shape[1], 0, y_increment[-1]
),
interpolation="none"
)
fig.colorbar(img, ax=ax)
ax.set_xlim(0, x.shape[1])
ax.set_xticks(x_increment)
ax.set_ylim(0, y_increment[-1])
ax.set_yticks(y_increment);
And you should get this.

Modify position of colorbar so that extend triangle is above plot

So, I have to make a bunch of contourf plots for different days that need to share colorbar ranges. That was easily made but sometimes it happens that the maximum value for a given date is above the colorbar range and that changes the look of the plot in a way I dont need. The way I want it to treat it when that happens is to add the extend triangle above the "original colorbar". It's clear in the attached picture.
I need the code to run things automatically, right now I only feed the data and the color bar range and it outputs the images, so the fitting of the colorbar in the code needs to be automatic, I can't add padding in numbers because the figure sizes changes depending on the area that is being asked to be plotted.
The reason why I need this behavior is because eventually I would want to make a .gif and I can't have the colorbar to move in that short video. I need for the triangle to be added, when needed, to the top (and below) without messing with the "main" colorbar.
Thanks!
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize, BoundaryNorm
from matplotlib import cm
###############
## Finds the appropriate option for variable "extend" in fig colorbar
def find_extend(vmin, vmax, datamin, datamax):
#extend{'neither', 'both', 'min', 'max'}
if datamin >= vmin:
if datamax <= vmax:
extend="neither"
else:
extend="max"
else:
if datamax <= vmax:
extend="min"
else:
extend="both"
return extend
###########
vmin=0
vmax=30
nlevels=8
colormap=cm.get_cmap("rainbow")
### Creating data
z_1=30*abs(np.random.rand(5, 5))
z_2=37*abs(np.random.rand(5, 5))
data={1:z_1, 2:z_2}
x=range(5)
y=range(5)
## Plot
for day in [1, 2]:
fig = plt.figure(figsize=(4,4))
## Normally figsize=get_figsize(bounds) and bounds is retrieved from gdf.total_bounds
## The function creates the figure size based on the x/y ratio of the bounds
ax = fig.add_subplot(1, 1, 1)
norm=BoundaryNorm(np.linspace(vmin, vmax, nlevels+1), ncolors=colormap.N)
z=data[day]
cs=ax.contourf(x, y, z, cmap=cmap, norm=norm, vmin=vmin, vmax=vmax)
extend=find_extend(vmin, vmax, np.nanmin(z), np.nanmax(z))
fig.colorbar(cm.ScalarMappable(norm=norm, cmap=cmap), ax=ax, extend=extend)
plt.close(fig)
You can do something like this: putting a triangle on top of the colorbar manually:
fig, ax = plt.subplots()
pc = ax.pcolormesh(np.random.randn(20, 20))
cb = fig.colorbar(pc)
trixy = np.array([[0, 1], [1, 1], [0.5, 1.05]])
p = mpatches.Polygon(trixy, transform=cb.ax.transAxes,
clip_on=False, edgecolor='k', linewidth=0.7,
facecolor='m', zorder=4, snap=True)
cb.ax.add_patch(p)
plt.show()

Legend overwritten by plot - matplotlib

I have a plot that looks as follows:
I want to put labels for both the lineplot and the markers in red. However the legend is not appearning because its the plot is taking out its space.
Update
it turns out I cannot put several strings in plt.legend()
I made the figure bigger by using the following:
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
However now I have only one label in the legend, with the marker appearing on the lineplot while I rather want two: one for the marker alone and another for the line alone:
Updated code:
plt.plot(range(len(y)), y, '-bD', c='blue', markerfacecolor='red', markeredgecolor='k', markevery=rare_cases, label='%s' % target_var_name)
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
# changed this over here
plt.legend()
plt.savefig(output_folder + fig_name)
plt.close()
What you want to do (have two labels for a single object) is not completely impossible but it's MUCH easier to plot separately the line and the rare values, e.g.
# boilerplate
import numpy as np
import matplotlib.pyplot as plt
# synthesize some data
N = 501
t = np.linspace(0, 10, N)
s = np.sin(np.pi*t)
rare = np.zeros(N, dtype=bool); rare[:20]=True; np.random.shuffle(rare)
plt.plot(t, s, label='Curve')
plt.scatter(t[rare], s[rare], label='rare')
plt.legend()
plt.show()
Update
[...] it turns out I cannot put several strings in plt.legend()
Well, you can, as long as ① the several strings are in an iterable (a tuple or a list) and ② the number of strings (i.e., labels) equals the number of artists (i.e., thingies) in the plot.
plt.legend(('a', 'b', 'c'))

Matplotlib - sequentially creating figures with the same size

I need to create a sequence of .pdf files where each .pdf contains a figure with five plots.
As I am going to include them in a LaTeX article, I wanted them all to be the same width and height so that each figure's corners are vertically aligned on both left and right sides.
I thought this would be enough, but apparently not:
common_figsize=(6,5)
fig, ax = plt.subplots(figsize = common_figsize)
# five plots in a loop for the first figure.
# my_code()...
plt.savefig("Figure-1.pdf", transparent=True)
plt.close(fig)
fig, ax = plt.subplots(figsize = common_figsize)
# five plots in a loop for the new figure.
# my_code()...
plt.savefig("Figure-2.pdf", transparent=True)
plt.close(fig)
If I understand correctly, this does not do exactly what I want because of different scales originating from different yticks resolutions.
For both figures, pyplot is fed the same list for xticks.
In this case, it is a list of 50 values, from 1 to 50.
CHUNK_COUNT = 50
x_step = CHUNK_COUNT / 10
new_xticks = list(range(x_step, CHUNK_COUNT + x_step, x_step)) + [1]
plt.xticks(new_xticks)
ax.set_xlim(left=1, right=CHUNK_COUNT)
This creates both figures with an X-axis that goes from 1 to 50.
So far so good.
However, I haven't figured out how to deal with the problem of yticks resolution.
One of the figures had less yticks than the other, so I overrode it to have as many ticks as the other:
# Add yticks to Figure 1.
y_divisor = 6
y_step = (100 - min_y_tick) / y_divisor
new_yticks = [min_y_tick + y_step * i for i in range(0, y_divisor + 1)]
plt.yticks(new_yticks)
This resulted in the following images:
(click on each to open in new tab to see that in fact the bounding square of each figure is different)
Figure 1:
Figure 2:
In summary, I believe matplotlib is accepting the figsize parameter, but then rearranges plot elements to accommodate for different tick values and text lengths.
Is it possible for it to operate in reverse? To change label and text rotations automagically so that the squares are absolutely the same length and height?
Apologies if this is a duplicate and thanks for the help.
EDIT:
Finally able to provide a minimal, complete and verifiable example.
Among the tests, I removed the custom yticks code and the problem still persists:
from matplotlib.lines import Line2D
import matplotlib.ticker as mtick
import matplotlib.pyplot as plt
from matplotlib import rc
# activate latex text rendering
rc('text', usetex=True)
from matplotlib import rcParams
rcParams.update({'figure.autolayout': True})
CHUNK_COUNT = 50
common_figsize=(6,5)
plot_counter = 5
x_step = int(int(CHUNK_COUNT) / 10)
new_xticks = list(range(x_step, int(CHUNK_COUNT) + x_step, x_step)) + [1]
##### Plot Figure 1
fig, ax = plt.subplots(figsize = common_figsize)
plt.ylabel("Summary of a simple YY axis")
plt.yticks(rotation=45)
ax.yaxis.set_major_formatter(mtick.PercentFormatter(is_latex=False))
for i in range(0, plot_counter):
xvals = range(1, CHUNK_COUNT + 1)
yvals = []
for j in xvals:
yvals.append(j + i)
plt.plot(xvals, yvals)
plt.xticks(new_xticks)
ax.set_xlim(left=1, right=int(CHUNK_COUNT))
plt.savefig("Figure_1.png", transparent=True)
plt.close(fig)
##### Plot Figure 2
fig, ax = plt.subplots(figsize = common_figsize)
plt.ylabel("Summary of another YY axis")
plt.yticks(rotation=45)
ax.yaxis.set_major_formatter(mtick.PercentFormatter(is_latex=False))
for i in range(0, plot_counter):
xvals = range(1, CHUNK_COUNT + 1)
yvals = []
for j in xvals:
yvals.append((j + i) / 100)
plt.plot(xvals, yvals)
plt.xticks(new_xticks)
ax.set_xlim(left=1, right=int(CHUNK_COUNT))
plt.savefig("Figure_2.png", transparent=True)
plt.close(fig)
It turns out this was due to a mistake on my part.
I carried over code from another context where
autolayout
was active:
from matplotlib import rcParams
rcParams.update({'figure.autolayout': True})
After setting it to False, the figure squares all had the same dimensions:
from matplotlib import rcParams
rcParams.update({'figure.autolayout': False})
Despite the length differences in ytick elements, it is now respecting the dimensions specified in my original question.
These results were generated with the MWE example I added at the end of my question:

Plotting a chart a plot in which the Y text data and X numeric data from dictionary. Matplotlib & Python 3 [duplicate]

I can create a simple columnar diagram in a matplotlib according to the 'simple' dictionary:
import matplotlib.pyplot as plt
D = {u'Label1':26, u'Label2': 17, u'Label3':30}
plt.bar(range(len(D)), D.values(), align='center')
plt.xticks(range(len(D)), D.keys())
plt.show()
But, how do I create curved line on the text and numeric data of this dictionarie, I do not know?
Т_OLD = {'10': 'need1', '11': 'need2', '12': 'need1', '13': 'need2', '14': 'need1'}
Like the picture below
You may use numpy to convert the dictionary to an array with two columns, which can be plotted.
import matplotlib.pyplot as plt
import numpy as np
T_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}
x = list(zip(*T_OLD.items()))
# sort array, since dictionary is unsorted
x = np.array(x)[:,np.argsort(x[0])].T
# let second column be "True" if "need2", else be "False
x[:,1] = (x[:,1] == "need2").astype(int)
# plot the two columns of the array
plt.plot(x[:,0], x[:,1])
#set the labels accordinly
plt.gca().set_yticks([0,1])
plt.gca().set_yticklabels(['need1', 'need2'])
plt.show()
The following would be a version, which is independent on the actual content of the dictionary; only assumption is that the keys can be converted to floats.
import matplotlib.pyplot as plt
import numpy as np
T_OLD = {'10': 'run', '11': 'tea', '12': 'mathematics', '13': 'run', '14' :'chemistry'}
x = np.array(list(zip(*T_OLD.items())))
u, ind = np.unique(x[1,:], return_inverse=True)
x[1,:] = ind
x = x.astype(float)[:,np.argsort(x[0])].T
# plot the two columns of the array
plt.plot(x[:,0], x[:,1])
#set the labels accordinly
plt.gca().set_yticks(range(len(u)))
plt.gca().set_yticklabels(u)
plt.show()
Use numeric values for your y-axis ticks, and then map them to desired strings with plt.yticks():
import matplotlib.pyplot as plt
import pandas as pd
# example data
times = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')
data = np.random.choice([0,1], size=len(times))
data_labels = ['need1','need2']
fig, ax = plt.subplots()
ax.plot(times, data, marker='o', linestyle="None")
plt.yticks(data, data_labels)
plt.xlabel("time")
Note: It's generally not a good idea to use a line graph to represent categorical changes in time (e.g. from need1 to need2). Doing that gives the visual impression of a continuum between time points, which may not be accurate. Here, I changed the plotting style to points instead of lines. If for some reason you need the lines, just remove linestyle="None" from the call to plt.plot().
UPDATE
(per comments)
To make this work with a y-axis category set of arbitrary length, use ax.set_yticks() and ax.set_yticklabels() to map to y-axis values.
For example, given a set of potential y-axis values labels, let N be the size of a subset of labels (here we'll set it to 4, but it could be any size).
Then draw a random sample data of y values and plot against time, labeling the y-axis ticks based on the full set labels. Note that we still use set_yticks() first with numerical markers, and then replace with our category labels with set_yticklabels().
labels = np.array(['A','B','C','D','E','F','G'])
N = 4
# example data
times = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')
data = np.random.choice(np.arange(len(labels)), size=len(times))
fig, ax = plt.subplots(figsize=(15,10))
ax.plot(times, data, marker='o', linestyle="None")
ax.set_yticks(np.arange(len(labels)))
ax.set_yticklabels(labels)
plt.xlabel("time")
This gives the exact desired plot:
import matplotlib.pyplot as plt
from collections import OrderedDict
T_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}
T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))
plt.plot(map(int, T_SRT.keys()), map(lambda x: int(x[-1]), T_SRT.values()),'r')
plt.ylim([0.9,2.1])
ax = plt.gca()
ax.set_yticks([1,2])
ax.set_yticklabels(['need1', 'need2'])
plt.title('T_OLD')
plt.xlabel('time')
plt.ylabel('need')
plt.show()
For Python 3.X the plotting lines needs to explicitly convert the map() output to lists:
plt.plot(list(map(int, T_SRT.keys())), list(map(lambda x: int(x[-1]), T_SRT.values())),'r')
as in Python 3.X map() returns an iterator as opposed to a list in Python 2.7.
The plot uses the dictionary keys converted to ints and last elements of need1 or need2, also converted to ints. This relies on the particular structure of your data, if the values where need1 and need3 it would need a couple more operations.
After plotting and changing the axes limits, the program simply modifies the tick labels at y positions 1 and 2. It then also adds the title and the x and y axis labels.
Important part is that the dictionary/input data has to be sorted. One way to do it is to use OrderedDict. Here T_SRT is an OrderedDict object sorted by keys in T_OLD.
The output is:
This is a more general case for more values/labels in T_OLD. It assumes that the label is always 'needX' where X is any number. This can readily be done for a general case of any string preceding the number though it would require more processing,
import matplotlib.pyplot as plt
from collections import OrderedDict
import re
T_OLD = {'10' : 'need1', '11':'need8', '12':'need11', '13':'need1','14':'need3'}
T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))
x_val = list(map(int, T_SRT.keys()))
y_val = list(map(lambda x: int(re.findall(r'\d+', x)[-1]), T_SRT.values()))
plt.plot(x_val, y_val,'r')
plt.ylim([0.9*min(y_val),1.1*max(y_val)])
ax = plt.gca()
y_axis = list(set(y_val))
ax.set_yticks(y_axis)
ax.set_yticklabels(['need' + str(i) for i in y_axis])
plt.title('T_OLD')
plt.xlabel('time')
plt.ylabel('need')
plt.show()
This solution finds the number at the end of the label using re.findall to accommodate for the possibility of multi-digit numbers. Previous solution just took the last component of the string because numbers were single digit. It still assumes that the number for plotting position is the last number in the string, hence the [-1]. Again for Python 3.X map output is explicitly converted to list, step not necessary in Python 2.7.
The labels are now generated by first selecting unique y-values using set and then renaming their labels through concatenation of the strings 'need' with its corresponding integer.
The limits of y-axis are set as 0.9 of the minimum value and 1.1 of the maximum value. Rest of the formatting is as before.
The result for this test case is:

Resources