Matplotlib RuntimeError: exceeds Locator.MAXTICKS when using MultipleLocator - python-3.x

I am plotting a Matplotlib chart with 10000 x axis data points. To avoid the X axis labels overlapping, I have used a Major MultipleLocator of 40 and a minor MultipleLocator of 10. This code works for 1000 data points.
from matplotlib import pyplot as plt
import numpy as np
import matplotlib.ticker as mticker
##generating 1000 data points
years = [i for i in range(1,10000)]
data = np.random.rand(len(years))
fig, ax = plt.subplots(figsize = (18,6))
ind = np.arange(len(data))
bars1 = ax.bar(ind, data,
label='Data')
ax.set_title("Data vs Year")
#Format Y Axis
ax.set_ylabel("Data")
ax.set_ylim((0,1))
#Format X Axis
ax.set_xticks(range(0,len(ind)))
ax.set_xticklabels(years)
ax.set_xlabel("Years")
ax.xaxis.set_major_locator(mticker.MultipleLocator(40))
ax.xaxis.set_major_formatter(mticker.FormatStrFormatter('%d'))
ax.xaxis.set_minor_locator(mticker.MultipleLocator(10))
fig.autofmt_xdate()
ax.xaxis_date()
plt.tight_layout()
plt.show()
This above chart produces the following error.
RuntimeError: Locator attempting to generate 1102 ticks from -510.0 to 10500.0: exceeds Locator.MAXTICKS
Can you please tell me the error in this chart?

First of all, you should remove these two lines:
ax.set_xticks(range(0,len(ind)))
ax.set_xticklabels(years)
These lines set 10000 ticks first. Since you used ax.xaxis.set_major/minor_locator(), these two lines are not needed. And then the line ax.xaxis.set_minor_locator(mticker.MultipleLocator(10)) will generate 1102 ticks (mticker.Locator.MAXTICKS==1000), so you should change the arg to at least 12 as a result of my testing.
Change arg of mticker.MultipleLocator() larger will get fewer ticks.
Despite any reason, if you do need 277 major ticks (40), and 1102 minor ticks (10), you can change the 'MAXTICKS' by mticker.Locator.MAXTICKS = 2000

Related

Why is Python matplot not starting from the point where my Data starts [duplicate]

So currently learning how to import data and work with it in matplotlib and I am having trouble even tho I have the exact code from the book.
This is what the plot looks like, but my question is how can I get it where there is no white space between the start and the end of the x-axis.
Here is the code:
import csv
from matplotlib import pyplot as plt
from datetime import datetime
# Get dates and high temperatures from file.
filename = 'sitka_weather_07-2014.csv'
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
#for index, column_header in enumerate(header_row):
#print(index, column_header)
dates, highs = [], []
for row in reader:
current_date = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(current_date)
high = int(row[1])
highs.append(high)
# Plot data.
fig = plt.figure(dpi=128, figsize=(10,6))
plt.plot(dates, highs, c='red')
# Format plot.
plt.title("Daily high temperatures, July 2014", fontsize=24)
plt.xlabel('', fontsize=16)
fig.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=16)
plt.show()
There is an automatic margin set at the edges, which ensures the data to be nicely fitting within the axis spines. In this case such a margin is probably desired on the y axis. By default it is set to 0.05 in units of axis span.
To set the margin to 0 on the x axis, use
plt.margins(x=0)
or
ax.margins(x=0)
depending on the context. Also see the documentation.
In case you want to get rid of the margin in the whole script, you can use
plt.rcParams['axes.xmargin'] = 0
at the beginning of your script (same for y of course). If you want to get rid of the margin entirely and forever, you might want to change the according line in the matplotlib rc file:
axes.xmargin : 0
axes.ymargin : 0
Example
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset('tips')
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
tips.plot(ax=ax1, title='Default Margin')
tips.plot(ax=ax2, title='Margins: x=0')
ax2.margins(x=0)
Alternatively, use plt.xlim(..) or ax.set_xlim(..) to manually set the limits of the axes such that there is no white space left.
If you only want to remove the margin on one side but not the other, e.g. remove the margin from the right but not from the left, you can use set_xlim() on a matplotlib axes object.
import seaborn as sns
import matplotlib.pyplot as plt
import math
max_x_value = 100
x_values = [i for i in range (1, max_x_value + 1)]
y_values = [math.log(i) for i in x_values]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
sn.lineplot(ax=ax1, x=x_values, y=y_values)
sn.lineplot(ax=ax2, x=x_values, y=y_values)
ax2.set_xlim(-5, max_x_value) # tune the -5 to your needs

Plotting data with variable frequency vs elapsed time

I have a dataset in pandas with measurements acquired with varying sample time.
I'm trying to plot the data vs. elapsed time, but time axis gets messed up.
I've found info that use of .autofmt_xdate() should solve the issue, but works only for data with fixed sampling frequency. In case of my data the x axis is entirely missing any labels.
a simple example of both cases
import pandas as pd
import matplotlib.pyplot as plt
idx1 = pd.to_timedelta(['00:00:00', '00:00:30', '00:01:00', '00:01:30', '00:02:00'])
idx2 = pd.to_timedelta(['00:00:01', '00:00:30', '00:01:00', '00:01:30', '00:02:00'])
vals = range(5)
s1= pd.Series(vals, idx1)
s2= pd.Series(vals, idx2)
# Labels on x are ok
plt.figure()
plt.gca().set_title('fixed frequency f=30s')
s1.plot()
plt.gcf().autofmt_xdate()
plt.show()
# Labels on x are messed up
plt.figure()
plt.gca().set_title('variable frequency')
s2.plot()
plt.gcf().autofmt_xdate()
plt.show()

Is there a maximum amount of ticklabels in a matplotlib axes?

So I have two lists one containing a bunch of years and the other one containing some integers, each list has 17 values.
when I make a simple bar chart
plt.bar(keys,values)
plt.show()
in the X axis of the graph it only contains some of the years in the keys list eg: the graph only has 2000,2002,2005,2007,2010,2012,2015. It has missed 2001,2003,2006,2008,2009 etc.
Is it because there is a maximum amount of keys allowed in the bar chart so it randomly took some keys?
If not how do i fix this?
There is a maximum amount of ticklabels on a matplotlib axes. This limit however lies well above 1000 and you would first run into severe lags when creating the figure.
The usual automatic ticking by matplotlib is that the axes are equipped with just as many labels as needed. I.e. if you plot 50 points on a plot, you would not want to have 50 labels as well. Further if you plot a point at 0.853164 you would not want to have such odd number being displayed as ticklabel on the axes.
Now, I cannot think of any reason matplotlib would produce the labels you report about, 2000,2002,2005,2007,2010,2012,2015, because the automatic locator for the ticks chooses equidistant points on the axes. For any help with this specific problem, a MCVE would be needed.
But in general there are two concepts from which you may choose.
Numerical axes
When plotting numbers, matplotlib will by default choose a linear axes and tick it automatically as described above.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2000, 2017)
y = np.random.randint(5,21, size=len(x))
plt.bar(x,y)
plt.show()
In this case an equidistant ticking of 2.5 units is chosen to have 7 nicely spaced ticks on the axes. If instead you would want to have every bar ticked, you could use a custom ticker.
E.g. a MultipleLocator with the interval set to 1,
import matplotlib.ticker as mticker
plt.gca().xaxis.set_major_locator(mticker.MultipleLocator(1))
plt.gca().tick_params(axis="x", rotation=90)
Or, a FixedLocator with the locations set to the x values of the bars,
import matplotlib.ticker as mticker
plt.gca().xaxis.set_major_locator(mticker.FixedLocator(x))
plt.gca().tick_params(axis="x", rotation=90)
Categorical axes
You may also decide that your xaxis shall be categorical. This means that every unique value gets its own tick and those ticks are equally spaced, independent of their value. This is easiest accomplished by converting the numbers to strings.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2000, 2017)
y = np.random.randint(5,21, size=len(x))
cats = list(map(str, x))
plt.bar(cats,y)
plt.gca().tick_params(axis="x", rotation=90)
plt.show()
The result is visually the same as above, but this time, the number 2000 is not at location 2000, but at its index 0, 2001 is at 1 and so on.
You can show all the ticks in this way:
plt.xticks(np.arange(min(keys), max(keys)+1, 1.0), rotation=45)
Example:
keys = [2000, 2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016]
values = range(1,18)
import matplotlib.pyplot as plt
plt.bar(keys,values)
plt.xticks(np.arange(min(keys), max(keys)+1, 1.0), rotation=45)
plt.show()

how to plot using matplotlib histogram where x-axis is daily timestamp range [duplicate]

This question already has answers here:
Histogram in matplotlib, time on x-Axis
(3 answers)
Closed 4 years ago.
My data records the time whereby workers finished their daily task.
The list has 100,000 entries. The time are within the range of 15:00:00 to 19:00:00
I have difficult plotting the data as histogram due to the x-axis are in different format.(My histogram will have missing space from 61 to 99 minutes)
Data=['16:24:00',
'17:48:00',
'16:10:00',
'16:46:00',
'17:13:00',
'15:31:00',
'16:23:00',
'16:53:00',
'16:28:00',
'16:33:00',
'17:38:00',
'17:08:00',
'16:29:00',
'16:25:00',
'16:17:00',
'17:38:00',
'16:29:00',
...]
I have tried using matplotlib.dates to format the axis but encounter ValueError: ordinal must be >= 1
Attempt 1
fig, ax = plt.subplots(1,1)
ax.hist(Data ,bins=50)
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M:%S'))
ax.xaxis.set_minor_locator(mdates.MinuteLocator())
plt.show()
Attempt 2
fig, ax = plt.subplots(1,1)
locator = mdates.AutoDateLocator()
ax.hist(Data ,bins=50)
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(locator))
ax.xaxis.set_minor_locator(mdates.MinuteLocator())
plt.show()
I hope to get something like this but with x-axis printed and y-axis represents occurrences
Imho the problem is that your Data are strings and therefore treated as categorical data. Transform them into datetime objects:
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
Data=['16:24:00',
'17:48:00',
'16:10:00',
'16:46:00',
'17:13:00',
'15:31:00',
'16:23:00',
'16:53:00',
'16:28:00',
'16:33:00',
'17:38:00',
'17:08:00',
'16:29:00',
'16:25:00',
'16:17:00',
'17:38:00',
'16:29:00']
#convert strings into datetime objects
conv_time = [datetime.strptime(i, "%H:%M:%S") for i in Data]
#define bin number
bin_nr = 7
fig, ax = plt.subplots(1,1)
#create histogram, get bin position for label
_counts, bins, _patches = ax.hist(conv_time, bins = bin_nr)
#set xticks at bin edges
plt.xticks(bins)
#reformat bin label into format hour:minute
ax.xaxis.set_major_formatter(mdates.DateFormatter("%H:%M"))
plt.show()
Output:

Seaborn right ytick [duplicate]

This question already has answers here:
multiple axis in matplotlib with different scales [duplicate]
(3 answers)
Closed 5 years ago.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
d = ['d1','d2','d3','d4','d5','d6']
value = [111111, 222222, 333333, 444444, 555555, 666666]
y_cumsum = np.cumsum(value)
sns.barplot(d, value)
sns.pointplot(d, y_cumsum)
plt.show()
I'm trying to make pareto diagram with barplot and pointplot. But I can't print percentages to the right side ytick. By the way, if I manuplate yticks it overlaps itself.
plt.yticks([1,2,3,4,5])
overlaps like in the image.
Edit: I mean that I want to quarter percentages (0, 25%, 50%, 75%, 100%) on the right hand side of the graphic, as well.
From what I understood, you want to show the percentages on the right hand side of your figure. To do that, we can create a second y axis using twinx(). All we need to do then is to set the limits of this second axis appropriately, and set some custom labels:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
d = ['d1','d2','d3','d4','d5','d6']
value = [111111, 222222, 333333, 444444, 555555, 666666]
fig, ax = plt.subplots()
ax2 = ax.twinx() # create a second y axis
y_cumsum = np.cumsum(value)
sns.barplot(d, value, ax=ax)
sns.pointplot(d, y_cumsum, ax=ax)
y_max = y_cumsum.max() # maximum of the array
# find the percentages of the max y values.
# This will be where the "0%, 25%" labels will be placed
ticks = [0, 0.25*y_max, 0.5*y_max, 0.75*y_max, y_max]
ax2.set_ylim(ax.get_ylim()) # set second y axis to have the same limits as the first y axis
ax2.set_yticks(ticks)
ax2.set_yticklabels(["0%", "25%","50%","75%","100%"]) # set the labels
ax2.grid("off")
plt.show()
This produces the following figure:

Resources