Matplotlib Formatting X-Axis Shows Jan-1-1970 - python-3.x

New to Matplotlib, trying to format dates on x axis. If I just use plt.xticks, the date is correct. But if I try to format the values using ax.xaxis.set_major_formatter, it changes my axis values to Jan-1-1970 based. I'm sure this is newbie stuff, thx for the bootstrap. (BTW, running in JupyterLabs notebook).
import pandas as pd
from datetime import date
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
%matplotlib inline
# data to plot
#df_plot = df_dts[df_dts.dt>"8/17/2020"]
df_plot = pd.DataFrame({
'dt': [date(2020,8,19), date(2020,8,20), date(2020,8,21), date(2020,8,22)],
'open_cnt': [2,15,2,7],
'close_cnt': [0,2,11,0]
})
# create plot
fig, ax = plt.subplots()
fig.set_size_inches(10, 5, forward=True)
index = np.arange(len(df_plot))
bar_width = 0.35
opacity = 0.8
rects1 = plt.bar(index, df_plot.open_cnt, bar_width, alpha=opacity, color='orange', label='Open')
rects2 = plt.bar(index + bar_width, df_plot.close_cnt, bar_width, alpha=opacity, color='g', label='Close')
plt.xlabel('Date')
plt.ylabel('Workitems')
plt.title('Open & Close Rates')
plt.xticks(index + bar_width/2, df_plot.dt)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b-%d-%y'))
plt.show()

Instead of messing up with formatter, set the index in your DataFrame
to proper text representation of your dates and call plot.bar on this
object:
fig, ax = plt.subplots(figsize=(10,5))
ax = df_plot.set_index(df_plot.dt.map(lambda s: s.strftime('%b-%d-%y')))\
.plot.bar(ax=ax, legend=False, title='Open & Close Rates', rot=0,
color=['orange', 'green'])
ax.set_xlabel('Date')
ax.set_ylabel('Workitems');
For your data I got the following picture:
As you can see, my code is more concise than yours.

If you don't want to change your original code much, you can simply do the transformation when you set the xticks.
df_plot['dt'] = pd.to_datetime(df_plot['dt'], format='%Y-%m-%d')
plt.xticks(index + bar_width/2, df_plot['dt'].dt.strftime('%b-%d-%y'))

Related

Combine bar plot and line plot in seaborn [duplicate]

I have dataframe like this:
df_meshX_min_select = pd.DataFrame({
'Number of Elements' : [5674, 8810,13366,19751,36491],
'Time (a)' : [42.14, 51.14, 55.64, 55.14, 56.64],
'Different Result(Temperature)' : [0.083849, 0.057309, 0.055333, 0.060516, 0.035343]})
and I tried to combine bar plot (number of elements Vs Different result) and line plot (Number of elements Vs Time) in the same figure, but I found the following problem like this:
it seems that x_value doesn't match when combining 2 plots, but if you see the data frame, the x value is exactly the same value.
My expectation is combining these 2 plots into 1 figure:
and this is the code that I made:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
df_meshX_min_select = pd.DataFrame({
'Number of Elements' : [5674, 8810,13366,19751,36491],
'Time (a)' : [42.14, 51.14, 55.64, 55.14, 56.64],
'Different Result(Temperature)' : [0.083849, 0.057309, 0.055333, 0.060516, 0.035343]})
x1= df_meshX_min_select["Number of Elements"]
t1= df_meshX_min_select["Time (a)"]
T1= df_meshX_min_select["Different Result(Temperature)"]
#Create combo chart
fig, ax1 = plt.subplots(figsize=(10,6))
color = 'tab:green'
#bar plot creation
ax1.set_title('Mesh Analysis', fontsize=16)
ax1.set_xlabel('Number of elements', fontsize=16)
ax1.set_ylabel('Different Result(Temperature)', fontsize=16)
ax1 = sns.barplot(x='Number of Elements', y='Different Result(Temperature)', data = df_meshX_min_select)
ax1.tick_params(axis='y')
#specify we want to share the same x-axis
ax2 = ax1.twinx()
color = 'tab:red'
#line plot creation
ax2.set_ylabel('Time (a)', fontsize=16)
ax2 = sns.lineplot(x='Number of Elements', y='Time (a)', data = df_meshX_min_select, sort=False, color=color, ax=ax2)
ax2.tick_params(axis='y', color=color)
#show plot
plt.show()
Anyone can help me, please?
Seaborn and pandas use a categorical x-axis for bar plots (internally numbered 0,1,2,...) and floating-point numbers for a line plot. Note that your x-values aren't evenly spaced, so either the bars would have weird distances between them, or wouldn't align with the x-values from the line plot.
Here is a solution using standard matplotlib to combine both graphs.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
df_meshx_min_select = pd.DataFrame({
'number of elements': [5674, 8810, 13366, 19751, 36491],
'time (a)': [42.14, 51.14, 55.64, 55.14, 56.64],
'different result(temperature)': [0.083849, 0.057309, 0.055333, 0.060516, 0.035343]})
x1 = df_meshx_min_select["number of elements"]
t1 = df_meshx_min_select["time (a)"]
d1 = df_meshx_min_select["different result(temperature)"]
fig, ax1 = plt.subplots(figsize=(10, 6))
color = 'limegreen'
ax1.set_title('mesh analysis', fontsize=16)
ax1.set_xlabel('number of elements', fontsize=16)
ax1.set_ylabel('different result(temperature)', fontsize=16, color=color)
ax1.bar(x1, height=d1, width=2000, color=color)
ax1.tick_params(axis='y', colors=color)
ax2 = ax1.twinx() # share the x-axis, new y-axis
color = 'crimson'
ax2.set_ylabel('time (a)', fontsize=16, color=color)
ax2.plot(x1, t1, color=color)
ax2.tick_params(axis='y', colors=color)
plt.show()
I was plotting a boxplot with a lineplot and I had the same problem even my two x-axes are identical, so I solved converting my x-axis feature to type string:
df_meshX_min_select['Number of Elements'] = df_meshX_min_select['Number of Elements'].astype('string')
This way the plot works using seaborn:

Pandas & Matplotlib: personalize the date format in a line chart

I want to make the dates on the x- axis look more prettier, currently the dates cannot be even read. what is the best way to do it.
Below is the code and also the actual graph picture
import matplotlib.pyplot as plt
import pandas as pd
import pandas as pd
df = dataset
# gca stands for 'get current axis'
ax = plt.gca()
y1 = df['Predicted_Lower']
y2 = df['Predicted_Upper']
x = df['Date']
ax.fill_between(x,y1, y2, facecolor="#CC6666", alpha=0.7)
df.plot(kind='line',x='Date',y='Predicted_Lower',color='white',ax=ax)
df.plot(kind='line',x='Date',y='Predicted_Upper',color='white', ax=ax)
df.plot(kind='line',x='Date',y='Predicted', color='yellow', ax=ax)
df.plot(kind='line',x='Date',y='Actuals', color='green', ax=ax)
plt.xticks(rotation=45)
plt.show()
You can modify the number of labels, by settings locs and labels parameters using matplotlib.pyplot.xticks, for example get the current locs and labels and only plot one-third of them:
# ...
df.plot(kind='line',x='Date',y='Actuals', color='green', ax=ax)
locs, labels = plt.xticks()
plt.xticks(locs[::3], labels[::3], rotation=45)
plt.show()

setting manual x-axis ticks violin plot

I'm trying to build a violin plot using matplotlib.
While setting the manual X-axis ticks based on the example provided here, I am failing to do so. Where am I missing out?
Here is a MWE
#!/usr/bin/env python3
import os
import numpy as np
import warnings
import matplotlib.pyplot as plt
import matplotlib.cbook
import matplotlib as mpl
warnings.filterwarnings("ignore",category=matplotlib.cbook.mplDeprecation)
OUTPUT_PATH=os.getcwd() + "/"
# Dots per inch for figure.
DPI = 500
def test_plot():
fig = plt.figure()
vector_size=100
bucket2 = np.random.rand(vector_size)
bucket3 = np.random.rand(vector_size)
bucket4 = np.random.rand(vector_size)
bucket5 = np.random.rand(vector_size)
bucket6 = np.random.rand(vector_size)
pos = [1,2,3,4,5]
data= [np.array(bucket2), np.array(bucket3), np.array(bucket4), np.array(bucket5), np.array(bucket6)]
axes1 = fig.add_subplot(111)
axes1.violinplot(data, pos, points=100, widths=0.7, showmeans=False, showextrema=True, showmedians=True)
axes1.set_xlabel('x-axis')
axes1.set_ylabel('y-axis')
xticks_t = ["",".1-.2", ".2-.3", ".3-.4", ".4-.5", ">.5"]
axes1.set_xticklabels(xticks_t)
axes1.set_xlim([0, 5])
axes1.spines['right'].set_visible(False)
axes1.spines['top'].set_visible(False)
axes1.xaxis.set_ticks_position('bottom')
axes1.yaxis.set_ticks_position('left')
fig.tight_layout()
file_name = 'test_violin.pdf'
fig.savefig(OUTPUT_PATH + str(file_name), bbox_inches='tight', dpi=DPI, pad_inches=0.1)
fig.clf()
plt.close()
pass
test_plot()
You can use the LaTeX expressions for the last tick to correctly display > as
xticks_t = ["",".1-.2", ".2-.3", ".3-.4", ".4-.5", r"$>.5$"]
and comment out the x-axis limits # axes1.set_xlim([0, 5])
which produces

Why is Python matplot not starting from the point where my Data starts [duplicate]

So currently learning how to import data and work with it in matplotlib and I am having trouble even tho I have the exact code from the book.
This is what the plot looks like, but my question is how can I get it where there is no white space between the start and the end of the x-axis.
Here is the code:
import csv
from matplotlib import pyplot as plt
from datetime import datetime
# Get dates and high temperatures from file.
filename = 'sitka_weather_07-2014.csv'
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
#for index, column_header in enumerate(header_row):
#print(index, column_header)
dates, highs = [], []
for row in reader:
current_date = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(current_date)
high = int(row[1])
highs.append(high)
# Plot data.
fig = plt.figure(dpi=128, figsize=(10,6))
plt.plot(dates, highs, c='red')
# Format plot.
plt.title("Daily high temperatures, July 2014", fontsize=24)
plt.xlabel('', fontsize=16)
fig.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=16)
plt.show()
There is an automatic margin set at the edges, which ensures the data to be nicely fitting within the axis spines. In this case such a margin is probably desired on the y axis. By default it is set to 0.05 in units of axis span.
To set the margin to 0 on the x axis, use
plt.margins(x=0)
or
ax.margins(x=0)
depending on the context. Also see the documentation.
In case you want to get rid of the margin in the whole script, you can use
plt.rcParams['axes.xmargin'] = 0
at the beginning of your script (same for y of course). If you want to get rid of the margin entirely and forever, you might want to change the according line in the matplotlib rc file:
axes.xmargin : 0
axes.ymargin : 0
Example
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset('tips')
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
tips.plot(ax=ax1, title='Default Margin')
tips.plot(ax=ax2, title='Margins: x=0')
ax2.margins(x=0)
Alternatively, use plt.xlim(..) or ax.set_xlim(..) to manually set the limits of the axes such that there is no white space left.
If you only want to remove the margin on one side but not the other, e.g. remove the margin from the right but not from the left, you can use set_xlim() on a matplotlib axes object.
import seaborn as sns
import matplotlib.pyplot as plt
import math
max_x_value = 100
x_values = [i for i in range (1, max_x_value + 1)]
y_values = [math.log(i) for i in x_values]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
sn.lineplot(ax=ax1, x=x_values, y=y_values)
sn.lineplot(ax=ax2, x=x_values, y=y_values)
ax2.set_xlim(-5, max_x_value) # tune the -5 to your needs

MatPlotLib + GeoPandas: Plot Multiple Layers, Control Figsize

Given the shape file available here: I know can produce the basic map that I need with county labels and even some points on the map (see below). The issue I'm having is that I cannot seem to control the size of the figure with figsize.
Here's what I have:
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
figsize=5,5
fig = plt.figure(figsize=(figsize),dpi=300)
shpfileshpfile=r'Y:\HQ\TH\Groups\NR\PSPD\Input\US_Counties\cb_2015_us_county_20m.shp'
c=gpd.read_file(shpfile)
c=c.loc[c['GEOID'].isin(['26161','26093','26049','26091','26075','26125','26163','26099','26115','26065'])]
c['coords'] = c['geometry'].apply(lambda x: x.representative_point().coords[:])
c['coords'] = [coords[0] for coords in c['coords']]
ax=c.plot()
#Control some attributes regarding the axis (for the plot above)
ax.spines['top'].set_visible(False);ax.spines['bottom'].set_visible(False);ax.spines['left'].set_visible(False);ax.spines['right'].set_visible(False)
ax.tick_params(axis='y',which='both',left='off',right='off',color='none',labelcolor='none')
ax.tick_params(axis='x',which='both',top='off',bottom='off',color='none',labelcolor='none')
for idx, row in c.iterrows():
ax.annotate(s=row['NAME'], xy=row['coords'],
horizontalalignment='center')
lat2=[42.5,42.3]
lon2=[-84,-83.5]
#Add another plot...
ax.plot(lon2,lat2,alpha=1,marker='o',linestyle='none',markeredgecolor='none',markersize=15,color='white')
plt.show()
As you can see, I opted to call the plots by the axis name because I need to control attributes of the axis, such as tick_params. I'm not sure if there is a better approach. This seems like a "no-brainer" but I can't seem to figure out why I can't control the figure size.
Thanks in advance!
I just had to do the following:
Use fig, ax = plt.subplots(1, 1, figsize = (figsize))
2.use the ax=ax argument in c.plot()
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
figsize=5,5
#fig = plt.figure(figsize=(figsize),dpi=300)
#ax = fig.add_subplot(111)
fig, ax = plt.subplots(1, 1, figsize = (figsize))
shpfileshpfile=r'Y:\HQ\TH\Groups\NR\PSPD\Input\US_Counties\cb_2015_us_county_20m.shp'
c=gpd.read_file(shpfile)
c=c.loc[c['GEOID'].isin(['26161','26093','26049','26091','26075','26125','26163','26099','26115','26065'])]
c['coords'] = c['geometry'].apply(lambda x: x.representative_point().coords[:])
c['coords'] = [coords[0] for coords in c['coords']]
c.plot(ax=ax)
ax.spines['top'].set_visible(False);ax.spines['bottom'].set_visible(False);ax.spines['left'].set_visible(False);ax.spines['right'].set_visible(False)
ax.tick_params(axis='y',which='both',left='off',right='off',color='none',labelcolor='none')
ax.tick_params(axis='x',which='both',top='off',bottom='off',color='none',labelcolor='none')
for idx, row in c.iterrows():
ax.annotate(s=row['NAME'], xy=row['coords'],
horizontalalignment='center')
lat2=[42.5,42.3]
lon2=[-84,-83.5]
ax.plot(lon2,lat2,alpha=1,marker='o',linestyle='none',markeredgecolor='none',markersize=15,color='white')

Resources