How to fill the area near the y axis in a plot? - python-3.x

I need to plot two features of a dataframe where df['DEPTH'] should be inverted and at y-axis and df['SPECIES'] should be at x-axis. Imagining that the plot would be a variant line, I would like to fill with color the area near the y-axis (left side of the line). So I wrote some code:
df = pd.DataFrame({'DEPTH': [100, 150, 200, 250, 300, 350, 400, 450, 500, 550],
'SPECIES':[12, 8, 9, 6, 10, 7, 4, 3, 1, 2]})
plt.plot(df['SPECIES'], df['DEPTH'])
plt.fill_between(df['SPECIES'], df['DEPTH'])
plt.ylabel('DEPTH')
plt.xlabel('SPECIES')
plt.ylim(np.max(df['DEPTH']), np.min(df['DEPTH']))
I tried 'plt.fill_between', but then the left part of the plot doesn't get all filled.
Anyone knows how can the filled part (blue color) reach the y-axis?

Instead of fill_between, you can use fill_betweenx. It will start filling from 0 by default, thus you need to set your x limit to be 0 too.
plt.plot(df['SPECIES'], df['DEPTH'])
# changing fill_between to fill_betweenx -- the order also changes
plt.fill_betweenx(df['DEPTH'], df['SPECIES'])
plt.ylabel('DEPTH')
plt.xlabel('SPECIES')
plt.ylim(np.max(df['DEPTH']), np.min(df['DEPTH']))
# setting the lower limit to 0 for the filled area to reach y axis.
plt.xlim(0,np.max(df['SPECIES']))
plt.show()
The result is below.

Related

Matplotlib not displaying all values

I am trying to display the following values in the form of a bar chart. However, I am only getting one value displayed (619,1). Below is the code which I used in an attempt to plot the below graph:
import matplotlib.pyplot as plt
plt.style.use('ggplot')
values= [1, 2, 3, 4, 5]
a = [619, 101, 815, 1361, 178]
plt.figure(figsize=(5, 5))
plt.bar(a, values)
plt.show()
The bar width is set to a default value of 0.8 so when your x-axis has such a large range, the bars are so skinny that they disappear.
The reason for the 0.8 is that bar charts are typically used for labelled categories, which are effectively spaced by 1 along the x-axis.
So you can set the width directly. (It's also possible to calculate a width, to make this more automatic, but then you need to decide about overlaps, etc.)
plt.figure(figsize=(5, 5))
plt.xlim(0, 1450)
plt.bar(a, values, width = 50)
It seems your data might be better suited for a horizontal bar plot (but don't take this too seriously as it may not have the right meaning at all), and if you want horizontal bars, you can do so like this:
plt.barh(values, a)

How to fill areas between curves with different scales in a plot?

I have a dataframe with three features: DEPTH, PERMEABILITY and POROSITY. And I would like to plot DEPTH at y axis and PERMEABILITY and POROSITY together at x axis, although these last two features have different scales.
df = pd.DataFrame({'DEPTH(m)': [100, 150, 200, 250, 300, 350, 400, 450, 500, 550],
'PERMEABILITY(mD)': [1000, 800, 900, 600, 200, 250, 400, 300, 100, 200],
'POROSITY(%)': [0.30, 0.25, 0.15, 0.19, 0.15, 0.10, 0.15, 0.19, 0.10, 0.15]})
I already managed to plot them together, but now I need to fill with two different colors the areas between the curves. For example, when PERMEABILITY curve is on the right side of POROSITY, the area between them should be green. If PERMEABILITY is on the left side, the area between curves should be yellow.
f, ax1 = plt.subplots()
ax1.set_xlabel('PERMEABILITY(mD)')
ax1.set_ylabel('DEPTH(m)')
ax1.set_ylim(df['DEPTH(m)'].max(), df['DEPTH(m)'].min())
ax1.plot(df['PERMEABILITY(mD)'], df['DEPTH(m)'], color='red')
ax1.tick_params(axis='x', labelcolor='red')
ax2 = ax1.twiny()
ax2.set_xlabel('POROSITY(%)')
ax2.plot(df['POROSITY(%)'], df['DEPTH(m)'], color='blue')
ax2.tick_params(axis='x', labelcolor='blue')
So the right output should be like this: (Sorry for the Paint image below)
Anyone could help me with this?
You can use the fill_betweenx() function, however you need to convert one of your axis to the scale of the other one, because you use twiny. Below, I converted your POROSITY data to fit to the axis of PERMEABILITY.
Then you can use two conditional fill_betweenx, where the two curves are larger than each other, to assign different colors to those patches. Also, since your data is discrete, you need to set interpolate=True in your fill_betweenx functions.
f, ax1 = plt.subplots()
ax1.set_xlabel('PERMEABILITY(mD)')
ax1.set_ylabel('DEPTH(m)')
ax1.set_ylim(df['DEPTH(m)'].max(), df['DEPTH(m)'].min())
ax1.plot(df['PERMEABILITY(mD)'], df['DEPTH(m)'], color='red')
ax1.tick_params(axis='x', labelcolor='red')
ax2 = ax1.twiny()
ax2.set_xlabel('POROSITY(%)')
ax2.plot(df['POROSITY(%)'], df['DEPTH(m)'], color='blue')
ax2.tick_params(axis='x', labelcolor='blue')
# convert POROSITY axis to PERMEABILITY
# value-min / range -> normalized POROSITY (normp)
# normp*newrange + newmin -> stretched POROSITY to PERMEABILITY
z=df['POROSITY(%)']
x=df['PERMEABILITY(mD)']
nz=((z-np.min(z))/(np.max(z)-np.min(z)))*(np.max(x)-np.min(x))+np.min(x)
# fill between in green where PERMEABILITY is larger
ax1.fill_betweenx(df['DEPTH(m)'],x,nz,where=x>=nz,interpolate=True,color='g')
# fill between in yellow where POROSITY is larger
ax1.fill_betweenx(df['DEPTH(m)'],x,nz,where=x<=nz,interpolate=True,color='y')
plt.show()
The result is as below (I might have used different colors, but I assume that's not a concern).

interpolate.interp1d linear plot doesn't agree with new inputs to the function

I have used scipy.interpolate.interp1d to have a linear interpolation between two arrays with float values. Then, I plotted the interpolation function with matplotlib. However, I noticed that some new values (that weren't originally included in the arrays representing x and y data) yield different results when plugged into the interpolation function, than what the plot suggests.
I am essentially trying to find the intersection points between a few lines that are parallel to the x-axis and the interpolation function's linear curve. By research online, I saw that many people use scipy's interpolate.interp1d for this purpose.
Here is the code:
from scipy import interpolate
import matplotlib.pyplot as plt
# Data
size = [12, 9, 6.5, 4.8, 2, 0.85, 0.45, 0.15, 0.07]
poW = [100, 99, 98, 97, 94, 80, 50, 6, 1]
# Approximate function f: size = f(poW)
f = interpolate.interp1d(poW, size, kind="linear")
# Here I create the plot
plt.axes(xscale='log') # scale x-axis
plt.plot(size, poW, "bs", # add data points with blue squares
f(poW), poW, "b") # add a blue trendline
# Draw D_10 as an additional point
plt.plot(f(10), 10, "rx", markersize=15)
# Draw D_30 as an additional point
plt.plot(f(30), 30, "rx", markersize=15)
# Draw D_60 as an additional point
plt.plot(f(60), 60, "rx", markersize=15)
plt.show()
The additional points I plot in the last 3 lines before plt.show(), don't correspond to the same positions indicated by the plot of the interpolation function itself. This is pretty interesting for me, and I can't seem to locate the problem here. I am pretty new to matplotlib and scipy, so I am sure I must be missing something. Any help or pointing in the right direction will be appreciated!

matplotlib: controlling position of y axis label with multiple twinx subplots

I wrote a Python script based on matplotlib that generates curves based on a common timeline. The number of curves sharing the same x axis in my plot can vary from 1 to 6 depending on user options.
Each of the data plotted use different y scales and require a different axis for drawing. As a result, I may need to draw up to 5 different Y axes on the right of my plot. I found the way in some other post to offset the position of the axes as I add new ones, but I still have two issues:
How to control the position of the multiple axes so that the tick labels don't overlap?
How to control the position of each axis label so that it is placed vertically at the bottom of each axis? And how to preserve this alignment as the display window is resized, zoomed-in etc...
I probably need to write some code that will first query the position of the axis and then a directive that will place the label relative to that position but I really have no idea how to do that.
I cannot share my entire code because it is too big, but I derived it from the code in this example. I modified that example by adding one extra plot and one extra axis to more closely match what intend to do in my script.
import matplotlib.pyplot as plt
def make_patch_spines_invisible(ax):
ax.set_frame_on(True)
ax.patch.set_visible(False)
for sp in ax.spines.values():
sp.set_visible(False)
fig, host = plt.subplots()
fig.subplots_adjust(right=0.75)
par1 = host.twinx()
par2 = host.twinx()
par3 = host.twinx()
# Offset the right spine of par2. The ticks and label have already been
# placed on the right by twinx above.
par2.spines["right"].set_position(("axes", 1.2))
# Having been created by twinx, par2 has its frame off, so the line of its
# detached spine is invisible. First, activate the frame but make the patch
# and spines invisible.
make_patch_spines_invisible(par2)
# Second, show the right spine.
par2.spines["right"].set_visible(True)
par3.spines["right"].set_position(("axes", 1.4))
make_patch_spines_invisible(par3)
par3.spines["right"].set_visible(True)
p1, = host.plot([0, 1, 2], [0, 1, 2], "b-", label="Density")
p2, = par1.plot([0, 1, 2], [0, 3, 2], "r-", label="Temperature")
p3, = par2.plot([0, 1, 2], [50, 30, 15], "g-", label="Velocity")
p4, = par3.plot([0,0.5,1,1.44,2],[100, 102, 104, 108, 110], "m-", label="Acceleration")
host.set_xlim(0, 2)
host.set_ylim(0, 2)
par1.set_ylim(0, 4)
par2.set_ylim(1, 65)
host.set_xlabel("Distance")
host.set_ylabel("Density")
par1.set_ylabel("Temperature")
par2.set_ylabel("Velocity")
par3.set_ylabel("Acceleration")
host.yaxis.label.set_color(p1.get_color())
par1.yaxis.label.set_color(p2.get_color())
par2.yaxis.label.set_color(p3.get_color())
par3.yaxis.label.set_color(p4.get_color())
tkw = dict(size=4, width=1.5)
host.tick_params(axis='y', colors=p1.get_color(), **tkw)
par1.tick_params(axis='y', colors=p2.get_color(), **tkw)
par2.tick_params(axis='y', colors=p3.get_color(), **tkw)
par3.tick_params(axis='y', colors=p4.get_color(), **tkw)
host.tick_params(axis='x', **tkw)
lines = [p1, p2, p3, p4]
host.legend(lines, [l.get_label() for l in lines])
# fourth y axis is not shown unless I add this line
plt.tight_layout()
plt.show()
When I run this, I obtain the following plot:
output from above script
In this image, question 2 above means that I would want the y-axis labels 'Temperature', 'Velocity', 'Acceleration' to be drawn directly below each of the corresponding axis.
Thanks in advance for any help.
Regards,
L.
What worked for me was ImportanceOfBeingErnest's suggestion of using text (with a line like
host.text(1.2, 0, "Velocity" , ha="left", va="top", rotation=90,
transform=host.transAxes))
instead of trying to control the label position.

Can't make dates appear on x-axis in pyplot

So I've been trying to plot some data. I have got the data to fetch from a database and placed it all correctly into the variable text_. This is the snippet of the code:
import sqlite3
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from dateutil.parser import parse
fig, ax = plt.subplots()
# Twin the x-axis twice to make independent y-axes.
axes = [ax, ax.twinx(), ax.twinx()]
# Make some space on the right side for the extra y-axis.
fig.subplots_adjust(right=0.75)
# Move the last y-axis spine over to the right by 20% of the width of the axes
axes[-1].spines['right'].set_position(('axes', 1.2))
# To make the border of the right-most axis visible, we need to turn the frame on. This hides the other plots, however, so we need to turn its fill off.
axes[-1].set_frame_on(True)
axes[-1].patch.set_visible(False)
# And finally we get to plot things...
text_ = [('01/08/2017', 6.5, 143, 88, 60.2, 3), ('02/08/2017', 7.0, 146, 90, 60.2, 4),
('03/08/2017', 6.7, 142, 85, 60.2, 5), ('04/08/2017', 6.9, 144, 86, 60.1, 6),
('05/08/2017', 6.8, 144, 88, 60.2, 7), ('06/08/2017', 6.7, 147, 89, 60.2, 8)]
colors = ('Green', 'Red', 'Blue')
label = ('Blood Sugar Level (mmol/L)', 'Systolic Blood Pressure (mm Hg)', 'Diastolic Blood Pressure (mm Hg)')
y_axisG = [text_[0][1], text_[1][1], text_[2][1], text_[3][1], text_[4][1], text_[5][1]] #Glucose data
y_axisS = [text_[0][2], text_[1][2], text_[2][2], text_[3][2], text_[4][2], text_[5][2]] # Systolic Blood Pressure data
y_axisD = [text_[0][3], text_[1][3], text_[2][3], text_[3][3], text_[4][3], text_[5][3]] # Diastolic Blood Pressure data
AllyData = [y_axisG, y_axisS, y_axisD] #list of the lists of data
dates = [text_[0][0], text_[1][0], text_[2][0], text_[3][0], text_[4][0], text_[5][0]] # the dates as strings
x_axis = [(parse(x, dayfirst=True)) for x in dates] #converting the dates to datetime format for the graph
Blimits = [5.5, 130, 70] #lower limits of the axis
Tlimits = [8, 160, 100] #upper limits of the axis
for ax, color, label, AllyData, Blimits, Tlimits in zip(axes, colors, label, AllyData, Blimits, Tlimits):
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y')) #format's the date
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
data = AllyData
ax.plot(data, color=color) #plots all the y-axis'
ax.set_ylim([Blimits, Tlimits]) #limits
ax.set_ylabel(label, color=color) #y-axis labels
ax.tick_params(axis='y', colors=color)
axes[0].set_xlabel('Date', labelpad=20)
plt.gca().set_title("Last 6 Month's Readings",weight='bold',fontsize=15)
plt.show()
The code currently makes this graph:
Graph with no x-values
I understand the problem is probably in the ax.plot part but I'm not sure what exactly. I tried putting that line of code as ax.plot(data, x_axis, color=color however, this made the whole graph all messed up and the dates didn't show up on the x-axis like i wanted them to.
Is there something I've missed?
If this has been answered elsewhere, please can you show me how to implement that into my code by editing my code?
Thanks a ton
Apparently x_data is never actually used in the code. Instead of
ax.plot(data, color=color)
which plots the data against its indices, you would want to plot the data against the dates stored in x_axis.
ax.plot(x_axis, data, color=color)
Finally, adding plt.gcf().autofmt_xdate() just before plt.show will rotate the dates nicely, such that they don't overlap.

Resources