So I'm trying to add a line of best fit to a graph using the plt.axhline() feature. The code I have currently is below which is working currently but doesn't include the axhline code,
df = pd.DataFrame(pd.read_csv('test5.csv', sep=','))
x = df["IE ratio"]
y = df["109"]
x1 = df["IE ratio"].mean()
plt.axvline(x1, 0, 1, c= 'k')
plt.scatter(x, y, s = 10)
plt.ylabel('Appearance of mutation')
plt.xlabel('IE spectrum')
plt.show()
I've tried to bring in the plt.axhline() feature but can't work out what I need to put in the bracket to get my desired output.
Here's the plot I get with a red line I've drawn on to show what I'm hoping to produce.
Outputted graph
Thanks in advance for any advice or help!
Computing the mean of 109 for a set of cuts of IE ratio and plotting it might get you a bit further but more information would be needed to give you more relevant advice.
import numpy as np
df['109_mean'] = pd.cut(df['IE ratio'], bins=np.arange(4.6,5.6,0.01))
df.plot('IE ratio', '109_mean')
I managed to get it working using old practise notebooks I have. I used the below code.
import matplotlib as plt
from numpy.polynomial import Polynomial
x = df["IE ratio"]
y = df["44"]
xfit = np.linspace(1.3, 4.0, 30)
q = Polynomial.fit(x, y, deg=5)
plt.scatter(x, y, s = 10)
plt.plot(xfit, q(xfit), c='k')
x1 = df["IE ratio"].mean()
plt.axvline(x1, 0, 1, c= 'k')
Thank you for your advice guys and gals :)
Related
I'm using the below code to draw the ECC curve y^2+x^3+x^2 =0
import numpy as np
import matplotlib.pyplot as plt
import math
def main():
fig = plt.figure()
ax = fig.add_subplot(111)
y, x = np.ogrid[-2:2:1000j, -2:2:1000j]
ax.contour(x.ravel(), y.ravel(), pow(y, 2) + pow(x, 3) + pow(x, 2) , [0],colors='red')
ax.grid()
plt.show()
if __name__ == '__main__':
main()
The output is
The expected image, however, is this
As we can see, the isolated point at (0,0) is not drawn. Any suggestions to solve this issue?
As already mentioned in the comment, it seems that a single point is not displayed as a contour. The best solution would be if the application indicates such points in some way by itself. Perhaps the library allows this, but I have not found a way and therefore show two workarounds here:
Option 1:
The isolated point at (0,0) could be marked explicitly:
ax.plot(0, 0, color="red", marker = "o", markersize = 2.5, zorder = 10)
In the case of multiple points, a masked array is a good choice, here.
Option 2:
The plot can be slightly varied around z = 0, e.g. z = 0.0002:
z = pow(y,2) + pow(x, 2) + pow(x, 3)
ax.contour(x.ravel(), y.ravel(), z, [0.0002], colors='red', zorder=10)
This will move the whole plot. Alternatively, the area around the isolated point alone could be shifted (by adding a second contour call with a small x,y grid around the isolated point at (0,0)). This does not change the rest.
I would like to create a version of this 2D binned "color map" with smoothed colors.
I am not even sure this would be the correct nomenclature for the plot, but, essentially, I want my figure to be color coded by the median values of a third variable for points that reside in each defined bin of my (X, Y) space.
Even though I am able to accomplish that to a certain degree (see example), I would like to find a way to create a version of the same plot with a smoothed color gradient. That would allow me to visualize the overall behavior of my distribution.
I tried ideas described here: Smoothing 2D map in python
and here: Python: binned_statistic_2d mean calculation ignoring NaNs in data
as well as links therein, but could not find a clear solution to the problem.
This is what I have so far:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.stats import binned_statistic_2d
import random
random.seed(999)
x = np.random.normal (0,10,5000)
y = np.random.normal (0,10,5000)
z = np.random.uniform(0,10,5000)
fig = plt.figure(figsize=(20, 20))
plt.rcParams.update({'font.size': 10})
ax = fig.add_subplot(3,3,1)
ax.set_axisbelow(True)
plt.grid(b=True, lw=0.5, zorder=-1)
x_bins = np.arange(-50., 50.5, 1.)
y_bins = np.arange(-50., 50.5, 1.)
cmap = plt.cm.get_cmap('jet_r',1000) #just a colormap
ret = binned_statistic_2d(x, y, z, statistic=np.median, bins=[x_bins, y_bins]) # Bin (X, Y) and create a map of the medians of "Colors"
plt.imshow(ret.statistic.T, origin='bottom', extent=(-50, 50, -50, 50), cmap=cmap)
plt.xlim(-40,40)
plt.ylim(-40,40)
plt.xlabel("X", fontsize=15)
plt.ylabel("Y", fontsize=15)
ax.set_yticks([-40,-30,-20,-10,0,10,20,30,40])
bounds = np.arange(2.0, 20.0, 1.0)
plt.colorbar(ticks=bounds, label="Color", fraction=0.046, pad=0.04)
# save plots
plt.savefig("Whatever_name.png", bbox_inches='tight')
Which produces the following image (from random data):
Therefore, the simple question would be: how to smooth these colors?
Thanks in advance!
PS: sorry for excessive coding, but I believe a clear visualization is crucial for this particular problem.
Thanks to everyone who viewed this issue and tried to help!
I ended up being able to solve my own problem. In the end, it was all about image smoothing with Gaussian Kernel.
This link: Gaussian filtering a image with Nan in Python gave me the insight for the solution.
I, basically, implemented the exactly same code, but, in the end, mapped the previously known NaN pixels from the original 2D array to the resulting smoothed version. Unlike the solution from the link, my version does NOT fill NaN pixels with some value derived from the pixels around. Or, it does, but then I erase those again.
Here is the final figure produced for the example I provided:
Final code, for reference, for those who might need in the future:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.stats import binned_statistic_2d
import scipy.stats as st
import scipy.ndimage
import scipy as sp
import random
random.seed(999)
x = np.random.normal (0,10,5000)
y = np.random.normal (0,10,5000)
z = np.random.uniform(0,10,5000)
fig = plt.figure(figsize=(20, 20))
plt.rcParams.update({'font.size': 10})
ax = fig.add_subplot(3,3,1)
ax.set_axisbelow(True)
plt.grid(b=True, lw=0.5, zorder=-1)
x_bins = np.arange(-50., 50.5, 1.)
y_bins = np.arange(-50., 50.5, 1.)
cmap = plt.cm.get_cmap('jet_r',1000) #just a colormap
ret = binned_statistic_2d(x, y, z, statistic=np.median, bins=[x_bins, y_bins]) # Bin (X, Y) and create a map of the medians of "Colors"
sigma=1 # standard deviation for Gaussian kernel
truncate=5.0 # truncate filter at this many sigmas
U = ret.statistic.T.copy()
V=U.copy()
V[np.isnan(U)]=0
VV=sp.ndimage.gaussian_filter(V,sigma=sigma)
W=0*U.copy()+1
W[np.isnan(U)]=0
WW=sp.ndimage.gaussian_filter(W,sigma=sigma)
np.seterr(divide='ignore', invalid='ignore')
Z=VV/WW
for i in range(len(Z)):
for j in range(len(Z[0])):
if np.isnan(U[i][j]):
Z[i][j] = np.nan
plt.imshow(Z, origin='bottom', extent=(-50, 50, -50, 50), cmap=cmap)
plt.xlim(-40,40)
plt.ylim(-40,40)
plt.xlabel("X", fontsize=15)
plt.ylabel("Y", fontsize=15)
ax.set_yticks([-40,-30,-20,-10,0,10,20,30,40])
bounds = np.arange(2.0, 20.0, 1.0)
plt.colorbar(ticks=bounds, label="Color", fraction=0.046, pad=0.04)
# save plots
plt.savefig("Whatever_name.png", bbox_inches='tight')
Got very happy when I learned the potential of applying plotly in my use of Python. I use PyCharm, and found out that I could depict numbers, figures, stats, etc using the above option. Yet I am bit confused. First, executing code including 'import plotly.io as pio' and on the following line 'pio.renderers.default = "browser"' it takes ages for the data and graphics to load, but almost only a split second to open the browser. Second, is there an alternative to the "browser"-choice, e.g. a choice that allowed fig.show() directly in the PyCharm console? - for jupyter I think the alternative is "notebook", but that is not PyCharm. If alternatives exist to pio, i.e. that prompt rendering of code execution in the console, I'd be all ears and eyes. Thx, in advance, for any advice.
import plotly.figure_factory as ff
import plotly.graph_objects as go
import numpy as np
import plotly.io as pio
pio.renderers.default = "browser"
## Create first figure
import plotly.io as pio
pio.renderers.default = "browser"
x1,y1 = np.meshgrid(np.arange(0, 2, .2), np.arange(0, 2, .2))
u1 = np.cos(x1)*y1
v1 = np.sin(x1)*y1
fig1 = ff.create_quiver(x1, y1, u1, v1, name='Quiver')
fig1.show()
## Create second figure
import plotly.io as pio
pio.renderers.default = "browser"
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
Y, X = np.meshgrid(x, y)
u = -1 - X**2 + Y
v = 1 + X - Y**2
fig2 = ff.create_streamline(x, y, u, v, arrow_scale=.1, name='Steamline')
fig2.show()
Sometimes results are indeed rendered in a browser, mostly the browser stops loading and an utterly blank, white screen keeps staring at me. That is why I'd kind of fancied a rendering result as performed in matplotlib where graphics is shown in the console directly.
The official answer to your question is that you can use other renderers like “png” or “svg” as provided by Orca. If your figure is too complex, however, you may have trouble with any renderer, depending on your hardware setup.
More info here: https://plot.ly/python/renderers/
this is a bit detailed, but help appreciated. It's a slightly annoying feature of seaborn that regplot can't handle datetime axes. This is especially so when you want to use the lowess parameter, which estimates local means to give a curve (as opposed to line) that tracks changes in a plot. (This functionality is available in R, for instance.)
This is a problem for me because I have a linear plot that I'd like to smooth out but without using a rolling mean. This is my plot:
To solve this problem, I took the index of my dataframe as the x-axis, and used statsmodels to calculate the lowess line as follows:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import statsmodels.api as sm
rated = pd.read_csv('filepath.csv')
rated['linear'] = rated.index
x = rated['linear']
y = rated['trust_num']
lowess = sm.nonparametric.lowess(y, x, frac=.2)
low_y = [i[1] for i in lowess]
low_x = [i[0] for i in lowess]
rated['low_y'] = low_y
rated['low_x'] = low_x
rated['low_y'] = low_y
rated['low_x'] = low_x
h = sns.lineplot(x = 'linear', y = 'trust_num', data = rated)
h = sns.lineplot(x = 'linear', y = 'low_y', data = rated, color = 'r')
This produces exactly what I'd like:
The final step comes with assigning dates to the x-axis, which I do as follows:
labels = [i for i in rated['date']]
h.set_xticklabels(labels)
The result is a clustered x-axis, as below:
Fair enough, this is a common plotting problem. So I try to rotate my xtick labels:
plt.xticks(rotation=45)
But it makes no difference. Can anyone advise how I might declutter the axis? Seems a pain to nearly get there and fall at a seemingly simple problem!
I need help with setting the limits of y-axis on matplotlib. Here is the code that I tried, unsuccessfully.
import matplotlib.pyplot as plt
plt.figure(1, figsize = (8.5,11))
plt.suptitle('plot title')
ax = []
aPlot = plt.subplot(321, axisbg = 'w', title = "Year 1")
ax.append(aPlot)
plt.plot(paramValues,plotDataPrice[0], color = '#340B8C',
marker = 'o', ms = 5, mfc = '#EB1717')
plt.xticks(paramValues)
plt.ylabel('Average Price')
plt.xlabel('Mark-up')
plt.grid(True)
plt.ylim((25,250))
With the data I have for this plot, I get y-axis limits of 20 and 200. However, I want the limits 20 and 250.
Get current axis via plt.gca(), and then set its limits:
ax = plt.gca()
ax.set_xlim([xmin, xmax])
ax.set_ylim([ymin, ymax])
One thing you can do is to set your axis range by yourself by using matplotlib.pyplot.axis.
matplotlib.pyplot.axis
from matplotlib import pyplot as plt
plt.axis([0, 10, 0, 20])
0,10 is for x axis range.
0,20 is for y axis range.
or you can also use matplotlib.pyplot.xlim or matplotlib.pyplot.ylim
matplotlib.pyplot.ylim
plt.ylim(-2, 2)
plt.xlim(0,10)
Another workaround is to get the plot's axes and reassign changing only the y-values:
x1,x2,y1,y2 = plt.axis()
plt.axis((x1,x2,25,250))
You can instantiate an object from matplotlib.pyplot.axes and call the set_ylim() on it. It would be something like this:
import matplotlib.pyplot as plt
axes = plt.axes()
axes.set_ylim([0, 1])
Just for fine tuning. If you want to set only one of the boundaries of the axis and let the other boundary unchanged, you can choose one or more of the following statements
plt.xlim(right=xmax) #xmax is your value
plt.xlim(left=xmin) #xmin is your value
plt.ylim(top=ymax) #ymax is your value
plt.ylim(bottom=ymin) #ymin is your value
Take a look at the documentation for xlim and for ylim
This worked at least in matplotlib version 2.2.2:
plt.axis([None, None, 0, 100])
Probably this is a nice way to set up for example xmin and ymax only, etc.
To add to #Hima's answer, if you want to modify a current x or y limit you could use the following.
import numpy as np # you probably alredy do this so no extra overhead
fig, axes = plt.subplot()
axes.plot(data[:,0], data[:,1])
xlim = axes.get_xlim()
# example of how to zoomout by a factor of 0.1
factor = 0.1
new_xlim = (xlim[0] + xlim[1])/2 + np.array((-0.5, 0.5)) * (xlim[1] - xlim[0]) * (1 + factor)
axes.set_xlim(new_xlim)
I find this particularly useful when I want to zoom out or zoom in just a little from the default plot settings.
This should work. Your code works for me, like for Tamás and Manoj Govindan. It looks like you could try to update Matplotlib. If you can't update Matplotlib (for instance if you have insufficient administrative rights), maybe using a different backend with matplotlib.use() could help.