How do I correctly write the syntax for performing and plotting a for loop operation? - python-3.x

I am trying to create a for loop which uses a defined function (B_lambda) and takes in values of wavelength and temperature to produce values of intensity. i.e. I want the loop to take the function B_lambda and to run through every value within my listed wavelength range for each temperature in the temperature list. Then I want to plot the results. I am not very good with the syntax and have tried many ways but nothing is producing what I need and I am mostly getting errors. I have no idea how to use a for loop to plot and all online sources that I have checked out have not helped me with using a defined function in a for loop. I will put my latest code that seems to have the least errors down below with the error message:
import matplotlib.pylab as plt
import numpy as np
from astropy import units as u
import scipy.constants
%matplotlib inline
#Importing constants to use.
h = scipy.constants.h
c = scipy.constants.c
k = scipy.constants.k
wavelengths= np.arange(1000,30000)*1.e-10
temperature=[3000,4000,5000,6000]
for lam in wavelengths:
for T in temperature:
B_lambda = ((2*h*c**2)/(lam**5))*((1)/(np.exp((h*c)/(lam*k*T))-1))
plt.figure()
plt.plot(wavelengths,B_lambda)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-73b866241c49> in <module>
17 B_lambda = ((2*h*c**2)/(lam**5))*((1)/(np.exp((h*c)/(lam*k*T))-1))
18 plt.figure()
---> 19 plt.plot(wavelengths,B_lambda)
20
21
/usr/local/lib/python3.6/dist-packages/matplotlib/pyplot.py in plot(scalex, scaley, data, *args, **kwargs)
2787 return gca().plot(
2788 *args, scalex=scalex, scaley=scaley, **({"data": data} if data
-> 2789 is not None else {}), **kwargs)
2790
2791
/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_axes.py in plot(self, scalex, scaley, data, *args, **kwargs)
1663 """
1664 kwargs = cbook.normalize_kwargs(kwargs, mlines.Line2D._alias_map)
-> 1665 lines = [*self._get_lines(*args, data=data, **kwargs)]
1666 for line in lines:
1667 self.add_line(line)
/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py in __call__(self, *args, **kwargs)
223 this += args[0],
224 args = args[1:]
--> 225 yield from self._plot_args(this, kwargs)
226
227 def get_next_color(self):
/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs)
389 x, y = index_of(tup[-1])
390
--> 391 x, y = self._xy_from_xy(x, y)
392
393 if self.command == 'plot':
/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py in _xy_from_xy(self, x, y)
268 if x.shape[0] != y.shape[0]:
269 raise ValueError("x and y must have same first dimension, but "
--> 270 "have shapes {} and {}".format(x.shape, y.shape))
271 if x.ndim > 2 or y.ndim > 2:
272 raise ValueError("x and y can be no greater than 2-D, but have "
ValueError: x and y must have same first dimension, but have shapes (29000,) and (1,)```

First thing to note (and this is minor) is that astropy is not required to run your code. So, you can simplify the import statements.
import matplotlib.pylab as plt
import numpy as np
import scipy.constants
%matplotlib inline
#Importing constants to use.
h = scipy.constants.h
c = scipy.constants.c
k = scipy.constants.k
wavelengths= np.arange(1000,30000,100)*1.e-10 # here, I chose steps of 100, because plotting 29000 datapoints takes a while
temperature=[3000,4000,5000,6000]
Secondly, to tidy up the loop a bit, you can write a helper function, that youn call from within you loop:
def f(lam, T):
return ((2*h*c**2)/(lam**5))*((1)/(np.exp((h*c)/(lam*k*T))-1))
now you can collect the output of your function, together with the input parameters, e.g. in a list of tuples:
outputs = []
for lam in wavelengths:
for T in temperature:
outputs.append((lam, T, f(lam, T)))
Since you vary both wavelength and temperature, a 3d plot makes sense:
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(10,10))
ax = fig.add_subplot(111, projection='3d')
ax.plot(*zip(*outputs))
An alternative would be to display the data as an image, using colour to indicate the function output.
I am also including an alternative method to generate the data in this one. Since the function f can take arrays as input, you can feed one temperature at a time, and with it, all the wavelengths simultaneously.
# initialise output as array with proper shape
outputs = np.zeros((len(wavelengths), len(temperature)))
for i, T in enumerate(temperature):
outputs[:,i] = f(wavelengths, T)
The output now is a large matrix, which you can visualise as an image:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(outputs, aspect=10e8, interpolation='none',
extent=[
np.min(temperature),
np.max(temperature),
np.max(wavelengths),
np.min(wavelengths)]
)

Related

Draw vertical line in seaborn pairplot [duplicate]

This question already has an answer here:
How to add individual vlines to every subplot of seaborn FacetGrid
(1 answer)
Closed 1 year ago.
I have the following plot code in seaborn
import seaborn as sns
import matplotlib.pyplot as plt
iris= sns.load_dataset("iris")
g= sns.pairplot(iris,
x_vars=["sepal_width", "sepal_length"],
y_vars=["petal_width"])
This produces the following output
Now I am trying to add a vertical line at x=3 on both plots
I have tried using plt.axvline(x=3, ls='--', linewidth=3, color='red') however that draws the line only on the last plot, as shown below
How can I have the vertical line drawn on both plots?
I have tried g.map_offdiag(plt.axvline(x=1, ls='--', linewidth=3, color='red')) and the g.map() variant, however I get the following error.
TypeError Traceback (most recent call last)
<ipython-input-12-612fcf2a7fef> in <module>
9
10 # plt.axvline(x=3, ls='--', linewidth=3, color='red')
---> 11 g.map_offdiag(plt.axvline(x=1, ls='--', linewidth=3, color='red'))
12
13
~/PycharmProjects/venv/lib/python3.8/site-packages/seaborn/axisgrid.py in map_offdiag(self, func, **kwargs)
1318 if x_var != y_var:
1319 indices.append((i, j))
-> 1320 self._map_bivariate(func, indices, **kwargs)
1321 return self
1322
~/PycharmProjects/venv/lib/python3.8/site-packages/seaborn/axisgrid.py in _map_bivariate(self, func, indices, **kwargs)
1463 if ax is None: # i.e. we are in corner mode
1464 continue
-> 1465 self._plot_bivariate(x_var, y_var, ax, func, **kws)
1466 self._add_axis_labels()
1467
~/PycharmProjects/venv/lib/python3.8/site-packages/seaborn/axisgrid.py in _plot_bivariate(self, x_var, y_var, ax, func, **kwargs)
1471 def _plot_bivariate(self, x_var, y_var, ax, func, **kwargs):
1472 """Draw a bivariate plot on the specified axes."""
-> 1473 if "hue" not in signature(func).parameters:
1474 self._plot_bivariate_iter_hue(x_var, y_var, ax, func, **kwargs)
1475 return
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py in signature(obj, follow_wrapped)
3091 def signature(obj, *, follow_wrapped=True):
3092 """Get a signature object for the passed callable."""
-> 3093 return Signature.from_callable(obj, follow_wrapped=follow_wrapped)
3094
3095
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py in from_callable(cls, obj, follow_wrapped)
2840 def from_callable(cls, obj, *, follow_wrapped=True):
2841 """Constructs Signature for the given callable object."""
-> 2842 return _signature_from_callable(obj, sigcls=cls,
2843 follow_wrapper_chains=follow_wrapped)
2844
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py in _signature_from_callable(obj, follow_wrapper_chains, skip_bound_arg, sigcls)
2214
2215 if not callable(obj):
-> 2216 raise TypeError('{!r} is not a callable object'.format(obj))
2217
2218 if isinstance(obj, types.MethodType):
TypeError: <matplotlib.lines.Line2D object at 0x1287cc580> is not a callable object
Any suggestions?
The easiest way is to loop through the axes and call axvline() on each of them. Note that .ravel() converts the 2D array of axes to a 1D array.
import seaborn as sns
import matplotlib.pyplot as plt
iris = sns.load_dataset("iris")
g = sns.pairplot(iris,
x_vars=["sepal_width", "sepal_length"],
y_vars=["petal_width"])
for ax in g.axes.ravel():
ax.axvline(x=3, ls='--', linewidth=3, c='red')
plt.show()
To use g.map() or its variants, the first parameter needs to be a function (without calling), followed by the parameters. So, in theory it would be g.map(plt.axvline, x=1, ls='--', linewidth=3, color='red'). But for a pairplot, the mapped function gets called with an x and y parameter from the pairplot, which conflict with the x as parameter for axvline.

How do I solve the Valuerror problem that appears when using the optimize.curve_fit function in my code?

What am I doing wrong here? I have tried to see if the shape of ydata and t are the same and they are in fact the same. The only thing that works is when I slice the output of the integrate.odeint function using [:,1] to give me the curve fit of didt. But the thing is, I require all three curves because I plan on graphing all three results. Your help is greatly appreciated.
import pandas as pd
import numpy as np
from datetime import datetime
from scipy import optimize
from scipy import integrate
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
N0=10203134 #susceptible population
ydata=np.array(df_plot.Jordan[253:]) #beginning on 1/10/2020
t=np.arange(len(ydata))
I0=ydata[0] #initial conditions
R0=4821
S0=N0-I0-R0
def SIR_Model(SIR,t,beta,alpha):
S,I,R=SIR
dsdt=-beta*S*I/N0
didt=beta*S*I/N0 -alpha*I
drdt=alpha*I
return dsdt,didt,drdt
def fit_ode(x,beta,alpha):
return integrate.odeint(SIR_Model,(S0,I0,R0),t,args=(beta,alpha))
print(ydata.shape)
print(t.shape)
popt,pcov=optimize.curve_fit(fit_ode,t,ydata)
perr=np.sqrt(np.diag(pcov))
print("standard deviation errors:",str(perr))
print("Optimal Parameters: Beta=",popt[0], "alpha:",popt[1])
ValueError Traceback (most recent call last)
<ipython-input-194-aa1bb286dbba> in <module>
----> 1 popt,pcov=optimize.curve_fit(fit_ode,t,ydata)
2 perr=np.sqrt(np.diag(pcov))
3 print("standard deviation errors:",str(perr))
4 print("Optimal Parameters: Beta=",popt[0], "alpha:",popt[1])
5
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, **kwargs)
782 # Remove full_output from kwargs, otherwise we're passing it in twice.
783 return_full = kwargs.pop('full_output', False)
--> 784 res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
785 popt, pcov, infodict, errmsg, ier = res
786 ysize = len(infodict['fvec'])
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
408 if not isinstance(args, tuple):
409 args = (args,)
--> 410 shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
411 m = shape[0]
412
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
22 def _check_func(checker, argname, thefunc, x0, args, numinputs,
23 output_shape=None):
---> 24 res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
25 if (output_shape is not None) and (shape(res) != output_shape):
26 if (output_shape[0] != 1):
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in func_wrapped(params)
482 if transform is None:
483 def func_wrapped(params):
--> 484 return func(xdata, *params) - ydata
485 elif transform.ndim == 1:
486 def func_wrapped(params):
ValueError: operands could not be broadcast together with shapes (138,3) (138,)

Jupyter Notebook:Pyhon3

I have edited my question by uploading the whole code, so if you could check this out #Nathon_Marotte Sir.
I am trying to run this code and it gives me an error:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
observations = 1000
xs = np.random.uniform(low=-10, high=10, size=(observations,1))
zs = np.random.uniform(-10,10,(observations,1))
inputs = np.column_stack((xs,zs))
print(inputs.shape)
noise= np.random.uniform(-1, 1, (observations, 1))
targets = 2*xs - 3*zs + 5 + noise
print(targets.shape)
#observations=1000
targets = targets.reshape(observations,)
fig=plt.figure()
ax = fig.add_subplot(111,projection='3d')
ax.plot(xs, zs, targets)
ax.set_xlabel('xs')
ax.set_ylabel('zs')
ax.set_zlabel('Targets')
ax.view_init(azim=250)
plt.show()
targets=targets.reshape(observations,)
Error:
ValueError Traceback (most recent call last)
<ipython-input-44-28d2a78b4ad5> in <module>
3 ax = fig.add_subplot(111,projection='3d')
4
----> 5 ax.plot(xs, zs, targets)
6
7 ax.set_xlabel('xs')
F:\Softwares\Anaconda\Installed\lib\site-packages\mpl_toolkits\mplot3d\axes3d.py in plot(self, xs, ys, zdir, *args, **kwargs)
1467
1468 # Match length
-> 1469 zs = np.broadcast_to(zs, np.shape(xs))
1470
1471 lines = super().plot(xs, ys, *args, **kwargs)
<__array_function__ internals> in broadcast_to(*args, **kwargs)
F:\Softwares\Anaconda\Installed\lib\site-packages\numpy\lib\stride_tricks.py in broadcast_to(array, shape, subok)
178 [1, 2, 3]])
179 """
--> 180 return _broadcast_to(array, shape, subok=subok, readonly=True)
181
182
F:\Softwares\Anaconda\Installed\lib\site-packages\numpy\lib\stride_tricks.py in _broadcast_to(array, shape, subok, readonly)
121 'negative')
122 extras = []
--> 123 it = np.nditer(
124 (array,), flags=['multi_index', 'refs_ok', 'zerosize_ok'] + extras,
125 op_flags=['readonly'], itershape=shape, order='C')
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (1000,) and requested shape (1000,1)
As I am a newbie and not have sufficient knowledge to fix this. So if you could help me out to fix this bug out? That would be great.
Thanking you in advance.
I am done with my question and it worked when I used Google Colab resources.
Thank you all and especially #NathanMarotte

Single-arrow quiver plots don't seem to work with the cartopy PlateCarree transform

I'm trying to create single arrows at a time (i.e. just showing a wind vector at one location) using plt.quiver on a map. However, using when doing so (by using the ccrs.PlateCarree() transform) I get an issue where some sub-function is trying to get the number of dims of the 0-dim array / int.
Here's some minimal working examples:
from matplotlib import pyplot as plt
import cartopy
import cartopy.crs as ccrs
%matplotlib inline
#### Works: simple quiver plot
plt.quiver(0,0,1,-1)
#### Works: simple quiver plot on a map axis, without specifying transform (this obviously would get the location wrong, but just to show what works/doesn't)
plt.subplot(projection=ccrs.EckertIV())
plt.quiver(0,0,1,-1)
#### Works: simple quiver plot on a map axis with some other transform
plt.subplot(projection=ccrs.EckertIV())
plt.quiver(0,0,1,-1,transform=ccrs.EckertIV())
#### Doesn't work: simple quiver plot on a map axis with the PlateCarree transform
plt.subplot(projection=ccrs.EckertIV())
plt.quiver(0,0,1,-1,transform=ccrs.PlateCarree())
This last call produces the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-93-dec36b09994d> in <module>
1 plt.subplot(projection=ccrs.EckertIV())
----> 2 plt.quiver(0,0,1,-1,transform=ccrs.PlateCarree())
~/.conda/envs/climate1/lib/python3.7/site-packages/matplotlib/pyplot.py in quiver(data, *args, **kw)
2791 def quiver(*args, data=None, **kw):
2792 __ret = gca().quiver(
-> 2793 *args, **({"data": data} if data is not None else {}), **kw)
2794 sci(__ret)
2795 return __ret
~/.conda/envs/climate1/lib/python3.7/site-packages/cartopy/mpl/geoaxes.py in wrapper(self, *args, **kwargs)
308
309 kwargs['transform'] = transform
--> 310 return func(self, *args, **kwargs)
311 return wrapper
312
~/.conda/envs/climate1/lib/python3.7/site-packages/cartopy/mpl/geoaxes.py in quiver(self, x, y, u, v, *args, **kwargs)
1837 # Transform the vectors if the projection is not the same as the
1838 # data transform.
-> 1839 if (x.ndim == 1 and y.ndim == 1) and (x.shape != u.shape):
1840 x, y = np.meshgrid(x, y)
1841 u, v = self.projection.transform_vectors(t, x, y, u, v)
AttributeError: 'int' object has no attribute 'ndim'
A workaround right now is to just plot the same arrow twice on top of itself, e.g.
plt.subplot(projection=ccrs.EckertIV())
plt.quiver(np.array([0,0]),np.array([0,0]),np.array([1,1]),np.array([-1,-1]),transform=ccrs.PlateCarree())
but I was wondering if I was missing something in terms of just getting this to work normally.
I'm using cartopy 0.18.0 and matplotlib 3.2.1.
Thanks in advance for any advice!
Sometimes one must adhere to the call signature strictly.
Call signature:
quiver([X, Y], U, V, [C], **kw)
X, Y: 1D or 2D array-like
If all of the numbers of X,Y,U,V are put into array form:
plt.quiver(np.array([0]),np.array([0]), \
np.array([1]),np.array([-1]), transform=ccrs.PlateCarree())
It will work.

How to do clustering with k-means algorithm for an imported data set with proper scaling of both axis

I m new to data science and python, and jupyter notebook, I m currently studying how to do k means clustering on a data set. I came across ways in which can introduce data
Data = {'x': [25,34,22,27,33,33,31,22,35,34,67,54,57,43,50,57,59,52,65,47,49,48,35,33,44,45,38,43,51,46],
'y': [79,51,53,78,59,74,73,57,69,75,51,32,40,47,53,36,35,58,59,50,25,20,14,12,20,5,29,27,8,7]
}
df = DataFrame(Data,columns=['x','y'])
and use of blobs
data = make_blobs(n_samples=200, n_features=2, centers=4, cluster_std=1.6, random_state=50)
but I would like to know how to do a proper code with a csv file imported from my computer and do a k means with scaling, thank you in advance, I could not find relevant blogs to help me
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.cluster import KMeans
data=pd.read_csv("C:/Users/Dulangi/Downloads/winequality-red.csv")
data
data["alcohol"]=data["alcohol"]/data["alcohol"].max()
data["quality"]=data["quality"]/data["quality"].max()
plt.scatter(data["alcohol"],data['quality'])
plt.xlabel("alcohol")
plt.ylabel('quality')
plt.show()
x=data.copy()
kmeans=KMeans(2)
kmeans.fit(x)
clusters=x.copy()
clusters['cluster_pred']=kmeans.fit_predict(x)
plt.scatter(clusters["alcohol"],clusters['quality'],c=clusters['cluster_pred'],cmap='rainbow')
plt.xlabel("alcohol")
plt.ylabel('quality')
plt.show()
from sklearn import preprocessing
x_scaled=preprocessing.scale(x)
#x_scaled
wcss=[]
for i in range(1,30):
kmeans=KMeans(i)
kmeans.fit(x_scaled)
wcss.append(kmeans.inertia_)
wcss
plt.plot(range(1,30),wcss)
plt.xlabel('Number of clusters')
plt.ylabel('WCSS')
plt.show()
This is what i tried
the error i got
ValueError Traceback (most recent call last)
<ipython-input-12-d4955ce8615e> in <module>
39
40
---> 41 plt.plot(range(1,30),wcss)
42 plt.xlabel('Number of clusters')
43 plt.ylabel('WCSS')
~\Anaconda3\lib\site-packages\matplotlib\pyplot.py in plot(scalex, scaley, data, *args, **kwargs)
2787 return gca().plot(
2788 *args, scalex=scalex, scaley=scaley, **({"data": data} if data
-> 2789 is not None else {}), **kwargs)
2790
2791
~\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in plot(self, scalex, scaley, data, *args, **kwargs)
1664 """
1665 kwargs = cbook.normalize_kwargs(kwargs, mlines.Line2D._alias_map)
-> 1666 lines = [*self._get_lines(*args, data=data, **kwargs)]
1667 for line in lines:
1668 self.add_line(line)
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in __call__(self, *args, **kwargs)
223 this += args[0],
224 args = args[1:]
--> 225 yield from self._plot_args(this, kwargs)
226
227 def get_next_color(self):
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _plot_args(self, tup, kwargs)
389 x, y = index_of(tup[-1])
390
--> 391 x, y = self._xy_from_xy(x, y)
392
393 if self.command == 'plot':
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _xy_from_xy(self, x, y)
268 if x.shape[0] != y.shape[0]:
269 raise ValueError("x and y must have same first dimension, but "
--> 270 "have shapes {} and {}".format(x.shape, y.shape))
271 if x.ndim > 2 or y.ndim > 2:
272 raise ValueError("x and y can be no greater than 2-D, but have "
ValueError: x and y must have same first dimension, but have shapes (29,) and (1,)
You can easily do by using scikit-Learn
import pandas as pd
data=pd.read_csv('myfile.csv')
df=pd.DataFrame(data,index=None)
df.head()
Check if rows contain any null values
df.isnull().sum()
Drop all the rows with null values if any
df_numeric.dropna(inplace=True)
Normalize data
Normalize the data with MinMax scaling provided by sklearn
from sklearn import preprocessing
minmax_processed = preprocessing.MinMaxScaler().fit_transform(df.drop('title',axis=1))
df_numeric_scaled = pd.DataFrame(minmax_processed, index=df.index, columns=df.columns[:-1])
df_numeric_scaled.head()
from sklearn.cluster import KMeans
Apply K-Means Clustering
What k to choose?
Let's fit cluster size 1 to 20 on our data and take a look at the corresponding score value.
Nc = range(1, 20)
kmeans = [KMeans(n_clusters=i) for i in Nc]
score = [kmeans[i].fit(df_numeric_scaled).score(df_numeric_scaled) for i in range(len(kmeans))]
These score values signify how far our observations are from the cluster center. We want to keep this score value around 0. A large positive or a large negative value would indicate that the cluster center is far from the observations.
Based on these scores value, we plot an Elbow curve to decide which cluster size is optimal. Note that we are dealing with tradeoff between cluster size(hence the computation required) and the relative accuracy.
import matplotlib as pl
pl.plot(Nc,score)
pl.xlabel('Number of Clusters')
pl.ylabel('Score')
pl.title('Elbow Curve')
pl.show()
Fit K-Means for clustering with k=5
kmeans = KMeans(n_clusters=5)
kmeans.fit(df_numeric_scaled)
df['cluster'] = kmeans.labels_
df.head()

Resources