replace one element in multiple lists of list - python-3.x

I have multiple lists in the list and I want to replace "\xa0" in each list and I don't know how to do this my sample list looks like
[['0001/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'LSSZEC18033999', '\xa0'],
['0001/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '40693008366', '\xa0'],
['0002/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'APLU750808254', 'HTHC18032101'],
['0002/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '02037823030', '\xa0'],
['0003/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'LSSZEC18032365', '\xa0'],
['0003/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'SHAE19030155', '\xa0'],
['0004/18-19', 'NHAVA SHEVA SEA (INNSA1)', '0258A33647', 'LLLNVS842311NVS'],
['0004/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '17602776476', '\xa0'],
['0005/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'APLU750808254', 'HTHC18032101'],
['0005/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'SNKO02A190301057', '\xa0'],
['0006/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'SZWY18030109', '\xa0'],
['0006/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '40684842450', '3986'],
['0007/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'SRL18030520', '\xa0'],
['0007/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'HDMUJPNS1768154', '\xa0'],
['0008/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'YSNBF18030315', '\xa0'],
['0008/19-20', 'MUMBAI', 'CTLQD19036504', '\xa0'], ['0009/18-19', 'NHAVA
SHEVA SEA (INNSA1)', 'SNKO02A180300433', '\xa0'], ['0009/19-20', 'SAHAR AIR
CARGO ACC (INBOM4)', '51404381786', 'X8867ANKF7X'], ['0010/18-19', 'NHAVA
SHEVA SEA (INNSA1)', 'SNKO02A180300587', '\xa0'], ['0010/19-20', 'NHAVA
SHEVA SEA (INNSA1)', 'SRL19030377', '\xa0']]
need help.

Try the below code, hope this helps.
data = [['0001/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'LSSZEC18033999', '\xa0'], ['0001/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '40693008366', '\xa0'], ['0002/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'APLU750808254', 'HTHC18032101'], ['0002/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '02037823030', '\xa0'], ['0003/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'LSSZEC18032365', '\xa0'], ['0003/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'SHAE19030155', '\xa0'], ['0004/18-19', 'NHAVA SHEVA SEA (INNSA1)', '0258A33647', 'LLLNVS842311NVS'], ['0004/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '17602776476', '\xa0'], ['0005/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'APLU750808254', 'HTHC18032101'], ['0005/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'SNKO02A190301057', '\xa0'], ['0006/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'SZWY18030109', '\xa0'], ['0006/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '40684842450', '3986'], ['0007/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'SRL18030520', '\xa0'], ['0007/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'HDMUJPNS1768154', '\xa0'], ['0008/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'YSNBF18030315', '\xa0'], ['0008/19-20', 'MUMBAI', 'CTLQD19036504', '\xa0'], ['0009/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'SNKO02A180300433', '\xa0'], ['0009/19-20', 'SAHAR AIR CARGO ACC (INBOM4)', '51404381786', 'X8867ANKF7X'], ['0010/18-19', 'NHAVA SHEVA SEA (INNSA1)', 'SNKO02A180300587', '\xa0'], ['0010/19-20', 'NHAVA SHEVA SEA (INNSA1)', 'SRL19030377', '\xa0']]
newdata = [[sent.replace(u'\xa0', u' ') for sent in lst]for lst in data]
print(newdata)

in_list = [['123', '\xa0'], ['123', '\xa0'], ['123', '\xa0'], ['123', '\xa0']]
out_list = [[i.replace('\xa0', '') if i == '\xa0' else i for i in sub_list] for sub_list in in_list]

Related

Text in table cells not available in Word doc (python-docx)

I'm trying to extract the text from certain columns in tables saved in docx files, so I'm using the python-docx library to parse the documents but it's only returning the text from certain cells. I've used opc-diag to get the xml for the word doc, and I've pasted a snippet below. The only cells that I can read the text from are the ones containing numbers (so 1 in the snippet), but I can't see what's different about those cells in the XML. I know I might have to end up writing my own parser (can't use Word as the code will be hosted in an AWS service) but I feel like I'm missing something obvious. Has anybody come across anything like this before? I found some other stackoverflow answers mentioning <w:sdt> tags causing problems, but I don't have any of those.
The code I'm using to extract text from cells -
for table in raw_script.tables:
column_data = []
for column in table.columns:
for cell in column.cells:
if cell.text not in column_data:
column_data.append(cell.text)
print(column_data)
That prints ['', '1', '2', '3', '4', '5'], which isn't what I want!
The document.xml snippet, if it helps -
<w:body>
<w:tbl>
<w:tblPr>
<w:tblW w:w="10556" w:type="dxa"/>
<w:tblLayout w:type="fixed"/>
<w:tblLook w:val="04A0" w:firstRow="1" w:lastRow="0" w:firstColumn="1" w:lastColumn="0" w:noHBand="0" w:noVBand="1"/>
</w:tblPr>
<w:tblGrid>
<w:gridCol w:w="2978"/>
<w:gridCol w:w="794"/>
<w:gridCol w:w="1622"/>
<w:gridCol w:w="4084"/>
<w:gridCol w:w="1078"/>
</w:tblGrid>
<w:tr w:rsidR="00C409AE" w:rsidTr="00305E71">
<w:trPr>
<w:cantSplit/>
<w:trHeight w:val="2127"/>
</w:trPr>
<w:tc>
<w:tcPr>
<w:tcW w:w="2978" w:type="dxa"/>
<w:tcBorders>
<w:top w:val="nil"/>
<w:left w:val="nil"/>
<w:bottom w:val="nil"/>
<w:right w:val="nil"/>
</w:tcBorders>
<w:shd w:val="clear" w:color="auto" w:fill="auto"/>
<w:hideMark/>
</w:tcPr>
<w:p w:rsidR="00C409AE" w:rsidRPr="00D01A3D" w:rsidRDefault="00CF5F0E" w:rsidP="00C409AE">
<w:pPr>
<w:spacing w:after="0" w:line="240" w:lineRule="auto"/>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:szCs w:val="20"/>
<w:u w:val="single"/>
<w:lang w:val="en-AU" w:eastAsia="en-AU"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:szCs w:val="20"/>
<w:u w:val="single"/>
<w:lang w:val="en-AU" w:eastAsia="en-AU"/>
</w:rPr>
<w:t>
Sunset</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00C409AE" w:rsidRPr="00D01A3D" w:rsidRDefault="00C409AE" w:rsidP="00C409AE">
<w:pPr>
<w:spacing w:after="0" w:line="240" w:lineRule="auto"/>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:szCs w:val="20"/>
<w:u w:val="single"/>
<w:lang w:val="en-AU" w:eastAsia="en-AU"/>
</w:rPr>
</w:pPr>
</w:p>
<w:p w:rsidR="00C409AE" w:rsidRPr="00D01A3D" w:rsidRDefault="00C409AE" w:rsidP="00C409AE">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:b/>
<w:color w:val="000000"/>
<w:u w:val="single"/>
</w:rPr>
</w:pPr>
<w:r w:rsidRPr="00D01A3D">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:b/>
<w:color w:val="000000"/>
<w:u w:val="single"/>
</w:rPr>
<w:t>
Series Title: 10:00:02</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00C409AE" w:rsidRPr="00D01A3D" w:rsidRDefault="00CF5F0E" w:rsidP="00C409AE">
<w:pPr>
<w:spacing w:after="0" w:line="240" w:lineRule="auto"/>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:b/>
<w:color w:val="000000"/>
<w:u w:val="single"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:b/>
<w:color w:val="000000"/>
<w:u w:val="single"/>
</w:rPr>
<w:t>
Sample Script</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00C409AE" w:rsidRPr="00305E71" w:rsidRDefault="00C409AE" w:rsidP="00C409AE">
<w:pPr>
<w:spacing w:after="0" w:line="240" w:lineRule="auto"/>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:szCs w:val="20"/>
<w:u w:val="single"/>
<w:lang w:val="en-AU" w:eastAsia="zh-TW"/>
</w:rPr>
</w:pPr>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="794" w:type="dxa"/>
<w:tcBorders>
<w:top w:val="nil"/>
<w:left w:val="nil"/>
<w:bottom w:val="nil"/>
<w:right w:val="nil"/>
</w:tcBorders>
<w:shd w:val="clear" w:color="auto" w:fill="auto"/>
<w:noWrap/>
<w:hideMark/>
</w:tcPr>
<w:p w:rsidR="00C409AE" w:rsidRDefault="00C409AE" w:rsidP="00C409AE">
<w:pPr>
<w:spacing w:after="0" w:line="240" w:lineRule="auto"/>
<w:jc w:val="center"/>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:i/>
<w:iCs/>
<w:color w:val="000000"/>
<w:lang w:val="en-AU" w:eastAsia="en-AU"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
<w:i/>
<w:iCs/>
<w:color w:val="000000"/>
</w:rPr>
<w:t>
1</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="1622" w:type="dxa"/>
<w:tcBorders>
<w:top w:val="nil"/>
<w:left w:val="nil"/>
<w:bottom w:val="nil"/>
<w:right w:val="nil"/>
</w:tcBorders>
<w:shd w:val="clear" w:color="auto" w:fill="auto"/>
<w:noWrap/>
<w:hideMark/>
</w:tcPr>
<w:p w:rsidR="00C409AE" w:rsidRDefault="00C409AE" w:rsidP="00C409AE">
<w:pPr>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:b/>
<w:bCs/>
<w:color w:val="000000"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:b/>
<w:bCs/>
<w:color w:val="000000"/>
</w:rPr>
<w:t>
SONG:</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="4084" w:type="dxa"/>
<w:tcBorders>
<w:top w:val="nil"/>
<w:left w:val="nil"/>
<w:bottom w:val="nil"/>
<w:right w:val="nil"/>
</w:tcBorders>
<w:shd w:val="clear" w:color="auto" w:fill="auto"/>
<w:hideMark/>
</w:tcPr>
<w:p w:rsidR="00C409AE" w:rsidRDefault="00CF5F0E" w:rsidP="00C409AE">
<w:pPr>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:color w:val="000000"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:color w:val="000000"/>
</w:rPr>
<w:t>
# Theme Music Lyrics</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00CF5F0E" w:rsidRDefault="00CF5F0E" w:rsidP="00C409AE">
<w:pPr>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:color w:val="000000"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:color w:val="000000"/>
</w:rPr>
<w:t>
# Second Line Of Theme</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="1078" w:type="dxa"/>
<w:tcBorders>
<w:top w:val="nil"/>
<w:left w:val="nil"/>
<w:bottom w:val="nil"/>
<w:right w:val="nil"/>
</w:tcBorders>
<w:shd w:val="clear" w:color="auto" w:fill="auto"/>
<w:noWrap/>
<w:hideMark/>
</w:tcPr>
<w:p w:rsidR="00C409AE" w:rsidRDefault="00C409AE" w:rsidP="00C409AE">
<w:pPr>
<w:jc w:val="right"/>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:color w:val="000000"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:cs="Calibri"/>
<w:color w:val="000000"/>
</w:rPr>
<w:t>
10:00:01</w:t>
</w:r>
</w:p>
</w:tc>
</w:tr>
<w:tr w:rsidR="00C409AE" w:rsidTr="00305E71">
<w:trPr>
<w:cantSplit/>
<w:trHeight w:val="1512"/>
</w:trPr>
<w:tc>
<w:tcPr>
<w:tcW w:w="2978" w:type="dxa"/>
<w:tcBorders>
<w:top w:val="nil"/>
<w:left w:val="nil"/>
<w:bottom w:val="nil"/>
<w:right w:val="nil"/>
</w:tcBorders>
<w:shd w:val="clear" w:color="auto" w:fill="auto"/>
<w:hideMark/>
</w:tcPr>

Curve Fitting multiple x variables

I'm currently trying to curve fit some experimental data to a simple power-law equation.
Nu = C*Re**m*Pr**(1/3)
I am trying to use the scipy.optimize.curve_fit function to do this, but am getting the error code: "Result from function call is not a proper array of floats." I don't know why I am getting this error code but I wonder if it is because I have too many arrays that I need to use for my equation.
My code is as follows
import matplotlib.pyplot as plt
import scipy.optimize as so
def function(C, m):
result = []
for i,j in zip(Re, Pr):
y = C * i ** m * j ** (1/3)
result.append(y)
return result
parameters, covariance = so.curve_fit(function, Re, Nu)
y2 = function(Re, Pr, *parameters)
print(parameters)
plt.plot(Re, Nu)
plt.plot(Re, y2)
plt.show()
Here is a graphing 3D surface fitter using curve_fit that has a 3D scatterplot, 3D surface plot, and a contour plot. Note that the initial parameter estimates are all 1.0, and this example does not use scipy's genetic algorithm to estimate initial parameter values.
import numpy, scipy, scipy.optimize
import matplotlib
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm # to colormap 3D surfaces from blue to red
import matplotlib.pyplot as plt
graphWidth = 800 # units are pixels
graphHeight = 600 # units are pixels
# 3D contour plot lines
numberOfContourLines = 16
def SurfacePlot(func, data, fittedParameters):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
matplotlib.pyplot.grid(True)
axes = Axes3D(f)
x_data = data[0]
y_data = data[1]
z_data = data[2]
xModel = numpy.linspace(min(x_data), max(x_data), 20)
yModel = numpy.linspace(min(y_data), max(y_data), 20)
X, Y = numpy.meshgrid(xModel, yModel)
Z = func(numpy.array([X, Y]), *fittedParameters)
axes.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=1, antialiased=True)
axes.scatter(x_data, y_data, z_data) # show data along with plotted surface
axes.set_title('Surface Plot (click-drag with mouse)') # add a title for surface plot
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
axes.set_zlabel('Z Data') # Z axis data label
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def ContourPlot(func, data, fittedParameters):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
x_data = data[0]
y_data = data[1]
z_data = data[2]
xModel = numpy.linspace(min(x_data), max(x_data), 20)
yModel = numpy.linspace(min(y_data), max(y_data), 20)
X, Y = numpy.meshgrid(xModel, yModel)
Z = func(numpy.array([X, Y]), *fittedParameters)
axes.plot(x_data, y_data, 'o')
axes.set_title('Contour Plot') # add a title for contour plot
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
CS = matplotlib.pyplot.contour(X, Y, Z, numberOfContourLines, colors='k')
matplotlib.pyplot.clabel(CS, inline=1, fontsize=10) # labels for contours
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def ScatterPlot(data):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
matplotlib.pyplot.grid(True)
axes = Axes3D(f)
x_data = data[0]
y_data = data[1]
z_data = data[2]
axes.scatter(x_data, y_data, z_data)
axes.set_title('Scatter Plot (click-drag with mouse)')
axes.set_xlabel('X Data')
axes.set_ylabel('Y Data')
axes.set_zlabel('Z Data')
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def func(data, a, alpha, beta):
t = data[0]
p_p = data[1]
return a * (t**alpha) * (p_p**beta)
if __name__ == "__main__":
xData = numpy.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])
yData = numpy.array([11.0, 12.1, 13.0, 14.1, 15.0, 16.1, 17.0, 18.1, 90.0])
zData = numpy.array([1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.0, 9.9])
data = [xData, yData, zData]
initialParameters = [1.0, 1.0, 1.0] # these are the same as scipy default values in this example
# here a non-linear surface fit is made with scipy's curve_fit()
fittedParameters, pcov = scipy.optimize.curve_fit(func, [xData, yData], zData, p0 = initialParameters)
ScatterPlot(data)
SurfacePlot(func, data, fittedParameters)
ContourPlot(func, data, fittedParameters)
print('fitted prameters', fittedParameters)
modelPredictions = func(data, *fittedParameters)
absError = modelPredictions - zData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(zData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

Fitting 3d data

I would like to fit a function to a 3d data.
I read the data with pandas:
df = pd.read_csv('data.csv')
Ca = df.Ca
q = df.q
L = df.L0
Then, I define my 3d function (z=f(x,y)) as:
def func(q, Ca, l0, v0, beta):
return l0 + q*v0*(1+beta/(q*Ca))
then I use curve_fit to find the best fit parameters:
from scipy.optimize import curve_fit
guess = (1,1,1)
popt, pcov = curve_fit(func, q,Ca,L, guess)
And it gives me the following errors:
ValueError: `sigma` has incorrect shape.
Do you know what is the mistake and how to solve it?
Thanks a lot for your help
Here is a graphical 3D fitter with 3D scatter plot, 3D surface plot, and 3D contour plot.
import numpy, scipy, scipy.optimize
import matplotlib
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm # to colormap 3D surfaces from blue to red
import matplotlib.pyplot as plt
graphWidth = 800 # units are pixels
graphHeight = 600 # units are pixels
# 3D contour plot lines
numberOfContourLines = 16
def SurfacePlot(func, data, fittedParameters):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
matplotlib.pyplot.grid(True)
axes = Axes3D(f)
x_data = data[0]
y_data = data[1]
z_data = data[2]
xModel = numpy.linspace(min(x_data), max(x_data), 20)
yModel = numpy.linspace(min(y_data), max(y_data), 20)
X, Y = numpy.meshgrid(xModel, yModel)
Z = func(numpy.array([X, Y]), *fittedParameters)
axes.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=1, antialiased=True)
axes.scatter(x_data, y_data, z_data) # show data along with plotted surface
axes.set_title('Surface Plot (click-drag with mouse)') # add a title for surface plot
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
axes.set_zlabel('Z Data') # Z axis data label
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def ContourPlot(func, data, fittedParameters):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
x_data = data[0]
y_data = data[1]
z_data = data[2]
xModel = numpy.linspace(min(x_data), max(x_data), 20)
yModel = numpy.linspace(min(y_data), max(y_data), 20)
X, Y = numpy.meshgrid(xModel, yModel)
Z = func(numpy.array([X, Y]), *fittedParameters)
axes.plot(x_data, y_data, 'o')
axes.set_title('Contour Plot') # add a title for contour plot
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
CS = matplotlib.pyplot.contour(X, Y, Z, numberOfContourLines, colors='k')
matplotlib.pyplot.clabel(CS, inline=1, fontsize=10) # labels for contours
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def ScatterPlot(data):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
matplotlib.pyplot.grid(True)
axes = Axes3D(f)
x_data = data[0]
y_data = data[1]
z_data = data[2]
axes.scatter(x_data, y_data, z_data)
axes.set_title('Scatter Plot (click-drag with mouse)')
axes.set_xlabel('X Data')
axes.set_ylabel('Y Data')
axes.set_zlabel('Z Data')
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def func(data, a, alpha, beta):
x = data[0]
y = data[1]
return a * (x**alpha) * (y**beta)
if __name__ == "__main__":
xData = numpy.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])
yData = numpy.array([11.0, 12.1, 13.0, 14.1, 15.0, 16.1, 17.0, 18.1, 90.0])
zData = numpy.array([1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.0, 9.9])
data = [xData, yData, zData]
initialParameters = [1.0, 1.0, 1.0] # these are the same as scipy default values in this example
# here a non-linear surface fit is made with scipy's curve_fit()
fittedParameters, pcov = scipy.optimize.curve_fit(func, [xData, yData], zData, p0 = initialParameters)
ScatterPlot(data)
SurfacePlot(func, data, fittedParameters)
ContourPlot(func, data, fittedParameters)
print('fitted prameters', fittedParameters)
modelPredictions = func(data, *fittedParameters)
absError = modelPredictions - zData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(zData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

Linear approximation of the given equation with python3

I was given a set of raw datum and have to model it by means of some machine learning techniques. After some research, I decided to do with the method of linear approximation.
Description of the equation.
z - depth (meters)
T(z) - temperature at the depth z
T(zᵢ) - temperature at the depth zᵢ
T₀ - temperature at the surface (It is constant and known)
K - coefficient of geothermal gradient (How the temperature changes with respect to the depth)
Mᵢ - flow rate of the liquid at the depth zᵢ
As it is shown from the equation we can find the temperature of the liquid in the any depth of the well bore.
I have list of depths, temperature and the flow rate of the liquid. I have to model an equation according to these datum by means of python3. Currently I use matplotlib library for such type of calculations.
Here is an example of non-linear multiple regression in Python 3, this should easily be adapted to your multiple regression problem.
import numpy, scipy, scipy.optimize
import matplotlib
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm # to colormap 3D surfaces from blue to red
import matplotlib.pyplot as plt
graphWidth = 800 # units are pixels
graphHeight = 600 # units are pixels
# 3D contour plot lines
numberOfContourLines = 16
def SurfacePlot(func, data, fittedParameters):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
matplotlib.pyplot.grid(True)
axes = Axes3D(f)
x_data = data[0]
y_data = data[1]
z_data = data[2]
xModel = numpy.linspace(min(x_data), max(x_data), 20)
yModel = numpy.linspace(min(y_data), max(y_data), 20)
X, Y = numpy.meshgrid(xModel, yModel)
Z = func(numpy.array([X, Y]), *fittedParameters)
axes.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=1, antialiased=True)
axes.scatter(x_data, y_data, z_data) # show data along with plotted surface
axes.set_title('Surface Plot (click-drag with mouse)') # add a title for surface plot
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
axes.set_zlabel('Z Data') # Z axis data label
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def ContourPlot(func, data, fittedParameters):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
x_data = data[0]
y_data = data[1]
z_data = data[2]
xModel = numpy.linspace(min(x_data), max(x_data), 20)
yModel = numpy.linspace(min(y_data), max(y_data), 20)
X, Y = numpy.meshgrid(xModel, yModel)
Z = func(numpy.array([X, Y]), *fittedParameters)
axes.plot(x_data, y_data, 'o')
axes.set_title('Contour Plot') # add a title for contour plot
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
CS = matplotlib.pyplot.contour(X, Y, Z, numberOfContourLines, colors='k')
matplotlib.pyplot.clabel(CS, inline=1, fontsize=10) # labels for contours
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def ScatterPlot(data):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
matplotlib.pyplot.grid(True)
axes = Axes3D(f)
x_data = data[0]
y_data = data[1]
z_data = data[2]
axes.scatter(x_data, y_data, z_data)
axes.set_title('Scatter Plot (click-drag with mouse)')
axes.set_xlabel('X Data')
axes.set_ylabel('Y Data')
axes.set_zlabel('Z Data')
plt.show()
plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems
def func(data, a, alpha, beta):
t = data[0]
p_p = data[1]
return a * (t**alpha) * (p_p**beta)
if __name__ == "__main__":
xData = numpy.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])
yData = numpy.array([11.0, 12.1, 13.0, 14.1, 15.0, 16.1, 17.0, 18.1, 90.0])
zData = numpy.array([1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.0, 9.9])
data = [xData, yData, zData]
# this example uses curve_fit()'s default initial paramter values
fittedParameters, pcov = scipy.optimize.curve_fit(func, [xData, yData], zData)
ScatterPlot(data)
SurfacePlot(func, data, fittedParameters)
ContourPlot(func, data, fittedParameters)
print('fitted prameters', fittedParameters)

Why the colorbar is not normalized (0 to 1)? How to force it?

I am plotting a confusion matrix. I have used the function from ScikitLearn. But I do not know why the colorbar does not have a range from 0 to 1. Is there a way to force it?
import itertools
def plot_confusion_matrix(cm, title='Confusion matrix RF', cmap=plt.cm.viridis):
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(np.unique(y)))
plt.xticks(tick_marks, rotation=90)
ax = plt.gca()
ax.set_xticklabels(['s'+lab for lab in (ax.get_xticks()+1).astype(str)])
plt.yticks(tick_marks)
ax.set_yticklabels(['s'+lab for lab in (ax.get_yticks()+1).astype(str)])
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
cm_imp = confusion_matrix(y_true, y_pred)
cm_imp_normalized = cm_imp.astype('float') / cm_imp.sum(axis=1)[:, np.newaxis]
plt.figure(figsize=(8,6))
plot_confusion_matrix(cm_imp_normalized)
plt.show()
print("")
print("")
You can set the color range using the vmin, vmax arguments to imshow.
plt.imshow(data, cmap=cmap, vmin=0, vmax=1)

Resources