hell going on with stochastic gradient descent - python-3.x

I am working with multivariate linear regression and using stochastic gradient descent to optimize.
Working on this dataSet
http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/
for every run all hyperParameters and all remaining things are same, epochs=200 and alpha=0.1
when I first run then I got final_cost=0.0591, when I run the program again keeping everything same I got final_cost=1.0056
, running again keeping everything same I got final_cost=0.8214
, running again final_cost=15.9591, running again final_cost=2.3162 and so on and on...
As you can see that keeping everything same and running, again and again, each time the final cost changes by large amount sometimes so large like from 0.8 to direct 15.9 , 0.05 to direct 1.00 and not only this the graph of final cost after every epoch within the same run is every zigzag unlike in batch GD in which the cost graph decreases smoothly.
I can't understand that why SGD is behaving so weirdly, different results in the different run.
I tried the same with batch GD and everything is absolutely fine and smooth as per expectations. In case of batch GD no matter how many times I run the same code the result is exactly the same every time.
But in the case of SGD, I literally cried,
class Abalone :
def __init__(self,df,epochs=200,miniBatchSize=250,alpha=0.1) :
self.df = df.dropna()
self.epochs = epochs
self.miniBatchSize = miniBatchSize
self.alpha = alpha
print("abalone created")
self.modelTheData()
def modelTheData(self) :
self.TOTAL_ATTR = len(self.df.columns) - 1
self.TOTAL_DATA_LENGTH = len(self.df.index)
self.df_trainingData =
df.drop(df.index[int(self.TOTAL_DATA_LENGTH * 0.6):])
self.TRAINING_DATA_SIZE = len(self.df_trainingData)
self.df_testingData =
df.drop(df.index[:int(self.TOTAL_DATA_LENGTH * 0.6)])
self.TESTING_DATA_SIZE = len(self.df_testingData)
self.miniBatchSize = int(self.TRAINING_DATA_SIZE / 10)
self.thetaVect = np.zeros((self.TOTAL_ATTR+1,1),dtype=float)
self.stochasticGradientDescent()
def stochasticGradientDescent(self) :
self.finalCostArr = np.array([])
startTime = time.time()
for i in range(self.epochs) :
self.df_trainingData =
self.df_trainingData.sample(frac=1).reset_index(drop=True)
miniBatches=[self.df_trainingData.loc[x:x+self.miniBatchSize-
((x+self.miniBatchSize)/(self.TRAINING_DATA_SIZE-1)),:]
for x in range(0,self.TRAINING_DATA_SIZE,self.miniBatchSize)]
self.epochCostArr = np.array([])
for j in miniBatches :
tempMat = j.values
self.actualValVect = tempMat[ : , self.TOTAL_ATTR:]
tempMat = tempMat[ : , :self.TOTAL_ATTR]
self.desMat = np.append(
np.ones((len(j.index),1),dtype=float) , tempMat , 1 )
del tempMat
self.trainData()
currCost = self.costEvaluation()
self.epochCostArr = np.append(self.epochCostArr,currCost)
self.finalCostArr = np.append(self.finalCostArr,
self.epochCostArr[len(miniBatches)-1])
endTime = time.time()
print(f"execution time : {endTime-startTime}")
self.graphEvaluation()
print(f"final cost :
{self.finalCostArr[len(self.finalCostArr)-1]}")
print(self.thetaVect)
def trainData(self) :
self.predictedValVect = self.predictResult()
diffVect = self.predictedValVect - self.actualValVect
partialDerivativeVect = np.matmul(self.desMat.T , diffVect)
self.thetaVect -=
(self.alpha/len(self.desMat))*partialDerivativeVect
def predictResult(self) :
return np.matmul(self.desMat,self.thetaVect)
def costEvaluation(self) :
cost = sum((self.predictedValVect - self.actualValVect)**2)
return cost / (2*len(self.actualValVect))
def graphEvaluation(self) :
plt.title("cost at end of all epochs")
x = range(len(self.epochCostArr))
y = self.epochCostArr
plt.plot(x,y)
plt.xlabel("iterations")
plt.ylabel("cost")
plt.show()
I kept epochs=200 and alpha=0.1 for all runs but I got a totally different result in each run.
The vector mentioned below is the theta vector where the first entry is the bias and remaining are weights
RUN 1 =>>
[[ 5.26020144]
[ -0.48787333]
[ 4.36479114]
[ 4.56848299]
[ 2.90299436]
[ 3.85349625]
[-10.61906207]
[ -0.93178027]
[ 8.79943389]]
final cost : 0.05917831328836957
RUN 2 =>>
[[ 5.18355814]
[ -0.56072668]
[ 4.32621647]
[ 4.58803884]
[ 2.89157598]
[ 3.7465471 ]
[-10.75751065]
[ -1.03302031]
[ 8.87559247]]
final cost: 1.0056239103948563
RUN 3 =>>
[[ 5.12836056]
[ -0.43672936]
[ 4.25664898]
[ 4.53397465]
[ 2.87847224]
[ 3.74693215]
[-10.73960775]
[ -1.00461585]
[ 8.85225402]]
final cost : 0.8214901206702101
RUN 4 =>>
[[ 5.38794798]
[ 0.23695412]
[ 4.43522951]
[ 4.66093372]
[ 2.9460605 ]
[ 4.13390252]
[-10.60071883]
[ -0.9230675 ]
[ 8.87229324]]
final cost: 15.959132174895712
RUN 5 =>>
[[ 5.19643132]
[ -0.76882106]
[ 4.35445135]
[ 4.58782119]
[ 2.8908931 ]
[ 3.63693031]
[-10.83291949]
[ -1.05709616]
[ 8.865904 ]]
final cost: 2.3162151072779804
I am unable to figure out what is going Wrong. Does SGD behave like this or I did some stupidity while transforming my code from batch GD to SGD. And if SGD behaves like this then how I get to know that how many times I have to rerun because I am not so lucky that every time in the first run I got such a small cost like 0.05 sometimes the first run gives cost around 10.5 sometimes 0.6 and maybe rerunning it a lot of times I got cost even smaller than 0.05.
when I approached the exact same problem with exact same code and hyperParameters just replacing the SGD function with normal batch GD I get the expected result i.e, after each iteration over the same data my cost is decreasing smoothly i.e., a monotonic decreasing function and no matter how many times I rerun the same program I got exactly the same result as this is very obvious.
"keeping everything same but using batch GD for epochs=20000 and alpha=0.1
I got final_cost=2.7474"
def BatchGradientDescent(self) :
self.costArr = np.array([])
startTime = time.time()
for i in range(self.epochs) :
tempMat = self.df_trainingData.values
self.actualValVect = tempMat[ : , self.TOTAL_ATTR:]
tempMat = tempMat[ : , :self.TOTAL_ATTR]
self.desMat = np.append( np.ones((self.TRAINING_DATA_SIZE,1),dtype=float) , tempMat , 1 )
del tempMat
self.trainData()
if i%100 == 0 :
currCost = self.costEvaluation()
self.costArr = np.append(self.costArr,currCost)
endTime = time.time()
print(f"execution time : {endTime - startTime} seconds")
self.graphEvaluation()
print(self.thetaVect)
print(f"final cost : {self.costArr[len(self.costArr)-1]}")
SomeBody help me figure out What actually is going on. Every opinion/solution is big revenue for me in this new field :)

You missed the most important and only difference between GD ("Gradient Descent") and SGD ("Stochastic Gradient Descent").
Stochasticity - Literally means "the quality of lacking any predictable order or plan". Meaning randomness.
Which means that while in the GD algorithm, the order of the samples in each epoch remains constant, in SGD the order is randomly shuffled at the beginning of every epochs.
So every run of GD with the same initialization and hyperparameters will produce the exact same results, while SGD will most defiantly not (as you have experienced).
The reason for using stochasticity is to prevent the model from memorizing the training samples (which will results in overfitting, where accuracy on the training set will be high but accuracy on unseen samples will be bad).
Now regarding to the big differences in final cost values between runs at your case, my guess is that your learning rate is too high. You can use a lower constant value, or better yet, use a decaying learning rate (which gets lower as epochs get higher).

Related

Pan Tompkins Lowpass filter overflow

The Pan Tompkins algorithm1 for removing noise from an ECG/EKG is cited often. They use a low pass filter, followed by a high pass filter. The output of the high pass filter looks great. But (depending on starting conditions) the output of the low pass filter will continuously increase or decrease. Given enough time, your numbers will eventually get to a size that the programming language cannot handle and rollover. If I run this on an Arduino (which uses a variant of C), it rolls over on the order of 10 seconds. Not ideal. Is there a way to get rid of this bias? I've tried messing with initial conditions, but I'm fresh out of ideas. The advantage of this algorithm is that it's not very computationally intensive and will run comfortably on a modest microprocessor.
1 Pan, Jiapu; Tompkins, Willis J. (March 1985). "A Real-Time QRS Detection Algorithm". IEEE Transactions on Biomedical Engineering. BME-32 (3): 230–236.
Python code to illustrate problem. Uses numpy and matplotlib:
import numpy as np
import matplotlib.pyplot as plt
#low-pass filter
def lpf(x):
y = x.copy()
for n in range(len(x)):
if(n < 12):
continue
y[n,1] = 2*y[n-1,1] - y[n-2,1] + x[n,1] - 2*x[n-6,1] + x[n-12,1]
return y
#high-pass filter
def hpf(x):
y = x.copy()
for n in range(len(x)):
if(n < 32):
continue
y[n,1] = y[n-1,1] - x[n,1]/32 + x[n-16,1] - x[n-17,1] + x[n-32,1]/32
return y
ecg = np.loadtxt('ecg_data.csv', delimiter=',',skiprows=1)
plt.plot(ecg[:,0], ecg[:,1])
plt.title('Raw Data')
plt.grid(True)
plt.savefig('raw.png')
plt.show()
#Application of lpf
f1 = lpf(ecg)
plt.plot(f1[:,0], f1[:,1])
plt.title('After Pan-Tompkins LPF')
plt.xlabel('time')
plt.ylabel('mV')
plt.grid(True)
plt.savefig('lpf.png')
plt.show()
#Application of hpf
f2 = hpf(f1[16:,:])
print(f2[-300:-200,1])
plt.plot(f2[:-100,0], f2[:-100,1])
plt.title('After Pan-Tompkins LPF+HPF')
plt.xlabel('time')
plt.ylabel('mV')
plt.grid(True)
plt.savefig('hpf.png')
plt.show()
raw data in CSV format:
timestamp,ecg_measurement
96813044,2.2336266040
96816964,2.1798632144
96820892,2.1505377292
96824812,2.1603128910
96828732,2.1554253101
96832660,2.1163244247
96836580,2.0576734542
96840500,2.0381231307
96844420,2.0527858734
96848340,2.0674486160
96852252,2.0283479690
96856152,1.9648094177
96860056,1.9208210945
96863976,1.9159335136
96867912,1.9208210945
96871828,1.8768328666
96875756,1.7986314296
96879680,1.7448680400
96883584,1.7155425548
96887508,1.7057673931
96891436,1.6520038604
96895348,1.5591397285
96899280,1.4809384346
96903196,1.4467253684
96907112,1.4369501113
96911032,1.3978494453
96914956,1.3440860509
96918860,1.2952101230
96922788,1.3000977039
96926684,1.3343108892
96930604,1.3440860509
96934516,1.3489736318
96938444,1.3294233083
96942364,1.3782991170
96946284,1.4222873687
96950200,1.4516129493
96954120,1.4369501113
96958036,1.4320625305
96961960,1.4565005302
96965872,1.4907135963
96969780,1.5053763389
96973696,1.4613881111
96977628,1.4125122070
96981548,1.4076246261
96985476,1.4467253684
96989408,1.4809384346
96993324,1.4760508537
96997236,1.4711632728
97001160,1.4907135963
97005084,1.5444769859
97008996,1.5982404708
97012908,1.5835777282
97016828,1.5591397285
97020756,1.5786901473
97024676,1.6324535369
97028604,1.6911046504
97032516,1.6959922313
97036444,1.6764417648
97040364,1.6813293457
97044296,1.7155425548
97048216,1.7448680400
97052120,1.7253177165
97056048,1.6911046504
97059968,1.6911046504
97063880,1.7302052974
97067796,1.7741935253
97071724,1.7693059444
97075644,1.7350928783
97079564,1.7595307826
97083480,1.8719452857
97087396,2.0381231307
97091316,2.2482893466
97095244,2.4828934669
97099156,2.7468230724
97103088,2.9960899353
97106996,3.0987291336
97110912,2.9178886413
97114836,2.5171065330
97118756,2.0185728073
97122668,1.5053763389
97126584,1.1094819307
97130492,0.8015640258
97134396,0.5767350673
97138308,0.4545454502
97142212,0.4349951267
97146124,0.4692081928
97150020,0.4887585639
97153924,0.4594330310
97157828,0.4105571746
97161740,0.3861192512
97165660,0.3763440847
97169580,0.3714565038
97173492,0.3225806236
97177404,0.2639296054
97181316,0.2394916772
97185236,0.2297165155
97189148,0.2443792819
97193060,0.2248289346
97196972,0.1857282543
97200900,0.1808406734
97204812,0.2199413537
97208732,0.2492668628
97212652,0.2443792819
97216572,0.2199413537
97220484,0.2248289346
97224404,0.2834799575
97228316,0.3274682044
97232228,0.3665689229
97236132,0.3861192512
97240036,0.4398827075
97243936,0.5083088874
97247836,0.6109481811
97251748,0.7086998939
97255660,0.7771260738
97259568,0.8553275108
97263476,0.9775171279
97267392,1.1094819307
97271308,1.1974585056
97275228,1.2512218952
97279148,1.2952101230
97283056,1.3734115362
97286992,1.4760508537
97290900,1.5493645668
97294820,1.5738025665
97298740,1.5982404708
97302652,1.6471162796
97306584,1.7106549739
97310500,1.7546432018
97314420,1.7546432018
97318340,1.7644183635
97322272,1.8084066390
97326168,1.8621701240
97330072,1.8963831901
97333988,1.8817204475
97337912,1.8572825431
97341840,1.8670577049
97345748,1.8866080284
97349668,1.8768328666
97353580,1.8230694770
97357500,1.7595307826
97361424,1.7302052974
97365332,1.7350928783
97369252,1.6959922313
97373168,1.6226783752
97377092,1.5298142433
97381012,1.4613881111
97384940,1.4320625305
97388860,1.4076246261
97392780,1.3440860509
97396676,1.2658846378
97400604,1.2121212482
97404532,1.1974585056
97408444,1.1779080629
97412356,1.1192570924
97416264,1.0361680984
97420164,0.9628542900
97424068,0.9286412239
97427988,0.9042033195
97431892,0.8406646728
97435804,0.7575757503
97439708,0.6940371513
97443628,0.6793744087
97447540,0.6793744087
97451452,0.6549364089
97455356,0.6060606002
97459240,0.5767350673
97463140,0.6011730194
97467044,0.6451612472
97470964,0.6842619895
97474884,0.6891495704
97478796,0.7184750556
97482700,0.8064516067
97486612,0.8846529960
97490516,0.9335288047
97494428,0.9530791282
97498340,0.9481915473
97502256,0.9726295471
97506156,0.9921798706
97510060,0.9726295471
97513980,0.8846529960
97517884,0.7869012832
97521796,0.7086998939
97525692,0.6549364089
97529604,0.5913978099
97533516,0.4887585639
97537428,0.3567937374
97541348,0.2639296054
97545260,0.2003910064
97549148,0.1417399787
97553060,0.0928641223
97556980,0.0537634420
97560892,0.0342130994
97564804,0.0146627559
97568708,0.0244379281
97572628,0.0048875851
97576500,0.0000000000
97580324,0.0000000000
97584172,0.0097751703
97588060,0.0244379281
97591980,0.0195503416
97595900,0.0146627559
97599812,0.0488758563
97603724,0.1319648027
97607628,0.2248289346
97611548,0.3030303001
97615444,0.3665689229
97619364,0.4496578693
97623276,0.5718474864
97627176,0.7038123130
97631076,0.8064516067
97634988,0.8699902534
97638900,0.9384163856
97642816,1.0361680984
97646720,1.1485825777
97650644,1.2365591526
97654572,1.2658846378
97658492,1.2805473804
97662404,1.3294233083
97666324,1.3782991170
97670244,1.3831867027
97674148,1.3489736318
97678068,1.3049852848
97681988,1.2903225421
97685908,1.3000977039
97689812,1.3098728656
97693728,1.2463343143
97697648,1.1876833438
97701568,1.1681329011
97705488,1.1876833438
97709412,1.1827956390
97713328,1.1339198350
97717244,1.0752688646
97721144,1.0557184219
97725056,1.0703812837
97728972,1.0850440216
97732872,1.0752688646
97736788,1.0459432601
97740696,1.0508308410
97744600,1.0948191833
97748520,1.1290322542
97752444,1.1192570924
97756364,1.0850440216
97760272,1.0801564407
97764168,1.1094819307
97768072,1.1339198350
97771996,1.1143695116
97775920,1.0557184219
97779840,1.0166177749
97783756,0.9970674514
97787668,0.9921798706
97791580,0.9530791282
97795500,0.8846529960
97799412,0.8504399299
97803316,0.8455523490
97807212,0.8699902534
97811124,0.8699902534
97815028,0.8308895111
97818940,0.8064516067
97822844,0.8211143493
97826756,0.8651026725
97830668,0.9042033195
97834572,0.8895405769
97838476,0.8993157386
97842396,0.9530791282
97846304,1.0410556793
97850204,1.0850440216
97854112,1.0899316024
97858020,1.1192570924
97861940,1.2267839908
97865860,1.4320625305
97869772,1.6911046504
97873688,1.9892473220
97877604,2.3020527362
97881524,2.6197457313
97885452,2.8299119949
97889372,2.7761485576
97893292,2.4535679817
97897216,1.9745845794
97901136,1.4956011772
97905052,1.0899316024
97908964,0.8260019302
97912864,0.6695992469
97916772,0.6353860855
97920684,0.7038123130
97924588,0.8553275108
97928496,1.0215053558
97932408,1.1388074159
97936324,1.2023460865
97940228,1.2463343143
97944148,1.3098728656
97948064,1.3734115362
97951988,1.3929618644
97955912,1.3734115362
97959836,1.3636363744
97963764,1.3782991170
97967684,1.4173997879
97971612,1.4173997879
97975540,1.3880742835
97979460,1.3831867027
97983372,1.4027370452
97987292,1.4467253684
97991216,1.4565005302
97995124,1.4320625305
97999036,1.4173997879
98002964,1.4662756919
98006884,1.5249266624
98010812,1.5689149856
98014740,1.5689149856
98018668,1.5689149856
98022592,1.6129032135
98026516,1.6715541839
98030436,1.6911046504
98034348,1.6617790222
98038268,1.6422286987
98042192,1.6715541839
98046124,1.7204301357
98050032,1.7399804592
98053952,1.7155425548
98057880,1.6911046504
98061800,1.7106549739
98065716,1.7595307826
98069636,1.7937438488
98073556,1.7888562679
98077476,1.7741935253
98081408,1.8084066390
98085312,1.8719452857
98089228,1.9305962562
98093144,1.9257086753
98097048,1.9257086753
98100968,1.9599218368
98104884,2.0332355499
98108804,2.0967741012
98112724,2.1016616821
98116652,2.0869989395
98120564,2.0967741012
98124484,2.1456501483
98128404,2.1847507953
98132324,2.1749756336
98136252,2.1212120056
98140156,2.0967741012
98144068,2.1114368438
98147996,2.0967741012
98151908,2.0430107116
98155824,1.9501466751
98159748,1.8817204475
98163664,1.8475073814
98167584,1.8377322196
98171508,1.7937438488
98175440,1.7253177165
98179364,1.7057673931
98183296,1.7106549739
98187200,1.7448680400
98191108,1.7546432018
98195032,1.7302052974
98198952,1.7302052974
98202868,1.7741935253
98206800,1.8426198005
98210712,1.8866080284
98214628,1.8914956092
98218548,1.8914956092
98222472,1.9403715133
98226396,2.0087976455
98230308,2.0527858734
98234212,2.0527858734
98238132,2.0527858734
98242044,2.0869989395
98245964,2.1407625675
98249892,2.1798632144
98253812,2.1749756336
98257740,2.1652004718
98261660,2.1896383762
98265588,2.2385141849
98269516,2.2678396701
98273444,2.2385141849
98277364,2.1896383762
98281292,2.1798632144
98285212,2.1994135379
98289140,2.2091886997
98293052,2.1798632144
98296980,2.1212120056
98300892,2.0918865203
98304804,2.1114368438
98308732,2.1163244247
98312660,2.0674486160
98316572,2.0087976455
98320480,1.9892473220
98324392,1.9892473220
98328308,2.0087976455
98332216,1.9892473220
98336132,1.9501466751
98340048,1.9354838371
98343972,1.9696969985
98347888,1.9843597412
98351812,1.9696969985
98355736,1.9159335136
98359664,1.8866080284
98363576,1.9012707710
98367484,1.9305962562
98371408,1.9208210945
98375324,1.8817204475
98379240,1.8719452857
98383156,1.8817204475
98387072,1.9305962562
98390984,1.9403715133
98394904,1.9159335136
98398832,1.9012707710
98402744,1.9354838371
98406672,1.9794721603
98410584,1.9941349029
98414492,1.9696969985
98418416,1.9550342559
98422336,1.9843597412
98426260,2.0430107116
98430164,2.0723361968
98434076,2.0527858734
98437988,2.0381231307
98441900,2.0625610351
98445820,2.1065492630
98449740,2.1309874057
98453660,2.1065492630
98457572,2.0869989395
98461492,2.0918865203
98465404,2.1456501483
98469324,2.1847507953
98473236,2.1749756336
98477148,2.1505377292
98481052,2.1652004718
98484972,2.1945259571
98488900,2.2287390232
98492820,2.2091886997
98496732,2.1700880527
98500644,2.1652004718
98504556,2.2091886997
98508476,2.2531769275
98512404,2.2336266040
98516324,2.1994135379
98520244,2.2043011188
98524152,2.2531769275
98528068,2.2873899936
98531988,2.2727272510
98535908,2.2238514423
98539836,2.1994135379
98543764,2.2336266040
98547676,2.2580645084
98551588,2.2482893466
98555508,2.1994135379
98559436,2.1652004718
98563356,2.1603128910
98567268,2.1700880527
98571164,2.1309874057
98575068,2.0527858734
98578992,1.9843597412
98582920,1.9648094177
98586840,1.9696969985
98590756,1.9501466751
98594680,1.8963831901
98598596,1.8523949623
98602528,1.8572825431
98606456,1.8621701240
98610376,1.8670577049
98614292,1.8328446388
98618204,1.8132943153
98622132,1.8426198005
98626048,1.8963831901
98629968,1.9257086753
98633892,1.8914956092
98637808,1.8670577049
98641716,1.8914956092
98645640,1.9941349029
98649556,2.1456501483
98653476,2.3313782215
98657404,2.5708699226
98661316,2.8885631561
98665236,3.2306940555
98669148,3.4799609184
98673064,3.4604105949
98676972,3.1769306659
98680884,2.7614858150
98684796,2.2678396701
98688720,1.8230694770
98692628,1.4418377876
98696556,1.2023460865
98700476,1.1241446733
98704400,1.1876833438
98708316,1.3098728656
98712228,1.4125122070
98716148,1.4858260154
98720060,1.5493645668
98723988,1.6275659561
98727908,1.6764417648
98731836,1.6911046504
98735748,1.6617790222
98739676,1.6471162796
98743608,1.6715541839
98747532,1.7057673931
98751452,1.7057673931
98755372,1.6568914413
98759300,1.6275659561
98763220,1.6422286987
98767140,1.6862169265
98771056,1.6911046504
98774964,1.6617790222
98778884,1.6568914413
98782812,1.6911046504
98786728,1.7448680400
98790632,1.7790811061
98794544,1.7693059444
98798452,1.7644183635
98802384,1.7888562679
98806312,1.8279570579
98810224,1.8426198005
98814132,1.8181818962
98818044,1.7986314296
98821972,1.8181818962
98825900,1.8523949623
98829832,1.8719452857
98833760,1.8377322196
98837684,1.8035190582
98841596,1.7986314296
98845528,1.8377322196
98849456,1.8670577049
98853368,1.8523949623
98857292,1.8181818962
98861220,1.8328446388
98865140,1.8866080284
98869048,1.9305962562
98872968,1.9305962562
98876888,1.9012707710
98880800,1.9208210945
98884704,1.9599218368
98888624,1.9892473220
98892544,1.9599218368
98896464,1.8866080284
98900376,1.8426198005
98904296,1.8377322196
98908216,1.8328446388
98912132,1.7839686870
98916040,1.7008798122
98919956,1.6471162796
98923884,1.6373411178
98927812,1.6324535369
98931740,1.5982404708
98935644,1.5151515007
98939564,1.4613881111
98943492,1.4418377876
98947424,1.4271749496
98951348,1.3685239553
98955260,1.2707722187
98959180,1.1925709247
98963092,1.1534701585
98967008,1.1339198350
98970932,1.1045943498
98974840,1.0215053558
98978748,0.9677418708
98982660,0.9579667091
98986572,0.9775171279
98990492,0.9824047088
98994396,0.9237536430
98998308,0.8748778343
99002212,0.8797654151
99006132,0.9188660621
99010036,0.9286412239
99013956,0.9090909004
99017836,0.8895405769
99021740,0.9042033195
99025644,0.9530791282
99029556,0.9921798706
99033468,0.9970674514
99037380,0.9872922897
99041296,1.0166177749
99045196,1.0752688646
99049100,1.1192570924
99053016,1.1290322542
99056940,1.0997067642
99060840,1.1094819307
99064744,1.1485825777
99068668,1.1925709247
99072588,1.2023460865
99076508,1.1925709247
99080428,1.2023460865
99084340,1.2658846378
99088252,1.3343108892
99092168,1.3587487936
99096084,1.3343108892
99100004,1.3294233083
99103924,1.3636363744
99107860,1.4027370452
99111772,1.3831867027
99115700,1.3343108892
99119628,1.3147605657
99123556,1.3343108892
99127480,1.3587487936
99131404,1.3538612127
99135324,1.3049852848
99139236,1.2756597995
99143156,1.2903225421
99147076,1.3196481466
99151012,1.3147605657
99154924,1.2707722187
99158844,1.2072336673
99162764,1.2023460865
99166676,1.2267839908
99170588,1.2365591526
99174508,1.1974585056
99178420,1.1632453203
99182340,1.1534701585
99186248,1.1876833438
99190164,1.1974585056
99194072,1.1583577394
99197996,1.1192570924
99201916,1.1192570924
99205832,1.1730204820
99209748,1.2072336673
99213668,1.2023460865
99217588,1.1779080629
99221488,1.1876833438
99225412,1.2267839908
99229332,1.2707722187
99233244,1.2609970569
99237152,1.2365591526
99241068,1.2463343143
99244988,1.2805473804
99248900,1.2952101230
99252820,1.2805473804
99256732,1.2316715717
99260660,1.2316715717
99264588,1.2854349613
99268512,1.3391984701
99272436,1.3538612127
99276364,1.3343108892
99280292,1.3391984701
99284212,1.3782991170
99288116,1.4271749496
99292040,1.4369501113
99295964,1.4076246261
99299892,1.4076246261
99303816,1.4662756919
99307740,1.5395894050
99311652,1.5640274047
99315564,1.5444769859
99319484,1.5444769859
99323412,1.5786901473
99327332,1.6275659561
99331252,1.6520038604
99335156,1.6422286987
99339076,1.6275659561
99343004,1.6422286987
99346924,1.6666666030
99350844,1.6568914413
99354764,1.6031280517
99358676,1.5542521476
99362604,1.5542521476
99366532,1.5835777282
99370460,1.5982404708
99374372,1.5835777282
99378300,1.5640274047
99382204,1.5835777282
99386132,1.6373411178
99390056,1.6715541839
99393980,1.6520038604
99397892,1.6275659561
99401812,1.6422286987
99405736,1.6862169265
99409664,1.7106549739
99413580,1.6911046504
99417500,1.6568914413
99421432,1.6715541839
99425348,1.7204301357
99429256,1.8084066390
99433164,1.9208210945
99437068,2.0918865203
99440980,2.3655912876
99444912,2.7321603298
99448828,3.0596284866
99452752,3.2453567981
99456680,3.1867058277
99460600,2.9374389648
99464516,2.5610947608
99468428,2.1163244247
99472356,1.6813293457
99476284,1.3343108892
99480200,1.1436949968
99484112,1.1339198350
99488036,1.2365591526
99491956,1.3440860509
99495864,1.4320625305
99499780,1.5298142433
99503708,1.6422286987
99507636,1.7350928783
99511556,1.7644183635
99515480,1.7399804592
99519396,1.7350928783
99523320,1.7448680400
99527220,1.7350928783
99531140,1.6862169265
99535064,1.5933528900
99538980,1.5102639198
99542892,1.4711632728
99546820,1.4467253684
99550748,1.3978494453
99554668,1.3049852848
99558588,1.2072336673
99562504,1.1485825777
99566428,1.1192570924
99570348,1.0752688646
99574256,1.0068426132
99578176,0.9384163856
99582084,0.9188660621
99585988,0.9188660621
99589900,0.9188660621
99593812,0.8895405769
99597716,0.8748778343
99601636,0.8651026725
99605552,0.9090909004
99609436,0.9481915473
99613356,0.9530791282
99617268,0.9237536430
99621180,0.9335288047
99625080,1.0019550323
99628980,1.0752688646
99632888,1.0801564407
99636792,1.0703812837
99640704,1.0899316024
99644616,1.1436949968
99648536,1.2170088291
99652444,1.2170088291
99656356,1.2023460865
99660268,1.2072336673
99664180,1.2561094760
99668084,1.3000977039
99671980,1.3147605657
99675900,1.2952101230
99679820,1.3000977039
99683728,1.3587487936
99687652,1.4027370452
99691568,1.4222873687
99695484,1.3978494453
99699404,1.3880742835
99703328,1.4173997879
99707248,1.4565005302
99711156,1.4760508537
99715064,1.4271749496
99718988,1.3929618644
99722908,1.3929618644
99726828,1.4076246261
99730748,1.3831867027
99734668,1.3147605657
99738580,1.2561094760
99742492,1.2414467334
99746420,1.2658846378
99750340,1.2658846378
99754252,1.2365591526
99758168,1.2121212482
99762084,1.2365591526
99766012,1.3000977039
99769916,1.3538612127
99773856,1.3685239553
99777780,1.3929618644
99781704,1.4662756919
99785620,1.5640274047
99789532,1.6568914413
99793460,1.6959922313
99797392,1.7057673931
99801312,1.7399804592
99805228,1.7937438488
99809148,1.8377322196
99813072,1.8377322196
99816996,1.8230694770
99820920,1.8475073814
99824840,1.9061583518
99828756,1.9501466751
99832680,1.9599218368
99836608,1.9501466751
99840536,1.9599218368
99844452,2.0087976455
99848364,2.0527858734
99852268,2.0527858734
99856184,2.0283479690
99860092,2.0185728073
99864012,2.0576734542
99867932,2.0967741012
99871836,2.0869989395
99875740,2.0478982925
99879652,2.0234603881
99883564,2.0527858734
99887484,2.1065492630
99891404,2.1163244247
99895332,2.0772237777
99899236,2.0527858734
99903156,2.0821113586
99907076,2.1065492630
99910996,2.1016616821
99914916,2.0576734542
99918828,2.0283479690
99922740,2.0430107116
99926652,2.0821113586
99930572,2.1016616821
99934492,2.0576734542
99938404,2.0332355499
99942316,2.0674486160
99946220,2.1309874057
99950124,2.1749756336
99954052,2.1652004718
99957972,2.1260998249
99961892,2.1456501483
99965804,2.1945259571
99969732,2.2336266040
99973644,2.2189638614
99977564,2.1945259571
99981492,2.2043011188
99985404,2.2482893466
99989332,2.2922775745
99993252,2.2580645084
99997164,2.2238514423
100001084,2.2189638614
100005044,2.2678396701
100009004,2.2776148319
100012956,2.2385141849
100016924,2.1798632144
100020892,2.1603128910
100024844,2.1798632144
100028804,2.2140762805
100032756,2.1798632144
100036716,2.1358749866
100040676,2.1163244247
100044644,2.1358749866
100048604,2.1603128910
100052556,2.1407625675
100056516,2.0967741012
100060468,2.0918865203
100064420,2.1163244247
100068384,2.1407625675
100072340,2.1065492630
100076292,2.0478982925
100080244,2.0332355499
100084196,2.0478982925
100088156,2.0674486160
100092100,2.0332355499
100096056,1.9696969985
100100004,1.9110459327
100103968,1.9061583518
100107928,1.9208210945
100111884,1.8768328666
100115844,1.8181818962
100119812,1.7888562679
100123776,1.8084066390
100127712,1.8475073814
100131672,1.8523949623
100135636,1.8181818962
100139604,1.8035190582
100143560,1.8377322196
100147524,1.8768328666
100151488,1.8719452857
100155448,1.8523949623
100159404,1.8132943153
100163376,1.8426198005
100167328,1.8963831901
100171276,1.9110459327
100175232,1.9061583518
100179188,1.9501466751
100183132,2.1016616821
100187084,2.3216030597
100191036,2.5904202461
100194996,2.8787879943
100198956,3.1769306659
100202916,3.4555230140
100206876,3.5826001167
100210836,3.4115347862
100214804,3.0205278396
100218748,2.5317692756
100222708,2.1016616821
100226676,1.7937438488
100230640,1.5933528900
100234592,1.4858260154
100238548,1.5053763389
100242500,1.6422286987
100246456,1.8377322196
100250424,1.9990224838
100254380,2.0967741012
100258340,2.1652004718
100262284,2.2434017658
100266236,2.3411533832
100270196,2.4242424964
100274140,2.4731183052
100278112,2.4975562095
100282068,2.5562071800
100286020,2.6441838741
100289992,2.7028348445
100293948,2.7077224254
100297908,2.7126100063
100301868,2.7468230724
100305820,2.8054740905
100309764,2.8250244140
100313724,2.7908113002
100317676,2.7370479106
100321636,2.7223851680
100325600,2.7419354915
100329556,2.7517106533
100333516,2.7126100063
100337468,2.6735093593
100341428,2.6686217784
100345392,2.7028348445
100349348,2.7272727489
100353316,2.6881721019
100357260,2.6441838741
100361212,2.6588466167
100365180,2.6832845211
100369140,2.7077224254
100373100,2.6783969402
100377052,2.6148581504
100381012,2.6001954078
100384960,2.6246333122
100388916,2.6490714550
100392860,2.6197457313
100396828,2.5659823417
100400788,2.5562071800
100404740,2.5806450843
100408692,2.6099705696
100412644,2.5904202461
100416588,2.5366568565
100420548,2.5268816947
100424508,2.5610947608
100428460,2.5953078269
100432412,2.5757575035
100436372,2.5171065330
100440324,2.4926686286
100444276,2.5219941139
100448228,2.5366568565
100452196,2.5073313713
100456148,2.4389052391
100460108,2.3949170112
100464068,2.3753666877
100468028,2.3655912876
100471988,2.3069403171
The trick seems to be the initial conditions. Load the first 13 values of input and output of low pass filter to zero and the bias goes away.
#low-pass filter
def lpf(x):
y = x.copy()
for n in range(13):
y[n,1] = 0
x[n,1] = 0
for n in range(len(x)):
if(n < 12):
continue
y[n,1] = 2*y[n-1,1] - y[n-2,1] + x[n,1] - 2*x[n-6,1] + x[n-12,1]
return y

Gradients vanishing despite using Kaiming initialization

I was implementing a conv block in pytorch with activation function(prelu). I used Kaiming initilization to initialize all my weights and set all the bias to zero. However as I tested these blocks (by stacking 100 such conv and activation blocks on top of each other), I noticed that the output I am getting values of the order of 10^(-10). Is this normal, considering I am stacking upto 100 layers. Adding a small bias to each layer fixes the problem. But in Kaiming initialization the biases are supposed to be zero.
Here is the conv block code
from collections import Iterable
def convBlock(
input_channels, output_channels, kernel_size=3, padding=None, activation="prelu"
):
"""
Initializes a conv block using Kaiming Initialization
"""
padding_par = 0
if padding == "same":
padding_par = same_padding(kernel_size)
conv = nn.Conv2d(input_channels, output_channels, kernel_size, padding=padding_par)
relu_negative_slope = 0.25
act = None
if activation == "prelu" or activation == "leaky_relu":
nn.init.kaiming_normal_(conv.weight, a=relu_negative_slope, mode="fan_in")
if activation == "prelu":
act = nn.PReLU(init=relu_negative_slope)
else:
act = nn.LeakyReLU(negative_slope=relu_negative_slope)
if activation == "relu":
nn.init.kaiming_normal_(conv.weight, nonlinearity="relu")
act = nn.ReLU()
nn.init.constant_(conv.bias.data, 0)
block = nn.Sequential(conv, act)
return block
def flatten(lis):
for item in lis:
if isinstance(item, Iterable) and not isinstance(item, str):
for x in flatten(item):
yield x
else:
yield item
def Sequential(args):
flattened_args = list(flatten(args))
return nn.Sequential(*flattened_args)
This is the test Code
ls=[]
for i in range(100):
ls.append(convBlock(3,3,3,"same"))
model=Sequential(ls)
test=np.ones((1,3,5,5))
model(torch.Tensor(test))
And the output I am getting is
tensor([[[[-1.7771e-10, -3.5088e-10, 5.9369e-09, 4.2668e-09, 9.8803e-10],
[ 1.8657e-09, -4.0271e-10, 3.1189e-09, 1.5117e-09, 6.6546e-09],
[ 2.4237e-09, -6.2249e-10, -5.7327e-10, 4.2867e-09, 6.0034e-09],
[-1.8757e-10, 5.5446e-09, 1.7641e-09, 5.7018e-09, 6.4347e-09],
[ 1.2352e-09, -3.4732e-10, 4.1553e-10, -1.2996e-09, 3.8971e-09]],
[[ 2.6607e-09, 1.7756e-09, -1.0923e-09, -1.4272e-09, -1.1840e-09],
[ 2.0668e-10, -1.8130e-09, -2.3864e-09, -1.7061e-09, -1.7147e-10],
[-6.7161e-10, -1.3440e-09, -6.3196e-10, -8.7677e-10, -1.4851e-09],
[ 3.1475e-09, -1.6574e-09, -3.4180e-09, -3.5224e-09, -2.6642e-09],
[-1.9703e-09, -3.2277e-09, -2.4733e-09, -2.3707e-09, -8.7598e-10]],
[[ 3.5573e-09, 7.8113e-09, 6.8232e-09, 1.2285e-09, -9.3973e-10],
[ 6.6368e-09, 8.2877e-09, 9.2108e-10, 9.7531e-10, 7.0011e-10],
[ 6.6954e-09, 9.1019e-09, 1.5128e-08, 3.3151e-09, 2.1899e-10],
[ 1.2152e-08, 7.7002e-09, 1.6406e-08, 1.4948e-08, -6.0882e-10],
[ 6.9930e-09, 7.3222e-09, -7.4308e-10, 5.2505e-09, 3.4365e-09]]]],
grad_fn=<PreluBackward>)
Amazing question (and welcome to StackOverflow)! Research paper for quick reference.
TLDR
Try wider networks (64 channels)
Add Batch Normalization after activation (or even before, shouldn't make much difference)
Add residual connections (shouldn't improve much over batch norm, last resort)
Please check this out in this order and give a comment what (and if) any of that worked in your case (as I'm also curious).
Things you do differently
Your neural network is very deep, yet very narrow (81 parameters per layer only!)
Due to above, one cannot reliably create those weights from normal distribution as the sample is just too small.
Try wider networks, 64 channels or more
You are trying much deeper network than they did
Section: Comparison Experiments
We conducted comparisons on a deep but efficient model with 14 weight
layers (actually 22 was also tested in comparison with Xavier)
That was due to date of release of this paper (2015) and hardware limitations "back in the days" (let's say)
Is this normal?
Approach itself is quite strange with layers of this depth, at least currently;
each conv block is usually followed by activation like ReLU and Batch Normalization (which normalizes signal and helps with exploding/vanishing signals)
usually networks of this depth (even of depth half of what you've got) use also residual connections (though this is not directly linked to vanishing/small signal, more connected to degradation problem of even deep networks, like 1000 layers)

PuLP solvers do not respond to options fed to them

So I've got a fairly large optimization problem and I'm trying to solve it within a sensible amount of time.
Ive set it up as:
import pulp as pl
my_problem = LpProblem("My problem",LpMinimize)
# write to problem file
my_problem.writeLP("MyProblem.lp")
And then alternatively
solver = CPLEX_CMD(timeLimit=1, gapRel=0.1)
status = my_problem .solve(solver)
solver = pl.apis.CPLEX_CMD(timeLimit=1, gapRel=0.1)
status = my_problem .solve(solver)
path_to_cplex = r'C:\Program Files\IBM\ILOG\CPLEX_Studio1210\cplex\bin\x64_win64\cplex.exe' # and yes this is the actual path on my machine
solver = pl.apis.cplex_api.CPLEX_CMD(timeLimit=1, gapRel=0.1, path=path_to_cplex)
status = my_problem .solve(solver)
solver = pl.apis.cplex_api.CPLEX_CMD(timeLimit=1, gapRel=0.1, path=path_to_cplex)
status = my_problem .solve(solver)
It runs in each case.
However, the solver does not repond to the timeLimit or gapRel instructions.
If I use timelimit it does warn this is depreciated for timeLimit. Same for fracgap: it tells me I should use relGap. So somehow I am talking to the solver.
However, nor matter what values i pick for timeLimit and relGap, it always returns the exact same answer and takes the exact same amount of time (several minutes).
Also, I have tried alternative solvers, and I cannot get any one of them to accept their variants of time limits or optimization gaps.
In each case, the problem solves and returns an status: optimal message. But it just ignores the time limit and gap instructions.
Any ideas?
out of the zoo example:
import pulp
import cplex
bus_problem = pulp.LpProblem("bus", pulp.LpMinimize)
nbBus40 = pulp.LpVariable('nbBus40', lowBound=0, cat='Integer')
nbBus30 = pulp.LpVariable('nbBus30', lowBound=0, cat='Integer')
# Objective function
bus_problem += 500 * nbBus40 + 400 * nbBus30, "cost"
# Constraints
bus_problem += 40 * nbBus40 + 30 * nbBus30 >= 300
solver = pulp.CPLEX_CMD(options=['set timelimit 40'])
bus_problem.solve(solver)
print(pulp.LpStatus[bus_problem.status])
for variable in bus_problem.variables():
print ("{} = {}".format(variable.name, variable.varValue))
Correct way to pass solver option as dictionary
pulp.CPLEX_CMD(options={'timelimit': 40})
#Alex Fleisher has it correct with pulp.CPLEX_CMD(options=['set timelimit 40']). This also works for CBC using the following syntax:
prob.solve(COIN_CMD(options=['sec 60','Presolve More','Multiple 15', 'Node DownFewest','HEUR on', 'Round On','PreProcess Aggregate','PassP 10','PassF 40','Strong 10','Cuts On', 'Gomory On', 'CutD -1', 'Branch On', 'Idiot -1', 'sprint -1','Reduce On','Two On'],msg=True)).
It is important to understand that the parameters, and associated options, are specific to a solver. PuLP seems to be calling CBC via the command line so an investigation of those things is required. Hope that helps

scipy.optimize.minimize() not converging giving success=False

I recently tried to apply backpropagation algorithm in python, I tried fmin_tnc,bfgs but none of them actually worked, so please help me to figure out the problem.
def sigmoid(Z):
return 1/(1+np.exp(-Z))
def costFunction(nnparams,X,y,input_layer_size=400,hidden_layer_size=25,num_labels=10,lamda=1):
#input_layer_size=400; hidden_layer_size=25; num_labels=10; lamda=1;
Theta1=np.reshape(nnparams[0:hidden_layer_size*(input_layer_size+1)],(hidden_layer_size,(input_layer_size+1)))
Theta2=np.reshape(nnparams[(hidden_layer_size*(input_layer_size+1)):],(num_labels,hidden_layer_size+1))
m=X.shape[0]
J=0;
y=y.reshape(m,1)
Theta1_grad=np.zeros(Theta1.shape)
Theta2_grad=np.zeros(Theta2.shape)
X=np.concatenate([np.ones([m,1]),X],1)
a2=sigmoid(Theta1.dot(X.T));
a2=np.concatenate([np.ones([1,a2.shape[1]]),a2])
h=sigmoid(Theta2.dot(a2))
c=np.array(range(1,11))
y=y==c;
for i in range(y.shape[0]):
J=J+(-1/m)*np.sum(y[i,:]*np.log(h[:,i]) + (1-y[i,:])*np.log(1-h[:,i]) );
DEL2=np.zeros(Theta2.shape); DEL1=np.zeros(Theta1.shape);
for i in range(m):
z2=Theta1.dot(X[i,:].T);
a2=sigmoid(z2).reshape(-1,1);
a2=np.concatenate([np.ones([1,a2.shape[1]]),a2])
z3=Theta2.dot(a2);
# print('z3 shape',z3.shape)
a3=sigmoid(z3).reshape(-1,1);
# print('a3 shape = ',a3.shape)
delta3=(a3-y[i,:].T.reshape(-1,1));
# print('y shape ',y[i,:].T.shape)
delta2=((Theta2.T.dot(delta3)) * (a2 * (1-a2)));
# print('shapes = ',delta3.shape,a3.shape)
DEL2 = DEL2 + delta3.dot(a2.T);
DEL1 = DEL1 + (delta2[1,:])*(X[i,:]);
Theta1_grad=np.zeros(np.shape(Theta1));
Theta2_grad=np.zeros(np.shape(Theta2));
Theta1_grad[:,0]=(DEL1[:,0] * (1/m));
Theta1_grad[:,1:]=(DEL1[:,1:] * (1/m)) + (lamda/m)*(Theta1[:,1:]);
Theta2_grad[:,0]=(DEL2[:,0] * (1/m));
Theta2_grad[:,1:]=(DEL2[:,1:]*(1/m)) + (lamda/m)*(Theta2[:,1:]);
grad=np.concatenate([Theta1_grad.reshape(-1,1),Theta2_grad.reshape(-1,1)]);
return J,grad
This is how I called the function (op is scipy.optimize)
r2=op.minimize(fun=costFunction, x0=nnparams, args=(X, dataY.flatten()),
method='TNC', jac=True, options={'maxiter': 400})
r2 is like this
fun: 3.1045444063663266
jac: array([[-6.73218494e-04],
[-8.93179045e-05],
[-1.13786179e-04],
...,
[ 1.19577741e-03],
[ 5.79555099e-05],
[ 3.85717533e-03]])
message: 'Linear search failed'
nfev: 140
nit: 5
status: 4
success: False
x: array([-0.97996948, -0.44658952, -0.5689309 , ..., 0.03420931,
-0.58005183, -0.74322735])
Please help me to find correct way of minimizing this function, Thanks in advance
Finally Solved it, The problem was I used np.randn() to generate random Theta values which gives random values in a standard normal distribution, therefore as too many values were within the same range,therefore this lead to symmetricity in the theta values. Due to this symmetricity problem the optimization terminates in the middle of the process.
Simple solution was to use np.rand() (which provide uniform random distribution) instead of np.randn()

How does sklearn.linear_model.LinearRegression work with insufficient data?

To solve a 5 parameter model, I need at least 5 data points to get a unique solution. For x and y data below:
import numpy as np
x = np.array([[-0.24155831, 0.37083184, -1.69002708, 1.4578805 , 0.91790011,
0.31648635, -0.15957368],
[-0.37541846, -0.14572825, -2.19695883, 1.01136142, 0.57288752,
0.32080956, -0.82986857],
[ 0.33815532, 3.1123936 , -0.29317028, 3.01493602, 1.64978158,
0.56301755, 1.3958912 ],
[ 0.84486735, 4.74567324, 0.7982888 , 3.56604097, 1.47633894,
1.38743513, 3.0679506 ],
[-0.2752026 , 2.9110031 , 0.19218081, 2.0691105 , 0.49240373,
1.63213241, 2.4235483 ],
[ 0.89942508, 5.09052174, 1.26048572, 3.73477373, 1.4302902 ,
1.91907482, 3.70126468]])
y = np.array([-0.81388378, -1.59719762, -0.08256274, 0.61297275, 0.99359647,
1.11315445])
I used only 6 data to fit a 8 parameter model (7 slopes and 1 intercept).
lr = LinearRegression().fit(x, y)
print(lr.coef_)
array([-0.83916772, -0.57249998, 0.73025938, -0.02065629, 0.47637768,
-0.36962192, 0.99128474])
print(lr.intercept_)
0.2978781587718828
Clearly, it's using some kind of assignment to reduce the degrees of freedom. I tried to look into the source code but couldn't found anything about that. What method do they use to find the parameter of under specified model?
You don't need to reduce the degrees of freedom, it simply finds a solution to the least squares problem min sum_i (dot(beta,x_i)+beta_0-y_i)**2. For example, in the non-sparse case it uses the linalg.lstsq module from scipy. The default solver for this optimization problem is the gelsd LAPACK driver. If
A= np.concatenate((ones_v, X), axis=1)
is the augmented array with ones as its first column, then your solution is given by
x=numpy.linalg.pinv(A.T*A)*A.T*y
Where we use the pseudoinverse precisely because the matrix may not be of full rank. Of course, the solver doesn't actually use this formula but uses singular value Decomposition of A to reduce this formula.

Resources