scikit-learn roc_curve: why does it return a threshold value = 2 some time? - scikit-learn
Correct me if I'm wrong: the "thresholds" returned by scikit-learn's roc_curve should be an array of numbers that are in [0,1]. However, it sometimes gives me an array with the first number close to "2". Is it a bug or I did sth wrong? Thanks.
In [1]: import numpy as np
In [2]: from sklearn.metrics import roc_curve
In [3]: np.random.seed(11)
In [4]: aa = np.random.choice([True, False],100)
In [5]: bb = np.random.uniform(0,1,100)
In [6]: fpr,tpr,thresholds = roc_curve(aa,bb)
In [7]: thresholds
Out[7]:
array([ 1.97396826, 0.97396826, 0.9711752 , 0.95996265, 0.95744405,
0.94983331, 0.93290463, 0.93241372, 0.93214862, 0.93076592,
0.92960511, 0.92245024, 0.91179548, 0.91112166, 0.87529458,
0.84493853, 0.84068543, 0.83303741, 0.82565223, 0.81096657,
0.80656679, 0.79387241, 0.77054807, 0.76763223, 0.7644911 ,
0.75964947, 0.73995152, 0.73825262, 0.73466772, 0.73421299,
0.73282534, 0.72391126, 0.71296292, 0.70930102, 0.70116428,
0.69606617, 0.65869235, 0.65670881, 0.65261474, 0.6487222 ,
0.64805644, 0.64221486, 0.62699782, 0.62522484, 0.62283401,
0.61601839, 0.611632 , 0.59548669, 0.57555854, 0.56828967,
0.55652111, 0.55063947, 0.53885029, 0.53369398, 0.52157349,
0.51900774, 0.50547317, 0.49749635, 0.493913 , 0.46154029,
0.45275916, 0.44777116, 0.43822067, 0.43795921, 0.43624093,
0.42039077, 0.41866343, 0.41550367, 0.40032843, 0.36761763,
0.36642721, 0.36567017, 0.36148354, 0.35843793, 0.34371331,
0.33436415, 0.33408289, 0.33387442, 0.31887024, 0.31818719,
0.31367915, 0.30216469, 0.30097917, 0.29995201, 0.28604467,
0.26930354, 0.2383461 , 0.22803687, 0.21800338, 0.19301808,
0.16902881, 0.1688173 , 0.14491946, 0.13648451, 0.12704826,
0.09141459, 0.08569481, 0.07500199, 0.06288762, 0.02073298,
0.01934336])
Most of the time these thresholds are not used, for example in calculating the area under the curve, or plotting the False Positive Rate against the True Positive Rate.
Yet to plot what looks like a reasonable curve, one needs to have a threshold that incorporates 0 data points. Since Scikit-Learn's ROC curve function need not have normalised probabilities for thresholds (any score is fine), setting this point's threshold to 1 isn't sufficient; setting it to inf is sensible but coders often expect finite data (and it's possible the implementation also works for integer thresholds). Instead the implementation uses max(score) + epsilon where epsilon = 1. This may be cosmetically deficient, but you haven't given any reason why it's a problem!
From the documentation:
thresholds : array, shape = [n_thresholds]
Decreasing thresholds on the decision function used to compute
fpr and tpr. thresholds[0] represents no instances being predicted
and is arbitrarily set to max(y_score) + 1.
So the first element of thresholds is close to 2 because it is max(y_score) + 1, in your case thresholds[1] + 1.
this seems like a bug to me - in roc_curve(aa,bb), 1 is added to the first threshold. You should create an issue here https://github.com/scikit-learn/scikit-learn/issues
Related
detect highest peaks automatically from noisy data python
Is there any way to detect the highest peaks using a python library without setting any parameter?. I'm developing a user interface and I want the algorithm to be able to detect highest peaks automatically... I want it to be able to detect these peaks in picture below: graph here Data looks like this: 8.60291e-07 -1.5491e-06 5.64568e-07 -9.51195e-07 1.07203e-06 4.6521e-07 6.43967e-07 -9.86092e-07 -9.82323e-07 6.38977e-07 -1.93884e-06 -2.98309e-08 1.33543e-06 1.05064e-06 1.17332e-06 -1.53549e-07 -8.9357e-07 1.59176e-06 -2.17331e-06 1.46756e-06 5.63301e-07 -8.77556e-07 7.47681e-09 -8.30101e-07 -3.6647e-07 5.27046e-07 -1.94983e-06 1.89018e-07 1.22533e-06 8.00735e-07 -8.51166e-07 1.13437e-06 -2.75787e-07 1.79601e-06 -1.67875e-06 1.13529e-06 -1.29865e-06 9.9688e-07 -9.34486e-07 8.89931e-07 -3.88634e-07 1.15124e-06 -4.23569e-07 -1.8029e-07 1.20537e-07 4.10736e-07 -9.99077e-07 -3.62984e-07 2.97916e-06 -1.95828e-06 -1.07398e-06 2.422e-06 -6.33202e-07 -1.36953e-06 1.6694e-06 -4.71764e-07 3.98849e-07 -1.0071e-06 -9.72984e-07 8.13553e-07 2.64193e-06 -3.12365e-06 1.34049e-06 -1.30419e-06 1.48369e-07 1.26033e-06 -2.59872e-07 4.28284e-07 -6.44356e-07 2.99934e-07 8.34335e-07 3.53226e-07 -7.08252e-07 4.1243e-07 2.41525e-06 -8.92159e-07 8.82339e-08 4.31945e-06 3.75152e-06 1.091e-06 3.8204e-06 -1.21356e-06 3.35564e-06 -1.06234e-06 -5.99808e-07 2.18155e-06 5.90652e-07 -1.36728e-06 -4.97017e-07 -7.77283e-08 8.68263e-07 4.37645e-07 -1.26514e-06 2.26413e-06 -8.52966e-07 -7.35596e-07 4.11911e-07 1.7585e-06 -inf 1.10779e-08 -1.49507e-06 9.87305e-07 -3.85296e-06 4.31265e-06 -9.89227e-07 -1.33537e-06 4.1713e-07 1.89362e-07 3.21968e-07 6.80237e-08 2.31636e-07 -2.98523e-07 7.99133e-07 7.36305e-07 6.39862e-07 -1.11932e-06 -1.57262e-06 1.86305e-06 -3.63716e-07 3.83865e-07 -5.23293e-07 1.31812e-06 -1.23608e-06 2.54684e-06 -3.99796e-06 2.90441e-06 -5.20203e-07 1.36295e-06 -1.89317e-06 1.22366e-06 -1.10373e-06 2.71276e-06 9.48181e-07 7.70881e-06 5.17066e-06 6.21254e-06 1.3513e-05 1.47878e-05 8.78543e-06 1.61819e-05 1.68438e-05 1.16082e-05 5.74059e-06 4.92458e-06 1.11884e-06 -1.07419e-06 -1.28517e-06 -2.70949e-06 1.65662e-06 1.42964e-06 3.40604e-06 -5.82825e-07 1.98288e-06 1.42819e-06 1.65517e-06 4.42749e-07 -1.95609e-06 -2.1756e-07 1.69164e-06 8.7204e-08 -5.35324e-07 7.43546e-07 -1.08687e-06 2.07289e-06 2.18529e-06 -2.8161e-06 1.88821e-06 4.07272e-07 1.063e-06 8.47244e-07 1.53879e-06 -9.0799e-07 -1.26709e-07 2.40044e-06 -9.48166e-07 1.41788e-06 3.67615e-07 -1.29199e-06 3.868e-06 9.54654e-06 2.51951e-05 2.2769e-05 7.21716e-06 1.36545e-06 -1.32681e-06 -3.09641e-06 4.90417e-07 2.99335e-06 1.578e-06 6.0025e-07 2.90656e-06 -2.08258e-06 -1.54214e-06 2.19757e-07 3.74982e-06 -1.76944e-06 2.15018e-06 -1.01935e-06 4.37469e-07 1.39078e-06 6.39587e-07 -1.7807e-06 -6.16455e-09 1.61557e-06 1.59644e-06 -2.35217e-06 5.29449e-07 1.9169e-06 -7.54822e-07 2.00342e-06 -3.28452e-06 3.91663e-06 1.66016e-08 -2.65897e-06 -1.4064e-06 4.67987e-07 1.67786e-06 4.69543e-07 -8.90106e-07 -1.4584e-06 1.37915e-06 1.98483e-06 -2.3735e-06 4.45618e-07 1.91504e-06 1.09653e-06 -8.00873e-07 1.32321e-06 2.04846e-06 -1.50656e-06 7.23816e-07 2.06049e-06 -2.43918e-06 1.64417e-06 2.65411e-07 -2.66107e-06 -8.01788e-07 2.05121e-06 -1.74988e-06 1.83594e-06 -8.14026e-07 -2.69342e-06 1.81152e-06 1.11664e-07 -4.21863e-06 -7.20551e-06 -5.92407e-07 -1.44629e-06 -2.08136e-06 2.86105e-06 3.77911e-06 -1.91898e-06 1.41742e-06 2.67914e-07 -8.55835e-07 -9.8584e-07 -2.74115e-06 3.39044e-06 1.39639e-06 -2.4964e-06 8.2486e-07 2.02432e-06 1.65793e-06 -1.43094e-06 -3.36807e-06 -8.96515e-07 5.31323e-06 -8.27209e-07 -1.39221e-06 -3.3754e-06 2.12372e-06 3.08218e-06 -1.42947e-06 -2.36777e-06 3.86218e-06 2.29327e-06 -3.3941e-06 -1.67291e-06 2.63828e-06 2.21008e-07 7.07794e-07 1.8172e-06 -2.00082e-06 1.80664e-06 6.69739e-07 -3.95395e-06 1.92148e-06 -1.07187e-06 -4.04938e-07 -1.76553e-06 2.7099e-06 1.30768e-06 1.41812e-06 -1.55518e-07 -3.78302e-06 4.00137e-06 -8.38623e-07 4.54651e-07 1.00027e-06 1.32196e-06 -2.62717e-06 1.67865e-06 -6.99249e-07 2.8837e-06 -1.00516e-06 -3.68011e-06 1.61847e-06 1.90887e-06 1.59641e-06 4.16779e-07 -1.35245e-06 1.65717e-06 -2.92667e-06 3.6203e-07 2.53528e-06 -2.0578e-07 -3.41919e-07 -1.42154e-06 -2.33322e-06 3.07175e-06 -2.69165e-08 -8.21045e-07 2.3175e-06 -7.22992e-07 1.49069e-06 8.75488e-07 -2.02676e-06 -2.81158e-07 3.6004e-06 -3.94708e-06 4.72983e-06 -1.38873e-06 -6.92139e-08 -1.4678e-06 1.04251e-06 -2.06625e-06 3.10406e-06 -8.13873e-07 7.23694e-07 -9.78912e-07 -8.65967e-07 7.37335e-07 1.52563e-06 -2.33591e-06 1.78265e-06 9.58435e-07 -5.22064e-07 -2.29736e-07 -4.26996e-06 -6.61411e-06 1.14789e-06 -4.32697e-06 -5.32779e-06 2.12241e-06 -1.40726e-06 1.76086e-07 -3.77194e-06 -2.71326e-06 -9.49402e-08 1.70807e-07 -2.495e-06 4.22324e-06 -3.62476e-06 -9.56055e-07 7.16583e-07 3.01447e-06 -1.41229e-06 -1.67694e-06 7.61627e-07 3.55881e-06 2.31015e-06 -9.50378e-07 4.45251e-08 -1.94791e-06 2.27081e-06 -3.34717e-06 3.05688e-06 4.57062e-07 3.87326e-06 -2.39215e-06 -3.52682e-06 -2.05212e-06 5.26495e-06 -3.28613e-07 -5.76569e-07 -7.46338e-07 5.98795e-06 8.80493e-07 -4.82965e-06 2.56839e-06 -1.58792e-06 -2.2294e-06 1.83841e-06 2.65482e-06 -3.10474e-06 -3.46741e-07 2.45557e-06 2.01328e-06 -3.92606e-06 inf -8.11737e-07 5.72174e-07 1.57245e-06 8.02612e-09 -2.901e-06 1.22079e-06 -6.31714e-07 3.06241e-06 1.20059e-06 -1.80344e-06 4.90784e-07 3.74243e-06 -2.94342e-07 -3.45764e-08 -3.42099e-06 -1.43695e-06 5.91064e-07 3.47308e-06 3.78232e-06 4.01093e-07 -1.58435e-06 -3.47375e-06 1.34943e-06 1.11768e-06 1.95212e-06 -8.28033e-07 1.53705e-06 6.38031e-07 -1.84702e-06 1.34689e-06 -6.98669e-07 1.81653e-06 -2.42355e-06 -1.35257e-06 3.04367e-06 -1.21976e-06 1.61896e-06 -2.69528e-06 1.84601e-06 6.45447e-08 -4.94263e-07 3.47568e-06 -2.00531e-06 3.56693e-06 -3.19446e-06 2.72141e-06 -1.39059e-06 2.20032e-06 -1.76819e-06 2.32727e-07 -3.47382e-07 2.11823e-07 -5.22614e-07 2.69846e-06 -1.47983e-06 2.14554e-06 -6.27594e-07 -8.8501e-10 7.89124e-07 -2.8653e-07 8.30902e-07 -2.12857e-06 -1.90887e-07 1.07593e-06 1.40781e-06 2.41641e-06 -4.52689e-06 2.37207e-06 -2.19479e-06 1.65131e-06 1.2706e-06 -2.18387e-06 -1.72821e-07 5.41687e-07 7.2879e-07 7.56927e-07 1.57739e-06 -3.79395e-07 -1.02887e-06 -1.20987e-06 1.43066e-06 8.96301e-08 5.09766e-07 -2.8812e-06 -2.35944e-06 2.25912e-06 -2.78967e-06 -4.69913e-06 1.60822e-06 6.9342e-07 4.6225e-07 -1.33276e-06 -3.59033e-06 1.11206e-06 1.83521e-06 2.39163e-06 2.3468e-08 5.91431e-07 -8.80249e-07 -2.77405e-08 -1.13184e-06 -1.28036e-06 1.66229e-06 2.81784e-06 -2.97589e-06 8.73413e-08 1.06439e-06 2.39075e-06 -2.76974e-06 1.20862e-06 -5.12817e-07 -5.19104e-07 4.51324e-07 -4.7168e-07 2.35608e-06 5.46906e-07 -1.66748e-06 5.85236e-07 6.42944e-07 2.43164e-07 4.01031e-07 -1.93646e-06 2.07416e-06 -1.16116e-06 4.27155e-07 5.2951e-07 9.09149e-07 -8.71887e-08 -1.5564e-09 1.07266e-06 -9.49402e-08 2.04016e-06 -6.38123e-07 -1.94241e-06 -5.17294e-06 -2.18622e-06 -8.26703e-06 2.54364e-06 4.32614e-06 8.3847e-07 -2.85309e-06 2.72345e-06 -3.42752e-06 -1.36871e-07 2.23346e-06 5.26825e-07 1.3566e-06 -2.17111e-06 2.1463e-07 2.06479e-06 1.76929e-06 -1.2655e-06 -1.3797e-06 3.10706e-06 -4.72189e-06 4.38138e-06 6.41815e-07 -3.25623e-08 -4.93707e-06 5.05743e-06 5.17578e-07 -5.30524e-06 3.62463e-06 5.68909e-07 1.16226e-06 1.10843e-06 -5.00854e-07 9.48761e-07 -2.18701e-06 -3.57635e-07 4.26709e-06 -1.50836e-06 -5.84412e-06 3.5054e-06 3.94019e-06 -4.7623e-06 2.05856e-06 -2.22992e-07 1.64969e-06 2.64694e-06 -8.49487e-07 -3.63562e-06 1.0386e-06 1.69461e-06 -2.05798e-06 3.60349e-06 3.42651e-07 -1.46686e-06 1.19949e-06 -1.60519e-06 2.37793e-07 6.12366e-07 -1.54669e-06 1.43668e-06 1.87009e-06 -2.22626e-06 2.15155e-06 -3.10571e-06 2.05188e-06 -4.40002e-07 2.06683e-06 -1.11362e-06 5.96924e-07 -2.64471e-06 2.4892e-06 1.13083e-06 -3.23181e-07 5.10651e-07 2.73499e-07 -1.24899e-06 1.40564e-06 -9.3158e-07 1.45947e-06 3.70544e-07 -1.62628e-06 -1.70215e-06 1.72098e-06 8.19031e-07 -5.57709e-07 1.10107e-06 -2.81845e-06 1.57654e-07 3.30716e-06 -9.75403e-07 1.73126e-07 1.30447e-06 7.64771e-08 -6.65344e-07 -1.4346e-06 5.03171e-06 -2.84576e-06 2.3212e-06 -2.73373e-06 2.16675e-08 2.24026e-06 -4.11682e-08 -3.36642e-06 1.78775e-06 1.28174e-08 -9.32068e-07 2.97177e-06 -1.05338e-06 9.42505e-07 2.02362e-07 -1.81326e-06 2.16995e-06 2.83722e-07 -1.2648e-06 9.21814e-07 -8.9447e-07 -1.61597e-06 3.5036e-06 -6.79626e-08 1.52823e-06 -2.98682e-06 5.57404e-07 9.5166e-07 7.10419e-07 -1.28528e-06 -3.76038e-07 -1.03845e-06 2.96631e-06 -1.18356e-06 -2.77313e-07 3.24149e-06 -1.85455e-06 -1.27747e-07 3.6264e-07 4.66431e-07 -1.54443e-06 1.38437e-06 -1.53119e-06 7.4231e-07 -1.2388e-06 1.99774e-06 1.15799e-06 1.39478e-06 -2.93527e-06 -2.03012e-06 2.46667e-06 2.16751e-06 -2.50354e-06 3.95905e-07 5.74371e-07 1.33575e-07 -3.98315e-07 4.93927e-07 -5.23987e-07 -1.74713e-07 6.49384e-07 -7.16766e-07 2.35733e-06 -4.91333e-08 -1.88138e-06 1.74722e-06 4.03503e-07 3.5965e-07 1.44836e-07]
The task you are describing could be treated like anomaly/outlier detection. One possible solution is to use a Z-score transformation and treat every value with a z score above a certain threshold as an outlier. Because there is no clear definition of an outlier it won't be able to detect such peaks without setting any parameters (threshold). One possible solution could be: import numpy as np def detect_outliers(data): outliers = [] d_mean = np.mean(data) d_std = np.std(data) threshold = 3 # this defines what you would consider a peak (outlier) for point in data: z_score = (point - d_mean)/d_std if np.abs(z_score) > threshold: outliers.append(point) return outliers # create normal data data = np.random.normal(size=100) # create outliers outliers = np.random.normal(100, size=3) # combine normal data and outliers full_data = data.tolist() + outliers.tolist() # print outliers print(detect_outliers(full_data)) If you only want to detect peaks, remove the np.abs function call from the code. This code snippet is based on a Medium Post, which also provides another way of detecting outliers.
How does sklearn.linear_model.LinearRegression work with insufficient data?
To solve a 5 parameter model, I need at least 5 data points to get a unique solution. For x and y data below: import numpy as np x = np.array([[-0.24155831, 0.37083184, -1.69002708, 1.4578805 , 0.91790011, 0.31648635, -0.15957368], [-0.37541846, -0.14572825, -2.19695883, 1.01136142, 0.57288752, 0.32080956, -0.82986857], [ 0.33815532, 3.1123936 , -0.29317028, 3.01493602, 1.64978158, 0.56301755, 1.3958912 ], [ 0.84486735, 4.74567324, 0.7982888 , 3.56604097, 1.47633894, 1.38743513, 3.0679506 ], [-0.2752026 , 2.9110031 , 0.19218081, 2.0691105 , 0.49240373, 1.63213241, 2.4235483 ], [ 0.89942508, 5.09052174, 1.26048572, 3.73477373, 1.4302902 , 1.91907482, 3.70126468]]) y = np.array([-0.81388378, -1.59719762, -0.08256274, 0.61297275, 0.99359647, 1.11315445]) I used only 6 data to fit a 8 parameter model (7 slopes and 1 intercept). lr = LinearRegression().fit(x, y) print(lr.coef_) array([-0.83916772, -0.57249998, 0.73025938, -0.02065629, 0.47637768, -0.36962192, 0.99128474]) print(lr.intercept_) 0.2978781587718828 Clearly, it's using some kind of assignment to reduce the degrees of freedom. I tried to look into the source code but couldn't found anything about that. What method do they use to find the parameter of under specified model?
You don't need to reduce the degrees of freedom, it simply finds a solution to the least squares problem min sum_i (dot(beta,x_i)+beta_0-y_i)**2. For example, in the non-sparse case it uses the linalg.lstsq module from scipy. The default solver for this optimization problem is the gelsd LAPACK driver. If A= np.concatenate((ones_v, X), axis=1) is the augmented array with ones as its first column, then your solution is given by x=numpy.linalg.pinv(A.T*A)*A.T*y Where we use the pseudoinverse precisely because the matrix may not be of full rank. Of course, the solver doesn't actually use this formula but uses singular value Decomposition of A to reduce this formula.
Obtaining hyperpolarization depth from electrophysiological graph
I am working on electrophysiological data which is in .abf format. I want to obtain the hyperpolarization depth as indicated above in the figure. This is what I have done so far; import matplotlib.pyplot as plt import pyabf import pandas as pd abf = pyabf.ABF("test.abf") abf.setSweep(10) # I can access a given sweep. Here sweep 10 df = pd.DataFrame({'time': abf.sweepX, 'current':abf.sweepY}) df1 = df.loc[15650:15800] df1.plot(x='time', y='current') I am thinking to apply change in derivative to find the first point of interest (x1,y1) and then lower point (x2,y2), but it looks complex. I would appreciate if someone give some hint or procedure. The dataset as follow, time current 0.7825 -63.323975 0.78255 -63.171387 0.7826 -62.89673 0.78265 -62.713623 0.7827 -62.469482 0.78275 -62.37793 0.7828 -62.10327 0.78285 -61.950684 0.7829 -61.76758 0.78295 -61.584473 0.783 -61.401367 0.78305 -61.24878 0.7831 -61.035156 0.78315 -60.85205 0.7832 -60.72998 0.78325 -60.516357 0.7833 -60.455322 0.78335 -60.2417 0.7834 -60.08911 0.78345 -59.96704 0.7835 -59.814453 0.78355 -59.661865 0.7836 -59.509277 0.78365 -59.417725 0.7837 -59.23462 0.78375 -59.11255 0.7838 -58.95996 0.78385 -58.86841 0.7839 -58.685303 0.78395 -58.59375 0.784 -58.441162 0.78405 -58.34961 0.7841 -58.19702 0.78415 -58.044434 0.7842 -57.922363 0.78425 -57.769775 0.7843 -57.678223 0.78435 -57.434082 0.7844 -57.34253 0.78445 -56.9458 0.7845 -56.274414 0.78455 -54.96216 0.7846 -53.253174 0.78465 -51.208496 0.7847 -48.950195 0.78475 -46.325684 0.7848 -43.09082 0.78485 -38.42163 0.7849 -31.036377 0.78495 -22.033691 0.785 -13.397217 0.78505 -6.072998 0.7851 -0.61035156 0.78515 2.7160645 0.7852 3.9367676 0.78525 3.4179688 0.7853 1.3427734 0.78535 -1.4953613 0.7854 -5.0964355 0.78545 -9.185791 0.7855 -13.641357 0.78555 -18.249512 0.7856 -23.132324 0.78565 -27.98462 0.7857 -32.714844 0.78575 -37.261963 0.7858 -41.47339 0.78585 -45.22705 0.7859 -48.553467 0.78595 -51.54419 0.786 -53.985596 0.78605 -56.18286 0.7861 -58.013916 0.78615 -59.539795 0.7862 -60.760498 0.78625 -61.88965 0.7863 -62.652588 0.78635 -63.323975 0.7864 -63.934326 0.78645 -64.2395 0.7865 -64.60571 0.78655 -64.78882 0.7866 -65.00244 0.78665 -64.971924 0.7867 -65.093994 0.78675 -65.03296 0.7868 -64.971924 0.78685 -64.819336 0.7869 -64.78882 0.78695 -64.66675 0.787 -64.48364 0.78705 -64.42261 0.7871 -64.2395 0.78715 -64.11743 0.7872 -63.964844 0.78725 -63.842773 0.7873 -63.659668 0.78735 -63.568115 0.7874 -63.446045 0.78745 -63.26294 0.7875 -63.171387 0.78755 -62.98828 0.7876 -62.89673 0.78765 -62.74414 0.7877 -62.713623 0.78775 -62.530518 0.7878 -62.438965 0.78785 -62.37793 0.7879 -62.25586 0.78795 -62.164307 0.788 -62.042236 0.78805 -62.01172 0.7881 -61.88965 0.78815 -61.88965 0.7882 -61.73706 0.78825 -61.706543 0.7883 -61.645508 0.78835 -61.61499 0.7884 -61.523438 0.78845 -61.462402 0.7885 -61.431885 0.78855 -61.340332 0.7886 -61.37085 0.78865 -61.279297 0.7887 -61.279297 0.78875 -61.157227 0.7888 -61.187744 0.78885 -61.09619 0.7889 -61.157227 0.78895 -61.12671 0.789 -61.09619 0.78905 -61.12671 0.7891 -61.00464 0.78915 -61.00464 0.7892 -60.97412 0.78925 -60.97412 0.7893 -60.943604 0.78935 -61.00464 0.7894 -60.913086 0.78945 -60.97412 0.7895 -60.943604 0.78955 -60.913086 0.7896 -60.943604 0.78965 -60.85205 0.7897 -60.85205 0.78975 -60.821533 0.7898 -60.88257 0.78985 -60.88257 0.7899 -60.913086 0.78995 -60.88257 0.79 -60.913086
We can plot the difference in current between consecutive points (which essentially is to a constant factor the derivative, since times are evenly spaced). First chart shows the actual diffs. Based on this we can set some threshold, such as 0.3, and apply it to filter the main DataFrame. The filtered values are shown in orange on the second chart: fig, ax = plt.subplots(2, figsize=(8,8)) # plot derivative df['current'].diff().plot(ax=ax[0]) # current threshold = 0.4 df['filtered'] = df.loc[df['current'].diff().abs() > threshold] df.plot(ax=ax[1]) # add spans x = df['filtered'].dropna() ax[1].axhspan(x.iloc[0], x.iloc[-1], alpha=0.3, edgecolor='skyblue', facecolor="none", hatch='////') ax[1].axvspan(x.index.min(), x.index.max(), alpha=0.3, edgecolor='orange', facecolor="none", hatch='\\\\') Output: If you're interested in range values, you can dropna values in the filtered subset and find min and max from the index: print('min', df['filtered'].dropna().index.min()) print('max', df['filtered'].dropna().index.max()) Output: min 0.78445 max 0.7865 For the value of the gap you can use: abs(df['filtered'].dropna().iloc[-1] - df['filtered'].dropna().iloc[0]) Output: 7.6599100000000035 Note: We can alternatively also get left edges of these spans as points where diff in the point is lower than the threshold and diff in the next point is higher than the threshold, and similarly for the right edges. This would also work in case we have multiple peaks: threshold = 0.3 x = df['current'].diff().abs() spanA = df.loc[(x < threshold) & (x.shift(-1) >= threshold)] spanB = df.loc[(x >= threshold) & (x.shift(-1) < threshold)] print(spanA) current time 0.7844 -57.34253 print(spanB) current time 0.7865 -64.60571
How to record the value of a variable within odeint?
I would like to know if there is a way to record the value of a specific variable within the function of integration, without having to print it within the definition of the function, which in many cases, due to the algorithm of prediction-correction, lead to more or less values than the final vector returned by the function? Example let's try with this code: import numpy as np from scipy.integrate import odeint import matplotlib.pyplot as plt def essai(y, t): a = y[0] c1 = a a = c1 / a**2 return [a] # Solving essai0 = [10] t = np.linspace(0, 2000, 10) y = odeint(essai, essai0, t) a = y[:, 0] # Graphs fig, ax = plt.subplots() ax.plot(t, a, 'k--', label='a') legend = ax.legend(loc='lower right', shadow=True, fontsize='x-large') legend.get_frame().set_facecolor('#FFFCCC') # 00FFCC plt.xlabel('x') plt.ylabel('y') plt.title('y vs x') plt.show() I would like to record the values of c1 which depends on a. What should I do? If I print, it I get (because of pred-corr algorithm): 10.0 10.001203411814794 10.00120326701222 10.002406534059283 10.00240638930896 10.031168251789499 10.03116843523562 10.059847893733858 10.059848247411573 10.088446178306066 10.088446526968276 10.178981333917179 10.1789826635142 10.26872274187664 10.268720875457465 10.251795853148066 10.251794757670828 10.324093402400061 10.324093338929458 10.395889284010963 10.395889126663482 10.467192620394076 10.467192470562162 10.60836217080531 10.608361512785885 10.747675991273601 10.747676529983982 10.885208084361661 10.88520861500753 11.021024408838219 11.021024559158226 11.15518691385528 11.15518704871583 11.389028983440005 11.389029612664437 11.618166387462095 11.618166372845774 11.842871925632974 11.842870666797078 12.063390475531826 12.0633901508557 12.279950446401756 12.279950250452782 12.492757035192547 12.492756877414479 12.790475076345272 12.79047467718475 13.081418818481728 13.081418595295522 13.366029970579808 13.366030900758636 13.644707388512776 13.644707798536366 13.917805722870085 13.917805853240296 14.185647189512732 14.185647276304193 14.448524340486092 14.44852440612534 14.849045554474056 14.849045812160185 15.239043242348172 15.239044113472564 15.619306858637934 15.619307570817467 15.990530200625596 15.990530706701604 16.353328829257094 16.35332918566708 16.70825155213741 16.708251810028536 17.055790075751844 17.055790265472186 17.52054793291328 17.520548366986496 17.97329155702487 17.97329263337524 18.414908470097206 18.41490919183692 18.84617978510828 18.846180323693773 19.26780035288661 19.26780072790131 19.68039039537204 19.680390669145883 20.084506483562638 20.084506685872917 20.63204921728682 20.632049705019547 21.165431430483114 21.16543268212929 21.685699626883885 21.685700483180575 22.193774842932424 22.193775478119036 22.69047628806277 22.69047673120133 23.176535191516802 23.1765355148269 23.652607704971896 23.652607943862492 24.296731084127696 24.296731656936466 24.92421316694978 24.924214631653445 25.536282592848192 25.536283593100098 26.134020839947766 26.134021582629195 26.718389929663125 26.718390447872228 27.290248649274574 27.290249027491374 27.8503676838429 27.85036796338048 28.60821935477876 28.608220025227006 29.346505899333515 29.346507613905608 30.066670806260635 30.066671977520553 30.769984796557875 30.769985666417984 31.457578314647648 31.457578921761066 32.13046057231114 32.13046101551341 32.78953730742519 32.789537635058444 33.68118868621462 33.68118947182226 34.54983545122736 34.549837459883506 35.39717380841791 35.397175180698845 36.22469707822626 36.224698097642104 37.033733817898586 37.03373452954837 37.82547018189015 37.82547070150822 38.60097077071101 38.60097115490064 39.65004988104156 39.650050802111195 40.67207751401193 40.67207986867377 41.669047220267416 41.66904882908885 42.64271422854618 42.64271542393563 43.594640193459966 43.59464102811222 44.52621945824691 44.52622006777859 45.43870353935591 45.438703990091476 46.67300975177773 46.673010832232926 47.87550305124021 47.87550581301012 49.04852683447106 49.04852872160157 50.194144483083306 50.19414588551954 51.3141919066777 51.31419288605143 52.41030839692969 52.41030911225109 53.48396538985435 53.483965918885744 54.9362075454971 54.93620881348237 56.35103457439806 56.35103781516747 57.73120149400896 57.731203708595864 59.07913425147381 59.07913589751868 60.39699143853227 60.39699258818307 61.68670054765226 61.68670138744394 62.94999176730058 62.949992388453296 64.65865496068966 64.65865644932029 Which is much more values than I may expect with t = np.linspace(0, 2000, 10) which divide the intervale of time in tenth of 200. I have thought to this problem for a long time without find a really good way to do it and I would be delighted to know how to bypass this problem.
There is no relation between the evaluation points of the ODE function in the internal solver steps and the requested sample points of the solution for the output. Moreover, the evaluation points can deviate from the solution trajectory with some error of an order lower than the order of the integration method. The easiest way to do what you want in a structured fashion is to define the c1 function as a separate function and then to call it on the results def c1_func(y): return y[0] def essai(y, t): a = y[0] c1 = c1_func(y) a = c1 / a**2 return [a] ... y = odeint(... c1_val = c1_func(y.T) plt.plot(x, c1_val) or so.
PACF function in statsmodels.tsa.stattools gives numbers greater than 1 when using ywunbiased?
I have a dataframe which is of length 177 and I want to calculate and plot the partial auto-correlation function (PACF). I have the data imported etc and I do: from statsmodels.tsa.stattools import pacf ys = pacf(data[key][array].diff(1).dropna(), alpha=0.05, nlags=176, method="ywunbiased") xs = range(lags+1) plt.figure() plt.scatter(xs,ys[0]) plt.grid() plt.vlines(xs, 0, ys[0]) plt.plot(ys[1]) The method used results in numbers greater than 1 for very long lags (90ish) which is incorrect and I get a RuntimeWarning: invalid value encountered in sqrtreturn rho, np.sqrt(sigmasq) but since I can't see their source code I don't know what this means. To be honest, when I search for PACF, all the examples only carry out PACF up to 40 lags or 60 or so and they never have any significant PACF after lag=2 and so I couldn't compare to other examples either. But when I use: method="ols" # or method="ywmle" the numbers are corrected. So it must be the algo they use to solve it. I tried importing inspect and getsource method but its useless it just shows that it uses another package and I can't find that. If you also know where the problem arises from, I would really appreciate the help. For your reference, the values for data[key][array] are: [1131.130005, 1144.939941, 1126.209961, 1107.300049, 1120.680054, 1140.839966, 1101.719971, 1104.23999, 1114.579956, 1130.199951, 1173.819946, 1211.920044, 1181.27002, 1203.599976, 1180.589966, 1156.849976, 1191.5, 1191.329956, 1234.180054, 1220.329956, 1228.810059, 1207.01001, 1249.47998, 1248.290039, 1280.079956, 1280.660034, 1294.869995, 1310.609985, 1270.089966, 1270.199951, 1276.660034, 1303.819946, 1335.849976, 1377.939941, 1400.630005, 1418.300049, 1438.23999, 1406.819946, 1420.859985, 1482.369995, 1530.619995, 1503.349976, 1455.27002, 1473.98999, 1526.75, 1549.380005, 1481.140015, 1468.359985, 1378.550049, 1330.630005, 1322.699951, 1385.589966, 1400.380005, 1280.0, 1267.380005, 1282.829956, 1166.359985, 968.75, 896.23999, 903.25, 825.880005, 735.090027, 797.869995, 872.8099980000001, 919.1400150000001, 919.320007, 987.4799800000001, 1020.6199949999999, 1057.079956, 1036.189941, 1095.630005, 1115.099976, 1073.869995, 1104.48999, 1169.430054, 1186.689941, 1089.410034, 1030.709961, 1101.599976, 1049.329956, 1141.199951, 1183.26001, 1180.550049, 1257.640015, 1286.119995, 1327.219971, 1325.829956, 1363.609985, 1345.199951, 1320.640015, 1292.280029, 1218.890015, 1131.420044, 1253.300049, 1246.959961, 1257.599976, 1312.410034, 1365.680054, 1408.469971, 1397.910034, 1310.329956, 1362.160034, 1379.319946, 1406.579956, 1440.670044, 1412.160034, 1416.180054, 1426.189941, 1498.109985, 1514.680054, 1569.189941, 1597.569946, 1630.73999, 1606.280029, 1685.72998, 1632.969971, 1681.550049, 1756.540039, 1805.810059, 1848.359985, 1782.589966, 1859.449951, 1872.339966, 1883.949951, 1923.569946, 1960.22998, 1930.6700440000002, 2003.369995, 1972.290039, 2018.050049, 2067.560059, 2058.899902, 1994.9899899999998, 2104.5, 2067.889893, 2085.51001, 2107.389893, 2063.110107, 2103.840088, 1972.180054, 1920.030029, 2079.360107, 2080.409912, 2043.939941, 1940.2399899999998, 1932.22998, 2059.73999, 2065.300049, 2096.949951, 2098.860107, 2173.600098, 2170.949951, 2168.27002, 2126.149902, 2198.810059, 2238.830078, 2278.8701170000004, 2363.639893, 2362.719971, 2384.199951, 2411.800049, 2423.409912, 2470.300049, 2471.649902, 2519.360107, 2575.26001, 2584.840088, 2673.610107, 2823.810059, 2713.830078, 2640.8701170000004, 2648.050049, 2705.27002, 2718.3701170000004, 2816.290039, 2901.52002, 2913.97998]
Your time series is pretty clearly not stationary, so that Yule-Walker assumptions are violated. More generally, PACF is usually appropriate with stationary time series. You might difference your data first, before considering the partial autocorrelations.