I am trying to produce all combination of numpy array that satisfy a condition efficiently my code now looks like this
import numpy as np
import itertools
a = np.array([1,11,12,13])
a = np.tile(a,(13,1))
a = a.flatten()
for c in itertools.combinations(a,4):
if np.sum(c)==21:
print(c)
If you only care about unique combinations (and there are only 256 of them), you can use itertools.product:
version_1 = np.vstack(list(sorted({tuple(row) for row in list(itertools.combinations(a, 4))}))) # unique combinations, your way
version_2 = np.array(list(itertools.product((1, 11, 12, 13), repeat=4))) # same result, but faster
assert (version_1 == version_2).all()
I'm using this answer to get the unique elements of a Numpy array.
So the final answer would be:
import itertools, numpy as np
a = np.array(list(itertools.product((1, 11, 12, 13), repeat=4)))
for arr in a[a.sum(axis=1) == 21]:
print(arr)
Related
import numpy as np
import pandas as pd
df = pd.DataFrame({'dt': ['2021-2-13', '2022-2-15'],
'w': [5, 7],
'n': [11, 8]})
df.reset_index()
print(list(df.loc[:,'dt'].values))
gives: ['2021-2-13', '2022-2-15']
NEEDED: [('2021-2-13'), ('2022-2-15')]
Important (at comment's Q): "NEEDED" is the way "mplfinance" accepts vlines argument for plot (checked) - I need to draw vertical lines for specified dates at x-axis of chart
import mplfinance as mpf
RES['Date'] = RES['Date'].dt.strftime('%Y-%m-%d')
my_vlines=RES.loc[:,'Date'].values # NOT WORKS
fig, axlist = mpf.plot( ohlc_df, type="candle", vlines= my_vlines, xrotation=30, returnfig=True, figsize=(6,4))
will only work if explcit my_vlines= [('2022-01-18'), ('2022-02-25')]
SOLVED: Oh, it really appears to be so simple after all
my_vlines=list(RES.loc[:,'Date'].values)
Your question asks for a list of Numpy arrays but your desired output looks like Tuples. If you need Tuples, note that it's the comma that makes the tuple not the parentheses, so you'd do something like this:
desired_format = [(x,) for x in list(df.loc[:,'dt'].values)]
If you want numpy arrays, you could do this
desired_format = [np.array(x) for x in list(df.loc[:,'dt'].values)]
I think I understand your problem. Please see the example code below and let me know if this resolves your problem. I expanded on your dataframe to meet mplfinance plot criteria.
import pandas as pd
import numpy as np
import mplfinance as mpf
df = pd.DataFrame({'dt': ['2021-2-13', '2022-2-15'],'Open': [5,7],'Close': [11, 8],'High': [21,30],'Low': [7, 3]})
df['dt']=pd.to_datetime(df['dt'])
df.set_index('dt', inplace = True)
mpf.plot(df, vlines = dict(vlines = df.index.tolist()))
Say if I would like to smooth the the following daily data named oildata with scipy.signal.savgol_filter:
from scipy.signal import savgol_filter
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = np.random.uniform(0, 10, size=90)
index= pd.date_range('20130226', periods=90)
oildata = pd.Series(data, index)
savgol_filter(oildata, 5, 3)
plt.plot(oildata)
plt.plot(pd.Series(savgol_filter(oildata, 5, 3), index=oildata.index))
plt.show()
Out:
Out:
When I replace savgol_filter(oildata, 5, 3) to savgol_filter(oildata, 31, 3):
Beside trial and error methods, I wonder if there are any criteria or methods to select a suitable window_length (which must be a positive odd integer) and polyorder (must be less than window_length) pairs quickly? Thanks.
Reference:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html
I want to use lmfit in order to fit my data.
The function I am using, has only one argument features. The content of features will be different (both columns and values), so I can't initialize parameters.
I tried to create a dataframe as here, but I can't use the guess method because this is for LorentzianModel and I just want to use Model.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import lmfit
from sklearn.linear_model import LinearRegression
df = {'a': [0, 0.2, 0.3], 'b':[14, 10, 9], 'target':[100, 200, 300]}
df = pd.DataFrame(df)
X = df[['a', 'b']]
y = df[['target']]
model = LinearRegression().fit(X, y)
features = pd.DataFrame({"a": np.array([0, 0.11, 0.36]),
"b": np.array([10, 14, 8])})
def eval_custom(features):
res = model.predict(features)
return res
x_val = features[["a"]].values
def calling_func(features, x_val):
pred_custom = eval_custom(features)
df = pd.DataFrame({'x': np.squeeze(x_val), 'y': np.squeeze(pred_custom)})
themodel = lmfit.Model(eval_custom)
params = themodel.guess(df['y'], x=df['x'])
result = themodel.fit(df['y'], params, x = df['x'])
result.plot_fit()
calling_func(features, x_val)
The model function needs to take independent variables and the individual model parameters as arguments. You're wrapping all of that into a single pandas Dataframe and then sending that. Don't do that.
If you need to create a dataframe from the current values of the model, do that inside your model function.
Also: a generic model function does not have a working guess function. Use model.make_params() and definitely, definitely (no exceptions, nope not ever) provide actual initial values for every parameter.
I have an array, arr1, which is a pd.Series array of length 1000 where some values are repeated. And I want to map every unique value in arr1 to a new value that is in a np array, arr2. I only know how to do this using a for loop:
import numpy as np
import pandas as pd
arr1 = pd.Series(np.random.choice(1000,1000, replace=True))
arr1_unq = arr1.drop_duplicates()
arr2 = np.random.choice(1000,len(arr1_unq), replace=False)
arr2_unq = np.unique(arr2)
for i in range(len(arr2)):
arr1[arr1==arr1_unq.iloc[i]]=arr2[i]
How can I do this more efficiently without using a for loop?
pandas.Series.map should do it
mapping = dict(zip(arr1_unq, arr2_unq))
arr1.map(maping)
I'm using a camera to store raw data in a numpy array, but I don't know What does mean a colon before a number in numpy array?
import numpy as np
import picamera
camera = picamera.PiCamera()
camera.resolution = (128, 112)
data = np.empty((128, 112, 3), dtype=np.uint8)
camera.capture(data, 'rgb')
data = data[:128, :112]
numpy array indexing is explained in the doc.
this example shows what is selected:
import numpy as np
data = np.arange(64).reshape(8, 8)
print(data)
data = data[:3, :5]
print(data)
the result will be the first 5 elements of the first 3 rows of the array.
as in standard python lst[:3] means everything up to the third element (i.e. the element with index < 3). in numpy you can do the same for every dimension with the syntax given in your question.