AttributeError: 'numpy.ndarray' object has no attribute 'rolling' - python-3.x

When I am trying to do MA or rolling average with log transformed data I get this error. Where am I going wrong?
This one with original data worked fine-
# Rolling statistics
rolmean = data.rolling(window=120).mean()
rolSTD = data.rolling(window=120).std()
with log transformed data-
MA = X.rolling(window=120).mean()
MSTD = X.rolling(window=120).std()
AttributeError: 'numpy.ndarray' object has no attribute 'rolling'

You have to convert the numpy array to a pandas dataframe to use the pandas.rolling method.
The change could be something like this
dataframe = pd.DataFrame(data)
rolmean = dataframe.rolling(120).mean()

Try this instead:
numpy.roll(your_array, shift, axis = None)
There is no attribute rolling in numpy. So you shoud use the above syntax
Hope this helps

Related

nunique() not producing correct output in aggregate functions

I am using a aggregation for following data frame;
df = pd.DataFrame({'col1':['team1','team1','team2','team3'],
'col2':[23, 4, 5 ,6],
'col3':['user1','user1','user2','user2']})
gb = df.groupby('col1')
gb.agg({'col2':np.sum,
'col3':nunique()})
But it seems nunique() is not compatible with groupby. Please see following output.
NameError: name 'nunique' is not defined
May I know how can we use unique() for this example.Help is appreciated.
Using Numpy
gb = df.groupby('col1')
gb.agg({'col2':np.sum,
'col3':np.nunique()})
Gives a new error, AttributeError: module 'numpy' has no attribute 'nunique'
You need to use
gb.agg({'col2':np.sum, 'col3':lambda x: len(np.unique(x))})

AttributeError:Float' object has no attribute log /TypeError: ufunc 'log' not supported for the input types

I have a series of fluorescence intensity data in a column ('2.4M'). I tried to create a new column 'ln_2.4M' by taking the ln of column '2.4M' I got an error:
AttributeError: 'float' object has no attribute 'log'
df["ln_2.4M"] = np.log(df["2.4M"])
I tried using a for loop to iterate the log over each fluorescence data in the column "2.4M":
ln2_4M = []
for x in df["2.4M"]:
ln2_4M = np.log(x)
print(ln2_4M)
Although it printed out ln2_4M as log of column "2.4M" correctly, I am unable to use the data because it gave alongside a TypeError:
ufunc 'log' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'
Not sure why? - Any help at understanding what is happening and how to fix this problem is appreciated. Thanks
.
I then tried using the method below and it worked:
df["2.4M"] = pd.to_numeric(df["2.4M"],errors = 'coerce')
df["ln_24M"] = np.log(df["2.4M"])

Unable to replace values using Dict on DataFrame column

I am creating a dictionary code_data by loading a CSV file to a data frame and converting it with to_dict method. This is a fragment of my code:
path = "E:\Knoema_Work_Dataset\hrfpfwd\MetaData\MetaData_num_person.csv"
code_data = pd.read_csv(path, usecols=['value', 'display_value'], dtype=object)
code_data = code_data.set_index('value')['display_value'].to_dict()
In the following line I am attempting to replace its values:
data["Number of Deaths"] = data["Number of Deaths"].replace(code_data)
Sadly, it leads to an error:
Cannot compare types 'ndarray(dtype=int64)' and 'str'
Could you provide me with some assistance with regards to my problem?

AttributeError: 'numpy.ndarray' object has no attribute 'sqrt'

I am trying to split dataframe in equal samples and applying some function to calculate value of each sample if any sample value greater than 0.3 then in result dataframe i want to save filename
df=pd.DataFrame({'Value':[-0.016,-0.006,0.003,-0.011,-0.036,-0.031,-0.014,-0.006,-0.01 ,-0.009,0.004,0.001,-0.012,-0.021,-0.008,0.001,-0.011,-0.01,-0.006,0.002,0.004],'Nmae':[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]})
x=pd.DataFrame([x.values.sqrt(np.mean(df2['Value']**2)) for x in np.array_split(df2, (len(df2)/10))])
getting this error
AttributeError: 'numpy.ndarray' object has no attribute 'sqrt'
if someone have any other effective way to do this task
This is a working version of your Code:
res= [np.sqrt(np.mean((x.Value**2))) for x in np.array_split(df, (len(df)/10))]
An alternative way of approaching this with Pandas would be. You define a new column 'Split_variable' and use it to apply your calculations:
df.groupby('Split_variable')['Value'].apply(lambda x: np.sqrt(np.mean((x**2))))

Find Max Value in a field of a shapefile

I have a shapefile (mich_co.shp) which I try to find the county with max population. My idea is to use max() function it's not possible. Here is my code so far:
from osgeo import ogr
import os
shapefile = "C:/Users/root/Python/mich_co.shp"
driver = ogr.GetDriverByName("ESRI Shapefile")
dataSource = driver.Open(shapefile, 0)
layer = dataSource.GetLayer()
for feature in layer:
print(feature.GetField("pop"))
layer.ResetReading()
The code above however only print all values of "pop" field like this:
10635.0
9541.0
112039.0
29234.0
23406.0
15477.0
8683.0
58990.0
106935.0
17465.0
156067.0
43868.0
135099.0
I tried:
print(max(feature.GetField("pop")))
but it returns TypeError: 'float' object is not iterable. For this, I've also tried:
for feature in range(layer):
and it returns TypeError: 'Layer' object cannot be interpreted as an integer.
Any helps of hints would be much appreciated.
Thanks you!
max() needs an iterable, such as a list. Try to build a list:
pops = [ feature.GetField("pop") for feature in layer ]
print(max(pops))

Resources