TypeError: 'float' object cannot be interpreted as an integer on linspace - python-3.x

TypeError Traceback (most recent call last)
d:\website\SpeechProcessForMachineLearning-master\SpeechProcessForMachineLearning-master\speech_process.ipynb Cell 15' in <cell line: 1>()
-->1 plot_freq(signal, sample_rate)
d:\website\SpeechProcessForMachineLearning-master\SpeechProcessForMachineLearning-master\speech_process.ipynb Cell 10' in plot_freq(signal, sample_rate, fft_size)
2 def plot_freq(signal, sample_rate, fft_size=512):
3 xf = np.fft.rfft(signal, fft_size) / fft_size
----> 4 freq = np.linspace(0, sample_rate/2, fft_size/2 + 1)
5 xfp = 20 * np.log10(np.clip(np.abs(xf), 1e-20, 1e100))
6 plt.figure(figsize=(20, 5))
File <__array_function__ internals>:5, in linspace(*args, **kwargs)
File ~\AppData\Local\Programs\Python\Python39\lib\site-packages\numpy\core\function_base.py:120, in linspace(start, stop, num, endpoint, retstep, dtype, axis)
23 #array_function_dispatch(_linspace_dispatcher)
24 def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None,
25 axis=0):
26 """
27 Return evenly spaced numbers over a specified interval.
28
(...)
118
119 """
--> 120 num = operator.index(num)
121 if num < 0:
122 raise ValueError("Number of samples, %s, must be non-negative." % num)
TypeError: 'float' object cannot be interpreted as an integer
What solution about this problem?

Related

How to subset a xarray.Dataset according to lat/lon values taken from a SRTM DEM extents

I have a year wise (1980-2020) precipitation data set in netCDF format. I am importing them in xarray to have 40 years of merged precipitation values:
import netCDF4
import numpy
import xarray as xr
import pandas as pd
prcp=xr.open_mfdataset('/home/hrsa/Sayantan/HAR_V2/prcp/HARv2_d10km_d_2d_prcp_*.nc',combine = 'nested', concat_dim="time")
prcp
which renders:
xarray.Dataset
Dimensions:
time: 14976west_east: 381south_north: 252
Coordinates:
time
(time)
datetime64[ns]
1980-01-01 ... 2020-12-31
west_east
(west_east)
float32
-1.675e+06 -1.665e+06 ... 2.125e+06
south_north
(south_north)
float32
-7.45e+05 -7.35e+05 ... 1.765e+06
lon
(south_north, west_east)
float32
dask.array<chunksize=(252, 381), meta=np.ndarray>
lat
(south_north, west_east)
float32
dask.array<chunksize=(252, 381), meta=np.ndarray>
Data variables:
prcp
(time, south_north, west_east)
float32
dask.array<chunksize=(366, 252, 381), meta=np.ndarray>
Attributes: (33)
This a large dataset, hence I am required to subset it according to an SRTM image whose extents (in EPSG:4326) is defined as
# Extents of the SRTM DEM covering Panchi_B and the SASE AWS/Base Camp
min_lon = 77.0
min_lat = 32.0
max_lon = 78.0
max_lat = 33.0
In order to subset according to above coordinates I have tried the following:
prcp = prcp.sel(lat = slice(min_lat,max_lat), lon = slice(min_lon,max_lon))
the Error output:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/.pyenv/versions/3.9.7/envs/v3.9.7/lib/python3.9/site-packages/xarray/core/indexing.py:73, in group_indexers_by_index(data_obj, indexers, method, tolerance)
72 try:
---> 73 index = xindexes[key]
74 coord = data_obj.coords[key]
KeyError: 'lat'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
Input In [25], in <cell line: 1>()
----> 1 prcp = prcp.sel(lat = slice(min_lat,max_lat), lon = slice(min_lon,max_lon))
File ~/.pyenv/versions/3.9.7/envs/v3.9.7/lib/python3.9/site-packages/xarray/core/dataset.py:2501, in Dataset.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
2440 """Returns a new dataset with each array indexed by tick labels
2441 along the specified dimension(s).
2442
(...)
2498 DataArray.sel
2499 """
2500 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
-> 2501 pos_indexers, new_indexes = remap_label_indexers(
2502 self, indexers=indexers, method=method, tolerance=tolerance
2503 )
2504 # TODO: benbovy - flexible indexes: also use variables returned by Index.query
2505 # (temporary dirty fix).
2506 new_indexes = {k: v[0] for k, v in new_indexes.items()}
File ~/.pyenv/versions/3.9.7/envs/v3.9.7/lib/python3.9/site-packages/xarray/core/coordinates.py:421, in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs)
414 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "remap_label_indexers")
416 v_indexers = {
417 k: v.variable.data if isinstance(v, DataArray) else v
418 for k, v in indexers.items()
419 }
--> 421 pos_indexers, new_indexes = indexing.remap_label_indexers(
422 obj, v_indexers, method=method, tolerance=tolerance
423 )
424 # attach indexer's coordinate to pos_indexers
425 for k, v in indexers.items():
File ~/.pyenv/versions/3.9.7/envs/v3.9.7/lib/python3.9/site-packages/xarray/core/indexing.py:110, in remap_label_indexers(data_obj, indexers, method, tolerance)
107 pos_indexers = {}
108 new_indexes = {}
--> 110 indexes, grouped_indexers = group_indexers_by_index(
111 data_obj, indexers, method, tolerance
112 )
114 forward_pos_indexers = grouped_indexers.pop(None, None)
115 if forward_pos_indexers is not None:
File ~/.pyenv/versions/3.9.7/envs/v3.9.7/lib/python3.9/site-packages/xarray/core/indexing.py:84, in group_indexers_by_index(data_obj, indexers, method, tolerance)
82 except KeyError:
83 if key in data_obj.coords:
---> 84 raise KeyError(f"no index found for coordinate {key}")
85 elif key not in data_obj.dims:
86 raise KeyError(f"{key} is not a valid dimension or coordinate")
KeyError: 'no index found for coordinate lat'
How can I resolve this issue? Any help will be appreciated, Thank you.
############# Edit (for #Robert Wilson) ##################
In order to find out the ranges, I did the following:
lon = prcp.lon.to_dataframe()
lon
lat = prcp.lat.to_dataframe()
lat

ValueError: Incompatible dimension for X and Y matrices in cosine similarity

I'm trying to find cosine similarity between two set of documents in Python 3.x. So I wrote following code
count_vectorizer = CountVectorizer(stop_words=stopwords)
sparse_matrix = count_vectorizer.fit_transform(formatted0)
doc_term_matrix = sparse_matrix.todense()
sparse_matrix = count_vectorizer.fit_transform(formatted)
doc_term_matrix1 = sparse_matrix.todense()
z=cosine_similarity(doc_term_matrix,doc_term_matrix1)
Length of doc_term_matrix is 29982 & doc_term_matrix1 is 346. But I'm getting error message
/opt/conda/lib/python3.9/site-packages/sklearn/utils/validation.py:593: FutureWarning: np.matrix usage is deprecated in 1.0 and will raise a TypeError in 1.2. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
warnings.warn(
/opt/conda/lib/python3.9/site-packages/sklearn/utils/validation.py:593: FutureWarning: np.matrix usage is deprecated in 1.0 and will raise a TypeError in 1.2. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
warnings.warn(
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_888/79579735.py in <module>
----> 1 z419=cosineSimilarity(splittedCosine419,doc_term_matrix)
2 z419
/tmp/ipykernel_888/2223236548.py in cosineSimilarity(splitted_german, doc_term_matrix)
8 sparse_matrix = count_vectorizer.fit_transform(formatted)
9 doc_term_matrix1 = sparse_matrix.todense()
---> 10 z=cosine_similarity(doc_term_matrix1,doc_term_matrix)
11 return z
/opt/conda/lib/python3.9/site-packages/sklearn/metrics/pairwise.py in cosine_similarity(X, Y, dense_output)
1249 # to avoid recursive import
1250
-> 1251 X, Y = check_pairwise_arrays(X, Y)
1252
1253 X_normalized = normalize(X, copy=True)
/opt/conda/lib/python3.9/site-packages/sklearn/metrics/pairwise.py in check_pairwise_arrays(X, Y, precomputed, dtype, accept_sparse, force_all_finite, copy)
179 )
180 elif X.shape[1] != Y.shape[1]:
--> 181 raise ValueError(
182 "Incompatible dimension for X and Y matrices: "
183 "X.shape[1] == %d while Y.shape[1] == %d" % (X.shape[1], Y.shape[1])
ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 1027 while Y.shape[1] == 10346
Can you suggest me the steps to resolve this issue?

I keep getting "TypeError: only integer scalar arrays can be converted to a scalar index" while using custom-defined metric in KNeighborsClassifier

I am using a custom-defined metric in SKlearn's KNeighborsClassifier. Here's my code:
def chi_squared(x,y):
return np.divide(np.square(np.subtract(x,y)), np.sum(x,y))
Above function implementation of chi squared distance function. I have used NumPy functions because according to scikit-learn docs, metric function takes two one-dimensional numpy arrays.
I have passed the chi_squared function as an argument to KNeighborsClassifier().
knn = KNeighborsClassifier(algorithm='ball_tree', metric=chi_squared)
However, I keep getting following error:
TypeError Traceback (most recent call last)
<ipython-input-29-d2a365ebb538> in <module>
4
5 knn = KNeighborsClassifier(algorithm='ball_tree', metric=chi_squared)
----> 6 knn.fit(X_train, Y_train)
7 predictions = knn.predict(X_test)
8 print(accuracy_score(Y_test, predictions))
~/.local/lib/python3.8/site-packages/sklearn/neighbors/_classification.py in fit(self, X, y)
177 The fitted k-nearest neighbors classifier.
178 """
--> 179 return self._fit(X, y)
180
181 def predict(self, X):
~/.local/lib/python3.8/site-packages/sklearn/neighbors/_base.py in _fit(self, X, y)
497
498 if self._fit_method == 'ball_tree':
--> 499 self._tree = BallTree(X, self.leaf_size,
500 metric=self.effective_metric_,
501 **self.effective_metric_params_)
sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree.__init__()
sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree._recursive_build()
sklearn/neighbors/_ball_tree.pyx in sklearn.neighbors._ball_tree.init_node()
sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree.rdist()
sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.DistanceMetric.rdist()
sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.PyFuncDistance.dist()
sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.PyFuncDistance._dist()
<ipython-input-29-d2a365ebb538> in chi_squared(x, y)
1 def chi_squared(x,y):
----> 2 return np.divide(np.square(np.subtract(x,y)), np.sum(x,y))
3
4
5 knn = KNeighborsClassifier(algorithm='ball_tree', metric=chi_squared)
<__array_function__ internals> in sum(*args, **kwargs)
~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in sum(a, axis, dtype, out, keepdims, initial, where)
2239 return res
2240
-> 2241 return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
2242 initial=initial, where=where)
2243
~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
85 return reduction(axis=axis, out=out, **passkwargs)
86
---> 87 return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
88
89
TypeError: only integer scalar arrays can be converted to a scalar index
I can reproduce your error message with:
In [173]: x=np.arange(3); y=np.array([2,3,4])
In [174]: np.sum(x,y)
Traceback (most recent call last):
File "<ipython-input-174-1a1a267ebd82>", line 1, in <module>
np.sum(x,y)
File "<__array_function__ internals>", line 5, in sum
File "/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py", line 2247, in sum
return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
File "/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
TypeError: only integer scalar arrays can be converted to a scalar index
Correct use(s) of np.sum:
In [175]: np.sum(x)
Out[175]: 3
In [177]: np.sum(np.arange(6).reshape(2,3), axis=0)
Out[177]: array([3, 5, 7])
In [178]: np.sum(np.arange(6).reshape(2,3), 0)
Out[178]: array([3, 5, 7])
(re)read the np.sum docs if necessary!
Using np.add instead of np.sum:
In [179]: np.add(x,y)
Out[179]: array([2, 4, 6])
In [180]: x+y
Out[180]: array([2, 4, 6])
The following should be equivalent:
np.divide(np.square(np.subtract(x,y)), np.add(x,y))
(x-y)**2/(x+y)

Unsupported operand types with df.copy() method

I'm uploading my dataset, and I'm copying my dataset, but an error is appearing.
import numpy as np
import pandas as pd
import mathplotlib.pyplot as plt
house_data=pd.read_csv("/home/houseprice.csv")
#we evaluate the price of a house for those cases where the information is missing, for each variable
def analyse_na_value(df, var):
df - df.copy()
# we indicate as a variable as 1 where the observation is missing
# we indicate as 0 where the observation has a real value
df[var] = np.where(df[var].isnull(), 1 , 0)
#print(df[var].isnull())
# we calculate the mean saleprice where the information is missing or present
df.groupby(var)['SalePrice'].median().plot.bar()
plt.title(var)
plt.show()
for var in vars_with_na:
analyse_na_value(house_data, var)
error,when I comment this code line, I don't get an error
df - df.copy()
TypeError Traceback (most recent call last)
~/anaconda3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py in na_arithmetic_op(left, right, op, is_cmp)
142 try:
--> 143 result = expressions.evaluate(op, left, right)
144 except TypeError:
~/anaconda3/lib/python3.8/site-packages/pandas/core/computation/expressions.py in evaluate(op, a, b, use_numexpr)
232 if use_numexpr:
--> 233 return _evaluate(op, op_str, a, b) # type: ignore
234 return _evaluate_standard(op, op_str, a, b)
~/anaconda3/lib/python3.8/site-packages/pandas/core/computation/expressions.py in _evaluate_numexpr(op, op_str, a, b)
118 if result is None:
--> 119 result = _evaluate_standard(op, op_str, a, b)
120
~/anaconda3/lib/python3.8/site-packages/pandas/core/computation/expressions.py in _evaluate_standard(op, op_str, a, b)
67 with np.errstate(all="ignore"):
---> 68 return op(a, b)
69
TypeError: unsupported operand type(s) for -: 'str' and 'str'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-31-25d58bc46c86> in <module>
15
16 for var in vars_with_na:
---> 17 analyse_na_value(house_data, var)
<ipython-input-31-25d58bc46c86> in analyse_na_value(df, var)
1 #we evaluate the price of a house for those cases where the information is missing, for each variable
2 def analyse_na_value(df, var):
----> 3 df - df.copy()
4
5 # we indicate as a variable as 1 where the observation is missing
~/anaconda3/lib/python3.8/site-packages/pandas/core/ops/__init__.py in f(self, other, axis, level, fill_value)
649 if isinstance(other, ABCDataFrame):
650 # Another DataFrame
--> 651 new_data = self._combine_frame(other, na_op, fill_value)
652
653 elif isinstance(other, ABCSeries):
~/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in _combine_frame(self, other, func, fill_value)
5864 return func(left, right)
5865
-> 5866 new_data = ops.dispatch_to_series(self, other, _arith_op)
5867 return new_data
5868
~/anaconda3/lib/python3.8/site-packages/pandas/core/ops/__init__.py in dispatch_to_series(left, right, func, axis)
273 # _frame_arith_method_with_reindex
274
--> 275 bm = left._mgr.operate_blockwise(right._mgr, array_op)
276 return type(left)(bm)
277
~/anaconda3/lib/python3.8/site-packages/pandas/core/internals/managers.py in operate_blockwise(self, other, array_op)
362 Apply array_op blockwise with another (aligned) BlockManager.
363 """
--> 364 return operate_blockwise(self, other, array_op)
365
366 def apply(self: T, f, align_keys=None, **kwargs) -> T:
~/anaconda3/lib/python3.8/site-packages/pandas/core/internals/ops.py in operate_blockwise(left, right, array_op)
36 lvals, rvals = _get_same_shape_values(blk, rblk, left_ea, right_ea)
37
---> 38 res_values = array_op(lvals, rvals)
39 if left_ea and not right_ea and hasattr(res_values, "reshape"):
40 res_values = res_values.reshape(1, -1)
~/anaconda3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py in arithmetic_op(left, right, op)
188 else:
189 with np.errstate(all="ignore"):
--> 190 res_values = na_arithmetic_op(lvalues, rvalues, op)
191
192 return res_values
~/anaconda3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py in na_arithmetic_op(left, right, op, is_cmp)
148 # will handle complex numbers incorrectly, see GH#32047
149 raise
--> 150 result = masked_arith_op(left, right, op)
151
152 if is_cmp and (is_scalar(result) or result is NotImplemented):
~/anaconda3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py in masked_arith_op(x, y, op)
90 if mask.any():
91 with np.errstate(all="ignore"):
---> 92 result[mask] = op(xrav[mask], yrav[mask])
93
94 else:
TypeError: unsupported operand type(s) for -: 'str' and 'str'
1
As far to what I know the copy() function works with python3,
but in pandas,
and python3 does it work I don't know.
How can I get rid of this error without commenting that code line?
I think you are supposed to do df = df.copy(). I would recommend changing the variable though. Here is an official Pandas documentation on this function. What you are doing is subtracting the data frame from itself...

the TypeError: 'float' object cannot be interpreted as an integer in stride_trick.as_strided

When trying to replicating the code given here.
import numpy as np
n=4
m=5
a = np.arange(1,n*m+1).reshape(n,m)
sz = a.itemsize
h,w = a.shape
bh,bw = 2,2
shape = (h/bh, w/bw, bh, bw)
strides = sz*np.array([w*bh,bw,w,1])
blocks=np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
print(blocks)
I got the following error message, what might be the reason?
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-0c3a23be3e7f> in <module>
12
13
---> 14 blocks=np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
15 print(blocks)
~\AppData\Local\Continuum\anaconda3\envs\dropletflow\lib\site-packages\numpy\lib\stride_tricks.py in as_strided(x, shape, strides, subok, writeable)
100 interface['strides'] = tuple(strides)
101
--> 102 array = np.asarray(DummyArray(interface, base=x))
103 # The route via `__interface__` does not preserve structured
104 # dtypes. Since dtype should remain unchanged, we set it explicitly.
~\AppData\Local\Continuum\anaconda3\envs\dropletflow\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
499
500 """
--> 501 return array(a, dtype, copy=False, order=order)
502
503
TypeError: 'float' object cannot be interpreted as an integer
Your shape is (2.0, 2.5, 2, 2), however the shape parameter is expecting a sequence of integers (as seen in the API for np.lib.stride_tricks.as_strided)

Resources