'set' object cannot be interpreted as an integer - python-3.x

I have the following python code:
text = "this’s a sent tokenize test. this is sent two. is this sent three? sent 4 is cool! Now it’s your turn."
from nltk.tokenize import sent_tokenize
sent_tokenize_list = sent_tokenize(text)
import numpy as np
lenDoc=len(sent_tokenize_list)
features={'position','rate'}
score = np.empty((lenDoc, 2), dtype=object)
score=[[0 for x in range(sent_tokenize_list)] for y in range(features)]
for i,sentence in enumerate(sent_tokenize_list):
score[i,features].append((lenDoc-i)/lenDoc)
But it results in the following error:
TypeError Traceback (most recent call last) <ipython-input-27-c53da2b2ab02> in <module>()
13
14
---> 15 score=[[0 for x in range(sent_tokenize_list)] for y in range(features)]
16 for i,sentence in enumerate(sent_tokenize_list):
17 score[i,features].append((lenDoc-i)/lenDoc)
TypeError: 'set' object cannot be interpreted as an integer

range() takes int values. features is a set so it throws an error. you made the same mistake with range(sent_tokenize_list). sent_tokenize_list is a list value not an int.
If you want x and y to be indexes of features and sent_tokenize_list then you have to use this: score=[[0 for x in range(len(sent_tokenize_list))] for y in range(len(features))]
But if you want x and y to be values of features and sent_tokenize_list then you have to remove range() from that line.

Related

Cannot create a numpy array using numpy's `full()` method and a python list

I can create a numpy array from a python list as follows:
>>> a = [1,2,3]
>>> b = np.array(a).reshape(3,1)
>>> print(b)
[[1]
[2]
[3]]
However, I don't know what causes error in the following code:
Code :
>>> a = [1,2,3]
>>> b = np.full((3,1), a)
Error :
ValueError Traceback (most recent call last)
<ipython-input-275-1ab6c109dda4> in <module>()
1 a = [1,2,3]
----> 2 b = np.full((3,1), a)
3 print(b)
/usr/local/lib/python3.6/dist-packages/numpy/core/numeric.py in full(shape, fill_value, dtype, order)
324 dtype = array(fill_value).dtype
325 a = empty(shape, dtype, order)
--> 326 multiarray.copyto(a, fill_value, casting='unsafe')
327 return a
328
<__array_function__ internals> in copyto(*args, **kwargs)
ValueError: could not broadcast input array from shape (3) into shape (3,1)
Even though the list a has 3 elements inside it and I expect a 3x1 numpy array, the full() method fails to deliver it.
I referred the broadcasting article of numpy too. However, they are much more focused towards the arithmetic operation perspective, hence I couldn't obtain anything useful from there.
So it would be great if you can help me to understand the difference in b/w. the above mentioned array creation methods and the cause of the error too.
Numpy is unable to broadcast the two shapes together because your list is interpreted as a 'row vector' (np.array(a).shape = (3,)) while you are asking for a 'column vector' (shape = (3, 1)). If you are set on using np.full, then you can shape your list as a column vector initially:
>>> import numpy as np
>>>
>>> a = [[1],[2],[3]]
>>> b = np.full((3,1), a)
Another option is to convert a into a numpy array ahead of time and add a new axis to match the desired output shape.
>>> a = [1,2,3]
>>> a = np.array(a)[:, np.newaxis]
>>> b = np.full((3,1), a)

Is there a way to get the sum of non-zero elements in numpy array? I keep getting a TypeError

I googled this question extensively and I can't figure out what's wrong. I keep getting:
"TypeError: '>' not supported between instances of 'numpy.ndarray' and 'int'".
I searched it and the websites led me to download a package and that still didn't solve my issue. My code looks as follows:
result=[]
FracPos = np.array(result)
for x in lines:
result.append(x.split())
TotalCells = np.array(result)[:,2]
print(type(TotalCells))
print(type(FracPos))
FracPos = np.sum((TotalCells)>0)
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-0972db6a45c4> in <module>
17 TotalCells = np.array(result)[:,2]
18 print(type(TotalCells))
---> 19 FracPos = np.sum((TotalCells)>0)
TypeError: '>' not supported between instances of 'numpy.ndarray' and 'int'
I can't figure out why I get the error at the last line and how to change it. I am very new to Python so I'm sure I'm missing something obvious, but the other questions like this are about nympy.ndarray and strings or lists which I understand why you can't compare.
It is because result is a list containing str objects. If you only have numbers in your array, then do:
result = [float(x) for x in result]
TotalCells = ...
Then, in order to get the sum of positive elements, you would have to do:
FracPos = np.sum(TotalCells[TotalCells > 0])
The line you wrote counts the number of postiive elements, it does not sum them.

Creating mathematical equations using numpy in python

I want to create equations using numpy array multiplication ie I want to keep all variables in an array and its coefficients in other array and multiply both with each other to produce an expression so that I can use m.Equation() method of GEKKO. I tried the mentioned code but failed, please let me know how I can achieve my goal.
By "it failed" I meant that it just gave an error and did not let me use x*y==1 as equation in m.Equation() method available in GEKKO. My target is that I want to keep variables in one array and their coefficients in the other array and I multiply them to get mathematical equations to be used as input in m.Equation() method.
import numpy as np
from gekko import GEKKO
X = np.array([x,y,z])
y = np.array([4,5,6])
m = GEKKO(remote=False)
m.Equation(x*y==1)
# I wanted to get a result like 4x+5y+6z=1
The error I get is below
Traceback (most recent call last):
File "C:\Users\kk\AppData\Local\Programs\Python\Python37\MY WORK FILES\numpy practise.py", line 5, in <module>
X = np.array([x,y,z])
NameError: name 'x' is not defined
You need to define variables and make the coefficients into a Gekko object. You can use an array to make the variables and a parameter for the coefficients:
from gekko import GEKKO
m = GEKKO(remote=False)
X = m.Array(m.Var, 3)
y = m.Param([4, 5, 6])
eq = m.Equation(X.dot(y) == 1)
print(eq.value)
Output:
((((v1)*(4))+((v2)*(5)))+((v3)*(6)))=1

Fit function hmmlearn doesn't work: fit() takes 2 positional arguments but 3 were given

I am trying to run a hidden markov model, however the fit function doesn't work properly.
Code:
import numpy as np
from hmmlearn import hmm
X1 = [[0.5], [1.0], [-1.0], [0.42], [0.24]]
X2 = [[2.4], [4.2], [0.5], [-0.24]]
X = np.concatenate([X1, X2])
lengths = [len(X1), len(X2)]
hmm.GaussianHMM(n_components=3).fit(X, lengths)
I get this error message:
TypeError Traceback (most recent call last)
<ipython-input-16-cdfada1be202> in <module>()
8 lengths = [len(X1), len(X2)]
9
---> 10 hmm.GaussianHMM(n_components=3).fit(X, lengths)
TypeError: fit() takes 2 positional arguments but 3 were given
Please check the version of hmmlearn you have and update that. The lengths param is available in newer versions as seen here
http://hmmlearn.readthedocs.io/en/latest/api.html#hmmlearn.hmm.GaussianHMM.fit
Then try doing (as #Harpal suggested):
hmm.GaussianHMM(n_components=3).fit(X, lengths=lengths)
This error can be reproduced for hmmlearn 0.1.1,
however if you do a pip install hmmlearn==0.2.0 in your virtual env and follow up with hmm.GaussianHMM(n_components=3).fit(X, lengths=lengths).
Things should work out just fine!

Numpy printing with iterator .format throws an error

Using anaconda distribution, Python 3.61 and using Jupyter notebook for Scipy/Numpy. I can use the print(' blah {} '.format(x)) to format numbers but if I iterate over a nparray I get an error.
# test of formatting
'{:+f}; {:+f}'.format(3.14, -3.14) # show it always
example stolen from the Python 3.6 manual section 6.1.3.2 Here and I get the expected response. So I know that it isn't that I've forgotten to import something i.e. it is built in.
if I do this:
C_sense = C_pixel + C_stray
print('Capacitance of node')
for x, y in np.nditer([Names,C_sense]):
print('The {} has C ={} [F]'.format(x,y))
I get output
Capacitance of node
The 551/751 has C =8.339999999999999e-14 [F]
The 554 has C =3.036e-13 [F]
The 511 has C =1.0376e-12 [F]
But if I do this:
# now with formatting
C_sense = C_pixel + C_stray
print('Capacitance of node')
for x, y in np.nditer([Names,C_sense]):
print('The {} has C ={:.3f} [F]'.format(x,y))
I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-9-321e0b5edb03> in <module>()
3 print('Capacitance of node')
4 for x, y in np.nditer([Names,C_sense]):
----> 5 print('The {} has C ={:.3f} [F]'.format(x,y))
TypeError: unsupported format string passed to numpy.ndarray.__format__
I've attached a screen shot of my Jupyter notebook to show context of this code.
The error is clearly coming from the formatter, not knowing what to do with the numpy iterable you get from np.nditer.
Does the following work?
for x,y in zip(Names,C_sense):
print('The {} has C ={:.3f} [F]'.format(x,y))

Resources