Why does second iteration always fail? - python-3.x

I am trying to create a dict of z_scores by filtering a dataframe based upon five locations.
No matter which location is first in the list, I always get the first key:value pair placed into
the dict, and no matter which location is second, I always get this error:
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm 2018.2.1\plugins\python\helpers\pydev\pydevd.py", line 1434, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2018.2.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/Mark/PycharmProjects/main/main.py", line 104, in <module>
z_score = z_score(base, df['SalePrice'])
TypeError: 'numpy.float64' object is not callable
Since every list value works when it is first, I don't see why every subsequent iteration fails.
My code:
def z_score(val, array, bessel=0):
mean = array.mean()
st_dev = std(array, ddof=bessel)
distance = val - mean
z = distance / st_dev
return z
neighborhoods = ['NAmes', 'CollgCr', 'OldTown', 'Edwards', 'Somerst']
base = 200000
z_scores = {}
for neighborhood in neighborhoods:
df = houses.loc[houses['Neighborhood'] == neighborhood]
z_score = z_score(base, df['SalePrice'])
z_scores[neighborhood] = z_score
sorted_z_scores = sorted(z_scores.items(), key=lambda x: x[1], reverse=True)
print(sorted_z_scores)

In python, the interpreter puts a higher priority on your variable names in the local scope than method names, so when you use z_score as a variable name, it masks access to the z_score method name, if you change the name of your z_score variable, your code should run.

Related

why is a generator function sometimes called not a generator?

For the same object definition in the same program I get TypeError: 'G' object is not iterable in one place and not in another. I am wondering why and if there is a workaround?
class G: # generator class
def __init__(self):
self.c = 1
def __next__(self):
yield self.c
g = G()
for i in range(3):
print(next(g))
g = G()
i = 0
for x in g:
print(x)
i += 1
if i >= 3:
break
In the first for loop it works but in the second for loop I get the runtime error message TypeError: 'G' object is not iterable
The exact output is here:
<generator object G.__next__ at 0x000001ECA83E43C0>
<generator object G.__next__ at 0x000001ECA83E43C0>
<generator object G.__next__ at 0x000001ECA83E43C0>
Traceback (most recent call last):
File "C:\Users\15104\AppData\Local\Programs\Python\Python310\lib\code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm 2022.2\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2022.2\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/15104/Dropbox/Programming/Python/try_tesserOCR/try1/try_with1.py", line 16, in <module>
for x in g:
TypeError: 'G' object is not iterable
Although in this simple example one might say "why would anybody do that", this is because it is taken out of a longer program. Taking out the second g = G() doesn't change the behavior.
The only difference I see in the code is that the first loop calls the next() function explicitly and the second one calls it implicitly through the for x in g: statement. However it seems like "for" is perfectly valid way to use an iterable.

What type do I need to pass to multiprocessing.Value for tables object

I have a table object that I want to pass it to multiple threads. I use multiprocessing.Value function to create a semaphore for that object. However, it tells me that Float32Atom is not hashable. Not sure what to do in this case?
>>> import tables as tb
>>> f = tb.open_file('dot.h5', 'w')
>>> filters = tb.Filters(complevel=5, complib='blosc')
>>> n_ = 10000
>>> W_hat = f.create_carray(f.root, 'data', tb.Float32Atom(), shape=(n_, n_), filters=filters)
>>> W_hat = Value(tb.Float32Atom(), W_hat)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lib/python3.8/multiprocessing/context.py", line 135, in Value
return Value(typecode_or_type, *args, lock=lock,
File "/home/lib/python3.8/multiprocessing/sharedctypes.py", line 74, in Value
obj = RawValue(typecode_or_type, *args)
File "/home/lib/python3.8/multiprocessing/sharedctypes.py", line 48, in RawValue
type_ = typecode_to_type.get(typecode_or_type, typecode_or_type)
TypeError: unhashable type: 'Float32Atom'
If it is correct that you have only threads (and no processes), you can just use multiprocessing.Semaphore. All threads run in the same context, so you can use it for all of them.
see https://docs.python.org/3/library/threading.html#semaphore-objects

Does the object.attribute syntax in python count as a name?

I am wondering if something counts as a name if it is expressed as an object.attribute syntax. The motivation comes from trying to understand this code from Learning Python:
def makeopen(id):
original = builtins.open
def custom(*pargs, **kargs):
print('Custom open call %r' %id, pargs, kargs)
return original(*pargs,*kargs)
builtins.open(custom)
I wanted to map out each name/variable to the scope that they exist in. I am unsure what to do with builtins.open. Is builtins.open a name? In the book the author does state that object.attribute lookup follows completely different rules to plain looksups, which would mean to me that builtins.open is not a name at all, since the execution model docs say that scopes define where names are visible. Since object.attribute syntax is visible in any scope, it doesn't fit into this classification and is not a name.
However the conceptual problem I have is then defining what builtins.open is? It is still a reference to an object, and can be rebound to any other object. In that sense it is a name, even although it doesn't follow scope rules?
Thank you.
builtins.open is just another way to access the global open function:
import builtins
print(open)
# <built-in function open>
print(builtins.open)
# <built-in function open>
print(open == builtins.open)
# True
From the docs:
This module provides direct access to all ‘built-in’ identifiers of
Python; for example, builtins.open is the full name for the built-in
function open()
Regarding the second part of your question, I'm not sure what you mean. (Almost) every "name" in Python can be reassigned to something completely different.
>>> list
<class 'list'>
>>> list = 1
>>> list
1
However, everything under builtins is protected, otherwise some nasty weird behavior was bound to happen in case someone(thing) reassigned its attributes during runtime.
>>> import builtins
>>> builtins.list = 1
Traceback (most recent call last):
File "C:\Program Files\PyCharm 2018.2.1\helpers\pydev\_pydev_comm\server.py", line 34, in handle
self.processor.process(iprot, oprot)
File "C:\Program Files\PyCharm 2018.2.1\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 266, in process
self.handle_exception(e, result)
File "C:\Program Files\PyCharm 2018.2.1\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 254, in handle_exception
raise e
File "C:\Program Files\PyCharm 2018.2.1\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 263, in process
result.success = call()
File "C:\Program Files\PyCharm 2018.2.1\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 228, in call
return f(*(args.__dict__[k] for k in api_args))
File "C:\Program Files\PyCharm 2018.2.1\helpers\pydev\_pydev_bundle\pydev_console_utils.py", line 217, in getFrame
return pydevd_thrift.frame_vars_to_struct(self.get_namespace(), hidden_ns)
File "C:\Program Files\PyCharm 2018.2.1\helpers\pydev\_pydevd_bundle\pydevd_thrift.py", line 239, in frame_vars_to_struct
keys = dict_keys(frame_f_locals)
File "C:\Program Files\PyCharm 2018.2.1\helpers\pydev\_pydevd_bundle\pydevd_constants.py", line 173, in dict_keys
return list(d.keys())
TypeError: 'int' object is not callable

Specify a one or many field using marshmallow

I would like to specify a schema field wichi accepts one or many resources. However I seem only able to specify one behavior or the other.
>>> class Resource(marshmallow.Schema):
... data = marshmallow.fields.Dict()
...
>>> class ContainerSchema(marshmallow.Schema):
... resource = marshmallow.fields.Nested(ResourceSchema, many=True)
...
>>> ContainerSchema().dump({'resource': [{'data': 'DATA'}]})
MarshalResult(data={'resource': [{'data': 'DATA'}]}, errors={})
In the above example a list must be defined. However I would prefer not to:
>>> ContainerSchema().dump({'resource': {'data': 'DATA'}})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/lib64/python3.6/site-packages/marshmallow/schema.py", line 513, in dump
**kwargs
File "/lib64/python3.6/site-packages/marshmallow/marshalling.py", line 147, in serialize
index=(index if index_errors else None)
File "/lib64/python3.6/site-packages/marshmallow/marshalling.py", line 68, in call_and_store
value = getter_func(data)
File "/lib64/python3.6/site-packages/marshmallow/marshalling.py", line 141, in <lambda>
getter = lambda d: field_obj.serialize(attr_name, d, accessor=accessor)
File "/lib64/python3.6/site-packages/marshmallow/fields.py", line 252, in serialize
return self._serialize(value, attr, obj)
File "/lib64/python3.6/site-packages/marshmallow/fields.py", line 448, in _serialize
schema._update_fields(obj=nested_obj, many=self.many)
File "/lib64/python3.6/site-packages/marshmallow/schema.py", line 760, in _update_fields
ret = self.__filter_fields(field_names, obj, many=many)
File "/lib64/python3.6/site-packages/marshmallow/schema.py", line 810, in __filter_fields
obj_prototype = obj[0]
KeyError: 0
Can I have a schema allowing both a single item or many of it?
The point with giving the arguments as a list - whether it's one or many - is so the schema knows how to handle it in either case. For the schema to process arguments of a different format, like not in a list, you need to add a preprocessor to the schema, like this:
class ContainerSchema(marshmallow.Schema):
resource = marshmallow.fields.Nested(ResourceSchema, many=True)
#pre_dump
def wrap_indata(self, indata):
if type(indata['resource']) is dict:
indata['resource'] = [indata['resource']]

Contradicting Errors?

So I'm trying to edit a csv file by writing to a temporary file and eventually replacing the original with the temp file. I'm going to have to edit the csv file multiple times so I need to be able to reference it. I've never used the NamedTemporaryFile command before and I'm running into a lot of difficulties. The most persistent problem I'm having is writing over the edited lines.
This part goes through and writes over rows unless specific values are in a specific column and then it just passes over.
I have this:
office = 3
temp = tempfile.NamedTemporaryFile(delete=False)
with open(inFile, "rb") as oi, temp:
r = csv.reader(oi)
w = csv.writer(temp)
for row in r:
if row[office] == "R00" or row[office] == "ALC" or row[office] == "RMS":
pass
else:
w.writerow(row)
and I get this error:
Traceback (most recent call last):
File "H:\jcatoe\Practice Python\pract.py", line 86, in <module>
cleanOfficeCol()
File "H:\jcatoe\Practice Python\pract.py", line 63, in cleanOfficeCol
for row in r:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
So I searched for that error and the general consensus was that "rb" needs to be "rt" so I tried that and got this error:
Traceback (most recent call last):
File "H:\jcatoe\Practice Python\pract.py", line 86, in <module>
cleanOfficeCol()
File "H:\jcatoe\Practice Python\pract.py", line 67, in cleanOfficeCol
w.writerow(row)
File "C:\Users\jcatoe\AppData\Local\Programs\Python\Python35-32\lib\tempfile.py", line 483, in func_wrapper
return func(*args, **kwargs)
TypeError: a bytes-like object is required, not 'str'
I'm confused because the errors seem to be saying to do the opposite thing.
If you read the tempfile docs you'll see that by default it's opening the file in 'w+b' mode. If you take a closer look at your errors, you'll see that you're getting one on read, and one on write. What you need to be doing is making sure that you're opening your input and output file in the same mode.
You can do it like this:
import csv
import tempfile
office = 3
temp = tempfile.NamedTemporaryFile(delete=False)
with open(inFile, 'r') as oi, tempfile.NamedTemporaryFile(delete=False, mode='w') as temp:
reader = csv.reader(oi)
writer = csv.writer(temp)
for row in reader:
if row[office] == "R00" or row[office] == "ALC" or row[office] == "RMS":
pass
else:
writer.writerow(row)

Resources