Please correct my code
PS - i'm fairly new to python
class Contact:
def __init__(self,cid, email):
self.cid=cid
self.email=email
def ind(contacts):
index={}
#Code here
return index
contacts = [Contact(1,'a'),
Contact(2,'b'),
Contact(3,'c'),
Contact(4,'a')]
print(ind(contacts))
Need the output to be like -
{'a':[1,4], 'b':2, 'c':3}
The following methods create list values like:
{'a':[1,4], 'b':[2], 'c':[3]}
I can't imagine why this wouldn't be fine, but I've added a method at the end that gets your specific output.
This doesn't maintain order of the emails:
def ind(contracts):
index={}
for contract in contracts:
index.setdefault(contract.email, []).append(contract.cid)
return index
To maintain order (e.g. start with 'a'), add from collects import OrderedDict to the top of your file and then the method is:
def ind(contracts):
index = OrderedDict()
for contract in contracts:
index.setdefault(contract.email, []).append(contract.cid)
return index
The printout of index will look different, but it acts the same as a normal dict object (just with ordering).
Exact output (with ordering):
def ind(contracts):
index = OrderedDict()
for contract in contracts:
if contract.email in index:
value = index[contract.email]
if not isinstance(value, list):
index[contract.email] = [value]
index[contract.email].append(contract.cid)
else:
index[contract.email] = contract.cid
return index
Related
Hi I am writing unittest using pytest. But I am not able to mock few db functions. We are using psycopg2 for db connections and executions. Response of query returned from psycopg2 is of the type DictRow which can be accessed either by key or by index.
Ex:
response = ['prajwal', '23', 'engineer'] #Response of a query "select name, age , job from users"
>>>response[0]
'prajwal'
>>>response['name']
'prajwal'
I want to know is there any way by which we can covert dict/list to above mentioned type.
Looking at the source for psycopg2, creating a DictRow requires passing in a DictCursor object. However the only thing it uses from DictCursor appears to be an index and description attribute.
# found in lib\site-packages\psycopg2.extras.py
class DictRow(list):
"""A row object that allow by-column-name access to data."""
__slots__ = ('_index',)
def __init__(self, cursor):
self._index = cursor.index
self[:] = [None] * len(cursor.description)
The index looks like a dict with a mapping a key to an index. e.g.response['name'] = 0
The description looks like your dict that you want to convert.
If you're feeling hacky you could take advantage of duck typing and pretend you're passing in a cursor when you're just satisfying the requirements.
The only caveat is after we instantiate the DictRow, we need to populate it. Our fake cursor hack will take care of the rest.
from psycopg2.extras import DictRow
class DictRowHack:
def __init__(self, my_dict):
# we need to set these 2 attributes so that
# it auto populates our indexes
self.index = {key: i for i, key in enumerate(my_dict)}
self.description = my_dict
def dictrow_from_dict(my_dict):
# this is just a little helper function
# so you don't always need to go through
# the steps to recreate a DictRow
fake_cursor = DictRowHack(my_dict)
my_dictrow = DictRow(fake_cursor)
for k, v in my_dict.items():
my_dictrow[k] = v
return my_dictrow
response = {'name': 'prajwal', 'age': '23', 'job': 'engineer'}
my_dictrow = dictrow_from_dict(response)
print(my_dictrow[1])
print(my_dictrow['name'])
print(type(my_dictrow))
I need to be able to implement dictionaries into this code. Not all needs to be changed just were i can change it and it still does the same job.
In a test file I have a list of three strings (1, once),(2,twice).(2, twice).
I'm guessing the number will represent the value.
This code passes the tests but I am struggling to understand how I can use dictionaries to make it do the same job.
If any one can help it'll be grateful.
The current is:
The list items are in a test file elsewhere.
class Bag:
def __init__(self):
"""Create a new empty bag."""
self.items = []
def add(self, item):
"""Add one copy of item to the bag. Multiple copies are allowed."""
self.items.append(item)
def count(self, item):
"""Return the number of copies of item in the bag.
Return zero if the item doesn't occur in the bag.
"""
counter = 0
for an_item in self.items:
if an_item == item:
counter += 1
return counter
def clear(self, item):
"""Remove all copies of item from the bag.
Do nothing if the item doesn't occur in the bag.
"""
index = 0
while index < len(self.items):
if self.items[index] == item:
self.items.pop(index)
else:
index += 1
def size(self):
"""Return the total number of copies of all items in the bag."""
return len(self.items)
def ordered(self):
"""Return the items by decreasing number of copies.
Return a list of (count, item) pairs.
"""
result = set()
for item in self.items:
result.add((self.count(item), item))
return sorted(result, reverse=True)
I have been scratching my head over it for a while now. I can only use these also for dictionaries.
Items[key] = value
len(items)
dict()
items[key]
key in items
Del items[key]
Thank you
Start with the simplest possible problem. You have an empty bag:
self.items = {}
and now a caller is trying to add an item, with bag.add('twice').
Where shall we put the item?
Well, we're going to need some unique index.
Hmmm, different every time, different every time, what changes with each .add()?
Right, that's it, use the length!
n = len(self.items)
self.items[n] = new_item
So items[0] = 'twice'.
Now, does this still work after a 2nd call?
Yes. items[1] = 'twice'.
Following this approach you should be able to refactor the other methods to use the new scheme.
Use unit tests, or debug statements like print('after clear() items is: ', self.items), to help you figure out if the Right Thing happened.
In my items.py:
class NewAdsItem(Item):
AdId = Field()
DateR = Field()
AdURL = Field()
In my pipelines.py:
import sqlite3
from scrapy.conf import settings
con = None
class DbPipeline(object):
def __init__(self):
self.setupDBCon()
self.createTables()
def setupDBCon(self):
# This is NOT OK!
# I want to get the items already HERE!
dbfile = settings.get('SQLITE_FILE')
self.con = sqlite3.connect(dbfile)
self.cur = self.con.cursor()
def createTables(self):
# OR optionally HERE.
self.createDbTable()
...
def process_item(self, item, spider):
self.storeInDb(item)
return item
def storeInDb(self, item):
# This is OK, I CAN get the items in here, using:
# item.keys() and/or item.values()
sql = "INSERT INTO {0} ({1}) VALUES ({2})".format(self.dbtable, ','.join(item.keys()), ','.join(['?'] * len(item.keys())) )
...
How can I get the item list names (like "AdId" etc) from items.py, before process_item() (in pipelines.py) is executed?
I use scrapy runspider myspider.py for execution.
I already tried to add "item" and/or "spider" like this def setupDBCon(self, item), but that didn't work, and resulted in:
TypeError: setupDBCon() missing 1 required positional argument: 'item'
UPDATE: 2018-10-08
Result (A):
Partially following the solution from #granitosaurus I found that I can get the item keys as a list, by:
Adding (a): from adbot.items import NewAdsItem to my main spider code.
Adding (b): ikeys = NewAdsItem.fields.keys() within the Class of above.
I could then access the keys from my pipelines.py via:
def open_spider(self, spider):
self.ikeys = list(spider.ikeys)
print("Keys in pipelines: \t%s" % ",".join(self.ikeys) )
#self.createDbTable(ikeys)
However, there were 2 problems with this method:
I was not able to get the ikeys list, into the createDbTable(). (I kept getting errors about missing arguments here and there.)
The ikeys list (as retrieved) was re-arranged and did not keep the order of the items, as they appear in items.py, which partially defeated the purpose. I still don't understand why these are out of order, when all docs says that Python3 should keep the order of dicts and lists etc. While at the same time, when using process_item() and getting the items via: item.keys() their order remain intact.
Result (B):
At the end of the day, it turned out too laborious and complicated to fix (A), so I just imported the relevant items.py Class into my pipelines.py, and use the item list as a global variable, like this:
def createDbTable(self):
self.ikeys = NewAdsItem.fields.keys()
print("Keys in creatDbTable: \t%s" % ",".join(self.ikeys) )
...
In this case I just decided to accept that the list obtained seem to be alphabetically sorted, and worked around the issue by just changing the key names. (Cheating!)
This is disappointing, because the code is ugly and contorted.
Any better suggestions would be much appreciated.
Scrapy pipelines have 3 connected methods:
process_item(self, item, spider)
This method is called for every item pipeline component.
process_item() must either: return a dict with data, return an Item (or any descendant class) object, return a Twisted Deferred or raise DropItem exception. Dropped items are no longer processed by further pipeline components.
open_spider(self, spider)
This method is called when the spider is opened.
close_spider(self, spider)
This method is called when the spider is closed.
https://doc.scrapy.org/en/latest/topics/item-pipeline.html
So you can only get access to item in process_item method.
If you want to get item class however you can attach it to spider class:
class MySpider(Spider):
item_cls = MyItem
class MyPipeline:
def open_spider(self, spider):
fields = spider.item_cls.fields
# fields is a dictionary of key: default value
self.setup_table(fields)
Alternative you can lazy load during process_item method itself:
class MyPipeline:
item = None
def process_item(self, item, spider):
if not self.item:
self.item = item
self.setup_table(item)
I created a class which is basically a hobby book. The book can be accessed by two methods, enter(n,h) which takes a name and keep adding hobbies to that name(one name can have multiple hobbies). The other method returns a set of hobbies for a particular name. My hobby book is storing every hobby that I insert to one name. Can someone help me fixing it?
class Hobby:
def __init__(self):
self.dic={}
self.hby=set()
def enter(self,n,h):
if n not in self.dic.items():
self.dic[n]=self.hby
for k in self.dic.items():
self.hby.add(h)
def lookup(self,n):
return self.dic[n]
I tried running following cases
d = Hobby(); d.enter('Roj', 'soccer'); d.lookup('Roj')
{'soccer'}
d.enter('Max', 'reading'); d.lookup('Max')
{'reading', 'soccer'} #should return just reading
d.enter('Roj', 'music'); d.lookup('Roj')
{'reading', 'soccer','music'} #should return soccer and music
Why are you re-inventing a dict here? Why are you using a separate set to which you always add values, and reference it to every key which ensures that it always returns the same set on a lookup?
Don't reinvent the wheel, use collections.defaultdict:
import collections
d = collections.defaultdict(set)
d["Roj"].add("soccer")
d["Roj"]
# {'soccer'}
d["Max"].add("reading")
d["Max"]
# {'reading'}
d["Roj"].add("music")
d["Roj"]
# {'soccer', 'music'}
.
UPDATE - If you really want to do it through your own class (and before you do, watch Stop Writing Classes!), you can do it as:
class Hobby(object):
def __init__(self):
self.container = {}
def enter(self, n, h):
if n not in self.container:
self.container[n] = {h}
else:
self.container[n].add(h)
def lookup(self, n):
return self.container.get(n, None)
d = Hobby()
d.enter("Roj", "soccer")
d.lookup("Roj")
# {'soccer'}
d.enter("Max", "reading")
d.lookup("Max")
# {'reading'}
d.enter("Roj", "music")
d.lookup("Roj")
# {'soccer', 'music'}
Note how no extra set is used here - every dict key gets its own set to populate.
I'm trying to write a Database Abstraction Layer in Python which lets you construct SQL statments using chained function calls such as:
results = db.search("book")
.author("J. K. Rowling")
.price("<40.00")
.title("Harry")
.execute()
but I am running into problems when I try to dynamically add the required methods to the db class.
Here is the important parts of my code:
import inspect
def myName():
return inspect.stack()[1][3]
class Search():
def __init__(self, family):
self.family = family
self.options = ['price', 'name', 'author', 'genre']
#self.options is generated based on family, but this is an example
for opt in self.options:
self.__dict__[opt] = self.__Set__
self.conditions = {}
def __Set__(self, value):
self.conditions[myName()] = value
return self
def execute(self):
return self.conditions
However, when I run the example such as:
print(db.search("book").price(">4.00").execute())
outputs:
{'__Set__': 'harry'}
Am I going about this the wrong way? Is there a better way to get the name of the function being called or to somehow make a 'hard copy' of the function?
You can simply add the search functions (methods) after the class is created:
class Search: # The class does not include the search methods, at first
def __init__(self):
self.conditions = {}
def make_set_condition(option): # Factory function that generates a "condition setter" for "option"
def set_cond(self, value):
self.conditions[option] = value
return self
return set_cond
for option in ('price', 'name'): # The class is extended with additional condition setters
setattr(Search, option, make_set_condition(option))
Search().name("Nice name").price('$3').conditions # Example
{'price': '$3', 'name': 'Nice name'}
PS: This class has an __init__() method that does not have the family parameter (the condition setters are dynamically added at runtime, but are added to the class, not to each instance separately). If Search objects with different condition setters need to be created, then the following variation on the above method works (the __init__() method has a family parameter):
import types
class Search: # The class does not include the search methods, at first
def __init__(self, family):
self.conditions = {}
for option in family: # The class is extended with additional condition setters
# The new 'option' attributes must be methods, not regular functions:
setattr(self, option, types.MethodType(make_set_condition(option), self))
def make_set_condition(option): # Factory function that generates a "condition setter" for "option"
def set_cond(self, value):
self.conditions[option] = value
return self
return set_cond
>>> o0 = Search(('price', 'name')) # Example
>>> o0.name("Nice name").price('$3').conditions
{'price': '$3', 'name': 'Nice name'}
>>> dir(o0) # Each Search object has its own condition setters (here: name and price)
['__doc__', '__init__', '__module__', 'conditions', 'name', 'price']
>>> o1 = Search(('director', 'style'))
>>> o1.director("Louis L").conditions # New method name
{'director': 'Louis L'}
>>> dir(o1) # Each Search object has its own condition setters (here: director and style)
['__doc__', '__init__', '__module__', 'conditions', 'director', 'style']
Reference: http://docs.python.org/howto/descriptor.html#functions-and-methods
If you really need search methods that know about the name of the attribute they are stored in, you can simply set it in make_set_condition() with
set_cond.__name__ = option # Sets the function name
(just before the return set_cond). Before doing this, method Search.name has the following name:
>>> Search.price
<function set_cond at 0x107f832f8>
after setting its __name__ attribute, you get a different name:
>>> Search.price
<function price at 0x107f83490>
Setting the method name this way makes possible error messages involving the method easier to understand.
Firstly, you are not adding anything to the class, you are adding it to the instance.
Secondly, you don't need to access dict. The self.__dict__[opt] = self.__Set__ is better done with setattr(self, opt, self.__Set__).
Thirdly, don't use __xxx__ as attribute names. Those are reserved for Python-internal use.
Fourthly, as you noticed, Python is not easily fooled. The internal name of the method you call is still __Set__, even though you access it under a different name. :-) The name is set when you define the method as a part of the def statement.
You probably want to create and set the options methods with a metaclass. You also might want to actually create those methods instead of trying to use one method for all of them. If you really want to use only one __getattr__ is the way, but it can be a bit fiddly, I generally recommend against it. Lambdas or other dynamically generated methods are probably better.
Here is some working code to get you started (not the whole program you were trying to write, but something that shows how the parts can fit together):
class Assign:
def __init__(self, searchobj, key):
self.searchobj = searchobj
self.key = key
def __call__(self, value):
self.searchobj.conditions[self.key] = value
return self.searchobj
class Book():
def __init__(self, family):
self.family = family
self.options = ['price', 'name', 'author', 'genre']
self.conditions = {}
def __getattr__(self, key):
if key in self.options:
return Assign(self, key)
raise RuntimeError('There is no option for: %s' % key)
def execute(self):
# XXX do something with the conditions.
return self.conditions
b = Book('book')
print(b.price(">4.00").author('J. K. Rowling').execute())