Round robin on iterable - python-3.x

Consider the following, simple round-robin implementation:
from itertools import chain, repeat
class RoundRobin:
def __init__(self, iterable):
self._iterable = set(iterable)
def __iter__(self):
for value in chain.from_iterable(repeat(self._iterable)):
yield value
Example usage:
machines = ['test1', 'test2',
'test3', 'test4']
rr_machines = RoundRobin(machines)
for machine in rr_machines:
# Do something
pass
While this works, I was wondering if there was a way to modify the iterable in the RoundRobin class that would also impact existing iterators.
E.g. suppose that while I'm consuming values from the iterator, one of the machines from the set has become unavailable, and I want to prevent it from being returned.
The only solution I could think of was to implement a separate Iterator class. Of course, that still leave the question what to do when all machines have become unavailable and no more values can be returned (StopIteration exception?).

Itertools' repeat makes a copy of the underlying iterator, in this case, your set containing the elements.
It is a mater of creating another implementation of repeat which would re-create such a copy at each iteration of the whole set. That is possible, because in this case, we know the iterator to be repeated is a container, while itertools.repeat has to work with any iterator (and so, remember the values from the first iteration):
def mutable_repeat(container):
while True:
for item in container.copy():
yield item
Just using this in place of repeat, allows you to make "on the fly" changes to your self._iterable set, and new values can be added/removed from that set. (Although a removed value most likely will be issued one last time before being removed)
If you need to guard against issuing a removed value even once, you can easily guard against it by adding some more logic to the whole thing -
instead of iteracting with self._iterable directly from outside your class, you could do:
class RoundRobin:
def __init__(self, iterable):
self._iterable = set(iterable)
self._removed = set()
def __iter__(self):
for value in chain.from_iterable(self.repeat()):
yield value
def remove(self, item):
self._removed.add(item)
self._iterable.remove(item)
def add(self, item):
self._iterable.add(item)
def repeat(self):
while True:
for item in self._iterable.copy():
if not item in self._removed:
yield item
self._removed = set()

Related

Should the "self" reference be used for iteration variables in for loops inside classes?

Title pretty much has my full question. For example,
class Test():
def foobar(self, my_list):
for element in my_list:
do_something()
OR
class Test():
def foobar(self, my_list):
for self.element in my_list:
do_something()
What exactly is the difference between the two, with respect to the variable element or self.element?
I think my confusion comes from the fact that my_list is passed to the method, so it is not an instance of Test, therefore self would be inappropriate as element refers to an element of my_list. But the variable element is being created in Test, so every Test instance will have its own element.
The answer is generally "no". The for loop is assigning the variable to point to each element in the list. When it's done, it persists outside the for loop. The better way would be to keep element a local variable, and it will go away when the method ends. But if you assign it to self.element, that's equivalent to self.element = my_list[-1] when the loop ends. Then it will persist when the method exists, and will be accessible from other methods and holders of the class instance. That's usually not what you want.

Access instance from passed calllback

I'm writing a game in python that has items in it. The player can use items, and different items can do different things. Instead of making a bunch of subclasses for each item type, I'm passing a callback function upon initialization. However, in some cases I need to access the item instance from the callback function - this is where I'm stuck.
Here's what I'm trying to do:
class Item:
def __init__(self, use_callback, regen=0):
self.use_callback = use_callback
self.regen = regen
def heal(self, player):
player.health += self.regen
item = Item(heal, regen=30)
item.use_callback(player)
However, only the player object is passed to the heal function and not the item object: TypeError: heal() missing 1 required positional argumentIt's inconvenient for me to use subclasses since I'm using a table of item drops for each enemy which contains information about the items they drop and it's easier to instantiate an Item upon death than figure out which subclass I need to instantiate.
Any ideas on how to get the reference to the item object?
How about wrapping the callback to pass in the object:
class Item:
def __init__(self, use_callback, regen=0):
self.use_callback = lambda *args, **kwargs: use_callback(self, *args, **kwargs)
self.regen = regen
def heal(item, player):
if item.regen is not None:
player.health += item.regen
item = Item(heal, regen=30)
item.use_callback(player)
Example code
An alternate architecture I would put some thought to is having the Player object have a consume method. The advantage here is the complexity is taken out of your Item object. There is a probably slightly neater way to write this.
item = Item(effects=[(heal, regen=30), (gravity_boost, multiplier=2), (speed)])
class Player
def consume(self, item):
for effect in item.effects:
callback = effect[0]
**kwargs = effect[1:]
callback(player, item, **kwargs)
Beyond this it might be worth considering a simple 'publish subscriber' system that would separate your objects so that they would not have cross dependencies. This will add architectural simplicity at the cost of some code complexity.

Load inconsistent data in pymongo

I am working with pymongo and am wanting to ensure that data saved can be loaded even if additional data elements have been added to the schema.
I have used this for classes that don't need to have the information processed before assigning it to class attributes:
class MyClass(object):
def __init__(self, instance_id):
#set default values
self.database_id = instance_id
self.myvar = 0
#load values from database
self.__load()
def __load(self):
data_dict = Collection.find_one({"_id":self.database_id})
for key, attribute in data_dict.items():
self.__setattr__(key,attribute)
However, in classes that I have to process the data from the database this doesn't work:
class Example(object):
def __init__(self, name):
self.name = name
self.database_id = None
self.member_dict = {}
self.load()
def load(self):
data_dict = Collection.find_one({"name":self.name})
self.database_id = data_dict["_id"]
for element in data_dict["element_list"]:
self.process_element(element)
for member_name, member_info in data_dict["member_class_dict"].items():
self.member_dict[member_name] = MemberClass(member_info)
def process_element(self, element):
print("Do Stuff")
Two example use cases I have are:
1) List of strings the are used to set flags, this is done by calling a function with the string as the argument. (def process_element above)
2) A dictionary of dictionaries which are used to create a list of instances of a class. (MemberClass(member_info) above)
I tried creating properties to handle this but found that __setattr__ doesn't look for properties.
I know I could redefine __setattr__ to look for specific names but it is my understanding that this would slow down all set interactions with the class and I would prefer to avoid that.
I also know I could use a bunch of try/excepts to catch the errors but this would end up making the code very bulky.
I don't mind the load function being slowed down a bit for this but very much want to avoid anything that will slow down the class outside of loading.
So the solution that I came up with is to use the idea of changing the __setattr__ method but instead to handle the exceptions in the load function instead of the __setattr__.
def load(self):
data_dict = Collection.find_one({"name":self.name})
for key, attribute in world_data.items():
if key == "_id":
self.database_id = attribute
elif key == "element_list":
for element in attribute:
self.process_element(element)
elif key == "member_class_dict":
for member_name, member_info in attribute.items():
self.member_dict[member_name] = MemberClass(member_info)
else:
self.__setattr__(key,attribute)
This provides all of the functionality of overriding the __setattr__ method without slowing down any future calls to __setattr__ outside of loading the class.

Creating a list of Class objects from a file with no duplicates in attributes of the objects

I am currently taking some computer science courses in school and have come to a dead end and need a little help. Like the title says, I need of create a list of Class objects from a file with objects that have a duplicate not added to the list, I was able to successfully do this with a python set() but apparently that isn't allowed for this particular assignment, I have tried various other ways but can't seem to get it working without using a set. I believe the point of this assignment is comparing data structures in python and using the slowest method possible as it also has to be timed. my code using the set() will be provided.
import time
class Students:
def __init__(self, LName, FName, ssn, email, age):
self.LName = LName
self.FName = FName
self.ssn = ssn
self.email = email
self.age = age
def getssn(self):
return self.ssn
def main():
t1 = time.time()
f = open('InsertNames.txt', 'r')
studentlist = []
seen = set()
for line in f:
parsed = line.split(' ')
parsed = [i.strip() for i in parsed]
if parsed[2] not in seen:
studentlist.append(Students(parsed[0], parsed[1], parsed[2], parsed[3], parsed[4]))
seen.add(parsed[2])
else:
print(parsed[2], 'already in list, not added')
f.close()
print('final list length: ', len(studentlist))
t2 = time.time()
print('time = ', t2-t1)
main()
A note, that the only duplicates to be checked for are those of the .ssn attribute and the duplicate should not be added to the list. Is there a way to check what is already in the list by that specific attribute before adding it?
edit: Forgot to mention only 1 list allowed in memory.
You can write
if not any(s.ssn==parsed[2] for s in studentlist):
without committing to this comparison as the meaning of ==. At this level of work, you probably are expected to write out the loop and set a flag yourself rather than use a generator expression.
Since you already took the time to write a class representing a student and since ssn is a unique identifier for the instances, consider writing an __eq__ method for that class.
def __eq__(self, other):
return self.ssn == other.ssn
This will make your life easier when you want to compare two students, and in your case make a list (specifically not a set) of students.
Then your code would look something like:
with open('InsertNames.txt') as f:
for line in f:
student = Student(*line.strip().split())
if student not in student_list:
student_list.append(student)
Explanation
Opening a file with with statement makes your code more clean and
gives it the ability to handle errors and do cleanups correctly. And
since 'r' is a default for open it doesn't need to be there.
You should strip the line before splitting it just to handle some
edge cases but this is not obligatory.
split's default argument is ' ' so again it isn't necessary.
Just to clarify the meaning of this item is that the absence of a parameter make the split use whitespaces. It does not mean that a single space character is the default.
Creating the student before adding it to the list sounds like too
much overhead for this simple use but since there is only one
__init__ method called it is not that bad. The plus side of this
is that it makes the code more readable with the not in statement.
The in statement (and also not in of course) checks if the
object is in that list with the __eq__ method of that object.
Since you implemented that method it can check the in statement
for your Student class instances.
Only if the student doesn't exist in the list, it will be added.
One final thing, there is no creation of a list here other than the return value of split and the student_list you created.

Python avoiding large array allocation multiple times

I have to compute a function many many times.
To compute this function the elements of an array must be computed.
The array is quite large.
How can I avoid the allocation of the array in every function call.
The code I have tried goes something like this:
class FunctionCalculator(object):
def __init__(self, data):
"""
Get the data and do some small handling of it
Let's say that we do
self.data = data
"""
def function(self, point):
return numpy.sum(numpy.array([somecomputations(item) for item in self.data]))
Well, maybe my concern is unfounded, so I have first this question.
Question: Is it true that the array [somecomputations(item) for item in data] is being allocated and deallocated for every call to function?
Thinking that that is the case I have tried
class FunctionCalculator(object):
def __init__(self, data):
"""
Get the data and do some small handling of it
Let's say that we do
self.data = data
"""
self.number_of_data = range(0, len(data))
self.my_array = numpy.zeros(len(data))
def function(self, point):
for i in self.number_of_data:
self.my_array[i] = somecomputations(self.data[i])
return numpy.sum(self.my_array)
This is slower than the previous version. I assume that the list comprehension in the first version can be ran in C entirely, while in the second version smaller parts of the script can be translated into optimized C code.
I have very little idea of how Python works inside.
Question: Is there a good way to skip the array allocation in every function call and at the same time take advantage of a well optimized loop on the array?
I am using Python3.5
Looping over the array is unnecessary and access python to c many times, hence the slow down. The beauty of numpy arrays that functions work on them cell by cell. I think the fastest would be:
return numpy.sum(somecomputations(self.data))
Somecomputations may need a bit of a modification, but often it will work off the bat. Also, you're not using point, and other stuff.

Resources