I am developing an Intelligent agent for board games using MCTS algorithm.
Monte carlo tree search (MCTS) is a popular method in AI which is mostly used for games (like GO, Chess, ...). In this method, An agent builds a tree based on states which would be a result of choosing moves allowed in current state. Agent is allowed to search through the tree for limited time. in this period, Agent expands the tree to the nodes which are most promising (for winning a game).
The picture below shows the process:
For more information you can check this link:
1 - http://www.cameronius.com/research/mcts/about/index.html
In root node of the tree, there would be a variable rootstate which shows the current state of game. A deepcopy of rootstate is used to simulate the tree states (future states) as we go deep in the tree.
I used this code for deepcopy of gamestate class because deepcopy doesn't work fine with cython objects due to their problem with pickle protocol:
cdef class gamestate:
# ... other functions
def __deepcopy__(self,memo_dictionary):
res = gamestate(self.size)
res.PLAYERS = self.PLAYERS
res.size = int(self.size)
res.board = np.array(self.board, dtype=np.int32)
res.white_groups = deepcopy(self.white_groups) # a module which checks if white player has won the game
res.black_groups = deepcopy(self.black_groups) # a module which checks if black player has won the game
# the black_groups and white_groups are also cython objects which the same deepcopy function is implemented for them
# .... etc
return res
Whenever an MCTS iteration starts, a deepcopy of the state is stored in memory.
The problem which occurs is that in the begining of the game,
the iterations per 1 second is between 2000 and 3000 which is expected, but as the game tree expands,the iterations per 1 second decreases to 1. It get even worse when each iteration takes more time to
be completed. When I checked the memory usage, I noticed that it increases from 0.6 percent to 90 percent for each time I call the agent to search. I had implemented the same algorithm in pure python and it has no issues of this type. So I guess the __deepcopy__ function causes the problem. I was once suggested to make my own pickle protocol for cython objects in here, but I am not very much familiar with pickle module.
Can anyone suggest me some protocol to use for my cython objects to get rid of this obstacle.
Edit 2:
I add some parts of the code which might help more.
The code below belongs to deepcopy of class unionfind which is used for white_groups and black_groups in gamestate:
cdef class unionfind:
cdef public:
dict parent
dict rank
dict groups
list ignored
cdef __init__(self):
# initialize variables ...
def __deepcopy__(self, memo_dictionary):
res = unionfind()
res.parent = self.parent
res.rank = self.rank
res.groups = self.groups
res.ignored = self.ignored
return res
this one is the search function which is run during allowed time:
cdef class mctsagent:
def search(time_budget):
cdef int num_rollouts = 0
while (num_rollouts < time_budget):
state_copy = deepcopy(self.rootstate)
node, state = self.select_node(state_copy) # expansion runs inside the select_node function
turn = state.turn()
outcome = self.roll_out(state)
self.backup(node, turn, outcome)
num_rollouts += 1
This issue is probably the lines
res.white_groups = deepcopy(self.white_groups) # a module which checks if white player has won the game
res.black_groups = deepcopy(self.black_groups) # a module which checks if black player has won the game
What you should be doing is calling deepcopy with the second argument memo_dictionary. This is deepcopys record of if it's already copied an object. Without it deepcopy ends up copying the same object multiple times (hence the huge memory use)
res.white_groups = deepcopy(self.white_groups, memo_dictionary) # a module which checks if white player has won the game
res.black_groups = deepcopy(self.black_groups, memo_dictionary) # a module which checks if black player has won the game
If the __deepcopy__() implementation needs to make a deep copy of a component, it should call the deepcopy() function with the component as first argument and the memo dictionary as second argument.
(edit: just seen that #Blckknght already pointed this out in the comments)
(edit2: unionfind looks to mainly contain Python objects. There probably isn't a huge value in it being a cdef class and not just a normal class. Also, your current __deepcopy__ for it doesn't actually make a copy of those dictionaries - you should be doing res.parent = deepcopy(self.parent, memo_dictionary) etc.. If you just made it a normal class this would be implemented automatatically)
Related
First, I'd like to thank the StackOverflow community for the tremendous help it provided me over the years, without me having to ask a single question.
I could not find anything that I can relate to my problem, though it is probably due to my lack of understanding of the subject, rather than the absence of a response on the website. My apologies in advance if this is a duplicate.
I am relatively new to multiprocess; some time ago I succeeded in using multiprocessing.pools in a very simple way, where I didn't need any feedback between the child processes.
Now I am facing a much more complicated problem, and I am just lost in the documentation about multiprocessing. I hence ask for you help, your kindness and your patience.
I am trying to build a parallel tempering monte-carlo algorithm, from a class.
The basic class very roughly goes as follows:
import numpy as np
class monte_carlo:
def __init__(self):
self.x=np.ones((1000,3))
self.E=np.mean(self.x)
self.Elist=[]
def simulation(self,temperature):
self.T=temperature
for i in range(3000):
self.MC_step()
if i%10==0:
self.Elist.append(self.E)
return
def MC_step(self):
x=self.x.copy()
k = np.random.randint(1000)
x[k] = (x[k] + np.random.uniform(-1,1,3))
temp_E=np.mean(self.x)
if np.random.random()<np.exp((self.E-temp_E)/self.T):
self.E=temp_E
self.x=x
return
Obviously, I simplified a great deal (actual class is 500 lines long!), and built fake functions for simplicity: __init__ takes a bunch of parameters as arguments, there are many more lists of measurement else than self.Elist, and also many arrays derived from self.X that I use to compute them. The key point is that each instance of the class contains a lot of informations that I want to keep in memory, and that I don't want to copy over and over again, to avoid dramatic slowing down. Else I would just use the multiprocessing.pool module.
Now, the parallelization I want to do, in pseudo-code:
def proba(dE,pT):
return np.exp(-dE/pT)
Tlist=[1.1,1.2,1.3]
N=len(Tlist)
G=[]
for _ in range(N):
G.append(monte_carlo())
for _ in range(5):
for i in range(N): # this loop should be ran in multiprocess
G[i].simulation(Tlist[i])
for i in range(N//2):
dE=G[i].E-G[i+1].E
pT=G[i].T + G[i+1].T
p=proba(dE,pT) # (proba is a function, giving a probability depending on dE)
if np.random.random() < p:
T_temp = G[i].T
G[i].T = G[i+1].T
G[i+1].T = T_temp
Synthesis: I want to run several instances of my monte-carlo class in parallel child processes, with different values for a parameter T, then periodically pause everything to change the different T's, and run again the child processes/class instances, from where they paused.
Doing this, I want each class-instance/child-process to stay independent from one another, save its current state with all internal variables while it is paused, and do as few copies as possible. This last point is critical, as the arrays inside the class are quite big (some are 1000x1000), and a copy will therefore very quickly become quite time-costly.
Thanks in advance, and sorry if I am not clear...
Edit:
I am using a distant machine with many (64) CPUs, running on Debian GNU/Linux 10 (buster).
Edit2:
I made a mistake in my original post: in the end, the temperatures must be exchanged between the class-instances, and not inside the global Tlist.
Edit3: Charchit answer works perfectly for the test code, on both my personal machine and the distant machine I am usually using for running my codes. I hence check this as the accepted answer.
However, I want to report here that, inserting the actual, more complicated code, instead of the oversimplified monte_carlo class, the distant machine gives me some strange errors:
Unable to init server: Could not connect: Connection refused
(CMC_temper_all.py:55509): Gtk-WARNING **: ##:##:##:###: Locale not supported by C library.
Using the fallback 'C' locale.
Unable to init server: Could not connect: Connection refused
(CMC_temper_all.py:55509): Gdk-CRITICAL **: ##:##:##:###:
gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed
(CMC_temper_all.py:55509): Gdk-CRITICAL **: ##:##:##:###: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed
The "##:##:##:###" are (or seems like) IP adresses.
Without the call to set_start_method('spawn') this error shows only once, in the very beginning, while when I use this method, it seems to show at every occurrence of result.get()...
The strangest thing is that the code seems otherwise to work fine, does not crash, produces the datafiles I then ask it to, etc...
I think this would deserve to publish a new question, but I put it here nonetheless in case someone has a quick answer.
If not, I will resort to add one by one the variables, methods, etc... that are present in my actual code but not in the test example, to try and find the origin of the bug. My best guess for now is that the memory space required by each child-process with the actual code, is too large for the distant machine to accept it, due to some restrictions implemented by the admin.
What you are looking for is sharing state between processes. As per the documentation, you can either create shared memory, which is restrictive about the data it can store and is not thread-safe, but offers better speed and performance; or you can use server processes through managers. The latter is what we are going to use since you want to share whole objects of user-defined datatypes. Keep in mind that using managers will impact speed of your code depending on the complexity of the arguments that you pass and receive, to and from the managed objects.
Managers, proxies and pickling
As mentioned, managers create server processes to store objects, and allow access to them through proxies. I have answered a question with better details on how they work, and how to create a suitable proxy here. We are going to use the same proxy defined in the linked answer, with some variations. Namely, I have replaced the factory functions inside the __getattr__ to something that can be pickled using pickle. This means that you can run instance methods of managed objects created with this proxy without resorting to using multiprocess. The result is this modified proxy:
from multiprocessing.managers import NamespaceProxy, BaseManager
import types
import numpy as np
class A:
def __init__(self, name, method):
self.name = name
self.method = method
def get(self, *args, **kwargs):
return self.method(self.name, args, kwargs)
class ObjProxy(NamespaceProxy):
"""Returns a proxy instance for any user defined data-type. The proxy instance will have the namespace and
functions of the data-type (except private/protected callables/attributes). Furthermore, the proxy will be
pickable and can its state can be shared among different processes. """
def __getattr__(self, name):
result = super().__getattr__(name)
if isinstance(result, types.MethodType):
return A(name, self._callmethod).get
return result
Solution
Now we only need to make sure that when we are creating objects of monte_carlo, we do so using managers and the above proxy. For that, we create a class constructor called create. All objects for monte_carlo should be created with this function. With that, the final code looks like this:
from multiprocessing import Pool
from multiprocessing.managers import NamespaceProxy, BaseManager
import types
import numpy as np
class A:
def __init__(self, name, method):
self.name = name
self.method = method
def get(self, *args, **kwargs):
return self.method(self.name, args, kwargs)
class ObjProxy(NamespaceProxy):
"""Returns a proxy instance for any user defined data-type. The proxy instance will have the namespace and
functions of the data-type (except private/protected callables/attributes). Furthermore, the proxy will be
pickable and can its state can be shared among different processes. """
def __getattr__(self, name):
result = super().__getattr__(name)
if isinstance(result, types.MethodType):
return A(name, self._callmethod).get
return result
class monte_carlo:
def __init__(self, ):
self.x = np.ones((1000, 3))
self.E = np.mean(self.x)
self.Elist = []
self.T = None
def simulation(self, temperature):
self.T = temperature
for i in range(3000):
self.MC_step()
if i % 10 == 0:
self.Elist.append(self.E)
return
def MC_step(self):
x = self.x.copy()
k = np.random.randint(1000)
x[k] = (x[k] + np.random.uniform(-1, 1, 3))
temp_E = np.mean(self.x)
if np.random.random() < np.exp((self.E - temp_E) / self.T):
self.E = temp_E
self.x = x
return
#classmethod
def create(cls, *args, **kwargs):
# Register class
class_str = cls.__name__
BaseManager.register(class_str, cls, ObjProxy, exposed=tuple(dir(cls)))
# Start a manager process
manager = BaseManager()
manager.start()
# Create and return this proxy instance. Using this proxy allows sharing of state between processes.
inst = eval("manager.{}(*args, **kwargs)".format(class_str))
return inst
def proba(dE,pT):
return np.exp(-dE/pT)
if __name__ == "__main__":
Tlist = [1.1, 1.2, 1.3]
N = len(Tlist)
G = []
# Create our managed instances
for _ in range(N):
G.append(monte_carlo.create())
for _ in range(5):
# Run simulations in the manager server
results = []
with Pool(8) as pool:
for i in range(N): # this loop should be ran in multiprocess
results.append(pool.apply_async(G[i].simulation, (Tlist[i], )))
# Wait for the simulations to complete
for result in results:
result.get()
for i in range(N // 2):
dE = G[i].E - G[i + 1].E
pT = G[i].T + G[i + 1].T
p = proba(dE, pT) # (proba is a function, giving a probability depending on dE)
if np.random.random() < p:
T_temp = Tlist[i]
Tlist[i] = Tlist[i + 1]
Tlist[i + 1] = T_temp
print(Tlist)
This meets the criteria you wanted. It does not create any copies at all, rather, all arguments to the simulation method call are serialized inside the pool and sent to the manager server where the object is actually stored. It gets executed there, and the results (if any) are serialized and returned in the main process. All of this, with only using the builtins!
Output
[1.2, 1.1, 1.3]
Edit
Since you are using Linux, I encourage you to use multiprocessing.set_start_method inside the if __name__ ... clause to set the start method to "spawn". Doing this will ensure that the child processes do not have access to variables defined inside the clause.
I'm running a constrained optimisation with scipy.optimize.minimize(method='COBYLA').
In order to evaluate the cost function, I need to run a relatively expensive simulation to compute a dataset from the input variables, and the cost function is one (cheap to compute) property of that dataset. However, two of my constraints are also dependent on that expensive data.
So far, the only way I have found to constrain the optimisation is to have each of the constraint functions recompute the same dataset that the cost function already has calculated (simplified quasi-code):
def costfun(x):
data = expensive_fun(x)
return(cheap_fun1(data))
def constr1(x):
data = expensive_fun(x)
return(cheap_fun2(data))
def constr2(x):
data = expensive_fun(x)
return(cheap_fun3(data))
constraints = [{'type':'ineq', 'fun':constr1},
{'type':'ineq', 'fun':constr2}]
# initial guess
x0 = np.ones((6,))
opt_result = minimize(costfun, x0, method='COBYLA',
constraints=constraints)
This is clearly not efficient because expensive_fun(x) is called three times for every x.
I could change this slightly to include a universal "evaluate some cost" function which runs the expensive computation, and then evaluates whatever criterion it has been given. But while that saves me from having to write the "expensive" code several times, it still runs three times for every iteration of the optimizer:
# universal cost function evaluator
def criterion_from_x(x, cfun):
data = expensive_fun(x)
return(cfun(data))
def costfun(data):
return(cheap_fun1(data))
def constr1(data):
return(cheap_fun2(data))
def constr2(data):
return(cheap_fun3(data))
constraints = [{'type':'ineq', 'fun':criterion_from_x, 'args':(constr1,)},
{'type':'ineq', 'fun':criterion_from_x, 'args':(constr2,)}
# initial guess
x0 = np.ones((6,))
opt_result = minimize(criterion_from_x, x0, method='COBYLA',
args=(costfun,), constraints=constraints)
I have not managed to find any way to set something up where x is used to generate data at each iteration, and data is then passed to both the objective function as well as the constraint functions.
Does something like this exist? I've noticed the callback argument to minimize(), but that is a function which is called after each step. I'd need some kind of preprocessor which is called on x before each step, whose results are then available to the cost function and constraint evaluation. Maybe there's a way to sneak it in somehow? I'd like to avoid writing my own optimizer.
One, more traditional, way to solve this would be to evaluate the constraints in the cost function (which has all the data it needs for that, have it add a penalty for violated constraints to the main cost function, and run the optimizer without the explicit constraints, but I've tried this before and found that the main cost function can become somewhat chaotic in cases where the constraints are violated, so an optimizer might get stuck in some place which violates the constraints and not find out again.
Another approach would be to produce some kind of global variable in the cost function and write the constraint evaluation to use that global variable, but that could be very dangerous if multithreading/-processing gets involved, or if the name I choose for the global variable collides with a name used anywhere else in the code:
'''
def costfun(x):
global data
data = expensive_fun(x)
return(cheap_fun1(data))
def constr1(x):
global data
return(cheap_fun2(data))
def constr2(x):
global data
return(cheap_fun3(data))
'''
I know that some people use file I/O for cases where the cost function involves running a large simulation which produces a bunch of output files. After that, the constraint functions can just access those files -- but my problem is not that big.
I'm currently using Python v3.9 and scipy 1.9.1.
You could write a decorator class in the same vein to scipy's MemoizeJac that caches the return values of the expensive function each time it is called:
import numpy as np
class MemoizeData:
def __init__(self, obj_fun, exp_fun, constr_fun):
self.obj_fun = obj_fun
self.exp_fun = exp_fun
self.constr_fun = constr_fun
self._data = None
self.x = None
def _compute_if_needed(self, x, *args):
if not np.all(x == self.x) or self._data is None:
self.x = np.asarray(x).copy()
self._data = self.exp_fun(x)
def __call__(self, x, *args):
self._compute_if_needed(x, *args)
return self.obj_fun(self._data)
def constraint(self, x, *args):
self._compute_if_needed(x, *args)
return self.constr_fun(self._data)
Followingly, the expensive function is only evaluated once for each iteration. Then, after writing all your constraints into one constraint function, you could use it like this:
from scipy.optimize import minimize
def all_constrs(data):
return np.hstack((cheap_fun2(data), cheap_fun3(data)))
obj = MemoizeData(cheap_fun1, expensive_fun, all_constrs)
constr = {'type': 'ineq', 'fun': obj.constraint}
x0 = np.ones(6)
opt_result = minimize(obj, x0, method="COBYLA", constraints=constr)
While Joni was writing their answer, I found another one, which is admittedly more hacky. I prefer theirs, but for the sake of completeness, I wanted to post this one, too.
It's derived from the material from https://mdobook.github.io/ and the accompanying video tutorials from BYU FLow Lab, in particular this video:
The trick is to use non-local variables to keep a cache of the last evaluation of the expensive function:
import numpy as np
last_x = None
last_data = None
def compute_data(x):
data = expensive_fun(x)
return(data)
def get_last_data(x):
nonlocal last_x, last_data
if not np.array_equal(x, last_x):
last_data = compute_data(x)
last_x = x
return(last_data)
def costfun(x):
data = get_last_data(x)
return(cheap_fun1(data)
def constr1(x):
data = get_last_data(x)
return(cheap_fun2(data)
def constr2(x):
data = get_last_data(x)
return(cheap_fun3(data)
...and then everything can progress as in my original code in the question.
Reasons why I prefer Joni's class-based version:
variable scopes are clearer than with nonlocal
If some of the functions allow calculation of their Jacobian, or there are other things worth buffering, the added complexity is held in check better than with
Having a class instance do all the work also allows you to do other interesting things, like keeping a record of all past evaluations and the path taken by the optimizer, without having to use a separate callback function. Very useful for debugging/tweaking convergence if the optimizer won't converge or takes too long, but also to visualize or otherwise investigate the objective function or similar.
The same ability might actually be really cool for things like constructing a response surface model from the results of previous function evaluations. That could be used to establish a starting guess in case the expensive function is some numerical method that benefits from a good starting point.
Both approaches allow the use of "cheap" constraints which don't require the expensive function to be evaluated, by simply providing them as separate functions. Not sure whether that would help much with compute times, though. I suppose that would depend on the algorithm used by the optimizer.
I am working on processing a dataset that includes dense GPS data. My goal is to use parallel processing to test my dataset against all possible distributions and return the best one with the parameters generated for said distribution.
Currently, I have code that does this in serial thanks to this answer https://stackoverflow.com/a/37616966. Of course, it is going to take entirely too long to process my full dataset. I have been playing around with multiprocessing, but can't seem to get it to work right. I want it to test multiple distributions in parallel, keeping track of sum of square error. Then I want to select the distribution with the lowest SSE and return its name along with the parameters generated for it.
def fit_dist(distribution, data=data, bins=200, ax=None):
#Block of code that tests the distribution and generates params
return(distribution.name, best_params, sse)
if __name__ == '__main__':
p = Pool()
result = p.map(fit_dist, DISTRIBUTIONS)
p.close()
p.join()
I need some help with how to actually make use of the return values on each of the iterations in the multiprocessing to compare those values. I'm really new to python especially multiprocessing so please be patient with me and explain as much as possible.
The problem I'm having is it's giving me an "UnboundLocalError" on the variables that I'm trying to return from my fit_dist function. The DISTRIBUTIONS list is 89 objects. Could this be related to the parallel processing, or is it something to do with the definition of fit_dist?
With the help of Tomerikoo's comment and some further struggling, I got the code working the way I wanted it to. The UnboundLocalError was due to me not putting the return statement in the correct block of code within my fit_dist function. To answer the question I did the following.
from multiprocessing import Pool
def fit_dist:
#put this return under the right section of this method
return[distribution.name, params, sse]
if __name__ == '__main__':
p = Pool()
result = p.map(fit_dist, DISTRIBUTIONS)
p.close()
p.join()
'''filter out the None object results. Due to the nature of the distribution fitting,
some distributions are so far off that they result in None objects'''
res = list(filter(None, result))
#iterates over nested list storing the lowest sum of squared errors in best_sse
for dist in res:
if best_sse > dist[2] > 0:
best_sse = dis[2]
else:
continue
'''iterates over list pulling out sublist of distribution with best sse.
The sublists are made up of a string, tuple with parameters,
and float value for sse so that's why sse is always index 2.'''
for dist in res:
if dist[2]==best_sse:
best_dist_list = dist
else:
continue
The rest of the code simply consists of me using that list to construct charts and plots with that best distribution overtop of a histogram of my raw data.
I have a list of zipcodes that I want to pull business listings for using the yelp fusion api. Each zipcode will have to make at least one api call ( often much more) and so, I want to be able to keep track of my api usage as the daily limit is 25000. I have defined each zipcode as an instance of user defined Locale class. This locale class has a class variable Locale.pulls, which acts as a global counter for the number of pulls.
I want to multithread this using the multiprocessing module but I am not sure if I need to use locks and if so, how would I do so? The concern is race conditions as I need to be sure each thread sees the current number of pulls defined as the Zip.pulls class variable in the pseudo code below.
import multiprocessing.dummy as mt
class Locale():
pulls = 0
MAX_PULLS = 20000
def __init__(self,x,y):
#initialize the instance with arguments needed to complete the API call
def pull(self):
if Locale.pulls > MAX_PULLS:
return none
else:
# make the request, store the returned data and increment the counter
self.data = self.call_yelp()
Locale.pulls += 1
def main():
#zipcodes below is a list of arguments needed to initialize each zipcode as a Locale class object
pool = mt.Pool(len(zipcodes)/100) # let each thread work on 100 zipcodes
data = pool.map(Locale, zipcodes)
A simple solution would be to check that len(zipcodes) < MAP_PULLS before running the map().
So i am making a text adventure game, and currently i am making the enemies. My class random_enemies makes trash mobs for your character to fight and i have a function in it called weak, normal, strong, etc... that scales with your character depending on which one it is. When i call random_enemies.weak it says (Name Error: global variable "p" is not defined) even though it should be.
import random
from character import *
from player import *
class random_enemies(character):
def __init__(self,name,hp,maxhp,attack_damage,ability_power,exp):
super(random_enemies,self).__init__(name,hp,maxhp)
self.attack_damage = attack_damage
self.ability_power = ability_power
self.exp = exp
def weak():
self.hp = random.randint(p.maxhp/10, p.maxhp/5)
self.attack_damage = None
self.ability_power = None
self.exp = None
from character import*
class player(character):
def __init__(self,name,hp,maxhp,attack_damage,ability_power):
super(player,self).__init__(name, hp, maxhp)
self.attack_damage = attack_damage
self.ability_power = ability_power
This is my player class and below is the class that player gets "maxhp" from.
class character(object):
def __init__(self,name,hp,maxhp):
self.name = name
self.hp = hp
self.maxhp = maxhp
def attack(self,other):
pass
p=player(Players_name, 100, 100, 10, 5,)
while (p.hp>0):
a=input("What do you want to do?")
if a=="Instructions":
Instructions()
elif a=="Commands":
Commands()
elif a=="Fight":
print("Level",wave,"Wave Begins")
if wave < 6:
b = random_enemies.weak()
print("A",b,"Appeared!")
print("Stats of",b, ": \n Health=", b.hp,"Attack Damage=",b.attack_damage)
continue
I just made this really quickly just to test if everything I had was working until I got the error. This is also the place where random_enemies.weak() was called. Also in this is where I defined what "p" was.
So, first of all, follow a naming convention. For python code I recommend that you use pep8 as a convention.
You have a problem with classes vs. instances in your code. First, you need an instance of a class before you can use it:
enemy = random_enemy() # a better name would be RandomEnemy
In Python, all methods start with self, and you need to pass to the method the arguments that it needs to do its work. weak is a method, so it should be more like this:
def weak(self, player):
# the method for weak ... weak attack ?
# remember to change p to player, which is more meaningful
...
Now that you have your instance and it has a method weak which receives a player as argument, you can use it as follows:
# you can't use random_enemy here as you tried because it is a class
# you need a random_enemy instance, the enemy that I talked about earlier
b = enemy.weak(player) # renamed p to player because it is more meaningful
For this all to work, you will need one more thing. weak() needs to return something. Right now you are using what it returns, nothing! The code that you posted is b = random_enemies.weak(). Because weak() does not have a return clause, b will always be None.
Some notes: Avoid one-letter variables unless there is a long standing convention (like using i for loop counter). It is easier to understand what you are trying to do if you define player instead of just p.
Python has a really great tutorial for all this stuff.