Capture 1 resource from a request of many resources: easier approach?

Capture 1 resource from a request of many resources: easier approach? - simpy

I am relatively new to Python and very new to Simpy. I am trying to build a model of a hospital system that:
Has roughly 20 resources (units) where each resource has a capacity of 5 to 50 (beds)
Not all units can serve the same patients. They have specialties.
When a patient needs a bed, the hospital requests 1 bed from roughly 5 of the 20 units.
So, what I want to do is make a request against multiple Unit resources and only capture 1 bed from 1 available Unit. After many iterations, I think I have found a way of doing this but my approach feels overly complicated.
Below I am showing code using the conditional AnyOf of Simpy. The way AnyOf works is if more than 1 resource has the availability, then more than 1 resource will be captured. So, after the AnyOf request, I release any extra captured beds and cancel any requests still pending.
Is there an easier approach?
from dataclasses import dataclass
import simpy
from simpy.events import AnyOf, Event
import random
#dataclass
class Unit():
env: simpy.Environment()
identifier: str
capacity: int = 1
def __post_init__(self):
self.beds: simpy.Resource = simpy.Resource(env, self.capacity)
def make_request(env, units: list[Unit]):
# create a list of simpy request events for unit 1, 3 and 5
# purpose is to put in a conditional request
any_of_request: list[Event] = []
request_to_unit_dictionary = {}
random_pool_size = random.randint(1, 5)
for x in range(random_pool_size):
random_unit = random.randint(0, 9)
unit = units[random_unit]
res_request = unit.beds.request()
request_to_unit_dictionary[res_request] = unit
any_of_request.append(res_request)
get_one_bed_request = AnyOf(env, any_of_request)
captured_units = yield get_one_bed_request
# how do i determine what unit was captured by the request?
# it is possible that both resources are avaliable
# and both request get filled
captured_requests = list(captured_units.keys())
captured_request = captured_requests[0]
captured_unit: Unit = request_to_unit_dictionary[captured_request]
print(captured_unit)
# if I understand correctly, if I only want 1 request then
# release any "extra" captures
for r in captured_requests:
if r != captured_request:
r.resource.release(r)
# cancel any of the request not captured.
for r in any_of_request:
if r not in captured_requests:
r.cancel()
env = simpy.Environment()
# create 10 units, each with 1 capacity for this example
units: list[Unit] = []
for x in range(1, 10):
units.append(Unit(identifier=f'unit{x}', env=env, capacity=1))
env.process(make_request(env, units))
env.process(make_request(env, units))
env.process(make_request(env, units))
env.process(make_request(env, units))
env.run()
Thanks,
Dan

Related

how to fetch data through multiple account with threading in python3

I want to achieve a function that can fetch data in parallel.
The background is the information of 100 sites can be fetched from site A.
the same account can't be used more than once at a time, so I created 5 different accounts on site A that eanble me to fetch information with 5 accounts.
account info like
worker1 pawd
worker2 pawd
worker3 pawd
worker4 pawd
worker5 pawd
if you want to get information of site B from site A .
then you need to type cmd like get info for siteB_IP on site A.
suppose there are 100 IPs are stored in a list names IPlist
how to fetch information of 100 IPs with 5 avaliable accounts in parallel by threading , and then
all of the information can be sotored in a variable without conflict.
what I have tried is below , below codes can not be executed due to I have no way to achieve the solution:
import threading
user = 'root'
pwd = 'Changeme123'
# the first step is to logon with default account
rs = link.send_cmd(r':lognew:' + '"' + user + '","' + pwd + '"')
# then get all nebor ip from the logon site, the function parse_multi is used for parsing data
IPlist = parse_multi(link.send_cmd('get-IP-info:0xffff'))
def Fetchinfo(user, ip):
rs = link.send_cmd(r':lognew:' + '"' + user + '","' + pwd + '"')
areainfo = link.send_cmd('get info for ' + site_IP)
for ip in IPlist:
# how to handle 100 IPs in the situstion of 5 accounts avaliable ?
thread = threading.thread(target = Fetchinfo, args = [worker, ip]

Since you don't want calls from the same account id and passwords to happen concurrently, you can define a function that sequentially loops through a sub-list of IPs to fetch synchronously:
def fetch_data_for_ips(account_id, account_password, ips_to_fetch):
results = list()
for ip_to_fetch in ips_to_fetch:
# fetch with the account_id and password synchronously
result = ...
results.append(result)
return results // Added this
Then, use a thread pool, to run the different batches concurrently for each account:
from concurrent.futures import ThreadPoolExecutor, as_completed
# Split the workload for each account to fetch
num, remainder = divmod(len(ip_list), len(accounts))
num_ips_for_each_account = num + bool(remainder)
# This gives e.g. [[1,2,3], [4,5,6]], where each sublist is for each account to fetch
ip_lists_for_each_account = [ip_list[i: i + num_ips_for_each_account] for i in range(0, len(ip_list), num_ips_for_each_account)]
# You should only need number of threads = to the number of accounts you have
with ThreadPoolExecutor(len(accounts)) as executor:
# Feel free to use a set instead if you don't need to know which result came from which thread
futures = dict()
results = list()
for (account_id, account_password), ips_to_fetch in zip(accounts, ip_lists_for_each_account):
future = executor.submit(fetch_data_for_ips, account_id, account_password, ips_to_fetch)
futures[future] = account_id
for future in as_completed(futures):
result = future.result()
account_id = futures[future]
print(f'{account_id} fetched these:', result)
results.extend(result)

you can refer to below sample code as rcshon suggested .
def fetch_data_for_ips(account_id,ips_to_fetch):
results = list()
for ip_to_fetch in ips_to_fetch:
# fetch with the account_id and password synchronously
result = ','.join((account_id,ip_to_fetch))
results.append(result)
return results
from concurrent.futures import ThreadPoolExecutor, as_completed
accounts = ['worker1','worker2','worker3','worker4','worker5']
ip_list = [str(_) for _ in range(10)]
# Split the workload for each account to fetch
num, remainder = divmod(len(ip_list), len(accounts))
num_ips_for_each_account = num + bool(remainder)
# This gives e.g. [[1,2,3], [4,5,6]], where each sublist is for each account to fetch
ip_lists_for_each_account = [ip_list[i: i + num_ips_for_each_account] for i in range(0, len(ip_list), num_ips_for_each_account)]
# You should only need number of threads = to the number of accounts you have
with ThreadPoolExecutor(len(accounts)) as executor:
# Feel free to use a set instead if you don't need to know which result came from which thread
futures = dict()
results = list()
for account_id, ips_to_fetch in zip(accounts, ip_lists_for_each_account):
future = executor.submit(fetch_data_for_ips, account_id, ips_to_fetch)
futures[future] = account_id
for future in as_completed(futures):
result = future.result()
account_id = futures[future]
print(f'{account_id} fetched these:', result)
results.extend(result)
output :
worker3 fetched these: ['worker3,4', 'worker3,5']
worker2 fetched these: ['worker2,2', 'worker2,3']
worker1 fetched these: ['worker1,0', 'worker1,1']
worker4 fetched these: ['worker4,6', 'worker4,7']
worker5 fetched these: ['worker5,8', 'worker5,9']

Simpy resource unavialbality

I am trying to make resources unavailable for a certain time in simpy. The issue is with timeout I find the resource is still active and serving during the time it should be unavailable. Can anyone help me with this in case you have encountered such a problem. Thanks a lot!
import numpy as np
import simpy
def interarrival():
return(np.random.exponential(10))
def servicetime():
return(np.random.exponential(20))
def servicing(env, servers_1):
i = 0
while(True):
i = i+1
yield env.timeout(interarrival())
print("Customer "+str(i)+ " arrived in the process at "+str(env.now))
state = 0
env.process(items(env, i, servers_array, state))
def items(env, customer_id, servers_array, state):
with servers_array[state].request() as request:
yield request
t_arrival = env.now
print("Customer "+str(customer_id)+ " arrived in "+str(state)+ " at "+str(t_arrival))
yield env.timeout(servicetime())
t_depart = env.now
print("Customer "+str(customer_id)+ " departed from "+str(state)+ " at "+str(t_depart))
if (state == 1):
print("Customer exists")
else:
state = 1
env.process(items(env, customer_id, servers_array, state))
def delay(env, servers_array):
while(True):
if (env.now%1440 >= 540 and env.now <= 1080):
yield(1080 - env.now%1440)
else:
print(str(env.now), "resources will be blocked")
resource_unavailability_dict = dict()
resource_unavailability_dict[0] = []
resource_unavailability_dict[1] = []
for nodes in resource_unavailability_dict:
for _ in range(servers_array[nodes].capacity):
resource_unavailability_dict[nodes].append(servers_array[nodes].request())
print(resource_unavailability_dict)
for nodes in resource_unavailability_dict:
yield env.all_of(resource_unavailability_dict[nodes])
if (env.now < 540):
yield env.timeout(540)
else:
yield env.timeout((int(env.now/1440)+1)*1440+540 - env.now)
for nodes in resource_unavailability_dict:
for request in resource_unavailability_dict[nodes]:
servers_array[nodes].release(request)
print(str(env.now), "resources are released")
env = simpy.Environment()
servers_array = []
servers_array.append(simpy.Resource(env, capacity = 5))
servers_array.append(simpy.Resource(env, capacity = 7))
env.process(servicing(env, servers_array))
env.process(delay(env,servers_array))
env.run(until=2880)
The code is given above. Actually, I have two nodes 0 and 1 where server capacities are 5 and 7 respectively. The servers are unavailable before 9AM (540 mins from midnight) and after 6 PM everyday. I am trying to create the unavailability using timeout but not working. Can you suggest how do I modify the code to incorporate it.
I am getting the error AttributeError: 'int' object has no attribute 'callbacks'which I can't figure out why ?

So the problem with simpy resources is the capacity is a read only attribute. To get around this you need something to seize and hold the resource off line. So in essence, I have two types of users, the ones that do "real work" and the ones that control the capacity. I am using a simple resource, which means that the queue at the schedule time will get processed before the capacity change occurs. Using a priority resource means the current users of a resource can finish their processes before the capacity change occurs , or you can use a pre-emptive resource to interrupt users with resources at the scheduled time. here is my code
"""
one way to change a resouce capacity on a schedule
note the the capacity of a resource is a read only atribute
Programmer: Michael R. Gibbs
"""
import simpy
import random
def schedRes(env, res):
"""
Performs maintenance at time 100 and 200
waits till all the resources have been seized
and then spend 25 time units doing maintenace
and then release
since I am using a simple resource, maintenance
will wait of all request that are already in
the queue when maintenace starts to finish
you can change this behavior with a priority resource
or pre-emptive resource
"""
# wait till first scheduled maintenance
yield env.timeout(100)
# build a list of requests for each resource
# then wait till all requests are filled
res_maint_list = []
print(env.now, "Starting maintenance")
for _ in range(res.capacity):
res_maint_list.append(res.request())
yield env.all_of(res_maint_list)
print(env.now, "All resources seized for maintenance")
# do maintenance
yield env.timeout(25)
print(env.now, "Maintenance fisish")
# release all the resources
for req in res_maint_list:
res.release(req)
print(env.now,"All resources released from maint")
# wait till next scheduled maintenance
dur_to_next_maint = 200 -env.now
if dur_to_next_maint > 0:
yield env.timeout(dur_to_next_maint)
# do it all again
res_maint_list = []
print(env.now, "Starting maintenance")
for _ in range(res.capacity):
res_maint_list.append(res.request())
yield env.all_of(res_maint_list)
print(env.now, "All resources seized for maintenance")
yield env.timeout(25)
print(env.now, "Maintenance fisish")
for req in res_maint_list:
res.release(req)
print(env.now,"All resources released from maint")
def use(env, res, dur):
"""
Simple process of a user seizing a resource
and keeping it for a little while
"""
with res.request() as req:
print(env.now, f"User is in queue of size {len(res.queue)}")
yield req
print(env.now, "User has seized a resource")
yield env.timeout(dur)
print(env.now, "User has released a resource")
def genUsers(env,res):
"""
generate users to seize resources
"""
while True:
yield env.timeout(10)
env.process(use(env,res,21))
# set up
env = simpy.Environment()
res = simpy.Resource(env,capacity=2) # may want to use a priority or preemtive resource
env.process(genUsers(env,res))
env.process(schedRes(env, res))
# start
env.run(300)

One way to do this is with preemptive resources. When it is time to make resources unavailable, issue a bunch of requests with the highest priority to seize idle resources, and to preempt resources currently in use. These requests would then release the resources when its time to make the resources available again. Note that you will need to add some logic on how the preempted processes resume once the resources become available again. If you do not need to preempt processes, you can just use priority resources instead of preemptive resources

How to get the list of followers from an Instagram account without getting banned?

I am trying to scrape all the followers of some particular Instagram accounts. I am using Python 3.8.3 and the latest version of Instaloader library. The code I have written is given below:
# Import the required libraries:
import instaloader
import time
from random import randint
# Start time:
start = time.time()
# Create an instance of instaloader:
loader = instaloader.Instaloader()
# Credentials & target account:
user_id = USERID
password = PASSWORD
target = TARGET # Account of which the list of followers need to be scraped;
# Login or load the session:
loader.login(user_id, password)
# Obtain the profile metadata of the target:
profile = instaloader.Profile.from_username(loader.context, target)
# Print the list of followers and save it in a text file:
try:
# The list to store the collected user handles of the followers:
followers_list = []
# Variables used to apply pauses to slow down scraping:
count = 0
short_counter = 1
short_pauser = randint(19, 24)
long_counter = 1
long_pauser = randint(4900, 5000)
# Fetch the followers one by one:
for follower in profile.get_followers():
sleeper = randint(840, 1020)
# Short pause for the process:
if (short_counter % short_pauser == 0):
short_counter = 0
short_pauser = randint(19, 24)
print('\nShort Pause.\n')
time.sleep(1)
# Long pause for the process:
if (long_counter % long_pauser == 0):
long_counter = 0
long_pauser = randint(4900, 5000)
print('\nLong pause.\n')
time.sleep(sleeper)
# Append the list and print the follower's user handle:
followers_list.append(follower.username)
print(count,'', followers_list[count])
# Increment the counters accordingly:
count = count + 1
short_counter = short_counter + 1
long_counter = long_counter + 1
# Store the followers list in a txt file:
txt_file = target + '.txt'
with open(txt_file, 'a+') as f:
for the_follower in followers_list:
f.write(the_follower)
f.write('\n')
except Exception as e:
print(e)
# End time:
end = time.time()
total_time = end - start
# Print the time taken for execution:
print('Time taken for complete execution:', total_time,'s.')
I am getting the following error after scraping some data:
HTTP Error 400 (Bad Request) on GraphQL Query. Retrying with shorter page length.
HTTP Error 400 (Bad Request) on GraphQL Query. Retrying with shorter page length.
400 Bad Request
In fact, the error occurs when Instagram detects unusual activity and disables the account for a while and prompts the user to change the password.
I have tried -
(1) Slowing down the process of scraping.
(2) Adding pauses in between in order to make the program more human-like.
Still, no progress.
How to bypass such restrictions and get the complete list of all the followers?
If getting the entire list is not possible, what is the best way to get at least 20,000 followers list (from multiple accounts) without getting banned / disabled account / facing such inconveniences?

Running MPI python script in MPI azure ml pipeline

I'm trying to run distributed python job through azure ML pipelines using MPIStep pipeline class, by referring to the below example link - https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
I tried implemented the same but even I change the node count parameter in MpiStep class, while running the script the it shows size (i.e comm.Get_size()) as 1 always. Can you please help me in what I'm missing here. Is there any specific setup required on the cluster?
Code snippets:
Pipeline code snippet:
model_dir = model_ds.path('./'+saved_model_blob+'/',data_reference_name='saved_model_path').as_mount()
label_dir = model_ds.path('./'+model_label_blob+'/',data_reference_name='model_label_blob').as_mount()
input_images = result_ds.path('./'+score_blob_name+'/',data_reference_name='Input_images').as_mount()
output_container = 'abc'
inti_container = 'xyz'
distributed_batch_score_step = MpiStep(
name="batch_scoring",
source_directory=SCRIPT_FOLDER,
script_name="batch_scoring_script_mpi.py",
arguments=["--dataset_path", input_images,
"--model_name", model_dir,
"--label_dir", label_dir,
"--intermediate_data_container", inti_container,
"--output_container", output_container],
compute_target=gpu_cluster,
inputs=[input_images, model_dir,label_dir],
pip_packages=["tensorflow","tensorflow-gpu==1.13.1","pillow","azure-keyvault","azure-storage-blob"],
conda_packages=["mesa-libgl-cos6-x86_64","mpi4py==3.0.2","opencv=3.4.2","scikit-learn=0.21.2"],
use_gpu=True,
allow_reuse = False,
node_count = nodecount_param,
process_count_per_node = 1
)
Python Script code snippet:
def run(input_dataset,comm):
rank = comm.Get_rank()
size = comm.Get_size()
print("Rank:" , rank)
print("Size:", size) # shows always 1, even the input node count is >1
print(MPI.Get_processor_name())
file_names = get_file_names(args.dataset_path)
sorted(file_names)
partition_size = len(file_names) // size
print("partition_size-->",partition_size)
partitioned_filenames = file_names[rank * partition_size: (rank + 1) * partition_size]
print("RANK {} - is processing {} images out of the total {}".format(rank, len(partitioned_filenames),
len(file_names)))
# call to Function 01
# call to Function 02
img_names = score_df['image_name'].unique()
output_batch = pd.DataFrame()
for i in img_names:
# call to Function 3
output_batch = output_batch.append(pp_output, ignore_index=True)
output_paths_list = comm.gather(output_batch, root=0)
print("RANK {} - number of pre-aggregated output files {}".format(rank, len(output_batch)))
print("saved in", currentDT + '\\' + 'data.csv')
if rank == 0:
print("RANK {} - number of aggregated output files {}".format(rank, len(output_paths_list)))
print("RANK {} - end".format(rank))
if __name__ == "__main__":
with tf.device('/GPU:0'):
init()
comm = MPI.COMM_WORLD
run(args.dataset_path,comm)

Got to know the issue is due to package version, earlier it is installed via conda with conda_packages=["mpi4py==3.0.2"], it worked after changing the install through pip - pip_packages=["mpi4py"]

Assign Class attributes from list elements

I'm not sure if the title accurately describes what I'm trying to do. I have a Python3.x script that I wrote that will issue flood warning to my facebook page when the river near my home has reached it's lowest flood stage. Right now the script works, however it only reports data from one measuring station. I would like to be able to process the data from all of the stations in my county (total of 5), so I was thinking that maybe a class method may do the trick but I'm not sure how to implement it. I've been teaching myself Python since January and feel pretty comfortable with the language for the most part, and while I have a good idea of how to build a class object I'm not sure how my flow chart should look. Here is the code now:
#!/usr/bin/env python3
'''
Facebook Flood Warning Alert System - this script will post a notification to
to Facebook whenever the Sabine River # Hawkins reaches flood stage (22.3')
'''
import requests
import facebook
from lxml import html
graph = facebook.GraphAPI(access_token='My_Access_Token')
river_url = 'http://water.weather.gov/ahps2/river.php?wfo=SHV&wfoid=18715&riverid=203413&pt%5B%5D=147710&allpoints=143204%2C147710%2C141425%2C144668%2C141750%2C141658%2C141942%2C143491%2C144810%2C143165%2C145368&data%5B%5D=obs'
ref_url = 'http://water.weather.gov/ahps2/river.php?wfo=SHV&wfoid=18715&riverid=203413&pt%5B%5D=147710&allpoints=143204%2C147710%2C141425%2C144668%2C141750%2C141658%2C141942%2C143491%2C144810%2C143165%2C145368&data%5B%5D=all'
def checkflood():
r = requests.get(river_url)
tree = html.fromstring(r.content)
stage = ''.join(tree.xpath('//div[#class="stage_stage_flow"]//text()'))
warn = ''.join(tree.xpath('//div[#class="current_warns_statmnts_ads"]/text()'))
stage_l = stage.split()
level = float(stage_l[2])
#check if we're at flood level
if level < 22.5:
pass
elif level == 37:
major_diff = level - 23.0
major_r = ('The Sabine River near Hawkins, Tx has reached [Major Flood Stage]: #', stage_l[2], 'Ft. ', str(round(major_diff, 2)), ' Ft. \n Please click the link for more information.\n\n Current Warnings and Alerts:\n ', warn)
major_p = ''.join(major_r)
graph.put_object(parent_object='me', connection_name='feed', message = major_p, link = ref_url)
<--snip-->
checkflood()
Each station has different 5 different catagories for flood stage: Action, Flood, Moderate, Major, each different depths per station. So for Sabine river in Hawkins it will be Action - 22', Flood - 24', Moderate - 28', Major - 32'. For the other statinos those depths are different. So I know that I'll have to start out with something like:
class River:
def __init__(self, id, stage):
self.id = id #station ID
self.stage = stage #river level'
#staticmethod
def check_flood(stage):
if stage < 22.5:
pass
elif stage.....
but from there I'm not sure what to do. Where should it be added in(to?) the code, should I write a class to handle the Facebook postings as well, is this even something that needs a class method to handle, is there any way to clean this up for efficiency? I'm not looking for anyone to write this up for me, but some tips and pointers would sure be helpful. Thanks everyone!
EDIT Here is what I figured out and is working:
class River:
name = ""
stage = ""
action = ""
flood = ""
mod = ""
major = ""
warn = ""
def checkflood(self):
if float(self.stage) < float(self.action):
pass
elif float(self.stage) >= float(self.major):
<--snip-->
mineola = River()
mineola.name = stations[0]
mineola.stage = stages[0]
mineola.action = "13.5"
mineola.flood = "14.0"
mineola.mod = "18.0"
mineola.major = "21.0"
mineola.alert = warn[0]
hawkins = River()
hawkins.name = stations[1]
hawkins.stage = stages[1]
hawkins.action = "22.5"
hawkins.flood = "23.0"
hawkins.mod = "32.0"
hawkins.major = "37.0"
hawkins.alert = warn[1]
<--snip-->
So from here I'm tring to stick all the individual river blocks into one block. What I have tried so far is this:
class River:
... name = ""
... stage = ""
... def testcheck(self):
... return self.name, self.stage
...
>>> for n in range(num_river):
... stations[n] = River()
... stations[n].name = stations[n]
... stations[n].stage = stages[n]
...
>>> for n in range(num_river):
... stations[n].testcheck()
...
<__main__.River object at 0x7fbea469bc50> 4.13
<__main__.River object at 0x7fbea46b4748> 20.76
<__main__.River object at 0x7fbea46b4320> 22.13
<__main__.River object at 0x7fbea46b4898> 16.08
So this doesn't give me the printed results that I was expecting. How can I return the string instead of the object? Will I be able to define the Class variables in this manner or will I have to list them out individually? Thanks again!

After reading many, many, many articles and tutorials on class objects I was able to come up with a solution for creating the objects using list elements.
class River():
def __init__(self, river, stage, flood, action):
self.river = river
self.stage = stage
self.action = action
self.flood = flood
self.action = action
def alerts(self):
if float(self.stage < self.flood):
#alert = "The %s is below Flood Stage (%sFt) # %s Ft. \n" % (self.river, self.flood, self.stage)
pass
elif float(self.stage > self.flood):
alert = "The %s has reached Flood Stage(%sFt) # %sFt. Warnings: %s \n" % (self.river, self.flood, self.stage, self.action)
return alert
'''this is the function that I was trying to create
to build the class objects automagically'''
def riverlist():
river_list = []
for n in range(len(rivers)):
station = River(river[n], stages[n], floods[n], warns[n])
river_list.append(station)
return river_list
if __name__ == '__main__':
for x in riverlist():
print(x.alerts())

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Capture 1 resource from a request of many resources: easier approach? - simpy

Related

how to fetch data through multiple account with threading in python3

Simpy resource unavialbality

How to get the list of followers from an Instagram account without getting banned?

Running MPI python script in MPI azure ml pipeline

Assign Class attributes from list elements

Categories

Resources