Unit testing a private callback factory - python-3.x

I am trying to create unit tests for a private callback factory with an alarm manager class. I don't want to test the private method directly as I think this contravenes TDD good practice but I cannot figure out a good way to test this.
Please don't just tell me my code is bad (I already know!). I just want to know if it is possible to use the unittest framework to test this or if I need to re-architect my code.
My class is an alarm manager that is responsible for managing the creation, scheduling and sound of an alarm. When an alarm is created, it is passed to a scheduler with a callback function. To create this callback function, there is a private callback factory that takes in the alarm object to generate the custom callback (while capturing some variables from the manager class)
The problem I have is that unless I use the full functionality of the scheduler, the callback will never get executed but obviously there will be a lot of pain with patching time to make sure it executes in a reasonable time. Furthermore, the callback is never returned to the user so no easy way of checking (or even calling it there)
The manager class looks largely like the code below:
class Manager:
def __init__(self):
self._scheduler = alarm.scheduler.Scheduler()
self._player = sound.player.Player()
def create_alarm(self, name, new_alarm):
self._scheduler.add_job(name, new_alarm.find_next_alarm(),\
self._create_callback(name, new_alarm))
def _create_callback(self, name, callback_alarm):
def callback():
self._player.play(callback_alarm.get_playback())
return callback
Overall, is there a way to somehow extract the callback object if I make the scheduler a mock object. Or is there some other clever way to test that the callback is doing what it should be?

Related

Designing a system to centrally manage series of events on different systems

I have a problem at work where I need to perform series of sequential tasks on different devices. These devices do not need to interact with each other and also each of the sequential tasks can be performed on each of the devices independently.
Assuming I have Tasks (A->B->C->D)(Ex: End of A triggers B and end of B triggers C and so on), Devices(dev1, dev2) can execute these tasks independent of each other.
How can I design a centralized system that executes each task on each device. I cannot use Threading or Multiprocessing due to Infra limitations.
I'm looking for some design suggestions(Classes) and How I can go about designing it.
First approach I thought about was brute force where I blindly use loops to loop over devices and perform each task.
Second approach I was reading about State Design Pattern and I was not sure how I can implement it.
EDIT: I have implemented the answer I have provided below. However I would like to know the correct way to transfer information between states. I know states needs to be mutually exclusive but each task needs to access certain resources that are common amongst all the resources. How can I structure this ?
I have used State design pattern to handle this. I have a Device class which is concrete class and have a method called "perform_task". This method changes behavior based on the state it is in. At a given point it can be in TaskA TaskB or etc.
class Device():
_state = None
def __init__(self):
"""Constructor method"""
self.switch_to(TaskA())
def switch_to(self, state):
self._state = state
self._state.context = self
def perform_task(self):
self._state.perform_task()
Then I have a State Abstract class which has abstract methods. Followed by State classes itself.
class State(ABC):
#property
def context(self):
return self._context
#context.setter
def context(self, context):
self._context = context
#abstractmethod
def perform_task(self):
pass
class TaskA():
def perform_task(self):
# Do something
self.context.switch_to(TaskB())
class TaskB():
def perform_task():
# Do something.
pass
Doing so we can extend this to any number of states in the future and handle new conditions too.
I probably try something with flask for super simple api and a client app on devices that "pool" data from center api and post results so center server know the progress and what is current used. client app would be super simple loop with sleep so it wont 100% cpu without needed.

How do I retrieve data from a Django DB before sending off Celery task to remote worker?

I have a celery shared_task that is scheduled to run at certain intervals. Every time this task is run, it needs to first retrieve data from the Django DB in order to complete the calculation. This task may or may not be sent to a celery worker that is on a separate machine, so in the celery task I can't make any queries to a local celery database.
So far I have tried using signals to accomplish it, since I know that functions with the wrapper #before_task_publish are executed before the task is even published in the message queue. However, I don't know how I can actually get the data to the task.
#shared_task
def run_computation(data):
perform_computation(data)
#before_task_publish.connect
def receiver_before_task_publish(sender=None, headers=None, body=None, **kwargs):
data = create_data()
# How do I get the data to the task from here?
Is this the right way to approach this in the first place? Or would I be better off making an API route that the celery task can get to retrieve the data?
I'm posting the solution that worked for me, thanks for the help #solarissmoke.
What works best for me is utilizing Celery "chain" callback functions and separate RabbitMQ queues for designating what would be computed locally and what would be computed on the remote worker.
My solution looks something like this:
#app.task
def create_data_task():
# this creates the data to be passed to the analysis function
return create_data()
#app.task
def perform_computation_task(data):
# This performs the computation with given data
return perform_computation(data)
#app.task
def store_results(result):
# This would store the result in the DB here, but for now we just print it
print(result)
#app.task
def run_all_computation():
task = signature("path.to.tasks.create_data_task", queue="default") | signature("path.to.tasks.perform_computation_task", queue="remote_computation") | signature("path.to.tasks.store_results", queue="default")
task()
Its important to note here that these tasks were not run serially; they were in fact separate tasks that are run by the workers and therefore do not block a single thread. The other tasks are only activated by a callback function from the others. I declared two celery queues in RabbitMQ, a default one called default, and one specifically for remote computation called "remote_computation". This is described explicitly here including how to point celery workers at created queues.
It is possible to modify the task data in place, from the before_task_publish handler, so that it gets passed to the task. I will say upfront that there are many reasons why this is not a good idea:
#before_task_publish.connect
def receiver_before_task_publish(sender=None, headers=None, body=None, **kwargs):
data = create_data()
# Modify the body of the task data.
# Body is a tuple, the first entry of which is a tuple of arguments to the task.
# So we replace the first argument (data) with our own.
body[0][0] = data
# Alternatively modify the kwargs, which is a bit more explicit
body[1]['data'] = data
This works, but it should be obvious why it's risky and prone to breakage. Assuming you have control over the task call sites I think it would be better to drop the signal altogether and just have a simple function that does the work for you, i.e.,:
def create_task(data):
data = create_data()
run_computation.delay(data)
And then in your calling code, just call create_task(data) instead of calling the task directly (which is what you presumably do right now).

How to test writing to the queue without actually writing to the queue

I am sure this is probably a simple question but I am new to programming and I am having a tough time understanding mocks. I am wondering if its possible in python to test a function that writes to the queue without actually writing to the queue.
def write_to_queue():
#call queue client
#do some status checking
return status
def post() :
#do something
return write_to_queue()
Is there a way to test write_to_queue without actually writing to the queue?
Mocking usually works like this:
Make sure the object to mock is injected into the tested code and not just created there.
Inject the mock instead of the real thing in a test scenario.
Assert on the expected behavior by verifying that the expected method calls where executed.

How to create an async multiprocessing JobQueue in Python?

I'm trying to make a Python 'JobQueue' that performs computationally intensive tasks asynchronously, on a local machine, with a mechanism that returns the results of each task to the main process. Python's multiprocessing.Pool has an apply_async() function that meets those requirements by accepting an arbitrary function, its multiple arguments, and callback functions that return the results. For example...
import multiprocessing
pool = multiprocessing.Pool(poolsize)
pool.apply_async(func, args=args,
callback=mycallback,
error_callback=myerror_callback)
The only problem is that the function given to apply_async() must be serializable with Pickle and the functions I need to run concurrently are not. FYI, the reason is, the target function is a member of an object that contains an IDL object, for example:
from idlpy import IDL
self.idl_obj = IDL.obj_new('ImageProcessingEngine')
This is the error message received at the pool.apply_async() line:
'Can't pickle local object 'IDL.__init__.<locals>.run''
What I tried
I made a simple implementation of a JobQueue that works perfectly fine in Python 3.6+ provided the Job object and it's run() method are Pickleable. I like how the main process can receive an arbitrarily complex amount of data returned from the asynchronously executed function via a callback function.
I tried to use pathos.pools.ProcessPool since it uses dill instead of pickle. However, it doesn't have a method similar to apply_async().
Are there any other options, or 3rd party libraries that provide this functionality using dill, or by some other means?
How about creating a stub function that would instantiate the IDL endopoint as a function static variable?
Please note that this is only a sketch of the code as it is hard to say from the question if you are passing IDL objects as parameters to the function you run in parallel or it serves another purpose.
def stub_fun(paramset):
if 'idl_obj' not in dir(stub_fun): # instantiate once
stub_fun.idl_obj = IDL.obj_new('ImageProcessingEngine')
return stub_fun.idl_obj(paramset)

Calling child class methods from parent classes; aggregation, inheritance or other?

Firstly, I'm new to python and OOP so I'm still learning how things work.
I am trying to port a websocket API helper to a new websocket library as the original library has bugs and the github repo is inactive. The new library(autobahn-twisted) is asynchronous and works via callbacks, where the original library placed new responses onto a queue for synchronous processing. This adds another level of complexity that I am still learning.
Currently, there are 3 modules that I have created/modified. These are:
connector.py
contains Protocol class which extends WebsocketClientProtocol
Protocol handles low level interaction with API such as formatting messages for sending, and receiving and processing(low lvl) responses
client.py
contains Client class which extends Protocol
Client contains higher level functions for un/subscribing to streams, and accessing other API endpoints
user_code.py
contains userClass class
my initial design was going to have this module inheriting from Client to perform operations on, and process the response data
My problem is in the userClass class, and its interaction with the other classes. I want to be able to call userClass methods from inside the Protocol class when messages are received and processed, and call Client methods from the userClass to request messages and subscribe to streams.
For the first attempt, I created an abstract class containing all the methods I wanted to call, and used userClass to implement all of them. This (I think) meant I could safely call the child methods from the parent methods, but I could not call the Client methods from userClass without creating a circular reference, and it seemed fragile(or in other words, everything broke) when I moved things to a new module.
My second attempt had Client as an object in userClass, using the aggregation relationship. Inside Client and Protocol, I referenced the userCode class rather than the object, however I then lose references to the object when calling methods.
I have not yet attempted to use straight inheritance with userClass inheriting Client and overwriting "dummy" methods from the parent classes, as it seemed like there was a lot of code duplication.
This example shows the functionality that I would like
class Protocol(WebsocketClientProtocol)
def onOpen(self):
print("open")
connectionOpened()
def send(self, msg):
self.sendMessage(json.dumps(msg))
class Client(Protocol)
def subscribe(self, msg)
self.send("subscribe_" + msg)
class userClass(Client)
def connectionOpened(self):
subscribe("this_stream")
What design paradigm should I follow to get this behaviour?
Thanks in advance

Resources