I am developing a data pipeline in python with around 7 or 8 modes. Basically the data pipeline will be calling many class. For simplicity, I have created a simple test script ((pseudo code))as below. But every function is getting imported from a class.
Some modes/tasks are independent steps and few can be combined and make as a datapipeline.
For example test_flow is an independant workflow. create_flow and monitor_flow can be called as independant tasks or sometimes can be called together also.
Is there a better way to design the pipeline as there are about 8 modes and I feel the design(calling --modes as below) is bit clumsy. Please let me know if there are any other elegant ways. Thanks.
def test_flow:
print(test_flow)
def create_flow:
print(create_flow)
def monitor_flow:
print(monitor_flow)
if __name__ == "__main__":
if args.mode == "test_flow":
test_flow
if args.mode == "create_flow":
create_flow
if args.mode == "monitor_flow":
monitor_flow
Your example code is full of syntax errors!
I would suggest something like this, but you probably would want to ensure further that only certain functions are reachable via the command line:
import sys
def test_flow():
print("called test_flow")
def create_flow():
print("called create_flow")
def monitor_flow():
print("called monitor_flow")
def main(argv):
if len(argv)>1:
specifiedCall = globals()[argv[1]]
if specifiedCall:
specifiedCall()
pass
if __name__ == "__main__":
main(sys.argv)
Related
Consider the following peice of code.
from multiprocessing import Process
class A(Process):
def __init__(self):
self.a = None
super(A, self).__init__()
def run(self) -> None:
self.a = 4
def print_a(self):
print(self.a)
if __name__ == '__main__':
b = A()
b.start()
b.print_a()
I want output to be :-
4
But after running the code, the output is :-
None
What am I doing wrong and how to fix it ?
My final Remarks :-
I am a noob in Python. I used to use Thread in java, and we never had such issue there. Initially, i didn't knew there is a godly class like threading in python too.
Simply put, what i wanted to do, is just the thing for which we use threading's Thread class. After using that, all my issues are fixed. The code I provided above is just a simple enact of the actual code where the shared variable is a non-Picklable class Web3 and it was giving me tons of problems with multiprocessing. It's just that my understanding and knowledge of python was shallow. I didn't realise that having a seperate memory for variables is what Process is for...
I want to use a function name other than main as a main function. Is it possible to do it? This sounds like a basic question but since I'm new to python, I wanted to make sure that I am going correctly.
file1.py
def func1():
"""some code"""
def func2():
"""Some code"""
def main(arg1,arg2):
func1()
func2()
if __name__=="__main__":
main(arg1,arg2)
I expect to use a different name for "main()" method here.
You can just do it:
def not_main(arg1, arg2):
func1()
func2()
if __name__ == "__main__":
not_main(arg1, arg2)
The if __name__ == "__main__": syntax is, unfortunately, unavoidable - what it does is it checks if the current file is being run as a script (as opposed to as an imported module). The convention is to have a function main() that holds the "main business logic" of your application, and have it be called inside the if __name__ == "__main__": block, but there are plenty of reasons not to do that, and it's not at all required.
How do i write a test, to Test for the default behavior (of a method ) of printing a range that we give it? Below is my attempt. Pasted code from my implementation file and the test case file.
`class FizzBuzzService:
def print_number(self, num):
for i in range(num):
print(i, end=' ')
import unittest
from app.logic import FizzBuzzService
class FizzBuzzServiceTestCases(unittest.TestCase):
def setUp(self):
"""
Create an instance of fizz_buzz_service
"""
self.fizzbuzz = FizzBuzzService()
def test_it_prints_a_number(self):
"""
Test for the default behavior of printing the range that we give
fizz_buzz_service
"""
number_range = range(10)
self.assertEqual(self.fizzbuzz.print_number(10), print(*number_range))
For me at least TDD is about finding a good design as much as it's about testing. As you've seen, testing for things like output is hard.
printing like this is known as a side effect - put simply it's doing something not based solely on the input parameter to the method. My solution would be to make print_number less side effecty, then test it like that. If you need to print it you can write another function higher up that prints, the output of print_number, but contains no meaningful logic other than that, which doesn't really need testing. Here's an example with your code changed to not have a side effect (it's one of several possible alternatives)
class FizzBuzzService:
def print_number(self, num):
for i in range(num):
yield i
import unittest
class FizzBuzzServiceTestCases(unittest.TestCase):
def setUp(self):
"""
Create an instance of fizz_buzz_service
"""
self.fizzbuzz = FizzBuzzService()
def test_it_prints_a_number(self):
"""
Test for the default behavior of printing the range that we give
fizz_buzz_service
"""
number_range = range(10)
output = []
for x in self.fizzbuzz.print_number(10):
output.append(x)
self.assertEqual(range(10), output)
You need to capture standard outputs in your tests to do that -
import sys
import cStringIO
def test_it_prints_a_number(self):
inital_stdout = sys.stdout
sys.stdout = cStringIO()
self.fizzbuzz.print_number(10)
value = sys.stdout.getvalue()
self.assertEqual(value, str(range(10)))
As you can see it's really messy, thus I'd highly recommend against it. Tests written on the based on string contents, especially standard outputs are utterly fragile. Besides the whole point of TDD is to write well-designed isolated code that is easily testable. If your code is difficult to test, than it is a sure shot indication that there's a problem in your design.
How about you divide your code into two parts, one that produce the numbers and need to be tested and other that just print it.
def get_numbers(self, num):
return range(num)
def print_number(self, num):
print(get_numbers)
# Now you can easily test get_numbers method.
Now if you really want to test printing functionality, then the better way would be use mocking.
I have a question on multiprocessing and tkinter. I am having some problems getting my process to function parallel with the tkinter GUI. I have created a simple example to practice and have been reading up to understand the basics of multiprocessing. However when applying them to tkinter, only one process runs at the time. (Using Multiprocessing module for updating Tkinter GUI) Additionally, when I added the queue to communicate between processes, (How to use multiprocessing queue in Python?), the process won't even start.
Goal:
I would like to have one process that counts down and puts the values in the queue and one to update tkinter after 1 second and show me the values.
All advice is kindly appreciated
Kind regards,
S
EDIT: I want the data to be available when the after method is being called. So the problem is not with the after function, but with the method being called by the after function. It will take 0.5 second to complete the calculation each time. Consequently the GUI is unresponsive for half a second, each second.
EDIT2: Corrections were made to the code based on the feedback but this code is not running yet.
class Countdown():
"""Countdown prior to changing the settings of the flows"""
def __init__(self,q):
self.master = Tk()
self.label = Label(self.master, text="", width=10)
self.label.pack()
self.counting(q)
# Countdown()
def counting(self, q):
try:
self.i = q.get()
except:
self.label.after(1000, self.counting, q)
if int(self.i) <= 0:
print("Go")
self.master.destroy()
else:
self.label.configure(text="%d" % self.i)
print(i)
self.label.after(1000, self.counting, q)
def printX(q):
for i in range(10):
print("test")
q.put(9-i)
time.sleep(1)
return
if __name__ == '__main__':
q = multiprocessing.Queue()
n = multiprocessing.Process(name='Process2', target=printX, args = (q,))
n.start()
GUI = Countdown(q)
GUI.master.mainloop()
Multiprocessing does not function inside of the interactive Ipython notebook.
Multiprocessing working in Python but not in iPython As an alternative you can use spyder.
No code will run after you call mainloop until the window has been destroyed. You need to start your other process before you call mainloop.
You are calling wrong the after function. The second argument must be the name of the function to call, not a call to the function.
If you call it like
self.label.after(1000, self.counting(q))
It will call counting(q) and wait for a return value to assign as a function to call.
To assign a function with arguments the syntax is
self.label.after(1000, self.counting, q)
Also, start your second process before you create the window and call counting.
n = multiprocessing.Process(name='Process2', target=printX, args = (q,))
n.start()
GUI = Countdown(q)
GUI.master.mainloop()
Also you only need to call mainloop once. Either position you have works, but you just need one
Edit: Also you need to put (9-i) in the queue to make it count down.
q.put(9-i)
Inside the printX function
I have a little desktop game written in Python and would like to be able to access internal of it while the game is running. I was thinking of doing this by having a web.py running on a separate thread and serving pages. So when I access http://localhost:8080/map it would display map of the current level for debugging purposes.
I got web.py installed and running, but I don't really know where to go from here. I tried starting web.application in a separate thread, but for some reason I can not share data between threads (I think).
Below is simple example, that I was using testing this idea. I thought that http://localhost:8080/ would return different number every time, but it keeps showing the same one (5). If I print common_value inside of the while loop, it is being incremented, but it starts from 5.
What am I missing here and is the approach anywhere close to sensible? I really would like to avoid using database if possible.
import web
import thread
urls = (
'/(.*)', 'hello'
)
app = web.application(urls, globals())
common_value = 5
class hello:
def GET(self):
return str(common_value)
if __name__ == "__main__":
thread.start_new_thread(app.run, ())
while 1:
common_value = common_value + 1
After searching around a bit, I found a solution that works:
If common_value is defined at a separate module and imported from there, the above code works. So in essence (excuse the naming):
thingy.py
common_value = 0
server.py
import web
import thread
import thingy
import sys; sys.path.insert(0, ".")
urls = (
'/(.*)', 'hello'
)
app = web.application(urls, globals())
thingy.common_value = 5
class hello:
def GET(self):
return str(thingy.common_value)
if __name__ == "__main__":
thread.start_new_thread(app.run, ())
while 1:
thingy.common_value = thingy.common_value + 1
I found erros with arguments, but
change :
def GET(self):
with:
def GET(self, *args):
and works now.