How to idiomatically end an Asyncio Operation in Python - python-3.x

I'm working on code where I have a long running shell command whose output is sent to disk. This command will generate hundreds of GBs per file. I have successfully written code that calls this command asynchronously and successfully yields control (awaits) for it to complete.
I also have code that can asynchronously read that file as it is being written to so that I can process the data contained therein. The problem I'm running into is that I can't find a way to stop the file reader once the shell command completes.
I guess I'm looking for some sort of interrupt I can pass into my writer function once the shell command ends that I can use to tell it to close the file and wrap up the event loop.
Here is my writer function. Right now, it runs forever waiting for new data to be written to the file.
import asyncio
PERIOD = 0.5
async def readline(f):
while True:
data = f.readline()
if data:
return data
await asyncio.sleep(PERIOD)
async def read_zmap_file():
with open('/data/largefile.json'
, mode = 'r+t'
, encoding = 'utf-8'
) as f:
i = 0
while True:
line = await readline(f)
print('{:>10}: {!s}'.format(str(i), line.strip()))
i += 1
loop = asyncio.get_event_loop()
loop.run_until_complete(read_zmap_file())
loop.close()
If my approach is off, please let me know. I'm relatively new to asynchronous programming. Any help would be appreciated.

So, I'd do something like
reader = loop.create_task(read_zmap_file)
Then in your code that manages the shell process once the shell process exits, you can do
reader.cancel()
You can do
loop.run_until_complete(reader)
Alternatively, you could simply set a flag somewhere and use that flag in your while statement. You don't need to use asyncio primitives when something simpler works.
That said, I'd look into ways that your reader can avoid the periodic sleep. if your reader will be able to keep up with the shell command, I'd recommend a pipe, because pipes can be used with select (and thus added to an event loop). Then in your reader you can write to to a file if you need a permanent log. I realize the discussion of avoiding the periodic sleep is beyond the scope of this question, and I don't want to go into more detail than I have, but you di ask for hints on how best to approach async programming.

Related

Get realtime output from a long-running executable using python

It's my first time asking a question on here so bear with me.
I'm trying to make a python3 program that runs executable files for x amount of time and creates a log of all output in a text file. For some reason the code I have so far works only with some executables. I'm new to python and especially subprocess so any help is appreciated.
import time
import subprocess
def CreateLog(executable, timeout=5):
time_start = time.time()
process = subprocess.Popen(executable, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, text=True)
f = open("log.txt", "w")
while process.poll() is None:
output = process.stdout.readline()
if output:
f.write(output)
if time.time() > time_start + timeout:
process.kill()
break
I was recently experimenting with crypto mining and came across nanominer, I tried using this python code on nanominer and the log file was empty. I am aware that nanominer already logs its output, but the point is why does the python code fail.
You are interacting through .poll() (R U dead yet?) and .readline().
It's not clear you want to do that.
There seems to be two cases for your long-lived child:
it runs "too long" silently
it runs forever, regularly producing output text at e.g. one-second intervals
The 2nd case is the easy one.
Just use for line in process.stdout:, consume the line,
peek at the clock, and maybe send a .kill() just as you're already doing.
No need for .poll(), as child exiting will produce EOF on that pipe.
For the 1st case, you will want to set an alarm.
See https://docs.python.org/3/library/signal.html#example
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
After "too long", five seconds, your handler will run.
It can do anything you desire.
You'll want it to have access to the process handle,
which will let you send a .kill().

Threads will not close off after program completion

I have a script that receives temperature data via using requests. Since I had to make multiple requests (around 13000) I decided to explore the use of multi-threading which I am new at.
The programs work by grabbing longitude/latitude data from a csv file and then makes a request to retrieve the temperature data.
The problem that I am facing is that the script does not finish fully when the last temperature value is retrieved.
Here is the code. I have shortened so it is easy to see what I am doing:
num_threads = 16
q = Queue(maxsize=0)
def get_temp(q):
while not q.empty():
work = q.get()
if work is None:
break
## rest of my code here
q.task_done()
At main:
def main():
for o in range(num_threads):
logging.debug('Starting Thread %s', o)
worker = threading.Thread(target=get_temp, args=(q,))
worker.setDaemon(True)
worker.start()
logging.info("Main Thread Waiting")
q.join()
logging.info("Job complete!")
I do not see any errors on the console and temperature is being successfully being written to another file. I have a tried running a test csv file with only a few longitude/latitude references and the script seems to finish executing fine.
So is there a way of shedding light as to what might be happening in the background? I am using Python 3.7.3 on PyCharm 2019.1 on Linux Mint 19.1.
the .join() function waits for all threads to join before continuing to the next line

Threads or asyncio gather?

Which is the best method to do concurrent i/o operations?
thread or
asyncio
There will be list of files.
I open the files and generate a graph using the .txt file and store it on the disk.
I have tried using threads but its time consuming and sometimes it does not generate a graph for some files.
Is there any other method?
I tried with the code below with async on the load_instantel_ascii function but it gives exception
for fl in self.finallist:
k = randint(0, 9)
try:
task2.append( * [load_instantel_ascii(fleName = fl, columns = None,
out = self.outdir,
separator = ',')])
except:
print("Error on Graph Generation")
event_loop.run_until_complete(asyncio.gather(yl1
for kl1 in task2)
)
If I understood everything correct and you want asynchronous file I/O, then asyncio itself doesn't support it out of the box. In the end all asyncio-related stuff that provides async file I/O does it using threads pool.
But it probably doesn't mean you shouldn't use asyncio: this lib is cool as a way to write asynchronous code in a first place, even if it wrapper above threads. I would give a try to something like aiofiles.

using threading to call the main function again?

I am really trying to wrap my head around the concept of Threading concept in practical applications. I am using the Threading module in python 3.4 and I am not sure if the logic is right for the program functionality.
Here is the gist of my code:
def myMain():
""" my main function actually uses sockets to
send data over a network
"""
# there will be an infinite loop call it loop-1 here
while True:
#perform encoding scheme
#send data out... with all other exception handling
# here is infinite loop 2 which waits for messages
# from other devices
while True:
# Check for incoming messages
callCheckFunction() ----> Should I call this on a thread?
the above mentioned callCheckFunction() will do some comparison on the received data and if the data values don't match I want to run the myMain() function again.
Here is the gist of the callCheckFunction():
def callCheckFunction():
if data == 'same':
# some work done here and then get out
# function and return back to listening on the socket
else data == 'not same':
myMain() ---------> Should I thread this one too??
This might be complicated but I am not sure if Threading is the thing I want. I did a nasty hack by calling the myMain() function the above mentioned fashioned which works great! but I assume there will definitely some limit to calling the function within the function and I want my code to be a bit professional not Hacky!
I have my mind set on Threading since I am listening to the socket in an infinite fashion when some new Data comes in the whole myMain() is called back creating kind of a hectic recursion which I want to control.
EDIT
So I have managed to make the code a bit more modular i.e. I have split the two Infinite Loops in to two different functions
now myMain is divided into
task1()
task2()
and the gist is as follows:
def task1():
while True:
# encoding and sending data
#in the end I call task2() since it the primary
# state which decides things
task2() ---------> still need to decide if I should thread or not
def task2():
while True:
# check for incoming messages
checker = threading.Thread(callCheckFunction, daemon=True)
checker.start()
checker.join()
Now since the callCheckFunction() needs the func1() I decided to Thread func1() in the function Note func1 is actually kinda the main() of the code:
def callCheckFunction():
else data == 'not same':
thready = threading.Thread(func1, daemon= True)
thready.start()
thready.join()
Results
with little understanding I do manage to get the code working. But I am not sure if this is really hacky or a professional way of doing things! I can ofcourse share the code via GitHub and also a Finite State Machine for the system. Also I am not sure if this code is Thread Safe ! But Help/Suggestions really needed

Jython How to stop script from thread?

I'm looking for some exit code that will be run from a thread but will be able to kill the main script. It's in Jython but I can't use java.lang.System.exit() because I still want the Java app I'm in to run, and sys.exit() isn't working. Ideally I would like to output a message then exit.
My code uses the threading.Timer function to run a function after a certain period of time. Here I'm using it to end a for loop that is executing for longer than 1 sec. Here is my code:
import threading
def exitFunct():
#exit code here
t = threading.Timer(1.0, exitFunct)
t.start()
for i in range(1, 2000):
print i
Well, if you had to, you could call mainThread.stop(). But you shouldn't.
This article explains why what you're trying to do is considered a bad idea.
If you want to kill the current process and you don't care about flushing IO buffers or reseting the terminal, you can use os._exit().
I don't know why they made this so hard.

Resources