Read console output realtime in lua - io

How can I manage to periodically read the output of a script while it is running?
In the case of youtube-dl, it sends download information (progress/speed/eta) about the video being downloaded to the terminal.
With the following code I am able to capture the total result of the scripts output (on linux) to a temporary file:
tmpFile = io.open("/tmp/My_Temp.tmp", "w+")
f = io.popen("youtube-dl http://www.youtube.com/watch?v=UIqwUx_0gJI", 'r')
tmpFile:write(f:read("*all"))
Instead of waiting for the script to complete and writing all the data at the end, I would like able to capture "snapshots" of the latest information that youtube-dl has sent to the terminal.
My overall goal is to capture the download information in order to design a progress bar using Iup.
If there are more intelligent ways of capturing download information I will be happy to take advice as well.
Regardless, if it is possible to use io.popen(), os.execute(), or other tools in such a way I would still like to know how to capture the real time console output.

This works fine both on Windows and Linux. Lines are displayed in real-time.
local pipe = io.popen'ping google.com'
for line in pipe:lines() do
print(line)
end
pipe:close()
UPD :
If previous code didn't work try the following (as dualed suggested):
local pipe = io.popen'youtube-dl with parameters'
repeat
local c = pipe:read(1)
if c then
-- Do something with the char received
io.write(c) io.flush()
end
until not c
pipe:close()

Related

How can I pass and receive information dinamically within a subprocess?

I'm developing a Python code that can run two applications and exchange information between them during their run time.
The basic scheme is something like:
start a subprocess with the 1st application
start a subprocess with the 2nd application
1st application performs some calculation, writes a file A, and waits for input
2nd application reads file A, performs some calculation, writes a file B, and waits for input
1st application reads file B, performs some calculation, writes a file C, and waits for input
...and so on until some condition is met
I know how to start one Python subprocess, and now I'm learning how to pass/receive information during run time.
I'm testing my Python code using a super-simple application that just reads a file, makes a plot, closes the plot, and returns 0.
I was able to pass an input to a subprocess using subprocess.communicate() and I could tell that the subprocess used that information (plot opens and closes), but here the problems started.
I can only send an input string once. After the first subprocess.communicate() in my code below, the subprocess hangs there. I suspect I might have to use subprocess.stdin.write() instead, since I read subprocess.communicate() will wait for the end of the file and I wish to send multiple times different inputs during the application run instead. But I also read that the use of stdin.write() and stdout.read() is discouraged. I tried this second alteranative (see #alternative in the code below), but in this case the application doesn't seem to receive the inputs, i.e. it doesn't do anything and the code ends.
Debugging is complicated because I haven't found a neat way to output what the subprocess is receiving as input and giving as output. (I tried to implement the solutions described here, but I must have done something wrong: Python: How to read stdout of subprocess in a nonblocking way, A non-blocking read on a subprocess.PIPE in Python)
Here is my working example. Any help is appreciated!
import os
import subprocess
from subprocess import PIPE
# Set application name
app_folder = 'my_folder_path'
full_name_app = os.path.join(app_folder, 'test_subprocess.exe')
# Start process
out_app = subprocess.Popen([full_name_app], stdin=PIPE, stdout=PIPE)
# Pass argument to process
N = 5
for n in range(N):
str_to_communicate = f'{{\'test_{n+1}.mat\', {{\'t\', \'y\'}}}}' # funny looking string - but this how it needs to be passed
bytes_to_communicate = str_to_communicate.encode()
output_communication = out_app.communicate(bytes_to_communicate)
# output_communication = out_app.stdin.write(bytes_to_communicate) # alternative
print(f'Communication command #{n+1} sent')
# Terminate process
out_app.terminate()

Is there a way to save a file that is written by a python script, if the python script is killed before it is completed?

I have been running a web scraper script written in Python. I had to terminate the Python script because of an issue with my internet connection. At the time, the script has run for almost 2-3 hours. I used a for loop to write the data into a CSV file. I had used 'file.close()' to save the file once the for loop is over; but as I terminated the program early, my time of two hours have wasted.
Once I tried to delete the newly created CSV file(its size is 0kB), it is said 'The action can't be completed because the file is open in Python'. I thought that all the data I extracted is now on the RAM.(maybe that's why I don't get the permission to close the 0kB sized CSV file?)
So, is there any way to access those data and write the data into the above-mentioned CSV file? (Otherwise, I will have to run to the same program for another two hours and wait for the results)
Here's my code!
#! python3.8
fileCsv = open('newCsv.csv','w',newline='')
outputWriter = csv.writer(fileCsv)
for i in range(100,000): # whatever range
num, name = 10000, 'hello' # The data extracted from the website
ourputWriter.writerow([num,name])
time.sleep(1)
fileCsv.close() # My program was terminated before this line, in the for loop
Using with should help here.
with open('newCsv.csv','w') as wr:
for i in range(100,000): # whatever range
num, name = 10000, 'hello' # The data extracted from the website
wr.writerow([num,name])
time.sleep(1)

How to extract python output out of the cmd?

I am using cmd in Windows 7 and I have encounter the following problem:
I write the command python in cmd to enter my code in python, then follows:
import requests
r=requests.get("https://nameofthepege.com")
r.text
After that the whole console gets full of hmtl code. The last 200 to 300 linesof the output are visible but the rest are not. How can I see more lines?
Moreover, is there any way to extract the html code produced by the r.textcommand in a new file from within the python environment or the cmd?
Regarding your first question.
After that the whole console gets full of html code. The last 200 to
300 lines of the output are visible but the rest are not. How can I
see more lines?
Response: The CMD default buffer is limited to 300 lines. You should increase the CMD prompt buffer size.
The below tutorial explains how to do that:
https://www.tenforums.com/tutorials/94089-change-command-prompt-screen-buffer-size-windows.html
Regarding your second question.
Moreover, is there any way to extract the html code produced by the
r.text command in a new file from within the python environment or the
cmd?
Response: You can write the content from r.text into a file by creating a file with Python open() function. More information about Reading and Writing Files in the below link:
https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files

audio file isn't being parsed with Google Speech

This question is a followup to a previous question.
The snippet of code below almost works...it runs without error yet gives back a None value for results_list. This means it is accessing the file (I think) but just can't extract anything from it.
I have a file, sample.wav, living publicly here: https://storage.googleapis.com/speech_proj_files/sample.wav
I am trying to access it by specifying source_uri='gs://speech_proj_files/sample.wav'.
I don't understand why this isn't working. I don't think it's a permissions problem. My session is instantiated fine. The code chugs for a second, yet always comes up with no result. How can I debug this?? Any advice is much appreciated.
from google.cloud import speech
speech_client = speech.Client()
audio_sample = speech_client.sample(
content=None,
source_uri='gs://speech_proj_files/sample.wav',
encoding='LINEAR16',
sample_rate_hertz= 44100)
results_list = audio_sample.async_recognize(language_code='en-US')
Ah, that's my fault from the last question. That's the async_recognize command, not the sync_recognize command.
That library has three recognize commands. sync_recognize reads the whole file and returns the results. That's probably the one you want. Remove the letter "a" and try again.
Here's an example Python program that does this: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/cloud-client/transcribe.py
FYI, here's a summary of the other types:
async_recognize starts a long-running, server-side operation to translate the whole file. You can make further calls to the server to see whether it's finished with the operation.poll() method and, when complete, can get the results via operation.results.
The third type is streaming_recognize, which sends you results continually as they are processed. This can be useful for long files where you want some results immediately, or if you're continuously uploading live audio.
I finally got something to work:
import time
from google.cloud import speech
speech_client = speech.Client()
sample = speech_client.sample(
content = None
, 'gs://speech_proj_files/sample.wav'
, encoding='LINEAR16'
, sample_rate= 44100
, 'languageCode': 'en-US'
)
retry_count = 100
operation = sample.async_recognize(language_code='en-US')
while retry_count > 0 and not operation.complete:
retry_count -= 1
time.sleep(10)
operation.poll() # API call
print(operation.complete)
print(operation.results[0].transcript)
print(operation.results[0].confidence)
for op in operation.results:
print op.transcript
Then something like
for op in operation.results:
print op.transcript

pcap file viewing library in python 3

I'm looking at trying to read pcap files from various CTF events.
Ideally, I would like something that can do the breakdown of information such as wireshark, but just being able to read the timestamp and return the packet as a bytestring of some kind would be welcome.
The problem is that there is little or no python 3 support with all the commonly cited libraries: dpkt, pylibpcap, pcapy, etc.
Does anyone know of a pcap library that works with python 3?
to my knowledge, there are at least 2 packages that seems to work with Python 3: pure-pcapfile and dpkt:
pure-pcapfile is easy to install in python 3 using pip. It's very easy to use but still limited to decoding Ethernet and IP data. The rest is left to you. But it works right out of the box.
dpkt doesn't work right out of the box and needs some manipulation before. They are porting it to Python 3 and plan to have a Python 2 and 3 compatible version for version 2.0. Unfortunately, it's not there yet. However, it is way more complete than pure-pcapfile and can decode many protocols. If your packet embeds several layers of protocols, it will decode them automatically for you. The only problem is that you need to make a few corrections here and there to make it work (as the time of writing this comment).
pure-pcapfile
the only one that I found working for Python 3 so far is pcapfile. You can find it at https://pypi.python.org/pypi/pypcapfile/ or install it by doing pip3 install pypcapfile.
There are just basic functionalities but it works very well for me and has been updated quite recently (at the time of writing this message):
from pcapfile import savefile
file = open('mypcapfile.pcp' , 'rb')
pcapfile = savefile.load_savefile(file,verbose=True)
If everything goes well, you should see something like this:
[+] attempting to load mypcapfile.pcap
[+] found valid header
[+] loaded 1234 packets
[+] finished loading savefile.
A few remarks now. I'm using Python 3.4.3. And doing import pcapfile will not import anything from it (I'm still a beginner with Python) but the only basic information and functions from the package. Next, you have to explicitly open your file in read binary mode by passing 'rb' as the mode in the open() function. In the documentation they don't say it explicitly.
The rest is like in the documentation:
packet = pcapfile.packets[12]
to access the packet number 12 (the 13th packet then, the first one being at 0). And you have basic functionalities like
packet.timestamp
to get a timestamp or
packet.raw()
to get raw data.
The documentation mentions functions to do packet decoding of some standard formats like Ethernet and IP.
dpkt
dpkt is not available for Python 3 so you need to do the following, assuming you have access to a command line. The code is available on https://github.com/kbandla/dpkt.git and you must download it before:
git clone https://github.com/kbandla/dpkt.git
cd dpkt
git checkout --track origin/migrate_py3
git pull
This 4 commands do the following:
clone (download) the code from its git repository on github
go into the newly created directory named dpkt
switch to the branch name migrate_py3 which contains the Python 3 code. As you can see from the name of this branch, it's still experimental. So far it works for me.
(just in case) download again the code
then copy the directory named dpkt in your project or wherever Python 3 can find it.
Later on, in Python 3 here is what you have to do to get started:
import dpkt
file = open('mypcapfile.pcap','rb')
will open your file. Don't forget the 'rb' binary mode in Python 3 (same thing as in pure-pcapfile).
pcap = dpkt.pcap.Reader(file)
will read and decode your file
for ts, buf in pcap:
eth = dpkt.ethernet.Ethernet(buf)
print(eth)
will, for example, decode Ethernet packet and print them. Then read the documentation on how to use dpkt. If your packets contain IP or TCP layer, then dpkt.ethernet.Ethernet(buf) will decode them as well. Also note that in the for loop, we have access to the timestamps in ts.
You may want to iterate it in a less constrained form and doing as follows will help:
(ts,buf) = next(pcap)
eth = dpkt.ethernet.Ethernet(buf)
where the first line get the next tuple from the pcap file. If pcap is False then you read everything.

Resources