Capture a terminal output in real time of an python module - python-3.x

The python 'yfinance' module downloads the quotes of many Financial Securities in a pandas dataframe and in the meanwhile it displays a progress bar in the console. In this way:
import yfinance as yf
Tickerlist = ["AAPL","GOOG","MSFT"]
quote = yf.download(tickers=Tickerlist,period='max',interval='1d',group_by='ticker')
I would like to capture the console progress bar in real time, and the code should be this:
import sys
import subprocesss
process = subprocess.Popen(["yf.download","tickers=Tickerlist","period='max'","interval='1d'","group_by='ticker'"],stdout=quote)
while True:
out = process.stdout.read(1)
sys.stdout.write(out)
sys.stdout.flush()
I make a big mess with subprocess. I need your help! Thanks.
I have already seen all the links that deal with this topic but without being able to solve my problem.

You need two python files to do what you want.
one is yf_download.py and second is run.py
The file code looks like this and you can run it through run.py
python run.py
yf_download.py
import sys
import yfinance as yf
Tickerlist = ["AAPL","GOOG","MSFT"]
def run(period):
yf.download(tickers=Tickerlist, period=period,interval='1d',group_by='ticker')
if __name__ == '__main__':
period = sys.argv[1]
run(period)
run.py
import sys
import subprocess
process = subprocess.Popen(["python", "yf_download.py", "max"],stdout=subprocess.PIPE)
while True:
out = process.stdout.read(1)
if process.poll() is not None:
break
if out != '':
sys.stdout.buffer.write(out)
sys.stdout.flush()

Related

How to Unittest sys.argv with multiple values

Let's say I have a Python script file that uses sys.argv variables. And I want to test the sys.argv with different values in multiple tests. The issue is that I have to patch and import it multiple times but i think it holds the sys.args values only from the first import so both TestCases print the same values ['test1', 'Test1']. Is my approach wrong?
example.py
import sys
ex1 = sys.argv[0]
ex2 = sys.argv[1]
print(ex1)
print(ex2)
test_example.py
import unittest
import mock
import sys
class TestExample(unittest.TestCase):
#mock.patch.object(sys, 'argv', ['test1', 'Test1'])
def test_example1(self):
import example
#mock.patch.object(sys, 'argv', ['test2', 'Test2'])
def test_example2(self):
import example
The problem is that Python will not (unless you work hard at it) reimport the same file more than once.
What you should do is have the import only define stuff, and define a main() function in your module that gets called when it's time to run stuff. You call this main() function directly in the script when the script is called by itself (see the last two lines of example.py below), but then you can also call that function by name during unit testing.
For what it's worth, this is best practice in Python: your scripts and module should mostly define functions, classes and whatnot on import, but not run much. Then you call your main() function in a __name__ guard. See What does if __name__ == "__main__": do? for more details about that.
Here's your code modified to work, testing your module with two different sets of sys.argv values:
example.py:
import sys
def main():
ex1 = sys.argv[0]
ex2 = sys.argv[1]
print(ex1)
print(ex2)
if __name__ == "__main__":
main()
test-example.py:
import unittest
import mock
import sys
import example
class TestExample(unittest.TestCase):
#mock.patch.object(sys, 'argv', ['test1', 'Test1'])
def test_example1(self):
example.main()
#mock.patch.object(sys, 'argv', ['test2', 'Test2'])
def test_example2(self):
example.main()
Results from python -m unittest:
..
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK
test1
Test1
test2
Test2

Python stuck at last program execution

I am new to Python and I think I broke my python :(
I was trying Sentdex's PyQt4 YouTube tutorial right here.
I made the changes from PyQt4 to PyQt5. This is the code I was playing around. So I think, I messed up by printing the whole page on the console.
Now the output is:
Load finished
Look at you shinin!
Press any key to continue . . .
This is being shown for any code executed. That is python shows this code even if I try print("hello") in Visual code. I even tried to restart. Now like a virus, it is not clearing.
import bs4 as bs
import sys
import urllib.request
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl
class Page(QWebEnginePage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebEnginePage.__init__(self)
self.html = ''
self.loadFinished.connect(self._on_load_finished)
self.load(QUrl(url))
self.app.exec_()
def _on_load_finished(self):
self.html = self.toHtml(self.Callable)
print('Load finished')
def Callable(self, html_str):
self.html = html_str
self.app.quit()
def main():
page = Page('https://pythonprogramming.net/parsememcparseface/')
soup = bs.BeautifulSoup(page.html, 'html.parser')
js_test = soup.find('p', class_='jstest')
print js_test.text
print (soup)
#js_test = soup.find('div', class_='aqi-meter-panel')
#display.popen.terminate()
if __name__ == '__main__': main()
OK, so finally got the problem fixed..went manually inside the temp files in C:\Users\xxx\AppData\Local and started on a deletion rampage...removed many files and folder remotely related to python,vscode and conda...this gave an error warning first time I executed my program again...then on subsequent run...no issue...python back to its normal self...surprised that I was not able to find any solution on the net for this.

Why is multiprocessing not working with python dash framework - Python3.6

I'm trying to implement multiprocessing library for splitting up a dataframe into parts, process it on multiple cores of CPU and then concatenate the results back into a final dataframe in a python dash application. The code works fine when I try it outside of the dash application (when I run the code standalone without enclosing it in a dash application). But when I enclose the same code in a dash application, I get an error. I have shown the code below:
I have tried the multiprocessing code out of the dash framework and it works absolutely fine.
import dash
from dash.dependencies import Input, Output, State
import dash_core_components as dcc
import dash_html_components as html
import flask
import dash_table_experiments as dt
import dash_table
import dash.dependencies
import base64
import time
import os
import pandas as pd
from docx import *
from docx.text.paragraph import Paragraph
from docx.text.paragraph import Run
import xml.etree.ElementTree as ET
import multiprocessing as mp
from multiprocessing import Pool
from docx.document import Document as doctwo
from docx.oxml.table import CT_Tbl
from docx.oxml.text.paragraph import CT_P
from docx.table import _Cell, Table
from docx.text.paragraph import Paragraph
import io
import csv
import codecs
import numpy as np
app = dash.Dash(__name__)
application = app.server
app.config.supress_callback_exceptions = True
app.layout = html.Div(children=[
html.Div([
html.Div([
html.H4(children='Reader'),
html.Br(),
],style={'text-align':'center'}),
html.Br(),
html.Br(),
html.Div([
dcc.Upload(html.Button('Upload File'),id='upload-data',style = dict(display = 'inline-block')),
html.Br(),
]
),
html.Div(id='output-data-upload'),
])
])
#app.callback(Output('output-data-upload', 'children'),
[Input('upload-data', 'contents')],
[State('upload-data', 'filename')])
def update_output(contents, filename):
if contents is not None:
content_type, content_string = contents.split(',')
decoded = base64.b64decode(content_string)
document = Document(io.BytesIO(decoded))
combined_df = pd.read_csv('combined_df.csv')
def calc_tfidf(input1):
input1 = input1.reset_index(drop=True)
input1['samplecol'] = 'sample'
return input1
num_cores = mp.cpu_count() - 1 #number of cores on your machine
num_partitions = mp.cpu_count() - 1 #number of partitions to split dataframe
df_split = np.array_split(combined_df, num_partitions)
pool = Pool(num_cores)
df = pd.concat(pool.map(calc_tfidf, df_split))
pool.close()
pool.join()
return len(combined_df)
else:
return 'No File uploaded'
app.css.append_css({'external_url': 'https://codepen.io/plotly/pen/EQZeaW.css'})
if __name__ == '__main__':
app.run_server(debug=True)
The above dash application takes as input any file. Upon uploading the file in the front end, a local CSV file (any file, in my case it is combined_df.csv) is loaded into a dataframe. Now I want to split the dataframe into parts using multiprocessing, process it and combine it back. But the above code results in the following error:
AttributeError: Can't pickle local object 'update_output..calc_tfidf'
What's wrong with this piece of code?
Okay I've figured it out now!. The problem is that the function calc_tfidf was not defined as a global function. I changed the function to be a global function and it worked perfect.
Simple checks when left unsolved at times might lead to days of redundant efforts! :(

How do I stop argparser from printing default search?

Link to my code: https://pastebin.com/y4zLD2Dp
The imports that have not been used are going to be used as I progress through my project, I just like to have all imports I need ready to go. The goal for this program will be a youtube video downloader into mp3 format first. This is my first big project to my standards, only been coding for just over 2 months.
from bs4 import BeautifulSoup
from selenium import webdriver
import requests
import sqlite3
import argparse
import sys
from selenium import webdriver
from apiclient.discovery import build
from apiclient.errors import HttpError
from oauth2client.tools import argparser
#To get a developer key visit https://console.developers.google.com.
DEVELOPER_KEY = ""
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"
'''Searches for results on youtube and stores them in the database.'''
def yt_search(options):
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
developerKey=DEVELOPER_KEY)
search = youtube.search().list(q=options.q, part="id,snippet",
maxResults=options.max_results).execute()
video_id = []
#Add results to a list and print them.
for result in search.get("items", []):
if result["id"]["kind"] == "youtube#video":
video_id.append("%s (%s)" % (result["snippet"]["title"],
result["id"]["videoId"]))
else:
continue
print("Videos:\n", "\n".join(video_id), "\n")
def download(args):
print(args)
"""Arguments for the program."""
if __name__ == '__main__':
parser = argparse.ArgumentParser(description= "This program searches for
Youtube links and allows you to download songs from said list. " +
"Please remember to be specific in your
searches for best results.")
parser.add_argument("--q", help="Search term", default="Aeryes")
parser.add_argument("--max-results", help="Max results", default=25)
parser.add_argument("--d", type=download, help="Download a video from
search results.")
args = parser.parse_args()
if len(sys.argv) < 2:
parser.parse_args(['--help'])
sys.exit(1)
try:
yt_search(args)
except HttpError:
print("HTTP error")
The problem that I am having is that upon running the --d cmd in the CLI it works and prints the arg as expected (This is just a test to see that functions are working with the parser) but after it prints a list of default youtube links from --q default which I do not want it to do. How do I stop this from happening. Should I use subparser or is there something that I am missing?
If anyone has good resources for argparser module other than official doc support please share.

Testing a module with input() and print() inside - using TestCase

Situation:
A students task is to write a python script, doing something with only input() and print().
Example task: "Write python script that asks for number 'n' and prints string of 'n' stars."
He writes a script like this:
solution.py
n = int(input())
print('*' * n)
I want to write python module, that tests that his script is working correctly.
My solution till now is to do this:
test.py
from io import StringIO
import sys
sys.stdout = StringIO()
sys.stdin = StringIO(str(2))
import riesenie
s = sys.stdout.getvalue()
if s != '**' * 2 + '\n':
raise Exception('WRONG')
else:
print("OK", file=sys.stderr)
BUT, I would like to achieve this behaviour using TestCase. Something like:
test.py
from unittest import TestCase
class TestSolution(TestCase):
def test_stars(self):
sys.stdout = StringIO()
sys.stdin = StringIO(str(2))
import riesenie
s = sys.stdout.getvalue()
self.assertEqual('**', s)
But this is not working.
Is there a way to achieve this? Thank you in advance.

Resources