I have a view that generates data and streams it in real time. I can't figure out how to send this data to a variable that I can use in my HTML template. My current solution just outputs the data to a blank page as it arrives, which works, but I want to include it in a larger page with formatting. How do I update, format, and display the data as it is streamed to the page?
import flask
import time, math
app = flask.Flask(__name__)
#app.route('/')
def index():
def inner():
# simulate a long process to watch
for i in range(500):
j = math.sqrt(i)
time.sleep(1)
# this value should be inserted into an HTML template
yield str(i) + '<br/>\n'
return flask.Response(inner(), mimetype='text/html')
app.run(debug=True)
You can stream data in a response, but you can't dynamically update a template the way you describe. The template is rendered once on the server side, then sent to the client.
One solution is to use JavaScript to read the streamed response and output the data on the client side. Use XMLHttpRequest to make a request to the endpoint that will stream the data. Then periodically read from the stream until it's done.
This introduces complexity, but allows updating the page directly and gives complete control over what the output looks like. The following example demonstrates that by displaying both the current value and the log of all values.
This example assumes a very simple message format: a single line of data, followed by a newline. This can be as complex as needed, as long as there's a way to identify each message. For example, each loop could return a JSON object which the client decodes.
from math import sqrt
from time import sleep
from flask import Flask, render_template
app = Flask(__name__)
#app.route("/")
def index():
return render_template("index.html")
#app.route("/stream")
def stream():
def generate():
for i in range(500):
yield "{}\n".format(sqrt(i))
sleep(1)
return app.response_class(generate(), mimetype="text/plain")
<p>This is the latest output: <span id="latest"></span></p>
<p>This is all the output:</p>
<ul id="output"></ul>
<script>
var latest = document.getElementById('latest');
var output = document.getElementById('output');
var xhr = new XMLHttpRequest();
xhr.open('GET', '{{ url_for('stream') }}');
xhr.send();
var position = 0;
function handleNewData() {
// the response text include the entire response so far
// split the messages, then take the messages that haven't been handled yet
// position tracks how many messages have been handled
// messages end with a newline, so split will always show one extra empty message at the end
var messages = xhr.responseText.split('\n');
messages.slice(position, -1).forEach(function(value) {
latest.textContent = value; // update the latest value in place
// build and append a new item to a list to log all output
var item = document.createElement('li');
item.textContent = value;
output.appendChild(item);
});
position = messages.length - 1;
}
var timer;
timer = setInterval(function() {
// check the response for new data
handleNewData();
// stop checking once the response has ended
if (xhr.readyState == XMLHttpRequest.DONE) {
clearInterval(timer);
latest.textContent = 'Done';
}
}, 1000);
</script>
An <iframe> can be used to display streamed HTML output, but it has some downsides. The frame is a separate document, which increases resource usage. Since it's only displaying the streamed data, it might not be easy to style it like the rest of the page. It can only append data, so long output will render below the visible scroll area. It can't modify other parts of the page in response to each event.
index.html renders the page with a frame pointed at the stream endpoint. The frame has fairly small default dimensions, so you may want to to style it further. Use render_template_string, which knows to escape variables, to render the HTML for each item (or use render_template with a more complex template file). An initial line can be yielded to load CSS in the frame first.
from flask import render_template_string, stream_with_context
#app.route("/stream")
def stream():
#stream_with_context
def generate():
yield render_template_string('<link rel=stylesheet href="{{ url_for("static", filename="stream.css") }}">')
for i in range(500):
yield render_template_string("<p>{{ i }}: {{ s }}</p>\n", i=i, s=sqrt(i))
sleep(1)
return app.response_class(generate())
<p>This is all the output:</p>
<iframe src="{{ url_for("stream") }}"></iframe>
5 years late, but this actually can be done the way you were initially trying to do it, javascript is totally unnecessary (Edit: the author of the accepted answer added the iframe section after I wrote this). You just have to include embed the output as an <iframe>:
from flask import Flask, render_template, Response
import time, math
app = Flask(__name__)
#app.route('/content')
def content():
"""
Render the content a url different from index
"""
def inner():
# simulate a long process to watch
for i in range(500):
j = math.sqrt(i)
time.sleep(1)
# this value should be inserted into an HTML template
yield str(i) + '<br/>\n'
return Response(inner(), mimetype='text/html')
#app.route('/')
def index():
"""
Render a template at the index. The content will be embedded in this template
"""
return render_template('index.html.jinja')
app.run(debug=True)
Then the 'index.html.jinja' file will include an <iframe> with the content url as the src, which would something like:
<!doctype html>
<head>
<title>Title</title>
</head>
<body>
<div>
<iframe frameborder="0"
onresize="noresize"
style='background: transparent; width: 100%; height:100%;'
src="{{ url_for('content')}}">
</iframe>
</div>
</body>
When rendering user-provided data render_template_string() should be used to render the content to avoid injection attacks. However, I left this out of the example because it adds additional complexity, is outside the scope of the question, isn't relevant to the OP since he isn't streaming user-provided data, and won't be relevant for the vast majority of people seeing this post since streaming user-provided data is a far edge case that few if any people will ever have to do.
Originally I had a similar problem to the one posted here where a model is being trained and the update should be stationary and formatted in Html. The following answer is for future reference or people trying to solve the same problem and need inspiration.
A good solution to achieve this is to use an EventSource in Javascript, as described here. This listener can be started using a context variable, such as from a form or other source. The listener is stopped by sending a stop command. A sleep command is used for visualization without doing any real work in this example. Lastly, Html formatting can be achieved using Javascript DOM-Manipulation.
Flask Application
import flask
import time
app = flask.Flask(__name__)
#app.route('/learn')
def learn():
def update():
yield 'data: Prepare for learning\n\n'
# Preapre model
time.sleep(1.0)
for i in range(1, 101):
# Perform update
time.sleep(0.1)
yield f'data: {i}%\n\n'
yield 'data: close\n\n'
return flask.Response(update(), mimetype='text/event-stream')
#app.route('/', methods=['GET', 'POST'])
def index():
train_model = False
if flask.request.method == 'POST':
if 'train_model' in list(flask.request.form):
train_model = True
return flask.render_template('index.html', train_model=train_model)
app.run(threaded=True)
HTML Template
<form action="/" method="post">
<input name="train_model" type="submit" value="Train Model" />
</form>
<p id="learn_output"></p>
{% if train_model %}
<script>
var target_output = document.getElementById("learn_output");
var learn_update = new EventSource("/learn");
learn_update.onmessage = function (e) {
if (e.data == "close") {
learn_update.close();
} else {
target_output.innerHTML = "Status: " + e.data;
}
};
</script>
{% endif %}
Related
I am trying to learn web-scraping on asynchronous javascript-heavy sites. I chose a real estate website to do that. So, I have done the search by hand and came up with the URL as the first step. Here is the url:
CW_url = https://www.cushmanwakefield.com/en/united-states/properties/invest/invest-property-search#q=Los%20angeles&sort=%40propertylastupdateddate%20descending&f:PropertyType=[Office,Warehouse%2FDistribution]&f:Country=[United%20States]&f:StateProvince=[CA]
I then tried to write code to read the page using beautiful soup:
while iterations < 10:
time.sleep(5)
html = driver.execute_script("return document.documentElement.outerHTML")
sel_soup = bs(html, 'html.parser')
forsales = sel_soup.findAll("for sale")
iterations += 1
print (f'iteration {iterations} - forsales: {forsales}')
I also tried using requests-html:
from requests_html import HTMLSession, HTML
from requests_html import AsyncHTMLSession
asession = AsyncHTMLSession()
r = await asession.get(CW_url)
r.html.arender(wait = 5, sleep = 5)
r.text.find('for sale')
But, this gives me -1, which means the text could not be found! The r.text does give me a wall of HTML text, and inside that there seems to be some javascript not run yet!
<script type="text/javascript">
var endpointConfiguration = {
itemUri: "sitecore://web/{34F7EE0A-4405-44D6-BF43-13BC99AE8AEE}?lang=en&ver=4",
siteName: "CushmanWakefield",
restEndpointUri: "/coveo/rest"
};
if (typeof (CoveoForSitecore) !== "undefined") {
CoveoForSitecore.SearchEndpoint.configureSitecoreEndpoint(endpointConfiguration);
CoveoForSitecore.version = "5.0.788.5";
var context = document.getElementById("coveo3a949f41");
if (!!context) {
CoveoForSitecore.Context.configureContext(context);
}
}
</script>
I thought the fact that the url contains all the search criteria means that the site makes the fetch request, returns the data, and generate the HTML. Apparently not! So, what am I doing wrong and how to deal with this or similar sites? Ideally, one would replace the search criteria in the CW_url and let the code retrieve and store the data
There is a simple flask app which writes statistics-table from db to a page. How can I plot plotly.express chart on this page?
Code for chart that I want to integrate to a flask app: (took from https://plotly.com/python/time-series/)
# Using plotly.express
import plotly.express as px
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/...')
fig = px.line(df, x='Date', y='AAPL.High')
fig.show()
There need to be more answers floating around out there that actually show how to do this without using dash. I will share my working example of using plotly.express and flask. I'll cut out most of the data work and figure building to just show what you need to do.
Imports needed
You'll need these in addition to your usual px and flask imports.
from plotly import utils
from json import dumps
Short Explanation: JSON and Plotly.js are key
I use pandas to get a dataframe in a function called get_data and get a scatter plot with lines connected with a function called get_lfig. The only important thing to note here is that get_lfig is returning a figure generated from px.scatter() but it can be any figure. Now the trick here is to turn your figure into a JSON and use it on the template side somewhere, you don't do much in python.
Example 1: Create the fig, JSON and pass it to the template
Altogether it looks something like this
from flask import Flask, render_template
from extensions import get_data, get_lfig
from plotly import utils
from json import dumps
app = Flask(__name__)
#app.route('/')
def home():
# function that get data, private and public sets
days=2
priv_data, pub_data= get_data(hours=24*days)
# function that returns a px.scatter figure
all_pub_fig = get_lfig(pub_data)
# turn the figure into a JSON then pass it to the template
all_pub_json = dumps(all_pub_fig, cls=utils.PlotlyJSONEncoder)
return render_template('graph.html', pub_lines_JSON=all_pub_json)
if __name__ == '__main__':
app.run(debug=True, port=8080, host='0.0.0.0')
On the template side, you want to include the src for plotly.js somewhere and then just use the Plotly.plot() function to populate your graph in a div.
<!-- somewhere up top -->
<script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
...
<h3>Line Graph Representation</h3>
<!-- scatter plot goes in this div -->
<div id='all-pub-graph'></div>
<script type="text/javascript">
// here is where the JSON gets plugged in via JS
var the_pubs_graph = {{pub_lines_JSON | safe}};
// you target the graph div id in the first arg,
// put your graph in the second, and set the third as {}
Plotly.plot("all-pub-graph", the_pubs_graph, {});
</script>
Example 2: Create the JSON, send it to JS fetch request
Knowing you just want a JSON of your figure, you can take this further and handle it in all JS without having to pass it to the template directly.
Here I handle a post request from on-page selectors that make an api call every time they're changed. The get_lfig function now returns a JSON of the figure instead of a figure object.
#app.route('/get-graphs', methods=['POST'])
def get_graphs():
if request.method == 'POST':
# This whole block is just form handling and data stuff
form = dict(request.form)
agg_func = form['agg_func']
days = int(form['days'])
interval = int(form['interval'])
if (len(agg_func) > 0) and (interval != 0):
pub, priv = get_data(hours=24*days, interval=interval, agg_func=agg_func)
else:
pub, priv = get_data(hours=24*days)
# get JSON figures from the data
pub_f, priv_f = get_lfig(pub), get_lfig(priv)
return {'public': pub_f, 'private': priv_f}
On the template side I use an event listener attached to a form so every time I make a change the graphs get updated. I still need the JS function to be able to find the url, so I pass it to the function using url_for() since utils.js isn't being rendered and can't take advantage of that same template functionality.
<div id="selectors">
<form id="graph-selectors" method="post" onchange="selector_changed('{{url_for('get_graphs')}}')">
<label for="day-selector">Days</label>
<select name="days" id="day-selector">
{% for opt in range(1,10) %}
<option value={{opt}}>{{opt}}</option>
{% endfor %}
</select>
<label for="function-selector">Aggregate Func</label>
<select name="agg_func" id="function-selector">
{% for opt in ['','mean','sum','min','max','std','count'] %}
<option value="{{opt}}">{{opt}}</option>
{% endfor %}
</select>
<label for="interval-selector">Aggregate Interval</label>
<select name="interval" id="interval-selector">
{% for opt in [0, 5, 10, 30, 60] %}
<option value={{opt}}>{{opt}} minutes</option>
{% endfor %}
</select>
</form>
</div>
<div id="content">
<div id="graph-section"></div>
</div>
<script src="{{ url_for('static', filename='utils.js') }}"></script>
then finally, the selector_changed function that makes an API call is stored in my utils.js and looks like this
// I just use this so it clears existing graphs between changes
const clearChildren = (parent) => {
while (parent.lastChild) {
parent.removeChild(parent.lastChild);
}
}
async function selector_changed(gUrl) {
// get the form data
var form_data = new FormData(document.querySelector('form#graph-selectors'))
// send it to get_graphs()
let response = await fetch(gUrl, {
method: "POST",
body: form_data
});
// get_graphs() returns the figure's JSON
let graphJSONs = await response.json();
// declare and clear the target area graphs will go in
var target_area = document.getElementById('graph-section');
clearChildren(target_area);
// create the public server graph section
var pub_area = document.createElement('div');
var pub_header = document.createElement('h3');
var pub_graph = JSON.parse(graphJSONs['public']);
pub_header.textContent = "Public Servers";
pub_area.id = 'public-graphs';
// create the private server graph section
var priv_area = document.createElement('div');
var priv_header = document.createElement('h3');
var priv_graph = JSON.parse(graphJSONs['private']);
priv_header.textContent = "Internal Servers";
priv_area.id = 'private-graphs';
// add everything to the page
target_area.append(pub_area); // start w/the divs
target_area.append(priv_area);
Plotly.plot(pub_area.id, pub_graph, {}); // then add the graphs
Plotly.plot(priv_area.id, priv_graph, {});
pub_area.prepend(pub_header); // then add the headers
priv_area.prepend(priv_header);
}
This is a lot of code, but I wanted to show two ways to handle this which are:
Creating a JSON and passing it to the template directly, and
Handling it as an API call that responds to fetch requests.
The second option is faster and you don't have to refresh the entire page every time, the first option was just showing how to do it with as little code as possible. Either way there should be enough here to modify to your needs and for my future reference. (:
i think to do this is a little more complicated to just call a fig.show(), take a fast look in the ploty lib i found this packet import dash_html_components as html with this you can return a html with your chart to put in web site,
from flask import Flask
app = Flask(__name__)
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
import pandas as pd
#app.route('/chart')
def chart():
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/...')
fig = px.line(df, x='Date', y='AAPL.High')
return html.Div([dcc.Graph(figure=fig)])
I am setting up a web application using Flask and Flask-socketio for Websockets, using eventlet as recommended in the documentation.
The purpose is to create a coding platform where users can subscribe, join the current challenge and for each exercise can submit their own solution (a piece of code) which will then be assessed for correctness on the server via test cases (they write a function, given some input it must give a specific output, if it doesn't count zero score).
The issue is that some server-side computation in executing the uploaded code may take several seconds to complete before returning the output to the client. In this case, the client disconnects upon receiving the data from the server, after the long wait. You can see below some code which reproduces the issue, the time.sleep represents the long server side computation, which will cause the client to disconnect and reconnect.
Note: in the code I put 40 seconds of sleep, that is to have the client disconnect every time, for a smaller time (say between 10 and 20), sometimes it works fine and sometimes it disconnects.
Why is that happening? How can I fix it?
from flask import Flask
from flask_socketio import SocketIO
from flask_socketio import emit, disconnect
import time
import random
flask_app = Flask(__name__)
socketio = SocketIO(flask_app, async_mode='eventlet')
webpage = '''
<html>
<body>
<p id="demo">Some content</p>
<button type="button" onclick="do_on_server()">Do server-side computation</button>
<script src="//cdnjs.cloudflare.com/ajax/libs/socket.io/2.2.0/socket.io.js" integrity="sha256-yr4fRk/GU1ehYJPAs8P4JlTgu0Hdsp4ZKrx8bDEDC3I=" crossorigin="anonymous"></script>
<script type="text/javascript" charset="utf-8">
var socket = io();
function do_on_server(){
socket.emit('do_on_server', {});
}
socket.on('feedback', feedback);
function feedback(data){
document.getElementById("demo").innerHTML = data["msg"];
}
</script>
</body>
</html>
'''
#flask_app.route('/')
def index():
return webpage
#socketio.on('do_on_server')
def do_on_server(json):
print('starting computation')
#long computation
time.sleep(40)
print('done computing')
emit('feedback', {'msg': random.random()})
#socketio.on('connect')
def on_connect():
print('Connected')
#socketio.on('disconnect')
def on_disconnect():
print('Disconnecting')
if __name__ == '__main__':
socketio.run(flask_app)
Console output:
>python socket_timeout.py
Connected
starting computation
done computing
Disconnecting
Connected
I have the following Python script which is using Flask-socketio
from flask import Flask, render_template
from flask_socketio import SocketIO, emit
from time import sleep
app = Flask(__name__)
app.config['SECRET_KEY'] = 'P#ssw0rd'
socketio = SocketIO(app)
#app.route('/')
def index():
return render_template('index.html')
#socketio.on('connect')
def on_connect():
payload1 = 'Connected!!!'
payload2 = 'Doing thing 1'
payload3 = 'Doing thing 2'
emit('send_thing', payload1, broadcast=True)
sleep(2)
emit('send_thing', payload2, broadcast=True)
sleep(2)
emit('send_thing', payload3, broadcast=True)
if __name__ == '__main__':
socketio.run(app)
And here is the corresponding index.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>SocketIO Python</title>
</head>
<body>
<div id="my-div"></div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/1.4.5/socket.io.js"></script>
<script>
(function init() {
var socket = io()
var divElement = document.getElementById('my-div')
socket.on('send_thing', function(payload) {
var dataElement = document.createElement('inner')
dataElement.innerHTML = payload
divElement.appendChild(dataElement)
})
})()
</script>
</body>
</html>
What I am trying to achieve is that when a client connects, it first says 'Connected!!!' and then 2 seconds later a new 'inner' element appears that says 'Doing thing 1' followed by 2 seconds later a new 'inner' element appears that says 'Doing thing 2' etc.
But what is happening is that when a client connects, it sends all 3 lines at the same time (after 4 seconds which is both sleep statements). This is the first time using SocketIO so I'm sure I've done something wrong.
When you use eventlet or gevent, the time.sleep() function is blocking, it does not allow any other tasks to run.
Three ways to address this problem:
Use socketio.sleep() instead of time.sleep().
Use eventlet.sleep() or gevent.sleep().
Monkey patch the Python standard library so that time.sleep() becomes async-friendly.
I have a CherryPy server running on a BeagleBone Black. Server generates a simple webpage and does local SPI reads / writes (hardware interface). The application is going to be used on a local network with 1-2 clients at a time.
I need to prevent a CherryPy class function being called twice, two or more instances before it completes.
Thoughts?
As saaj commented, a simple threading.Lock() will prevent the handler from being run at the same time by another client. I might also add, using cherrypy.session.acquire_lock() will prevent the same client from the running two handlers simultaneously.
Refreshing article on Python locks and stuff: http://effbot.org/zone/thread-synchronization.htm
Although I would make saaj's solution much simpler by using a "with" statement in Python, to hide all those fancy lock acquisitions/releases and try/except block.
lock = threading.Lock()
#cherrypy.expose
def index(self):
with lock:
# do stuff in the handler.
# this code will only be run by one client at a time
return '<html></html>'
It is general synchronization question, though CherryPy side has a subtlety. CherryPy is a threaded-server so it is sufficient to have an application level lock, e.g. threading.Lock.
The subtlety is that you can't see the run-or-fail behaviour from within a single browser because of pipelining, Keep-Alive or caching. Which one it is is hard to guess as the behaviour varies in Chromium and Firefox. As far as I can see CherryPy will try to serialize processing of request coming from single TCP connection, which effectively results in subsequent requests waiting for active request in a queue. With some trial-and-error I've found that adding cache-prevention token leads to the desired behaviour (even though Chromium still sends Connection: keep-alive for XHR where Firefox does not).
If run-or-fail in single browser isn't important to you you can safely ignore the previous paragraph and JavaScript code in the following example.
Update
The cause of request serialisation coming from one browser to the same URL doesn't lie in server-side. It's an implementation detail of a browser cache (details). Though, the solution of adding random query string parameter, nc, is correct.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import threading
import time
import cherrypy
config = {
'global' : {
'server.socket_host' : '127.0.0.1',
'server.socket_port' : 8080,
'server.thread_pool' : 8
}
}
class App:
lock = threading.Lock()
#cherrypy.expose
def index(self):
return '''<!DOCTYPE html>
<html>
<head>
<title>Lock demo</title>
<script type='text/javascript' src='http://cdnjs.cloudflare.com/ajax/libs/qooxdoo/3.5.1/q.min.js'></script>
<script type='text/javascript'>
function runTask(wait)
{
var url = (wait ? '/runOrWait' : '/runOrFail') + '?nc=' + Date.now();
var xhr = q.io.xhr(url);
xhr.on('loadend', function(xhr)
{
if(xhr.status == 200)
{
console.log('success', xhr.responseText)
}
else if(xhr.status == 503)
{
console.log('busy');
}
});
xhr.send();
}
q.ready(function()
{
q('p a').on('click', function(event)
{
event.preventDefault();
var wait = parseInt(q(event.getTarget()).getData('wait'));
runTask(wait);
});
});
</script>
</head>
<body>
<p><a href='#' data-wait='0'>Run or fail</a></p>
<p><a href='#' data-wait='1'>Run or wait</a></p>
</body>
</html>
'''
def calculate(self):
time.sleep(8)
return 'Long task result'
#cherrypy.expose
def runOrWait(self, **kwargs):
self.lock.acquire()
try:
return self.calculate()
finally:
self.lock.release()
#cherrypy.expose
def runOrFail(self, **kwargs):
locked = self.lock.acquire(False)
if not locked:
raise cherrypy.HTTPError(503, 'Task is already running')
else:
try:
return self.calculate()
finally:
self.lock.release()
if __name__ == '__main__':
cherrypy.quickstart(App(), '/', config)