Context : I'm trying to create a simple image processing application using Plotly Dash. It has to be able to display images and execute some operations on the image while updating the image's display. The image will either be generated on the fly or uploaded to the application.
The images are in the numpy ndarray format because my processing is based on numpy and matplotlib operations that use such format.
Question : What are the Python code that allows to display a ndarray while being able to update it by some GUI operation ?
I'm basically searching the closest thing that dash can offer to a matplotlib.pyplot.imshow()
Research : In my research, I've found this repository which probably incorporate all the basic features I need to start working, but as a beginner in Plotly Dash I struggle to extract the code I need, and it seems it doesn't use numpy either.
I've found this question, which is very close to what I'm asking, but the only answer does not incorporate numpy arrays.
I've found a solution, with Flask.Response and the src parameter of html.Img, and openCV for the image encoding :
import dash
import dash_core_components as dcc
import dash_html_components as html
import cv2
from flask import Flask, Response
import numpy as np
import time
def get_image(seed=0):
# strip slide
size = 400
res = np.mod((np.arange(size)[..., None] + np.arange(size)[None, ...]) + seed, [255])
ret, jpeg = cv2.imencode('.jpg', res)
return jpeg.tobytes()
def gen():
i = 0
while True:
time.sleep(0.03333)
frame = get_image(i)
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')
i += 1
server = Flask(__name__)
app = dash.Dash(__name__, server=server)
#server.route('/video_feed')
def video_feed():
return Response(gen(),
mimetype='multipart/x-mixed-replace; boundary=frame')
app.layout = html.Div([
html.H1("Webcam Test"),
html.Img(src="/video_feed")
])
if __name__ == '__main__':
app.run_server(debug=True)
Result :
On the browser, the strips are slowly moving, I guess a gif would've made a better demonstration.
Careful to not send too much images per second, or you'll overflow the application and the browser and they'll eventually crash. So use time.pause or other equivalent limiter in the generator loop.
Tough, I'm still intrested on how other people would do that. One drawback with this solution is that I think that the users would have to share the same display, defined at the path /video_feed.
Related
I have created a function which is takes the latex and Description about the question and render it as image, what i want is to split it so that i can display properly in web site.
from io import BytesIO as StringIO
import matplotlib.pyplot as plt
import textwrap as tw
def render_latex(formula, fontsize=12, dpi=300, format_='svg'):
"""Renders LaTeX formula into image.
"""
import pdb; pdb.set_trace()
note1_txt = 'This is sample question to check the new line break for latex, which is not working as expected.'
fig = plt.figure(figsize=(0.01, 0.10))
note1_txt += tw.fill(tw.dedent(formula.rstrip()), width=60)
fig.text(0, 0, u'${}$'.format(note1_txt), fontsize=fontsize)
buffer_ = StringIO()
fig.savefig(buffer_, dpi=dpi, transparent=True, format=format_, bbox_inches='tight', pad_inches=2)
plt.close(fig)
return buffer_.getvalue()
if __name__ == '__main__':
image_bytes = render_latex(
r'\theta=\theta+C(1+\theta-\beta)\sqrt{1-\theta}succ_mul \theta=\theta+C(1+\theta-\beta)\sqrt{1-\theta}succ_mul \theta=\theta+C(1+\theta-\beta)\sqrt{1-\theta}succ_mul',
fontsize=10, dpi=200, format_='png')
with open('formula.png', 'wb') as image_file:
image_file.write(image_bytes)
After executing getting image which is
I want to format it in multi line without losing the latex formatting. I have tried reducing the width. if width is larger and it result properly but that is not my case. for mobile view i am render it as image. Please help me to sort this out.
If we look at the previews of the environments, they show the episodes increasing in the animation on the bottom right corner. https://gym.openai.com/envs/CartPole-v1/ .Is there a command to explicitly show that?
I don't think there is a command to do that directly available in OpenAI, but I've written some code that you can probably adapt to your purposes. This is the end result:
These is how I achieve the end result:
For each step, you obtain the frame with env.render(mode='rgb_array')
You convert the frame (which is a numpy array) into a PIL image
You write the episode name on top of the PIL image using utilities from PIL.ImageDraw (see the function _label_with_episode_number in the code snippet).
You save the labeled image into a list of frames.
You render the list of frames as a GIF using matplotlib utilities.
Here is the code I wrote for obtaining a GIF of the behavior of a random agent with the Episode number displayed in the top left corner of each frame:
import os
import imageio
import numpy as np
from PIL import Image
import PIL.ImageDraw as ImageDraw
import matplotlib.pyplot as plt
def _label_with_episode_number(frame, episode_num):
im = Image.fromarray(frame)
drawer = ImageDraw.Draw(im)
if np.mean(im) < 128:
text_color = (255,255,255)
else:
text_color = (0,0,0)
drawer.text((im.size[0]/20,im.size[1]/18), f'Episode: {episode_num+1}', fill=text_color)
return im
def save_random_agent_gif(env):
frames = []
for i in range(5):
state = env.reset()
for t in range(500):
action = env.action_space.sample()
frame = env.render(mode='rgb_array')
frames.append(_label_with_episode_number(frame, episode_num=i))
state, _, done, _ = env.step(action)
if done:
break
env.close()
imageio.mimwrite(os.path.join('./videos/', 'random_agent.gif'), frames, fps=60)
env = gym.make('CartPole-v1')
save_random_agent_gif(env)
You can find a working version of the code here: https://github.com/RishabhMalviya/dqn_experiments/blob/master/train_and_visualize.py#L10
import time
# cv2.cvtColor takes a numpy ndarray as an argument
import numpy as nm
import pytesseract
# importing OpenCV
import cv2
from PIL import ImageGrab, Image
bboxes = [(1469, 1014, 1495, 1029)]
def imToString():
# Path of tesseract executable
pytesseract.pytesseract.tesseract_cmd = 'D:\Program Files (x86)\Tesseract-OCR' + chr(92) + 'tesseract.exe'
while (True):
for box in bboxes:
# ImageGrab-To capture the screen image in a loop.
# Bbox used to capture a specific area.
cap = ImageGrab.grab(bbox=box)
# Converted the image to monochrome for it to be easily
# read by the OCR and obtained the output String.
tesstr = pytesseract.image_to_string(
cv2.cvtColor(nm.array(cap), cv2.COLOR_BGR2GRAY), lang='eng', config='digits') # ,lang='eng')
cap.show()
#input()
time.sleep(5)
print(tesstr)
# Calling the function
imToString()
It captures an image like this:
It isn't always two digits it can be one or three digits too.
Pytesseract returns values like: asi and oli
So, which Image To Text (OCR) Algorithm should I use for this problem? And, how to use that? I need a very precise value in this example it's 53 so the output should be around 50.
I am trying to insert a saved PDF image into a ReportLab flowable.
I have seen several answers to similar questions and many involve using Py2PDF like this:
import PyPDF2
import PIL
input1 = PyPDF2.PdfFileReader(open(path+"image.pdf", "rb"))
page0 = input1.getPage(0)
xObject = page0['/Resources']['/XObject'].getObject()
for obj in xObject:
#Do something here
The trouble I'm having is with a sample image I've saved from MatPlotLib as a PDF. When I try to access that saved image with the code above, it returns nothing under page0['/Resources']['/XObject'].
In fact, here's what I see when I look at page0 and /XObject:
'/XObject': {}
Here's the code I used to generate the PDF:
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
plt.rcdefaults()
fig, ax = plt.subplots()
# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
error = np.random.rand(len(people))
ax.barh(y_pos, performance, xerr=error, align='center',
color='green', ecolor='black')
ax.set_yticks(y_pos)
ax.set_yticklabels(people)
ax.invert_yaxis() # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')
plt.savefig(path+'image.pdf',bbox_inches='tight')
Thanks in advance!
I am currently trying using librosa to perform stfft, such that the parameter resembles a stfft process from a different framework (Kaldi).
The audio file is fash-b-an251
Kaldi does it using a sample frequency of 16 KHz, window_size = 400 (25ms), hop_length=160 (10ms).
The spectrogram extracted from this looks like this:
I then tried to do the same using librosa:
import numpy as np
import sys
import librosa
import os
import scipy
import matplotlib.pyplot as plt
from matplotlib import cm
# Input parameter
# relative_path_to_file
if len(sys.argv) < 1:
print "Missing Arguments!"
print "python spectogram_librosa.py path_to_audio_file"
sys.exit()
path = sys.argv[1]
abs_path = os.path.abspath(path)
spectogram_dnn = "/home/user/dnn/spectogram"
if not os.path.exists(spectogram_dnn):
print "spectogram_dnn folder didn't exist!"
os.makedirs(spectogram_dnn)
print "Created!"
y,sr = librosa.load(abs_path,sr=16000)
D = librosa.logamplitude(np.abs(librosa.core.stft(y, win_length=400, hop_length=160, window=scipy.signal.hanning,center=False)), ref_power=np.max)
librosa.display.specshow(D,sr=16000,hop_length=160, x_axis='time', y_axis='log', cmap=cm.jet)
plt.colorbar(format='%+2.0f dB')
plt.title('Log power spectrogram')
plt.show()
raw_input()
sys.exit()
Which is basically taken from here:
In which i've modified the stfft function such that it fits my parameters..
Problems is that is creates an entirely different plot..
So.. What am I doing wrong in librosa?.. Why is this plot so much different, from the one created in kaldi.
Am I missing something?
It has to do with the Hz scale. The one in the first image is linear while the one in the second image is logarithmic. You can fix it by either changing the scale in either of the images to match the other.