I need to make pytesseract.image_to_string faster - python-3.x

i'm capturing the screen and then reading text from it using tesseract to transform it to a string the problem is that it's to slow for what i need i'm doing about 5.6fps and I needed more like 10-20.(i didn't put the imports i used because u can just see them in the code)
i tried everithing i know and nothing helped
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
time.sleep(7)
def getDesiredWindow():
"""Returns the top-left and bottom-right of the desired window."""
print('Click the top left of the desired region.')
pt1 = detectClick()
print('First position set!')
time.sleep(1)
print('Click the bottom right of the desired region.')
pt2 = detectClick()
print('Got the window!')
return pt1,pt2
def detectClick():
"""Detects and returns the click position"""
state_left = win32api.GetKeyState(0x01)
print("Waiting for click...")
while True:
a = win32api.GetKeyState(0x01)
if a != state_left: #button state changed
state_left = a
if a < 0:
print('Detected left click')
return win32gui.GetCursorPos()
def gettext(pt1,pt2):
# From the two input points, define the desired box
box = (pt1[0],pt1[1],pt2[0],pt2[1])
image = ImageGrab.grab(box)
return pytesseract.image_to_string(image)
"""this is the part where i need it to be faster"""

Hi my solution was to make the image smaller.
Yes it might affect the image_to_string result and make it inaccurate but in my case since my images were 1500 width I managed to get 3x speed with this.
Try to change basewidth and try again:
from PIL import Image
basewidth = 600
img = Image.open('yourimage.png')
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
img = img.resize((basewidth, hsize), Image.ANTIALIAS)
img.save('yourimage.png')

Related

Is noLoop stopping execution of draw?

This is my first post here, so I apologize if I'm making any mistakes.
I recently started to study Processing in Python mode and I'm trying to develop a code that, after selecting an image from your computer, reads the colors and inserts them in a list. The final idea is to calculate the percentage of certain colors in the image. For this I am using the following code:
img = None
tam=5
cores_img = []
def setup():
size (500, 500)
selectInput(u"Escolha a ilustração para leitura de cores", "adicionar_imagens")
noLoop()
def adicionar_imagens(selection):
global img
if selection == None:
print(u"Seleção cancelada")
else:
print(u"Você selecionou " + selection.getAbsolutePath())
img = loadImage(selection.getAbsolutePath())
def draw():
if img is not None:
image (img, 0, 0)
for xg in range(0, img.width, tam):
x = map(xg, 0, img.width, 0, img.width)
for yg in range(0, img.height, tam):
y = map(yg, 0, img.height, 0, img.height)
cor = img.get(int(x), int(y))
cores_img.append(cor)
print (cores_img)
I'm using noLoop() so that the colors are added only once to the list. However, it seems that the draw is not running. It performs the setup actions, but when the image is selected, nothing happens. There is also no error message.
I'm completely lost about what might be happening. If anyone has any ideas and can help, I really appreciate it!
Calling noLoop() indeed stops the draw() loop from running, which means by the time you've selected and image nothing yould happen.
You can however manually call draw() (or redraw()) once the image is loaded:
img = None
tam=5
cores_img = []
def setup():
size (500, 500)
selectInput(u"Escolha a ilustração para leitura de cores", "adicionar_imagens")
noLoop()
def adicionar_imagens(selection):
global img
if selection == None:
print(u"Seleção cancelada")
else:
print(u"Você selecionou " + selection.getAbsolutePath())
img = loadImage(selection.getAbsolutePath())
redraw()
def draw():
if img is not None:
image (img, 0, 0)
for xg in range(0, img.width, tam):
x = map(xg, 0, img.width, 0, img.width)
for yg in range(0, img.height, tam):
y = map(yg, 0, img.height, 0, img.height)
cor = img.get(int(x), int(y))
cores_img.append(cor)
print (cores_img)
You should pay attention to a few details:
As the reference mentions, calling get() is slow: pixels[x + y * width] is faster (just remember to call loadPixels() if the array doesn't look right)
PImage already has a pixels array: calling img.resize(img.width / tam, img .height / tam) should downsample the image so you can read the same list
x = map(xg, 0, img.width, 0, img.width) (and similarly y) maps from one range to the same range which has no effect
e.g.
img = None
tam=5
cores_img = None
def setup():
size (500, 500)
selectInput(u"Escolha a ilustração para leitura de cores", "adicionar_imagens")
noLoop()
def adicionar_imagens(selection):
global img, cores_img
if selection == None:
print(u"Seleção cancelada")
else:
print(u"Você selecionou " + selection.getAbsolutePath())
img = loadImage(selection.getAbsolutePath())
print("total pixels",len(img.pixels))
img.resize(img.width / tam, img.height / tam);
cores_img = list(img.pixels)
print("resized pixels",len(img.pixels))
print(cores_img)
def draw():
pass
Update
I thought that calling noLoop on setup would make draw run once. Still
it won't print the image... I'm calling 'image (img, 0, 0)' at the end
of 'else', on 'def adicionar_imagens (selection)'. Should I call it
somewhere else?
think of adicionar_imagens time-wise, running separate to setup() and draw()
you are right, draw() should be called once (because of noLoop()), however it's called as soon as setup() completes but not later (as navigating the file system, selecting a file and confirming takes time)
draw() would need to be forced to run again after the image was loaded
Here's an updated snippet:
img = None
# optional: potentially useful for debugging
img_resized = None
tam=5
cores_img = None
def setup():
size (500, 500)
selectInput(u"Escolha a ilustração para leitura de cores", "adicionar_imagens")
noLoop()
def adicionar_imagens(selection):
global img, img_resized, cores_img
if selection == None:
print(u"Seleção cancelada")
else:
print(u"Você selecionou " + selection.getAbsolutePath())
img = loadImage(selection.getAbsolutePath())
# make a copy of the original image (to keep it intact)
img_resized = img.get()
# resize
img_resized.resize(img.width / tam, img.height / tam)
# convert pixels array to python list
cores_img = list(img.pixels)
# force redraw
redraw()
# print data
print("total pixels",len(img.pixels))
print("resized pixels",len(img.pixels))
# print(cores_img)
def draw():
print("draw called " + str(millis()) + "ms after sketch started")
# if an img was selected and loaded, display it
if(img != None):
image(img, 0, 0)
# optionally display resized image
if(img_resized != None):
image(img_resized, 0, 0)
Here are a couple of notes that may be helpful:
each pixel in the list is a 24 bit ARGB colour (e.g. all channels are stored in a single value). if you need individual colour channels remember you have functions like red(), green(), blue() available. (Also if that gets slow notice the example include faster versions using bit shifting and masking)
the Histogram example could be helpful. You would need to port from Java to Python syntax and use 3 histograms (one for each colour channel), however the principle of counting intensities is nicely illustrated

Reading a barcode using OpenCV QRCodeDetector

I am trying to use OpenCV on Python3 to create an image with a QR code and read that code back.
Here is some relevant code:
def make_qr_code(self, data):
qr = qrcode.QRCode(
version=2,
error_correction=qrcode.constants.ERROR_CORRECT_H,
box_size=10,
border=4,
)
qr.add_data(data)
return numpy.array( qr.make_image().get_image())
# // DEBUG
img = numpy.ones([380, 380, 3]) * 255
index = self.make_qr_code('Hello StackOverflow!')
img[:index.shape[0], :index.shape[1]][index] = [0, 0, 255]
frame = img
# // DEBUG
self.show_image_in_canvas(0, frame)
frame_mono = cv.cvtColor(numpy.uint8(frame), cv.COLOR_BGR2GRAY)
self.show_image_in_canvas(1, frame_mono)
qr_detector = cv.QRCodeDetector()
data, bbox, rectifiedImage = qr_detector.detectAndDecode(frame_mono)
if len(data) > 0:
print("Decoded Data : {}".format(data))
self.show_image_in_canvas(2, rectifiedImage)
else:
print("QR Code not detected")
(the calls to show_image_in_canvas are just for showing the images in my GUI so I can see what is going on).
When inspecting the frame and frame_mono visually, it looks OK to me
However, the QR Code Detector doesn't return anything (going into the else: "QR Code not detected").
There is literally nothing else in the frame than the QR code I just generated. What do I need to configure about cv.QRCodeDetector or what additional preprocessing do I need to do on my frame to make it find the QR code?
OP here; solved the problem by having a good look at the generated QR code and comparing it to some other sources.
The problem was not in the detection, but in the generation of the QR codes.
Apparently the array that qrcode.QRCode returns has False (or maybe it was 0 and I assumed it was a boolean) in the grid squares that are part of the code, and True (or non-zero) in the squares that are not.
So when I did img[:index.shape[0], :index.shape[1]][index] = [0, 0, 255] I was actually creating a negative image of the QR code.
When I inverted the index array the QR code changed from the image on the left to the image on the right and the detection succeeded.
In addition I decided to switch to the ZBar library because it's much better at detecting these codes under less perfect circumstances (like from a webcam image).
import cv2
import sys
filename = sys.argv[1]
# Or you can take file directly like this:
# filename = f'images/filename.jpg' where images is folder for files that you trying to read
# read the QRCODE image
# in case if QR code is not black/white it is better to convert it into grayscale
# Zero means grayscale
img = cv2.imread(filename, 0)
img_origin = cv2.imread(filename)
# initialize the cv2 QRCode detector
detector = cv2.QRCodeDetector()
# detect and decode
data, bbox, straight_qrcode = detector.detectAndDecode(img)
# if there is a QR code
if bbox is not None:
print(f"QRCode data:\n{data}")
# display the image with lines
# length of bounding box
# Cause bbox = [[[float, float]]], we need to convert fload into int and loop over the first element of array
n_lines = len(bbox[
0])
bbox1 = bbox.astype(int) # Float to Int conversion
for i in range(n_lines):
# draw all lines
point1 = tuple(bbox1[0, [i][0]])
point2 = tuple(bbox1[0, [(i + 1) % n_lines][0]])
cv2.line(img_origin, point1, point2, color=(255, 0, 0), thickness=2)
# display the result
cv2.imshow("img", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("QR code not detected")
To re-state the accepted answer, the background of the QRcode must be white and the foreground must be black. So if the generated QRcode has a white foreground you must invert the colors, e.g.:
from cv2 import cv2
img = cv2.imread('C:/Users/N/qrcode.jpg')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Invert colors so foreground is black
img_invert = cv2.bitwise_not(img_gray)
cv2.imshow('gray', img_gray)
cv2.imshow('inverted', img_invert)
cv2.waitKey(1)
qr_detector = cv2.QRCodeDetector()
text, _, _ = qr_detector.detectAndDecode(img_invert)
print(text)

How to detect objects which have almost similar color with their background?

original image
image after kmeans clustering
image I get as result
I am working on malaria parasite detection using thick blood microscopy image. I have tried to segment the parasite objects but it is difficult since they have almost similar background color. I have used vv2.kmeans() to cluster the parasite and non parasite.
import csv as csv
import matplotlib.pyplot as plt
def smooth(img):
dest=cv2.medianBlur(img,7)
#dest=cv2.GaussianBlur(img, (7,7),0)
return dest
def process(path,img):
image=cv2.imread(path+img,1)
image=smooth(image)
return image
def kmeans(img,name):
output=[]
image=img.reshape(img.shape[0]*img.shape[1],3)
image=np.float32(image)
nclusters=5
criteria=(cv2.TERM_CRITERIA_EPS+cv2.TERM_CRITERIA_MAX_ITER,10,1.0)
attempts=10
flags=cv2.KMEANS_RANDOM_CENTERS
compactness,labels,centers=cv2.kmeans(image,nclusters,None,criteria,attempts,flags)
centers = np.uint8(centers)
res = centers[labels.flatten()]
res2 = res.reshape((img.shape))
cv2.imwrite(dest+name[:-4]+'.png', res2)
im_color=cv2.imread(dest+name[:-4]+'.png',cv2.IMREAD_COLOR)
im_gray = cv2.cvtColor(im_color, cv2.COLOR_BGR2GRAY)
_, mask = cv2.threshold(im_gray, thresh=100, maxval=255, type=cv2.THRESH_BINARY_INV)
mask3 = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR) # 3 channel mask
im_thresh_color = cv2.bitwise_and(img, mask3)
cv2.imwrite("C:\\Users\\user\\Desktop\\lbim2\\"+name[:-4] +".png",im_thresh_color)
def preprocess(path):
images=[]
j=0
print ("Median Blur")
for i in os.listdir(path):
print(i)
images.append(process(path,i))
print(images[j].shape)
#print(images[1].shape)
images[j]=kmeans(images[j],i)
j+=1
print(i)
dest='../output1/'
print ("Preprocess")
preprocess('../input1/')
I have get a image with all pixel value 0. black output

How to get screenshot and change DPI on the clipboard?

Under Win7 I would like to get the content of a window on the clipboard and set/adjust the DPI setting on the clipboard and copy it to a final application.
The MCVE below is not yet working as desired.
There is an issue:
sometimes it can happen that apparently the window is not yet set to foreground and the ImageGrab.grab(bbox) gets the wrong content. Waiting for some time (2-5 sec) helps, but is not very practical. How to avoid or workaround this?
Here is the code:
from io import BytesIO
from PIL import Image,ImageGrab
import win32gui, win32clipboard
import time
def get_screenshot(window_name, dpi):
hwnd = win32gui.FindWindow(None, window_name)
if hwnd != 0:
win32gui.SetForegroundWindow(hwnd)
time.sleep(2) ### sometimes window is not yet in foreground. delay/timing problem???
bbox = win32gui.GetWindowRect(hwnd)
screenshot = ImageGrab.grab(bbox)
width, height = screenshot.size
lmargin = 9
tmargin = 70
rmargin = 9
bmargin = 36
screenshot = screenshot.crop(box = (lmargin,tmargin,width-rmargin,height-bmargin))
win32clipboard.OpenClipboard()
win32clipboard.EmptyClipboard()
output = BytesIO()
screenshot.convert("RGB").save(output, "BMP", dpi=(dpi,dpi))
data = output.getvalue()[14:]
output.close()
win32clipboard.SetClipboardData(win32clipboard.CF_DIB, data)
win32clipboard.CloseClipboard()
print("Screenshot taken...")
else:
print("No window found named:", window_name)
window_name = "Gnuplot (window id : 0)"
get_screenshot(window_name,200)
Edit:
also this attempt to improve still gets sometimes the wrong content. Maybe somebody can explain why?
win32gui.SetForegroundWindow(hwnd)
for i in range(1000):
print(i)
time.sleep(0.01)
if win32gui.GetForegroundWindow() == hwnd:
break
bbox = win32gui.GetWindowRect(hwnd)
Addition:
That's what I (typically) get when I remove the line with the delay time time.sleep(2).
Left: desired content, right: received content. How can I get a reliable capture of content the desired window? What's wrong with the code? The larger I set the delay time the higher the probability that I get the desired content. But I don't want to wait several seconds to be sure. How can I check whether the system is ready for a screenshot?
As discussed you can use the approach as discussed in below
Python Screenshot of inactive window PrintWindow + win32gui
import win32gui
import win32ui
from ctypes import windll
import Image
hwnd = win32gui.FindWindow(None, 'Calculator')
# Change the line below depending on whether you want the whole window
# or just the client area.
#left, top, right, bot = win32gui.GetClientRect(hwnd)
left, top, right, bot = win32gui.GetWindowRect(hwnd)
w = right - left
h = bot - top
hwndDC = win32gui.GetWindowDC(hwnd)
mfcDC = win32ui.CreateDCFromHandle(hwndDC)
saveDC = mfcDC.CreateCompatibleDC()
saveBitMap = win32ui.CreateBitmap()
saveBitMap.CreateCompatibleBitmap(mfcDC, w, h)
saveDC.SelectObject(saveBitMap)
# Change the line below depending on whether you want the whole window
# or just the client area.
#result = windll.user32.PrintWindow(hwnd, saveDC.GetSafeHdc(), 1)
result = windll.user32.PrintWindow(hwnd, saveDC.GetSafeHdc(), 0)
print result
bmpinfo = saveBitMap.GetInfo()
bmpstr = saveBitMap.GetBitmapBits(True)
im = Image.frombuffer(
'RGB',
(bmpinfo['bmWidth'], bmpinfo['bmHeight']),
bmpstr, 'raw', 'BGRX', 0, 1)
win32gui.DeleteObject(saveBitMap.GetHandle())
saveDC.DeleteDC()
mfcDC.DeleteDC()
win32gui.ReleaseDC(hwnd, hwndDC)
if result == 1:
#PrintWindow Succeeded
im.save("test.png")
Thanks to #Tarun Lalwani pointing to
this answer, I finally have a code which is working for me for the time being. However, it seems to me quite lengthy with a lot of different modules. Maybe it still can be simplified. Suggestions are welcome.
Code:
### get the content of a window and crop it
import win32gui, win32ui, win32clipboard
from io import BytesIO
from ctypes import windll
from PIL import Image
# user input
window_name = 'Gnuplot (window id : 0)'
margins = [8,63,8,31] # left, top, right, bottom
dpi = 96
hwnd = win32gui.FindWindow(None, window_name)
left, top, right, bottom = win32gui.GetWindowRect(hwnd)
width = right - left
height = bottom - top
crop_box = (margins[0],margins[1],width-margins[2],height-margins[3])
hwndDC = win32gui.GetWindowDC(hwnd)
mfcDC = win32ui.CreateDCFromHandle(hwndDC)
saveDC = mfcDC.CreateCompatibleDC()
saveBitMap = win32ui.CreateBitmap()
saveBitMap.CreateCompatibleBitmap(mfcDC, width, height)
saveDC.SelectObject(saveBitMap)
result = windll.user32.PrintWindow(hwnd, saveDC.GetSafeHdc(), 0)
bmpinfo = saveBitMap.GetInfo()
bmpstr = saveBitMap.GetBitmapBits(True)
im = Image.frombuffer( 'RGB', (bmpinfo['bmWidth'], bmpinfo['bmHeight']),
bmpstr, 'raw', 'BGRX', 0, 1).crop(crop_box)
win32clipboard.OpenClipboard()
win32clipboard.EmptyClipboard()
output = BytesIO()
im.convert("RGB").save(output, "BMP", dpi=(dpi,dpi))
data = output.getvalue()[14:]
output.close()
win32clipboard.SetClipboardData(win32clipboard.CF_DIB, data)
win32clipboard.CloseClipboard()
win32gui.DeleteObject(saveBitMap.GetHandle())
saveDC.DeleteDC()
mfcDC.DeleteDC()
win32gui.ReleaseDC(hwnd, hwndDC)
print('"'+window_name+'"', "is now on the clipboard with", dpi, "dpi.")
### end of code

Connecting slider to Graphics View in PyQt

I'm trying to display image data read in from a binary file (I have the code written for retrieving this data from a file and storing it as an image for use with QImage() ). What I would like to do is connect a slider to a Graphics View widget so that when you move the slider, it moves through the frames and displays the image from that frame (these are echograms ranging from 1-500 frames in length). I'm very new to PyQt and was curious how one might even begin doing this?
from PyQt4.QtCore import *
from PyQt4.QtGui import *
import numpy as np
class FileHeader(object):
fileheader_fields= ("filetype","fileversion","numframes","framerate","resolution","numbeams","samplerate","samplesperchannel","receivergain","windowstart","winlengthsindex","reverse","serialnumber","date","idstring","ID1","ID2","ID3","ID4","framestart","frameend","timelapse","recordInterval","radioseconds","frameinterval","userassigned")
fileheader_formats=('S3','B','i4','i4','i4','i4','f','i4','i4','i4','i4','i4','i4','S32','S256','i4','i4','i4','i4','i4','i4','i4','i4','i4','i4','S136')
def __init__(self,filename,parent=None):
a=QApplication([])
filename=str(QFileDialog.getOpenFileName(None,"open file","C:/vprice/DIDSON/DIDSON Data","*.ddf"))
self.infile=open(filename, 'rb')
dtype=dict(names=self.fileheader_fields, formats=self.fileheader_formats)
self.fileheader=np.fromfile(self.infile, dtype=dtype, count=1)
self.fileheader_length=self.infile.tell()
for field in self.fileheader_fields:
setattr(self,field,self.fileheader[field])
def get_frame_first(self):
frame=Frame(self.infile)
print self.fileheader
self.infile.seek(self.fileheader_length)
print frame.frameheader
print frame.data
def __iter__(self):
self.infile.seek(self.fileheader_length)
for _ in range(self.numframes):
yield Frame(self.infile)
#def close(self):
#self.infile.close()
def display(self):
print self.fileheader
class Frame(object):
frameheader_fields=("framenumber","frametime","version","status","year","month","day","hour","minute","second","hsecond","transmit","windowstart","index","threshold","intensity","receivergain","degc1","degc2","humidity","focus","battery","status1","status2","velocity","depth","altitude","pitch","pitchrate","roll","rollrate","heading","headingrate","sonarpan","sonartilt","sonarroll","latitude","longitude","sonarposition","configflags","userassigned")
frameheader_formats=("i4","2i4","S4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","i4","S16","S16","f","f","f","f","f","f","f","f","f","f","f","f","f8","f8","f","i4","S60")
data_format="uint8"
def __init__(self,infile):
dtype=dict(names=self.frameheader_fields,formats=self.frameheader_formats)
self.frameheader=np.fromfile(infile,dtype=dtype,count=1)
for field in self.frameheader_fields:
setattr(self,field,self.frameheader[field])
ncols,nrows=96,512
self.data=np.fromfile(infile,self.data_format,count=ncols*nrows)
self.data=self.data.reshape((nrows,ncols))
class QEchogram():
def __init__(self):
self.__colorTable=[]
self.colorTable=None
self.threshold=[50,255]
self.painter=None
self.image=None
def echogram(self):
fileheader=FileHeader(self)
frame=Frame(fileheader.infile)
echoData=frame.data
#fileName = fileName
self.size=[echoData.shape[0],echoData.shape[1]]
# define the size of the data (and resulting image)
#size = [96, 512]
# create a color table for our image
# first define the colors as RGB triplets
colorTable = [(255,255,255),
(159,159,159),
(95,95,95),
(0,0,255),
(0,0,127),
(0,191,0),
(0,127,0),
(255,255,0),
(255,127,0),
(255,0,191),
(255,0,0),
(166,83,60),
(120,60,40),
(200,200,200)]
# then create a color table for Qt - this encodes the color table
# into a list of 32bit integers (4 bytes) where each byte is the
# red, green, blue and alpha 8 bit values. In this case we don't
# set alpha so it defaults to 255 (opaque)
ctLength = len(colorTable)
self.__ctLength=ctLength
__colorTable = []
for c in colorTable:
__colorTable.append(QColor(c[0],c[1],c[2]).rgb())
echoData = np.round((echoData - self.threshold[0])*(float(self.__ctLength)/(self.threshold[1]-self.threshold[0])))
echoData[echoData < 0] = 0
echoData[echoData > self.__ctLength-1] = self.__ctLength-1
echoData = echoData.astype(np.uint8)
self.data=echoData
# create an image from our numpy data
image = QImage(echoData.data, echoData.shape[1], echoData.shape[0], echoData.shape[1],
QImage.Format_Indexed8)
image.setColorTable(__colorTable)
# convert to ARGB
image = image.convertToFormat(QImage.Format_ARGB32)
# save the image to file
image.save(fileName)
self.image=QImage(self.size[0],self.size[1],QImage.Format_ARGB32)
self.painter=QPainter(self.image)
self.painter.drawImage(QRect(0.0,0.0,self.size[0],self.size[1]),image)
def getImage(self):
self.painter.end()
return self.image
def getPixmap(self):
self.painter.end()
return QPixmap.fromImage(self.image)
if __name__=="__main__":
data=QEchogram()
fileName="horizontal.png"
data.echogram()
dataH=data.data
print "Horizontal data", dataH
I could give you a more specific answer if you showed what you were trying so far, but for now I will just make assumptions and give you an example.
First what you would do is create a QSlider. You set the QSlider minimum/maximum to the range of images that you have available. When you slide it, the sliderMoved signal will fire and tell you what the new value is.
Next, you can create a list containing all of your QPixmap images ahead of time. If these images are huge and you are concerned about memory, you might have to create them on demand using your already coded approach. But we will assume you can put them in a list for now, to make the example easier.
Then you create your QGraphics set up, using a single QGraphicsPixmapItem. This item can have its pixmap replaced on demand.
Putting it all together, you get something like this:
from PyQt4 import QtCore, QtGui
class Widget(QtGui.QWidget):
def __init__(self, parent=None):
super(Widget, self).__init__(parent)
self.resize(640,480)
self.layout = QtGui.QVBoxLayout(self)
self.scene = QtGui.QGraphicsScene(self)
self.view = QtGui.QGraphicsView(self.scene)
self.layout.addWidget(self.view)
self.image = QtGui.QGraphicsPixmapItem()
self.scene.addItem(self.image)
self.view.centerOn(self.image)
self._images = [
QtGui.QPixmap('Smiley.png'),
QtGui.QPixmap('Smiley2.png')
]
self.slider = QtGui.QSlider(self)
self.slider.setOrientation(QtCore.Qt.Horizontal)
self.slider.setMinimum(0)
# max is the last index of the image list
self.slider.setMaximum(len(self._images)-1)
self.layout.addWidget(self.slider)
# set it to the first image, if you want.
self.sliderMoved(0)
self.slider.sliderMoved.connect(self.sliderMoved)
def sliderMoved(self, val):
print "Slider moved to:", val
try:
self.image.setPixmap(self._images[val])
except IndexError:
print "Error: No image at index", val
if __name__ == "__main__":
app = QtGui.QApplication([])
w = Widget()
w.show()
w.raise_()
app.exec_()
You can see that we set the range of the slider to match your image list. At any time, you can change this range if the contents of your image list change. When the sliderMoved fires, it will use the value as the index of the image list and set the pixmap.
I also added a check to our sliderMoved() SLOT just in case your slider range gets out of sync with your image list. If you slide to an index that doesn't exist in your image list, it will fail gracefully and leave the existing image.
A lot of the work you are doing--converting image data to QImage, displaying frames with a slider--might be solved better using a library written for this purpose. There are a couple libraries I can think of that work with PyQt and provide everything you need:
guiqwt
pyqtgraph
(disclaimer: shameless plug)
If you can collect all of the image data into a single 3D numpy array, the code for displaying this in pyqtgraph looks like:
import pyqtgraph as pg
pg.image(imageData)
This would give you a zoomable image display with frame slider and color lookup table controls.

Resources