Why does this code slow down? Graphics.py? - python-3.x

I have some code that reads a small BMP (128x96) file and puts the RGB values into a list.
I then run a nested loop and read the RGB values in reverse from the list and draw them on the screen.
It starts quite quickly and draws the first 20 lines in a second, but progressively slows down to such an extent I've never seen it finish. It only a small 128x96 image.
I feel it's the calls to the graphics.py library, buy why, or is it something else?
I'm running this on a raspberry pi, if that's of use. Python 3.4.2
If your interested in trying you can find the supporting files here https://drive.google.com/open?id=1yM9Vn1Nugnu79l1UNShamEAGd2VWF3T4
(It's the graphics.py library I'm using and the tiny bmp file, also the actual file in question called SlowDownWhy.py)
import math
import sys
from graphics import *
from PIL import Image
# Initialise Vars for Image width n height
iw=0
ih=0
img=Image.open("ani1.bmp","r") # Open Image
iw, ih = img.size # Set image width n height
ch = int(1000/ih) # Cube height set
cw = ch # Cube width set
win = GraphWin("My Window", iw*cw, ih*ch)
win.setBackground(color_rgb(128,128,128))
#Transfer Bitmap RGB vales to csv list - 'RGBlist'
pix_val = list(img.getdata())
RGBlist = [x for sets in pix_val for x in sets]
noe = (iw * ih * 3)-3
x = iw
y = ih
for vy in list(range(ih)):
y = y-1
x = iw
for vx in list(range(iw)):
x = x-1
r=RGBlist[noe]
g=RGBlist[noe+1]
b=RGBlist[noe+2]
noe=noe-3
cx=x*cw
cy=y*ch
aPoint = Rectangle(Point(cx,cy), Point(cx+cw,cy+ch))
aPoint.setFill(color_rgb(r,g,b))
aPoint.draw(win)
It should create a window no bigger than 1000 pixels in height and start drawing the picture from the bottom right to the top left, line by line. but slows down progressively.

Ignoring the invalid syntax, this is simply because of the way graphics.py is programmed: It is not designed to handle this many objects put onto the screen. (It uses tkinter in the back-end, which will slow down with 128*96=12,288 objects). For rendering images, you should either directly integrate them or use another library, as example pygame.
To integrate it into the graphics.py program, there is the Image-class, which you overwrote with the PIL.Image-library (this is the reason why you never do import *). Look here: Importing custom images into graphics.py

Related

How to detect an object in an image rather than screen with pyautogui?

I am using pyautogui.locateOnScreen() function to locate elements in chrome and get their x,y coordinates and click them. But at some point I need to take a screenshot of a part of the screen and search for the object I want in this screenshot. Then I get coordinates of it. Is it possible to do it with pyautogui?
My example code:
coord_one = pyautogui.locateOnScreen("first_image.png",confidence=0.95)
scshoot = pyautogui.screenshot(region=coord_one)
coord_two = # search second image in scshoot and if it can be detected get coordinates of it.
If it is not possible with pyautogui, can you advice the easiest-smartest way?
Thanks in advance.
I don't believe there is a built-in direct way to do what you need but the python-opencv library does the job.
The following code sample assumes you have an screen capture you just took "capture.png" and you want to find "logo.png" in that capture, which you know is an subsection of "capture.png".
Minimal example
"""Get bounding box of cropped image from original image."""
import cv2 as cv
import numpy as np
img_rgb = cv.imread(r'res/original.png')
# the cropped image, expected to be smaller
target_img = cv.imread(r'res/crop.png')
_, w, h = target_img.shape[::-1]
res = cv.matchTemplate(img_rgb,target_img,cv.TM_CCOEFF_NORMED)
# with the method used, the date in res are top left pixel coords
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)
top_left = max_loc
# if we add to it the width and height of the target, then we get the bbox.
bottom_right = (top_left[0] + w, top_left[1] + h)
cv.rectangle(img_rgb,top_left, bottom_right, 255, 2)
cv.imshow('', img_rgb)
MatchTemplate
From the docs, MatchTemplate "simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image." Under the hood, this offers methods such as square difference to compare the images represented as arrays.
See more
For a more in-depth explanation, check the opencv docs as the code is entirely based off their example.

Create new raster (.tif) from standard deviation stretched bands, works with dstack but not to write a new file, Python

I am sorry if the title is unclear, I am new to python and my vocabulary is limited.
What I am trying to do is apply a standard deviation stretch to each band in a .tif raster and then create a new raster (.tif) by stacking those bands using GDAL (Python).
I able to create new false color rasters with differing band combinations and save them, and I am able to create my desired IMAGE in python using dstack (first block of code), but I am unable to save that image as a georectified .tif file.
So to create the stretched image using dstack my code looks like:
import os
import numpy as np
import matplotlib.pyplot as plt
import math
from osgeo import gdal
# code from my prof
def std_stretch_data(data, n=2):
"""Applies an n-standard deviation stretch to data."""
# Get the mean and n standard deviations.
mean, d = data.mean(), data.std() * n
# Calculate new min and max as integers. Make sure the min isn't
# smaller than the real min value, and the max isn't larger than
# the real max value.
new_min = math.floor(max(mean - d, data.min()))
new_max = math.ceil(min(mean + d, data.max()))
# Convert any values smaller than new_min to new_min, and any
# values larger than new_max to new_max.
data = np.clip(data, new_min, new_max)
# Scale the data.
data = (data - data.min()) / (new_max - new_min)
return data
# open the raster
img = gdal.Open(r'/Users/Rebekah/ThesisData/TestImages/OG/OG_1234.tif')
#open the bands
red = img.GetRasterBand(1).ReadAsArray()
green = img.GetRasterBand(2).ReadAsArray()
blue = img.GetRasterBand(3).ReadAsArray()
# create alpha band where a 0 indicates a transparent pixel and 1 is a opaque pixel
# (this is from class and i dont FULLY understand it)
alpha = np.where(red + green + blue ==0, 0, 1).astype(np.byte)
red_stretched = std_stretch_data(red, 1)
green_stretched = std_stretch_data(green, 1)
blue_stretched = std_stretch_data(blue, 1)
data_stretched = np.dstack((red_stretched, green_stretched, blue_stretched, alpha))
plt.imshow(data_stretched)
plt.show()
And that gives me a beautiful image of exactly what I want in a separate window. But no where in that code is an option to assign projections, or save it as a multiband tif.
So I took that and applied it the best I could to the code I use to create false color images and it fails (code below). If I create a 4 band tif with the alpha band the output is an empty tif, and if I create a 3 band tif and omit the alpha band the output is an entirely black tif.
import os
import numpy as np
import matplotlib.pyplot as plt
import math
from osgeo import gdal
#code from my professor
def std_stretch_data(data, n=2):
"""Applies an n-standard deviation stretch to data."""
# Get the mean and n standard deviations.
mean, d = data.mean(), data.std() * n
# Calculate new min and max as integers. Make sure the min isn't
# smaller than the real min value, and the max isn't larger than
# the real max value.
new_min = math.floor(max(mean - d, data.min()))
new_max = math.ceil(min(mean + d, data.max()))
# Convert any values smaller than new_min to new_min, and any
# values larger than new_max to new_max.
data = np.clip(data, new_min, new_max)
# Scale the data.
data = (data - data.min()) / (new_max - new_min)
return data
#open image
img = gdal.Open(r'/Users/Rebekah/ThesisData/TestImages/OG/OG_1234.tif')
# get geotill driver
gtiff_driver = gdal.GetDriverByName('GTiff')
# read in bands
red = img.GetRasterBand(1).ReadAsArray()
green = img.GetRasterBand(2).ReadAsArray()
blue = img.GetRasterBand(3).ReadAsArray()
# create alpha band where a 0 indicates a transparent pixel and 1 is a opaque pixel
# (this is from class and i dont FULLY understand it)
alpha = np.where(red + green + blue ==0, 0, 1).astype(np.byte)
# apply the 1 standard deviation stretch
red_stretched = std_stretch_data(red, 1)
green_stretched = std_stretch_data(green, 1)
blue_stretched = std_stretch_data(blue, 1)
# create empty tif file
NewImg = gtiff_driver.Create('/Users/riemann/ThesisData/TestImages/FCI_tests/1234_devst1.tif', img.RasterXSize, img.RasterYSize, 4, gdal.GDT_Byte)
if NewImg is None:
raise IOerror('could not create new raster')
# set the projection and geo transform of the new raster to be the same as the original
NewImg.SetProjection(img.GetProjection())
NewImg.SetGeoTransform(img.GetGeoTransform())
# write new bands to the new raster
band1 = NewImg.GetRasterBand(1)
band1.WriteArray(red_stretched)
band2 = NewImg.GetRasterBand(2)
band2.WriteArray(green_stretched)
band3= NewImg.GetRasterBand(3)
band3.WriteArray(blue_stretched)
alpha_band = NewImg.GetRasterBand(4)
alpha_band.WriteArray(alpha)
del band1, band2, band3, img, alpha_band
I am not entirely sure how to go from here and create a new file displaying the stretch on the different bands.
The image is just a 4 band raster (NAIP) downloaded from earthexplorer, I can upload the specific image I am using for my test if needed but there is nothing inherently special about this file compared to other NAIP images.
You should close the new Dataset (NewImg) as well by either adding it to the del list you already have, or setting it to None.
That properly closes the file and makes sure all data is written to disk.
There is however another issue, you are scaling your data between 0 and 1, but storing it as a Byte. So either change the output datatype from gdal.GDT_Byte to something like gdal.GDT_Float32. Or multiply your scaled data to fit the output datatype, in the case of Byte multiple with 255 (don't forget the alpha), you should properly round it for accuracy, GDAL will otherwise truncate to the nearest integer.
You can use np.iinfo() to check what the range of a datatype is, in case you are unsure what multiplication to use for other datatypes.
Depending on your use case, it might be easiest to use gdal.Translate for the scaling. If you would modify your scaling function a little to return the scaling parameteters instead of the data, you could use something like:
ds = gdal.Translate(output_file, input_file, outputType=gdal.GDT_Byte, scaleParams=[
[old_min_r, old_max_r, new_min_r, new_max_r], # red
[old_min_g, old_max_g, new_min_g, new_max_g], # green
[old_min_b, old_max_b, new_min_b, new_max_b], # blue
[old_min_a, old_max_a, new_min_a, new_max_a], # alpha
])
ds = None
You could also add the exponents keyword for non-linear stretching.
Using gdal.Translate would save you from all the standard file creation boilerplate, you still would need to think about the datatype, since that might change compared to the input file.

How can I use VIPS for image normalization?

I want to normalize the exposure and color palettes of a set of images. For context, this is for training a neural net in image classification on medical images. I'm also doing this for hundreds of thousands of images, so efficiency is very important.
So far I've been using VIPS, specifically PyVIPS, and would prefer a solution using that library. After finding this answer and looking through the documentation, I tried
x = pyvips.Image.new_from_file('test.ndpi')
x = x.hist_norm()
x.write_to_file('test_normalized.tiff')
but that seems to always produce a pure-white image.
You need hist_equal for histogram equalisation.
The main docs are here:
https://libvips.github.io/libvips/API/current/libvips-histogram.html
However, that will be extremely slow for large slide images. It will need to scan the whole slide once to build the histogram, then scan again to equalise it. It would be much faster to find the histogram of a low-res layer, then use that to equalise the high-res one.
For example:
#!/usr/bin/env python3
import sys
import pyvips
# open the slide image and get the number of layers ... we are not fetching
# pixels, so this is quick
x = pyvips.Image.new_from_file(sys.argv[1])
levels = int(x.get("openslide.level-count"))
# find the histogram of the highest level ... again, this should be quick
x = pyvips.Image.new_from_file(sys.argv[1],
level=levels - 1)
hist = x.hist_find()
# from that, compute the transform for histogram equalisation
equalise = hist.hist_cum().hist_norm()
# and use that on the full-res image
x = pyvips.Image.new_from_file(sys.argv[1])
x = x.maplut(equalise)
x.write_to_file(sys.argv[2])
Another factor is that histogram equalisation is non-linear, so it will distort lightness relationships. It can also distort colour relationships and make noise and compression artifacts look crazy. I tried that program on an image I have here:
$ ~/try/equal.py bild.ndpi[level=7] y.jpg
The stripes are from the slide scanner and the ugly fringes from compression.
I think I would instead find image max and min from the low-res level, then use them to do a simple linear stretch of pixel values.
Something like:
x = pyvips.Image.new_from_file(sys.argv[1])
levels = int(x.get("openslide.level-count"))
x = pyvips.Image.new_from_file(sys.argv[1],
level=levels - 1)
mn = x.min()
mx = x.max()
x = pyvips.Image.new_from_file(sys.argv[1])
x = (x - mn) * (256 / (mx - mn))
x.write_to_file(sys.argv[2])
Did you find the new Region feature in pyvips? It makes generating patches for training MUCH faster, up to 100x faster in some cases:
https://github.com/libvips/pyvips/issues/100#issuecomment-493960943

How to adaptively split an image into regions and set a different text orientation for each one?

Input-Sample
I am trying to pre-process my images in order to improve the ocr quality. However, I am stuck with a problem.
The Images I am dealing with contain different text orientations within the same image (2 pages, 1st is vertical, the 2nd one is horizontally oriented and they are scanned to the same image.
The text direction is automatically detected for the first part. nevertheless, the rest of the text from the other page is completely missed up.
I was thinking of creating a zonal template to detect the regions of interest but I don't know how.
Or automatically detect the border and split the image adaptively then flip the splitted part to achieve the required result.
I could set splitting based on a fixed pixel height but it is not constant as well.
from tesserocr import PyTessBaseAPI, RIL
import cv2
from PIL import Image
with PyTessBaseAPI() as api:
filePath = r'sample.jpg'
img = Image.open(filePath)
api.SetImage(img)
boxes = api.GetComponentImages(RIL.TEXTLINE, True)
print('Found {} textline image components.'.format(len(boxes)))
for i, (im, box, _, _) in enumerate(boxes):
# im is a PIL image object
# box is a dict with x, y, w and h keys
api.SetRectangle(box['x'], box['y'], box['w'], box['h'])
ocrResult = api.GetUTF8Text()
conf = api.MeanTextConf()
for box in boxes:
box = boxes[0][1]
x = box.get('x')
y = box.get('y')
h = box.get('h')
w = box.get('w')
cimg = cv2.imread(filePath)
crop_img = cimg[y:y+h, x:x+w]
cv2.imshow("cropped", crop_img)
cv2.waitKey(0)
output image
as you can see i can apply an orientation detection but I wount get any meaningful text out of such an image.
Try Tesseract API method GetComponentImages and then DetectOrientationScript on each component image.

Matplotlib - Stacked Bar Chart with ~1000 Bars

Background:
I'm working on a program to show a 2d cross section of 3d data. The data is stored in a simple text csv file in the format x, y, z1, z2, z3, etc. I take a start and end point and flick through the dataset (~110,000 lines) to create a line of points between these two locations, and dump them into an array. This works fine, and fairly quickly (takes about 0.3 seconds). To then display this line, I've been creating a matplotlib stacked bar chart. However, the total run time of the program is about 5.5 seconds. I've narrowed the bulk of it (3 seconds worth) down to the code below.
'values' is an array with the x, y and z values plus a leading identifier, which isn't used in this part of the code. The first plt.bar is plotting the bar sections, and the second is used to create an arbitrary floor of -2000. In order to generate a continuous looking section, I'm using an interval between each bar of zero.
import matplotlib.pyplot as plt
for values in crossSection:
prevNum = None
layerColour = None
if values != None:
for i in range(3, len(values)):
if values[i] != 'n':
num = float(values[i].strip())
if prevNum != None:
plt.bar(spacing, prevNum-num, width=interval, \
bottom=num, color=layerColour, \
edgecolor=None, linewidth=0)
prevNum = num
layerColour = layerParams[i].strip()
if prevNum != None:
plt.bar(spacing, prevNum+2000, width=interval, bottom=-2000, \
color=layerColour, linewidth=0)
spacing += interval
I'm sure there's a more efficient way to do this, but I'm new to Matplotlib and still unfamilar with its capabilities. The other main use of time in the code is:
plt.savefig('output.png')
which takes about a second, but I figure this is to be expected to save the file and I can't do anything about it.
Question:
Is there a faster way of generating the same output (a stacked bar chart or something that looks like one) by using plt.bar() better, or a different Matplotlib function?
EDIT:
I forgot to mention in the original post that I'm using Python 3.2.3 and Matplotlib 1.2.0
Leaving this here in case someone runs into the same problem...
While not exactly the same as using bar(), with a sufficiently large dataset (large enough that using bar() takes a few seconds) the results are indistinguishable from stackplot(). If I sort the data into layers using the method given by tcaswell and feed it into stackplot() the chart is created in 0.2 seconds, rather than 3 seconds.
EDIT
Code provided by tcaswell to turn the data into layers:
accum_values = []
for values in crosssection:
accum_values.append([float(v.strip()) for v iv values[3:]])
accum_values = np.vstack(accum_values).T
layer_params = [l.strip() for l in layerParams]
bottom = numpy.zeros(accum_values[0].shape)
It looks like you are drawing each bar, you can pass sequences to bar (see this example)
I think something like:
accum_values = []
for values in crosssection:
accum_values.append([float(v.strip()) for v iv values[3:]])
accum_values = np.vstack(accum_values).T
layer_params = [l.strip() for l in layerParams]
bottom = numpy.zeros(accum_values[0].shape)
ax = plt.gca()
spacing = interval*numpy.arange(len(accum_values[0]))
for data,color is zip(accum_values,layer_params):
ax.bar(spacing,data,bottom=bottom,color=color,linewidth=0,width=interval)
bottom += data
will be faster (because each call to bar creates one BarContainer and I suspect the source of your issues is you were creating one for each bar, instead of one for each layer).
I don't really understand what you are doing with the bars that have tops below their bottoms, so I didn't try to implement that, so you will have to adapt this a bit.

Resources