How to get the value of a pixel from the screen (GTK 3, Shell, Linux) [duplicate] - linux

This question already has answers here:
How can I grab the color of a pixel on my desktop? (Linux)
(4 answers)
Closed 2 years ago.
I am trying to do some automation on Linux, and I can't find a good way to get a pixel value from the screen, given 2 coordinates.
I have this python code:
#!/usr/bin/env python3
import pyautogui
import sys
image = pyautogui.screenshot()
print(str(image.getpixel((int(sys.argv[1]), int(sys.argv[2])))))
How can I do this without taking a screenshot and instead read from the pixel buffer?
If there is a program that someone knows about that can do this (I've heard AutoHotkey on windows can), that would also be helpful, as I'm using shell script (and lots of xdotool) to do the automation.

The following code, when called like this: ./getColor.py [X coordinate] [Y coordinate], will print the decimal RGB color value of the specified pixel on the screen in the form (R, G, B)
#!/usr/bin/python
import gi, sys
gi.require_version('Gdk', '3.0')
from gi.repository import Gdk
pixbuf = Gdk.pixbuf_get_from_window(Gdk.get_default_root_window(), int(sys.argv[1]), int(sys.argv[2]), 1, 1)
print(tuple(pixbuf.get_pixels()))

Related

Python - Spyder ignoring picker enabled plot

I am writing this script in Spyder (Python 3.5) and I want it to do this:
1) Plot something
2) Allow me to pick some values from the plot
3) Store those values into a variable
4) Do something with that variable
I have checked this thread: Store mouse click event coordinates with matplotlib and modified the function presented there for my own code. The problem I have is that spyder seems to ignore the interactive plot and runs the whole script at once, without waiting for me to pick any values from the plot. As I am using the values for further calculations, I obviously get an error from this. I have even tried to set an input('Press enter to continue...') after the plot, to see if it made it stop and wait for my pickings, but it does not work either.
When I run the script step by step, it works fine, I get the plot, pick my values, print the variable and find all of them in there and use them afterwards. So the question is: how can I make it work when I run the whole script?
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import plot as plot
def onpick(event):
ymouse = event.ydata
print ('y of mouse: {:.2f}'.format(ymouse))
times.append(ymouse)
if len(times)==5:
f.canvas.mpl_disconnect(cid)
return times
#
t=np.arange(1000)
y=t**3
f=plt.figure(1)
ax=plt.gca()
ax.plot(t,y,picker=5)
times=[]
cid=f.canvas.mpl_connect('button_press_event',onpick)
plt.show()
#Now do something with times
mtimes=np.mean(times)
print(mtimes)
(Spyder maintainer here) I think to solve this problem you need to go to
Preferences > IPython console > Graphics
and turn off the option called Activate support. That will make your script to block the console when a plot is run, so you can capture the mouse clicks you need on it.
The only problem is you need to run
In [1]: %matplotlib qt5
before starting to run your code because Spyder doesn't that for you anymore.

Move to searched text on active screen with pyautogui

I am trying to make a program that searches for a text on a web-page, then places the mouse cursor on the highlighted text after it has been found. Is this possible using pyautogui? If so, how. If not, are there any other alternatives to do this?
Example code below:
import webbrowser
import pyautogui
var = 'Filtered Questions'
webbrowser.open('https://stackexchange.com/')
time.sleep(2)
pyautogui.hotkey('ctrl', 'f')
pyautogui.typewrite(var)
#code to place mouse cursor to the occurrence of var
I would prefer to not use the pyautogui.moveTo() or pyautogui.moveRel() because the text I am searching for on the website is not static. The position of the searched text varies when the web page loads. Any help would be highly appreciated.
When you use Chrome or Chromium as a browser there is a much easier and much more stable approach using ONLY pyautogui:
Perform Crtl + F with pyautogui
Perform Ctrl + Enter to 'click' on search result / open the link related to the result
With other browsers you have to clarify if there keyboard shortcuts also exists.
Yes, you can do that, but you additionally need Tesseract (and the Python-module pytesseract) for text recognition and PIL for taking screenshots.
Then perform the following steps:
Open the page
Open and perform the search (ctrl+f with pyautogui) - the view changes to the first result
Take a screenshot (with PIL)
Convert the image to text and data (with Tesseract) and find the text and the position
Use pyautogui to move the mouse and click on it
Here is the needed code for getting the image and the related data:
import time
from PIL import ImageGrab # screenshot
import pytesseract
from pytesseract import Output
pytesseract.pytesseract.tesseract_cmd = (r"C:\...\AppData\Local\Programs\Tesseract-OCR\tesseract") # needed for Windows as OS
screen = ImageGrab.grab() # screenshot
cap = screen.convert('L') # make grayscale
data=pytesseract.image_to_boxes(cap,output_type=Output.DICT)
print(data)
In data you find all required information you need to move the mouse and click on the text.
The downside of this approach is the ressource consuming OCR part which takes a few seconds on slower machines.
I stumbled upon this question while researching the topic. Basically the answer is no. " major points:
1) Pyautogui has the option of searching using images. Using this you could for example screenshot all the text you want to find and save as individual text files then use that to search for it dynamically and move the mouse there/click/do whatever you need to. However, as explained in the docs, it takes 1-2 seconds for each search which is rather unpractical.
2) In some cases, but not always, using ctrl+f on a website and searching for the text will scroll so that the result is in the middle (vertical) of the page. However that relies on some heavy implications about where the text to search is. If it's at the top of the page you obviously won't be able to use that method, same as if it's at the bottom.
If you're trying to automate clicks and have links with distinguishable names, my advice would be to parse the source code and artificially clicking the link. Otherwise you're probably better off with a automation suite like blue prism.
pyautogui is for controlling mouse and keyboard and for automating other GUI applications. If your need is to find a text on a webpage, you may look for better options that are intended for scraping webpages. For instance: Selenium
If you are a newcomer looking for how to find a string of text anywhere on your screen and stumbled upon this old question through a Google search, you can use the following snippet which I have used in my own projects (It takes a raw string of text as an input, and if the text is found on the screen, return the coordinates, and if not, return None):
import pyautogui
import pytesseract
import cv2
import numpy as np
# In case you're on Windows and pytesseract doesn't
# find your installation (Happened to me)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
def find_coordinates_text(text, lang='en'):
# Take a screenshot of the main screen
screenshot = pyautogui.screenshot()
# Convert the screenshot to grayscale
img = cv2.cvtColor(np.array(screenshot), cv2.COLOR_RGB2GRAY)
# Find the provided text (text) on the grayscale screenshot
# using the provided language (lang)
data = pytesseract.image_to_data(img, lang=lang, output_type='data.frame')
# Find the coordinates of the provided text (text)
try:
x, y = data[data['text'] ==
text]['left'].iloc[0], data[data['text'] == text]['top'].iloc[0]
except IndexError:
# The text was not found on the screen
return None
# Text was found, return the coordinates
return (x, y)
Usage:
text_to_find = 'Filtered Questions'
coordinates = find_coordinates_text(text_to_find)

Imagemagick's import is frustratingly slow

I am writing an small python script to take screenshots of a game window (which will be in the background/minimized) and performing some simple template matching and ocr using cv2.
I am currently calling im's import as follows:
import -window windowID png:-
to take a screenshot of an inactive window.
However this takes almost 4s and is easily the slowest part of my script by a factor of 100x.
Is there any alternative to import or perhaps another way of approaching this that will be faster?
I have already tried graphicsmagick (ended up being slower than imagemagick) and xwd (did not capture the unfocused window even though the windowID was specified)
Link to full python script (Line 44 is where the screenshot taking happens)
You are doing all the PNG encoding and zlib compression in ImageMagick and then decompressing it all again in OpenCV. I guess you would do better if you found a format that is more closely shared between the two.
Specifically, ImageMagick could give you RGB pixels directly, which you could then convert to BGR very easily in OpenCV with cvtColor().
import -window windowID rgb:
You would have to query the window dimensions to get width and height.
Alternatively, you could use PPM format which OpenCV can also read without any libraries and which includes dimensions:
import -window windowID ppm:

is it possible to edit matplotlib plot interactively?

I am not sure if this is an acceptable question in SE.
I am wondering if it is possible to edit matplotlib plot interactively. i.e.,
# plot
plt.plot(x, y[1])
plt.plot(x, -1.0*y[2])
plt.show()
will open up a tk screen with the plot. Now, say, I want to modify the linewidth or enter x/y label. Is it possible to do that interactively (either on the screen, using mouse like xmgrace or from a gnuplot like command prompt)?
You can do simple interactive editing with pylustrator
pip install pylustrator
One way to do what (I think) you ask for is to use ipython. ipython is an interactive python environment which comes with many python distributions.
A quick example:
In a cmd, type >>> ipython, which will load the ipython terminal. In ipython, type:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
ax.plot([1, 2, 3, 4, 5], [1, 2, 3, 4, 5], 'r-')
fig.show()
Now you have a figure, at the same time as the ipython terminal is "free". This means that you can do commands to the figure, like ax.set_xlabel('XLABEL'), or ax.set_yticks([0, 5]). To make it show on screen, you need to redraw the canvas, which is done by calling fig.canvas.draw().
Note that with ipython, you have full tab-completion with all functions to all objects! Typing fig.get_ and then tab gives you the full list of functions beginning with fig.get_, this is extremely helpful!
Also note that you can run python-scripts in ipython, with run SCRIPT.py in the ipython-cmd, and at the same time having access to all variables defined in the script. They can then be used as above.
Hope this helps!
No, it is not generally possible to do what you want (dynamically interact with a matplotlib using the mouse).
What you see is a rendering of your plot on a "canvas", but it does not include a graphical user interface (GUI) like you have with e.g. xmgrace, Origin etc.
That being said, if you wish to pursue it you have a number of possible options, including:
Modify the matplotlib source code yourself to include a GUI
Do something with buttons, like in YuppieNetworking's answer here:
Change dynamically the contents of a matplotlib plot
But it is probably quicker and more convenient to just use some other plotting software, where someone has already designed a decent user interface for you.
Alternatively, using an iPython notebook to quickly modify your plot script works well enough.
There is a navigation toolbar in qt4agg matplotlib backend which you can add easily. Not much, but at least good scaling...
Not a working code, just some fragments:
from matplotlib.backends.backend_qt4agg import FigureCanvas
from matplotlib.backends.backend_qt4agg import NavigationToolbar2QT as NavigationToolbar
from matplotlib.figure import Figure
from matplotlib.backends.qt_compat import QtCore, QtWidgets, is_pyqt5
self.figure = Figure(figsize=(5, 3))
self.canvas = FigureCanvas(self.figure)
self.addToolBar(QtCore.Qt.BottomToolBarArea,
NavigationToolbar(self.canvas, self))
Self is your window object derived from QtGui.QMainWindow.

getting the color of pixels on the screen in Python3.x

I want to get the color of pixels on the screen in Python3.x. (example x,y)
I work on windows. and using tkinter.
But, It can't use PIL in Python 3.x.
How can i do this.
Thank you.
You can use PIL on Python3.x; there is a binary installer for Windows. The code might not work as is both on Python2.x and Python3.x:
try: import Image # Python 2.x
except ImportError:
from PIL import Image # Python 3.x
You could capture an area of the screen using an analog of grab() method.

Resources