how can I extract text contents from GUI apps in linux? - linux

I want to extract text contents from GUI apps,here are 2 examples::
example 1:
Suppose I opened firefox, and input url : www.google.com
how can I extract the string "www.google.com" from firefox using my own app ?
example 2:
open calculator(using gcalctool),then input 1+1
How can I extract the string "1+1" of calculator from my own program?
in brief ,what I want is to find out whether there is a way to extract the text contents from any widget of an GUI application
Thanks

I don't think there's a generic way to do this, at least not a very elegant one.
Some inelegant ideas:
You might be able to modify the X window system or even some toolkit framework to extract what is being displayed in specific window elements as text.
You could take a screenshot and use an OCR library to convert the pixels back into text for the interesting areas.
You could recompile the apps of interest to add some kind of mechanism for asking them questions.
You could use something like xtest to inject events highlighting the region of interest and copying it to the clipboard.

I believe firefox and gcalctool are for examples only and you just want to know in general how to pass output of one application to other application.
There are many ways to do that on Linux, like:
piping
application1 | application2
btw here is the Firefox command line manual if you want to start firefox on Ubuntu with a URL. eg:
firefox "$url"
where $url is a variable whose value can be www.mozilla.org

That sounds difficult. Supposing you're running X11, you can very easily grab a window picture ( see "man xwd"); however there is no easy way to get to the text unless it's selected and therefore copied to the clipboard.
Alternatively, if you only want to capture user input, this is quite easy to do, too, by activation the X11 record extension: put this in your /etc/X11/xorg.conf:
Section "Module"
Load "record"
#Load other modules you need ...
EndSection
though it may prove difficult to use too, see example code for Xorg/X11 record extension fails

Related

How to create a linux terminal ASCII character logo?

I'm a newbie to Linux. I want to create my own ASCII character logo to be displayed on the Linux terminal(it is for pleasure and also to learn). I searched through the internet and found there are tools available for that work. For example,Figlet,Neofetch,Screenfetch etc. But I want to know if there is any method to create a such a logo except hard-coding the logo. If anyone know please help.
You're much more likely to get an answer if you explain clearly.
By 'ASCII character logo' you could mean that you want to create your own character (as in letter/number/symbol) and use it in the terminal just like you'd be able to display the 'L' letter. For that you'd need to create your own font, add your character in and set the terminal to use that font. There are plenty of tools online that can help you create a font.
If you mean you simply want to display some ascii art on the terminal, you can use something like this: http://patorjk.com/software/taag/#p=display&f=Graffiti&t=Type%20Something%20. You can save the text to a file on your linux computer, and print it back out again using the cat command.
An example is below:
cat ascii_logo.txt
You do also list things like figet, which will automatically generate the ASCII art for you from some text - I'm not sure why these don't fulfill your needs?

Script to paste a specific string into a text field with a hotkey

I am trying to find a way to paste a predefined string upon entering a specific keyboard sequence, on any app.
For example if I have to paste an url or a password into a field, I can have said password in a hidden script and when I press, say, [ctrl] + [5], it would write "example123" on the text field where my cursor is.
Ideally without copying to the clipboard (I'd prefer keeping what I have on my clipboard and also avoiding to paste a password or such by mistake elsewhere).
I have tried every solution I've found so far that include xclip, xdotool and xvkdb. All of them either do not work or are really inconsistent: They only paste the string sometimes, and when they do, it's usually only part of the string ("ample123" instead of "example123").
I thought of using compose key, which I heavily use anyway to write in french on an us keyboard, but it seems it only supports 1 character sequences, as nothing is printed when I modify my .XCompose to include custom output sequences of len > 1.
I am using Ubuntu 18.04 with Gnome as a DE. Ideally something that also works when logging back (like compose keys).
You need to walk the Document Object Model for either Gnome or your web-page. My concern is that with a desktop script you wont be able to access the web page because you will need to be able to establish a target to send string to. I see in your question that you tried using using "x{tool-name}" to grab the text field element. Delivering the sting really isn't the problem. The problem is getting the GUI element of text box pragmatically. The easiest way to get access to this in a user loaded web-page is with WebExtensions API which is how to make extensions for most modern browsers. Otherwise, if you can get away with only having access to Gnome's GUI I would try LDTP, it's a library used for testing, but it looks like it can be used for automation too.
For keyboard shortcuts:
It really shouldn't matter what the script is doing to how you want to activate it. I would just go to Gnome/Settings/Keyboard and set the path to where I saved the script to be the Command. If you go the WebExtension route, you will want to build the shortcut into your extension.

How to add comments to folder in linux and view them with mouse cursor

I run simulations for various choices of parameters. For each choice I store the resulting data in a folder, like
/home/me/Documents/MyProject/C=10/1.dat
/home/me/Documents/MyProject/C=10/2.dat
/home/me/Documents/MyProject/C=10/3.dat
...
and
/home/me/Documents/MyProject/C=20/1.dat
/home/me/Documents/MyProject/C=20/2.dat
/home/me/Documents/MyProject/C=20/3.dat
...and so forth.
would like to write a little text file AAA.txt which contains not just the C parameter but all the others too. Then when viewing this folder which contains the data I want to hold my cursor on the little file symbol and have a little box appear. This box should show just the content of AAA.txt, so I can quickly check which set of parameters was used in this particular run.
Anyone know how to do this? I use Ubuntu 14.04
I am not aware of ways to give you a custom "tooltip". As an alternative, you could look into creating custom thumbnails of your .dat files.
See here for how to do that with nautilus; the default file browser for Ubuntu.
Alternatively, you might look into what Gloobus can do for you.

Creating a Print Monitor / Print Handler

I'm having trouble getting started with building a Print Monitor / Print Handler for Windows using Visual Studio 2012 Ultimate with WDK 8. Basically, this is what I am trying to accomplish:
Create a print monitor (something an application can print to) that will generate a file with the content that should be printed (like the default XPS printer or a PDF printer), and then invokes the print handler
Create a print handler that will parse the generated file and do certain actions with it (check to see if certain text is present, upload the file online, etc)
I feel like the print handler part should not be too hard, but starting with the print monitor is what I'm stuck at. What would I do within VS12? I see options for "Printer Driver V4", "Printer Driver V4 Property Bag", and "Printer XPS Render Filter". Should I use one of those templates, and, if so, what would I do within them? Anything pointing me in the right direction would be appreciated!
EDIT:
Just some more clarification - I only need the text from the print output, but I've read from various sources that getting text-only output leads to no output at all from sources like Firefox, etc since they print text as glyphs.
I will be using the print handler to parse the text for keywords and then upload that information to a web server in a specific format. The print monitor just needs to capture and save the text information from whatever application is printing.
As you pointed out in your comments, some applications such as Firefox print using glyph indices instead of characters. In fact, quite a few do and it's becoming more common. What you need is a print driver. The good news is Microsoft has already written it for you and provided you with sample source code in the WDK. Start by reviewing this to understand your options. The Unidriver is perhaps a little simpler but the Postscript driver has the advantage of generating output that can readily be transformed to PDF or other formats that retain text information (as opposed to raster page images that lose all text information). As far as I'm concerned, don't even think about XPS; it's just an all around disaster.
To handle glyph indices, what you'll need to do is add code to the driver's OEMTextOut function that uses the font's cmap tables to translate glyph indices back into character codes. I'm unaware of any public domain libraries that parse font files, so you'll likely have to write your own code to do this. (Hint: If you support only OpenType/TrueType fonts, you'll cover 99% of all printing applications).
Getting the Microsoft sample code to build, install and run is mostly straightforward, but if you're new to the WDK and installing print drivers, plan on spending a week or more on just that. The glyph index translation part is far more complex and you should plan on spending a lot more time on that.

Tools for displaying text, powerpoint style, in linux

I have a problem where I need a way to display a repeating series of "images" on a computer monitor. Specifically, given a series of text files, I'd like a way to display the contents of said files on a screen in a way much like a powerpoint would.
My current thoughts are to find some tool that will take in a text file of some format, and then output an image which contains the text from the file. Then I'd put it in a directory and have some Slideshow program continuously go between the images in that directory. It's a very hacky solution, obviously.
So, does anyone know of tools that would do such a thing? Or is there a better way to do this? I've looked into the library libgd2, but it doesn't seem to support text-wrapping for images, which is something I'd need.
Thanks!
MagicPoint is a tool for displaying presentations. Presentations are written in a simple plain text file format, much like HTML.
You could easily generate the MagicPoint file automatically and then run it and display the presentation. You can also generate HTML, PS oder PDF from the presentation and display that.
Are you looking for powerpoint equivalent for linux? Openoffice??
have you tried some magic scripting with TeX?
a chain like
tex file | dvi2ps | ps2jpg > output
and define some TeX-Macros?
Showoff's pretty cool. It uses Markdown-formatted slides to create a simple little Sinatra app that you run (with showoff serve), and then view in a browser.
Docutils. See http://docutils.sourceforge.net/docs/user/slide-shows.html
The text syntax is reStructuredText
another idea:
text2gif
To complement the suggestions given by others, if you were going to write a program to do this, it would probably be more efficient to just render the text to the screen directly, rather than converting it to images first. It could probably be done using a canvas or text box component in a full-screen window on whatever window manager you are using (e.g. KDE or Gnome).
I give presentations with Opera's #media projection CSS support. On http://talks.webconverger.com/ you can find a template and an example which you can load in Opera's full screen mode and start sliding through.
So besides writing in a familiar language HTML, it's dead easy to share the slides and even get your audience to look at the slides as you're going through them.
If you are looking for something more flashy, there are tools on the Web to generate animations and what not, and again you would simply use a full screen browser to play it back to your audience.

Resources