I'm looking for a command line method to do optical character recognition in linux. The main problem, however, is characters are 7-segment LCD characters. For exampe,
I would like to use GOCR, but it gets hung up on the broken stroke of the character. If someone knows how I can help it along, or an alternative commandline OCR that would recognize 7-segment LCD, I would appreciate it.
It wasn't until I posted this question did it occur to me that the proper description for the font type is "7-segment" not "LCD" font, and until then i had been using poor search terms on Google. Having done a better search now, I found SSOCR (7-segment OCR) which fits the bill perfectly. There where other solutions on the web for 7-segment with GOCR, like shifting-overlaying image to fill the breaks, but SSOCR is pretty much straight-forward, looking at and counting the segments themselves, so it able to deal with the goofy and restricted character-set of 7-segment displays.
Related
Motivation: I'm trying to write scripts which send keystrokes to the currently focused window. Right now I use xdotool, which lets me send raw keystrokes. However, I want the exact keystrokes to be a function of the current text around the input caret in the focused window.
Problem: Is there a generic way of reading the state of the text input caret -- both its current position as well as the text around it? Intuitively, I want the content of the current "text box" as well as the location of the cursor within that text box. Perhaps this is not possible in the general case, but is there a way of doing it which would work for emacs and firefox? I'm running Ubuntu Linux
Further motivation: due to a bad case of RSI I control my computer by voice rather than typing. This works by setting up voice-activated scripts that are triggered by saying different phrases. When dictating English prose, it would be helpful to automatically capitalize words at the beginning of sentences. This automatic capitalization can be accomplished by reading the characters immediately before the input caret, checking if they contain a period, and if so, capitalizing the start of the next phrase that I dictate by voice.
Thanks so much! If anybody can help me here, it would greatly increase my day-to-day accessibility.
Since there is no standard widget toolkit for X11, but only a buch of independently developed arbitrary toolkits, there is no generic way to implement this.
As far as X11 and tools operating on its level (like xdotool) is concerned, there's only windows of either the InputOutput variety (i.e. visible windows, that receive events and one can draw to) or the just Input which are invisible and only receive events. There are no further refined "widgets" so to speak. You get a pixel grid, which you can draw to.
Accessibility interfaces are the burden of the toolkits (or if you don't use a toolkit – then you're a badass – you, the developer), to implement: https://www.freedesktop.org/wiki/Accessibility/
The absolute generic way would be to take a screenshot of the currently focused window, employ a computer vision / machine learning based solution to identify the caret, then OCR the line of text around it. And to be honest, IMHO doing it that way would probably be a lot more reliable than hoping for the accessibility interfaces to be properly implemented.
Back in 199[456] I was using Linux and a Matrox graphics adapter. For programming I often used the text mode and didn't bother to boot into X11. These graphics cards allowed for really high text resolutions and still had a very readable font. Occasionally I'd like to test if this font would work well for programming on X11 -- but I cannot find this font to give it a try!
I have searched intensively, for example here, but no font seems to look like the Matrox one. So, the questions:
which font was used? Was it the regular console font that just looked better on those graphic cards?
is this font available for X11? Which one is it?
Any examples / screenshots?
I'd be very glad if anyone could explain if I'm just hallucinating or if my memories are accurate.
UPDATE: I've since found a good resource. Selecting the font Px437_IBM_VGA_8x16.ttf and setting the terminal to 12px comes pretty close to my memories. Since monitor resolutions are much higher now, the font becomes pretty tiny, and scaling it up looks somewhat wrong. I will have to experiment.
This site dumps the ROMs of several old VGA BIOS chips and locates the bitmaps used by the character generator. There's a Matrox card from 1993 in there, but the fonts look quite ordinary to me.
What software or environment were you using, out of curiosity?
Also, have you made any progress on this subject on any other sites? I'm very curious as I'm going to be embarking on a highly relevant project at some point in the future.
I just found this font online -> http://webdraft.hu/fonts/classic-console/
maybe it will help.
I was thinking of implementing a labyrinth game in Haskell - the labyrinth will be of ASCII symbols and I would like it to be colored - for example walls to be blue "#", coins to be yellow 'o' and so on, and I was looking at System-Console-ANSI.
I would like to ask if it will be possible at all to do this with this packet and I was thinking how to refresh the labirynth when an action happens (for example it can have coins in it, represented by 'o' and when the hero steps on a coin, he gets it and it should disappear) - will claering the screen and printing the labyrinth again do the job smoothly?
Can you please give me some ideas and maybe packets if System Console ANSI won't do the job? Thank you very much in advance!
I suggest you have a look at vty-ui at http://hackage.haskell.org/package/vty-ui and http://jtdaugherty.github.com/vty-ui/. There's a very good user's manual for it. I've only played with it a little, but I think it would be well suited to your application.
Quick text-processing question. It's not necessarily related to programming, but this is the best place I figured I should go.
Rate down to tell me this kind of question is not welcome here. (Though, I really like my one little reputation point.)
Anyways, how can I encode text so that two characters get rendered in the same charspace?
NOTE: this is for plain-text -- nothing particularly complex.
The best you can do is put a backspace character between the two. However the outcome isn't likely to be useful to you, it will depend on what software is being used to display the text. The most likely is that the backspace will be ignored or shown as some generic "unavailable" glyph. The second most likely is that the second character will completely erase the first. You'd have to be very lucky for the two characters to be displayed one over the other in the same space.
If it's plain text to be processed by any editor, as far as I know you can't. Even if your text is encoded in Unicode, I don't think it provides combining characters for normal letters, but just for accents and similar symbols which are intended to be combined with other glyphs.
BTW, I'm not sure that stackoverflow is the right place for this kind of stuff, I'd see it better in superuser.com.
Like any responsible developer, I'd like to make sure that the sites I produce are accessible to the widest possible audience, and that includes the significant fraction of the population with some form of colour blindness.
There are many websites which offer to filter a URL you feed it, either by rendering a picture or by filtering all content. However, both approaches seem to fail when rendering even moderately complex layouts, so I'd be interested in finding a client-side approach.
The ideal solution would be a system filter over the whole screen that can be used to test any program. The next best thing would be a browser plugin.
I came across Color Oracle and thought it might help. Here is the short description:
Color Oracle is a colorblindness simulator for Windows, Mac and Linux. It takes the guesswork out of designing for color blindness by showing you in real time what people with common color vision impairments will see.
Color Oracle is great, but another option is KMag, which is part of KDE in Linux. It's ostensibly a screen magnifier, but can simulate protanopia, deuteranopia, tritanopia and achromatopsia.
It differs from Color Oracle by requiring an additional window in which to display the re-coloured image, but an advantage is that one can modify the underlying image at the same time as previewing the simulation.
Here is a screenshot showing the original figure on the left, and the KMag window on the right, simulating protanopia.
Here's a link to a website that simulates various kinds of color blindness:
http://www.vischeck.com/
They let you check URL's and Screenshots with three kinds of different color blindness types (URL checking is a bit dated though. Image-check works better).
I'd encourage everyone to check their applications btw. Seeing your own app with others eyes may be an eye opener (pun intended).
I know this is a quite old question, but I've recently found an interesting solution to transparently simulate color blindness.
When working with Linux, you can simulate color blindness using the Color Filter plugin for Compiz. It comes with profiles for deuteranopia and protonopia und changes the colors of the whole screen in real-time.
It's very nice because it works transparently in all applications (even within Youtube-Videos), but it will only work where Compiz is available, e.g. only under Linux.
Here's an article that has some guidelines for optimizing UI for color blind users:
Particletree » Be Kind to the Color Blind
It contains a link to another article with the kind of tools you were asking for:
10 colour contrast checking tools to improve the accessibility of your design | 456 Berea Street
A great paper that explains a conversion that preserves color differences is:
Detail Preserving Reproduction of color images for Monochromats and Dichromats.(PDF)
I haven't implemented the filter, but I plan to when I have some more free time.
I found Colour Simulations easy to use on Windows 10. This software can apply a color-blind filter to a part of the screen or the whole screen. And what's great is it allows me to interact with my PC normally as if it doesn't exist in fullscreen mode. It runs quite slow in my 4K screen using an integrated graphics card, though.