Creating a Print Monitor / Print Handler - visual-studio-2012

I'm having trouble getting started with building a Print Monitor / Print Handler for Windows using Visual Studio 2012 Ultimate with WDK 8. Basically, this is what I am trying to accomplish:
Create a print monitor (something an application can print to) that will generate a file with the content that should be printed (like the default XPS printer or a PDF printer), and then invokes the print handler
Create a print handler that will parse the generated file and do certain actions with it (check to see if certain text is present, upload the file online, etc)
I feel like the print handler part should not be too hard, but starting with the print monitor is what I'm stuck at. What would I do within VS12? I see options for "Printer Driver V4", "Printer Driver V4 Property Bag", and "Printer XPS Render Filter". Should I use one of those templates, and, if so, what would I do within them? Anything pointing me in the right direction would be appreciated!
EDIT:
Just some more clarification - I only need the text from the print output, but I've read from various sources that getting text-only output leads to no output at all from sources like Firefox, etc since they print text as glyphs.
I will be using the print handler to parse the text for keywords and then upload that information to a web server in a specific format. The print monitor just needs to capture and save the text information from whatever application is printing.

As you pointed out in your comments, some applications such as Firefox print using glyph indices instead of characters. In fact, quite a few do and it's becoming more common. What you need is a print driver. The good news is Microsoft has already written it for you and provided you with sample source code in the WDK. Start by reviewing this to understand your options. The Unidriver is perhaps a little simpler but the Postscript driver has the advantage of generating output that can readily be transformed to PDF or other formats that retain text information (as opposed to raster page images that lose all text information). As far as I'm concerned, don't even think about XPS; it's just an all around disaster.
To handle glyph indices, what you'll need to do is add code to the driver's OEMTextOut function that uses the font's cmap tables to translate glyph indices back into character codes. I'm unaware of any public domain libraries that parse font files, so you'll likely have to write your own code to do this. (Hint: If you support only OpenType/TrueType fonts, you'll cover 99% of all printing applications).
Getting the Microsoft sample code to build, install and run is mostly straightforward, but if you're new to the WDK and installing print drivers, plan on spending a week or more on just that. The glyph index translation part is far more complex and you should plan on spending a lot more time on that.

Related

Printing to Star TSP143LAN from NodeJS on Linux, with formatting

I have the proper CUPS drivers installed: I can print to my Star TSP143LAN using any application with print capability (like Chrome). I can print to this printer using the node-printer module, by specifying either the printer name or the printer’s network address, and setting the print mode to TEXT.
But I can’t seem to format what I print from NodeJS using the node-printer library. If I set the mode to RAW and send commands as specified in Star’s Command Line Emulator manual for this printer, node-printer will report a successful print but nothing happens. It doesn’t print.
I’m attempting to send these RAW commands because I want to do various formatting operations like make the font larger or bold, and so on.
I’ve tried the node-thermal-printer module but I’ve had no luck.
I’ve been scouring the internet for some help on this issue but I haven’t been able to find much. I’ve seen it mentioned that the TSP143 LAN doesn’t communicate in the same way as other star products and it’s best to use Star’s drivers as a go-between, but I’m not sure what that means. (I thought I might be doing that already when specifying the printer’s class name when attempting to print from node-printer...)
I didn’t have much trouble implementing the Star Swift SDK into an iOS app and doing formatting operations there. But I need to print from a NodeJS environment on Linux. I’m at a loss.
If there’s anybody to whom this sounds familiar and can point me in the right direction I’d be very grateful...
Thanks!
After quite a bit of research, it looks like the Star TSP100/TSP143 LAN is not able to print using Line Mode Commands or ESC/POS from Linux: the solution has been to generate a PDF from HTML (using wkhtmltopdf) and then print the PDF using the node printer library (https://github.com/tojocky/node-printer). I have not yet found a better way to properly format prints.

Using different program office extension

I have a program that can access a database with a whole bunch of articles.
Due to copyright, I can't access the database straight from my program, but I have a different program that can access it, and it's legitimate to copy small bits from the articles.
Because my friends and I quote a lot from these articles, I thought it would be useful if we could find an add-in for Word that will copy the requested part from an article.
Is there any add-in for Word that would let me use the program that I mentioned above so that I can access the database from within Word?
I would like to program this add-in myself, if possible.
Without further information about which operating system, and version of Word you are using, I can offer only a general outline.
1) It seems to me that you want to make a Word macro using Word Basic, or Visual Basic.
2) When you want to call your program which is external to Word, you need to use the shell command as outlined here from Microsoft's webpage.
I hope that helps you get started writing your macro!
CHEERS
Well its a wrokaround but you can use an automation tool which can run a sequence of actions on a given GUI like Winrunner or TestQuest to semulate the usage of the program, i assume these tools can get an input from a given xml or text file and log outputs in log text file.
If you have the output in a text file you will be able to parse the file using any programmign language and get the information you need and write it to eord or whatever format using OLE objects.

Getting data from a browser by screen-scraping

I have gone thru several relevant looking questions but they did not contain the answer I am looking for. So, here is my question:
I have several web applications at my workplace, which are written using different frameworks and the authors are long gone to ask for feature updates. Hence I have to go thru the same grueling sequence of actions to get, which amounts to a file size of few kilobytes, everyday.
I tried parsing the page source but the programming technique of the authors were all over the place. Some even intentionally obscure the code to not let the data show as text, and there is no reason for this as the code they wrote is company asset. Long story short, I realized if I can copy and paste the textual content of these pages, I can process that data much easily than parsing the page source to get the text (which is sometimes totally impossible)
So, I am now looking for a browser plug-in (in windows or linux environments) or equivalent text based tools on windows or linux, which will load these pages and save the text on the screen to file(s) when invoked.
Despite how hard I tried, I am coming up empty handed.
I do not want to utilize the services of a third party screen-scraping web site, as the data is company confidential and not accessible by outside parties. Everything has to happen on the client end as I do not have access to the servers these apps are running on (mostly IIS on windows front end and a oracle db at the back end. The middle tier, as I have explained before is anyone's wild guess, ranging from native oracle apps to weblogic to tomcat and to some in house developed java/javascript stuff.
Thanks for all the help in advance
After searching for an answer for well over a year, I came to realize, as long as I use windows, a modern version of it that is, autohotkey is my savior.
I open the web page, maximize it, place my cursor (mousemove, x, y) then left click (mouseclick, L) then send ctrl-A followed by ctrl-C.
Voila ! everything is in the clipboard. Then I activate my unix session (winactivate PuTTY) and send appropriate key press commands to launch the editor of my choice (which is vi) and finally send a shift-Insert to paste the clipboard into my document. Then save and exit of course.
As an added bonus, right after my document is saved, I can invoke the script of my choice to parse this file and give me back the portion(s) I am interested in.
I know it is not bullet proof, but for my purpose, it helps to a great extent. As a matter of fact, I can do whatever I want with this method.
What about something like this: http://www.nirsoft.net/utils/htmlastext.html
Freeware that converts an HTML page to text
Any of links, lynx or w3m will do what you want, they are text browsers and you can dump text from a webpage with, for example:
w3m -dump http://www.google.com > g.txt

how can I extract text contents from GUI apps in linux?

I want to extract text contents from GUI apps,here are 2 examples::
example 1:
Suppose I opened firefox, and input url : www.google.com
how can I extract the string "www.google.com" from firefox using my own app ?
example 2:
open calculator(using gcalctool),then input 1+1
How can I extract the string "1+1" of calculator from my own program?
in brief ,what I want is to find out whether there is a way to extract the text contents from any widget of an GUI application
Thanks
I don't think there's a generic way to do this, at least not a very elegant one.
Some inelegant ideas:
You might be able to modify the X window system or even some toolkit framework to extract what is being displayed in specific window elements as text.
You could take a screenshot and use an OCR library to convert the pixels back into text for the interesting areas.
You could recompile the apps of interest to add some kind of mechanism for asking them questions.
You could use something like xtest to inject events highlighting the region of interest and copying it to the clipboard.
I believe firefox and gcalctool are for examples only and you just want to know in general how to pass output of one application to other application.
There are many ways to do that on Linux, like:
piping
application1 | application2
btw here is the Firefox command line manual if you want to start firefox on Ubuntu with a URL. eg:
firefox "$url"
where $url is a variable whose value can be www.mozilla.org
That sounds difficult. Supposing you're running X11, you can very easily grab a window picture ( see "man xwd"); however there is no easy way to get to the text unless it's selected and therefore copied to the clipboard.
Alternatively, if you only want to capture user input, this is quite easy to do, too, by activation the X11 record extension: put this in your /etc/X11/xorg.conf:
Section "Module"
Load "record"
#Load other modules you need ...
EndSection
though it may prove difficult to use too, see example code for Xorg/X11 record extension fails

Is any software decent at importing column-aligned text?

Here's something that's really irked me over the years. I've never used any software that, when importing data from a column-aligned text file, can figure out the column breaks in a correct manner.
Excel 2K3 and a lot of other Microsoft components that seem to share a common codebase (like the import options for SQL2K) attempt to figure out the column breaks for you. Unfortunately, they only look at the first n rows, and are often completely wrong.
OpenOffice.Org 3.1 has a import dialog almost exactly like Excel 2K3 but it doesn't even attempt to guess the column breaks for you. And the latest version of Numbers doesn't appear to handle column-aligned imports at all.
Obviously column-aligned data is undesirable for a number of reasons, but a lot of older software (particularly in-house software various companies have floating around) exports data in this format so I do need to handle it every so often. Surely, somewhere, SOME software imports it well without me coding an import utility myself or manually specifying where twelve zillion columns start and stop?
OSX, Windows, whatever. I'm open to suggestions. Ultimate goal is to get it into a SQL Server table, but simply getting it into a Excel/XML/tab-delimited/etc file in the meantime would be fine because it's easy enough to get into SQL Server from there.
I tend to normalize such data with awk -- perhaps generating a csv file -- before trying to import it into Excel.
See the awk user's manual.
I don't think there is a silver bullet for your request. I think the best you can hope for is to define your input format once and be able to reuse that format when you receive a file with the same format again.
As one poster mentioned you could use awk or, if .NET is more your thing, then you could use FileHelpers. It's an open source .NET library that does a good job reading and writing both Fixed length and delimited files. The downside is that you would be creating a .NET application to do the work (either inserting directly into a DB or perhaps creating an output file. On the plus side, once created, you could reuse the mapping classes again if you get the same file format.
Well obviously no software can be entirely correct in guessing the layout of a fixed column file, since there is no seperator (though variable width columns with higher maximum lengths will often produce enough space on the end to start guessing). For example the following could be anywhere from 1-9 columns (I have personally had to figure out some super packed fixed column layouts like this, only much longer)
135464876
647873159
345467575
If SQL Server is the ultimate destination, have you looked into the SQL Server import wizard?
Right click your database in Management Studio and select Tasks->Import Data. Proceed through and select "Flat File" as your data source. In the format dropdown change from Delimited to Fixed Width. On the left you can now use the Columns screen to draw the column seperators. There is also an advanced and preview screen.
Try out this demo (I was on development team):
Personator 4
Install, run the program, go to Tools | ASCII Conversion | Import from ASCII.
The import will be to DBF/FoxPro, but you can then export that file into one of the formats you mentioned.
The start/stop guesser uses a few statistical formulas to try to get the boundaries correct; you get to verify and/or correct with a graphical editor after analysis.
If you save your file as a text file and attempt to open
it in Microsoft Excel 2007 and select "Fixed Width",
Excel will "guess" where the breaks occur (based on
whitespace), but you can actually change where the column
field breaks will occur. The application has vertical lines that
can be moved left or right X characters. Excel
will "guess" where the breaks occur, but if it
guesses incorrectly, you can still change where the field breaks
should occur. On STEP 2 of the wizard, just move the
vertical lines to the left or right if you need
to change Excel's guesses as to where the field breaks
are. You can see which character number the field
break occurs in before importing.

Resources