extraneous space around italics and bolds in manpage? - linux

I am writing a man page. I am displaying only with man -l filename.man.
I am running into an issue where my text is displaying as:
file/path/[ option ]
Where I would like it to display as:
file/path/[option].
The offending lines of text (code?) look like:
.\" filename.man
The filepath is /file/path/[
.I option
], and it contains files.
I am extremely new to formatting with [gnt]roff, and appreciate any pointers or recommended readings.

A definite guide to groff is gnu.org. I recommend the pdf for easy searching. It includes chapter 5.17.1 Changing Fonts that explains \f. Note there are at least 2 alternatives. The first uses \c to stop further output at the end of a line:
The filepath is /file/path/[\c
.I option\c
], and it contains files.
but the following is the preferred method, using a macro RI (roman-italic) provided specially for the purpose of joining "words" in an alternating roman then italic font:
The filepath is
.RI /file/path/[ option ],
and it contains files.
See the pdf chapter 4.1.3 Macros to set fonts, or the man page for groff_man (the man macros, also at gnu.org)

This can be achieved using the \fI <text> \fR in line style tags in place of the .I tags at the start of the line. In your code,
.\" filename.man
The filepath is /file/path/[\fIoption\fR], and it contains files.
I don't know why, and would appreciate comments or edits explaining the difference or linking to useful documentation.

Related

Mathjax strangely render back-tick in code blocks

I have a TCL code gist embedded in my own website. This page uses MathJax 2.7.5 configed with "TeX-MML-AM_CHTML". However, Mathjax strangely rendered the code between back-ticks in the comments of the code. For example, one line in the source code is (can be found here in gist):
# `testPrintFlag` : integer
The letters "int" were rendered by MathJax to a integration symbol (see here).
The gist code block appears like this., but the correct one should be like this. I'm wondering how I can fix this.
Thanks!
The configuration file TeX-MML-AM_CHTML includes the AsciiMath input processor, and AsciiMath uses back-ticks as its math delimiters. So all your back-tacks will cause AsciiMath to process their contents as math. If you aren't using AsciiMath input, you probably want to use a different configuration format, like just TeX-AMS_CHTML which only does TeX input (not MathML and AsciiMath, as in your original). That will be faster as well as it is a smaller file.
If you are using AsciiMath input, then you could configure it to use a different delimiter. See the documentation for details.
You could also configure MathJax to skip containers with certain class names (e.g., class="gist"). See the ignoreClass option for the asciimath2jax preprocessor at the link above. There is a similar one for the tex2jax preprocessor.

How does PDF securing work?

I'm curious how does PDF securing work? I can lock PDF file so system can't recognize text and manipulate with PDF file. Everything I found was about "how to lock/unlock" however nothing about "how does it work". Is there anyone who could explain it to me? Thx
The OP clarified in a comment
I mean lock on text recognition or manipulation with PDF file. There should be nothing about cryptography imho just some trick.
There are some options, among them:
You can render the text as a bitmap and include that bitmap in the PDF
-> no text information.
Or you can embed the font in question using a non-standard encoding without using standard glyph names
-> text information in an unknown encoding.
E.g. cf. the PDF analysed in this answer.
A special case: make the encoding wrong only for a few characters, maybe just one, probably a digit. This way an unalert person thinks everything was extracted ok, and only when the data is to be used, the errors start screwing things up, something which especially in case of wrong digits is hard to fix. E.g. cf. the PDF analysed in this answer.
Or you can put text in structures where text extraction software or copy&paste routines usually don't look, like creating a large pattern tile containing the text for some text area and filling the area with the matching pattern color.
-> text information present but not seen by most extractors.
E.g. cf. this answer; the technique here is used to make the text of a watermark non-extractable.
Or you can put extra text all over the page but make it invisible, e.g. under images, drawn in rendering mode 3 (invisible), located in some disabled optional content group (layer), ... Text extractors often do not check whether the text they extract actually is visible.
-> text information present but polluted by garbage text bits.
...

How to highlight portions of a PDF file programmatically (eg. using command line)

I am interested in highlighting portions of a PDF programmatically, hopefully through a command line tool of sorts. My particular PDF file is not OCRed so the text is not searchable, but the particular places that I would like to highlight occur on every page in the same position. I was wondering if there is a tool to do this where I can input the rectangle positions in pixels into the command line tool and it would highlight the relevant portions for me.
Previous Findings
I have looked over the internet and found a few sites noting how to do this by searching for the text. Unfortunately that is not possible for me as my PDF does not have OCR.
I have searched stackexchange for similar questions and found
How to Highlight Text in PDF with commandline (windows)? and https://stackoverflow.com/questions/32713633/how-to-highlight-text-in-pdf-using-acrobat-reader-from-command-line but both were unanswered.
Potential Ideas
The first link had a possible lead with a given link to
Add comments to PDF files automagically with regular expressions
which uses ghostscript to include annotations. Is it possible to use ghostscript to highlight the pages in a similar fashion by coordinates.
The second link mentioned using command line options for the adobe acrobat/reader exe file, but searching the relevant manual for the command line switches does not show any highlighting options. It may be possible that Adobe does not support the highlight option through command line anymore, which would be unfortunate.
My last idea would be using AutoHotkey to create a macro that does an actual highlight for me using a GUI program, but that would be the last resort.
What do you all think? Any ideas on what to do, or things to check out? I am willing to program out a solution and can work out the solution on Windows or Linux if necessary. Thanks in advance.
I would have thought a Highlight annotation was what you wanted.Highlight annotations are a type of text markup annotation and as such take a set of QuadPoints which describe the bounding box(es) to apply the annotation type to.
Since you say you know the co-ordinates this would seem appropriate for your use. Of course, you will have to create the Annotation on every page, and you will have to learn how to program this with a pdfmark, but I believe it should work.
Note that the co-ordinates are in user space (generally 72 points to the inch) NOT pixels, because PDF is not an image format there is no concept of pixels, except for included images.
There are quite a few officially unsupported command line parameters to acrobat or the acrobat reader (acrord32.exe in Windows).
See: https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/pdf_open_parameters.pdf
This includes a parameter to highlight with four integers at left,right,top,bottom that are in some unspecified units but with 0,0 at the top left of the page.
EXCEPT... I have been unable to get this to work.
I can pass in parameters to search and zoom but highlight never shows anything.
For instance:
start acrord32 /n /s /a "search=MS25441&zoom=300&page=1&highlight=0,55,0,65" floorplan1_ABM_cameras.pdf
Opens the files, searches for the string, zooms to 300% but nothing shows for a highlight no matter what coordinates I specify.

VIM Convert Text to URL with Search/Replace

I have a document that contains long filenames, followed by a hyphen, followed by a description of the contents of the file. The files are all PDFs. I am converting this document into a page on our website, so that it has the filename, which should be a link to the file, followed by the description of the file contents.
I'm fairly versed in the basics of VIM, but advanced search/replace is something I'm lacking in. What I'd like to do is convert each filename into a link to that filename. For example:
WebAdapt_Prod_Int_10.1_Install.IIS7.2008R2.pdf - Step by step instructions for installing ArcSDE 10.1 for Oracle on the ‘Test’ environment, including configuration notes.
Should convert to:
WebAdapt_Prod_Int_10.1_Install.IIS7.2008R2.pdf - Step by step instructions for installing ArcSDE 10.1 for Oracle on the ‘Test’ environment, including configuration notes.
There are roughly 30 of these documents, so going line-by-line would be time consuming (though by the time I get a response I'll probably already have done it). I'd just like to know how to do this for the next time I'm given a big text file that needs formatting.
Thanks in advance!
Try this:
:%s!\v^\S+\.pdf!&!
Please note that the above doesn't try to do anything with HTML entities though. See the Vim Tips wiki for a possible solution if that's a concern.
Edit: The way this works:
:% - filter the entire file
s!...!...! - substitute
\v - set "very magic" syntax for regexps
^\S+\.pdf - match one or more non-spaces at the begging of line, followed by .pdf
& - replace with the link: & is the matched string (that is, the filename).

Mathjax: how to deal with this strange behavior?

I am using the markdown editor and I have loaded Mathjax in all pages of my website.
I have realized that this line of latex works well:
$(u_1)$
However, this one does not work (basically latex does not work):
$(u_1,u_2)$
In order to make this work, I have to write something like this:
$(u\_1,u\_2)$
I have a similar problem here. This does not work:
$$M=\left(\begin{array}{cc}
a & b \\
c & d \\
\end{array}\right)$$
But this works:
$$M=\left(\begin{array}{cc}
a & b \\\\
c & d \\\\
\end{array}\right)$$
This is a common issue of mixing LaTeX-input with Markdown. From the MathJax documentation:
There cannot be HTML tags within the math delimiters (other than <br>) as TeX-formatted math does not include HTML tags.
And later:
Another source of difficulty is when MathJax is used in content management systems that have their own document processing commands that are interpreted before the HTML page is created. For example, many blogs and wikis use formats like Markdown to allow you to create the content of you pages. In Markdown, the underscore is used to indicate italics, and this usage will conflict with MathJax’s use of the underscore to indicate a subscript. Since Markdown is applied to the page first, it will convert your subscripts markers into italics (inserting tags into your mathematics, which will cause MathJax to ignore the math).
As other answers on SO (see the link at the top) point out, some markdown parsers are more aware of TeX-like syntax than others.

Resources