CLI pdf viewer for linux [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Hey, for quite a while now, I am looking for a pdf viewer for the command line.
As I like to work without X on Linux, and often work on a remote machine, I would like to have a tool to read pdfs. There are quite a lot of really good graphical programs (evince, okular, acroread, ...) to do the job, so I figured there should be at least one decent text-mode tool. But I don't even know of a crappy one!
Currently, I either start X only to read pdfs, or use pdftohtml+lynx.
However, the latter does not produce a very good output, and most documents are just unreadable, especially if they contain mathematical formula.
Google is full of people saying either it's not possible or suggesting the pdftohtml version.
I realise, this is not exactly a programming question, but I am currently considering starting a project to implement such a program, unless there already is a good one out there.
Thanks for any suggestions.

Hi I think that you don't need to write a program for your purpose I mean reading pdf file in console mode because less command already do it for you. So use it and just enjoy it.
less "the name of pdf file"

Ok, you asked to know even "crappy" ones. Here are two (decide yourself about their respective crappiness):
First: Ghostscript's txtwrite output device
gs \
-dBATCH \
-dNOPAUSE \
-sDEVICE=txtwrite \
-sOutputFile=- \
/path/to/your/pdf
Second: XPDF's pdftotext CLI utility (better than Ghostscript):
pdftotext \
-f 13 \
-l 17 \
-layout \
-opw supersecret \
-upw secret \
-eol unix \
-nopgbrk \
/path/to/your/pdf
- |less
This will display the page range 13 (first page) to 17 (last page), preserve the layout of a double-password protected named PDF file (using user and owner passwords secret and supersecret), with Unix EOL convention, but without inserting pagebreaks between PDF pages, piped through less...
pdftotext -h displays all available commandline options.
Of course, both tools only work for the text parts of PDFs (if they have any). Oh, and mathematical formula also won't work too well... ;-)
Edit: I had mis-typed the command above (originally using pdftops instead of pdftotext).

There is also the green PDF viewer. There is a demo on YouTube.

By the way, i m always in the same situation, and I use mc (midnight commander) which handles text pdf's very well...
Just view the file (F3) in mc

Try fbgs, which should be provided by the fbi or fbida package depending on your distribution. Note that it only works in real terminals (ttys).
http://web.archive.org/web/20150316143120/http://linuxers.org/howto/how-open-pdf-files-linux-console-using-fbgs-framebuffer-pdf-viewer

fbpdf is a framebuffer pdf viewer.
There is also a fork, jfbpdf, but at the moment I am not able to get it working.

This would only work if your PDF document is structured, i.e. it is a tagged PDF document.
This is required to get the correct reading-order of the text objects in the document.
Tagged PDF documents also allow your to re-flow the document though I am not aware of any tool doing that with command line output.

Related

Is there way to get the lets say top 5 comments of a youtube video with a command line utility on linux

I want to get the top comments of youtube videos.
Is there a way to do this with a scriptable commandline utility or do I need to use curl and the API.
I thought of using youtube-dl , but there seems to be no such function.
Is there a similar tool capable of doing this?
Also I read some older questions, which suggested that there is no way of doing this (except by getting all comments and searching them locally), since it is not implemented in the API.
So I was wondering if this changed recently.
question from 2011
question from 2015
The API doesn't order comments into 'top comments' (unless you mean top-level comments, which is default) but you can use wget and parse the output file.
wget -O output "https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=5&videoId=[VIDEO_ID]&order=time&textFormat=plainText&key=[API_KEY]"

Edit an applescript file from a linux computer [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Though applescript appears to be a scripting language like any other (wikipedia/applescript), for reasons I don't understand it seems these scripts are often saved as binaries. It seems like this isn't an issue for someone working on a Mac with a mac-based text editor that can open these scripts into a plain-text format where they can be edited and read, but for the rest of us, we just see gibberish. For instance, Github has many examples of .scpt files committed to repositories instead of/without the plain-text equivalent (a bit of Googling suggests this would be a .applescript file instead)
Question: Is there an open-source tool that can parse and serialize these binaries so that they can be viewed/edited in a standard plain text editor and saved back as .scpt?
(My context: I'd like to provide a user-friendly, os native button-click way to launch my application on a mac, rather than tell users to open a bash terminal and type stuff.)
Edit I only have access to a linux machine, I don't own a mac.
Instead of trying to create an AppleScript on a non-Mac, what you can do is simply name your shell script file with a .command suffix and make sure that it has execute POSIX permissions for the user. The user can then double-click the file in the Finder to execute your script instead of having to enter Terminal commands.
If you would like to take advantage of AppleScript commands within your shell script file to add some simple GUI functionality, you can use the osascript command.
BTW, for reference: on a Mac the application "Script Editor" (or "AppleScript Editor" on older systems) is generally used to create AppleScripts. It provides several save options - the .scpt binary and .applescript plain text files you noted as well as .scptd script bundles and .app standard, double-clickable applications.

Convert Microsoft Office documents to Text [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm looking for a library (or command line tool) to turn MS Office documents into either plaintext or HTML (for conversion to text).
It must run on Linux (not via Wine!).
I found antiword, but the last release was 2005, so it won't read the new Office 2007 formats.
I need it to read Word, Excel and Powerpoint documents
The new office 2007 format is just (ZIP) compressed XML.
All the text (in at least the .docx format) is located (once you decompress the file) in the word folder, document.xml file. Strip it from all the XML tags and you'll get the text. You'll lose the formatting no doubt, but if you want to do text indexing or something like it format isn't relevant anyway. The order is preserved.
I haven't analyzed Excel and Powerpoint but the approach should be similar. Excel might be trickier, depending on how are the cells stored in the XML file.
The Apache POI library can extract text from office formats. This is used by Tika in Lucene. Tika can be executed as a command line tool:
curl http://.../document.doc \
| java -jar tika-app-x.y.jar --text \
| grep -q keyword
PyODConverter for automating OpenOffice. Use it to do the conversions.
OONinja example converting Doc to PDF but any OpenOffice supported imports or exports should work. Also has the advantage of working Headless if required.
other options include,
Abiword
or you really just want to deal with command line WvWare but I don't think it supports Docx,
You can use Autonomy Keyview with the appropriate licence to use in your application. It seems to be extremely powerful and can extract text from almost everything; we use it to identify text within arbitrary format files.
I've no idea what the licensing terms are, but they're available from your account manager :)

Get started with Latex on Linux [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
Impressed by is-latex-worth-learning-today, and many how-to's on Windows,
How do you have someone started with LaTeX on Linux?
How do you generate a pdf out of it and give up the OOO Word processer?
Update:
Thanks to all the suggestions given in here. I was able to create an awesome ppt using the Beamer class: http://github.com/becomingGuru/gids-django-ppt. I found this approach too far better than using powerpoint and the like.
Those interested may checkout the TEX file, with many custom commands and the corresponding presentation.
First you'll need to Install it:
If you're using a distro which packages LaTeX (almost all will do) then look for texlive or tetex. TeX Live is the newer of the two, and is replacing tetex on most distributions now.
If you're using Debian or Ubuntu, something like:
<code>apt-get install texlive</code>
..will get it installed.
RedHat or CentOS need:
<code>yum install tetex</code>
Note : This needs root permissions, so either use su to switch user to root, or prefix the commands with sudo, if you aren't already logged in as the root user.
Next you'll need to get a text editor. Any editor will do, so whatever you are comfortable with. You'll find that advanced editors like Emacs (and vim) add a lot of functionality and so will help with ensuring that your syntax is correct before you try and build your document output.
Create a file called test.tex and put some content in it, say the example from the LaTeX primer:
\documentclass[a4paper,12pt]{article}
\begin{document}
The foundations of the rigorous study of \emph{analysis}
were laid in the nineteenth century, notably by the
mathematicians Cauchy and Weierstrass. Central to the
study of this subject are the formal definitions of
\emph{limits} and \emph{continuity}.
Let $D$ be a subset of $\bf R$ and let
$f \colon D \to \mathbf{R}$ be a real-valued function on
$D$. The function $f$ is said to be \emph{continuous} on
$D$ if, for all $\epsilon > 0$ and for all $x \in D$,
there exists some $\delta > 0$ (which may depend on $x$)
such that if $y \in D$ satisfies
\[ |y - x| < \delta \]
then
\[ |f(y) - f(x)| < \epsilon. \]
One may readily verify that if $f$ and $g$ are continuous
functions on $D$ then the functions $f+g$, $f-g$ and
$f.g$ are continuous. If in addition $g$ is everywhere
non-zero then $f/g$ is continuous.
\end{document}
Once you've got this file you'll need to run latex on it to produce some output (as a .dvi file to start with, which is possible to convert to many other formats):
latex test.tex
This will print a bunch of output, something like this:
=> latex test.tex
This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4)
entering extended mode
(./test.tex
LaTeX2e <2003/12/01>
Babel <v3.8d> and hyphenation patterns for american, french, german, ngerman, b
ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, esperanto, e
stonian, finnish, greek, icelandic, irish, italian, latin, magyar, norsk, polis
h, portuges, romanian, russian, serbian, slovak, slovene, spanish, swedish, tur
kish, ukrainian, nohyphenation, loaded.
(/usr/share/texmf/tex/latex/base/article.cls
Document Class: article 2004/02/16 v1.4f Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size12.clo))
No file test.aux.
[1] (./test.aux) )
Output written on test.dvi (1 page, 1508 bytes).
Transcript written on test.log.
..don't worry about most of this output -- the important part is the Output written on test.dvi line, which says that it was successful.
Now you need to view the output file with xdvi:
xdvi test.dvi &
This will pop up a window with the beautifully formatted output in it. Hit `q' to quit this, or you can leave it open and it will automatically update when the test.dvi file is modified (so whenever you run latex to update the output).
To produce a PDF of this you simply run pdflatex instead of latex:
pdflatex test.tex
..and you'll have a test.pdf file created instead of the test.dvi file.
After this is all working fine, I would suggest going to the LaTeX primer page and running through the items on there as you need features for documents you want to write.
Future things to consider include:
Use tools such as xfig or dia to create diagrams. These can be easily inserted into your documents in a variety of formats. Note that if you are creating PDFs then you shouldn't use EPS (encapsulated postscript) for images -- use pdf exported from your diagram editor if possible, or you can use the epstopdf package to automatically convert from (e)ps to pdf for figures included with \includegraphics.
Start using version control on your documents. This seems excessive at first, but being able to go back and look at earlier versions when you are writing something large can be extremely useful.
Use make to run latex for you. When you start on having bibliographies, images and other more complex uses of latex you'll find that you need to run it over multiple files or multiple times (the first time updates the references, and the second puts references into the document, so they can be out-of-date unless you run latex twice...). Abstracting this into a makefile can save a lot of time and effort.
Use a better editor. Something like Emacs + AUCTeX is highly competent. This is of course a highly subjective subject, so I'll leave it at that (that and that Emacs is clearly the best option :)
To get started with LaTeX on Linux, you're going to need to install a couple of packages:
You're going to need a LaTeX distribution. This is the collection of programs that comprise the (La)TeX computer typesetting system. The standard LaTeX distribution on Unix systems used to be teTeX, but it has been superceded by TeX Live. Most Linux distributions have installation packages for TeX Live--see, for example, the package database entries for Ubuntu and Fedora.
You will probably want to install a LaTeX editor. Standard Linux text editors will work fine; in particular, Emacs has a nice package of (La)TeX editing macros called AUCTeX. Specialized LaTeX editors also exist; of those, Kile (KDE Integrated LaTeX Environment) is particularly nice.
You will probably want a LaTeX tutorial. The classic tutorial is "A (Not So) Short Introduction to LaTeX2e," but nowadays the LaTeX wikibook might be a better choice.
I would recommend start using Lyx, with that you can use Latex just as easy as OOO-Writer.
It gives you the possibility to step into Latex deeper by manually adding Latex-Code to your Document.
PDF is just one klick away after installatioin. Lyx is cross-plattform.
It depends on your Linux distibution and your preference of editors etc. but I would recommend to start with Kile (a KDE app) as it is easy to learn and installing it should install most of the needed packages for LaTex and PDF generation. Just have a look at the screenshots.
If you use Ubuntu or Debian, I made a tutorial easy to follow: Install LaTeX on Ubuntu or Debian. This tutorial explains how to install LaTeX and how to create your first PDF.
LaTeX comes with most Linux distributions in the form of the teTeX distribution. Find all packages with 'teTeX' in the name and install them.
Most editors such as vim or emacs come with TeX editing modes. You can also get WYSIWIG-ish front-ends (technically WYSIWYM), of which perhaps the best known is LyX.
The best quick intro to LaTeX is Oetiker's 'The not so short intro to LaTeX'
LaTeX works like a compiler. You compile the LaTeX document (which can include other files), which generates a file called a .dvi (device independent). This can be post-processed to various formats (including PDF) with various post-processors.
To do PDF, use dvips and use the flag -PPDF (IIRC - I don't have a makefile to hand) to produce a PS with font rendering set up for conversion to pdf. PDF conversion can then be done with ps2pdf or distiller (if you have this).
The best format for including graphics in this environment is eps (Encapsulated Postscript) although not all software produces well-behaved postscript. Photographs in jpeg or other formats can be included using various mechanisms.
I would personally use a complete editing package such as:
TexWorks
TexStudio
Then I would install "MikTeX" as the compiling package, which allows you to generate a PDF from your document, using the pdfLaTeX compiler.
yum -y install texlive
was not enough for my centos distro to get the latex command.
This site https://gist.github.com/melvincabatuan/350f86611bc012a5c1c6 contains additional packages. In particular:
yum -y install texlive texlive-latex texlive-xetex
was enough but the author also points out these as well:
yum -y install texlive-collection-latex
yum -y install texlive-collection-latexrecommended
yum -y install texlive-xetex-def
yum -y install texlive-collection-xetex
Only if needed:
yum -y install texlive-collection-latexextra

Command line program to create website screenshots (on Linux) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
What is a good command line tool to create screenshots of websites on Linux? I need to automatically generate screenshots of websites without human interaction. The only tool that I found was khtml2png, but I wonder if there are others that aren't based on khtml (i.e. have good JavaScript support, ...).
A little more detail might be useful...
Start a firefox (or other browser) in an X session, either on your console or using a vncserver. You can use the --height and --width options to set the size of the window to full screen. Another firefox command can be used to set the URL being displayed in the first firefox window. Now you can grab the screen image with one of several commands, such as the "import" command from the Imagemagick package, or using gimp, or fbgrab, or xv.
#!/bin/sh
# start a server with a specific DISPLAY
vncserver :11 -geometry 1024x768
# start firefox in this vnc session
firefox --display :11
# read URLs from a data file in a loop
count=1
while read url
do
# send URL to the firefox session
firefox --display :11 $url
# take a picture after waiting a bit for the load to finish
sleep 5
import -window root image$count.jpg
count=`expr $count + 1`
done < url_list.txt
# clean up when done
vncserver -kill :11
Try nice small tool CutyCapt, which depends only on Qt and QtWebkit. ;)
Have a look at PhantomJS, which seems to be a free scritable Webkit engine that runs on Linux, OSX and Windows. I've not used it since we currently use Browshot (commercial solution), but when all our credits run out, we will seriously have a loot at it (since it's free and can run on our servers)
scrot is a command line tool for taking screenshots. See the man page and this tutorial.
You might also want to look at scripting the browser. There are firefox add-ons that take screenshots such as screengrab (which can capture the entire page if you want, not just the visible bit) and you could then script the browser with greasemonkey to take the screenshots.
See Webkit2png.
I think this is what I used in the past.
Edit I discover I haven't used the above, but found this page with reviews of many different programs and techniques.
I know its not a command line tool but you could easily script up something to use http://browsershots.org/ Not that useful for applications not hosted on external IPs.
A great tool none the less.
I don't know of anything custom built, I'm sure there could be something done with the gecko engine to render to a png file instead of the screen ...
Or, you could fire up firefox in full screen mode in a dedicated VNC server instance and use a screenshot grabber to take the screenshot. Fullscreen = minimal chrome, VNC server instance = no visible UI + you can choose your resolution.
Use xinit with Xvnc as the X server to do this - you'll need to read all the manpages.
Downsides are that the screenshot is always the same size, doesn't resize according to the web page ...
There is the import command, but you'll need X, and a little bash script that open the browser window, then take the screenshot and close the browser.
You can find more information here, or just typing import --help in a shell ;)
http://khtml2png.sourceforge.net/
The deb file
http://sourceforge.net/projects/khtml2png/files/khtml2png2/2.7.6/khtml2png_2.7.6_i386.deb/download
worked on my Ubuntu after installing libkonq4 ... but you may have to cover other dependencies.
I think javascript support may be better now!
Stephan
Not for the command line but at least for usage in batch operation for a larger set of urls you may use firefox with its addon fireshot (licensed version?).
Open tabs for all urls in your set (e.g. "open tabs for all bookmarks in this folder...").
Then in fireshot launch "Capture all tabs"
In the edit window then call "select all shots -> save all shots"
Having set the screenshot properties (size, fileformat, etc.) before you end with a nice set of shotfiles.
Steffen

Resources