How do I reference a label in one latex file from another? - reference

I have two files, a.tex and b.tex. In a.tex I have a label, \label{stuff}. In b.tex I need to refer to this label, \ref{stuff}.
I also have a main.tex file. Regardless of if I use \include{a}\include{b} or use \input{a}\input{b} the reference is not included in the pdf generated from main.tex.
The document type of main.tex is tufte-book.
I am using similar thing in my thesis. Using \hyperref[label]{text you wanna show} in the text is working for me. Don't forget to have package hyperref included.
\label{ab:PCL} % this is inside my abbreviations file
\hyperref[ab:PCL]{polycaprolactone} % this is in my text file


Minimal self-compiling to .pdf Rmarkdown file

I need to compose a simple rmarkdown file, with text, code and the results of executed code included in a resulting PDF file. I would prefer if the source file is executable and self sifficient, voiding the need for a makefile.
This is the best I have been able to achieve, and it is far from good:
#!/usr/bin/env Rscript
pandoc('hw_ch4.rmd', format='latex')
# TODO: how to NOT print the above commands to the resulting .pdf?
# TODO: how to avoid putting everyting from here on in ""s?
# TODO: how to avoid mentioning the file name above?
# TODO: how to render special symbols, such as tilde, miu, sigma?
# Unicode character (U+3BC) not set up for use with LaTeX.
# See the inputenc package documentation for explanation.
# nano hw_ch4.rmd && ./hw_ch4.rmd && evince hw_ch4.pdf
4E1. In the model definition below, which line is the likelihood?
A: y_i is the likelihood, based on the expectation and deviation.
4M1. For the model definition below, simulate observed heights from the prior (not the posterior).
points <- 10
rnorm(points, mean=rnorm(points, 0, 10), sd=runif(points, 0, 10))
4M3. Translate the map model formula below into a mathematical model definition.
flist <- alist(
y tilda dnorm( mu , sigma ),
miu tilda dnorm( 0 , 10 ),
sigma tilda dunif( 0 , 10 )
What I eventually came to use is the following header. At first it sounded neat, but later I realized
+ is indeed easy to compile in one step
- this is code duplication
- mixing executable script and presentation data in one file is a security risk.
#!/usr/bin/env Rscript
argv <- commandArgs(trailingOnly=FALSE)
fname <- sub("--file=", "", argv[grep("--file=", argv)])
render(fname, output_format="pdf_document")
date: "compiled on: `r Sys.time()`"
The quit() line is supposed to guarantee that the rest of the file is treated as data. The <!--- and --> comments are to render the executable code as comments in the data interpretation. They are, in turn, hidden by the #s from the shell.

Why is python pulling in symbols instead of text from a pdf

I am trying to loop through a set of pdfs (all are OCR'd) in a set of folders and search for key terms in the pdf and if pdf contains a certain term, then save the folder name, file name, etc.. This code is working to an extent. Except, it is missing a few pdfs within the search terms. The reason is because when I read in a couple of the pdfs it displays some jibberish (to me at least) on a couple of pages. For example, say I have read in a pdf named 'the_one.pdf'. It has 278 pages. When I go into adobe acrobat to search this document, I can find 'Search Term 1' on page 171, but when it is read with python, python outputs something like this:
˛ ˚˝ˆ˙)˛˚˜
˛ #%˛%
˛ ˛˚˛˝%
˛ ˛,
˛ -ˆ˚
˛ ˚˝.˝
Of course, it displays the majority of pages correctly, but for some reason it won't display a couple of them. For confidentiality reasons, I can't post the pdfs. Does anyone have any idea why this is happening?
Also, anything you can point out to speed up my code or make it more dynamic is helpful as well. Always looking to learn.
import PyPDF2
from os import walk
import os
import re
import csv
pdf_location = r'PDF Directory'
x = ['Search term 1', 'Search term 2', 'Search term 3', 'etc..']
key_terms = []
rule = []
filenamey = []
for dirpath, dirnames, filenames in walk(pdf_location):
for filename in filenames:
if filename.endswith('.pdf'):
pdfFileObj = open(os.path.join(dirpath,filename), 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj, strict = False)
num_pages = pdfReader.numPages
count = 0
text = ""
while count < num_pages:
pageObj = pdfReader.getPage(count)
count +=1
text += pageObj.extractText()
for i in x:
if,text, re.IGNORECASE):
rulex = dirpath.split("Rule")[1]
filenamex = filename
key_termx = x[0]
Parsing PDF is a complex task, the 1.7 spec has around 750 pages and Adobe makes money with it - thats why it works for them.
PDFs internally have tables that hold
"how letters look" (glyphs)
"what unicode letters those glyphs are mapped to" (you need that to copy&paste someting from pdf correctly)
and a cross-ref which glyph mapps to what unicode. Fonts might be (partly) be embedded in the pdf as well.
Thats (one reason) why pdfs can look 100% ok, could be "OCR"ed ok - but if you just copy&paste from a document that has a corrupt mapping between glyphs and unicode points, you only get gibberish.
I have heard some programms even provide unicode mappings for all glyphs but they do not match up at all ... on purpose (or bad quality) - to prevent copy&paste.
Bottom line: you can try to re-OCR some pages, you could use Adobe Acrobat PRO to extract text from PDF (it has build in ocr features) that give you gibberish or just skip it.
You can try some other pdf-reading framework, maybe they got something not quite right - but chances are slim if it almost always works but just not for a few special pdfs.
I am just a novice in pdf - there are some more advanced ppl around to pipe in on this - but if you cannot share the pdf its going to be hard to advice anything.
Alternate approaches: Searching text in a PDF using Python?

lhs2Tex produces invalid .tex file

I ran lhs2Tex on my f.lhs file ($ lhs2Tex f.lhs > f.tex). It completed successfully, creating f.tex and producing no errors.
However, when I then run $ pdflatex f.tex, I get (some output followed by) the following error:
Runaway argument?
\ignorespaces \SaveRestoreHook \column {B}{#{}>{\hspre }l<{\hspost }#\ETC.
! File ended while scanning use of \PT#scantoend.
<inserted text>
<*> core.tex
Why is the output of lhs2Tex failing to compile to valid tex? It seems that lhs2Tex should not generate invalid tex, even with invalid input.
I am not sure of the best way to debug this; the error message that I have points to f.tex, but that is a long generated output, not my written code. In my code (f.lhs), I am not sure where to start since lhs2Tex sends it to f.tex with no errors or output at all.
Any suggestions on how to approach debugging this?
I removed a bunch of the generated code while checking to ensure that what remained still caused the same error.
Here is what remains; this minimized version of f.tex causes the same error.
%% ODER: format == = "\mathrel{==}"
%% ODER: format /= = "\neq "
\newcommand{\tex}[1]{\text{\texfamily#1}} % NEU
%mathindent has to be defined
% This package provides two environments suitable to take the place
% of hscode, called "plainhscode" and "arrayhscode".
% The plain environment surrounds each code block by vertical space,
% and it uses \abovedisplayskip and \belowdisplayskip to get spacing
% similar to formulas. Note that if these dimensions are changed,
% the spacing around displayed math formulas changes as well.
% All code is indented using \leftskip.
% Changed 19.08.2004 to reflect changes in colorcode. Should work with
% CodeGroup.sty.
{{\parskip=0pt\parindent=0pt\par\vskip #1\noindent}}
% can be used, for instance, to redefine the code size, by setting the
% command to \small or something alike
% The command \sethscode can be used to switch the code formatting
% behaviour by mapping the hscode environment in the subst directive
% to a new LaTeX environment.
{\expandafter\let\expandafter\hscode\csname #1\endcsname
\expandafter\let\expandafter\endhscode\csname end#1\endcsname}
% "plain" mode is the proposed default.
% It should now work with \centering.
% This required some changes. The old version
% is still available for reference as oldplainhscode.
% Here, we make plainhscode the default environment.

Ignore the comment sign (%) in m-file within a string

In my code I have the following line:
fprintf(logfile,'Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR: %d\tSigma: %d\tDisp: %.1f\r\n',parameter_sets(ps,:));
which is too long, so I want to break it to:
fprintf(logfile,'Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR: ...
%d\tSigma: %d\tDisp: %.1f\r\n',parameter_sets(ps,:));
However, since the brake is within a string, MATLAB see the formatting %d sign in the second line as a start of a comment, and ignore this line (and produce an error...).
So I tried to make it clearer with a [] that warp the string:
fprintf(logfile,['Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR: ...
%d\tSigma: %d\tDisp: %.1f\r\n'],parameter_sets(ps,:));
but no help, it still interpret the second line as a comment. I also tried with and without the ellipsis (...) in different places, with no success.
So how can I write a line in a formatted way (i.e. a reasonable length) if it has a % sign in it?
Divide it in two lines like this:
fprintf(logfile,['Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR:', ...
'%d\tSigma: %d\tDisp: %.1f\r\n'],parameter_sets(ps,:));
% notice the apostrophe and comma(',) before ellpsis(...) at the end of first line
% and apostrophe(') at the start of the second line

Pdflatex error when using {-" ... "-} inline TeX comments in lhs2TeX

I have the following code block in my .lhs file which uses inline TeX comments:
main = print 0
However, after compiling with lhs2TeX, I get the following errors when compiling the generated .tex file:
! Missing $ inserted.
<inserted text>
l.269 \end{hscode}
I've inserted a begin-math/end-math symbol since I think
you left one out. Proceed, with fingers crossed.
! LaTeX Error: Bad math environment delimiter.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
l.269 \end{hscode}
Your command was ignored.
Type I <command> <return> to replace it with another command,
or <return> to continue without it.
! Missing $ inserted.
<inserted text>
l.269 \end{hscode}
I've inserted a begin-math/end-math symbol since I think
you left one out. Proceed, with fingers crossed.
! LaTeX Error: Bad math environment delimiter.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
l.269 \end{hscode}
Your command was ignored.
Type I <command> <return> to replace it with another command,
or <return> to continue without it.
! Missing $ inserted.
<inserted text>
l.269 \end{hscode}
I've inserted a begin-math/end-math symbol since I think
you left one out. Proceed, with fingers crossed.
! LaTeX Error: Bad math environment delimiter.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
l.269 \end{hscode}
Your command was ignored.
When I remove the " marks in the inline comment, the error disappears. Anyone know what's wrong?
P.S Here's the .tex file that lhs2TeX generates:
\documentclass{article}%% ODER: format == = "\mathrel{==}"
%% ODER: format /= = "\neq "
{<-> ssub * cmtt/m/it}{}
{<-> ssub * cmtt/bx/n}{}
\newcommand{\tex}[1]{\text{\texfamily#1}} % NEU
\newcommand{\anonymous}{\kern0.06em \vbox{\hrule\#width.5em}}
\newcommand{\rbind}{\mathbin{=\mkern-6.7mu<\!\!\!<}}% suggested by Neil Mitchell
%mathindent has to be defined
\newcommand{\onelinecommentchars}{\quad-{}- }
\newcommand{\hsindent}[1]{\quad}% default is fixed indentation
\newcommand{\Todo}[1]{$\langle$\textbf{To do:}~#1$\rangle$}
% This package provides two environments suitable to take the place
% of hscode, called "plainhscode" and "arrayhscode".
% The plain environment surrounds each code block by vertical space,
% and it uses \abovedisplayskip and \belowdisplayskip to get spacing
% similar to formulas. Note that if these dimensions are changed,
% the spacing around displayed math formulas changes as well.
% All code is indented using \leftskip.
% Changed 19.08.2004 to reflect changes in colorcode. Should work with
% CodeGroup.sty.
{{\parskip=0pt\parindent=0pt\par\vskip #1\noindent}}
% can be used, for instance, to redefine the code size, by setting the
% command to \small or something alike
% The command \sethscode can be used to switch the code formatting
% behaviour by mapping the hscode environment in the subst directive
% to a new LaTeX environment.
{\expandafter\let\expandafter\hscode\csname #1\endcsname
\expandafter\let\expandafter\endhscode\csname end#1\endcsname}
% "compatibility" mode restores the non-polycode.fmt layout.
% "plain" mode is the proposed default.
% It should now work with \centering.
% This required some changes. The old version
% is still available for reference as oldplainhscode.
% Here, we make plainhscode the default environment.
% The arrayhscode is like plain, but makes use of polytable's
% parray environment which disallows page breaks in code blocks.
% The mathhscode environment also makes use of polytable's parray
% environment. It is supposed to be used only inside math mode
% (I used it to typeset the type rules in my thesis).
% texths is similar to mathhs, but works in text mode.
% The framed environment places code in a framed box.
% The inlinehscode environment is an experimental environment
% that can be used to typeset displayed code inline.
\def\nextline{}}{\) }%
% The joincode environment is a separate environment that
% can be used to surround and thereby connect multiple code
% blocks.
\begin{document}\section{}Precis 0
\section{}Precis 1
The {-" ... "-} construct is quite low-level. It drops you in the environment TeX currently is in, which happens to be math mode already. So the solution to your problem is simple. Write the code to be inserted as if you are in math mode.
A different option is to use a normal comment, but make the comment characters invisible using \invisiblecomments. Normal comments are typeset as text by lhs2TeX.
The following complete lhs2TeX document demonstrates both options:
%include polycode.fmt
% Assume you are in math mode already:
main = print 0
% This works, too:
main = print 0
{- $\langle$Link$\rangle$ -}
