SublimeText3 + pandown + pandoc: includes_paths not working - sublimetext3

I'm using ST3+pandown+pandoc to convert markdown to PDF. I want to use pandown's includes_paths setting to avoid typing the path to my image directory every time. I haven't been able to get it to work, however. Here's a MWE:
I have a directory structure as follows:
text.markdown
test/img.pdf
In text.markdown, I have:
![](img.pdf)
I've got set includes_paths as follows in Pandown.sublime-settings:
"includes_paths":
[
"test/"
],
But, no dice. I've also tried with an absolute path, ./test, and test. Any ideas?

I think Pandown's includes_paths only applies to Pandoc's --include-in-header, --include-before-body and --include-after-body options, not image locations etc.
From Pandown.sublime-settings about includes_paths:
Pandoc apparently doesn't search for values for its --include
arguments anywhere but the working directory, which makes
working from a standard stylesheet or standard script
sort of tedious.

A workaround, using the graphicx package loaded in the YAML header and \graphicspath:
---
header-includes:
- \usepackage{graphicx}
---
\graphicspath{{test/}}
![](img.pdf)
Pandoc will say that it can't find img.pdf, but the image will be present in the final pdf.

Related

WGET - how to download embedded pdf's that have a download button from a text file URL list? Is it possible?

Happy New Years!
I wanted to see if anybody has ever successfully downloaded embedded pdf file's from multiple url's contained in a .txt file for a website?
For instance;
I tried several combinations of wget -i urlist.txt (which downloads all the html files perfectly); however it doesn't also grab each html file's embedded .pdf?xxxxx <---- slug on the end of the .pdf?*
The exact example of this obstacle is the following:
This dataset I have placed all 2 pages of links into a url.txt:
https://law.justia.com/cases/washington/court-of-appeals-division-i/2014/
1 example URL within this dataset:
https://law.justia.com/cases/washington/court-of-appeals-division-i/2014/70147-9.html
The embedded pdf link is the following:
https://cases.justia.com/washington/court-of-appeals-division-i/2014-70147-9.pdf?ts=1419887549
The .pdf files are actually "2014-70147-9.pdf?ts=1419887549" .pdf?ts=xxxxxxxxxx
each one is different.
The URL list contains 795 links. Does anyone have a successful method to download every .html in my urls.txt while also downloading the .pdfxxxxxxxxxxxxxx file's also to go with the .html's ?
Thank you!
~ Brandon
Try using the following:
wget --level 1 --recursive --span-hosts --accept-regex 'https://law.justia.com/cases/washington/court-of-appeals-division-i/2014/.*html|https://cases.justia.com/washington/court-of-appeals-division-i/.*.pdf.*' --input-file=urllist.txt
Details about the options --level, --recursive, --span-hosts, --accept-regex, and --input-file can be found in wget documentation at https://www.gnu.org/software/wget/manual/html_node/index.html.
You will also need to know how regular expressions work. You can start at https://www.grymoire.com/Unix/Regular.html
You are looking for a web-scraper. Be careful to not break any rules if you ever use one.
You could also process the content you have received through wget using some string manipulation in a bash script.

po2html missing html template

I have no coding experience but I need to convert a .po file into an .html file. I am learning how to use this translate toolkit.
I am using translate tool kit here: http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/html2po.html
Managed to get far as to install python and things.
I have gotten as far as below, and am unsure how to proceed.
Folder xh has example.po (which has relevant msgids and msgstr) and also example.html (which is blank)
C:\Users\bob>po2html -i C:\Users\bob\Desktop\xh -o C:\Users\bob\Desktop\xh
processing 1 files...
po2html: WARNING: Error processing: input C:\Users\Oh\Desktop\xh\example.po, output C:\Users\Oh\Desktop\xh\example.html, template None: must have template file for HTML files
[###########################################] 100%
How do I create this html template and what does it look like (My only knowledge is that you can create a HTML file in notepad, but I'm not sure what to put in it so example.html is blank atm)

How Can I Specify a Directory without Using the Full Directory Name? - Python 3.4

I don't want to specify the full directory of a folder or object within my program. I do not want to do this because if a user decides to change the installation folder, it will not function properly. I've seen in HTML you can do something like: ./folder/directory/name and it would work perfectly fine. Is there a way to do something like that within Python?
From https://docs.python.org/3/reference/datamodel.html
__file__ is the pathname of the file from which the module was loaded
You may find it helpful to apply os.path.abspath() to '.' or __file__.

scons: How to deal with dynamic targets?

I'm trying to automate my work of converting PDF to png file with scons. The tool used for my conversion is convert from ImageMagick.
Here's the raw command line:
convert input.pdf temp/temp.png
convert temp/*.png -append output.png
The first command will generate one PNG file for each page in PDF file, so the target of the first command is a dynamic file list.
Here's the SConstruct file I'm working on:
convert = Builder(action=[
Delete("${TARGET.dir}"),
Mkdir("${TARGET.dir}"),
"convert $SOURCE $TARGET"])
combine = Builder(action="convert $SOURCE -append $TARGET")
env = Environment(BUILDERS={"Convert": convert, "Combine": combine})
pdf = env.PDF("input.tex")
pngs = env.Convert("temp/temp.png", pdf) # I don't know how to specify target in this line
png = env.Combine('output.png', pngs)
Default(png)
The code pngs = env.Convert("temp/temp.png", pdf) actually is wrong since the target is multiple files that I don't know how many before env.Convert is executed, so the final output.png only contains the first page of the PDF file.
Any hint is appreciated.
UPDATE:
I just found that I can use command convert input.pdf -append output.png to avoid the two-step conversion.
Still I'm curious how to handle the scenario when the intermediate temporary file list is unknown beforehand and requires a dynamic target list.
If you want to know how to do the original (convert and combine) situation you proposed, I would suggest creating a builder with a SCons Emitter. The emitter allows you to modify the list of source and target files. This works nicely for generated files that dont exist with a clean build.
As you mentioned, the convert step will generate multiple targets, the trick is you need to be able to "calculate" those targets in the emitter based on the source. For example, recently I created a wsdl2java builder and was able to do some simple wsdl parsing in the emitter to calculate all of the target java files to be generated (the source being the wsdl).
Here is a general idea of what the build scripts should look like:
def convert_emitter(source, target, env):
# both and source and target will be a list of nodes
# in this case, the target will be empty, and you need
# to calculate all of the generated targets based on the
# source pdf file. You will need to open the source file
# with standard python code. All of the targets will be
# removed when cleaned (scons -c)
target = [] # fill in accordingly
return (target, source)
# Optionally, you could supply a function for the action
# which would have the same signature as the emitter
convert = env.Builder(emitter=convert_emitter,
action=[
Delete("temp"),
Mkdir("temp"),
"convert $SOURCE $TARGET"])
env.Append(BUILDERS={'Convert' : convert})
combine = env.Builder(action=convert_action, emitter=combine_emitter)
env.Append(BUILDERS={'Combine' : combine})
pdf = env.PDF('input.tex')
# You can omit the target in this call, as it will be filled-in by the emitter
pngs = env.Convert(source=pdf)
png = env.Combine(target='output.png', source=pngs)
Depending on what qualifies as "dynamic" for you, I believe the correct answer is: not possible.
As long as the source on which you would like to "dynamically" compute a target set is present when SCons is run, #Brady's solution should work fine. However, if the source in question itself is the target of some other command, it will not work. This is a fundamental limitation of SCons, as it makes the assumption that the set of build targets can be statically determined from the base set of input (non-intermediate) sources. It runs through and computes a build/target/dependency graph in one sweep, then executes it in the next. It has no ability to run through some known portion of the build graph, stop to introspect some intermediate targets to dynamically compute the rest of the build graph, and then continue. I'd frankly love for this ability in the work that I do with SCons, but I'm afraid this is just a fundamental limitation.
The best you can do is set the build up so that on the first run, it stops at the construction of the PDF (if no PDF target exists when the build script is executed). Once the PDF has been built, you can rerun the build and set things up so the rest of the build steps execute based on the PDF built from the last run. This more or less works decently... except for one problem. If the PDF ends up changing (and producing some new pages for instance), you'll actually have to rerun the build twice in order to capture the changes to the PDF, since any page counts (etc) will be based on the old version of the PDF.
I'd love for someone to prove me wrong here, but such is the way of things.
Looking at this, there's no requirement for the individual temp/*png to be kept - if there was, you shouldn't be putting them in a temp directory, and in any case you'd have to do quite a bit of work if you wanted to work out which pages to generate.
So it looks more sensible to do this as one step, this So you'd have something like this
png = env.Convert('output.png', 'input.pdf')
where the action function for convert was something like this:
Delete('temp'),
Mkdir('temp'),
'convert $SOURCE temp/$TARGET',
'for i in temp/*png; do convert $TARGET temp/$i',
Delete('temp')
Though frankly you might do better with writing that whole thing as a single callable script to make sure you got the page sorting correct.

Linux: WGET - scheme missing using -i option

I am trying to download multiple files from yahoo finance using wget.
To do that i used a python script to generate a text file with all urls that i need.
When downloading a single file (a csv file) using the following code:
wget ichart.finance.yahoo.com/table.csv?s=BIOM3.SA&a=00&b=5&c=1900&d=04&e=21&f=2013&g=d&ignore=.csv
everything goes OK!
However, when the option -i is added and instead of reading the url directly, but instead reading it from the file, i get the error:
Invalid URL ichart.finance.yahoo.com/table.csv?s=BIOM3.SA&a=00&b=5&c=1900&d=04&e=21&f=2013&g=d&ignore=.csv: Scheme missing
The file that contains the urls is a text file with a single url in each line. The urls are exactly like the one in the first example, but with some different parameters.
Is there a way to correct this?
Thanks a lot for reading!!
To solve the problem I added double-quotes on the links and a web protocol. For example:
"http://ichart.finance.yahoo.com/table.csv?s=BIOM3.SA&a=00&b=5&c=1900&d=04&e=21&f=2013&g=d&ignore=.csv"

Resources