weasyprint: change href link formatting in PDF

weasyprint: change href link formatting in PDF - weasyprint

I am using weasyprint 56.1 with django-weasyprint 2.1.0, with vanilla settings.
When my HTML contains an ordinary hyperlink of the form
my link text
I want weasyprint to generate PDF that looks like
my link text
and where that text is a hyperlink to https://example.com.
However, what I get instead is the following format:
my link text (https://example.com)
where both parts are hyperlinked.
The link is correct and works, but I do not want the URL to show.
I could not find anything about this in the weasyprint documentation.
I just spent an hour in the weasyprint source code trying to find the spot where this formatting happens, but to no avail.
What logic is responsible for this formatting and how can I change it?

This isn't an answer to the question you asked, but I had the same issue, and then realized that in my CSS I had this artifact from a template:
a[href]:after {
content: " (" attr(href) ")";
}

Related

How to filter text after webscraping

So I'm trying to webscrape this website that provides novels for free, for example this page: https://www.wuxiaworld.com/novel/martial-world/mw-chapter-1
I'm trying to only extract the title and the body of the chapter. Finding the title is easy enough since its in h4, however the body of the chapter is not separated by any specific div tags so I cannot just isolate it. I was wondering how I'd do this. The closest Ive gotten to just having the text is this.
Ps. Im new to webscraping, sorry if my question is unclear or stupid.
I tried to identify if the body of text was under any exclusive div tag but it wasn't, so i tried to call it under whatever the closest div tag was, this still returned a lot of useless and unwanted text.
edit : #koro, there's more than one instance of fr-view being used so it doesn't isolate the text. fr-view class also appears before the chapter text.

I'm not versed in webscraping but upon reviewing the page source html I see that <div class="fr-view"> only precedes the body text on the novel pages. If you start the logging after the scraper identifies this line you should be able to stop at the very next <a href="/novel..... tag to only have the novel text included.
Some of the pages I see also include footnotes with some extra information, these include an <a href=#footnote....> tag, so if you would like to keep the footnotes included I would search for <a href=/novel...> and NOT <a href=...>
P.S. I only looked at 4 pages and while they all appear to have the same format that I've pointed out above it's still possible that you may run into issues, but that's definitely something you can a bridge you can cross when you get there!

Insert page break when using "markdown-pdf" nodejs module?

I'm using the node.js module "Markdown-PDF" (https://www.npmjs.com/package/markdown-pdf, version 9.0) to covert markdown to PDF, and I need to add a few page breaks to clean up the presentation in the PDF output.
I tried all the recommendations I could find on this and other forums, including inline HTML tags such as:
<div style="page-break-after: always;"></div>
And some CSS hacks, like applying page breaks to all div tags (as described here: http://forums.apricitysoftware.com/t/include-pdf-pagebreak-instructions-in-markdown/152). None of these are working, all tags in the markdown (source) document appear in the PDF (output) document un-rendered.
Expected (ideal) behavior would be to add the page breaks to the markdown files, and have the PDF reflect the desired changes. Something like this, within my markdown files:
markdown text
markdown text
markdown text
[page break command]
markdown text
markdown text
markdown text
Thanks in advance for any assistance or suggestions that anyone can provide!

Got an assist from a friend and figured this out. Markdown-pdf uses HTML5Boilerplate, so you can edit the index.html file, found here on my system:
/usr/local/lib/node_modules/markdown-pdf/html5bp/index.html
I added the CSS described here: http://forums.apricitysoftware.com/t/include-pdf-pagebreak-instructions-in-markdown/152
And it worked. Was able to include the HTML tags described in the post and force page breaks. Success!

The styled div tag you mentioned only works if the html parameter of remarkable object is set to true in options parameter:
var markdownpdf = require("markdown-pdf")
, fs = require("fs")
let options = {
paperFormat: "A4",
paperOrientation: "landscape",
remarkable: {
html: true
}
}
fs.createReadStream("teste.md")
.pipe( markdownpdf( options))
.pipe( fs.createWriteStream("document.pdf"))
In order to use a md marker (instead of using html div tag), I guess you should use preProcessMd and change a specific pattern to the styled div tag.
Hope I could offer some help!

netsuite - inline html

I am trying to use a custom column as a hyperlink to a external site. Meaning,
In PO detail page, I want to add a custom column and I want the value of it to have the following HTML content.
Try Google
So when I go to the PO detail page I want to have a link to google.com.
How can I do this? I tried this as Inline HTML, Free-form Text and Rich text. none of them gave me a link.

I found 1 way of doing this using .
1. I added a Inline HTML field.
2. I added the default value for that as a <iframe> block which sends data to my service point.
3. In the service point I created the link (<a>) neede for that PO and print it.
That worked for me.

I had this same problem and finally found the answer. You need to create a field as Inline HTML and then enclose the link and the url in single quotes, concatenating with double pipes:
'Search Google'
More help can be found here:
http://www.netsuiterp.com/2019/06/highlighting-url-link-custom-field.html

Use a Tag as Page Title in Kentico

I have a tag cloud on product listing pages on my site that goes to a tag results page which displays products that contain that chosen tag. I want to put a header at the top of that results page that says something like "Products Tagged As: (insert tag name here)"
Any advice? I can't seem to access the system variable that displays the currently chosen tag name. The page URL contains the tagID variable, if that helps:
Product-Features.aspx?tagid=36
I am using Portal Engine Kentico development, by the way. Thanks.

I know this question has been asked a while ago. But I am posting my answer just in case if anyone come across this question they can use this snippet.
Try using following macro
{%tag="";foreach(g IN SiteObjects.TagGroups){foreach(t IN g.Tags){if(t.TagID=ToInt(QueryString.tagid)){tag=t.TagName;}}}return tag;%}
Note: I am using Kentico Version 9.0
For some reason the macro doesnt work in page template directly, I put the above macro in a Static Text webpart its worked like a charm.
Hope it would help someone like myself.
Regards,
Gopala

Use following macro:
{% GlobalObjects.Tags.Where("TagID = " + ToInt(QueryString.GetValue("tagid", 0))) %}

COGNOS generate report in xls format

I have one prompt page, one html report output page and one xls report output page. On prompt page, I have a prompt that selects Output Format(HTML/XLS) and a generate button that generates the report. The generate button needs to display the output page in the correct format.
The 'Generate' button just does promptAction('finish'). The thing is that no matter what i select in the format (XLS,PDF etc),promptAction('finish') always generates the HTML output.
So is there a way to call something like promptAction('finish', varFormat)?

I normally do this the other way around - use native Cognos functionality to run it in the format required (i.e. using run with options). Then use a variable to detect the format that was applied then apply conditional formatting. In your case the would be rendering the XLS page if XLS was selected and render the HTML page if HTML was selected.

I remember having this problem with HTML vs PDF page rendering. I don't have Cognos in front of me but what i found out is that i had to update my conditional style/format because the following would not work right... it was a strange problem but i did come up with a workaround
old pseudo code that wouldnt' work.
Created Variable that says
Case RENDER_TYPE
When PDF
THEN PDF
WHEN HTML
THEN HTML
End
then i put on a conditional style using this variable to make the page visible or not... and this would not work.
what i had to do was this...
Case
When RENDER TYPE = 'HTML'
Then 'HTML'
Else 'PDF' <- or in your case EXL
End
of course its only good for two formats but for some strange reason trying to use any other value than HTML created weird behavior.
Thanks,
If Render Type <> 'HTML' then render PDF otherwise render HTML...
i had problems anytime referring to the render variable with anything other than HTML. So basically i just had to test when HTML then HTML else other format.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

weasyprint: change href link formatting in PDF - weasyprint

This isn't an answer to the question you asked, but I had the same issue, and then realized that in my CSS I had this artifact from a template: a[href]:after { content: " (" attr(href) ")"; }

Related

How to filter text after webscraping

Insert page break when using "markdown-pdf" nodejs module?

netsuite - inline html

Use a Tag as Page Title in Kentico

COGNOS generate report in xls format

Categories

Resources