Documentation or reference for "NETSCAPE-Bookmark-file-1" DOCTYPE - browser

Is there any standard (possibly created after-the-fact) that governs <!DOCTYPE NETSCAPE-Bookmark-file-1> files? If you export bookmarks from either Chrome or Firefox (tried on Windows 10) you get this kind of file, which seems to be HTML of sorts.
I've tried searching the web but found only pragmatic results like parsers in specific programming stacks, or tips and tricks on importing and exporting it.
Is there any standard, RFC, format description, or reference parser, or something similar?

Not even valid HTML it is, neither technically, nor semantically. And it seems that modern browsers interpret the factual standard loosely when writing such files, but luckily also when importing.
The best available format description (probably reverse engineered, yes) seems to be this one:
https://learn.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/platform-apis/aa753582(v=vs.85)
And it's by Microsoft of all things...

Related

Is there anyway to sanitize SVG file in c#, any libraries anything?

Is there anyway to sanitize SVG file in c#, any libraries anything?
From client side we are sanitizing the SVG files while uploading , but the security team is asking for a sanitization in serverside too.
I'm primarily a Python developer, but I thought I'd throw some research into the issue for ya. I used to develop for C, so I thought I should at least have a basic understanding of what's going on.
*.SVG files are structured like XML documents, and use the HTML DOM to access JavaScript and CSS functionalities. Trying to enumerate and script out every single catch for potential JavaScript-based security issues doesn't seem realistic, so personally, I'd just entirely remove all JavaScript sectors that do anything more than define simple variables, do math operations, or reference already-defined visual elements from any uploaded *.SVG files.
Since *.SVG files are based on XML and are human-readable, this could be accomplished by iterating through the file either line-by-line like a text file or element-by-element like an XML or HTML file. You'd want to go through and remove all the JavaScript scripts that don't meet the above criteria, save it & then convert it to XML and use a standard XML-sanitation library on it, and then convert that back to *.SVG. I reckon this Github library and this StackOverflow thread could be helpful in that.
I hope my response was helpful!
It is true what your security team say: client-side security is not security. It is just user convenience. Never rely on client-side checks. Anyone wanting to do bad things to your application will bypass client-side checks first thing.
Now, a SVG file can be used in a XSS attack only by leveraging the <script> tag.
Unfortunately, defusing/securing a script is a very complicated topic and prone to errors and both false positives and negatives.
So, I believe your only recourse is to remove scripts altogether. This might not be what you need.
But, if it is, then it's very simple to do. The script tag cannot be masqueraded inside the SVG, or the browser will not recognize it in the first place, making the attack moot. So a simple regex should suffice. Something like,
cleanSVGcode = Regex.Replace(
userSVGcode,
#"<script.*?script>",
#"",
RegexOptions.IgnoreCase|RegexOptions.SingleLine
);
It is possible to sanitize out further sequences. Since, if they're written incorrectly or in an obfuscated way, javascript calls won't work, the number of these sequences is limited.
#"javascript:" => #"syntax:error:"

How to write a Navigator object standard W3C proposal?

Based on the w3schools site: "There is no public standard that applies to the navigator object, but all major browsers support it.".
I somehow see the navigator object as rather important, and with the rate at which browsers change their versions today even more so. So it rather baffles me that there is no standard for it.
What baffles me even more is that none of the browsers seem to have come up with the idea to include the two most important properties into this object:
navigator.browserName
navigator.browserVersion
We all have to parse the darn navigator.userAgent and hope from version to version that stuff in there did not change too much. Like it just did in IE11 for example...
How can one even write a W3C proposal for a new standard?
Thanks to #Alohci and some more digging, since the Navigator object has a standard, the way to go about proposing a change to it is to send an email to the public-html#w3.org

Shortest path to render from DirectWrite to Direct3D11?

Man so much has changed since I learned DirectX 7.
Everywhere I look (except Wikipedia), it says I have to render from DWrite to D2D or GDI before I can do anything.
Is that Wikipedia article wrong? Can I not render to Direct3D?
I'd like to avoid having to render to D2D, since apparently, to get D2D to write to D3D, you have to open up a D3D10.1 device as well.
Does it really take all this just to render text in D3D11?
Unfortunately, Microsoft decided to remove native text support from their DirectX API. Now you can either use DirectWrite, and then as you said render to GDI or D2D, which is somewhat clunky, or alternatively, make your own font-handling class, and use that (which is what I've chosen to done for my project).
There is a good tutorial on how to produce a custom Font-handling class, here: http://www.rastertek.com/dx11tut12.html
Obviously, you should write your own, but it provides a good starting point, and allows you to see all the necessary proceedures (something you will probably want to add will be support for multiple fonts, for which I recommend creating a Font class, which your Font-Handler stores with an associative string in a std::map< char*, Font* >).
Hope this helps! :)

Markdown to HTML conversion

I'm still in the middle of coding my final year project at university, and I have come across an issue where I need to either convert from HTML to Markdown or visa versa. Now I have no experience whatsoever of Perl, Python, etc. so I'm in need of an easy-to-implement solution, I only have about 6 weeks left to complete this now. I'm writing the data from a WMD text box to SQL Server, and I can either upload it as Markdown or HTML but if that data needs editing it cannot be in HTML as this would be too confusing for the end user who is perceived to have zero/very little computing "know how".
What should I do?
Karmastan's answer is probably the best here. Keeping the raw Markdown in the database is a really good solution as it allows users to upkeep the content in a form with which they're familiar.
However, if you have a bunch of HTML which is already converted, you might want to look at something like Markdownify: The HTML to Markdown converter for PHP.
Edit: based on what you've said below, there are a few things you should keep in mind:
Make sure that the following is set in wmd.js:
wmd_options = {"output": "Markdown"};
This ensures that you're storing Markdown in the database.
Source: How do you store the markdown using WMD in ASP.NET?
When outputting the Markdown to the web, you need to transform it to HTML. To do this, you'll need a library which does Markdown -> HTML conversion. Here are two examples:
Announcing Markdown.NET
Revisied Markdown.NET Library
I'm not a .NET developer, so I can't really help with how these libraries should be used, but hopefully the documentation will make that clear.
If you look at the web site for Markdown, you'll find a Perl script that converts Markdown-syntax documents to HTML. Keep Markdown text in your database and invoke the script whenever you need to display the text. No Perl knowledge required!

Creating PDF Invoices - Are there any templating solutions? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Our company is looking to integrate invoices into a new system we are developing.
We require a solution to create a layout of the invoice and then convert to pdf.
We have considered just laying out the invoice in html/css then converting to pdf.
We have also considered using SVG->PDf conversion.
Both of these solutions integrate well into our existing templating language used for our web application.
Historically we have been a Microsoft based business and used Crystal Reports for such a task but we are looking for an open source Linux solution for this project.
Does any one have any suggestions of an approach or technology we could use for such a task?
Try this... create a blank invoice with Word (or whatever you want) and save it as a PDF.
Then use a PDF library to modify the PDF (insert the text at particular coordinates). We do this in the Microsoft world and it is extremely easy.
The biggest benefit is that we can use our own tools to create and modify the template. If we want to add some static text, we just crank open Word, make the change and save it to a PDF file (that is being used as a template).
For Microsoft, we use iTextSharp which is actually a C# port of the original Java version of iText
Additionally...
You can use Adobe Acrobat to insert fields in the PDF (address, phone, invoice number, line item 1, line item 2, etc...) and then use iText/iTextSharp to populate these fields at run time.
This is, in more detail, what we do... and it is extremely easy.
The normal way is to install (La)TeX (probably already on the linux box) and run pdflatex to get the pdfs. You can also use Apache FOP, if you prefer xslt and xsl-fo.
If the number of invoices to create is low you might want to use open-office (directly or as a toolkit).
If you want high-precision positioning and low-level access, a low-level pdf library (I don't know if iTextSharp works with mono) might be what you want.
I would try out LaTeX first, because it allows you to get results with the least effort.
I've previously produced invoices by templating a PostScript file, and then using Ghostscript's ps2pdf to convert those into PDFs.
We use Reportlab with Python. If you look around there are a load of ready-made forms/invoices/etc.
There are several OSS reporting engines (Jasper Reports, Pentaho and BIRT to name three) that you could use in much the same way as you have historically used Crystal Reports. One of the other posters mentions ReportLab, which is an option if you're using Python or can embed a Python runtime in your application.
Probably the most flexible solution is to create XMLs with invoice data and then by using XSLTs transform the, into PDFs, HTMls, whatever...
It depends on your environment. If you have access to Java, you might look at iText (http://www.lowagie.com/iText/), a library that allows you to generate PDF files on the fly.
There are two steps, if i understood correctly:
1) Creation of PDF template with placeholders to populate data programmatically
2) Populating the PDF template programmatically during run time
For #1, OpenOffice allows creation of PDF templates, which can then be populated programmatically. It's good enough to create simple invoices that doesn't probably involve datagrid/table kind of stuff.
For #2, you already have the answers here - iText, iTextSharp.
Hope this helps!
I love wkhtmltopdf http://code.google.com/p/wkhtmltopdf/
Not sure what your goal is here, but there is an opensource php-library called fpdf, which also has an extension for taking a pre-made pdf as layout and then populate it with more content, generating a new PDF with that info.
However, I would go for a solution that you can integrate nicely into the plattform you're building, but I wouldn't go in a HTML->PDF solution since you won't have any clue about what would fit on a piece of paper regarding sizes in that kind of enviroment, meaning you won't know when you should split the content into two separate templates.
You might also try using XSL:FO. XSL:FO is a documented standard for describing page layout: http://www.w3.org/TR/xsl/#fo-section.
I've had success on two projects creating documents by creating an XML schema that defines the content of the "PDF". I then use the XSD tool (from Microsoft) to generate a class representing this document. I then map my data into that structure, serialize the populated class to XML, along with an XSL stylesheet that defines how that data should be mapped into FO, and pass it to an FO formatter. For formatters, I have use Alt-Soft's Xml2Pdf with success. There are a few others out there. There are some tools available to help create the XSL to FO stylesheet (i.e. stylusstudio and XmlSpy), but I recommend learning the FO constructs as the tools seem to produce bloated stylesheets. FO is comparable to HTML (where a P tag is a BLOCK tag in FO), but can be tricky. This nice thing about FO, is that some formatter support conversion to other formats, such as Word, HTML, etc.
Other options:
iTextSharp (C# port of iText). Just started reading about this. Open source and free. I don't think there is any "templating" supported with this, but I could be wrong about that.
SQL Server Reporting Services. Assuming your invoice data is in, or can be put in, a format that can be read by reporting services (SQL Server, Web Service, etc), define the layout in SSRS and then publish to reporting server. Use SSRS Web Services or query parameter execution to execute the report and have it output as PDF.
This html-2-pdf site may be a helpful starting point: http://maarten.lippmann.us/?p=101
A site a friend of mine built uses a script to churn HTML pages into printable PDFs, too - http://philambdaupsilon.org. Not sure on the exact details of it, but he is an SO user, and I'll send this question to him, too.
Unfortunately, the best system on the market (at present) is passing the HTML & CSS to a ColdFusion server and have that return the rendered PDF. So if money isn't a big concern, this is the quickest to deploy solution that'll render the best results.
I've tried very hard to get FPDF, TCPDF, the R&OS pdf class, and even CodeIgniter's recommendation to work, but nothing with stable output for anything beyond the most basic/bland HTML files.
Honestly, if the ColdFusion solution isn't viable, I'd use html2ps, and then ps2pdf to convert your files into a PDF.
(This is all assuming that you don't want to take the time and design each PDF using the native PDF-creator code in PHP. This is what systems like SugarCRM use. Though its very functional with stable results, the actual creation of each PDF-generator file is a most painful process)
We have used Jasper Reports before. It's not what you'd call user-friendly, but it will talk directly to your database.
html2pdf works very well. You can use this to generate both HTML and PDF reports from the same source.
I'm fiddling with Black Sheep Invoices right now, which is great at first but now I'm having trouble actually getting it to render the PDFs. Lots of installation difficulties--probably a lot easier on your own server but i'm up on a shared host with it. The HTML output and data management portions are well done though, which is something you won't get out of just creating a postscript template. I was hoping to find a reference to a library that has an active development team though (Black Sheep is not being updated at this time).
If you want browser perfect HTML converted to PDF then try commandlineprint
You'll need to install firefox on a linux distro, disable all firefox alerts and then run it through a virtual display. Check this thread for more details.
It's infuriating to get running well but does give you the best results for HTML to PDF conversion I've seen.
OK, a search of Google Code projects turned up Simple Invoices, which is awesome and well maintained.
I use TROFF for my invoices because of its extremely simple textual encoding. The logic is a few lines of Perl. Keeping it simple.
For a Ruby solution, try Prawn: http://prawn.majesticseacreature.com/
I use open office on the server and then generate the XML for the document (just unzip the document and hack away)
Some can use Dhek template editor to define area/placeholder for existing PDF, without altering existing document, and then populate it to generate final doc (e.g. with user values from a form): https://github.com/applicius/dhek .

Resources