PDF parsing for an eBook app - uiwebview

Looking for some help on the pros and cons of using UIWebView vs. CGPDF for the purpose of creating an eBook reader app. The content would be in the form of PDF, thus I am trying to figure out which of the two API's - UIWebView or CGPDF would work the best.
I'm targeting features like zoom, table of contents, notes and annotations, and search.

If you are looking for search, annotations, table of contents, etc...you have to use CGPDF*. UIWebView does not contain implementation for those. Even then, CGPDF can be limited somewhat and is generally pretty frustrating :)
On the other hand, UIWebView is obviously much simpler to use.

Related

How can I replicate GitHub's app code block style in UITextView?

I am trying to create a markdown editor for fun and because I was interested in learning TextKit. I am working on iOS devices only, so I only have UIKit framework at my disposal, also I watched this WWDC18 video which explains some best practices to adopt with TextKit. At first I was really interested in knowing how to get the same result that the video shows at 22:13.
The code shown by the dev in the video makes use of AppKit and it wraps code in that rectangle by replacing the code section with a NSMutableParagraphStyle and by providing a custom NSTextBlock to the latter. As you can guess, I can't subclass NSTextBlock in UIKit and therefore I can't replicate the example shown.
I have something a little bit harder to ask though, what I really wanted to replicate is the GitHub application code block style, this is the final result that I would like to replicate:
As you can see it should have a rounded rectangle with a custom background color and also it needs to be scrollable in order to make code expand as much as it needs (I am not interested in syntax highlighting).
How can I achieve something like this if NSTextBlock is not available in UIKit?
For the moment all I have done is override NSTextStorage to set the correct font to the code section with custom NSAttributedString.
I have investigated in the app and noticed that it is managed by Ryan Nystrom which is the creator of GitHawk prior to moving to GitHub. The GitHawk markdown reader is very similar to the one that you find in the GitHub application and the funny thing is that GitHawk is open source so I took a look at that code and tried to understand how that could be replicated.
Turns out that GitHawk heavily uses IGListKit framework and parses markdown in ListDiffable elements, specifically the code blocks ListDiffables are given to a ListSectionController that created an horizontal UIScrollView to display code.
I gave up trying to understand more on this because GitHawk code is really really entangled and it is really hard, near impossible, to extract the specific functionality.

Google Docs: Table of content page numbers

we are currently building an application on the google cloud platform, which generates reports in Google Doc. For them, it is really important to have a table of content ... with page numbers. I know this is a feature request since a few years and there are add-ons (Paragraph Styles +, which didn't work for us) that provide this solution, butt we are considering to build this ourselves. if anybody has a suggestion on how we could start with this, it would be a great help!
thanks,
Best bet is to file a feature request on the product forums.
Currently the only way to do that level of manipulation of a doc to provide a custom TOC is to use Apps Script. It provides access to the document structure sufficient enough to build and insert a basic table of contents, but I'm not sure there's enough to do paging correctly (unless you force a page break on ever page...) There's no method to answer the question of "what page is this element on?"
Hacks like writing to a DOCX and converting don't work because TOCs are recognized for what they and show up without page numbers.
Of course you could write a DOCX or PDF with the TOC as you'd like and upload as a blob rather than as a Google Doc. They can still be viewed in Drive and such.

Any way in Expression Engine to simulate Wordpress' shortcode functionality?

I'm relatively new to Expression Engine, and as I'm learning it I am seeing some stuff missing that WordPress has had for a while. A big one for me is shortcodes, since I will use these to allow CMS users to place more complex content in place with their other content.
I'm not seeing any real equivalent to this in EE, apart from a forthcoming plugin that's in private beta.
As an initial test I'm attempting to fake shortcodes by using delimited strings (e.g. #foo#) in the content field, then using a regex to pull those out and pass them to a function that can retrieve the content out of EE's database.
This brings me to a second question, which is that in looking at EE's API docs, there doesn't appear to be a simple means of retrieving the channel entries programmatically (thinking of something akin to WP's built-in get_posts function).
So my questions are:
a) Can this be done?
b) If so, is my method of approaching it reasonable? Or is there something stupidly obvious I'm missing in my approach?
To reiterate, my main objective here is to have some means of allowing people managing content to drop a code in place in their content that will be replaced with channel content.
Thanks for any advice or help you can give me.
Here's a simple example of the functionality you're looking for.
1) Start by installing Low Replace.
2) Create two Global Variables called gv_hello and gv_goodbye with the values "Hello" and "Goodbye" respectively.
3) Put this text into the body of an entry:
[say_hello]
Nice to see you.
[say_goodbye]
4) Put this into your template, wrapping the Low Replace tag around your body field.
{exp:low_replace
find="[say_hello]|[say_goodbye]"
replace="{gv_hello}|{gv_goodbye}"
multiple="yes"
}
{body}
{/exp:low_replace}
5) It should output this into your browser:
Hello
Nice to see you.
Goodbye
Obviously, this is a really simple example. You can put full blown HTML into your global variable. For example, we've used that to render a complex, interactive graphic that isn't editable but can be easily dropped into a page by any editor.
Unfortunately, due to parse order issues, EE tags won't work inside Global Variables. If you need EE tags in your short code output, you'll need to use Low Variables addon instead of Global Variables.
Continued from the comment:
Do you have examples of the kind of shortcodes you want to support/include? Because i have doubts if controlling the page-layout from a text-field or wysiwyg-field is the way to go.
If you want editors to be able to adjust layout or show/hide extra parts on the page, giving them access to some extra fields in the channel, is (imo) much more manageable and future-proof. For instance some selectfields, a relationship (or playa) field, or a matrix, to let them choose which parts to include/exclude on a page, or which entry from another channel to pull content from.
As said in the comment: i totally understand if you want to replace some #foo# tags with images or data from another field (see other answers: nsm-transplant, low_replace). But, giving an editor access to shortcodes and picking them out, is like writing a template-engine to generate ee-template code for the ee-template-engine.
Using some custom fields to let editors pick and choose parts to embed is, i think, much more manageable.
That being said, you could make a plugin to parse the shortcodes from a textareas content, and then program a lot, to fetch data from other modules you want to support. For channel entries you could build out of the channel data library by objectiveHTML. https://github.com/objectivehtml/Channel-Data
I hear you, I too miss shortcodes from WP -- though the reason they work so easily there is the ubiquity of the_content(). With the great flexibility of EE comes fewer blanket solutions.
I'd suggest looking at NSM Transplant. It should fit the bill for you.
There is also a plugin called Shortcode, which you can find here at
Devot-ee
A quote from the page:
Shortcode aims to allow for more dynamic use of content by authors and
editors, allowing for injection of reusable bits of content or even
whole pieces of functionality into any field in EE

How to open iba (iBooks Author) books in a iOS app?

Is it possible to parse and read iba books in a iOS app?
This is probably impossible without a lot of reverse engineering. .iba files are based on epub as mr.octobor suggests, but the layout and rendering isn't achieved through normal html, css and javascript as in epub. Apple uses a lot of proprietary code in .iba, particularly to build the widgets.
Baldur Bjarnason laments this fact on this post, where you can get a better understanding of the situation: http://www.baldurbjarnason.com/notes/the-ibooks-builtin-widgets/
The piece by Daniel Glazman linked in that article is useful as well: http://www.glazman.org/weblog/dotclear/index.php?post/2012/01/20/iBooks-Author-a-nice-tool-but

Creating PDF Invoices - Are there any templating solutions? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Our company is looking to integrate invoices into a new system we are developing.
We require a solution to create a layout of the invoice and then convert to pdf.
We have considered just laying out the invoice in html/css then converting to pdf.
We have also considered using SVG->PDf conversion.
Both of these solutions integrate well into our existing templating language used for our web application.
Historically we have been a Microsoft based business and used Crystal Reports for such a task but we are looking for an open source Linux solution for this project.
Does any one have any suggestions of an approach or technology we could use for such a task?
Try this... create a blank invoice with Word (or whatever you want) and save it as a PDF.
Then use a PDF library to modify the PDF (insert the text at particular coordinates). We do this in the Microsoft world and it is extremely easy.
The biggest benefit is that we can use our own tools to create and modify the template. If we want to add some static text, we just crank open Word, make the change and save it to a PDF file (that is being used as a template).
For Microsoft, we use iTextSharp which is actually a C# port of the original Java version of iText
Additionally...
You can use Adobe Acrobat to insert fields in the PDF (address, phone, invoice number, line item 1, line item 2, etc...) and then use iText/iTextSharp to populate these fields at run time.
This is, in more detail, what we do... and it is extremely easy.
The normal way is to install (La)TeX (probably already on the linux box) and run pdflatex to get the pdfs. You can also use Apache FOP, if you prefer xslt and xsl-fo.
If the number of invoices to create is low you might want to use open-office (directly or as a toolkit).
If you want high-precision positioning and low-level access, a low-level pdf library (I don't know if iTextSharp works with mono) might be what you want.
I would try out LaTeX first, because it allows you to get results with the least effort.
I've previously produced invoices by templating a PostScript file, and then using Ghostscript's ps2pdf to convert those into PDFs.
We use Reportlab with Python. If you look around there are a load of ready-made forms/invoices/etc.
There are several OSS reporting engines (Jasper Reports, Pentaho and BIRT to name three) that you could use in much the same way as you have historically used Crystal Reports. One of the other posters mentions ReportLab, which is an option if you're using Python or can embed a Python runtime in your application.
Probably the most flexible solution is to create XMLs with invoice data and then by using XSLTs transform the, into PDFs, HTMls, whatever...
It depends on your environment. If you have access to Java, you might look at iText (http://www.lowagie.com/iText/), a library that allows you to generate PDF files on the fly.
There are two steps, if i understood correctly:
1) Creation of PDF template with placeholders to populate data programmatically
2) Populating the PDF template programmatically during run time
For #1, OpenOffice allows creation of PDF templates, which can then be populated programmatically. It's good enough to create simple invoices that doesn't probably involve datagrid/table kind of stuff.
For #2, you already have the answers here - iText, iTextSharp.
Hope this helps!
I love wkhtmltopdf http://code.google.com/p/wkhtmltopdf/
Not sure what your goal is here, but there is an opensource php-library called fpdf, which also has an extension for taking a pre-made pdf as layout and then populate it with more content, generating a new PDF with that info.
However, I would go for a solution that you can integrate nicely into the plattform you're building, but I wouldn't go in a HTML->PDF solution since you won't have any clue about what would fit on a piece of paper regarding sizes in that kind of enviroment, meaning you won't know when you should split the content into two separate templates.
You might also try using XSL:FO. XSL:FO is a documented standard for describing page layout: http://www.w3.org/TR/xsl/#fo-section.
I've had success on two projects creating documents by creating an XML schema that defines the content of the "PDF". I then use the XSD tool (from Microsoft) to generate a class representing this document. I then map my data into that structure, serialize the populated class to XML, along with an XSL stylesheet that defines how that data should be mapped into FO, and pass it to an FO formatter. For formatters, I have use Alt-Soft's Xml2Pdf with success. There are a few others out there. There are some tools available to help create the XSL to FO stylesheet (i.e. stylusstudio and XmlSpy), but I recommend learning the FO constructs as the tools seem to produce bloated stylesheets. FO is comparable to HTML (where a P tag is a BLOCK tag in FO), but can be tricky. This nice thing about FO, is that some formatter support conversion to other formats, such as Word, HTML, etc.
Other options:
iTextSharp (C# port of iText). Just started reading about this. Open source and free. I don't think there is any "templating" supported with this, but I could be wrong about that.
SQL Server Reporting Services. Assuming your invoice data is in, or can be put in, a format that can be read by reporting services (SQL Server, Web Service, etc), define the layout in SSRS and then publish to reporting server. Use SSRS Web Services or query parameter execution to execute the report and have it output as PDF.
This html-2-pdf site may be a helpful starting point: http://maarten.lippmann.us/?p=101
A site a friend of mine built uses a script to churn HTML pages into printable PDFs, too - http://philambdaupsilon.org. Not sure on the exact details of it, but he is an SO user, and I'll send this question to him, too.
Unfortunately, the best system on the market (at present) is passing the HTML & CSS to a ColdFusion server and have that return the rendered PDF. So if money isn't a big concern, this is the quickest to deploy solution that'll render the best results.
I've tried very hard to get FPDF, TCPDF, the R&OS pdf class, and even CodeIgniter's recommendation to work, but nothing with stable output for anything beyond the most basic/bland HTML files.
Honestly, if the ColdFusion solution isn't viable, I'd use html2ps, and then ps2pdf to convert your files into a PDF.
(This is all assuming that you don't want to take the time and design each PDF using the native PDF-creator code in PHP. This is what systems like SugarCRM use. Though its very functional with stable results, the actual creation of each PDF-generator file is a most painful process)
We have used Jasper Reports before. It's not what you'd call user-friendly, but it will talk directly to your database.
html2pdf works very well. You can use this to generate both HTML and PDF reports from the same source.
I'm fiddling with Black Sheep Invoices right now, which is great at first but now I'm having trouble actually getting it to render the PDFs. Lots of installation difficulties--probably a lot easier on your own server but i'm up on a shared host with it. The HTML output and data management portions are well done though, which is something you won't get out of just creating a postscript template. I was hoping to find a reference to a library that has an active development team though (Black Sheep is not being updated at this time).
If you want browser perfect HTML converted to PDF then try commandlineprint
You'll need to install firefox on a linux distro, disable all firefox alerts and then run it through a virtual display. Check this thread for more details.
It's infuriating to get running well but does give you the best results for HTML to PDF conversion I've seen.
OK, a search of Google Code projects turned up Simple Invoices, which is awesome and well maintained.
I use TROFF for my invoices because of its extremely simple textual encoding. The logic is a few lines of Perl. Keeping it simple.
For a Ruby solution, try Prawn: http://prawn.majesticseacreature.com/
I use open office on the server and then generate the XML for the document (just unzip the document and hack away)
Some can use Dhek template editor to define area/placeholder for existing PDF, without altering existing document, and then populate it to generate final doc (e.g. with user values from a form): https://github.com/applicius/dhek .

Resources