How can I create a customized version of an existing pdf file with node.js? - node.js

I have an old system that was written in PHP a long time ago that I would like to update to node.js to allow me to share code with a more modern system. Unfortunately, one of the main features of the PHP system is a tool that allows it to load an existing PDF file (which happens to be a government form), fill out the user's information, and provide a PDF to the browser that has all of that information present.
I have considered making a PHP script that will just do the PDF customization and using node for everything else, but it seems like something like this should be able to be done without requiring PHP to be installed.
Any idea how I might solve my problem just using node?

After a lot of searching and nearly giving up, I did eventually find that the HummusJS library will do what I want to do!
Update April 2020: In the intervening years since I posted this other options have cropped up which look like they should work. Since this question still gets a lot of attention I thought I'd come back and update with some other options:
pdf-lib - This one is my current favorite; it works great. It may have limitations for extremely large PDFs, but it is constantly improving and you can do nearly anything with it -- if not through the helper API then through the abstraction they provide which allows you to use nearly any raw PDF feature, though that requires more knowledge of the PDF file format than most possess.
It's worth noting that pdf-lib doesn't support loading encrypted pdfs, but you can use something like qpdf to strip the encryption before loading it.
https://www.npmjs.com/package/nopodofo - This one should be one of the best options out there, but I couldn't get it working myself on a mac
https://www.npmjs.com/package/node-pdfsign - Not exactly the same thing but can be used with other tools to do digital signatures on a PDF. Haven't used it yet, but I expect do
Update Dec 2021: I'm still using pdf-lib and I think it's still the best available library, but there are a lot of new libraries that have come out in the last couple of years for handling PDFs, so it's worth looking around a bit.

Related

Convert pdf to images using Node package or open source tool

I'm looking for an open-source tool or a NPM package, which can be ran using node (for example by spawning a process and calling command line).
The result I need a PDF file converted/broken to images. Where each page in PDF is now an image file.
I checked
https://npmjs.com/package/pdf-image -- seems to be last maintained 3 years ago.
same for https://npmjs.com/package/pdf-img-convert
Please advise which package/tool I can use?
Thanks in advance.
Be aware generally https://npmjs.com/package/pdf-img-convert is frequently updated thus the better of the two, but has 3 pending pull requests so review if they impact your useage. (Note https://npmjs.com/package/pdf-image has a significantly much heavier set of dependencies to break and also has a much bigger list of pending pull requests thus your correct assumption the older it is ....)
However current pdf-img-convert 1.0.3 has a breaking dependency that needs a manual correction due to a change in Mozilla naming earlier this year from es5 to legacy.
see https://github.com/olliet88/pdf-img-convert.js/issues/10
For a cross platform Open Source CLI tool I would suggest Artifex MuTool (AGPL is not free for commercial use, but your getting quality support) has continuous daily commits, it can be programmed via Mutool Run ecma.js
Out of the box a simple convert in.pdf out%4d.png will attempt fixing broken PDF but may reject some that need a more forgiving secondary approach such as above.
Go ahead with the second one.
https://npmjs.com/package/pdf-img-convert

visual studio code resource collector

I'm working on a website with Visual Studio Code.
Is there a way to save only the files being used by a project into a separate folder?
Basically what I'm looking for is a tool which would scan all the local resources linked by all html files (meaning linked images, videos, files), and then it would save them all in a separate folder.
The reason why I'm asking this is because at the moment I'm testing things out, meaning I'm using image A, then image B, C and so on so forth. These images live in subfolders, so now I ended up with some images which I'm actually using in the html pages and some which I'm not. The thing is, is not simple to check which images I'm using.
You'll find the same principle in 3d applications, such as 3ds Max for instance, where, once you're done with the project, you can use a Resource Collector tool to strip out all the unused assets and save only the ones used by the project.
I've looked for an extension or a solution to this without any luck, so I guess an extension does not exist yet, but I think it would be a nice tool.
I don't understand why someone downvoted my post.
Either what I'm asking is already possible, although like I said I searched and I didn't find anything, or who downvoted consider my request stupid.
Whatever the reason, I believe it would be more mature to give a proper answer, even if whoever downvoted did it for either one of the two possible reason above.
In fact:
The solution already exists: like I said, I didn't find it, so if someone knows the solution why not simply posting it here?
The solution doesn't not exist but someone thinks it's a stupid idea. Well, it is not and it would be polite and civilized to discuss it.
In the current era it became so easy to express opinions without actually doing anything, by simply pressing a button to say nothing valuable, as a "I like".
I never stop feeling amazed where the social media behavior it's taking us.

Memcached on NodeJS - node-memcached or node-memcache, which one is more stable?

I need to implement a memory cache with Node, it looks like there are currently two packages available for doing this:
node-memcached (https://github.com/3rd-Eden/node-memcached)
node-memcache (https://github.com/vanillahsu/node-memcache)
Looking at both Github pages it looks like both projects are under active development with similar features.
Can anyone recommend one over the other? Does anyone know which one is more stable?
At the moment of writing this, the project 3rd-Eden/node-memcached doesn't seem to be stable, according to github issue list. (e.g. see issue #46) Moreover I found it's code quite hard to read (and thus hard to update), so I wouldn't suggest using it in your projects.
The second project, elbart/node-memcache, seems to work fine , and I feel good about the way it's source code is written. So If I were to choose between only this two options, I would prefer using the elbart/node-memcache.
But as of now, both projects suffer from the problem of storing BLOBs. There's an opened issue for the 3rd-Eden/node-memcached project, and the elbart/node-memcache simply doesn't support the option. (it would be fair to add that there's a fork of the project that is said to add option of storing BLOBs, but I haven't tried it)
So if you need to store BLOBs (e.g. images) in memcached, I suggest using overclocked/mc module. I'm using it now in my project and have no problems with it. It has nice documentation, it's highly-customizable, but still easy-to-use. And at the moment it seems to be the only module that works fine with BLOBs storing and retrieving.
Since this is an old question/answer (2 years ago), and I got here by googling and then researching, I feel that I should tell readers that I definitely think 3rd-eden's memcached package is the one to go with. It seems to work fine, and based on the usage by others and recent updates, it is the clear winner. Almost 20K downloads for the month, 1300 just today, last update was made 21 hours ago. No other memcache package even comes close. https://npmjs.org/package/memcached
The best way I know of to see which modules are the most robust is to look at how many projects depend on them. You can find this on npmjs.org's search page. For example:
memcache has 3 dependent projects
memcached has 31 dependent projects
... and in the latter, I see connect-memcached, which would seem to lend some credibility there. Thus, I'd go with the latter barring any other input or recommenations.

How to manage application resources?

We are developing a web application which is available in 3 languages.
There are these key-value pairs to translate everything. At this moment we use Excel (key, german, french, english) for this. But this does not work well ... if there is more than 1 person editing this file, you have no chance to automatically merge the different files.
Is there a good (and free) tool which can handle this job?
--- additional information ---
(This is a STRUTS application) But the question is how to manage these kinds of information in general (or at least in an conveinient way, which also supports multiple users editing this single file ("mergeable" filetypes))
Why not use gettext and manage separate .po files? See that blog entry.
If you can store this information in plain text then you will be able to use a version control system like subversion to help you with merging changes. Subversion is free.
The free guide (the "Red Book") to subversion gives a fairly good explanation of how this kind of merging works.
http://svnbook.red-bean.com/en/1.5/svn.basic.vsn-models.html#svn.basic.vsn-models.copy-merge
EDIT: Another thought - if you really want to stay using a spreadsheet - Google Docs supports simultaneous editing of a spreadsheet. You could import your existing spreadsheet and get your multi-user merging wishes for free with very little change to how you work.
Good Question.
There are some "Best Practice" depending on what you actually code in (java, ms-windows c#).
I solved this (but I think there must be a better way) by using a SQL db instead of excel file, and a wrote a plug for VS (VB6,........,..., emacs) that was able to insert new keys into the db without going to round trip with version control. The keys are the developers name of what they think is a best guess for a label. (key => save, sv => "spara", no => "", en => "save").
This db can then be generated as a module, class, obj, txt, to appropriate code(platform)
and can be accessed, depending on the ide, so in c#, bt,label = corelang.save;
Someone else can then do all the language stuff, and then we just update the db and rerun the generation to the platform resources.
After years of seeing localization done, including localization at large companies like Sony. I can only say the "standard" is Excel :)
There are tons of good ideas around, and probably many better ways to do it, but in real-life excel seems to be the best/cost effective solution that doesn't require training or making complex new tools to get the job done.
Found out, that Intellij Idea (at leas in version 7 and 8) has an editor for application resources. But it is not free at all. And it does not scale for bigger resource files with more than 1.000 keys.
Another good choice would be to use Google's spreadsheets ... for those who don't know it - it is like an "online Excell web-application". It can handle concurrent access from multiple users. Yay! But sadly, it comes from Google. This makes it impossible to be used in commercial projects.
So,
still searching...
cheers,
mana

What is a good web-based Grid that accepts Excel clipboard data?

Any good recommendations for a platform agnostic (i.e. Javascript) grid control/plugin that will accept pasted Excel data and can emit Excel-compliant clipboard data during a Copy?
I believe Excel data is formatted as CSV during "normal" clipboard operations.
dhtmlxGrid looks promising, but the online demo's don't actually copy contents to my clipboard!
I'm currently using dhtmlxGrid and we have the Excel copy/paste functionality working. dhtmlXGrid is the most full featured javascript grid package that I've found.
On their website, dhtmlXGrid claims to support Clipboard functionality in the Professional version. (However, I noticed the Sample on their site isn't working on my Firefox. EDIT: It's probably the permissions issue that Nathan mentioned.)
In any case, we had to do some extra work to get the exact Excel copy and paste functionality we wanted. We essentially had to override some of their functionality to get the desired behavior. Their support was pretty good in helping us come up with a solution.
So to answer your question, you should be able to get them to support copy and paste if you purchase the Professional version. I'm just warning you that it may take some additional work to fine tune that behavior.
Overall, I'm happy with dhtmlXGrid. We use a lot of their features. Their support is pretty good. They usually take one day to respond since they are in Europe (I think). And Javascript is by its very nature open source so I can always dive in when I need to.
Not an answer, but a warning: my company bought the 2007 Infragistics ASP.NET controls just for the Grid, and we regret that choice.
The quality of API is horrible (in our opinion at least), making it very hard to program against the grid (for example, inconsistent naming conventions, but this is just an inconvenience, we have complaints about the object model as well).
So I can't say that I know of a better option, I just know I will give a try to something else before paying for Infragistics products again (and the email support we got was horrible as well).
I was wrestling with this problem several years ago (2004 I think). We ran into the problem that Firefox doesn't allow scripts to read the clipboard by default (but you can grant access to the clipboard).
There's other ways of reading the clipboard data as well...Flash, for instance, can read the clipboard. There's a good article on ajaxian to explain how do to this behind the scenes.
In the end, we couldn't find a web-based Grid that fit the bill, so we had to create our own in a mixture of Actionscript and Javascript.
I'd hate to be Captain Obvious here...but what about a plain old .NET Gridview control? You can copy Excel data into it and out of it...and you can run it on any system with the .NET platform installed.
http://dhtmlx.com/dhxdocs/doku.php?id=dhtmlxgrid:clipboard_operations

Resources