I'd like to be able to produce PDF files in NodeJS.
Currently, we use puppeteer. We need to produce highly designed documents and so puppeteer/chromium gives me the ability to create a complex layout in HTML with the added benefit of also having the HTML version of the file.
It's great for relatively small documents where design is key.
The problem is when I try to produce long report documents. These documents do not require elaborate design. These are pretty much just a header with some information, and then a simple table with lots of records that stretch far as the eye can see, so they tend to be large. Like, really really large.
When I try using puppeteer for that, well pretty much just crashes and burns because loading such huge layouts into the underlying browser is just too much.
Currently I do "stitching". I create the document by having puppeteer create the doc in parts, and then I connect all those "doclets" into one using PDFKit.
But then I have problems like when one "doclet" ends and a new one begins, there are blank lines. (partially empty pages for no good reason from the perspective of a customer viewing it)
What I'm looking for is a library that has basic layout functionality but that doesn't use a browser (or perhaps uses something lightweight).
Problem is that libraries like PDFkit and pdf-lib seem to be too low level.
I'm going to literally have to "draw" the documents by telling it where exactly the text should be.
If I want tables, I'm going to straight up have to draw rectangles and stuff.
Having to create all of this manually would be a nightmare.
All I want is the ability to create simple layouts (tables, titles, text wrapping, background color) without having to use a library that just launches chromium.
Please, let me know if you know of any such option.
Thanks in advance!
What I tried:
PDFkit/pdf-lib - too low level. Unless I'm getting something wrong, there doesn't seem to be a way to create word wrapped layouts with basic tables.
jsPDF doesn't seem to be able to use the HTML functionality on the server(I think to get it to work I'd have to let it use a browser...? if so, doesn't really help).
Puppeteer/other libraries that pilot a browser - well, it uses a browser so a no-go for large docs.
Praying to Odin - No luck so far.
Is it possible to display an image directly from variable? I made a program that receives data from sql server and generates tables in excel. However sql could also have binary values for images and so on... which I would like to display in excel cells. But loadpicture, addpicture don't support this... I would like to avoid saving them into temporary files or avoid using external programs and display directly from memory while afterwards keeping the images in excel. Is this possible?
Ok tricky :)
Shape images can comes from file or from the web. So how about to implement a tiny webserver via vba which serves those images. VBA Webserver for Excel or
homebrewed web server vba
(If you get websocks running ) BTW: to implement a webserver is rather trivial. In your case he just has to deliver the images - which is done by a base64 encoded string with a few header lines. Another way is to modify some VB.Net Webserver to a COM DLL. Also not THAT complicated simple VB.NET Webserver <<< Might be a good idea if you are on 64 bit. Anyhow if you got the Webserver thing up you can now showel your database datas into the webserver and let he deliver them to a shape by a URL. To be honest that all is pure overkill. I use a software ram disc for such purposes. For example Softperfect Ram disc. So there is no real file written cause it uses a small amount of memory for the virtual disc. Its damn fast cause the computer ram is used and after reboot everything is gone if you whish. Sure you might parse the database data by VBA and write the pixels little by little to the shape. Some guys have made a full image processor in vb6 which is nearly vba Photo Daemon So you can see how easy to decompose and compose images. I just hope that something might be usable for you. This is not really a "answer" but might give you some ideas. The pure VBA Webserver is nice if you get the winsocks running. Ive fooled around a bit with this as we was still on 32 bit - was funny :)
This was tried three years ago and did not work directly from a variable. (see Inserting an Image into a sheet using Base64 in VBA?)
Giving up the "don't save to temporary file" idea and the whole thing is easy. (see How to embed a GIF image into an Excel file). Why not save it to a file? e.g. in a ram disk?
;-)
I am currently using josdirkson's SVG extrude script to form groups of 20-30 complex shapes. My goal is to individually manipulate each object as well as the group as a whole throughout the user's interaction. I have been able to achieve this so far, however, my load time can range from 7 to 20 seconds on a variety of devices. I was wondering if a lot of this could be just inherent in script that converts all the SVG paths into bezierCurves, etc. If this was the case, I was wondering if a viable solution might be to somehow export from Three.js to a JSON or other file type which would then be the subsequent data source users are loading from. I was looking at this thread briefly, but didn't want to get too far ahead of myself before crowd sourcing a solution! Any advice or input is greatly appreciated! Thank you!
Best advice I can give you is to have a look at the profile of the loading-process. To test this in chrome, you can add pairs of console.profile('something'); and console.profileEnd('something'); calls to your code for the region you want to analyze. Then open up the Profiles-panel in the devtools and reload the page / re-run the javascript.
This will probably be able to tell you if you are right about your assumptions. At least it will help you find the thing the time in JS is spent on.
And if that's really the case, you could do some caching of geometries, using geometry.toJSON() and new THREE.JSONLoader().parse(json) to save and restore the geometries. In most cases this should be significantly faster than somehow computing the geometry. (note: there are other, more space-efficient and even more performant ways to do the caching, but the json-format is a good place to start)
I'm putting together a mobile version of a webpage which consists entirely of client art. For the old-fashioned desktop version, I just used PNGs, but I really wanted to use SVG for mobile. SVGZ would be smaller and resolution independent, so it seemed like a perfect use case.
But the client is worried that, once his art is online in SVG, anyone could download the files and use his art illegally (he's had stuff he worked on pirated before, so he takes this pretty seriously.) This had never occurred to me until he brought it up, but the SVG would basically be his original source art.
I was wondering if there's any way to prevent the SVG files from being accessed by the user. As far I know this is impossible -- making the files available to the user-agent means making them available to the user -- but I wanted to ask around to be sure.
Thanks for any help.
No, this is impossible. If a web browser can request the files for display, then any computer anywhere can request the files and save the direct results.
Serving up intentionally degraded artwork (e.g. rasterization) is the only way to prevent people from having the originals. Of course, a determined thief could still re-trace the PNG and get a vectorized, resolution-independent close approximation of the original.
Your client could alternatively:
Include copyright comments in the source, proving ownership. (Yes, a thief could delete these.)
Include 'hidden' elements (0% opacity or placed under another item), proving ownership. (Yes, a thief could delete these.)
Use data steganography in the source SVG to watermark it (e.g. vary the decimal values in a path in a manner minor enough to not effect the result, but still embed custom data). (Yes, any thief suspecting this could lower decimal precision or transform all values in a manner that might remove this.)
Trust in the law to protect his works, or provide a recourse if they are stolen.
Trust in the goodness of most of mankind to not do this.
Decide that theft is the sincerest form of flattery, and not worry about it. :)
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I want to protect only certain numbers that are displayed after each request. There are about 30 such numbers. I was planning to have images generated in the place of those numerbers, but if the image is not warped as with captcha, wont scripts be able to decipher the number anyway? Also, how much of a performance hit would loading images be vs text?
The only way to make sure bad-guys don't get your data is not to share it with anyone. Any other solution is essentially entering an arms race with the screen-scrapers. At one point or another, one of you will find the arms-race too costly to continue. If the data you are sharing has any perceptible value, then probably the screen-scrapers will be very determined.
It's not possible.
You use javascript and encrypt the page, using document.write() calls after decrypting. I either scrape from the browser's display or feed the page through a JS engine to get the output.
You use Flash. I can poke into the flash file and get the values. You encrypt them in the flash and I can just run it then grab the output from the interpreter's display as a sequence of images.
You use images and I can just feed them through an OCR.
You're in an arms race. What you need to do is make your information so useful and your pages so easy to use that you become the authority source. It's also handy to change your output formats regularly to keep up, but screen scrapers can handle that unless you make fairly radical changes. Radical changes drive users away because the page is continually unfamiliar to them.
Your image solution wont' help much, and images are far less efficient. A number is usually only a few bytes long in HTML encoding. Images start at a few hundred bytes and expand to a 1k or more depending on how large you want. Images also will not render in the font the user has selected for their browser window, and are useless to people who use assisted computing devices (visually impaired people).
Apart from the images, you could display the numbers using JavaScript or flash.
You could also use CSS to position individual digits using various combinations of absolute or relative positions.
You could also use JavaScript to help you create these DIV.
The point is just to obfuscate enough that it becomes really hard.
One more solution is to use images of segments or single dots and re-construct the images of the digits using CSS, a bit like a dot-matrix display.
You could litter the source of the page with these absolutely positioned DIVs and again make it more difficult to reconstruct by creating them dynamically.
At any rate, you can't stop a determined scraper from getting to the data: it doesn't take a lot to automate a web browser and take screenshots that can be fed to an OCR.
There is nothing anyone from paying someone pennies to get the data manually anyway.
The point is: how determined are your opponents (user?).
It's a bit like the software protection business: making things hard enough that you would deter casual 'pirates' is not too hard, and it's a fairly good approach in general.
However, if there is much value in the data you present, there is nothing you can really do to protect it.
All you can do it make it hard enough so that casual 'thieves' will prefer to continue paying for your services rather than circumvent it.
Javascript would probably be the easiest to implement, but you could get really creative and have large blocks of numbers with certain ones being viewable by placing layers on top of the invalid numbers, blending the wrong numbers into the background, or making them invisible via css and semi-randomly generated class names.
I can't believe I'm promoting a common malware scripting tactic, but...
You could encode the numbers as encoded Javascript that gets rendered at runtime.
Generate an image containing those numbers and display the image. :-)
I think you guys are being too reactive with these solutions. Javascript, Capcha, even litigation and the DMCA process don't address the complex adaptive nature of web scraping and data theft. Don't you think the "ideal" solution to prevent malicious bots and website scraping would be something working in a real-time proactive mitigation strategy? Very similar to a Content Protection Network. Just say'n.
Examples:
IBM - IBM ISS Data Security Services
DISTIL - www.distil.it
Can you provide a little more detail on what it is you're doing? Certainly there's a performance hit to create an image instead of dumping out the text of a number, but how often would you be doing this per day?
Using JavaScript is the same as using text. It's trivial to reverse engineer.
Use animated numbers using flash. It may not be fool proof but it would make it harder to crack.
What about posting a lot of dummy numbers and showing the right ones with external CSS? Just as long the scraper doesn't start to parse the external CSS.
Don't output the numbers, i.e. prefix
echo $secretNumber;
with //.
For all those that recommend using Javascript, or CSS to obfuscate the numbers, well there's probably a way around it. Firefox has a plugin called abduction. Basically what it does is saves the page to a file as an image. You could probably modify this plugin to save the image, and then analyze the image to find out the secret number that is trying to be hidden.
Basically, if there's enough incentive behind scraping these numbers from the page, then it will be done. Otherwise, just post a regular number, and make it easier on your users so they won't have to worry so much about not being able to copy and paste the number, or other such problems the result from this trickery.
just do something unexpected and weird (different every time) w/ CSS box model. Force them to actually use a browser backed screenscraper.
I don't think this is possible, you can make their job harder (use images as some suggested here) but this is all you can do, you can't stop a determined person from getting the data, if you don't want them to scrape your data, don't publish it, as simple as that ...
Assuming these numbers are updated often (if they aren't then protecting them is completely moot as a human can just transcribe them by hand) you can limit automated scraping via throttling. An automated script would have to hit your site often to check for updates, if you can limit these checks you win, without resorting to obfuscation.
For pointers on throttling see this question.