How to upload a huge(GBs) file to a webserver [closed] - web

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am creating an application where I want to upload huge files.
Following is a little description of what this application tries to achieve:
Create a vmdk file from user's physical machine(using Vmware Converter tool, vmdk files can be GBs in size).
Upload this vmdk file to the remote server
now the purpose of having vmdk file on remote server is accessibility.
i.e. a user away from his physical machine, can later on login via a webconsole, and
instantiate a virtual machine from this vmdk on the remote server
I think this makes the situation different from normal file uploads(10-20MB file uploads).
rsync/scp/sftp might help, but..
Would this be possible using a web interface?
If not, then do I need to create a separate client for the end user, to convert and upload his files efficiently?
Any help is appreciated..

Use a file transfer protocol for this, not HTTP. You need a protocol that can restart the transfer in the middle in case the connection breaks.
BTW, I don't mean to use FTP.
I'm not an expert on all the current file transfer protocols (I've been an FTP expert, which is why I recommend against it).
However, in this situation, I think you're off-base in assuming you need transparency. All the users of this system will already have the VMWare Converter software on their machine. I see no reason they couldn't also have a small program of yours that will do the actual upload. If there's an API to the Converter software, then your program could automate the entire process - they'd run your program before they go home for the night, your program would convert to the vmdk, then upload it.
Exactly which protocol to use, I don't know. That might take some experimentation. However, if the use of the protocol is embedded within your small application and in the service, then your users will not need to know which protocols you're experimenting with. You'll be able to change them as you learn more, especially if you distribute your small program in a form that allows auto-update.

If you insist on using a web interface for this, the only way to pull it off is with something similar to a signed Java applet (I can't speak to Flash or other similar technologies, but I'm sure they're similarly capable).
Once you've crossed this threshold of going to an applet-like control, then you have far more freedom about what and how you can do things.
There's nothing wrong with HTTP per se for uploading files, it's just that the generic browser is a crummy client for it (no restartability, as mentioned, is but one limitation).
But with an applet you can select any protocol you want, you can throttle uploads so as to not saturate the clients connection, you can restart, send pieces, do checksums, whatever.
You don't need an entire webpage devoted to this, it can be a small component. It can even be an invisible component (and fired via JS). But the key factor is that it has to be a SIGNED component. An unsigned component can't interact with the users file system, so you'll need to get the component signed. It can be your own cert, etc. It follows much of the similar mechanics as normal web certificates.
Obviously the client browser will need to support your applet tech as well.

Rsync would be ideal if you can find a host that supports it.
It can restart easily, retransfer only changed parts of a file if that's useful to you and has built in options to use ssh, compression etc.
It can also confirm that the remote copy matches the local file without transferring very much data

I would run parallel FTP streams to speed up the process....

Related

Does VSCode remote-ssh extension in remote development store ANY info client side?

This is a security oriented question. Basically I'm reviewing the appdata for vscode and I see a couple of cache files. I'm trying to figure out if any of the file data is being transferred into client OS since that would be a security violation. I don't see a firm answer on this anywhere. Microsoft saying that it's "Sandboxed" isn't good enough for my security concerns, I need to be reasonably certain.
Basically if vscode-remote is ultimately a renderer like an ssh terminal it's okay, however if it does even a small amount of plain text caching on WINDOWS that's a no no since ultimately I'd be bypassing the security of the server.
Just to be clear my access is secured over ssh and approved, but my viewing on the client side is what's in question.
It appears to be okay(haven't found any files in violation), but I need something firmer, and of course it needs to be from an official source. (or offer direct proof to substantiate the use case as secure).
This is not actually my own answer one of the vscode development team(Chuck Lantz) responded to a direct question by email.
Okay, have an update. We don’t currently have the equivalent of an “In-Private”
mode in the browser context where all caching is in RAM.
You can, however, run VS Code in portable mode and keep the contents in a more
secure location. This keeps all data relative to the application folder so you
could put some or all of it in an encrypted virtual hard drive or even on a
remote file share (e.g. using SSHFS).
Portable Mode in Visual Studio Code
It defaults to using the system temp location for some content, but you can
change that to a sub-folder as well. The location of data folders by OS is also listed in the article.
Thanks Chuck!

Does IPFS host files you access by default?

I could not find a straight answer, so I am sorry if it has already been solved.
I was wondering, like with BitTorrent, when you download something using IPFS, does it automatically 'seed'/host it?
I also know you can stop seeding with torrents. If it automatically hosts the file after downloading it, what would be a way to stop hosting a specific file? all files?
edit: if it doesn't automatically host files after a download, how would one go about hosting a file on IPFS?
Thanks a ton.
To understand IPFS the best thing to do is take the time to read the white paper.
The protocol used for data distribution is inspired by BitTorrent and is named BitSwap. To protect against leeches (free-loading nodes that never share), BitSwap use a credit-like system.
So to answer your questions, yes when you download some content it's automatically hosted (or a least part of it), and if you try to trick the protocol by not hosting the content your credit will drop and you will not be able to participate in the network.

Securely copy files to a UNC path using .NET

I need to copy files from one server to a UNC path on the same network. The ASP.NET app uses .NET 2.0
Currently we're just using a simple System.IO.File.Copy method, and works just fine, but we were asked to make sure the files are transferred securely.
I can think of two ways to do this. Either writing a WCF or ASMX service and install a SSL certificate on the target server, and use that, or, explicitly encrypting each file before calling File.Copy, and then decrypting the file once it's copied.
Am I missing an option? Are there better ways to do this? If not...which option would be best for my requirement?
thanks in advance.
My initial concern was that a person in my LAN could just launch a simple tool and get a copy of the files being copied between servers on my LAN.
After asking a related question on superuser.com - can a file being copied over my LAN be sniffed?, i learned that even if a regular person is able to launch a popular sniffer tool like WireShark and configure it to see the stream of the files being copied over the network, it would not be an easy task to convert that stream back into a file. It would take a higher skill to do that.
However, for safety, I'd go with encrypting the stream (WCF or ASMX service over SSL) so that even if they can see the stream, it'd still be encrypted.

What's the most secure way to send data from a-b? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
If I had let's say a sensitive report in PDF format and wanted to send it to someone, what is the most secure way?
Does a desktop application make it more secure? Since we are basically doing a client to server communication via private IP address? Then add some kind of standard encryption algorithm to the data as you send it over the wire?
What about a web based solution? In web based, you have a third person in the loop. Sure, it would do the same kind of encryption that I would have on a desktop.. but now instead of client->server directly, you have client->server | server<- client... You also have exposure to the broad internet for any intruders to jump in, making yourself more open to man-in-middle attack... One thing the web has going for it is digitial certificates but I think that is more authentication than authorization.. which the desktop problem doesnt have?
Obviously from a usability point of view - a person wants to just goto a web page and download a report he's expecting. But most secure? Is desktop the answer? Or is it just too hard to do from a usability perspective?
OK there seems to be some confusion. I am a software engineer and am facing a problem where business users have some secure documents that they need to distribute - I am just wondering if using the web and SSL/CA is the standard solution to this, or maybe a desktop application could be the answer??
The method that comes to mind as being very easy (as in it has been done a lot and is proven) is just distributing via a web site that is secured with SSL. It's trivial to set up (doesn't matter if you're running Windows, *nix, etc) and is a familiar pattern to the user.
Setting up a thick client is likely more work because you have to do the encryption yourself (not difficult these days, but there is more to know in terms of following best practices). I don't think that you'll gain much (any?) security from having to maintain a significantly larger set of code.
Most secure would be print it, give it to a courier in a locked briefcase, and have the courier hand deliver it. I think that'd be going overboard, though :)
In real world terms, unless you're talking national security (in which case, see courier option above), or Trade Secrets Which Could Doom Your Company (again, see courier option above), having a well encrypted file downloaded from the web is secure enough. Use PGP encryption (or similar), and I recommend the Encrypt and Sign option, make the original website a secure one as well, and you're probably fine.
The other thing about a desktop application is: how is it getting the report? If it's not generating the report locally, it's really doing just as many steps as a web page: app requests report, report generated, server notifies client, client downloads.
A third option, though, is to use something other than the website to download the reports. For instance, you could allow the user to request the report through the web, but provide a secure FTP (SFTP or FTPS) site or AS2 (or AS3) connection for the actual download.
Using a secure file transfer (or managed file transfer) is definitely the best option for securely transferring electronic data. There are smaller, more personal-use solutions out there like Dropbox or Enterprise solutions like BiscomDeliveryServer.com
Print it off, seal it in an envelope, hire some armed guards for protection and hand deliver it to them.
You may think its a silly answer, but unless you can identify what your threat vectors are any answer is pretty meaningless, since there is no guarantee it will address those threats.
Any system is only as secure as it's weakest link. If you sent the document securely and the user downloaded / saved it to their desktop then you'd be no better off than an unsecure system. Even worse they could get the docuemnt and then send it onto loads of people that shouldn't see it, etc. That leads on to a question whether you have an actual requirement that they can only view and not download the document? If not, why go to all this effort?
But if they are able to down load it, then the most secure method may be to send them an email telling them that the document is available. They then connect to a system (web / ftp?) using credentials sent separately to authenticate their access.
I'm surprised no one has mentioned a PK-encryption over email solution. Everyone in the "enterprise" gets a copy of everyone else's public key and their own private key. Lots of tools exist to do the heavy-lifting. Start with PGP and work from there.

ensuring uploaded files are safe

My boss has come to me and asked how to enure a file uploaded through web page is safe. He wants people to be able to upload pdfs and tiff images (and the like) and his real concern is someone embedding a virus in a pdf that is then viewed/altered (and the virus executed). I just read something on a procedure that could be used to destroy stenographic information emebedded in images by altering least sifnificant bits. Could a similar process be used to enusre that a virus isn't implanted? Does anyone know of any programs that can scrub files?
Update:
So the team argued about this a little bit, and one developer found a post about letting the file download to the file system and having the antivirus software that protects the network check the files there. The poster essentially said that it was too difficult to use the API or the command line for a couple of products. This seems a little kludgy to me, because we are planning on storing the files in the db, but I haven't had to scan files for viruses before. Does anyone have any thoughts or expierence with this?
http://www.softwarebyrob.com/2008/05/15/virus-scanning-from-code/
I'd recommend running your uploaded files through antivirus software such as ClamAV. I don't know about scrubbing files to remove viruses, but this will at least allow you to detect and delete infected files before you view them.
Viruses embedded in image files are unlikely to be a major problem for your application. What will be a problem is JAR files. Image files with JAR trailers can be loaded from any page on the Internet as a Java applet, with same-origin bindings (cookies) pointing into your application and your server.
The best way to handle image uploads is to crop, scale, and transform them into a different image format. Images should have different sizes, hashes, and checksums before and after transformation. For instance, Gravatar, which provides the "buddy icons" for Stack Overflow, forces you to crop your image, and then translates it to a PNG.
Is it possible to construct a malicious PDF or DOC file that will exploit vulnerabilities in Word or Acrobat? Probably. But ClamAV is not going to do a very good job at stopping those attacks; those aren't "viruses", but rather vulnerabilities in viewer software.
It depends on your company's budget but there are hardware devices and software applications that can sit between your web server and the outside world to perform these functions. Some of these are hardware firewalls with anti-virus software built in. Sometimes they are called application gateways or application proxies.
Here are links to an open source gateway that uses Clam-AV:
http://en.wikipedia.org/wiki/Gateway_Anti-Virus
http://gatewayav.sourceforge.net/faq.html
You'd probably need to chain an actual virus scanner to the upload process (the same way many virus scanners ensure that a file you download in your browser is safe).
In order to do this yourself, you'd have to keep it up to date, which means keeping libraries of virus definitions around, which is likely beyond the scope of your application (and may not even be feasible depending on the size of your organization).
Yes, ClamAV should scan the file regardless of the extension.
Use a reverse proxy setup such as
www <-> HAVP <-> webserver
HAVP (http://www.server-side.de/) is a way to scan http traffic though ClamAV or any other commercial antivirus software. It will prevent users to download infected files.
If you need https or anything else, then you can put another reverse proxy or web server in reverse proxy mode that can handle the SSL before HAVP
Nevertheless, it does not work at upload, so it will not prevent the files to be stored on servers, but prevent the files from being downloaded and thus propagated. So use it with a regular file scanning (eg clamscan).

Resources