File upload fails when file is over 60K in size - linux

I have been working to convert an in-house application away from using FTP, as the security team has told us to get off FTP. So I've been using HTTP uploads instead, and for the most part it works very well. Our environment is a mishmash of Linux, HP-UX, Solaris, and AIX. On our Linux servers, curl is universally available, so I have been using curl's POST capabilities for uploads, and it's worked flawlessly. Unfortunately, the Unix machines rarely have curl, or even wget, so I wrote a GET script with perl which works fine, and the POST script I wrote for perl(lifted and adapted from elsewhere on the web) works brilliantly for Unix, up until the data being uploaded is greater than about 60K(which curl handles fine in Linux, btw). Beyond that, the Apache error log starts spitting out:
CGI.pm: Server closed socket during multipart read (client aborted?).
No such error ever occurs when I use curl for the upload. Here's my POST script, using Socket, since LWP is not available on every server, and not at all on any of the Unix servers.
#!/usr/bin/perl -w
use strict;
use Socket;
my $v = 'dcsm';
my $upfile = $ARGV[0] or die 'Upload File not found or not specified.' . "\n";
my $hostname = $ARGV[1] or die 'Hostname not specified.' . "\n";
$| = 1;
my $host = "url.mycompany dot com";
my $url = "/csmtar.cgi";
my $start = times;
my ( $iaddr, $paddr, $proto );
$iaddr = inet_aton($host);
$paddr = sockaddr_in( 80, $iaddr );
$proto = getprotobyname('tcp');
unless ( socket( SOCK, PF_INET, SOCK_STREAM, $proto ) ) {
die "ERROR : init socket: $!";
}
unless ( connect( SOCK, $paddr ) ) {
die "no connect: $!\n";
}
my $length = 0;
open( UH, "< $upfile" ) or warn "$!\n";
$length += -s $upfile;
my $boundary = 'nn7h23ffh47v98';
my #head = (
"POST $url HTTP/1.1",
"Host: $host",
"User-Agent: z-uploader",
"Content-Length: $length",
"Content-Type: multipart/form-data; boundary=$boundary",
"",
"--$boundary",
"Content-Disposition: form-data; name=\"hostname\"",
"",
"$hostname",
"--$boundary",
"Content-Disposition: form-data; name=\"ren\"",
"",
"true",
"--$boundary",
"Content-Disposition: file; name=\"filename\"; filename=\"$upfile\"",
"--$boundary--",
"",
"",
);
my $header = join( "\r\n", #head );
$length += length($header);
$head[3] = "Content-Length: $length";
$header = join( "\r\n", #head );
$length = -s $upfile;
$length += length($header);
select SOCK;
$| = 1;
print SOCK $header;
while ( sysread( UH, my $buf, 8196 ) ) {
if ( length($buf) < 8196 ) {
$buf = $buf . "\r\n--$boundary";
syswrite SOCK, $buf, length($buf);
} else {
syswrite SOCK, $buf, 8196;
}
print STDOUT '.',;
}
close UH;
shutdown SOCK, 1;
my #data = (<SOCK>);
print STDOUT "result->#data\n";
close SOCK;
Anybody see something that jumps out at them?
UPDATE:
I made the following updates, and the errors appear to be unchanged.
To address the content-length issue, and attempt to eliminate the potential for the loop equalling the exact number of characters before appending the final boundary, I made the following code update.
my $boundary = 'nn7h23ffh47v98';
my $content = <<EOF;
--$boundary
Content-Disposition: form-data; name="hostname"
$hostname
--$boundary
Content-Disposition: file; name="filename"; filename="$upfile"
--$boundary--
EOF
$length += length($content);
my $header = <<EOF;
POST $url HTTP/1.1
Host: $host
User-Agent: z-uploader
Content-Length: $length
Content-Type: multipart/form-data; boundary=$boundary
EOF
$header .= $content;
select SOCK;
$| = 1;
print SOCK $header;
my $incr = ($length + 100) / 20;
$incr = sprintf("%.0f", $incr);
while (sysread(UH, my $buf, $incr )) {
syswrite SOCK, $buf, $incr;
}
syswrite SOCK, "\n--$boundary", $incr;

You are asking if there's "something that jumps out", from looking at the code.
Two things jump out at me:
1) The Content-Length parameter in a POST HTTP message specifies the exact byte count of the entity portion of the HTTP message. See section 4.4 of RFC 2616.
You are setting the Content-Length: header to the exact size of the of the file you're uploading. Unfortunately, in addition to the file itself, you are also sending the MIME headers.
The "entity" portion of the HTTP message, as defined by RFC 2616, essentially consists of everything after the blank line following the last header of the HTTP message header. Everything below that point must be included in the Content-Length: header. The Content-Length header is NOT the size of the file you're uploading, but the same of the HTTP message's entire entity portion, which follows the header.
2) Ignoring the broken Content-Length: header, if the size of the file happens to be an exact multiple of 8196 bytes, the MIME document you are constructing will most likely be corrupted. Your last sysread() call will get the last 8196 bytes in the file, which you will happy copy through, the next call to sysread() will return 0, and you will terminate the loop, without emitting the trailing boundary delimiter. The MIME document will be corrupt.

Related

Reading and sending a pdf from node to client - blank pages

I'm currently reading a PDF via a node backend, sending it through an API gateway layer and back to the client - when the response hits the client however, the pdf is downloaded with the correct number of pages but is completely blank. I've tried setting the encoding in a number of ways but with no luck. When setting the encoding to binary and running a diff of the downloaded PDF vs the original PDF, there are no visible differences even though the filesizes differ.
Node backend:
`
export async function generatePDF (req, res, next) {
try {
const fStream = fs.createReadStream(path.join(__dirname, 'businesscard.pdf'), { encoding: 'binary' }) // have also tried without the binary encoding
return fStream.pipe(res)
} catch (err) {
res.send(err)
}
}
`
The API Gateway simply sends a request to the node backend and sets the content type before sending it on:
`
res.setHeader('Content-Type', 'application/pdf')
`
Frontend:
`
function retrievePDF () {
return fetch('backendurlhere', {
method: 'GET',
headers: { 'Content-Type': 'application/pdf' },
credentials: 'include'
})
.then(response => {
return response.text()
})
.catch(err => {
console.log('ERR', err)
})
`
retrievePDF is called and then the following is performed via a React component:
`
generatePDF () {
this.props.retrievePDF()
.then(pdfString => {
const blob = new Blob([pdfString], { type: 'application/pdf' })
const objectUrl = window.URL.createObjectURL(blob)
window.open(objectUrl)
})
}
`
The string representation of the response looks a bit like this (simply a sample):
`
%PDF-1.4
1 0 obj
<<
/Title (þÿ)
/Creator (þÿ)
/Producer (þÿQt 5.5.1)
/CreationDate (D:20171003224921)
>>
endobj
2 0 obj
<<
/Type /Catalog
/Pages 3 0 R
>>
endobj
4 0 obj
<<
/Type /ExtGState
/SA true
/SM 0.02
/ca 1.0
/CA 1.0
/AIS false
/SMask /None>>
endobj
5 0 obj
[/Pattern /DeviceRGB]
endobj
6 0 obj
<<
/Type /Page
/Parent 3 0 R
/Contents 8 0 R
/Resources 10 0 R
/Annots 11 0 R
/MediaBox [0 0 142 256]
>>
endobj
10 0 obj
<<
/ColorSpace <<
/PCSp 5 0 R
/CSp /DeviceRGB
/CSpg /DeviceGray
>>
/ExtGState <<
/GSa 4 0 R
>>
/Pattern <<
>>
/Font <<
/F7 7 0 R
>>
/XObject <<
>>
>>
endobj
11 0 obj
[ ]
endobj
8 0 obj
<<
/Length 9 0 R
/Filter /FlateDecode
>>
stream
xåW]kÂ0}ϯ¸ÏÕ$mÆ`V6{{ºÊûûKÓ´vS¥N_f°WsÒ{ÏýÈMÛ»<ÑëzÙä¦Af&»q^©4MlE+6fcw-äUwp?ÖÓ%ëºX93Éî/tã¾·næ5Å¢trîeaiÎx-ù7vFËCí5nl¢¸Myláïmå·Ïgö²G±T ¹ïÒZk¢ð£¹¼)<äµµwm7ösÖ2¿P#¥ryëþèò]pÎÅ%åïÌDRqÿ)ôHTxpÄQOtjTI"ØBGd¤º
¢=¢£8Ú¶c¢téÑIþ¶c¡¶æ.ÇK»¾
ä¥.Inþ)(ÚbX¹Mqs«b²5B¡vÚ ò·ÚNeçmÇ.![¨±87¿ÜÂõ[H ¢à>ëRÄ]ZNæÚÂú¿·PWÒU4¢ØR]Ê®Kj±6\\ÐNØFG¬Ô;ÝRLüݱP[>·~'½%ä8M8丸0ýiiÕ}ت³S$=N*s'>¹³§VùGfûÉU`ËÁ¥wú®FéC^½"òºBcö
Ùå#endstream
endobj
`
The HTTP response looks as follows:
`
access-control-allow-credentials: true
access-control-allow-origin: http://frontend.dev.com
access-control-expose-headers: api-version, content-length, content-md5, content-type, date, request-id, response-time
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/octet-stream
Date: Wed, 09 May 2018 09:37:22 GMT
Server: nginx/1.13.3
Transfer-Encoding: chunked
vary: origin
`
I've also tried other methods of reading the file, such as readFileSync, and constructing chunks via fStream.on('data') and sending back as a Buffer. Nothing seems to work.
Note: I'm using Restify (not express)
Edit:
Running the file through a validator shows the following:
`
File teststring.pdf
Compliance pdf1.4
Result Document does not conform to PDF/A.
Details
Validating file "teststring.pdf" for conformance level pdf1.4
The 'xref' keyword was not found or the xref table is malformed.
The file trailer dictionary is missing or invalid.
The "Length" key of the stream object is wrong.
Error in Flate stream: data error.
The "Length" key of the stream object is wrong.
Error in Flate stream: data error.
The document does not conform to the requested standard.
The file format (header, trailer, objects, xref, streams) is corrupted.
The document does not conform to the PDF 1.4 standard.
Done.
`
For anyone having issues, I found out that in my gateway layer, the request was wrapped around a utility function that performed a text read on the response, i.e.
return response.text()
I removed this and instead piped the response from the backend:
fetch('backendurl') .then(({ body }) => { body.pipe(res) })
Hopefully this helps anyone employing the gateway pattern with similar issues

Using SFTP to transfer images from HTML form to remote linux server using PERL/CGI.pm

This is a school project, and the instructor has no knowledge of how to write the code.
I am using CGI and I am attempting to transfer a file without using Net::FTP or Net::SFTP since the server I am attempting to transfer it to will not allow connections from these services. I have written the HTML form and I am able to grab the name of the file uploaded through CGI.
Is it possible to use the SFTP command within a Perl script that resides on a Linux server using bash to transfer a file uploaded through an HTML form?
If anyone knows a way to do it please post the code so I can modify it and insert into my script.
use CGI qw(:standard);
use File::Basename;
my ( $name, $path, $extension) = fileparse ( $productimage, '..*' );
$productimage = $name . $extension;
$productimage =~ tr/ /_/; $productimage =~ s/[^$safechars]//g;
if ( $productimage =~/^([$safechars]+)$/ ) {
$productimage = $1;
} else {
die "Filename contains invalid characters";
}
$fh = upload('image');
$uploaddir = "../../.hidden/images";
open ( UPLOADFILE, ">$uploaddir/$productimage" )
or die "$!"; binmode UPLOADFILE;
while (<$fh>) {
print UPLOADFILE;
}
close UPLOADFILE;
This is the code I used to upload the file into the server.

Encrypting/decrypting some file types with Rijndael 256 (CakePHP Security library) garbles contents

I am using CakePHP's Security::rijndael() function to encrypt and decrypt text and files. I previously wrote some code using mcrypt directly, which worked in the same way, but then I found Security::rijndael and realised I had reinvented the wheel. So the problem I have happens either way.
If I encrypt a string, or a text file, or a PDF document, the code below works perfectly and I get the correct decrypted string/file. However, if I try encrypting a .doc, .docx or an image file, the decrypted file is garbled.
Here's the code that does the encrypting/decrypting
public static function encrypt($plainText, $key) {
$plainText = base64_encode($plainText);
//Hash key to ensure it is long enough
$hashedKey = Security::hash($key);
$cipherText = Security::rijndael($plainText, $hashedKey, 'encrypt');
return base64_encode($cipherText);
}
public static function decrypt($cipherText, $key) {
$cipherText = base64_decode($cipherText);
$hashedKey = Security::hash($key);
$plainText = Security::rijndael($cipherText, $hashedKey, 'decrypt');
return base64_decode($plainText);
}
...and this code actually presents the file to the user (I've edited the code to keep it simple):
public function download($id){
App::uses('File', 'Utility');
$key = $this->User->getDocumentKey($id);
$file = new File('my_encrypted_file.docx');
$encrypted = $file->read();
$decrypted = Encrypt::decrypt($encrypted, $key);
header('Cache-Control: no-store, no-cache, must-revalidate');
header('Content-Disposition: attachment; filename="my_decrypted_file.docx"');
echo $decrypted;
die();
}
Update - it appears that the encryption is a red herring, as the file is garbled even without encrypting and decrypting it! The following produces exactly the same broken file:
header('Content-Disposition: attachment; filename="test.docx"');
$file = new File($this->data['Model']['file']['tmp_name']);
echo $file->read();
die();
I think I know the reason for that problem now, it is line 208 in Security.php:
$out .= rtrim(mcrypt_decrypt($algorithm, $cryptKey, $text, $mode, $iv), "\0");
Since PHP's mycrypt() uses ZeroBytePadding this line removes the padding afterwards.
The problem is that a .docx-File (as far as I could check it) terminates with a few Null-characters. If you only remove a single one of them, Word fails to open the file.
So what happens is that rtrim() also deletes these bytes even though they are not part of the padding.
To fix this, you can add a termination character (for example X) at the end of your files before encrypting and remove it after decrypting. This will prevent cutting off the tailing zero-bytes from the .docx-files:
public static function encrypt($plainText, $key) {
$plainText = base64_encode($plainText . "X"); // `X` terminates the file
/* do encryption */
}
public static function decrypt($cipherText, $key) {
/* do decrytion */
return rtrim(base64_decode($plainText), "X"); // cut off the termination `X`
}
Well, I was barking up the wrong tree.
For whatever reason (whitespace at the start of some PHP file maybe?), adding ob_clean(); immediately after sending the headers, has fixed the problem.

Nginx: how to view what files are in downloading state?

I still wait for the answer.
I need to view if someone is downloading a specific file at this moment. The initial problem: I would like to know when someone interrupts downloading the file.
The server configuration:
server
{
listen 80;
server_name mysite.com;
location /ugp/
{
alias /www/ugp/;
}
break;
}
A user can download files from http://mysite.com/ugp/, for example http://mysite.com/ugp/1.mp3.
UPDATE.
It's no so obviously how to do it analyzing access.log. Some browsers send 206 code when user stops downloading (Google Chrome) some not (HTC player, mobile application):
85.145.8.243 - - [18/Jan/2013:16:08:41 +0300] "GET /ugp/6.mp3 HTTP/1.1" 200 10292776 "-" "HTC Streaming Player htc_wwe / 1.0 / htc_ace / 2.3.5"
85.145.8.243 - - [18/Jan/2013:16:08:41 +0300] "GET /ugp/2.mp3 HTTP/1.1" 200 697216 "-" "HTC Streaming Player htc_wwe / 1.0 / htc_ace / 2.3.5"
85.145.8.243 - - [18/Jan/2013:16:09:44 +0300] "GET /ugp/7.mp3 HTTP/1.1" 200 4587605 "-" "HTC Streaming Player htc_wwe / 1.0 / htc_ace / 2.3.5"
In order to offer more flexibility, you can add a PHP server, and publish URLs like
http://mysite.com/download.php?file=2.mp3
the download.php reads the file from its location (eg /var/www/files/2.mp3), having a suggested code. ( Headers )
<?php
// Here you know precisely that a download is requested
$path = '/home/var/www/files';
$file = "$path/" . $_GET['file']; // check user input!
$size = filesize($file);
$read = 0;
$state = 'downloading ' . $file;
// echo headers for a binary file download
# $f = fopen ($file, 'r');
if ( $f ) {
// Output 100 bytes in each iteration
while (($chunk = fread($f, 100)) !== false) {
// specify somewhere $state, still downloading
echo $chunk;
$read += strlen($chunk);
}
fclose ($f);
}
if ($read >= $size)
$state = 'done';
else
$state = 'fail';
?>
This is an exemple to provide an algorithm - not tested at all! But should be pretty close from a download code (usually readfile is used, but it reads the file one shot)
Have a look at the nginx extended status module.

mod_rewrite to download - makes suspicous looking file

I decided to try and use mod_rewrite to hide the location of a file that a user can download.
So they click on a link that's directed to "/download/some_file/" and they instead get "/downloads/some_file.zip"
Implemented like this:
RewriteRule ^download/([^/\.]+)/?$ downloads/$1.zip [L]
This works except they when the download progress appears I'm getting a file "download" with no extension which looks suspicious and the user might not be aware they are supposed to unzip it. Is there a way of doing this so it looks like an actual file? Or is there a better a way I should be doing this?
To provide some context/reason for hiding the location of the file. This is for a band where the music can be downloaded for free provided the user signs up for the mailing list.
Also not I need to do this within .htaccess
You can set the filename by sending the Content-disposition header:
https://serverfault.com/questions/101948/how-to-send-content-disposition-headers-in-apache-for-files
Ok so I believe that I'm restricted as to what headers I can set using .htaccess
So I have instead solved this using php.
I initially copied a download php script found here:
How to rewrite and set headers at the same time in Apache
However my file size was too big and so this was not working properly.
After a bit of googling I came across this: http://teddy.fr/blog/how-serve-big-files-through-php
So my complete solution is as follows...
First send requests to download script:
RewriteRule ^download/([^/\.]+)/?$ downloads/download.php?download=$1 [L]
Then get full filename, set headers, and serve it chunk by chunk:
<?php
if ($_GET['download']){
$file = $_SERVER['DOCUMENT_ROOT'].'media/downloads/' . $_GET['download'] . '.zip';
}
define('CHUNK_SIZE', 1024*1024); // Size (in bytes) of tiles chunk
// Read a file and display its content chunk by chunk
function readfile_chunked($filename, $retbytes = TRUE) {
$buffer = '';
$cnt =0;
// $handle = fopen($filename, 'rb');
$handle = fopen($filename, 'rb');
if ($handle === false) {
return false;
}
while (!feof($handle)) {
$buffer = fread($handle, CHUNK_SIZE);
echo $buffer;
ob_flush();
flush();
if ($retbytes) {
$cnt += strlen($buffer);
}
}
$status = fclose($handle);
if ($retbytes && $status) {
return $cnt; // return num. bytes delivered like readfile() does.
}
return $status;
}
$save_as_name = basename($file);
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: no-cache');
header("Content-Type: application/zip");
header("Content-Disposition: disposition-type=attachment; filename=\"$save_as_name\"");
readfile_chunked($file);
?>

Resources