Introduction
Say that on the same local network we have two Node JS servers set up with Express: Server A for API and Server F for form.
Server A is an API server where it takes the request and saves it to MongoDB database (files are stored as Buffer and their details as other fields)
Server F serves up a form, handles the form post and sends the form's data to Server A.
What is the most efficient way to send files between two NodeJS servers where the receiving server is Express API? Where does the file size matter?
1. HTTP Way
If the files I'm sending are PDF files (that won't exceed 50mb) is it efficient to send the whole contents as a string over HTTP?
Algorithm is as follows:
Server F handles the file request using https://www.npmjs.com/package/multer and saves the file
then Server F reads this file and makes an HTTP request via https://github.com/request/request along with some details on the file
Server A receives this request and turns the file contents from string to Buffer and saves a record in MongoDB along with the file details.
In this algorithm, both Server A (when storing into MongoDB) and Server F (when it was sending it over to Server A) have read the file into the memory, and the request between the two servers was about the same size as the file. (Are 50Mb requests alright?)
However, one thing to consider is that -with this method- I would be using the ExpressJS style of API for the whole process and it would be consistent with the rest of the app where the /list, /details requests are also defined in the routes. I like consistency.
2. Socket.IO Way
In contrast to this algorithm, I've explored https://github.com/nkzawa/socket.io-stream way which broke away from the consistency of the HTTP API on Server A (as the handler for socket.io events are defined not in the routes but the file that has var server = http.createServer(app);).
Server F handles the form data as such in routes/some_route.js:
router.post('/', multer({dest: './uploads/'}).single('file'), function (req, res) {
var api_request = {};
api_request.name = req.body.name;
//add other fields to api_request ...
var has_file = req.hasOwnProperty('file');
var io = require('socket.io-client');
var transaction_sent = false;
var socket = io.connect('http://localhost:3000');
socket.on('connect', function () {
console.log("socket connected to 3000");
if (transaction_sent === false) {
var ss = require('socket.io-stream');
var stream = ss.createStream();
ss(socket).emit('transaction new', stream, api_request);
if (has_file) {
var fs = require('fs');
var filename = req.file.destination + req.file.filename;
console.log('sending with file: ', filename);
fs.createReadStream(filename).pipe(stream);
}
if (!has_file) {
console.log('sending without file.');
}
transaction_sent = true;
//get the response via socket
socket.on('transaction new sent', function (data) {
console.log('response from 3000:', data);
//there might be a better way to close socket. But this works.
socket.close();
console.log('Closed socket to 3000');
});
}
});
});
I said I'd be dealing with PDF files that are < 50Mb. However, if I use this program to send larger files in the future, is socket.io a better way to handle 1GB files as it's using stream?
This method does send the file and the details across but I'm new to this library and don't know if it should be used for this purpose or if there is a better way of utilizing it.
Final thoughts
What alternative methods should I explore?
Should I send the file over SCP and make an HTTP request with file details including where I've sent it- thus, separating the protocols of files and API requests?
Should I always use streams because they don't store the whole file into memory? (that's how they work, right?)
This https://github.com/liamks/Delivery.js ?
References:
File/Data transfer between two node.js servers this got me to try socket-stream way.
transfer files between two node.js servers over http for HTTP way
There are plenty of ways to achieve this , but not so much to do it right !
socket io and wesockets are efficient when you use them with a browser , but since you don't , there is no need for it.
The first method you can try is to use the builtin Net module of nodejs, basically it will make a tcp connection between the servers and pass the data.
you should also keep in mind that you need to send chunks of data not the entire file , the socket.write method of the net module seems to be a good fit for your case check it : https://nodejs.org/api/net.html
But depending on the size of your files and concurrency , memory consumption can be quite large.
if you are running linux on both servers you could even send the files at ground zero with a simple linux command called scp
nohup scp -rpC /var/www/httpdocs/* remote_user#remote_domain.com:/var/www/httpdocs &
You can even do this with windows to linux or the other way.
http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
the client scp for windows is pscp.exe
Hope this helps !
Related
I'm trying to implement NodeJS and Socket.io for real time communication between two devices (PC & Smartphones) in my company product.
Basically what I want to achieve is sending a notification to all online users when somebody change something on a file.
All the basic functionality for saving the updates are already there and so, when everything is stored and calculated, I send a POST request to my Node server saying that something changed and he need to notify the users.
The problem now is that when I want to change some code in the NodeJS scripts, as long as I work alone, I can just upload the new files via FTP and just restart the pm2 service, but when my colleagues will start working with me on this story we will have problems merging our changes without overlapping each other.
Launching a local server is also not possible because we need the connection between our current server and the node machine and since our server is online it cannot access our localhosts.
It's there a way for a team to work together in the same Node server but without overlapping each other ?
Implement changes using some other option rather than FTP. For example:
You can use webdav-fs in authenticated or non-authenticated mode:
// Using authentication:
var wfs = require("webdav-fs")(
"http://example.com/webdav/",
"username",
"password"
);
wfs.readdir("/Work", function(err, contents) {
if (!err) {
console.log("Dir contents:", contents);
} else {
console.log("Error:", err.message);
}
});
putFileContents(remotePath, format, data [, options])
Put some data in a remote file at remotePath from a Buffer or String. data is a Buffer or a String. options has a property called format which can be "binary" (default) or "text".
var fs = require("fs");
var imageData = fs.readFileSync("someImage.jpg");
client
.putFileContents("/folder/myImage.jpg", imageData, { format: "binary" })
.catch(function(err) {
console.error(err);
});
And use callbacks to notify your team, or lock the files via the callback.
References
webdav-fs
webdav
lockfile
Choosing Secure Passwords
I am trying to develop a very simple image server with NodeJS & SocketIO. A project I am working on requires me to load several hundred images on page-load (customer requirement). Currently, a HTTP request is made for each image via use of the HTML "img" tag. With the reduced latency and overall efficiency of websockets compared to HTTP or Ajax, I was hoping to improve performance by sending images over websockets instead.
Unfortunately, reading images from the server's file-system with NodeJS and sending them over websockets with SocketIO has been significantly slower than the traditional HTTP requests served over Apache. Below is my server code:
var express = require('express'),
app = express(),
http = require('http'),
fs = require("fs"),
mime = require('mime'),
server = http.createServer(app),
io = require('socket.io').listen(server);
server.listen(151);
io.sockets.on('connection',function(socket){
socket.emit('connected');
socket.on('getImageData',function(file,callback){
var path = 'c:/restricted_dir/'+file;
fs.readFile(path,function(err,data){
if (!err){
var prefix = "data:" + mime.lookup(path) + ";base64,";
var base64Image = prefix+data.toString('base64');
socket.emit('imageData',data,callback);
}
});
});
});
I have also tried buffering with "createReadStream", but I saw no significant speed improvements with this. I should also note that it is desirable to receive the image data as a Base64-encoded dataURI so I can simply throw that into the "src" attribute of the "img" tag. I understand Base64 means roughly a 30% increase in the data's size, but even when using binary image data, it still takes about 10 times longer than HTTP.
EDIT:
I suppose the real question here is, "are websockets really the best way to serve static files?" After further thought and additional reading, I strongly suspect the issue here is related to parallel processing. Since NodeJS operates on a single thread, maybe it is not the best solution for serving all these static image files? Does anyone have any thoughts on this?
Browsers usually open multiple connections to the same server to perform requests in parallel, and can also perform multiple requests per single connection, whereas you only have one websocket connection.
Also, the combo fs.readFile()/Base64-encode/socket.emit() introduces a significant overhead, where a regular httpd can use system calls like sendfile() and don't even have to touch the file contents before they are being sent to the client.
The single-threaded nature of Node isn't an issue here, because Node can do I/O (which is what you're doing, minus the Base64-encoding) really well.
So I would say that websockets aren't very suitable for static file serving :)
I'm testing streaming by creating a basic node.js app code that basically streams a file to the response. Using code from here and here.
But If I make a request from http://127.0.0.1:8000/, then open another browser and request another file, the second file will not start to download until the first one is finished. In my example I created a 1GB file. dd if=/dev/zero of=file.dat bs=1G count=1
But if I request three more files while the first one is downloading, the three files will start downloading simultaneously once the first file has finished.
How can I change the code so that it will respond to each request as it's made and not have to wait for the current download to finish?
var http = require('http');
var fs = require('fs');
var i = 1;
http.createServer(function(req, res) {
console.log('starting #' + i++);
// This line opens the file as a readable stream
var readStream = fs.createReadStream('file.dat', { bufferSize: 64 * 1024 });
// This will wait until we know the readable stream is actually valid before piping
readStream.on('open', function () {
console.log('open');
// This just pipes the read stream to the response object (which goes to the client)
readStream.pipe(res);
});
// This catches any errors that happen while creating the readable stream (usually invalid names)
readStream.on('error', function(err) {
res.end(err);
});
}).listen(8000);
console.log('Server running at http://127.0.0.1:8000/');
Your code seems fine the way it is.
I checked it with node v0.10.3 by making a few requests in multiple term sessions:
$ wget http://127.0.0.1:8000
Two requests ran concurrently.
I get the same result when using two different browsers (i.e. Chrome & Safari).
Further, I can get concurrent downloads in Chrome by just changing the request url slightly, as in:
http://localhost:8000/foo
and
http://localhost:8000/bar
The behavior you describe seems to manifest when making multiple requests from the same browser for the same url.
This may be a browser limitation - it looks like the second request isn't even made until the first is completed or cancelled.
To answer your question, if you need multiple client downloads in a browser
Ensure that your server code is implemented such that file-to-url mapping is 1-to-many (i.e. using a wildcard).
Ensure your client code (i.e. javascript in the browser), uses a different url for each request.
I'd like to add a live functionality to a PHP based forum - new posts would be automatically shown to users as soon as they are created.
What I find a bit confusing is the interaction between the PHP code and NodeJS+socket.io.
How would I go about informing the NodeJS server about new posts and have the server inform the clients that are watching the thread in which the post was posted?
Edit
Tried the following code, and it seems to work, my only question is whether this is considered a good solution, as it looks kind of messy to me.
I use socket.io to listen on port 81 to clients, and the server running om port 82 is only intended to be used by the forum - when a new post is created, a PHP script sends a POST request to localhost on port 82, along with the data.
Is this ok?
var io = require('socket.io').listen(81);
io.sockets.on('connection', function(socket) {
socket.on('init', function(threadid) {
socket.join(threadid);
});
});
var forumserver = require('http').createServer(function(req, res) {
if (res.socket.remoteAddress == '127.0.0.1' && req.method == 'POST') {
req.on('data', function(chunk) {
data = JSON.parse(chunk.toString());
io.sockets.in(data.threadid).emit('new-post', data.content);
});
}
res.end();
}).listen(82);
Your solution of a HTTP server running on a special port is exactly the solution I ended up with when faced with a similar problem. The PHP app simply uses curl to POST to the Node server, which then pushes a message out to socket.io.
However, your HTTP server implementation is broken. The data event is a Stream event; Streams do not emit messages, they emit chunks of data. In other words, the request entity data may be split up and emitted in two chunks.
If the data event emitted a partial chunk of data, JSON.parse would almost assuredly throw an exception, and your Node server would crash.
You either need to manually buffer data, or (my recommendation) use a more robust framework for your HTTP server like Express:
var express = require('express'), forumserver = express();
forumserver.use(express.bodyParser()); // handles buffering and parsing of the
// request entity for you
forumserver.post('/post/:threadid', function(req, res) {
io.sockets.in(req.params.threadid).emit('new-post', req.body.content);
res.send(204); // HTTP 204 No Content (empty response)
});
forumserver.listen(82);
PHP simply needs to post to http​://localhost:82/post/1234 with an entity body containing content. (JSON, URL-encoded, or multipart-encoded entities are acceptable.) Make sure your firewall blocks port 82 on your public interface.
Regarding the PHP code / forum's interaction with Node.JS, you probably need to create an API endpoint of sorts that can listen for changes made to the forum. Depending on your forum software, you would want to hook into the process of creating a new post and perform the API callback to Node.js at this time.
Socket.io out of the box is geared towards visitors of the site being connected on the frontend via Javascript. Upon the Node server receiving notification of a new post update, it would then notify connected clients of this new post and its details, at which point it would probably add new HTML to the DOM of the page the visitor is viewing.
You may want to arrange the Socket.io part of things so that users only subscribe to specific events being emitted by them being in a specific room such as "subforum123" so that they only receive notifications of applicable posts.
So this is my setup
I have a client from which files are uploaded to the node.js server (serverA) and from there I want to stream the files to another server (serverB) without saving the file temporarily (on serverA).
What is the simplest and the best way to achieve this?
I am able to upload the file to serverA but I don't want the temporary file to be stored.
Update:
its a simple ajax file uplaod to (severA)... The idea is to transfer byte-wise so that even if the connection goes off, you can read it back from that particular byte.
I am using express.js on serverA and backbone.js is the client using which I do the ajax uploads. For now there's no connection between A and B as such, they communicate through endpoints. serverA is running on port 4000 and serverB on port 5000. I want to somehow pipe the file from serverA to an endpoint on serverB.
Since HttpRequest is a stream, you could use the request module to pipe the current request into the other endpoint inside your express route:
app.post('myroute', function (req, res) {
var request = require('request');
req.pipe(request.post('/my/path:5000')).pipe(res);
});
Would that approach work?