MongoDB GridFS - Is it filename or fileName - node.js

Please look at the following image, from http://mongoexplorer.com/:
I've been trying to work through GridFS, referencing https://github.com/jamescarr/nodejs-mongodb-streaming. The files I uploaded, come back nicely and the stream that comes back via the following get function looks right.
var gridfs = (function () {
function gridfs() { }
gridfs.get = function (id, fn) {
var db, store;
db = mongoose.connection.db;
id = new ObjectID(id);
store = new GridStore(db, id, "r", {
root: "fs"
});
return store.open(function (err, store) {
if (err) {
return fn(err);
}
return fn(null, store);
});
};
return gridfs;
})();
Using http://mongoexplorer.com/ I uploaded files into GridFS to test with, but they seem broken when I use the node code above to retrieve them.
That is when I noticed the filename / fileName thing. Looking here /node_modules/mongodb/lib/mongodb/gridfs/gridstore.js I saw the reference to filename with a lowercase 'N', but in my GridFS, it's fileName with a capital 'N'.
OK, so just for kicks, I changed it to lowercase in GridFS, but I still get some corruption in the stream (node code above) when retrieving files uploaded with http://mongoexplorer.com/.
Clicking Save as... in http://mongoexplorer.com/, however brings back my fine just perfectly.
To get back to my question, (since my tests didn't seem to prove anything,) I am wondering which is it: filename with a lowercase 'N', or fileName with 'N' in caps?

Please use the latest mongodb native driver as there are a ton of fixes for GridFS, there is a ton of examples in the github directory for the driver under the tests for usage of GridFS as streams as well.
Docs are at
http://mongodb.github.com/node-mongodb-native
In general I would say that if you use the core functionalities stick to the driver as the one you are using it using a driver that' way of of date which explains you corruption issues.

Another Windows tool nl. MongoVue also looks for filename instead of fileName. I'd say the answer is more likely filename instead of fileName.
With retrieving the small Windows file from GridStore, I found a bug, but I don't know how to fix it. I guess there must be some value like Chunk.CurrentSize or the like, but looking at the chunk.js file in the native node mongo driver https://github.com/mongodb/node-mongodb-native/blob/master/lib/mongodb/gridfs/chunk.js, I did the following...
I found this:
Chunk.prototype.readSlice = function(length) {
if ((this.length() - this.internalPosition + 1) >= length) {
var data = null;
if (this.data.buffer != null) { //Pure BSON
data = this.data.buffer.slice(this.internalPosition, this.internalPosition + length);
} else { //Native BSON
data = new Buffer(length);
length = this.data.readInto(data, this.internalPosition);
}
this.internalPosition = this.internalPosition + length;
return data;
} else {
return null;
}
};
and moved the following
data = this.data.buffer.slice(this.internalPosition, this.internalPosition + length);
into the this if statement (1024 * 256 is the value from Chunk.DEFAULT_CHUNK_SIZE = 1024 * 256;)
if (this.data.buffer != null) { //Pure BSON
if (this.data.buffer.length > 1024 * 256) {
// move to here
}
else
{
data = this.data.buffer;
}
like so:
Chunk.prototype.readSlice = function(length) {
if ((this.length() - this.internalPosition + 1) >= length) {
var data = null;
if (this.data.buffer != null) { //Pure BSON
if (this.data.buffer.length > 1024 * 256) {
data = this.data.buffer.slice(this.internalPosition, this.internalPosition + length);
}
else
{
data = this.data.buffer;
}
} else { //Native BSON
data = new Buffer(length);
length = this.data.readInto(data, this.internalPosition);
}
this.internalPosition = this.internalPosition + length;
return data;
} else {
return null;
}
};
The issue with windows files smaller than the chunk size is solved, but this isn't the most elegant solution. I'd like to propose this as the answer, but I realize using the default chunk size hard coded isn't the dynamic value which would make this less of a workaround ;-)

Related

Using socketio-file-upload to upload multiple files

Im using NodeJS with socket.io and socketio-file-upload to upload multiple files, it works great! However I'm having an issue where I'm trying to save the name attribute of the input these files come to save them into my DB.
When I upload 1 or more files, I can't seem to access the input field name or something that shows me which of the files come from which input field.
Here is my front:
var uploader = new SocketIOFileUpload(socket);
var array_files_lvl_3 = [
document.getElementById("l3_id_front"),
document.getElementById("l3_id_back"),
document.getElementById("l3_address_proof_1"),
document.getElementById("l3_address_proof_2"),
document.getElementById("l3_passport")
];
uploader.listenOnArraySubmit(document.getElementById("save_level_3"), array_files_lvl_3);
And here is my back:
var uploader = new siofu();
uploader.dir = "uploads/userL3";
uploader.listen(socket);
uploader.on('saved', function(evnt){
console.log(evnt);
//this "event" variable has a lot of information
//but none of it tells me the input name where it came from.
});
This is what the "evnt" variable holds:
Unfortunately the library doesn't send that information. So there is nothing existing config you can do. So this needs code modification.
client.js:374
var _fileSelectCallback = function (event) {
var files = event.target.files || event.dataTransfer.files;
event.preventDefault();
var source = event.target;
_baseFileSelectCallback(files, source);
client.js:343
var _baseFileSelectCallback = function (files, source) {
if (files.length === 0) return;
// Ensure existence of meta property on each file
for (var i = 0; i < files.length; i++) {
if (source) {
if (!files[i].meta) files[i].meta = {
sourceElementId: source.id || "",
sourceElementName: source.name || ""
};
} else {
if (!files[i].meta) files[i].meta = {};
}
}
After these changes I am able to get the details in event.file.meta
I'm the author of socketio-file-upload.
It looks like the specific input field is not currently being recorded, but this would not be a hard feature to add. Someone opened a new issue and left a backpointer to this SO question.
A workaround would be to directly use submitFiles instead of listenOnArraySubmit. Something like this might work (untested):
// add a manual listener on your submit button
document.getElementById("save_level_3").addEventListener("click", () => {
let index = 0;
for (let element of array_files_lvl_3) {
let files = element.files;
for (let file of files) {
file.meta = { index };
}
uploader.submitFiles(files);
index++;
}
});

NodeJS - read and write file causes corruption

I'm kinda new to NodeJS and I'm working on a simple file encoder.
I planned to change the very first 20kb of a file and just copy the rest of it.
So I used the following code, but it changed some bytes in the rest of the file.
Here is my code:
var fs = require('fs');
var config = require('./config');
fs.open(config.encodeOutput, 'w', function(err, fw) {
if(err) {
console.log(err);
} else {
fs.readFile(config.source, function(err, data) {
var start = 0;
var buff = readChunk(data, start);
while(buff.length) {
if(start < config.encodeSize) {
var buffer = makeSomeChanges(buff);
writeChunk(fw, buffer);
} else {
writeChunk(fw, buff);
}
start += config.ENCODE_BUFFER_SIZE;
buff = readChunk(data, start);
}
});
}
});
function readChunk(buffer, start) {
return buffer.slice(start, start + config.ENCODE_BUFFER_SIZE);
}
function writeChunk(fd, chunk) {
fs.writeFile(fd, chunk, {encoding: 'binary', flag: 'a'});
}
I opened encoded file and compared it with the original file.
I even commented these parts:
//if(start < config.encodeSize) {
// var buffer = makeSomeChanges(buff);
// writeChunk(fw, buffer);
//} else {
writeChunk(fw, buff);
//}
So my program just copies the file, but it still changes some bytes.
What is wrong?
So I checked the pattern and I realized some bytes are not in the right place and I guessed that it should be because I'm using async write function.
I changed fs.writeFile() to fs.writeFileSync() and everything is working fine now.
Since you were using asynchronous IO, you should've been waiting for a queue of operations, as multiple writes happening at the same time are likely to end up corrupting your file. This explains why your issue is solved using synchronous IO — this way, a further write cannot start before the previous one completed.
However, using synchronous APIs when asynchronous ones are available is a poor choice, due to which your program will be actually blocked while it writes to the file. You should go for async and create a queue to wait for.

Office task pane app: how to get the whole document in an OOXml string?

I'm developing an Office Task Pane app that needs to access the whole document. I know there is an API getFileAsync()
https://msdn.microsoft.com/en-us/library/office/jj220084.aspx
Office.context.document.getFileAsync(fileType [, options], callback);
However,the fileType can only be three values: compressed, pdf, text.
compressed
Returns the entire document (.pptx or .docx) in Office Open XML (OOXML) format as a byte array.
pdf
Returns the entire document in PDF format as a byte array.
text
Returns only the text of the document as a string. (Word only)
When it is compressed, the returned value is a byte array.
How can I get an OOXml string?
Or is there an API to select all content in a document so that I can use the getSelectedDataAsync() API?
In case someone finds this thread, I managed to solve this using zip.js.
var dataByteArray = [];
function getDocumentAsOoxml() {
Office.context.document.getFileAsync("compressed", { sliceSize: 100000 }, function (result) {
if (result.status == Office.AsyncResultStatus.Succeeded) {
// Get the File object from the result.
var myFile = result.value;
var state = {
file: myFile,
counter: 0,
sliceCount: myFile.sliceCount
};
getSlice(state);
}
});
}
function getSlice(state) {
state.file.getSliceAsync(state.counter, function (result) {
if (result.status == Office.AsyncResultStatus.Succeeded) {
readSlice(result.value, state);
}
});
}
function readSlice(slice, state) {
var data = slice.data;
// If the slice contains data, create an HTTP request.
if (data) {
dataByteArray = dataByteArray.concat(data);
state.counter++;
if (state.counter < state.sliceCount) {
getSlice(state);
} else {
closeFile(state);
}
}
}
function closeFile(state) {
// Close the file when you're done with it.
state.file.closeAsync(function (result) { });
// convert from byte array to blob that can bre read by zip.js
var byteArray = new Uint8Array(dataByteArray);
var blob = new Blob([byteArray]);
// Load zip.js library
$.getScript("/Scripts/zip.js/zip.js", function () {
zip.workerScriptsPath = "/Scripts/zip.js/";
// use a BlobReader to read the zip from a Blob object
zip.createReader(new zip.BlobReader(blob), function (reader) {
// get all entries from the zip file
reader.getEntries(function (entries) {
if (entries.length > 0) {
for (var i = 0; i < entries.length; i++) {
var entry = entries[i];
// find the file you are looking for
if (entry.filename == 'word/document.xml') {
entry.getData(new zip.TextWriter(), function (text) {
// text contains the entry data as a String
doSomethingWithText(text);
// close the zip reader
reader.close(function () {
// onclose callback
});
}, function (current, total) {
// onprogress callback
});
break;
}
}
}
});
}, function (error) {
// onerror callback
});
});
}
Hopefully there will be a easier way in the future.
this is a little late.
I've been working with Task Pane apps lately and, as it turns out, OOXML is natively compressed (unless I'm grossly mistaken).
My best advice would be to figure out the encoding that the string is encoded at, and decode with that encoding type. I'm willing to bet that it's UTF-8.

Node JS Sync check if file is being uploaded

I need to check if a file is being uploaded to a FTP server. I have no control over the server side (I'd just love a file.temp rename), so my option (best guess to my knowledge of FTP) is to ask the file size or last modified after an interval. My problem is I would need this to be sync.
function isStillUploading(ftp, filePath) {
var startFileSize = -1;
var nextFileSize = -2;
ftp.size(filePath,
function sizeReturned(error, size) {
if (!error) startFileSize = size;
else startFileSize = -3;
});
// CPU-melting style delay.
var start = Date.now();
while (Date.now() - start < 500) { }
ftp.size(filePath,
function sizeReturned(error, size) {
if (!error) nextFileSize = size;
else nextFileSize = -4;
});
// This would be better, but I have no way of having
// root f wait for this and return a correct boolean.
//setTimeout(ftp.size(filePath,
// function sizeReturned(error, size) {
// if (!error) nextFileSize = size;
// else nextFileSize = -4;
// }),
// 500);
// TODO: add checks for -1 -2 -3 -4
console.log("File size change: " + startFileSize + ":" + nextFileSize);
return (startFileSize != nextFileSize);
}
Writing this in a callback style would still imply a loop somewhere to re-check the file's size or (IMO) nasty callback nesting I really don't like. As far as code readability goes, I think just making it sync would be so much easier.
Is there a simple way of doing this or should I re-write with callbacks and events?
Thank you for your help.

Check if a document exists in mongoose (Node.js)

I have seen a number of ways of finding documents in mongoDB such that there is no performance hit, i.e. you don't really retrieve the document; instead you just retrieve a count of 1 or 0 if the document exists or not.
In mongoDB, one can probably do:
db.<collection>.find(...).limit(1).size()
In mongoose, you either have callbacks or not. But in both cases, you are retrieving the entries rather than checking the count. I simply want a way to check if a document exists in mongoose — I don't want the document per se.
EDIT: Now fiddling with the async API, I have the following code:
for (var i = 0; i < deamons.length; i++){
var deamon = deamons[i]; // get the deamon from the parsed XML source
deamon = createDeamonDocument(deamon); // create a PSDeamon type document
PSDeamon.count({deamonId: deamon.deamonId}, function(error, count){ // check if the document exists
if (!error){
if (count == 0){
console.log('saving ' + deamon.deamonName);
deamon.save() // save
}else{
console.log('found ' + deamon.leagueName);
}
}
})
}
You have to read about javascript scope. Anyway try the following code,
for (var i = 0; i < deamons.length; i++) {
(function(d) {
var deamon = d
// create a PSDeamon type document
PSDeamon.count({
deamonId : deamon.deamonId
}, function(error, count) {// check if the document exists
if (!error) {
if (count == 0) {
console.log('saving ' + deamon.deamonName);
// get the deamon from the parsed XML source
deamon = createDeamonDocument(deamon);
deamon.save() // save
} else {
console.log('found ' + deamon.leagueName);
}
}
})
})(deamons[i]);
}
Note: Since It includes some db operation, I am not tested.
I found it simpler this way.
let docExists = await Model.exists({key: value});
console.log(docExists);
Otherwise, if you use it inside a function, make sure the function is async.
let docHandler = async () => {
let docExists = await Model.exists({key: value});
console.log(docExists);
};
You can use count, it doesn't retrieve entries. It relies on mongoDB's count operation which:
Counts the number of documents in a collection.
Returns a document that contains this count and as well as the command status.

Resources