Querying mongodb gridfs file data

Querying mongodb gridfs file data - node.js

hi i am a newbie to gridfs and am able to insert a file and view the file in gridfs using the query below
mongofiles -d myfiles put hi.txt
db.fs.files.findOne({'filename':'hi.txt'});
I need to view the contents of the file(hi.txt) i tried getResources but it didnt seem to work.Am stuck here,any help will be much helpful

As far as I know, in nodeJS (though you didn't specifically mentioned it in your question, just tagged it), the GridStore object is to be used for manipulating the files:
(quoted form the GridStore doc)
Opening a GridStore (a single file in GridFS) is a bit similar to opening a database. At first you need to create a GridStore object and then open it.
var gs = new mongodb.GridStore(db, filename, mode[, options])
Reading from GridStore can be done with read
gs.read([size], callback)
where
size is the length of the data to be read
callback is a callback function with two parameters - error object (if an error occured) and data (binary string)
Streaming from GridStore:
You can stream data as it comes from the database using stream
gs.stream([autoclose=false])
where
autoclose If true, current GridStore will be closed when EOF and ‘close’ event will be fired
The function returns read stream based on this GridStore file. It supports the events ‘read’, ‘error’, ‘close’ and ‘end’.
Also, at the quoted doc site, there are a lot of useful examples for storing file, etc...
Recommended reading:
A primer for GridFS using the Mongo DB driver

Related

PDFTron : - how to get pdf file from gridfs (mongodb), add watermark in it and send it to client?

I am using Gridfs to store large files in mognodb.
Now I am using PDFTron for pdf editing and want to watermark pdf.
The problem is i am not able to read file from Gridfs stream in pdftron nodejs sdk.
also i want to send it back to the client without storing it locally or anywhere else.
I am doing something like this...
const bucket = new mongodb.GridFSBucket(db);
const stream = bucket.openDownloadStream(ObjectId(file_id))
const pdfdoc = await PDFNet.PDFDoc.createFromFilter(stream);
the error i am getting is ...
TypeError: 1st input argument in function 'createFromFilter' is of type 'object'. Expected type 'Filter'. Function Signature: createFromFilter(PDFNet.Filter)

The PDFDoc.createFromFilter API is expecting a PDFNet Filter, not whatever GridFS is returning.
https://www.pdftron.com/api/pdfnet-node/PDFNet.PDFDoc.html#.createFromFilter__anchor
You can see this sample on creating a PDFDoc object from a Filter
https://www.pdftron.com/documentation/samples/node/js/PDFDocMemoryTest
Though the easiest is to write your GridFD stream to a buffer, and then pass that buffer to PDFDoc.createFromBuffer. https://www.pdftron.com/api/pdfnet-node/PDFNet.PDFDoc.html#.createFromBuffer__anchor

Upload small files via binData type in mongodb

I wanna upload small files of size lesser than 16MB to MongoDB via BinData type that i came to know is the only default option for smaller files whereas GRIDFS is ideally used for files exceeding 16MB in size.
Unfortunately I didn't easily get proper documentation and examples of uploading files without GridFS on MongoDB docs. The information I found about BinData type is either quite limited or I failed to understand. Going through several similar questions here (that are mostly python based) and elsewhere, I got some idea about usage of BinData but still I'm unable to successfully upload the files in this way.
I need more information about uploading files via BinData and especially the right way to initialise as I usually get BinData not a function or BinData is not defined errors. Here's my current code where I'm testing the functionality:
import { Meteor } from "meteor/meteor";
import { Mongo } from "meteor/mongo";
export const Attachment = new Mongo.Collection("attachment");
let BinData = Mongo.BinData; //wrong initialisation
function createAttachment(fileData) {
const data = new Buffer(fileData, "base64");
Attachment.insert({file: new BinData(0, data)});
}
Some helpful links:
BSON Types in Mongo
BSON spec

There are several Meteor Packages that you can use for file uploading.
I have used this one myself https://atmospherejs.com/vsivsi/file-collection
It can store your files in gridfs, and provides urls for retrieving images etc
Also:
https://atmospherejs.com/jalik/ufs
https://atmospherejs.com/ostrio/files

how to read an incomplete file and wait for new data in nodejs

I have a UDP client that grabs some data from another source and writes it to a file on the server. Since this is large amount of data, I dont want the end user to wait until they its full written to the server so that they can download it. So I made a NodeJS server that grabs the latest data from the file and sends it to the user.
Here is the code:
var stream = fs.readFileSync(filename)
.on("data", function(data) {
response.write(data)
});
The problem here is, if the download starts when the file was only for example 10mb.. the fs.readFileSync will only read my file up to 10mb. Even if 2 mins later the file increased to 100mb. fs.readFileSync will never know about the new updated data. How can I do this in Node? I would like somehow refresh the fs state or maybe perpaps wait for new data using fs file system. Or is there some kind of fs fileContent watcher?
EDIT:
I think the code below describes better what I would like to achieve, however in this code it keeps reading forever and I dont have any variable from fs.read that can help me stop it:
fs.open(filename, 'r', function(err, fd) {
var bufferSize=1000,
chunkSize=512,
buffer=new Buffer(bufferSize),
bytesRead = 0;
while(true){ //check if file has new content inside
fs.read(fd, buffer, 0, chunkSize, bytesRead);
bytesRead+= buffer.length;
}
});

Node has built-in methods in the fs module. It is tagged as unstable, so it can change in the future.
Its called: fs.watchFile(filename[, options], listener)
You can read more about it here: https://nodejs.org/api/fs.html#fs_fs_watchfile_filename_options_listener
But i highly suggest you to use one of the good modules mantained actively like
watchr:
From his readme:
Better file system watching for Node.js. Provides a normalised API the
file watching APIs of different node versions, nested/recursive file
and directory watching, and accurate detailed events for
file/directory changes, deletions and creations.
The module page is here: https://github.com/bevry/watchr
(Used the module in a couple of proyects and working great, im not related to it in other way)

you need store in some data base last size of file.
read filesize first.
load your file.
then make a script to check if file was change.
you can consult the size with jquery.post to obtain your result and decide if need to reload in javascript

Extra data in Skipper stream

Currently writing code to make a skipper gridfs adapter. When you call upload, files are passed in the callback and I would like to add an extra field to the files that contain the metadata of a gridfs store and the ID of a gridfs store.
I'm looking through the Upstream code in skipper and see something called stream.extra. I'm guessing it's for passing extra data, how would I go about using this?

Thanks for working on this! You can add extra metadata to your stream by putting it on the __newFile object in the receiver's _write method. For example, in the bundled s3Receiver, you can see this on line 61:
__newFile.extra.fsName = fsName;
which is adding the newly generated filename as metadata on the uploaded file object. In your controller's upload callback, you can retrieve the extra data from the returned file objects:
req.file('myFile').upload(function (err, files) {
var newFileName = files[0].extra.fsName;
});

Node.js: Processing a stream without running out of memory

I'm trying to read a giant logfile (250,000 lines), parsing each line into a JSON object, and insert each JSON object to CouchDB for analytics.
I'm trying to do this by creating a buffered stream that will process each chunk seperately, but I always run out of memory after about 300 lines. It seems like using buffered streams and util.pump should avoid this, but apparently not.
(Perhaps there are better tools for this than node.js and CouchDB, but I'm interested in learning how to do this kind of file processing in node.js and think it should be possible.)
CoffeeScript below, JavaScript here: https://gist.github.com/5a89d3590f0a9ca62a23
fs = require 'fs'
util = require('util')
BufferStream = require('bufferstream')
files = [
"logfile1",
]
files.forEach (file)->
stream = new BufferStream({encoding:'utf8', size:'flexible'})
stream.split("\n")
stream.on("split", (chunk, token)->
line = chunk.toString()
# parse line into JSON and insert in database
)
util.pump(fs.createReadStream(file, {encoding: 'utf8'}), stream)

Maybe this helps:
Memory leak when using streams in Node.js?
Try to use pipe() to solve it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Querying mongodb gridfs file data - node.js

Related

PDFTron : - how to get pdf file from gridfs (mongodb), add watermark in it and send it to client?

Upload small files via binData type in mongodb

how to read an incomplete file and wait for new data in nodejs

Extra data in Skipper stream

Node.js: Processing a stream without running out of memory

Categories

Resources