is there any way to load csv file in aws opensearch? - node.js

hi anyone knows how to upload csv file to aws opensearch directly using api call (like bulk api of aws).I want to do this using nodejs, i don't want to use kinesis or logstash also make sure that upload must be happen in chunks .I tried a lot but couldn't make it happen.

Opensearch provide javascript client. You can use Bulk API to upload documents in chunks.
Update 1:
As you mentioned you want yo index directly CSV file then use elasticsearch-csv NPM package.

Related

how to upload mongodb back up data directly to gcp bucket using node.js

I need to store mongodb database back up data directly to gcp bucket without storing back up file in local. I have reffered a lot but couldn't find my required solution. Is there any way to solve this ?

NodeJS, how to handle image uploading with MongoDB?

I would like to know what is the best way to handle image uploading and saving the reference to the database. What I'm mostly interested is what order do you do the process in?
Should you upload the images first in the front-end (say Cloudinary), and then call the API with result links to the images and save it to the database?
Or should you upload the images to the server first, and upload them from the back-end and save the reference afterwards?
OR, should you do the image uploading after you save the record in the database and then update it once the images were uploaded?
It really depends on the resources, timeline, and number of images you need to upload daily.
So basically if you have very few images to upload then you can upload that image to your server then upload it to any cloud storage(s3, Cloudinary,..) you are using. As this will be very easy to implement(you can find code snippet over the internet) and you can securely maintain your secret keys/credential to your cloud platform on the server side.
But, according to me best way of doing this will be something like this. I am taking user registration as an example
Make server call to get a temporary credential to upload files on the cloud(Generally, all the providers give this functionality i.e. STS/Signed URL in AWS).
The user will fill up the form and select the image on the client side. When the user clicks the submit button make one call to save the user in the database and start upload with credentials. If possible keep a predictable path for upload. Like for user upload /users/:userId or something like that. this highly depends on your use case.
Now when upload finishes make a server call for acknowledgment and store some flag in the database.
Now advantages of this approach are:
You are completely offloading your server from handling file operations which are pretty heavy and I/O blocking and you are distributing that load to all clients.
If you want to post process the files after upload you can easily integrate this with serverless platforms and do that on there and again offload that.
You can easily provide retry mechanism to your users in case of file upload fails but they won't need to refill the data, just upload the image/file again
You don't need to expose the URL directly to the client for file upload as you are using temporary Creds.
If the significance of the images in your app is high then ideally, you should not complete the transaction until the image is saved. The approach should be to create an object in your code which you will eventually insert into mongodb, start upload of image to cloud and then add the link to this object. Finally then insert this object into mongodb in one go. Do not make repeated calls. Anything before that, raise an error and catch the exception
You can have many answers,
if you are working with big files greater than 16mb please go with gridfs and multer,
( changing the images to a different format and save them to mongoDB)
If your files are actually less than 16 mb, please try using this Converter that changes the image of format jpeg / png to a format of saving to mongodb, and you can see this as an easy alternative for gridfs ,
please check this github repo for more details..

Better/best approach to load huge CSV file into DynamoDb

I have a huge .csv file on my local machine. I want to load that data in a DynamoDB (eu-west-1, Ireland). How would you do that?
My first approach was:
Iterate the CSV file locally
Send a row to AWS via a curl -X POST -d '<row>' .../connector/mydata
Process the previous call within a lambda and write in DynamoDB
I do not like that solution because:
There are too many requests
If I send data without the CSV header information I have to hardcode the lambda
If I send data with the CSV header there is too much traffic
I was also considering putting the file in an S3 bucket and process it with a lambda, but the file is huge and the lambda's memory and time limits scare me.
I am also considering doing the job on an EC2 machine, but I lose reactivity (if I turn off the machine while not used) or I lose money (if I do not turn off the machine).
I was told that Kinesis may be a solution, but I am not convinced.
Please tell me what would be the best approach to get the huge CSV file in DynamoDB if you were me. I want to minimise the workload for a "second" upload.
I prefer using Node.js or R. Python may be acceptable as a last solution.
If you want to do it the AWS way, then data pipelines may be the best approach:
Here is a tutorial that does a bit more than you need, but should get you started:
The first part of this tutorial explains how to define an AWS Data
Pipeline pipeline to retrieve data from a tab-delimited file in Amazon
S3 to populate a DynamoDB table, use a Hive script to define the
necessary data transformation steps, and automatically create an
Amazon EMR cluster to perform the work.
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-part1.html
If all your data is in S3 you can use AWS Data pipeline's predefined template to 'import DynamoDB data from S3' It should be straightforward to configure.

CSV/Text file upload using node js

I need to create a function to upload CSV/txt file into mongodb using mean stack.
The function should be like i will upload a file. First it will check whether its in text/csv format than it will upload that to mongodb.
I searched on internet and couldnt find any good material. Anyone have any idea, Please share
I used angular-file-upload and wrote my own handler using multiparty for express.

Saving Images to S3 from External URL with Node.js and MongoDB

I'm trying to save the images from a third-party API to my own S3 bucket using Node.js and MongoDB. The API provides a URL to the image on the third-party servers. I've never done this before but I'm assuming I have to download the image to my own server and then upload it to S3?
Should I save the image to mongodb with GridFS and then delete it once it is on S3? If so, what's the best way to do that?
I've read and re-read this previous question:
Problem with MongoDB GridFS Saving Files with Node.JS
But I can't find good documentation on how I should determine buffer/chunk size and other attributes for a JPEG image file.
Once I've saved the file on my server/database, then it seems like I should use:
https://github.com/appsattic/node-awssum
To upload it to S3. Is that a good idea?
I apologize if this is an obvious answer, I'm pretty green when it comes to databases and server scripting.
The easiest thing to do would be to save the image onto disk and then stream the upload from there using AwsSum's S3 PutObject operation. Alternatively, if you have the file contents in a Buffer you can just use that.
Have a look at the following two examples and they should help you figure out what to do:
https://github.com/appsattic/node-awssum/blob/master/examples/amazon/s3/put-bucket.js
https://github.com/appsattic/node-awssum/blob/master/examples/amazon/s3/put-object-streaming.js
Let me know if you need any more help. :)
Cheers,
Andy
Disclaimer: I'm the author of AwsSum.

Resources