CKAN resource_create API - node.js

Am trying to read from list of files placed in a location using NodeJS and create resources for those files in CKAN.
Am using CKAN API v3 (/api/3) to create the resource.
NodeJS library used for iterating through files in a location : filehound
CKAN API Used : /api/3/action/resource_create
Code to iterate over files (please don't run it; it's only a sample snippet) :
// file operations helper
const fileHound = require("filehound");
// filesystem
const fs = require("fs");
// to make http requests
const superagent = require("superagent");
var basePath = "/test-path", packageId = "8bf37d22-0c25-40c0-8faa-7de12ff927f5";
var files = fileHound.create()
.paths(basePath)
.find();
files.then(function (fileNames) {
if (Array.isArray(fileNames)){
for (var fileName of fileNames) {
superagent
.post("http://test-ckan-domain.com/api/3/action/resource_create")
.set("Authorization", "78d5d219-37de-41f7-9443-188bc564051e")
.field("package_id", packageId)
.field("name", fileName)
.field("format", "CSV")
.attach("upload", fs.createReadStream(fileName))
.end(function (err, res) {
if (!err) console.log(fileName ,"Status Code ", res.statusCode);
else console.log("Error ", err);
});
}
}
});
After iterating through the files and uploading them to CKAN by creating a new resource, CKAN responds back with "success: true" and "id" of the resource created with HTTP status code 200.
However, only a few resource(s) are created and few are not. Am using CKAN's frontend to verify if the resource was created under a package/dataset (As the response from the Resource Creation API is successful).
Is it a known issue with CKAN? or am i going wrong in my code?

You've probably solved this by now, but given that some packages are just missing but you only get 200 success messages, I'm wondering if your for loop just isn't working correctly. Be careful with the syntax; I think declaring var filename inside the loop like that might be wrong.
https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Statements/for...of

Related

Node.js - Why does my HTTP GET Request return a 404 when I know the data is there # the URL I am using

I'm still new enough with Node that HTTP requests trip me up. I have checked all the answers to similar questions but none seem to address my issue.
I have been dealt a hand in the Wild of having to go after JSON files in an API. I then parse those JSON files to separate them out into rows that populate a SQL database. The API has one JSON file with an ID of 'keys.json' that looks like this:
{
"keys":["5sM5YLnnNMN_1540338527220.json","5sM5YLnnNMN_1540389571029.json","6tN6ZMooONO_1540389269289.json"]
}
Each array element in the keys property holds the value of one of the JSON data files in the API.
I am having problems getting either type of file returned to me, but I figure if I can learn what is wrong with the way I am trying to get 'keys.json', I can leverage that knowledge to get the individual JSON data files represented in the keys array.
I am using the npm modules 'request' and 'request-promise-native' as follows:
const request = require('request');
const rp = require('request-promise-native');
My URL is constructed with the following elements, as follows (I have used the ... to keep my client anonymous, but other than that it is a direct copy:
let baseURL = 'http://localhost:3000/Users/doug5solas/sandbox/.../server/.quizzes/'; // this is the development value only
let keysID = 'keys.json';
Clearly the localhost aspect will have to go away when we deploy but I am just testing now.
Here is my HTTP call:
let options = {
method: 'GET',
uri: baseURL + keysID,
headers: {
'User-Agent': 'Request-Promise'
},
json: true // Automatically parses the JSON string in the response
};
rp(options)
.then(function (res) {
jsonKeysList = res.keys;
console.log('Fetched', jsonKeysList);
})
.catch(function (err) {
// API call failed
let errMessage = err.options.uri + ' ' + err.statusCode + ' Not Found';
console.log(errMessage);
return errMessage;
});
Here is my console output:
http://localhost:3000/Users/doug5solas/sandbox/.../server/.quizzes/keys.json 404 Not Found
It is clear to me that the .catch() clause is being taken and not the .then() clause. But I do not know why that is because the data is there at that spot. I know it is because I placed it there manually.
Thanks to #Kevin B for the tip regarding serving of static files. I revamped the logic using express.static and served the file using that capability and everything worked as expected.

How to download a multipart wav file from cloudant database and save locally using Node JS and REST API?

I am stuck in retrieving multipart from cloudant using Node JS API. Hence, I used REST API to download the wav file from cloudant database. But its not downloading wav file from https URL. When I enter the https URL directly in browser, it prompts me to save file locally. So, the URL is correct.
Here is the code for REST API:
var request1 = require('request');
var filestream = fs.createWriteStream("input.wav");
var authenticationHeader = "Basic " + new Buffer(user + ":" + pass).toString("base64");
request1( { url : "example.com/data/1533979044129/female";, headers : { "Authorization" : authenticationHeader } },
function (error, httpResponse, body) {
const statusCode = httpResponse.statusCode;
httpResponse.pipe(filestream);
httpResponse.on('end', function () {
console.log("file complete");
filestream.close();
}); });
The file size of input.wav is 0. Its not downloading file. Please help.
Your callback has an error argument, which you are completely ignoring. Do something with this error, like print it out so your problem can tell you what you're doing wrong. I definitely see at least 1 problem in your source, and the error from request should tell you what it is.
Edit On second thought the above code shouldn't even execute. You should share code that you tested yourself. There's typos in there.

Microsoft Graph API unable to update Excel file

I have an Excel file (.xlsx), which when I have it already in my OneDrive, I can use a REST command like this, to modify it:
/v1.0/me/drive/root:/SpreadSheetName.xlsx:/workbook/worksheets/content_stats/tables('RawStats')/Rows
When I modify my code to first upload the file to OneDrive (rather than using the file that is already there), and I use the REST API, I get the error:
Open navigation properties are not supported on OpenTypes. Property name: 'tables'.
I have searched the web for this message, and cannot find anything related to what I am doing. The REST call for modifying the file which was just uploaded is nearly identical, although I do reference the file by ID instead, as that is what is returned by the upload API. This is the URL I use to modify the file which was just uploaded.
/v1.0/me/drive/items:/<RealExcelSpreadsheetID>:/workbook/worksheets/content_stats/tables('RawStats')/Rows
Both are doing a POST. Exact same file, only difference is it is being uploaded first, rather than already being in OneDrive. The file was definitely uploaded correctly, as when I go through the OneDrive web interface I do find it and can view it online. This is a business account.
It was uploaded as MIME type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.
It uses these scopes delegated via OAuth2:
User.Read
User.ReadWrite
Files.Read
Files.ReadWrite
Files.ReadWrite.All
Sites.ReadWrite.All
Using Node.js and JavaScript, although that should't matter.
Here is the code used to upload the file:
function copyTemplateInOneDrive(res, queryParam, officeAccessToken, callback) {
var fs = require('fs');
var excelExt = ".xlsx";
var excelSpreadsheetFilenameStart = "stats";
var uploadUrl = "https://graph.microsoft.com/v1.0/me/drive/root:/" +
excelSpreadsheetFilename + dateNowFull() + excelExt + ":/content";
var xlsxMimeType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
fs.readFile(excelSpreadsheetTemplateFilename, function read(error, fileContent) {
var request = require('request');
var options = {
url: uploadUrl,
headers: {
'Accept': 'application/json',
'Authorization': 'Bearer ' + officeAccessToken,
'Content-Type': xlsxMimeType,
},
body: fileContent
};
request.put(options, function (error, response, body) {
var result = JSON.parse(body);
if ('id' in result) {
console.log("Successfully uploaded template file to OneDrive");
res.write("Successfully uploaded template file to OneDrive");
var excelSpreadsheetID = result.id;
excelSpreadsheetUrl = result.webUrl;
} else {
console.log("ERROR: unable to upload template file to OneDrive " + body);
res.write("Error: unable to upload template file to OneDrive" + body);
return;
}
callback(null, res, queryParam, officeAccessToken, excelSpreadsheetID);
});
});
}
It uses the async module from node.js (which makes use of callback). It also saves the ID returned, and later passes it into the call to the Microsoft Graph.
Your issue here is the URL you're calling with the id is using syntax expecting the file path (i.e. folder\filename.ext) rather than the id. This is why switching to the file name started working for you.
There are two ways to address a file stored in OneDrive:
drive/items/{item-id}
/drive/root:/path/to/file (note the :)
You correctly switched your URI from drive/root to drive/items but by leaving the : in place you are telling OneDrive to address the file by it's path rather than it's id. In other words, it's looking for a file named "{some-id}".
For addressing a file by it's path, your URL is correct:
/drive/root:/{file-path}:/workbook/worksheets/content_stats/tables('RawStats')/Rows
For addressing a file by it's id however, you need to drop the ::
/drive/items/{file-id}/workbook/worksheets/content_stats/tables('RawStats')/Rows
You can read about how files are addressed in the documentation for DriveItem.

how to restrict making http calls from aws lambda

I am creating application which takes nodejs code from the user, and I am creating lambda function on the fly using that code.
eg: The code can be
var http = require('http');
exports.handler = function(event, context) {
console.log('start request to ' + event.url)
http.get('http://##someapi', function(res) {
console.log("Any Response : " + res.statusCode);
}).on('error', function(e) {
console.log("Error from API : " + e.message);
});
console.log('end request to ' + event.url)
context.done(null);
}
But some how I want to restrict http/https calls to be made from that code , as I don't have control on what code will passed by the user.
So is there any way to restrict that, like some sort of ROLE or POLICY or any configuration to achieve that?
I am able to restrict DynamoDB access by specifying Policy in Role. So I have control over db access but not http calls.
Simply prepend the user's code with the following:
(function(){
function onlyAWS (module) {
var isAWS = /amazonaws.com$/i
var orig = module.request
module.request = function restrictedRequest (opts, done) {
if (typeof opts === 'string') opts = require('url').parse(opts)
if (isAWS.test(opts.host || opts.hostname)) {
return orig.call(module, opts, done)
} else {
throw new Error('No HTTP requests allowed')
}
}
}
onlyAWS(require('http'))
onlyAWS(require('https'))
})()
One alternative would be, putting these lambdas in a VPC with restricted Outbound access.
It sounds like funny solution but I found simple solution to my problem. I am adding below code along with code entered by User.
var require = function(){
return "You are not allowed to do this operation";
}
Now if used user tries to include any 3rd party library like required('http') , then it will not allow to instantiate http lib in the node code.
using this solution I am able to block loading all 3rd party library which i don't want User to use in AWS lambda function.
I am still searching for proper solution instead of using that hack in code.

S3 file upload stream using node js

I am trying to find some solution to stream file on amazon S3 using node js server with requirements:
Don't store temp file on server or in memory. But up-to some limit not complete file, buffering can be used for uploading.
No restriction on uploaded file size.
Don't freeze server till complete file upload because in case of heavy file upload other request's waiting time will unexpectedly
increase.
I don't want to use direct file upload from browser because S3 credentials needs to share in that case. One more reason to upload file from node js server is that some authentication may also needs to apply before uploading file.
I tried to achieve this using node-multiparty. But it was not working as expecting. You can see my solution and issue at https://github.com/andrewrk/node-multiparty/issues/49. It works fine for small files but fails for file of size 15MB.
Any solution or alternative ?
You can now use streaming with the official Amazon SDK for nodejs in the section "Uploading a File to an Amazon S3 Bucket" or see their example on GitHub.
What's even more awesome, you finally can do so without knowing the file size in advance. Simply pass the stream as the Body:
var fs = require('fs');
var zlib = require('zlib');
var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());
var s3obj = new AWS.S3({params: {Bucket: 'myBucket', Key: 'myKey'}});
s3obj.upload({Body: body})
.on('httpUploadProgress', function(evt) { console.log(evt); })
.send(function(err, data) { console.log(err, data) });
For your information, the v3 SDK were published with a dedicated module to handle that use case : https://www.npmjs.com/package/#aws-sdk/lib-storage
Took me a while to find it.
Give https://www.npmjs.org/package/streaming-s3 a try.
I used it for uploading several big files in parallel (>500Mb), and it worked very well.
It very configurable and also allows you to track uploading statistics.
You not need to know total size of the object, and nothing is written on disk.
If it helps anyone I was able to stream from the client to s3 successfully (without memory or disk storage):
https://gist.github.com/mattlockyer/532291b6194f6d9ca40cb82564db9d2a
The server endpoint assumes req is a stream object, I sent a File object from the client which modern browsers can send as binary data and added file info set in the headers.
const fileUploadStream = (req, res) => {
//get "body" args from header
const { id, fn } = JSON.parse(req.get('body'));
const Key = id + '/' + fn; //upload to s3 folder "id" with filename === fn
const params = {
Key,
Bucket: bucketName, //set somewhere
Body: req, //req is a stream
};
s3.upload(params, (err, data) => {
if (err) {
res.send('Error Uploading Data: ' + JSON.stringify(err) + '\n' + JSON.stringify(err.stack));
} else {
res.send(Key);
}
});
};
Yes putting the file info in the headers breaks convention but if you look at the gist it's much cleaner than anything else I found using streaming libraries or multer, busboy etc...
+1 for pragmatism and thanks to #SalehenRahman for his help.
I'm using the s3-upload-stream module in a working project here.
There is also some good examples from #raynos in his http-framework repository.
Alternatively you can look at - https://github.com/minio/minio-js. It has minimal set of abstracted API's implementing most commonly used S3 calls.
Here is an example of streaming upload.
$ npm install minio
$ cat >> put-object.js << EOF
var Minio = require('minio')
var fs = require('fs')
// find out your s3 end point here:
// http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
var s3Client = new Minio({
url: 'https://<your-s3-endpoint>',
accessKey: 'YOUR-ACCESSKEYID',
secretKey: 'YOUR-SECRETACCESSKEY'
})
var outFile = fs.createWriteStream('your_localfile.zip');
var fileStat = Fs.stat(file, function(e, stat) {
if (e) {
return console.log(e)
}
s3Client.putObject('mybucket', 'hello/remote_file.zip', 'application/octet-stream', stat.size, fileStream, function(e) {
return console.log(e) // should be null
})
})
EOF
putObject() here is a fully managed single function call for file sizes over 5MB it automatically does multipart internally. You can resume a failed upload as well and it will start from where its left off by verifying previously upload parts.
Additionally this library is also isomorphic, can be used in browsers as well.

Resources