Unify data from various requests in Node.js - node.js

On my current project in node.js (where I'm a beginner), there is a source of data that can be queried (WEB API) but is restricted to a certain amount of entries from a given position. It is not possible to request the full record set (eg. 1000 entries) you only can select a maximum of 50 Entries from an offset.
The function to get the data is as follow:
let myResponseData = await client.getData({userId: myId, first: 50, after: cursor});
where userId is an authentication, first is the number of records I want to request (0-50), and after is the cursor
the response looks like this:
{
"totalRecords": 1000,
"page_info": {
"has_next_page": true,
"end_cursor": "this is some random string"
},
"data": [
{
"id:" 1,
"name": "Name1"
},
{
"id:" 2,
"name": "Name2"
}
]
}
My current progress in fetching all data is, that i've already constructed a function that reads all pages from the data source like this:
let myResponseData = await client.getData({userId: myId});
let hasNextPage = myResponseData.page_info.next_page;
let cursor = myResponseData.page_info.cursor;
while(hasNextPage){
console.log('next...' + cursor);
myResponseData= await client.getData({ userId: myId, first: 50, after: cursor});
hasNextPage = myResponseData.page_info.next_page;
cursor = myResponseData.page_info.cursor;
myFunctionToJoinData(myResponseData); //Here i need help
}
This works so far since the console logs the next + random string.
My goal is to end up with a json data object like in the request, but having all the entries from all queries
How can this be achieved?

Related

How to pull a range of objects from an array of objects and re-insert them at a new position in the array?

Desired Behaviour
Pull a range of objects from an array of objects and push them back to the array at a new index.
For example, pull objects from the array where their index is between 0 and 2, and push them back to the array at position 6.
For reference, in jQuery, the desired behaviour can be achieved with:
if (before_or_after === "before") {
$("li").eq(new_position).before($("li").slice(range_start, range_end + 1));
} else if (before_or_after === "after") {
$("li").eq(new_position).after($("li").slice(range_start, range_end + 1));
}
jsFiddle demonstration
Schema
{
"_id": ObjectId("*********"),
"title": "title text",
"description": "description text",
"statements": [
{
"text": "string",
"id": "********"
},
{
"text": "string",
"id": "********"
},
{
"text": "string",
"id": "********"
},
{
"text": "string",
"id": "********"
},
{
"text": "string",
"id": "********"
}]
}
What I've Tried
I am able to reposition a single object in an array of objects with the code below.
It uses pull to remove the object from the array and push to add it back to the array at a new position.
In order to do the same for a range of objects, I think I just need to modify the $pull and $push variables but:
I can't figure out how to use $slice in this context, either as a projection or an aggregation, in a $pull operation
Because I can't figure out the first bit, I don't know how to attempt the second bit - the $push operation
// define the topic_id to search for
var topic_id = request_body.topic_id;
// make it usable as a search query
var o_id = new ObjectID(topic_id);
// define the statement_id to search for
var statement_id = request_body.statement_id;
// define new position
var new_position = Number(request_body.new_position);
// define old position
var old_position = Number(request_body.old_position);
// define before or after (this will be relevant later)
// var before_or_after = request_body.before_or_after;
// define the filter
var filter = { _id: o_id };
// define the pull update - to remove the object from the array of objects
var pull_update = {
$pull: {
statements: { id: statement_id } // <----- how do i pull a range of objects here
}
};
// define the projection so that only the 'statements' array is returned
var options = { projection: { statements: 1 } };
try {
// perform the pull update
var topic = await collection.findOneAndUpdate(filter, pull_update, options);
// get the returned statement object so that it can be inserted at the desired index
var returned_statement = topic.value.statements[old_position];
// define the push update - to add the object back to the array at the desired position
var push_update = {
$push: {
statements: {
$each: [returned_statement],
$position: new_position
}
} // <----- how do i push the range of objects back into the array here
};
// perform the push update
var topic = await collection.findOneAndUpdate(filter, push_update);
}
Environments
##### local
$ mongod --version
db version v4.0.3
$ npm view mongodb version
3.5.9
$ node -v
v10.16.3
$ systeminfo
OS Name: Microsoft Windows 10 Home
OS Version: 10.0.18363 N/A Build 18363
##### production
$ mongod --version
db version v3.6.3
$ npm view mongodb version
3.5.9
$ node -v
v8.11.4
RedHat OpenShift Online, Linux
Edit
Gradually, figuring out parts of the problem, I think:
Using the example here, the following returns objects from array with index 0 - 2 (ie 3 objects):
db.topics.aggregate([
{ "$match": { "_id": ObjectId("********") } },
{ "$project": { "statements": { "$slice": ["$statements", 0, 3] }, _id: 0 } }
])
Not sure how to use that in a pull yet...
I also looked into using $in (even though i would prefer to just grab a range of objects than have to specify each object's id), but realised it does not preserve the order of the array values provided in the results returned:
Does MongoDB's $in clause guarantee order
Here is one solution to re-ordering results from $in in Node:
https://stackoverflow.com/a/34751295
Here an example with mongo 3.5
const mongo = require('mongodb')
;(async function (params) {
const client = await mongo.connect('mongodb://localhost:27017')
const coll = client.db('test').collection('test')
const from0to99 = Array(100).fill('0').map((_, i) => String(i))
const from5To28 = Array(24).fill('0').map((_, i) => String(i + 5))
const insert = { statements: from0to99.map(_ => ({ id: _ })) }
await coll.insertOne(insert)
const all100ElementsRead = await coll.findOneAndUpdate(
{ _id: insert._id },
{
$pull: {
statements: {
id: { $in: from5To28 }
}
}
},
{ returnOriginal: true }
)
/**
* It shows the object with the desired _id BEFORE doing the $pull
* You can process all the old elements as you wish
*/
console.log(all100ElementsRead.value.statements)
// I use the object read from the database to push back
// since I know the $in condition, I must filter the array returned
const pushBack = all100ElementsRead.value.statements.filter(_ => from5To28.includes(_.id))
// push back the 5-28 range at position 72
const pushed = await coll.findOneAndUpdate(
{ _id: insert._id },
{
$push: {
statements: {
$each: pushBack,
$position: 72 // 0-indexed
}
}
},
{ returnOriginal: false }
)
console.log(pushed.value.statements) // show all the 100 elements
client.close()
})()
This old issue helped
if you want "desired behavior" when mutating arrays ,
you add these to checklist:
array.length atleast==7 if you want to add ,splice at 6
creates a new array if u use concat
mutates orignal if used array.push or splice or a[a.length]='apple'
USE slice() to select between incex1 to index2.
or run a native for loop to select few elements of array or
apply a array.filter() finction.
once you select your elements which needed to be manupulated you mentioned you want to add it to end. so this is the method below.
about adding elements at end:
CONCAT EXAMPLE
const original = ['🦊']; //const does not mean its immutable just that it cant be reassigned
let newArray;
newArray = original.concat('🦄');
newArray = [...original, '🦄'];
// Result
newArray; // ['🦊', '🦄']
original; // ['🦊']
SPLICE EXAMPLE:
const zoo = ['🦊', '🐮'];
zoo.splice(
zoo.length, // We want add at the END of our array
0, // We do NOT want to remove any item
'🐧', '🐦', '🐤', // These are the items we want to add
);
console.log(zoo); // ['🦊', '🐮', '🐧', '🐦', '🐤']

How to query batch by batch from ElasticSearch in nodejs

I'm trying to get data from ElasticSearch with my node application. In my index, there are 1 million records, thus I cannot be sent to another services with the whole records. That's why I want to get 10,000 records per request, as per example:
const getCodesFromElasticSearch = async (batch) => {
let startingCount = 0;
if (batch > 1) {
startingCount = (batch * 1000);
} else if (batch === 1) {
startingCount = 0;
}
return await esClient.search({
index: `myIndex`,
type: 'codes',
_source: ['column1', 'column2', 'column3'],
body: {
from: startingCount,
size: 1000,
query: {
bool: {
must: [
....
],
filter: {
....
}
}
},
sort: {
sequence: {
order: "asc"
}
}
}
}).then(data => data.hits.hits.map(esObject => esObject._source));
}
It's still working when batch=1. But when goes to batch=2, that got problem that from should not be larger than 10,000 as per its documentation. And I don't want to change max_records as well. Please let me know any alternate way to get 10,000 by 10,000.
The scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use the cursor on a traditional database.
So you can use scroll API to get your whole 1M dataset below-something like below without using from because elasticsearch normal search has a limit of 10k record in max request so when you try to use from with greater value then it'll return error, that's why scrolling is good solutions for this kind of scenarios.
let allRecords = [];
// first we do a search, and specify a scroll timeout
var { _scroll_id, hits } = await esclient.search({
index: 'myIndex',
type: 'codes',
scroll: '30s',
body: {
query: {
"match_all": {}
},
_source: ["column1", "column2", "column3"]
}
})
while(hits && hits.hits.length) {
// Append all new hits
allRecords.push(...hits.hits)
console.log(`${allRecords.length} of ${hits.total}`)
var { _scroll_id, hits } = await esclient.scroll({
scrollId: _scroll_id,
scroll: '30s'
})
}
console.log(`Complete: ${allRecords.length} records retrieved`)
You can also add your query and sort with this existing code snippets.
As per comment:
Step 1. Do normal esclient.search and get the hits and _scroll_id. Here you need to send the hits data to your other service and keep the _scroll_id for a future batch of data calling.
Step 2 Use the _scroll_id from the first batch and use a while loop until you get all your 1M record with esclient.scroll. Here you need to keep in mind that you don't need to wait for all of your 1M data, within the while loop when you get response back just send it to your service batch by batch.
See Scroll API: https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/scroll_examples.html
**See Search After **: https://www.elastic.co/guide/en/elasticsearch/reference/5.2/search-request-search-after.html

NodeJS Iterate through City in JSON, Return Cities and Users in each City

I have the below snippet from a JSON Object that has 3,500 records in it.
[
{
"use:firstName": "Bob",
"use:lastName": "Smith",
"use:categoryId": 36,
"use:company": "BobSmith",
"use:webExId": "Bob.Smith#email.com",
"use:address": {
"com:addressType": "PERSONAL",
"com:city": "US-TX",
"com:country": 1
}
},
{
"use:firstName": "Jane",
"use:lastName": "Doe",
"use:categoryId": 36,
"use:webExId": "Jane.Doe#email.com",
"use:address": {
"com:addressType": "PERSONAL",
"com:city": "US-CA",
"com:country": "1_1"
}
}
{
"use:firstName": "Sam",
"use:lastName": "Sneed",
"use:categoryId": 36,
"use:webExId": "Sam.Sneed#email.com",
"use:address": {
"com:addressType": "PERSONAL",
"com:city": "US-CA",
"com:country": "1_1"
}
}
]
I am using NodeJS and I have been stuck on figuring out the best way to:
1. Iterate through ['use:address']['com:city' to map out and identify all of the Cities. (In the example above, I have two: US-TX and US-CA in the three records provided)
2. Then identify how many records match each City (In the example above, I would have US-TX: 1 and US-CA: 2)
The only code I have is the easy part which is doing a forEach loop through the JSON data, defining userCity variable (to make it easier for me) and then logging to console the results (which is really unnecessary but I did it to confirm I was looping through JSON properly).
function test() {
const webexSiteUserListJson = fs.readFileSync('./src/db/webexSiteUserDetail.json');
const webexSiteUsers = JSON.parse(webexSiteUserListJson);
webexSiteUsers.forEach((userDetails) => {
let userCity = userDetails['use:address']['com:city'];
console.log(userCity);
})
};
I've been searching endlessly for help on the topic and probably not formulating my question properly. Any suggestions are appreciated on how to:
1. Iterate through ['use:address']['com:city' to map out and identify all of the Cities.
2. Then identify how many records match each City (In the example above, I would have US-TX: 1 and US-CA: 2)
Thank you!
You could reduce the webexSiteUsers array into an object that is keyed by city, where each value is the number of times the city occurs. Something like the below should work.
const counts = webexSiteUsers.reduce((countMemo, userDetails) => {
let userCity = userDetails['use:address']['com:city'];
if (countMemo[userCity]) {
countMemo[userCity] = countMemo[userCity] + 1;
} else {
countMemo[userCity] = 1;
}
return countMemo;
}, {});
counts will then be an object that looks like this.
{
"US-TX": 1,
"US-CA": 2
}

how to get one page data list and total count from database with knex.js?

I have a user table with some records(such as 100), how can I get one page data and total count from it when there are some where conditions?
I tried the following:
var model = knex.table('users').where('status',1).where('online', 1)
var totalCount = await model.count();
var data = model.offset(0).limit(10).select()
return {
totalCount: totalCount[0]['count']
data: data
}
but I get
{
"totalCount": "11",
"data": [
{
"count": "11"
}
]
}
, how can I get dataList without write where twice? I don't want to do like this:
var totalCount = await knex.table('users').where('status',1).where('online', 1).count();
var data = await knex.table('users').where('status',1).where('online', 1).offset(0).limit(10).select()
return {
totalCount: totalCount[0]['count']
data: data
}
Thank you :)
You probably should use higher level library like Objection.js which has already convenience method for getting pages and total count.
You can do it like this with knex:
// query builder is mutable so when you call method it will change builders internal state
const query = knex('users').where('status',1).where('online', 1);
// by cloning original query you can reuse common parts of the query
const total = await query.clone().count();
const data = await query.clone().offset(0).limit(10);

Couchbase & nodejs: View query with range, order and limited results

I am new to couchbase and I'm trying to understand how filtering, ordering and limiting results in a view work together.
Couchbase version: 3.0.1
I'm using nodejs as the SDK.
I have a map function like this
function (doc, meta) {
if (doc.type !== 'item' || !doc.category) {
return;
}
emit([doc.orderId, doc.category.id, doc.number], null);
}
And an item document that looks like this
{
"id": 1,
"type": "item",
"number": 1203,
"orderId": 2,
"category": {
"id": 10,
"title": "Carpet"
}
}
I would like to filter only items with orderId = 2 and category.id = 10, all this ordered by number descending. Because I have a paginator, I would like to display 20 items per page. I have thousands of items in the database.
With the query below, I have an error because of the order call. If I comment it, I find the results, filtered, limited and ordered by default by number ascending.
var order_id = 2,
category_id = 10,
limit = 20,
skip = 0,
range = [order_id, category_id],
// suppose we have a valid couchbase connexion and a viewQuery object
query = viewQuery.from('items', 'myView')
.limit(limit)
.skip(skip)
.order(2) // 2 = DESC. This line doesn't work
.include_docs(true)
.range(range, range.concat([{}]), true);
bucket.query(query, function (err, docs) {
console.log(err);
console.log(docs);
});
The error says:
Error: query_parse_error: No rows can match your key range, reverse your start_key and end_key or set descending=false
Note that if I order ASC, the error occurs too. I have to remove the call to the .order() function to have my view behave properly.
Does anyone knows why?
Thanks
When you order your query in descending order, you have to swap the order of the start and end keys as well (the parameters to the range method.)

Resources