How to query pageviews in nodejs #google-analytics/data library - node.js

Does anyone know how to query pageviews in the google analytics library for node? Can't seem to find any documentation on which "metric" to query for that:
async function runReport() {
const [response] = await analyticsDataClient.runReport({
property: `properties/${propertyId}`,
dimensions: [
{
name: 'date',
},
],
metrics: [
{
name: 'pageViews',
}
],
dateRanges: [
{
startDate: '7daysAgo',
endDate: 'today',
},
],
});
console.log('Report result:');
response.rows.forEach(row => {
console.log(row.dimensionValues[0], row.metricValues[0]);
});
}
runReport();
Which is giving me an error Field pageViews is not a valid metric.
Not sure if anyone is aware of a list that outlines the valid metrics you can query with this library, I wasn't able to find one. I can get the activeUsers just fine so I know the code and configuration is working

Found it about 5 mins after I posted this question - go figure. Hope this helps someone: https://developers.google.com/analytics/devguides/reporting/data/v1/api-schema#metrics

Related

How to filter with pagination efficiently with millions of records in mongodb?

I know there are a LOT of questions regarding this subject. And while most work, they are really poor in performance when there are millions of records.
I have a collection with 10,000,000 records.
At first I was using mongoose paginator v2 and it took around 8s to get each page, with no filtering and 25s when filtering. Fairly decent compared to the other answers I found googling around. Then I read about aggregate (in some question about the same here) and it was a marvel, 7 ms to get each page without filtering, no matter what page it is:
const pageSize = +req.query.pagesize;
const currentPage = +req.query.currentpage;
let recordCount;
ServiceClass.find().count().then((count) =>{
recordCount = count;
ServiceClass.aggregate().skip(currentPage).limit(pageSize).exec().then((documents) => {
res.status(200).json({
message: msgGettingRecordsSuccess,
serviceClasses: documents,
count: recordCount,
});
})
.catch((error) => {
res.status(500).json({ message: msgGettingRecordsError });
});
}).catch((error) => {
res.status(500).json({ message: "Error getting record count" });
});
What I'm having issues with is when filtering. aggregate doesn't really work like find so my conditions are not working. I read the docs about aggregate and tried with [ {$match: {description: {$regex: regex}}} ] inside aggregate as a start but it did not return anything.
This is my current working function for filtering and pagination (which takes 25s):
const pageSize = +req.query.pagesize;
const currentPage = +req.query.currentpage;
const filter = req.params.filter;
const regex = new RegExp(filter, 'i');
ServiceClass.paginate({
$or:[
{code: { $regex: regex }},
{description: { $regex: regex }},
]
},{limit: pageSize, page: currentPage}).then((documents)=>{
res.status(200).json({
message: msgGettingRecordsSuccess,
serviceClasses: documents
});
}).catch((error) => {
res.status(500).json({ message: "Error getting the records." });
});
code and description are both indexes. code is a unique index and description is just a normal index. I need to search for documents which contains a string either in code or description field.
What is the most efficient way to filter and paginate when you have millions of records?
Below code will get the paginated result from the database along with the count of total documents for that particular query simultaneously.
const pageSize = +req.query.pagesize;
const currentPage = +req.query.currentpage;
const skip = currentPage * pageSize - pageSize;
const query = [
{
$match: { $or: [{ code: { $regex: regex } }, { description: { $regex: regex } }] },
},
{
$facet: {
result: [
{
$skip: skip,
},
{
$limit: pageSize,
},
{
$project: {
createdAt: 0,
updatedAt: 0,
__v: 0,
},
},
],
count: [
{
$count: "count",
},
],
},
},
{
$project: {
result: 1,
count: {
$arrayElemAt: ["$count", 0],
},
},
},
];
const result = await ServiceClass.aggregate(query);
console.log(result)
// result is an object with result and count key.
Hope it helps.
The most efficient way to filter and paginate when you have millions of records is to use the MongoDB's built-in pagination and filtering features, such as the skip(), limit(), and $match operators in the aggregate() pipeline.
You can use the skip() operator to skip a certain number of documents, and the limit() operator to limit the number of documents returned. You can also use the $match operator to filter the documents based on certain conditions.
To filter your documents based on the code or description field, you can use the $match operator with the $or operator, like this:
ServiceClass.aggregate([
{ $match: { $or: [{ code: { $regex: regex } }, { description: { $regex: regex } }] } },
{ $skip: currentPage },
{ $limit: pageSize }
])
You can also use the $text operator instead of $regex which will perform more efficiently when you have text search queries.
It's also important to make sure that the relevant fields (code and description) have indexes, as that will greatly speed up the search process.
You might have to adjust the query according to your specific use case and data.

Mongoose - How to populate only the first element in a nested array of every object

I am trying to create a Chat App in NodeJS, Mongoose and React Native and I want to show to the user the last message of every conversation.
I have an array of messages in my Conversation Schema as following:
const ConversationSchema = new mongoose.Schema({
{...}
messages: [
{
type: mongoose.Schema.Types.ObjectId,
ref: 'Message'
}
]
{...}
})
I wonder if anyone can help me to be able to only populate the first message of every Conversation so the response will be:
conversations: [
{
"_id": ObjectId(...)
"messages": [
"text": "Last message"
]
},
{
"_id": ObjectId(...)
"messages": [
"text": "Last message"
]
},
...
]
I am currently using the populate function of mongoose but the problem is that if only populates the first conversation:
Conversation.find().populate(query).exec((err, conversations => {
{...}
})
const query = {
{
path: "messages",
options: {
limit: 1,
sort: { _id: -1 }
}
}
}
Note: If I do not specify the limit: 1 and sort: { _id: -1 } it correctly populates all elements of the array but that's not what I am looking for.
Thanks to anyone who can help me!
You need to use perDocumentLimit than Limit
If you need the correct limit, you should use the perDocumentLimit option (new in Mongoose 5.9.0). Just keep in mind that populate() will execute a separate query for each story, which may cause populate() to be slowe

Slow .findByIdAndUpdate().lean() with MongoDB

I'm needing to update/save documents with sizes between 100KB - 800KB. Update operations like so, console.time('save'); await doc.findByIdAndUpdate(...).lean(); console.timeEnd('save');, are taking over 5s - 10s to finish. The updates contain ~50KB at most.
The large document property which is being updated has a structure like so:
{
largeProp: [{
key1: { key1A:val, key1B:val, ... 10 more ... },
key2: { key1A:val, key1B:val, ... 10 more ... },
key3: { key1A:val, key1B:val, ... 10 more ... },
...300 more...
}, ...100 more... ]
}
I'm using a Node.js server on Ubuntu VM with mongoose.js with MongoDB hosted on a separate server. The MongoDB server is does not show any unusual load, it usually stays under 7% CPU, however my Node.js server will hit 100% CPU usage with just this update operation (after a .findById() and some quick logic, 8ms-52ms). The .findById() takes about 500ms - 1s for this same object.
I need these saves to be much faster, and I don't understand why this is so slow.
I did not do much more profiling on the Mongoose query. Instead I tested out a native MongoDB query and it significantly improved the speed, so I will be using native MongoDB going forward.
const {ObjectId} = mongoose.Types;
let result = await mongoose.connection.collection('collection1')
.aggregate([
{ $match: { _id: ObjectId(gameId) } },
{ $lookup: {
localField:'field1',
from:'collection2',
foreignField:'_id',
as:'field1'
}
},
{ $unwind: '$field1' },
{ $project: {
_id: 1,
status: 1,
createdAt: 1,
slowArrProperty: { $slice: ["$positions", -1] } },
updatedAt: 1
}
},
{ $unwind: "$slowArrProperty" }
]).toArray();
if (result.length < 1) return {};
return result[0];
This query, as well as doing some restructuring of my data model solved my issue. Specifically, the document property that was very large and causing issues, I used the above { $slice: ["$positions", -1] } } to only return one of the objects in the array at a time.
Just from switching to native MongoDB queries (within the mongoose wrapper), I saw between 60x and 3000x improvements on query speeds.

Issue with filter in Google Analytics Reporting v4

I am using Analytics Reporting v4 and Node.js.
I need to get a number of triggered events for a group of dimensions.
For example:
dimensions: date, source, medium, campaign
metrics: pageviews, totalEvents
where eventAction = "Test Action"
When I combine these two metrics: pageviews and totalEvents,
it shows the wrong numbers in result. But when I use them separately, then it works well.
True results for metrics:
total pageviews - 32 (but shows 17)
total events - 9
Maybe someone knows why? Maybe because it does not calculate pageviews where the user didn't do an action ("Test Action")? And how can I do this correctly?
Response example - http://i.imgur.com/BUkqiQG.png
Request code:
reportRequests: [{
view_id: viewId,
dateRanges: [{
startDate: '2020-02-10',
endDate: '2020-02-10',
}],
dimensions: [
{
name: 'ga:date'
},
{
name: 'ga:source'
},
{
name: 'ga:medium'
},
{
name: 'ga:campaign'
}
],
metrics: [
{
expression: 'ga:pageviews'
},
{
expression: 'ga:totalEvents'
},
],
orderBys: [{
fieldName: 'ga:date',
sortOrder: "ASCENDING"
}],
dimensionFilterClauses: [{
filters: [
{
dimension_name: 'ga:eventAction',
operator: 'EXACT',
expressions: ["Test Action"]
}
]
}]
}]
This is because you are filtering down the pages to those which only have the event.
The source, medium, campaign dimensions are all session level. Therefore, when you report on those, and just pageviews, they give total pageviews.
However, when you filter the results to where eventAction=Test, it only returns the pageviews where that event action occurred.
Instead of this, I would suggest using a segment, something like:
"segments": [{
"dynamicSegment": {
"sessionSegment": {
"segmentFilters": [{
"simpleSegment" :{
"orFiltersForSegment": [{
"segmentFilterClauses":[{
"dimensionFilter": {
"dimensionName": "ga:eventAction",
"expressions": ["Test Action"]
}
}]
}]
}
}]
}
}
}]
More info: https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#DynamicSegment

Elasticsearch _bulk update issue giving VersionConflictEngineException message

I am using elasticsearch in one of my project. I am facing an issue while updating a record. The error message that I am getting is:-
{ _index: 'makes',
_type: 'make',
_id: '55b8cdbae36236490d00002a',
status: 409,
error: 'VersionConflictEngineException[[makes][0] [make][55b8cdbae36236490d00002a]: version conflict, current [168], provided [167]]' }
Using ES bulk api. My application is in node.js.
Let me share my code too:-
var conditions = [];
conditions.push({
update: {
_index: config.elasticSearch.index,
_type: config.elasticSearch.type,
_id: id
}
});
conditions.push({
doc: {
published: true
}
});
client.bulk({
body: conditions
}, function(err, resp) {
console.log(resp);
console.log(resp.items[0].update);
return res.send({success: true, message: "Shows updated successful"})
});
The below is the value of conditions array:
[ { update:
{ _index: 'makes',
_type: 'make',
_id: '55b8cdbae36236490d00002a' } },
{ doc: { published: true } } ]
When you start querying a record, its response with the record including the version of that record. When you want to update it, but before it, it has been updated by another, the record in the database has a higher version than what the client thinks.
This may happen because some operation are still in queue, so you get the unprocessed record (hence lower version). When this happens, try https://www.elastic.co/guide/en/elasticsearch/reference/1.6/indices-refresh.html:
curl -XPOST 'http://localhost:9200/{your_index}/_refresh'
Then call your method again

Resources