Using custom fields in the aggregation query

Using custom fields in the aggregation query - getstream-io

It is ok to use custom variables in the aggregation for the feed?
When I push my activity I push the following
$data = [
'actor' => '1',
'verb' => "$verb",
'object' => "$objectType:$objectId",
'target' => "$targetObjectType:$targetObjectId",
'time' => "$time",
'foreign_id' => "$foreignId",
// Custom field
'object_type' => $objectType
];
It mentions when editing the aggregation feed:
The following variables are
available to you: verb, time, object, target, id, actor.
The reason I want a custom variable is that I want to aggregate by VERB TARGET and OBJECT(TYPE). So that I can show things such as 10 points were added to your item of id 1. If we use the id as well such as object=point:1 then we can't use this in the aggregation since it will be different id for each point hence never
aggregate.
I just tried using a custom variable in the aggregation and it seems to be
available and works. Is anything wrong in doing that?

Yes, you can use custom variables in your aggregation format. There is nothing wrong with doing so. In fact it's a great solution which gives you a lot of control over the aggregation. We should clarify that more clearly in the interface.

Related

Using Logstash Aggregate Filter plugin to process data which may or may not be sequenced

Hello all!
I am trying to use the Aggregate filter plugin of Logstash v7.7 to correlate and combine data from two different CSV file inputs which represent API data calls. The idea is to produce a record showing a combined picture. As you can expect the data may or may not arrive in the right sequence.
Here is as an example:
/data/incoming/source_1/*.csv
StartTime, AckTime, Operation, RefData1, RefData2, OpSpecificData1
231313232,44343545,Register,ref-data-1a,ref-data-2a,op-specific-data-1
979898999,75758383,Register,ref-data-1b,ref-data-2b,op-specific-data-2
354656466,98554321,Cancel,ref-data-1c,ref-data-2c,op-specific-data-2
/data/incoming/source_1/*.csv
FinishTime,Operation,RefData1, RefData2, FinishSpecificData
67657657575,Cancel,ref-data-1c,ref-data-2c,FinishSpecific-Data-1
68445590877,Register,ref-data-1a,ref-data-2a,FinishSpecific-Data-2
55443444313,Register,ref-data-1a,ref-data-2a,FinishSpecific-Data-2
I have a single pipeline that is receiving both these CSVs and I am able to process and write them as individual records to a single Index. However, the idea is to combine records from the two sources into one record each representing a superset. of Operation related information
Unfortunately, despite several attempts I have been unable to figure out how to achieve this via Aggregate filter plugin. My primary question is whether this is a suitable use of the specific plugin? And if so, any suggestions would be welcome!
At the moment, I have this
input {
file {
path => ['/data/incoming/source_1/*.csv']
tags => ["source1"]
}
file {
path => ['/data/incoming/source_2/*.csv']
tags => ["source2"]
}
# use the tags to do some source 1 and 2 related massaging, calculations, etc
aggregate {
task_id = "%{Operation}_%{RefData1}_%{RefData1}"
code => "
map['source_files'] ||= []
map['source_files'] << {'source_file', event.get('path') }
"
push_map_as_event_on_timeout => true
timeout => 600 #assuming this is the most far apart they will arrive
}
...
}
output {
elastic { ...}
}
And other such variations. However, I keep getting individual records being written to the Index and am unable to get one combined. Yet again, as you can see from the data set there's no guarantee of the sequencing of records - so I am wondering if the filter is the right tool for the job, to begin with? :-\
Or is it just me not being able to use it right! ;-)
In either case, any inputs/ comments/ suggestions welcome. Thanks!
PS: This message is being cross-posted over from Elastic forums. I am providing a link there just in case some answers pop up there too.

The answer is to use Elastic search in upsert mode. Please see the specifics here..

I recommend first that the information reaches you in order so that the filter can take it better, secondly, you could set the options in your pipeline.yml: pipeline.workers: 1 and pipeline.ordered: true, thus guaranteeing the order of processing.

How to check for a presence of at least one parameter using express-validator

I'm using express and express-validator in my Nodejs app. I want to check for the presence of at least one of incoming parameters. Its sort of either or combination.
Lets say my service accepts 2 parameters. I want to be sure at least one of them is provided by the client.
The below code would work for just one. But I have no idea how to make it either or.
req.checkBody('param1', 'Mandatory field param1 not populated').notEmpty();

Say you want to update a model that has id, status, and content... like a social media post, for example. Your controller may support updating the status of the model or its content. So, you could do something like the following:
export const updateModelValidation = [
param('id').exists().isNumeric(), // <-- required model identifier
oneOf( // <-- one of the following must exist
[
body('status').exists().isString(),
body('content').exists().isString(),
],
),
];

You could use multiple validation chains and use the oneOf function to validate against at least 1 validation chain.
https://www.npmjs.com/package/express-validator#oneofvalidationchains-message

Best way to manage internationalization in database

I ' ve some troubles , managing my i18n in my database
For now I ' just two languages available on my application , but in order to be scalable, I would like to do it the "best" way.
I could have duplicated all fields like description_fr, description_en but I was no confortable with this at all. What I've done for now, is a external table , call it content, and its architecture is like this :
id_ref => entity referenced id (2)
type => table name (university)
field => field of the specific table (description)
lang => which lang (fr, en, es…)
content => and finally the appropriate content.
I think it can be important to precise, I use sequelizeJS as ORM. So I can use a usefull hooks as afterFind, afterCreate and afterUpdate. So Each time I wanna to find a resource for example, after find it, my hook retrieve all content for this resource and set definitly my object with goods values. It works, but I'm not in love with this.
But I have some troubles with this :
It's increase considerably my number of requests to the database : If I select 50 rows for example, I have to do 50 requests more.. , and just for a particular model. If I have nested models, it's exponential…
Then, It's complicated to fetch data by content i18ned. Example find a university with a specific name is complicated.
And It's a lot of work for updating etc...
So I wonder, if it would be a good idea , to save as a JSON, directly in the table concerned , the data. Something like
{
fr : { 'name':'Ma super université' },
en : { 'name':'My kick ass university' }
}
And keep on using Sequelize Hooks to build and insert proper data into my object.
What do you think ?
How do you manage this ?
EDIT
I use a mysql database
It concerns around 20 fields (cross models)
I have to set the default value using a my default_lang if there is no content set (e.g, event.description in french will be the same as the english one, if there is no content set)

I used this npm package sequelize-i18n. It worked pretty fine for me using sequelize 3.23.2, unfortunately it seems does not have support for sequelize 4.x yet.

Is it possible to have a collection attribute in SailsJs without using the 'via' field?

For example, if I had a 'Conversation' model a simple chat messaging system, I might do the following:
module.exports = {
attributes: {
messages: {
collection: 'Message'
}
}
}
Is this allowed in SailsJs? If not, is it recommended to mimic a "Has" relationship from Conversation to Message by using some form of custom array? Such as below:
module.exports = {
attributes: {
messages: {
type: 'array'
}
}
}
In a more complex scenario, my goal is to have the 'Conversation' know all of its 'Message' objects, but it is unnecessary for those 'Message' objects to know of its associated 'Conversation'.

I'd been using that construct for quite a while but only now did I find that the official docs don't specify it.
They mention that in one-way associations a model is associated with another model and don't mention collections. (Though they should work in just the same manner.)
For one-to-many associations they specify that a model can be associated with many other models (a collection) but don't specify what happens if you ignore the via attribute. They simply mention it is needed.
However, if you simply leave out the via attribute, the id field is used as the key for the association. So the construct you specified is allowed.
On a different note, you might want to reconsider keeping messages as either an array or a collection. Since you might need to add/retrieve/update/remove messages in a random fashion and collections and arrays can only be accessed as a whole, it might make sense to specify a relevant index on the Message collection and forgo having an association. This would let you quickly run queries like "retrieve the last 10 messages of thread " and so on.

Specify returned fields in Node.js / Waterline?

I want to make a request like:
User.find().exec(function(){});
I know I can use toJSON in the model however I don't like this approach since sometimes I need different parameters. For instance if it's the logged in user I will return their email and other parameters. However if it the request fort he same data is made by a different user it would not include the email and a smaller subset of parameters.
I've also tried using:
User.find({}, {username:1}) ...
User.find({}, {fields: {username:1}});
But not having any luck. How can I specify the fields I need returned?

So actually found a weird workaround for this. the fields param WILL work as long as you pass other params with it such as limit or order:
User.find({}, {fields: {username:1}}).limit(1);
Note that this will NOT work with findOne or any of the singular returning types. This means in your result callback you will need to do user[1].
Of course the other option is to just scrub your data on the way out, which is a pain if you are using a large list of items. So if anything this works for large lists where you might actually set limit(20) and for single items you can just explicitly return paras until select() is available.

This is an update to the question, fields is no longer used in sails 11, please use select instead of fields.
Model.find({field: 'value'}, {select: ['id', 'name']})
.paginate({page: 1}, {limit: 10})
.exec(function(err, results) {
if(err) {
res.badRequest('reason');
}
res.json(results);
});

Waterline does not currently support any "select" syntax; it always returns all fields for a model. It's currently in development and may make it into the next release, but for now the best way to do what you want would be to use model class methods to make custom finders. For example, User.findUser(criteria, cb) could find a user give criteria, and then check whether it was the logged-in user before deciding which data to return in the callback.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string