NIFI - Jolt how to select the first non null value in a Json Array - transform

I have a problem that I am trying to solve in Nifi and would love your help in coming up with a solution. I have thought of using Jolt transform to achieve this, but am open to any other suggestions
I have an Json array that looks like this:
"val1": "AAA",
"val2": "",
"val3": "111",
"val4": "red"
"val1": "BBB",
"val2": "2",
"val3": "222",
"val4": "blue"
"val1": "CCC",
"val2": "",
"val3": "333",
"val4": "orange"
"val1": "DDD",
"val2": "2",
"val3": "4444",
"val4": "green"
and I wrote a JoltSpec
"operation": "shift",
"spec": {
"*": {
"val1": "&",
"val2": "&",
"val3": "&"
"0": {
"val4": "&"
that transform the json array to:
"val4" : "red",
"val1" : [ "BBB", "CCC", "DDD" ],
"val2" : [ "2", "", "2" ],
"val3" : [ "222", "333", "4444" ]
However, this is not exactly the outcome I am looking for. What I need is for val2 to be only a single value (I want to ignore all the empty string occurrences and basically select the the first non-empty string that is available.
val2 can either be an empty string "" or some string that occurs repeatedly e.g. "2" (I am using 2 as an example here, but val2 can be anything like 3 or 123 or 345 etc, but if it is 123 all occurrences of val2 will be 123)
Sample desired output
"val4" : "red",
"val1" : [ "BBB", "CCC", "DDD" ],
"val2" : "2",
"val3" : [ "222", "333", "4444" ]
Any help would be appreciated. Thank you in advance

I figured it out... silly me.
I used QueryRecord processor and used ORDER BY and then used JoltTransform to select the first record.


Filtering Data in JSON based on value instead of Index - Kusto Query Langauge

I am trying to extract specific field from json by filtering data based on it's value instead of Index.
For example my json looks like below
"AllData": [
"ID": "1",
"Value": "Value1"
"ID": "2",
"Value": "Value2"
"ID": "3",
"Value": "Value3"
"ID": "4",
"Value": "Value4"
"ID": "5",
"Value": "Value5"
I need to project section (id and value) where value = valueX. But valueX may not always at index X it can be at any other index also. So while projecting I can not use Index. I need to project based on value. I can use contains operator in my where clause which helps to filter the arrays (list of AllData array) as shown below
| where parse_json(MyJson) contains("Value5")
| project MyJson[5].ID, MyJson[5].Value // this may give wrong result because Value5 can be at some other index
Any Suggestions will be helpful.
you can use mv-apply:
let my_value = "Value3";
print d = dynamic({"AllData": [
"ID": "1",
"Value": "Value1"
"ID": "2",
"Value": "Value2"
"ID": "3",
"Value": "Value3"
"ID": "4",
"Value": "Value4"
"ID": "5",
"Value": "Value5"
| mv-apply d = d.AllData on (
project ID = d.ID, Value = d.Value
| where Value == my_value

MongoDB Aggregate with sum of array object values

I have a collection with the following data:
{ "id": 1,
"name": "abc",
"age" : "12"
"quizzes": [
"id": "1",
"time": "10"
"id": "2",
"time": "20"
{ "id": 2,
"name": "efg",
"age" : "20"
"quizzes": [
"id": "3",
"time": "11"
"id": "4",
"time": "25"
I would like to perform the MongoDB Aggregation for a sum of quizzes for each document.and set it to totalTimes field
And this is the result that I would like to get after the querying:
{ "id": 1,
"name": "abc",
"age" : "12",
"totalTimes": "30"
"quizzes": [
"id": "1",
"time": "10"
"id": "2",
"time": "20"
{ "id": 2,
"name": "efg",
"age" : "20",
"totalTimes": "36"
"quizzes": [
"id": "3",
"time": "11"
"id": "4",
"time": "25"
How can I query to get the sum of quizzes time?
Quite simple using $reduce
$addFields: {
totalTimes: {
$reduce: {
input: "$quizzes",
initialValue: 0,
in: {
$sum: [
$toInt: "$$this.time"
Mongo Playground

How do I remove a list from an array of lists based on a condition using groovy?

I am new to groovy and need help removing an entire list if it does not meet a criteria
Here is the JSON --
"School" : New Elementary School,
"District" : "District1",
"City" : "NewTown",
"Students" : [ {
"Name": "Student1",
"Grade": "1"
}, {
"Name": "Student2",
"Grade": "2"
}, {
"Name": "Student3",
"Grade": "1"
}, {
"Name": "Student4",
"Grade": "1"
}, {
"Name": "Student5",
"Grade": "1"
} ],
I want a JSON which will have students from Grade 1 only i.e. remove Student2.
Output should be --
"School" : New Elementary School,
"District" : "District1",
"City" : "NewTown",
"Students" : [ {
"Name": "Student1",
"Grade": "1"
}, {
"Name": "Student3",
"Grade": "1"
}, {
"Name": "Student4",
"Grade": "1"
}, {
"Name": "Student5",
"Grade": "1"
} ],
I have the loop in place and the condition as well. I looked up online to removing an entire list but can't seem to find anything.
You can use the removeAll function to do that. Example
List a = [[a:1],[a:2],[a:1]]
a.removeAll{ it.a==2 }
[[a:1], [a:1]]
In your case
students = students.removeAll{ it.Grade == "2" }

Iterating over array in Cosmos DB

I have a Cosmos DB where a document called Auditlog resides.
The simplified structure is as follows:
"id": "1",
"name": "A",
"messages": [
"gps": {
"src": "GPS"
"ts": "0"
"id": "2",
"name": "B",
"messages": [
"gps": {
"src": "DR"
"ts": "1"
I want to filter the document to get all entries that have src: GPS.
The result also needs to show the ID.
I have no idea on how to accomplish this.
I tried using the 'IN'-operator but without luck.
Using the 'IN'-operator makes it impossible to display the ID.
I tried:
IN Auditlog.messages
WHERE c.gps.src = "GPS"
The result is correct but I need the ID to be displayed in the result.
The following just results in an array of empty objects:
IN Auditlog.messages
WHERE c.gps.src = "GPS"
Can someone please help me?
Thanks in advance.
FROM c JOIN a IN c.messages
WHERE a.gps.src = "GPS"
result will be
"id": "1"
"id": "2"

How to find duplicate values in mongodb using distinct query?

I am working on Mongodb distinct query, i have one collection with repeated entry, i am doing as per the created_at. But i want to fetch without repeated values.
Sample JSON
"posts": [{
"id": "580a2eb915a0161010c2a562",
"name": "\"Ah Me Joy\" Porter",
"created_at": "15-10-2016"
}, {
"id": "580a2eb915a0161010c2a562",
"name": "\"Ah Me Joy\" Porter",
"created_at": "25-10-2016"
}, {
"id": "580a2eb915a0161010c2a562",
"name": "\"Ah Me Joy\" Porter",
"created_at": "01-10-2016"
}, {
"id": "580a2eb915a0161010c2bf572",
"name": "Hello All",
"created_at": "05-10-2016"
Mongodb Query
db.getCollection('posts').find({"id" : ObjectId("580a2eb915a0161010c2a562")})
So i want to know about distinct query of mongodb, please kindly go through my post and let me know.
try as follows:
It will return all the unique IDs in the collection posts as follows:
["580a2eb915a0161010c2a562", "580a2eb915a0161010c2bf572"]
From MongoDB docs:
The example use the inventory collection that contains the following documents:
{ "_id": 1, "dept": "A", "item": { "sku": "111", "color": "red" }, "sizes": [ "S", "M" ] }
{ "_id": 2, "dept": "A", "item": { "sku": "111", "color": "blue" }, "sizes": [ "M", "L" ] }
{ "_id": 3, "dept": "B", "item": { "sku": "222", "color": "blue" }, "sizes": "S" }
{ "_id": 4, "dept": "A", "item": { "sku": "333", "color": "black" }, "sizes": [ "S" ] }
To Return Distinct Values for a Field (dept):
db.inventory.distinct( "dept" )
The method returns the following array of distinct dept values:
[ "A", "B" ]
As per my understanding, you want to get distinct results which should eliminates the duplicate id in that collection
By using distinct in mongodb, It will return list of distinct values
["580a2eb915a0161010c2a562", "580a2eb915a0161010c2bf572"]
So you should look into mongodb aggregation
{ "$group" : { "_id" : "$id", "name" : {"$first" : "$name"}, "created_at" : {"$first" : "$created_at"} }}
The output will be list of results which eliminates the duplicate id documents
