Get objects with max value from grouped by linq query

Get objects with max value from grouped by linq query - c#-4.0

I got the following linq query:
var invadersOrderedInColumns = from i in invaders
group i by i.GetPosition().X;
This will order the invaders with the same X position. The next thing I want to do is retrieve the invader with the highest Y value from each of those columns.
Imagine if you will each invader as a black blok in the following image. This will represent the invaders after the above linq query. Each X = Value is the key.
Now, from each of these groups (columns), I want to get the invaders with the highest Y position (so the bottom invader of each column when you look at the picture):
How can I get this done with a Linq query?

I don't much care for the query syntax, but in extension method syntax it would look something like this.
var invadersOrderedInColumns = invaders
.GroupBy(d => d.GetPosition().X)
.Select(d => d.OrderByDescending(y => y.GetPosition().Y).First());

Related

Timeseries differencing - ArangoDB (AQL or Python)

I have a collection which holds documents, with each document having a data observation and the time that the data was captured.
e.g.
{
_key:....,
"data":26,
"timecaptured":1643488638.946702
}
where timecaptured for now is a utc timestamp.
What I want to do is get the duration between consecutive observations, with SQL I could do this with LAG for example, but with ArangoDB and AQL I am struggling to see how to do this at the database. So effectively the difference in timestamps between two documents in time order. I have a lot of data and I don't really want to pull it all into pandas.
Any help really appreciated.

Although the solution provided by CodeManX works, I prefer a different one:
FOR d IN docs
SORT d.timecaptured
WINDOW { preceding: 1 } AGGREGATE s = SUM(d.timecaptured), cnt = COUNT(1)
LET timediff = cnt == 1 ? null : d.timecaptured - (s - d.timecaptured)
RETURN timediff
We simply calculate the sum of the previous and the current document, and by subtracting the current document's timecaptured we can therefore calculate the timecaptured of the previous document. So now we can easily calculate the requested difference.
I only use the COUNT to return null for the first document (which has no predecessor). If you are fine with having a difference of zero for the first document, you can simply remove it.
However, neither approach is very straight forward or obvious. I put on my TODO list to add an APPEND aggregate function that could be used in WINDOW and COLLECT operations.

The WINDOW function doesn't give you direct access to the data in the sliding window but here is a rather clever workaround:
FOR doc IN collection
SORT doc.timecaptured
WINDOW { preceding: 1 }
AGGREGATE d = UNIQUE(KEEP(doc, "_key", "timecaptured"))
LET timediff = doc.timecaptured - d[0].timecaptured
RETURN MERGE(doc, {timediff})
The UNIQUE() function is available for window aggregations and can be used to get at the desired data (previous document). Aggregating full documents might be inefficient, so a projection should do, but remember that UNIQUE() will remove duplicate values. A document _key is unique within a collection, so we can add it to the projection to make sure that UNIQUE() doesn't remove anything.
The time difference is calculated by subtracting the previous' documents timecaptured value from the current document's one. In the case of the first record, d[0] is actually equal to the current document and the difference ends up being 0, which I think is sensible. You could also write d[-1].timecaptured - d[0].timecaptured to achieve the same. d[1].timecaptured - d[0].timecaptured on the other hand will give you the inverted timestamp for the first record because d[1] is null (no previous document) and evaluates to 0.
There is one risk: UNIQUE() may alter the order of the documents. You could use a subquery to sort by timecaptured again:
LET timediff = doc.timecaptured - (
FOR dd IN d SORT dd.timecaptured LIMIT 1 RETURN dd.timecaptured
)[0]
But it's not great for performance to use a subquery. Instead, you can use the aggregation variable d to access both documents and calculate the absolute value of the subtraction so that the order doesn't matter:
LET timediff = ABS(d[-1].timecaptured - d[0].timecaptured)

How do I filter a list of LiveData objects in Kotlin?

The title might be worded weirdly or unclear, but I am creating a game using android studio and Kotlin as the language. I have a repository that retrieves the score to the game (also stores it):
val readAllData: LiveData<List<ScoreDB>> = scoreDao.getScore()
Then in my leaderboard composable function I have:
val scoreList : LiveData<List<ScoreDB>> = vm.readAllData
I want to filter out this list to display the top 10 scores. After scoreList is filtered to only the top ten score, I was going to put it in a lazyColumn using something like this:
//TODO List highest scores from database in this lazycolumn
items(10){idx->
ScoreRow(idx)
}
I am stuck on how to filter the scoreList to contain only the top 10 scores and then to display them in the lazy column. Thanks for the help

You can use sortedByDescending() to filter the list, and use take() to get the first 10 elements. If you want to show it, you should create a new LivaData to store your filterd list:
val topTenScoreList : LiveData<List<ScoreDB>> =
Transformations.map(scoreList) {
it.sortedDescending{ s->s.score }.take(10)
}
And use topTenScoreList to generate columns

I want to filter out this list to display the top 10 scores.
OK, so you need
a list
to the sort the list highest to lowest and
to take the first 10 of that:
So:
val topTenScores = scoreList // The live data
.value // The actual list
.sortedByDescending { it.score } // The list sorted by ScoreDB.score
.take(10) // And filtering out the first 10

Cosmos DB paginated query with custom order by clause

I want to do a select query in Cosmos DB that returns a maximum number of results (say 50) and then gives me the continuation token so I can continue the search where I left off.
Now let's say my query has 2 equality conditions in my where clause, e.g.
where prop1 = "a" and prop2 = "w" and prop3 = "g"
In the results that are returned, I want the records that satisfy prop1 = "a" to appear first, followed by the results that have prop2 = "w" followed by the ones with prop3 = "g".
Why do I need it? Because while I could just get all the data to my application and sort it there, I can't pull all records obviously as that would mean pulling in too much data. So if I can't order it this way in cosmos itself, in the results that I get, I might only have those records that don't have prop1 = "a" at all. Now I could keep retrying this till I get the ones with prop1 = "a" (I need this because I want to show the results with prop1 = "a" as the first set of results to the user) but I might have to pull like a 100 times to get the first record since I have a huge dataset sitting in my Cosmos DB.
How can I handle this scenario in Cosmos? Thanks!

So if I am understanding your question correctly, you want to accomplish this:
SELECT * FROM c
WHERE
c.prop1 = 'a'
AND
c.prop2 = 'b'
AND
c.prop3 = 'c'
ORDER BY
c.prop1, c.prop2, c.prop3
OFFSET 0 LIMIT 25
Now, luckily you can now do this in CosmosDB SQL. But, there is a caveat. You have to set up a composite index in your collection to allow for this.
So, for this collection, my composite index would look like this:
Now, if I wanted to change it to this:
SELECT * FROM c
WHERE
c.prop1 = 'a'
AND
c.prop2 = 'b'
AND
c.prop3 = 'c'
ORDER BY
c.prop1 DESC, c.prop2, c.prop3
OFFSET 0 LIMIT 25
I could add another composite index to cover that use-case. You can see in your settings it's an array of arrays so you can add as many combinations as you'd like.
This should get you to where you need to be if I understood your question correctly.

How to check if all element of a list is inside of a list of strings

Im parsing a website to catch available products and there sizes. Theres 3 products loaded. Theres a list named 'find_id_1' that houses 3 elements, each element has the product name and their variant ids. I made 2 other list one named keywords and one named negative. the keywords list houses the keywords that my desired product title should have. If any elements from the negative list are in the product title then I don't want that product.
found_product = []
keywords = ['YEEZY','BOOST','700']
negative = ['INFANTS','KIDS']
find_id_1 = ['{"id":2069103968384,"title":
"\nYEEZY BOOST 700 V2","handle":**"yeezy-boost-700-v2-vanta-june-6"**,
[{"id":19434310238336,"parent_id":2069103968384,"available":true,
"sku":"193093889925","featured_image":null,"public_title":null,
"requires_shipping":true,"price":30000,"options"',
'{"id":2069103935616,"title":"\nYEEZY BOOST 700 V2 KIDS","handle":
"yeezy-boost-700-v2-vanta-kids-june-6",`
["10.5k"],"option1":"10.5k","option2":"",
`"option3":"","option4":""},{"id":19434309845120,"parent_id":2069103935616,
"available":false,"sku":"193093893625","featured_image":null,
"public_title":null,"requires_shipping":true,"price":18000,"options"',
'{"id":2069104001152,"title":"\nYEEZY BOOST 700 V2 INFANTS",
"handle":**"yeezy-boost-700-v2-vanta-infants-june-6"***,`
["4K"],"option1":"4k","option2":"",`
"option3":"","option4":""},{"id":161803398876,"parent_id":2069104001152,
"available":false,"sku":"193093893724",
"featured_image":null,"public_title":null,
"requires_shipping":true,"price":15000,"options"']
I've tried using a for loop to iterate through every element in find_info_1 then creating another for loop that iterates through every element in keyword and negative but i get the wrong product. Heres my code:
for product in find_id_1:
for key in keywords:
for neg in negative:
if key in product:
if neg not in product:
found_product = product
It prints the following:
'{"id":2069104001152,"title":"\nYEEZY BOOST 700 V2 INFANTS",
"handle":"yeezy-boost-700-v2-vanta-infants-june-6,`
["4K"],"option1":"4k","option2":"",`
"option3":"","option4":""},
{"id":161803398876,"parent_id":2069104001152,
"available":false,"sku":"193093893724",
"featured_image":null,"public_title":null,
"requires_shipping":true,"price":15000,"options"']
Im trying to get it to return element 0 from find_info_1 because thats the only one that doesn't have any of the elements from the list negative. Would using a for loop be the best and fastest way to iterate through my list? Thank you! Any help is welcome!

First of all you should'nt treat a json data as a string. Just parse the json using json library so you can check just the title of the product. As the product list and the specification of each of the product get bigger, the time taken for iteration increases.
To answer your question, you can simply do
for product in find_id_1:
if any(key in product for key in keywords):
if not any(neg in product for neg in negative):
found_product.append(product)
this will get you the element as per your specification. however I made some changes to your data, just to make it a valid python code..
found_product = []
keywords = ['YEEZY','BOOST','700']
negative = ['INFANTS','KIDS']
find_id_1 = [""""'{"id":2069103968384,"title":
"\nYEEZY BOOST 700 V2","handle":**"yeezy-boost-700-v2-vanta-june-6"**,
[{"id":19434310238336,"parent_id":2069103968384,"available":true,
"sku":"193093889925","featured_image":null,"public_title":null,
"requires_shipping":true,"price":30000,"options"'""",
""""'{"id":2069103935616,"title":"\nYEEZY BOOST 700 V2 KIDS","handle":
"yeezy-boost-700-v2-vanta-kids-june-6",`
["10.5k"],"option1":"10.5k","option2":"",
`"option3":"","option4":""},{"id":19434309845120,"parent_id":2069103935616,
"available":false,"sku":"193093893625","featured_image":null,
"public_title":null,"requires_shipping":true,"price":18000,"options"'""",
""""'{"id":2069104001152,"title":"\nYEEZY BOOST 700 V2 INFANTS",
"handle":**"yeezy-boost-700-v2-vanta-infants-june-6"***,`
["4K"],"option1":"4k","option2":"",`
"option3":"","option4":""},{"id":161803398876,"parent_id":2069104001152,
"available":false,"sku":"193093893724",
"featured_image":null,"public_title":null,
"requires_shipping":true,"price":15000,"options"'"""]
for product in find_id_1:
if any(key in product for key in keywords):
if not any(neg in product for neg in negative):
found_product.append(product)
print(found_product)

Does CouchDB support multiple range queries?

How are multiple range queries implemented in CouchDB? For a single range condition, startkey and endkey combination works fine, but the same thing is not working with a multiple range condition.
My View function is like this:
"function(doc){
if ((doc['couchrest-type'] == 'Item')
&& doc['loan_name']&& doc['loan_period']&&
doc['loan_amount'])
{ emit([doc['template_id'],
doc['loan_name'],doc['loan_period'],
doc['loan_amount']],null);}}"
I need to get the whole docs with loan_period > 5 and
loan_amount > 30000. My startkey and endkey parameters are like this:
params = {:startkey =>["7446567e45dc5155353736cb3d6041c0",nil,5,30000],
:endkey=>["7446567e45dc5155353736cb3d6041c0",{},{},{}],:include_docs => true}
Here, I am not getting the desired result. I think my startkey and endkey params are wrong. Can anyone help me?

A CouchDB view is an ordered list of entries. Queries on a view return a contiguous slice of that list. As such, it's not possible to apply two inequality conditions.
Assuming that your loan_period is a discrete variable, this case would probably be best solved by emit'ing the loan_period first and then issuing one query for each period.
An alternative solution would be to use couchdb-lucene.

You're using arrays as your keys. Couchdb will compare arrays by comparing each array element in increasing order until two element are not equal.
E.g. to compare [1,'a',5] and [1,'c',0] it will compare 1 whith 1, then 'a' with 'c' and will decide that [1,'a',5] is less than [1,'a',0]
This explains why your range key query fails:
["7446567e45dc5155353736cb3d6041c0",nil,5,30000] is greater ["7446567e45dc5155353736cb3d6041c0",nil,5,90000]

Your emit statement looks a little strange to me. The purpose of emit is to produce a key (i.e. an index) and then the document's values that you are interested in.
for example:
emit( doc.index, [doc.name, doc.address, ....] );
You are generating an array for the index and no data for the view.
Also, Couchdb doesn't provide for an intersection of views as it doesn't fit the map/reduce paradigm very well. So your needs boil down to trying to address the following:
Can I produce a unique index which I can then extract a particular range from? (using startkey & endkey)

Actually CouchDB allows views to have complex keys which are arrays of values as given in the question:
[template_id, loan_name, loan_period, loan_amount]
Have you tried
params = {:startkey =>["7446567e45dc5155353736cb3d6041c0",nil,5,30000],
:endkey=>["7446567e45dc5155353736cb3d6041c0",{}],:include_docs => true}
or perhaps
params = {:startkey =>["7446567e45dc5155353736cb3d6041c0","\u0000",5,30000],
:endkey=>["7446567e45dc5155353736cb3d6041c0","\u9999",{}],:include_docs => true}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get objects with max value from grouped by linq query - c#-4.0

I don't much care for the query syntax, but in extension method syntax it would look something like this. var invadersOrderedInColumns = invaders .GroupBy(d => d.GetPosition().X) .Select(d => d.OrderByDescending(y => y.GetPosition().Y).First());

Related

Timeseries differencing - ArangoDB (AQL or Python)

How do I filter a list of LiveData objects in Kotlin?

Cosmos DB paginated query with custom order by clause

How to check if all element of a list is inside of a list of strings

Does CouchDB support multiple range queries?

Categories

Resources