Update all values in a map using values from another map (Elixir) - hashmap

How can I update all values in a map, using the values from corresponding keys of another map?
For example I have the two maps bellow:
map = %{"December 2021" => 0, "November 2021" => 0, "October 2021" => 0}
map_2 = %{"December 2021" => 7, "November 2021" => 6}
And I want to update all values from map with the corresponding values from map_2, so in the end map = %{"December 2021" => 7, "November 2021" => 6, "October 2021" => 0}
I have tried:
Enum.map(map_2, fn {key, value} -> %{map | k => v} end)
I have also tried the code above with other functions like Map.update!/3 and similar ones, but they all return a list with maps for each iteration of Enum.map.
Does someone have any idea on how to do it?
Thanks in advance

I think you are looking for Enum.reduce/3. Reduce here means to reduce a collection to a single value. The collection to reduce here is map_2. The initial/single value you start with is the original collection map. The output is a modification of the initial value.
In this example, every key/value of map_2 is passed to a function that also expects an accumulator acc. acc is initialized with the initial value map (the second argument of Enum.reduce). In this case, every value of map_2 is added to map:
map = %{"December 2021" => 0, "November 2021" => 0, "October 2021" => 0}
map_2 = %{"December 2021" => 7, "November 2021" => 6}
Enum.reduce(map_2, map, fn {key, value}, acc ->
Map.put(acc, key, value)
end)
#> %{"December 2021" => 7, "November 2021" => 6, "October 2021" => 0}
If you want different behavior, for example don't add values from map_2 with keys that are non-existent in map: try to change the reducer function.

The most idiomatic way to accomplish this is to use the Map.merge/2 function, which does exactly this:
Map.merge(map, map_2)
This will merge all key-value pairs from map_2 into map.
Of course, you could write a custom version of it using Enum.reduce/3 as well as suggested by #zwippie, but this would be less efficient and more verbose.

Related

In the case of a tie, how do I return the largest and most frequent number in python?

I have a list of numbers. I created this frequency dictionary d:
from collections import Counter
mylist = [10, 8, 12, 7, 8, 8, 6, 4, 10, 12, 10, 12]
d = Counter(mylist)
print(d)
The output is like this:
Counter({10: 3, 8: 3, 12: 3, 7: 1, 6: 1, 4: 1})
I know I can use max(d, key=d.get) to get value if there is no tie in frequency. If multiple items are maximal, the function usually returns the first one encountered. How can I return the largest number, in this case, 12, instead of 10? Thank you for your help!
Define a lambda function that returns a tuple. Tuples are sorted by their first value, and then tie-broken by subsequent values. Like this:
max(d, key=lambda x:(d.get(x), x))
So for the two example values, the lambda will return (3, 10) and (3, 12). And of course, the second will be considered the max.
Further explanation:
When the max function is given a collection to find the max of, and a key, it will go over the values in the collection, passing each value into the key function. Whatever element from the collection results in the maximal output from the key function is considered the maximal value.
In this case, we're giving it a lambda function. Lambdas are just functions. Literally no difference in their usage, just a different syntax for defining them. The above example could have been written as:
def maxKey(x):return (d.get(x), x)
max(d, key=maxKey)
and it would behave the same way.
Using that function, we can see the return values that it would give for your sample data.
maxKey(10) #(3, 10)
maxKey(12) #(3, 12)
The main difference between the anonymous lambda above and using d.get is that the lambda returns a tuple with two values in it.
When max encounters a tie, it returns the first one it saw. But because we're now returning tuples, and because we know that the second value in each tuple is unique (because it comes from a dictionary), we can be sure that there won't be any duplicates. When max encounters a tuple it first compares the first value in the tuple against whatever it has already found to be the maximal value. If there's a tie there, it compares the next value. If there's a tie there, the next value, etc. So when max compares (3, 10) with (3, 12) it will see that (3, 12) is the maximal value. Since that is the value that resulted from 12 going into the key function, max will see 12 as the maximal value.
You can get the max count (using d.most_common), and then get the max of all keys that have the max count:
max_cnt = d.most_common(1)[0][1]
grt_max = max(n for n, cnt in d.items() if cnt == max_cnt)
print(grt_max)
Output:
12

Arangodb: How to use MERGE_RECURSIVE() With an Array

In AQL, the MERGE_RECURSIVE function cannot take an array as an input. How then would I use said function with the result of a previous query, which is, of course, an array?
For example, if the output of my query result is:
[
{
"John": {"city": "Berlin"}
},
{
"John": {"country": "Germany"}
}
]
I want to MERGE_RECURSIVE(myResult) to return:
{
"John": {"city": "Berlin", "country": "Germany"}
}
I just need a way to use MERGE_RECURSIVE with my query's output array
If you got an array like x = [1, 2, 3] but the function you want to pass this to requires each element as separate argument, so SOME_FUNC(1, 2, 3) instead of SOME_FUNC( [1, 2, 3] ), then there's the APPLY() function to spread the array:
APPLY("SOME_FUNC", [1, 2, 3] )
This is essentially like the following call:
SOME_FUNC(x[0], x[1], x[2])
... but you spare yourself to type all that with APPLY() and it will work with a variable number of elements in the array. So the solution in your case is:
RETURN APPLY("MERGE_RECURSIVE", myResult)

What is the generally preferred way to rank search results when search term includes delimiter

I have two documents 4349 and P 43.
A search string of P 43 returns both in order
4349
P 43
My indexing def is like below
#AnalyzerDefs({
#AnalyzerDef(
name = "ngram",
charFilters = {
#CharFilterDef(factory = HTMLStripCharFilterFactory.class)
},
tokenizer = #TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
#TokenFilterDef(factory = StandardFilterFactory.class),
#TokenFilterDef(factory = LowerCaseFilterFactory.class),
#TokenFilterDef(factory = StopFilterFactory.class, params = {
#Parameter(name = "words", value = "/org/apache/lucene/analysis/snowball/english_stop.txt")}),
#TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = {
#Parameter(name = "maxGramSize", value = "1"),
#Parameter(name = "maxGramSize", value = "15")
})
}
),
My search def is same without the ngramfilter and I have turned off lengthnorm.
Q: How to return the 2nd as a higher match or is the returned list ranking fine?
Q: Another way to phrase is how to take token order in input query into consideration or is already taken?
I was able to use querybuilder.phrase().withSlop(10)...sentence('P 43') but now it doesnt return the first result anymore and only the second one
You need to understand how your Analyzer definition will break up your terms.
Using such an EdgeNGramFilterFactory your 4349 will be converted in a sequence of tokens like this:
4349 -> [4, 43, 434, 4349]
While "P 43" will be similarly split, but only after separating the "P" from the "43" as you also have a StandardTokenizerFactory:
P 43 -> [p, 4, 43]
So those tokens get inserted in your inverted index.
At query time, the sentence "P 43" will be split using the same approach:
P 43 -> [p, 4, 43]
Both your documents will contain all of 4 and 43, like your query is asking for. So both documents match.
Now if you repeat the test but disable the N-Gram based filter we will have a different index:
4349 -> [4349]
P 43 -> [p, 43]
Your query will be:
P 43 -> [p, 43]
Only the second document matches any of the two terms p OR 43, so only the second document will be considered a match.
I would suggest playing with the helper class org.hibernate.search.util.AnalyzerUtils, which is what I used myself to confirm which tokens are going to be produced for each input / analyzer configuration.
Analyzer analyzer = searchFactory.getAnalyzer( "ngram" );
System.out.println( AnalyzerUtils.tokenizedTermValues( analyzer, "description", "4349" ) );
System.out.println( AnalyzerUtils.tokenizedTermValues( analyzer, "description", "P 43" ) );

mapPartitions returns empty array

I have the following RDD which has 4 partitions:-
val rdd=sc.parallelize(1 to 20,4)
Now I try to call mapPartitions on this:-
scala> rdd.mapPartitions(x=> { println(x.size); x }).collect
5
5
5
5
res98: Array[Int] = Array()
Why does it return empty array? The anonymoys function is simply returning the same iterator it received, then how is it returning empty array? The interesting part is that if I remove println statement, it indeed returns non empty array:-
scala> rdd.mapPartitions(x=> { x }).collect
res101: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
This I don't understand. How come the presence of println (which is simply printing size of iterator) affecting the final outcome of the function?
That's because x is a TraversableOnce, which means that you traversed it by calling size and then returned it back....empty.
You could work around it a number of ways, but here is one:
rdd.mapPartitions(x=> {
val list = x.toList;
println(list.size);
list.toIterator
}).collect
To understand what is going on we have to take a look at the signature of the function you pass to mapPartitions:
(Iterator[T]) ⇒ Iterator[U]
So what is an Iterator? If you take a look at the Iterator documentation you'll see it is a trait which extends TraversableOnce:
trait Iterator[+A] extends TraversableOnce[A]
Above should give you a hint what happens in your case. Iterators provide two methods hasNext and next. To get the size of the Iterator you have to simply iterate over it. After that hasNext returns false and you get an empty Iterator as the result.

Underscore GroupBy Sort

I have a question regarding programming in function style.
I use underscore.js library.
Let's consider some use-case. I have an array of some labels with repetitions I need to count how many occurrences of each label is in array and sort it according to the number of occurrences.
For counting, how many labels I can use countBy
_.countBy([1, 2, 3, 4, 5], function(num) {
return num % 2 == 0 ? 'even': 'odd';
});
=> {odd: 3, even: 2}
But here, as result I have a hash, which doesn't have meaning for order, so there is no sort. So here, I need to convert the hash to array then to sort it and convert backward to hash.
I am pretty sure there is an elegant way to do so, however I am not aware of it.
I would appreciate any help.
sort it and convert backward to hash.
No, that would loose the order again.
You could use
var occurences = _.countBy([1, 2, 3, 4, 5], function(num) {
return num % 2 == 0 ? 'even': 'odd';
});
// {odd: 3, even: 2}
var order = _.sortBy(_.keys(occurences), function(k){return occurences[k];})
// ["even", "odd"]
or maybe just
_.sortBy(_.pairs(occurences), 1)
// [["even", 2], ["odd", 3]]

Resources