When I run this equation in Excel it tells me that there is only 1 argument for the IF statement. I am not sure why it is saying that when I have 3 arguments. Within the OR statement I have 2 different AND statements. It works just fine if I get rid of the second AND statement. Did I mess up a parentheses somewhere? I am not sure what is wrong. Any help would be greatly appreciated. Thanks!
=IF(OR(ARRAYFORMULA(SUM(COUNTIF(B19:O19,{"I","Ip","Ia","It","Ih","A","Aa","Ap","At","Ah","X","R","Rt","Rx","Rp","Rh","K","Kt","E","Et","AL","HL","TV*","FFSL","ADM*"})))=10, AND(ARRAYFORMULA(SUM(COUNTIF(B19:O19,{"R-10","Rx-10*","Rp-10","Rt-10*","Rh-10","I-10","Ia-10","Ip-10","It-10","Ih-10","X-10*","A-10*","At-10"})))=4, ARRAYFORMULA(SUM(COUNTIF(B19:O19,{"I","Ip","Ia","It","Ih","A","Aa","Ap","At","Ah","X","R","Rt","Rx","Rp","Rh","K","Kt","E","Et","AL","HL","TV*","FFSL","ADM*"})))=5),AND(ARRAYFORMULA(SUM(COUNTIF(B19:O19,{"HL-9","X-9","N-9","E-9","J-9","Jh-9","Nh-9","Eh-9"})))=8,ARRAYFORMULA(SUM(COUNTIF(B19:O19,{"I","Ip","Ia","It","Ih","A","Aa","Ap","At","Ah","X","R","Rt","Rx","Rp","Rh","K","Kt","E","Et","AL","HL","TV*","FFSL","ADM*"})))=1) ,"80 Hours","Error"))
This question makes me think "If only there was an online Excel Formula Beautifier".
Oh look, there is.
If you copy-and-paste it into the beautifier you get the code below.
You can now see that your parameters "80 Hours", "Error" are parameters of the first ARRAYFORMULA function, not the IF function.
=IF(
OR(
ARRAYFORMULA(
SUM(
COUNTIF(
B19:O19,
{ "I",
"Ip",
"Ia",
"It",
"Ih",
"A",
"Aa",
"Ap",
"At",
"Ah",
"X",
"R",
"Rt",
"Rx",
"Rp",
"Rh",
"K",
"Kt",
"E",
"Et",
"AL",
"HL",
"TV*",
"FFSL",
"ADM*"
ARRAYROWSTOP)
ARRAYSTOP)
)
)
) = 10,
AND(
ARRAYFORMULA(
SUM(
COUNTIF(
B19:O19,
{ "R-10",
"Rx-10*",
"Rp-10",
"Rt-10*",
"Rh-10",
"I-10",
"Ia-10",
"Ip-10",
"It-10",
"Ih-10",
"X-10*",
"A-10*",
"At-10"
ARRAYROWSTOP)
ARRAYSTOP)
)
)
) = 4,
ARRAYFORMULA(
SUM(
COUNTIF(
B19:O19,
{ "I",
"Ip",
"Ia",
"It",
"Ih",
"A",
"Aa",
"Ap",
"At",
"Ah",
"X",
"R",
"Rt",
"Rx",
"Rp",
"Rh",
"K",
"Kt",
"E",
"Et",
"AL",
"HL",
"TV*",
"FFSL",
"ADM*"
ARRAYROWSTOP)
ARRAYSTOP)
)
)
) = 5
),
AND(
ARRAYFORMULA(
SUM(
COUNTIF(
B19:O19,
{ "HL-9",
"X-9",
"N-9",
"E-9",
"J-9",
"Jh-9",
"Nh-9",
"Eh-9"
ARRAYROWSTOP)
ARRAYSTOP)
)
)
) = 8,
ARRAYFORMULA(
SUM(
COUNTIF(
B19:O19,
{ "I",
"Ip",
"Ia",
"It",
"Ih",
"A",
"Aa",
"Ap",
"At",
"Ah",
"X",
"R",
"Rt",
"Rx",
"Rp",
"Rh",
"K",
"Kt",
"E",
"Et",
"AL",
"HL",
"TV*",
"FFSL",
"ADM*"
ARRAYROWSTOP)
ARRAYSTOP)
)
)
) = 1
),
"80 Hours",
"Error"
)
)
Related
I do this operation below to this dataset:
d = pd.DataFrame({'id': ["007", "007", "001", "001", "008", "008", "007", "007", "009", "007", "000", "001", "009", "009", "009"],
'id_2': ["b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"c", "c", "c", "c"],
'userid': ["us1", "us2", "us1", "us2", "us4", "us4", "us5", "us1", "us2", "us1", "us2", "us4", "us1", "us2", "us1"],
"timestamp_date": [1589175687010, 1589188715313, 1589187142475, 1589187315368, 1589187155446, 1589187301028, 1589189765216, 1589190375088,
1589364060781, 1589421612029, 1589363453544, 1589364557808, 1589354347548, 1589356096273, 1589273208050]})
df = d.sort_values('timestamp_date')
df.groupby(['id_2', 'id'], sort=False).apply(
lambda x: list(zip(x['userid'][:-1], x['userid'][1:],
x['timestamp_date'][:-1], x['timestamp_date'][1:]))).reset_index(name='duplicates')
But the problem is that this operation is taking super long. Just to give and idea, for 4 million registers, it is taking around 17 minutes.
I would like to know if there is any other way that I can do it that is more efficient. I think the zip is the problem for my tests and for what I've read online, but I couldn't find another way of doing and achieving the same result :(
Thanks
using SQL syntax I can add new column using subquery like that:
import spark.sqlContext.implicits._
List(
("a", "1", "2"),
("b", "1", "3"),
("c", "1", "4"),
("d", "1", "5")
).toDF("name", "start", "end")
.createOrReplaceTempView("base")
List(
("a", "1", "2"),
("b", "2", "3"),
("c", "3", "4"),
("d", "4", "5"),
("f", "5", "6")
).toDF("name", "number", "_count")
.createOrReplaceTempView("col")
spark.sql(
"""
|select a.name,
| (select Max(_count) from col b where b.number == a.end) - (select Max(_count) from col b where b.number == a.start) as result
|from base a
|""".stripMargin)
.show(false)
How I can do that with DataFrame API?
I found the syntax:
import spark.sqlContext.implicits._
val b = List(
("a", "1", "2"),
("b", "1", "3"),
("c", "1", "4"),
("d", "1", "5")
).toDF("name", "start", "end")
List(
("a", "1", "2"),
("b", "2", "3"),
("c", "3", "4"),
("d", "4", "5"),
("f", "5", "6")
).toDF("name", "number", "_count")
.createOrReplaceTempView("ref_table")
b.withColumn("result", expr("((select max(_count) from ref_table r where r.number = end) - (select max(_count) from ref_table r where r.number = start)) as result")).show(false)
I think max is not required, we can follow below approach
val base = List(
("a", "1", "2"),
("b", "1", "3"),
("c", "1", "4"),
("d", "1", "5")
).toDF("name", "start", "end")
val col = List(
("a", "1", "2"),
("b", "2", "3"),
("c", "3", "4"),
("d", "4", "5"),
("f", "5", "6")
).toDF("name", "number", "_count")
val df = base.join(col, col("number") === base("end")).select(base("name"), col("_count"))
val df1 = base.join(col, col("number") === base("start")).select(base("name").alias("nameDf"), col("_count").alias("count"))
df.join(df1, df("name") === df1("nameDf")).select($"name", ($"_count"- $"count").alias("result")).show(false)
The recursive version of LCS code looks something like this(m, n are lengths of the strings X and Y respectively)
int lcs( char[] X, char[] Y, int m, int n ) {
if (m == 0 || n == 0)
return 0;
if (X[m-1] == Y[n-1])
return 1 + lcs(X, Y, m-1, n-1);
else
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
I notice that this algo deletes characters of the strings from the end and creates various substrings of the original two strings and then tries to look for a match.
What I don't understand is, since it considers only certain substrings and not all possible subsequences, how is this algorithm guaranteed to provide the correct result? Is there a logical/mathematical/intuitive proof behind this?
one of the step in Dynamic programming is guessing (check out MIT lecture on YouTube)
You have two pointers to each character in both the strings respectively.
The guess here is to either include the character or not.
if both the character are same then you include the character and move the pointer ahead.
chose a character either from String X or Y. hence the max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n))
it's recurrence is given as:
Since you're guessing all the possible outcomes the above recurrence gives correct answer.
for more reference: https://courses.csail.mit.edu/6.006/fall10/handouts/recitation11-19.pdf
It calculates either lcs( X, Y, m-1, n-1) when the last chars are the same.
OR
It calculates both lcs( X, Y, m-1,n ) AND lcs( X, Y, m, n-1) when the characters are different.
So it covers all possible solutions.
If the lengths are 0. Then lcs is 0.
If the last (or first) items are the same, then the lcs is 1 + lcs(X,Y,m-1,n-1)
This clause reduces the problem by adding the common element to the lcs, and finding the smaller lcs of the 2 shorter strings.
Final the longest subsequence is either
lcs(X, Y, m-1, n)
or
lcs(X, Y, m, n-1)
So when it is asked to calculate lcs( "abce", "bghci", 4, 5);
if( m(4) == 0 || n(5) == 0 ) // evaluates to false
if( X[3]('e') == Y[4] ) // evaluates to false
return max( lcs( "abc", "bghci", 3, 5 ), // These two recursive calls
lcs( "abce", "bghc", 4, 4 ) ); // cover the whole problem space
e.g.
(A) lcs( "abce", "bghci" ) => (B) lcs( "abc", "bghci" )
(C) lcs( "abce", "bghc" )
(B) lcs( "abc", "bghci" ) => (D) lcs( "ab", "bghci" )
(E) lcs( "abc", "bghc" )
(C) lcs( "abce", "bghc" ) => (F) lcs( "abc", "bghc" )
(G) lcs( "abce", "bgh" )
(D) lcs( "ab" , "bghci") => (H) lcs( "a" , "bghci")
(I) lcs( "ab" , "bghc" )
(E) lcs( "abc", "bghc" ) => 1 + (J) lcs( "ab" , "bgh")
(F) lcs( "abc", "bghc" ) => 1 + (K) lcs( "ab" , "bgh")
(G) lcs( "abce", "bgh" ) => (L) lcs( "abc", "bgh" )
(M) lcs( "abce", "bg" )
(H) lcs( "a", "bghci") => lcs( "", "bghci" ) (0)
(N) lcs( "a", "bghc" )
(I) lcs( "ab", "bghc") => (O) lcs( "a", "bghc" )
(P) lcs( "ab", "bgh" )
(J) lcs( "ab", "bgh" ) => (Q) lcs( "a", "bgh" )
(R) lcs( "ab", "bg" )
(K) lcs( "ab", "bgh" ) => (S) lcs( "a", "bgh" )
(T) lcs( "ab", "bg" )
(L) lcs( "abc", "bgh" ) => (U) lcs( "ab", "bg" )
(V) lcs( "abc", "bg" )
(M) lcs( "abce", "bg" ) => (W) lcs( "abc", "bg" )
(X) lcs( "abce","b" )
(N) lcs( "a", "bghc") => lcs( "", "bghc" ) (0)
(Y) lcs( "a", "bgh" )
(O) lcs( "a", "bghc" ) => lcs( "", "bghc" ) (0)
lcs( "a" "bgh" )
(P) lcs( "ab", "bgh" ) => (Z) lcs( "a", "bgh" )
(AA) lcs( "ab", "bg" )
(Q) lcs( "a", "bgh" ) => lcs( "", "bgh") (0)
(AB) lcs( "a", "bg")
(R) lcs( "ab", "bg" ) => (AC) lcs( "a", "bg")
(AD) lcs( "ab", "b" )
(S) lcs( "a", "bgh" ) => lcs( "", "bgh") (0)
(AE) lcs( "a", "bg" )
(T) lcs( "ab", "bg" ) => (AF) lcs( "a", "bg" )
(AG) lcs( "ab", "b" )
(U) lcs( "ab", "bg" ) => (AH) lcs( "a", "bg" )
(AI) lcs( "ab", "b" )
(V) lcs( "abc","bg" ) => (AJ) lcs( "ab", "bg" )
=> (AK) lcs( "abc", "b" )
(W) lcs( "abc","bg" ) => (AL) lcs( "ab", "bg" )
(AM) lcs( "abc", "b" )
(X) lcs( "abce", "b") => (AN) lcs( "abc", "b" )
lcs( "abce", "" ) (0)
(Y) lcs( "abce", "b") => (AO) lcs( "abc", "b" )
lcs( "abce", "" ) (0)
(Z) lcs( "a", "bgh") => lcs( "", "bgh" ) (0)
(AP) lcs( "a", "bg" )
(AA)lcs( "ab", "bg") => (AQ) lcs( "a", "bg" )
(AR) lcs( "ab", "b" )
(AB)lcs( "a", "bg") => lcs( "", "bg" ) (0)
(AS) lcs( "a", "b" )
(AC)lcs( "a", "bg") => lcs( "", "bg") (0)
(AT) lcs( "a", "b")
(AD)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(AE)lcs( "a", "bg") => lcs( "", "bg")
(AU) lcs( "a", "b" )
(AF)lcs( "a", "bg") => lcs( "", "bg") (0)
(AV) lcs( "a", "b")
(AG)lcs( "ab", "b" ) => 1 + lcs( "a", "" ) (1)
(AH)lcs( "a", "bg") => lcs( "", "bg") (0)
(AW) lcs( "a", "b" )
(AI)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(AJ)lcs( "ab", "bg") => (AX) lcs( "a", "bg")
(AK)lcs( "abc", "b") => (AY) lcs( "ab", "b")
lcs( "abc", "b") (0)
(AL)lcs( "ab", "bg") => (AZ) lcs( "a", "bg")
(BA) lcs( "ab", "b")
(AM)lcs( "abc", "b") => (BB) lcs( "ab", "b")
lcs( "abc", "" ) (0)
(AN)lcs( "abc", "b") => (BC) lcs( "ab", "b")
lcs( "abc", "" ) (0)
(AO)lcs( "abc", "b") => (BD) lcs( "ab", "b")
lcs( "abc", "" ) (0)
(AP)lcs( "a", "bg") => lcs( "", "bg") (0)
(BE) lcs( "a", "b")
(AQ)lcs( "a", "bg") => lcs( "", "bg") (0)
(BF) lcs( "a", "b")
(AR)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(AS)lcs( "a", "b") => lcs( "", "b") (0)
lcs( "a", "" ) (0)
(AT)lcs( "a", "b" ) as (AS)
(AU)lcs( "a", "b" ) as (AS)
(AV)lcs( "a", "b") as (AS)
(AW)lcs( "a", "b") as (AS)
(AX)lcs( "a", "bg") => lcs( "", "bg") (0)
(BG) lcs( "a", "b")
(AY)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(AZ)lcs( "a", "bg") => lcs( "", "bg") (0)
(BH) lcs( "a", "b")
(BA)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(BB)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(BC)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(BD)lcs( "ab", "b") => 1 + lcs( "a", "" ) (1)
(BE)lcs( "a", "b") as (AS)
(BF)lcs( "a", "b") as (AS)
(BG)lcs( "a", "b") as (AS)
So the lcs of 2 comes from the following call stack....
lcs( "abcde", "bghci") // (A)
lcs( "abcd", "bghci" ) // (B)
lcs( "abc", "bghc" ) // (E)
1 + lcs( "ab", "bgh" ) // (J)
lcs( "ab", "bg" ) // (R)
lcs( "ab", "b" ) // (AD)
1 + lcs( "a", "" )
Which gives an answer of 2. As you can see, it tests alot more combinations.
What I don't understand is, since it considers only certain substrings
and not all possible subsequences,
The trick in the algorithm is to notice that if the first (or last character is the same), then the longest common subsequence is 1 + the lcs of the 2 shorter strings. Something like proof by induction, or proof by contradiction can help you prove that the division of work is necessary and sufficient to cover all the possible alternatives.
As can be seen in my build out, it considers many alternatives multiple times in calculating the lcs, so it is not a very efficient algorithm.
Having a list of maps as this
def listOfMaps =
[
["car": "A", "color": "A", "motor": "A", "anything": "meh"],
["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"],
["car": "A", "color": "A", "motor": "B", "anything": "Anything"],
["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]
How am I supposed to find duplicates by car, color and motor? If there are more than 1 map with the same car, color and motor value it should return true. In this case it should return true since first and second map have the same car, color and motor value, value could be anything as long as they are the same.
Groovy has a handy Collection.unique(boolean,closure) method that allows you to create a new list by removing the duplicates from an input list based on the comparator defined in a closure. In your case, you could define a closure that firstly compares car field, then color, and lastly - motor. Any element that duplicates values for all these fields will be filtered out.
Consider the following example:
def listOfMaps = [
["car": "A", "color": "A", "motor": "A", "anything": "meh"],
["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"],
["car": "A", "color": "A", "motor": "B", "anything": "Anything"],
["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]
// false parameter below means that the input list is not modified
def filtered = listOfMaps.unique(false) { a, b ->
a.car <=> b.car ?:
a.color <=> b.color ?:
a.motor <=> b.motor
}
println filtered
boolean hasDuplicates = listOfMaps.size() > filtered.size()
assert hasDuplicates
Output:
[[car:A, color:A, motor:A, anything:meh], [car:A, color:A, motor:B, anything:Anything], [car:A, color:B, motor:A, anything:Anything]]
Not sure I have understood question correctly, but I have come up with the next code snippet:
def listOfMaps = [
["car": "A", "color": "A", "motor": "A", "anything": "meh"],
["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"],
["car": "A", "color": "A", "motor": "B", "anything": "Anything"],
["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]
static def findDuplicatesByKeys(List<Map<String, String>> maps, List<String> keys) {
Map<String, List<Map<String, String>>> aggregationKeyToMaps = [:].withDefault { key -> []}
maps.each { singleMap ->
def aggregationKey = keys.collect { key -> singleMap[key] }.join('-')
aggregationKeyToMaps.get(aggregationKey).add(singleMap)
}
aggregationKeyToMaps
}
findDuplicatesByKeys(listOfMaps, ['car', 'color', 'motor'])
Basically it iterates over list of maps and groups them by values of the provided keys. The result will be a map of list of maps. Something similar to:
def aggregatedMaps = [
"A-A-A": [
["car": "A", "color": "A", "motor": "A", "anything": "meh"],
["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"]
],
"A-A-B": [
["car": "A", "color": "A", "motor": "B", "anything": "Anything"]
],
"A-B-A": [
["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]
]
You can grab .values() for example and apply needed removals (you haven't specified which duplicate should be removed) and finally flatten the list. Hope that's helpful.
You can group the maps by the appropriate fields and then check if there exists at least one group with more then one element:
boolean result = listOfMaps
.groupBy { [car: it.car, color: it.color, motor: it.motor] }
.any { it.value.size() > 1 }
I'm new to Elixir and trying to get a random letter from a function.
I'm tryng to define a function that return a random letter between a and z.
For for some reason this sometimes returns a blank character.
Why?
defp random_letter do
"abcdefghijklmnopqrstuvwxyz"
|> String.split("")
|> Enum.random
end
def process do
Enum.each(1..12, fn(number) ->
IO.puts random_letter
end)
end
Output:
g
m
s
v
r
o
m
x
e
j
w
String.split("abcdefghijklmnopqrstuvwxyz", "")
returns
["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", ""]
Look at the last element in the list and you get your answer :)
To avoid that, you can use trim option like this:
String.split("abcdefghijklmnopqrstuvwxyz", "", trim: true)
When you want to split the string, here are two alternatives. The second one is used when you have Unicode strings.
iex(1)> String.codepoints("abcdefghijklmnopqrstuvwxyz")
["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p",
"q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
iex(2)> String.graphemes("abcdefghijklmnopqrstuvwxyz")
["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p",
"q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
Or you can use
iex(1)> <<Enum.random(?a..?z)>>
"m"