Racket: how to apply filter on a field of a struct in a list of struct - struct

My task is to write a function, inYear, which takes a number called year, and a list of event structures and produces a new list where each element is an event structure which occurred during year. I found this and tried defining a lambda function within the filter. See my event definitions/list, function and test below. The test fails and no elements get filtered out, it just returns the original list. What am I doing wrong?
(struct event (name day month year xlocation ylocation) #:transparent)
(define e1 (event "new years" 1 "Jan" 2021 0 0))
(define e2 (event "valentines" 14 "Feb" 2021 2 2))
(define e3 (event "my birthday" 6 "Mar" 2021 10 10))
(define e4 (event "tyler's birthday" 10 "Sep" 2020 20 20))
(define l1 '(e1 e2 e3 e4))
(define (inYear year events)
(filter (lambda (e) (equal? (event-year e) year)) events))
(check-expect (inYear 2021 l1) '(e1 e2 e3))

The definition l1 evaluates to a list of symbols not a list of structs.
'(e1 e2 e3 e4) = (list 'e1 'e2 'e3 'e4)
You can convert the definition and the test output to be (list e1 e2 e3 e4) and (list e1 e2 e3) respectively.
Alternatively, you can use a quasi-quote-unquote combination like:
`(,e1 ,e2 ,e3 ,e4)
But this is less idiomatic for simply defining a list of structs.

Related

Combine automatically all values from ranges of cells into A new format range of cells (automated) without VBA/Macros

I am looking to combine automatically all values from ranges of cells into A new format range of cells using Excel formulas, and I wonder if it's possible. Thanks
Here is the input data:
Year
Class
Type
New
Class
Type
Old
2018
13
(Fx)
$ 15 025,16
2018
8
(E)
$ 4 185,07
2018
12
(E)
$ 2 173,79
2018
12s
(E)
$ 75,66
2018
46
(E)
$ 470,92
2018
50
(E)
$ 1 869,14
2018
8
(Fu)
$ 2 111,45
2018
13
(Fu)
$ 28 942,45
2018
13
(Fx)
$ 3 397,14
2018
46
(E)
$ 19,98
2018
50
(E)
$ 570,00
2018
8
(Fu)
$ 3 324,33
2018
13
(Fu)
$ 873,70
2019
13
(Fx)
$ 517,86
2019
8
(E)
$ 4 365,76
2019
12
(E)
$ 1 014,93
2019
50
(E)
$ 3 296,12
2019
8
(Fu)
$ 2 016,51
2019
2019
2020
8
(E)
$ 267,60
2020
50
(E)
$ 998,92
2020
8
(Fu)
$ 251,86
2020
2020
2020
2021
13
(Fx)
$ 1 997,30
2021
8
(E)
$ 7 733,39
2021
50
(E)
$ 2 766,23
2021
8
(Fu)
$ 5 880,03
2021
13
(Fx)
$ 15 693,38
2021
13
(Fu)
$ 22 274,74
2021
46
(E)
$ 399,98
and this is the expected output for Formula 1 (see explanation below):
Year
Class
Type
Old
New
2018
13
(Fx)
$ 3 397,14
$ 15 025,16
2018
46
(E)
$ 19,98
$ 470,92
2018
50
(E)
$ 570,00
$ 1 869,14
2018
8
(Fu)
$ 3 324,33
$ 2 111,45
2018
13
(Fu)
$ 873,70
$ 28 942,45
2018
8
(E)
$ 4 185,07
$ -
2018
12
(E)
$ 2 173,79
$ -
2018
12s
(E)
$ 75,66
$ -
2019
13
(Fx)
$ -
$ 517,86
2019
8
(E)
$ -
$ 4 365,76
2019
12
(E)
$ -
$ 1 014,93
2019
50
(E)
$ -
$ 3 296,12
2019
8
(Fu)
$ -
$ 2 016,51
2020
8
(E)
$ 267,60
$ -
2020
50
(E)
$ 998,92
$ -
2020
8
(Fu)
$ 251,86
$ -
2021
8
(E)
$ 7 733,39
$ -
2021
50
(E)
$ 2 766,23
$ -
2021
8
(Fu)
$ 5 880,03
$ -
2021
13
(Fx)
$ 15 693,38
$ 1 997,30
2021
13
(Fu)
$ 22 274,74
$ -
2021
46
(E)
$ 399,98
$ -
I have uploaded the file in google sheet for convenience:
https://docs.google.com/spreadsheets/d/1SE2B5m-sz-L55Gc5CePbBDMeggDj2cLMZlXHnYT282U/edit?usp=sharing
However I am looking for a solution in Excel (office 2021)
Formula 1
What I am looking for is to create a new range of cells (L to M) from the range of cells on the left side (col A to G).
For each year:
if class (col B) and type (C) match respectively class (col E) and type (col F), write the value in text format of Col B in I and value of Col C in J, and get the value of col D in col M and the value of col G in col L.
if class (col B) and type (C) are not found respectively in class (col E) and type (col F) and are not 0 (i.e. is not blank row), write the value in text format of Col B in I and value of Col C in J, and get the value of col D in col M and put 0 as value in col L.
if class (col E) and type (F) are not found respectively in class (col B) and type (col C) and are not 0 (i.e. is not blank row), write the value in text format of Col E in I and value of Col F in J, and get the value of col G in col L and put 0 as value in col M.
Formula 2
What I am looking for is to create a new range of cells (L to M) from the range of cells on the left side (col A to G).
Equivalent to Google sheets Flatten formula, but with 2 columns in consideration rather than one (where the same row of the 2 columns is seen as "1 value" - like Flatten a pairing values).
For each year:
Obtain the unique values of 2 arrays combined (array 1= col B and C, array 2= col E and F). By unique, it means Col B must match with Col E while for the same row, col C matches col F.
Note: The output will be the first 4 columns of the expected output of Formula 1.
Even though your question lacks your tries of getting in the direction or could easily be logged as separate questions by itself I managed to get the data using Office 365 (note that your expected result in your spreadsheet do not match the one shared as a screenshot).
I managed to sort the unique year / class / type rows and lookup the associated values for old and new:
=LET(
data,A6:G39,
header1,INDEX(data,1,SEQUENCE(1,3)),
header2,HSTACK(INDEX(data,1,7),INDEX(data,1,4)),
d,DROP(data,1),
y,INDEX(d,,1),
nc,INDEX(d,,2),
nt,INDEX(d,,3),
nv,INDEX(d,,4),
oc,INDEX(d,,5),
ot,INDEX(d,,6),
ov,INDEX(d,,7),
ac,VSTACK(HSTACK(y,oc,ot),HSTACK(y,nc,nt)),
uc,SORT(UNIQUE(FILTER(ac,INDEX(ac,,2)<>""))),
formula1,VSTACK(header1,uc),
br,BYROW(uc,LAMBDA(x,TEXTJOIN("",0,x))),
ol,XLOOKUP(br,y&oc&ot,ov,0,0),
nl,XLOOKUP(br,y&nc&nt,nv,0,0),
formula2,VSTACK(HSTACK(header1,header2),HSTACK(uc,ol,nl)),
formula2)
The final argument in the LET-function is set to formula2 for getting the complete view including headers & values for what you described Formula 2 in your question.
Changing the final argument to formula1 would result in the requested result for what you described Formula 1 in your question.
I made it dynamical, so if you change the range for argument data the calculation will adopt to that range.
I first stacked the year y, class values for the new values nc and the type values for the new values nt and the year y, class values for the old values oc and the type values for the old values ot.
and sorted it, showed unique values only and filtered out rows showing blank data in the class column.
Then I performed a TEXTJOIN by row to this array br to be used to lookup the
associated old values ol and for the new values nl.
Than I stacked the headers, unique sorted filtered year/class/type and it's lookup values as a whole.
I recommend you to split the parts into separate questions if you would want to calculate this in a different Excel version (or that would've been my recommendation anyway).
EDIT:
I also have an alternative version leaving out the headers, but avoiding DROP, VSTACK and HSTACK, all being function not available to Excel 2021 as far as I could find.
This formula should work in Excel 2021:
=LET(
data,A6:G39,
r,ROWS(data)-1,
c,COLUMNS(data)+1,
sr,SEQUENCE(r*2,),
sm,MOD((sr-1),r)+2,
dn,INDEX(data,sm,SEQUENCE(1,4)),
_dn1,INDEX(dn,,1),
_dn2,INDEX(dn,,2),
_dn3,INDEX(dn,,3),
_dn4,INDEX(dn,,4),
do,INDEX(data,sm,{1,5,6,7}),
_do1,INDEX(do,,1),
_do2,INDEX(do,,2),
_do3,INDEX(do,,3),
_do4,INDEX(do,,4),
da,IF(sr<=r,do,dn),
_fa,SORT(FILTER(da,ISTEXT(INDEX(da,,3)))),
_ufa,UNIQUE(INDEX(_fa,SEQUENCE(ROWS(_fa)),SEQUENCE(1,3))),
_fa1,INDEX(_ufa,,1),
_fa2,INDEX(_ufa,,2),
_fa3,INDEX(_ufa,,3),
_lo,XLOOKUP(_fa1&_fa2&_fa3,_do1&_do2&_do3,_do4,0,0),
_ln,XLOOKUP(_fa1&_fa2&_fa3,_dn1&_dn2&_dn3,_dn4,0,0),
CHOOSE({1,2,3,4,5},_fa1,_fa2,_fa3,_lo,_ln))
(the _ufa-part is equal to the data for Formula 2, as you described in your question)
And including the headers
=LET(data,A6:G39,r,ROWS(data)-1,c,COLUMNS(data)+1,sr,SEQUENCE(r*2,),sm,MOD((sr-1),r)+2,dn,INDEX(data,sm,SEQUENCE(1,4)),_dn1,INDEX(dn,,1),_dn2,INDEX(dn,,2),_dn3,INDEX(dn,,3),_dn4,INDEX(dn,,4),do,INDEX(data,sm,{1,5,6,7}),_do1,INDEX(do,,1),_do2,INDEX(do,,2),_do3,INDEX(do,,3),_do4,INDEX(do,,4),da,IF(sr<=r,do,dn),_fa,SORT(FILTER(da,ISTEXT(INDEX(da,,3)))),_ufa,UNIQUE(INDEX(_fa,SEQUENCE(ROWS(_fa)),SEQUENCE(1,3))),_fa1,INDEX(_ufa,,1),_fa2,INDEX(_ufa,,2),_fa3,INDEX(_ufa,,3),_lo,XLOOKUP(_fa1&_fa2&_fa3,_do1&_do2&_do3,_do4,0,0),_ln,XLOOKUP(_fa1&_fa2&_fa3,_dn1&_dn2&_dn3,_dn4,0,0),result_data,CHOOSE({1,2,3,4,5},_fa1,_fa2,_fa3,_lo,_ln),
header,INDEX(data,1,{1,2,3,7,4}),
IF(SEQUENCE(ROWS(result_data)+1)<=1,header,INDEX(result_data,SEQUENCE(ROWS(result_data)+1,,0),SEQUENCE(1,COLUMNS(header)))))
Here a solution. Your input data and rules seem to have some inconsistences. I check my result for formula 1 is the same as #P.b, so it seem we have the same understanding, but your question and data need to be reviewed.
Here is the formula 1 in O4:
=LET(setY, FILTER(A4:C36, (B4:B36<>"") * (C4:C36<>"")),
amountY, FILTER(D4:D36, D4:D36<>""),
setX, FILTER(HSTACK(A4:A36, E4:F36), (E4:E36<>"") * (F4:F36<>"")),
amountX, FILTER(G4:G36, G4:G36<>""),
lkupY, BYROW(setY, LAMBDA(rowY, CONCAT(rowY))),
lkupX, BYROW(setX, LAMBDA(rowX, CONCAT(rowX))),
notMatchXInY, ISNA(XMATCH(lkupX, lkupY)),
SORT(IFERROR(VSTACK(HSTACK(setY, XLOOKUP(lkupY,lkupX, amountX), amountY),
FILTER(HSTACK(setX, amountX, XLOOKUP(lkupX,lkupY, amountY)),
notMatchXInY)),""),1))
and here is the output:
Note: In light yellow (columns E:G) data I think they were miss placed in the input data from the question (it was corrected in the question after)
Formula 2 is just a partial result from the data in formula 1:
=LET(setY, FILTER(A4:C36, (B4:B36<>"") * (C4:C36<>"")),
setX, FILTER(HSTACK(A4:A36, E4:F36), (E4:E36<>"") * (F4:F36<>"")),
lkupY, BYROW(setY, LAMBDA(rowY, CONCAT(rowY))),
lkupX, BYROW(setX, LAMBDA(rowX, CONCAT(rowX))),
notMatchXInY, ISNA(XMATCH(lkupX, lkupY)),
VSTACK(setY, FILTER(setX, notMatchXInY))
)
or filtering by the three first columns from formula 1 result, i.e.:
LET(formula2, FILTER(formula1, {1,1,1,0,0}), formula2)
where formula1 represents the output of formula 1.
The output will just the first three columns from the result of formula 1 (but without sorting, but it can be added, if needed)
Explanation
I use the following suffixes to identify each set and related calculations:
X for Old Data
Y for New Data
We first filter by non empty rows of both sets (New and Old data). SetY, SetX represent such sets (not considering the amount part, just Year, Class and Type):
setY, FILTER(A4:C36, (B4:B36<>"") * (C4:C36<>""))
setX, FILTER(HSTACK(A4:A36, E4:F36), (E4:E36<>"") * (F4:F36<>""))
Next we define the lookup variables, via concatenation of the search criteria:
lkupY, BYROW(setY, LAMBDA(rowY, CONCAT(rowY)))
lkupX, BYROW(setX, LAMBDA(rowX, CONCAT(rowX)))
The corresponding amount for each set:
amountY, FILTER(D4:D36, D4:D36<>"")
amountX, FILTER(G4:G36, G4:G36<>"")
Because the resulting output will be the unique values (Year, Class and Type) from both sets, we need to find the elements in X not in Y and for that we use this variable:
notMatchXInY, ISNA(XMATCH(lkupX, lkupY))
The rest is just to build the final result using VSTACK, HSTACK to build the result in the form we want.
We use XLOOP function to apply the following business rules: For elements in setY found in setX, we use amountX and amountY respectively. All the #N/A values (not found cases) are replaced at the end with an empty string (""). So this rule can be implemented as follow:
HSTACK(setY, XLOOKUP(lkupY,lkupX, amountX), amountY)
For the second set (Old Set). Elements from SetX not in SetY we obtain the amount as follow:
FILTER(HSTACK(setX, amountX, XLOOKUP(lkupX,lkupY, amountY)), notMatchXInY)
We use XLOOKUP in a similar way for previous setY and combine it with HSTACK. The only difference is to exclude setX elements already present in setY and for that we use a FILTER.

Rank groups without duplicates [duplicate]

I am trying to get a unique rank value (e.g. {1, 2, 3, 4} from a subgroup in my data. SUMPRODUCT will produce ties{1, 1, 3, 4}, I am trying to add the COUNTIFS to the end to adjust the duplicate rank away.
subgroup
col B col M rank
LMN 01 1
XYZ 02
XYZ 02
ABC 03
ABC 01
XYZ 01
LMN 02 3
ABC 01
LMN 03 4
LMN 03 4 'should be 5
ABC 02
XYZ 02
LMN 01 1 'should be 2
So far, I've come up with this.
=SUMPRODUCT(($B$2:$B$38705=B2)*(M2>$M$2:$M$38705))+countifs(B2:B38705=B2,M2:M38705=M2)
What have I done wrong here?
The good news is that you can throw away the SUMPRODUCT function and replace it with a pair of COUNTIFS functions. The COUNTIFS can use full column references without detriment and is vastly more efficient than the SUMPRODUCT even with the SUMPRODUCT cell ranges limited to the extents of the data.
In N2 as a standard function,
=COUNTIFS(B:B, B2,M:M, "<"&M2)+COUNTIFS(B$2:B2, B2, M$2:M2, M2)
Fill down as necessary.
      
  Filtered Results
        
Solution basing on OP
Studying your post demanding to post any alternatives, I got interested in a solution based on your original approach via the SUMPRODUCT function.
IMO this could show the right way for the sake of the art:
Applied method
Get
a) all current ids with a group value lower or equal to the current value
MINUS
b) the number of current ids with the identical group value starting count from the current row
PLUS
c) the increment of 1
Formula example, e.g. in cell N5:
=SUMPRODUCT(($B$2:$B$38705=$B5)*($M$2:$M$38705<=$M5))-COUNTIFS($B5:$B$38705,$B5,$M5:$M$38705,$M5)+1
P.S.
Of course, I agree with you preferring the above posted solution, too :+)

Need help understanding code execution, for nested for loop

I have trouble understanding the element wise execution of the following code. The goal is to define a function, that returns the cartesian product of 2 sets. The problem should be solved using the methods in the code below.
I have tried looking up similar problems, but since i am new to programing and python i get stuck easy.
A = {1,2,3,4}
B = {3,4,5}
def setprod(m1,m2):
p=set()
for e1 in m1:
for e2 in m2:
p.add((e1,e2))
return p
setprod(A,B) returns {(1, 3), (3, 3), (4, 5), (4, 4), (1, 4), (1, 5), (2, 3), (4, 3), (2, 5), (3, 4), (2, 4), (3, 5)} The cartesian product is the set containing all the ordered pairs of elements of the two sets. The elements in A can be choosen 4 diffrent ways and B 3 giving 4x3=12 combinations.
I just can`t see why the code above accomplishes this.
If you have access to a debugging tool (perhaps you could install pycharm and use its debugger) you can see what's going on.
Let's step through what's going on in the code together mentally.
A = {1,2,3,4} #Step 1, load a set (1,2,3,4)
B = {3,4,5} #Step 2, load a set (3,4,5)
def setprod(m1,m2): #Step 4, define the function
p=set()
for e1 in m1:
for e2 in m2:
p.add((e1,e2))
return p
setprod(A,B) #Step 5, execute function with parameters
At this point if we want to see what setprod does we step into the function.
p=set() #Steppedin, step 1 create empty set
for e1 in m1: #Steppedin, step 2, begin forloop iterating through m1,
#which contains (1,2,3,4); e1 is set to 1
for e2 in m2: #Steppedin, step 3 begin inner for loop
#iterating through m2, which contains (3,4,5),
#e2 is set to 3, e1 contains the value 1
p.add((e1,e2)) #Stepped in, step 4. add (m1[0],m2[0]), represented by
# (e1,e2) to the set.
return p
At stepped in step 4, the next step is the same line of code but with different register values, e2 is no longer m2[0] but m2[1]
p.add((e1,e2)) #Stepped in, step 5. add (m1[0],m2[1]), represented by
# (e1,e2) to the set where e1 = 1 and e2 = 4
.
p.add((e1,e2)) #Stepped in, step 6. add (m1[0],m2[2]), represented by
# (e1,e2) to the set where e1 = 1 and e2 = 5
At this point we return to the parent for loop.
for e1 in m1: #Stepped in, step 7.
#use m1[1] as e1 and repeat previous process but
#with the new e1 value set to 2
for e2 in m2: #Stepped in, step 8. e1 contains 2, e2 is set to 3
p.add((e1,e2))
(Just a note, if you were debugging this, I believe you'll only see the values for e2 and e1 when you are at the section of the code p.add, saying that e1 is set to some value at #stepped in, step 7, isn't completely accurate but is helpful enough for looking at the idea of what is happening.)

Converting a python 3 dictionary into a dataframe (keys are tuples)

I have a dictionary that has the following structure:
Dict [(t1,d1)] = x
(x are integers, t1 and d1 are strings)
I want to convert this Dictionary into a dataframe of the following format:
d1 d2 d3 d4
t1 x y z x
t2 etc.
t3
t4
...
The following command
d.DataFrame([[key,value] for key,value in Dict.items()],columns=["key_col","val_col"])
gives me
key_col val_col
0 (book, d1) 100
1 (pen, d1) 10
2 (book, d2) 30
3 (pen, d2) 0
How do I make d's my column names and t's my row names?
Pandas automatically assumes tuple keys are multiindex. Pass dictionary to series constructor and unstack.
pd.Series(dct).unstack()

How to extract 5 number and nanmes from the list [duplicate]

I am trying to get a unique rank value (e.g. {1, 2, 3, 4} from a subgroup in my data. SUMPRODUCT will produce ties{1, 1, 3, 4}, I am trying to add the COUNTIFS to the end to adjust the duplicate rank away.
subgroup
col B col M rank
LMN 01 1
XYZ 02
XYZ 02
ABC 03
ABC 01
XYZ 01
LMN 02 3
ABC 01
LMN 03 4
LMN 03 4 'should be 5
ABC 02
XYZ 02
LMN 01 1 'should be 2
So far, I've come up with this.
=SUMPRODUCT(($B$2:$B$38705=B2)*(M2>$M$2:$M$38705))+countifs(B2:B38705=B2,M2:M38705=M2)
What have I done wrong here?
The good news is that you can throw away the SUMPRODUCT function and replace it with a pair of COUNTIFS functions. The COUNTIFS can use full column references without detriment and is vastly more efficient than the SUMPRODUCT even with the SUMPRODUCT cell ranges limited to the extents of the data.
In N2 as a standard function,
=COUNTIFS(B:B, B2,M:M, "<"&M2)+COUNTIFS(B$2:B2, B2, M$2:M2, M2)
Fill down as necessary.
      
  Filtered Results
        
Solution basing on OP
Studying your post demanding to post any alternatives, I got interested in a solution based on your original approach via the SUMPRODUCT function.
IMO this could show the right way for the sake of the art:
Applied method
Get
a) all current ids with a group value lower or equal to the current value
MINUS
b) the number of current ids with the identical group value starting count from the current row
PLUS
c) the increment of 1
Formula example, e.g. in cell N5:
=SUMPRODUCT(($B$2:$B$38705=$B5)*($M$2:$M$38705<=$M5))-COUNTIFS($B5:$B$38705,$B5,$M5:$M$38705,$M5)+1
P.S.
Of course, I agree with you preferring the above posted solution, too :+)

Resources