I've got 2 tables with same criteria "Num"
I'm using this formula:
=XLOOKUP(1,(NUM=RANGE_NUM)*(DATE=RANGE_DATE),GOAL_ARRAY,,,)**
It's working only for exact match, but I want fill up all rows with goal column result
Current result in table 1
TABLE 1 (several dates with 1-2 days deviance)
Num
Date
Goal
1136
2022-01-01
250
1136
2022-01-02
=N/A
1136
2022-01-03
=N/A
1136
2022-02-01
500
1136
2022-02-02
=N/A
1136
2022-02-03
=N/A
1136
2022-03-01
250
1136
2022-03-02
=N/A
1136
2022-03-03
=N/A
TABLE 2 (exact date)
Num
Date
Goal
1136
2022-01-01
250
1136
2022-02-01
500
1136
2022-03-01
250
Is it possible to make "NUM" exact match with "DATE" approximate?
TABLE 3 (expected result)
Num
Date
Goal
1136
2022-01-01
250
1136
2022-01-02
250
1136
2022-01-03
250
1136
2022-02-01
500
1136
2022-02-02
500
1136
2022-02-03
500
1136
2022-03-01
250
1136
2022-03-02
250
1136
2022-03-03
250
Maybe I have to use other formula combinations? Any suggestions please
Use FILTER():
=XLOOKUP(B2,FILTER($F$2:$F$5,$E$2:$E$5=A2),FILTER($G$2:$G$5,$E$2:$E$5=A2),"",-1)
Related
I have two dataframes.
First one:
Date B
2021-12-31 NaN
2022-01-31 500
2022-02-28 540
Second one:
Date A
2021-12-28 520
2021-12-31 530
2022-01-20 515
2022-01-31 529
2022-02-15 544
2022-02-25 522
I want to concatenate both the dataframes based on Year and Month and the resultant dataframe should look like below
Date A B
2021-12-28 520 NaN
2021-12-31 530 NaN
2022-01-20 515 500
2022-01-31 529 500
2022-02-15 544 540
2022-02-25 522 540
You need a left merge on the month period:
df2.merge(df1,
left_on=pd.to_datetime(df2['Date']).dt.to_period('M'),
right_on=pd.to_datetime(df1['Date']).dt.to_period('M'),
suffixes=(None, '_'),
how='left'
)
Then drop(columns=['key_0', 'Date_']) if needed.
Output:
key_0 Date A Date_ B
0 2021-12 2021-12-28 520 2021-12-31 NaN
1 2021-12 2021-12-31 530 2021-12-31 NaN
2 2022-01 2022-01-20 515 2022-01-31 500.0
3 2022-01 2022-01-31 529 2022-01-31 500.0
4 2022-02 2022-02-15 544 2022-02-28 540.0
5 2022-02 2022-02-25 522 2022-02-28 540.0
I have the following table and I want to take one random sample for each id where value is equal to 1.
id
date
value
A
2022-01-01
0
A
2022-01-02
1
A
2022-01-04
0
B
2022-01-01
0
B
2022-01-02
0
C
2022-01-02
1
C
2022-01-03
1
C
2022-01-05
1
C
2022-01-06
0
In pandas I would do the below, but how do I do it in spark?
df[df.value==1].groupby('id').apply(lambda df: df.sample(1)).reset_index(drop=True)
This is what the final dataset could look like:
id
date
value
A
2022-01-02
1
C
2022-01-03
1
I have a pandas dataframe as follows:
code title amount_1 amount_2 currency_1 currency_2
0 246 ex 500 550 USD GBP
1 300 am 200 250 USD GBP
2 315 ple 300 325 USD GBP
I'd like to get this into the format
code title amount currency
246 ex 500 USD
246 ex 550 GBP
All of the currencies are the same. How can I get this format? I've tried using melt and reset_index, but neither seemed to do exactly what I need.
Thank you
Use wide_to_long:
df1 = pd.wide_to_long(df,
stubnames=['amount','currency'],
i=['code','title'],
j='measure', sep='_').reset_index()
print (df1)
code title measure amount currency
0 246 ex 1 500 USD
1 246 ex 2 550 GBP
2 300 am 1 200 USD
3 300 am 2 250 GBP
4 315 ple 1 300 USD
5 315 ple 2 325 GBP
Good afternoon,
I have a data set, which contains reporting dates and various parameters corresponding with this date. Each parameter provides a value related to the duration between the reports. So, the set contains multiple rows with the same date (but different time), and could contain a single row for which values are applicable to multiple dates.
Example of the set listed below:
Date Distance Price
02-01-2018 04:00 370 26.71
03-01-2018 04:00 357 27.31
04-01-2018 04:00 376 23.47
04-01-2018 04:48 8 0.72
04-01-2018 05:36 0 0.19
05-01-2018 04:00 328 23.77
05-01-2018 17:30 202 15.23
07-01-2018 17:13 7 1.54
08-01-2018 05:00 0 7.44
09-01-2018 05:00 0 3.89
10-01-2018 04:00 333 21.38
10-01-2018 11:00 110 6.40
What i would like to get:
Date Distance covered Price
02-01-2018 04:00 370 26.71
03-01-2018 04:00 357 27.31
04-01-2018 04:00 376 23.47
05-01-2018 04:00 336 24.67
06-01-2018 04:00 204 15.56
07-01-2018 04:00 4 0.77
08-01-2018 04:00 2 7.24
09-01-2018 04:00 0 3.93
10-01-2018 04:00 333 21.55
11-01-2018 04:00 110 6.40
I want to be able to choose a start date/time, and create a macro which automatically creates 24hr intervals after this date till the end of data set and interpolates the parameters.
Any help would be much appreciated. Thanks in advance.
Best regards,
I'm trying to write an Excel formula to return a value from the table below:
Q Y Mean 1 2 3 4 5 6 7 8 9
1 4 1301 <1183 1183 1233 1283 1333 1383 1433 1483 1533
2 4 1306 <1189 1189 1239 1289 1339 1389 1439 1489 1539
3 4 1317 <1200 1200 1250 1300 1350 1400 1450 1500 1550
4 4 1333 <1214 1214 1264 1314 1364 1414 1464 1514 1564
1 5 1346 <1225 1225 1275 1325 1375 1425 1475 1525 1575
2 5 1360 <1235 1235 1285 1335 1385 1435 1485 1535 1585
3 5 1372 <1245 1245 1295 1345 1395 1445 1495 1545 1595
4 5 1390 <1255 1255 1305 1355 1405 1455 1505 1555 1605
1 6 1403 <1266 1266 1316 1366 1416 1466 1516 1566 1616
2 6 1416 <1276 1276 1326 1376 1426 1476 1526 1576 1626
3 6 1425 <1285 1285 1335 1385 1435 1485 1535 1585 1635
4 6 1426 <1291 1291 1341 1391 1441 1491 1541 1591 1641
I want to be able to identify a year, then a quarter, and then according to a pupil's score, return the corresponding standard nine figure in the top line.
What's the best way to do this? I've tried INDEX and MATCH functions without success.
One strategy for a multiple lookup like this is to concatenate the indices together to form a unique index. This unique index will let you get the correct row combining the year/quarter.
The second piece of this is using INDEX to return an entire row from your table of scores which can then be used to find the score in the table with MATCH.
Once you have the score column, you can return the nine from there. Using INDEX again.
The end result is an INDEX-MATCH-INDEX-MATCH. It made more sense to me by splitting the formulas into different cells, but I combined them together below.
Here is what I started with. I added the ID column that combines the year/quarter.
Formula in D3 = =B3&"-"&C3, copied down to the end.
Cells C17 and C18 are inputs.
Cell C19 = =C17&"-"&C18
Cell C20 (Score) is an input.
Cell C21 is the messy one which combines the logic described above: =INDEX(F2:N2,MATCH(C20,INDEX(F3:N14,MATCH(C19,D3:D14,0),),1))
Here is that formula expanded with color so you can see somewhat is going on:
Say we have:
We place the desired quarter in A1 and the desired year in B1, and in C1 enter:
=INDEX(C1:C14,SUMPRODUCT(--($A$3:$A$14=$A$1)*($B$3:$B$14=$B$1)*(ROW(3:14))))
and copy across. This gives us: