Excel: Index match if day of date matches? - excel

I have created a time line in excel like this:
Sheet 2
A B C D E
____________________________
01 02 03 04 05
I have some data in sheet 1
Column A Column E
01/01/2017 Supplier X
05/01/2017 Supplier B
I am wanting to return the name of a supplier using index match where the day (listed on my timeline) matches the date in column A on sheet 1.
Here's what i'm trying to use but it produces #N/A and # Value errors
=INDEX(Sheet1!$E:$E,MATCH(F$22,DAY(Sheet1!$A:$A),0))
Desired result:
A B C D E
Supplier X_____________________ Supplier B
01 02 03 04 05
Please can someone show me where i am going wrong?

Related

Summing an array of values based on multiple criteria and look up table

I am given the following sales table which provide the sales that each employee made, but instead of their name I have their ID and each ID may have more than 1 row.
To map the ID back to the name, I have a look up table with each employee's name and ID. One thing to keep in mind is that any given name could potentially have more than one ID assigned to it, as described in the example below:
Sales Table:
Year
ID
North
South
West
East
2020
A
58
30
74
72
2020
A
85
40
90
79
2020
B
9
82
20
5
2020
B
77
13
49
21
2020
C
85
55
37
11
2020
C
29
70
21
22
2021
A
61
37
21
42
2021
A
22
39
2
34
2021
B
62
55
9
72
2021
B
59
11
2
37
2021
C
41
22
64
47
2021
C
83
18
56
83
ID table:
ID
Name
A
Allison
B
Brandon
C
Brandon
I am trying to sum up each employee's sales by a given year, and aggregate all their transactions by their name (rather than ID), so that my result looks like the following:
Result:
Report
2021
Allison
258
Brandon
721
I want the user to be able to select the year, and the report would automatically sum up each person's sales by the year and their name. Again, Brandon was assigned ID B and C, so the report should be able to obtain all 2021 sales under B and C.
I posted a similar question which did not include the added complexity of having a name tied to more than one ID. In that thread, I was provided a solution with the following formula:
=SUMPRODUCT($C$2:$F$13*($B$2:$B$13=INDEX($I$2:$I$4,MATCH(N3,$J$2:$J$4,0)))*($A$2:$A$13=$N$2))
While this formula works on names that only have one ID tied to it, I believe the INDEX and MATCH component falls through once it encounters a duplicate name on the ID table.
I am currently using Excel 2016, so any solution would need to be compatible with that version at least. Thanks in advance for any guidance on this.
Try this formula solution can work in your Excel 2016
In L4, formula copied down :
=SUMPRODUCT(($A$3:$A$14=K$3)*(VLOOKUP(T(IF({1},$B$3:$B$14)),$H$3:$I$5,2,0)=K4)*$C$3:$F$14)

Use Switch/Case Statement to build DF2, by Iterating Over Rows in DF1

I've loaded data from a tab deliminated file into a DF. The Tab data is a form filled out with a template.
A critical concept is that a variable number of rows makes up one entry in the form. In DF1 below, every time the index is "A", a new record is starting. So the code will need to iterate through the rows to rebuild each record in DF2. Each record will be represented as one row in DF2.
Based on the fact that each "A" row in DF1 starts a new form entry (and corresponding row in DF2), we can see in DF1 below there are just two entries in my example, and will be just two rows in DF2. Also imortant: there are a different number of pieces of data (columns) in each row. Z has 2 (then NAs), A has 3, B has 4.
All of this needs to be mapped to DF2 depending on the index letters Z, A, B (note there are more index letters but this is simplified for this example).
DF 1
- A B C D
Z xyz 5 NA NA
A COA aa bb NA
B RE 01 02 03
B DE 04 05 06
A COB dd ee NA
B RE 01 02 03
B DE 04 05 06
In the past i've done this type of thing in VBA and would have used a CASE statement to transform the data. I've found a good start using dictionaries in this thread:
Replacements for switch statement in Python?
One code example at the above thread suggests using a dictionary type case statement:
return{
'a': 1,
'b': 2,
}[x]
This seems like it would work although i'm not certain how to execute in practice. In addition for each A, B, etc above, I need to output multiple instructions, depending on the index letter. For the most part, the instructions are where to map in DF2. For example, in my:
Index A:
Map column A to DF2.iloc[1]['B']
Map column B to DF2.iloc[1]['C']
Map column C to DF2.iloc[1]['D']
Index B:
Would have four instructions, similar to above.
DF2 would end up looking like so
- A B C D E F G H I J K L
1 xyz COA aa bb RE 01 02 03 DE 04 05 06
2 xyz COB dd ee RE 01 02 03 DE 04 05 06
So for each row in DF1, a different number of instructions is being performed depending on the "index letter." All instructions are telling the code where to put the data in DF2. The mapping instruction for each different index letter will always be the same for the columns, only the row will be changing (some type of counter as you move from one record group to the next in DF2).
How can I handle the different number of instructions for each type of index letter in a switch/case type format?
Thank you
I think you can use:
#filter only 2,3 index rows
df1 = df[df.index.isin([2,3])].copy()
#create new column for same value if 2 in index
df1['new'] = np.where(df1.index == 2, 'Z', df1.A)
#create groups by compare 2
df1['g'] = (df1.index == 2).cumsum()
#convert columns to index and reshape, then change order
df1 = (df1.set_index(['g','new']).unstack()
.swaplevel(0,1, axis=1)
.sort_index(axis=1, ascending=[False, True]))
#default columns names
df1.columns = range(len(df1.columns))
print (df1)
0 1 2 3 4 5 6 7 8 9 10 11
g
1 ABC aa bb cc R 01 02 NaN D NaN 03 04
2 DEF dd ee ff R 01 02 NaN D NaN 03 04

Sample dataframe with number of records sampled per hour predefined

I have to sample a dataframe (df1) and I have another dataframe (df2) that tells me how many records I should retrieve from each hour of the day.
For example,
df1:
Hour number
0. 00 A
1. 00 B
2. 00 C
3. 01 D
4. 01 A
5. 01 B
6. 01 D
df2:
Hour number
0. 00 1
1. 01 2
So that in the end, I would get for example, record number 1 for midnight and records 3 and 5 for 1 am (or any other combination so long as it respects the number in df2)
The thing is that I need to write this in a function in order for me to call this inside another function.
So far I have
def sampling(frame):
return np.random.choice(frame.index)
but I am failing to add the constraints of the df2.
Could anybody help?
First we add the number of samples required as a new column using merge and the apply sample to each group of Hour values. Finally we remove the added column by returning all but the last column:
def sampling(df1, df2):
return df1.merge(df2, on='Hour').groupby('Hour').apply(lambda x: x.sample(x.Number[0])).reset_index(0,True).iloc[:,:-1]
df1 = pd.DataFrame({'Hour': [0,0,0,1,1,1,1], 'Value': list('ABCDABD')})
df2 = pd.DataFrame({'Hour': [0,1], 'Number': [1,2]})
sampling(df1, df2)
Result:
Hour Value
2 0 C
4 1 A
5 1 B

I am having a column like this 9(05),X(05),X(15). I want to separate this 9,X,X into one column and data in () into another column. How can I do that?

I am having a column like this 9(05),X(05),X(15). I want to separate this 9,X,X into one column and data in () into another column. How can I do that?
input column is
9(05)
x(05)
x(15)
x(15)
s9(07)
Use extract:
pat = r'(.*?)\((.*?)\)'
df[['a','b']] = df['col'].str.extract(pat, expand=True)
print (df)
col a b
0 9(05) 9 05
1 x(05) x 05
2 x(15) x 15
3 x(15) x 15
4 s9(07) s9 07

How to set values based on a date range in Excel?

I want to set values based on a arrival and departure date:
Idx Arrive Depart 01. Jan 02. Jan 03. Jan 04. Jan 05. Jan ...
1 01. Jan 04. Jan 1 1 1
2 02. Jan 04. Jan 1 1
3 02. Jan 05. Jan 1 1 1
4 01. Jan 05. Jan 1 1 1 1
5 03. Jan 05. Jan 1 1
... ... ... ... ... ... ... ... ...
Total 2 4 5 3
For example, Idx 1:
Arrives on 01 January
Departs on 04 January
A total of 3 nights accommodation needed (value of '1' in the columns 01, 02 and 03 January) You'll note that a '1' isn't entered in the 04 January column, as this is the date of departure and no accommodation isn't required that night.
How to archieve this in Excel?
Assuming that Arrive is in column A and the column headers (Arrive, Depart, 01. Jan) are on row 1, you want to put the following formula into cell C2:
=IF(AND(C$1>=$A2,C$1<$B2),1,"")
From there, you can copy the formula into the other cells. The formula assumes that the dates on the left and at the top are proper data values, i.e. Excel treats them as dates.

Resources