Choosing the second highest row in an index set - excel

Currently have a data set that looks like this
ID Function UniqueID
1 .8 11
1 .77 12
1 .75 13
2 .8 21
2 .8 22
2 .75 23
I am attempting to grab the first row of each "ID" set only when the second highest row in that ID set has a function value LESS THAN the highest row
In this instance I want output of
UniqueID: 11

With Excel 365, you could try
=LET(matches,MATCH(A2:A7,A2:A7,0),FILTER(C2:C7,(INDEX(B2:B7,matches)>INDEX(B2:B7,matches+1))*(ROW(A2:A7)-1=matches)*(COUNTIF(A2:A7,A2:A7)>1)))
assuming that the values are sorted in descending order in each ID group and that the data starts in row 2.
So this compares the value in the first row of each group with the value in the second row, selects only the first row of each group and also checks that there are at least two values in the group to compare.

Related

Retrieve column 4 from Column 2 and 3 which contains minimum and maximum conditions along with Column 1 which is a separate value?

Hello I have a table shown below where I have letters in column 1, and min and max ranges for column 2 and 3. I am trying to retrieve the final number in column 4.
I know I can use a VLOOKUP and set the range as TRUE to get the last column. However, how would I factor in multiple columns/criteria to find match the correct range with the correct letter.
For example, I can would like to get value 4 from the last column. I would have to match with "B" and it would be between 0 and $50,000.
A 0 $50,000 1
A $50,001 $100,000 2
A $100,001 $250,000 3
B 0 $50,000 4
B $50,001 $100,000 5
B $100,001 $250,000 6
C 0 $50,000 7
C $50,001 $100,000 8
C $100,001 $250,000 9
Thank you!
Two ways:
If the pattern is the same as to the breaks of the dollar amounts then use this:
=INDEX(D:D,MATCH(G1,A:A,0)+MATCH(H1,$B$1:$B$3)-1)
Where MATCH(G1,A:A,0) returns the first row where the ID is located and MATCH(H1,$B$1:$B$3) finds the relative location of the price in the first pattern. Change $B$1:$B$3 to encompass the whole pattern.
If the patterns are different then you can use this:
=SUMIFS(D:D,A:A,G1,B:B,"<=" & H1,C:C,">=" & H1)
One more for the future when Microsoft releases FILTER():
=FILTER(D:D,(A:A=G1)*(B:B<=H1)*(C:C>=H1))
This is entered normally and does not matter the pattern.

Compare row with all other previous string in one column and change value of another column in Python

I have a csv file named namelist.csv, it includes:
Index String Size Name
1 AAA123000DDD 10 One
2 AAA123DDDQQQ 20 One
3 AAA123000DDD 25 One
4 AAA123D 20 One
5 ABA 15 One
6 FFFrrrSSSBBB 60 Two
7 FFFrrrSSSBBB 30 Two
8 FFFrrrSS 50 Two
9 AAA12 70 Two
I want to compare row in column String of each name group: if the string in each row is match or is substring of all above rows then remove the previous rows and sum the value of Size column to the value of subtring row.
Example: i take row 3rd: AAA123000DDD, i compare it to 2 row 1st and 2nd, it see that it is a match with 1st row, it will remove the 1st row then sum value of the 1st row column Size to the 3rd row column Size .
then the table will be like:
Index String Size Name
2 AAA123DDDQQQ 20 One
3 AAA123000DDD 35 One
4 AAA123D 20 One
...
the final result will be:
Index String Size Name
3 AAA123000DDD 35 One
4 AAA123D 40 One
5 ABA 15 One
8 FFFrrrSS 140 Two
9 AAA12 70 Two
i think of using groupby of pandas to group all Name column, but i don't know how to apply the comparison of String column and sum of Size column.
I am new to Python so any help I will very appreciate.
Assuming Name is distinct with String, here's how you would do the aggregation. I kept Name so that it also shows in the final DataFrame.
df_group = df.groupby(['String', 'Name'])['Size'].sum().reset_index()
Edit:
To match the substrings (and using the example above that it appears that a substring will not match with multiple strings), you can make a mapping of substrings to full strings and then group by the full string column as before:
all_strings = set(df['Strings'])
substring_dict = dict()
for row in df.itertuples():
for item in all_strings:
if row.String in item:
substring_dict[row.String] = item
def match_substring(x):
return substring_dict[x]
df['full_strings'] = df.String.apply(match_substring)
df_group = df.groupby(['full_strings', 'Name'])['Size'].sum().reset_index()

Calculate a value for 2 different set of IDs in excel

My excel table has 5 Rows: Id, ColA, ColB, Count and Test.
ID A B Count Test
2 a low 5 -
2 b high 6 -
2 c low 7 -
2 d high 8 -
2 e low 9 -
1 a low 1 =(1-5)
1 l high 2 -
1 e low 3 =(3-9)
I want to Calculate the value of Test for only rows with Id = 1
If Value of ColA for ID 1 = Value of of ColA for ID 2 and
Value of ColB for ID 1 = Value of of ColB for ID 2
then calculate the difference between the Count Values
else
0
The Excel Table is connected to Sql Query. Every time I refresh it the table has a different number of rows.
I tried using VLOOKUP in TEST column where Id = 1 and specified the array table as the first 5 rows (only with Id = 2) but it doesn't seem to work because when I refresh the table the second time there are only 2 rows for Id = 2.
I want the TEST column value to be automatically calculated each time the table is refreshed. Thanks!
use countifs to find if it exists, and sumifs to return the value:
=IF(AND(A2=1,COUNTIFS(B:B,B2,C:C,C2,A:A,2)),D2-SUMIFS(D:D,B:B,B2,C:C,C2,A:A,2),0)

How to convert repeated measures from rows to columns in Excel

I have a data file of approximately 5000 repeated measures organized with rows containing IDs and repeated measures on weight, BMI, etc for children. I would like to find the maximum value of one variable (BMI) for each individual (out of up to 9 records). How can I do a lookup on multiple rows for each ID and return the max of a value for each person?
A very abbreviated example is as follows:
HAVE:
ID Date BMI
1 1 20
1 2 18
1 3 24
2 4 23
2 5 19
2 6 17
3 7 25
3 8 18
3 9 21
WANT
ID Highest BMI Corresponding date
1 24 3
2 23 4
3 25 7
Alternatively if there is a way to do this in SPSS or JMP (I don't have access to SAS now), please let me know.
Thanks!
Melissa
You can do this easily in Excel in two parts
A PivotTable to extract the maxiumum BMI for each ID
matching the maximum BMI per ID to a date
Part 1 - PivotTable
Create a PivotTable with
A Row Label of ID
Values as Max of BMI
see below
Part 2 - matching the date
In the cell to the right of tge first BMI maximum, put this formula
=SUMPRODUCT(--($A$2:$A$10=B14),--($C$2:$C$10=C14),$B$2:$B$10)/SUMPRODUCT(--($A$2:$A$10=B14),--($C$2:$C$10=C14))
(ensure you re-map your ranges if they differ from this example)
This formula the record that matches the ID and Max BMI

Excel - Finding Max value in a column

I have an Summary sheet set up data set up as follows-
Cat A Cat B Cat C Cat D
Name 1 0 0 0 0
Name 2 2 3 2 2
Name 3 2 2 2 2
Name 4 3 2 2 3
Name 5 2 3 2 3
I also then have separate tabs for each of Name1 through to Name 5.
The summary sheet contains the maximum values for each category from each tab. So the Cell at Cat A Name 1 should show the maximum value on Sheet(Name1) in the Cat A column.
So far so good. However each tab may not contain the same categories, so therefore I would like teh summary sheet to check the maximum value in each column by doing a search on the Cat name.
So far I have this-
=MATCH(Overview!S$1,Name1!$C$1:$V$1,0)
Which returns the column number with the right Category, in this case 13. So I can find the right column. What I am struggling with is to now find the maximum value in the column.
Can anyone help?
Thanks
IAssuming your search range goes to row 1000:
=MAX(INDEX(Name1!$C$2:$V$1000,0,MATCH(Overview!S$1,Name1!$C$1:$V$1,0)))
The 0 Row argument in Index means to select the entire column.
The Offset function is your key here.
After you've got the value from the match, you can pass it to the offset to get the correct column.
So, for example, you probably want something like:
=Max(Name1!$C1:$C2000)
But you don't know whether you should use the C column or the D column or whatever, in this case, it was 13, so is that the P column? (c=3, the match was 13 so 3+13 = 16 = P?), so I think you want something like this:
=Max(Offset(Name1!$C$1:$C$2000, 0, [result of your match expression] - 1))
Here's an example of what I think you want in GoogleDocs:
https://docs.google.com/spreadsheet/ccc?key=0Ai45AJPc2AWMdGRlZXNIdlZBaHJxc01qVlJWa1N1WXc

Resources