I am trying to transform a hierarchy with 5 Levels into a Parent/Child table with 2 columns.
I need to do this with Excel formulas, not VBA script. Can I do this with Index and Match functions?
The Parent/Child Table should be dynamic and update automatically if Hierarchy changes.
A
B
C
D
E
1
H I E R A R C H Y
2
Universe
3
North America
4
USA
5
California
6
San Francisco
7
Los Angeles
8
Montana
9
Mexico
10
Canada
11
Europe
12
Italy
13
Spain
France
I was able to populate the Children with the array formula below, i.e. North America in B2:
{=INDEX(A3:E3,MATCH(FALSE,ISBLANK(A3:E3),0))}
However I am looking for a formula to populate the Parents.
Expected Parent / Child Table:
A
B
1
Parent
Child
2
Universe
North America
3
North America
USA
4
USA
California
5
California
San Francisco
6
California
Los Angeles
7
USA
Montana
8
North America
Mexico
9
North America
Canada
10
Universe
Europe
11
Europe
Italy
12
Europe
Spain
13
Europe
France
I think the INDEX and MATCH method that works for the children will not work for the parents because:
Your data structure implies that there is a single piece of information per row therefore {=INDEX(A3:E3,MATCH(FALSE,ISBLANK(A3:E3),0))} always works for a child value
But, the parent for that child value is the value in the prior column and an unknown number of rows above the row of the child. So, because the INDEX/MATCH approach does not return the location of the child in order that we can figure out the prior column that needs to be identified and addressed etc.
We can think about the parent value being in row of the column with the maximum index of a non-blank values up to and including that row.
E.g. for Spain the range to check for the parent is:
E.g. for Los Angeles the range to check for the parent is:
So to establish the correct range to check for the parent value for a child, you need to offset by -1 column and from the min row (2) to the max row (the row of the child).
To do this we need to get:
The row/ column coordinates of the children (and therefore we can also know the child value)
The min and max row of the prior column (the parent must exist in a non-blank value in that column)
Resolve values via an INDIRECT(ADDRESS(...)) approach against row/ column indices
So it will end up like this:
Where:
Child addresses
C_row: =ROW(A2:E2) is simply the row of the hierarchy
C_col: =SUMPRODUCT(--NOT(ISBLANK(A2:E2)),COLUMN(A2:E2)) is a non-array formula version of your formula that returns the index of the non-blank cell instead of the value itself
C_add: =ADDRESS(G2,H2) is the address of the child value per C_row and C_col above
Parent addresses
P_col: =H2-1 is the prior column to the child column (C_col); which we know has the parent value
P_add_min: =IF(J2>0,ADDRESS(2,J2),"zzz") is the address of the minimum row of the range containing the parent, with a condition to identify the root of the hierarchy (i.e. Universe has no parent)
P_add_max: =IF(J2>0,ADDRESS(G2,J2),"zzz") is the address of the maximum row of the range containing the parent, with the same condition as P_add_min re root of the hierarchy
P_row: =IF(J2>0,AGGREGATE(14,4,(NOT(ISBLANK(INDIRECT(K2&":"&L2)))*ROW(INDIRECT(K2&":"&L2))),1),"zzz") says where the child is not the root of the hierarchy, get the LARGEst index of a non-blank value in the prior column to the child between the min and max rows established above
Expected output
Parent: =IF(J2>0,INDIRECT(ADDRESS(M2,J2)),"zzz") gets the value at the address given by P_row and P_col
Child: =INDIRECT(I2) gets the value at the address given by C_add
Extensibility
The method above accounts for a hierarchy of arbitrary number of components where the maximum depth of the hierarchy is 5 (e.g. leaves in column E).
If you want to have arbitrary depth as well then you need to separate the helper columns and output columns into a different sheet e.g.
C_row: =ROW(Sheet1!2:2)
C_col: =SUMPRODUCT(--NOT(ISBLANK(Sheet1!2:2)),COLUMN(Sheet1!2:2))
C_add: =ADDRESS(A2,B2,1,1,"Sheet1")
P_col: =B2-1
P_add_min: =IF(D2>0,ADDRESS(2,D2),"zzz")
P_add_max: =IF(D2>0,ADDRESS(A2,D2),"zzz")
P_row: =IF(D2>0,AGGREGATE(14,4,(NOT(ISBLANK(INDIRECT("Sheet1!"&E2&":"&F2)))*ROW(INDIRECT("Sheet1!"&E2&":"&F2))),1),"zzz")
Parent: =IF(D2>0,INDIRECT(ADDRESS(G2,D2,1,1,"Sheet1")),"zzz")
Child: =INDIRECT(C2)
And you can go ahead and rename zzz just as "" to tidy things up.
HTH
Related
I have a dynamic table containing the following columns:
COUNTRY(String), ACTIVE (Boolean),NAME(String)
An example will be:
COUNTRY(String), ACTIVE (Boolean),NAME(String)
USA, True, Chair
Canada, False, Table
USA, False, Pen
USA, True, Pencil
Canada, True, Pencil
Canada, True, Basket
I want to create a data validation list with the names for every country that are active. The list should be dynamic as the table is constantly being changed.
For the example, the data validation list should check whether the cell containing the country name is Canada or USA and
if USA then: Chair, Pencil / if Canada: Pencil, Basket.
I have resolved this using a three-step organization. The first step is to make a "table" for any growing/shrinking list that will be used for the data validation.
Step 1
List the possible values in columns as a new table. Place the country name at the header row for each column.
A B C D E
1 Country chair USA Canada <----header row
2 USA pencil pencil table
3 Canada table Plato warthog
4 USA warthog
5 Canada tuba
6 USA magic
7 Canada Plato
Select the columns (for instance, D1:E3 in the example) and click Insert > Table.
Check "my data has headers" and click OK.
Select a cell in the newly-formatted column and click Table Tools > Design > Properties > Table Name: ______ and type Countries.
Step 2
Select the column for which you want data validation. Note which cell is active (the cell inside of the selection which you can write to). In the example, it's Cell B2.
Step 3
Under data validation, choose list and type in this equation:
=INDIRECT("Countries[" & A2 & "]")
If the row that's active is row 3, type this in instead:
=INDIRECT("Countries[" & A3 & "]")
This is actually working for me. Try it!
OK, I have two columns in excel that contain city names. I need a rank of how many times a relationship between two cities occurs. For example, the ranking for the data below should be as follows. #1 is Austin to Dallas with 3 occurrences. #2 is Chicago to Boston with 2 occurrences. #3 is Chicago to New York with 1 occurrence.
sample data set
You can use a =COUNTIFS statement to check specific cities.
For example:
Row 1 = Headers
Column A = Origin City
Column B = Destination City
Data in your table should be in A2:B7
You can use:
=COUNTIFS($A$2:$A$7,"Austin",$B$2:$B$7,"Dallas")
=COUNTIFS($A$2:$A$7,"Chicago",$B$2:$B$7,"Boston")
=COUNTIFS($A$2:$A$7,"Chicago",$B$2:$B$7,"New York")
I am looking for a vlookup formula that returns multiple matches using two lookup values. I am currently trying to use the concatenate method, but I haven't quite figured it out. The table needs to return all of the multiple matches not just one. Currently, its only returning the last match.
For example, lets say I have a list of multiple city and states. The cities differ but the states remain the same obviously. I want to return the number of people in the each city.
City State #OfPeople
Albany NY 10
Orlando FL 5
Tampa FL 3
Seattle WA 1
Queens NY 8
So I concatenated the city and state column.
Join City State #OfPeople
Albany-NY Albany NY 10
Orlando-FL Orlando FL 5
Tampa-FL Tampa FL 3
Seattle-WA Seattle WA 1
Queens-NY Queens NY 8
The purpose of this is to create an updated log of people in each city has time progresses. I want to have a grand total amount of people in each column. (I know this requires another formula. I'm just focused on returning multiple matches for now). However, I don't want to overwrite the existing data. Hopefully, I explained this well. This is just an example of a larger project I'm working on. I need to be able to build on this list. That's why its important that I be able to return matches multiple times.
Join City State #OfPeople Total
Albany-NY Albany NY 10 10
Orlando-FL Orlando FL 5 15
Tampa-FL Tampa FL 3 18
Seattle-WA Seattle WA 1 19
Queens-NY Queens NY 8 27
Any help would be greatly appreciated!
Considering you're trying to get some grand totals based on multiple criteria, I would suggest using SUMIFS() / COUNTIFS() functions, rather than focusing on searching matching row itself.
However, if you need multiple criteria look up, for some reason, I believe INDEX() + MATCH() combination can perfectly do the job.
The table needs to return all of the multiple matches not just one.
Currently, its only returning the last match
You'll need to use SUMIFS() if there are multiple records for the same city/state combo in your people lookup.
=SUMIFS (sum_range, range1, criteria1, [range2], [criteria2], ...)
Let's assume that you have a cities tab and a people tab. Let's assume you have ten cities that you want to return the total amount of people from.
Cities Tab definition
City range: 'Cities'!A$1:A$10
State range: 'Cities'!B$1:B$10
People Tab definition
City range: 'People'!A$1:A$100
State range: 'People'!B$1:B$100
#OfPeople range: 'People'!C$1:C$100
Drop this formula in the first row of your cities tab, drag down the entire range of cities.
=SUMIFS('People'!C$1:C$100, 'Cities'!A$1, People'!A$1:A$100, 'Cities'!B$1, 'People'!B$1:B$100)
There is a way to have a group of rows related to other one, in the same sheet, like a more detailed information? Obviously must keep them always next to the main row if you filter or sort.
Desired example based on vehicles and travels:
A B C D
1 [ID] [VEHICLE TYPE] [BRAND] [COLOUR]
+ 2 A-171 PICKUP HONDA BLACK
- 3 [TRAVEL] [KM] [STATION]
- 4 12/08/2016 13.000 BARCELONA
- 5 13/08/2016 13.750 DONOSTI
+ 6 B-501 VAN RENAULT WHITE
- 7 [TRAVEL] [KM] [STATION]
- 8 12/08/2016 117.800 PARIS
- 9 13/08/2016 120.000 AMSTERDAM
- 10 14/08/2016 124.320 MUNICH
So when you sort the spreadsheet, should keep always the travel rows next to the vehicle row.
It is that possible? If not, what can I do to get this or similar? (I don't mind to use other sheet tab, but it wasn't the ideal)
You can use the Group function (Alt-A-G-G), and they won't be sorted as usual if you use sort on the whole column
I'm working on Excel with a lot of data and I'm having difficulty with knowing how to sort through it to get some important numbers. I have minimal Excel experience.
Right now I'm struggling with knowing how to get the average in the difference between two columns. The trick is that I have to get the average in difference when column A is less that column B and then, the same when it's more. And all that within a category.
So for example let's say I have 3 categories: Football, Soccer, and Basketball (these are just made up ones).
So in column A, I have: Soccer, Football, Basketball. Then, in column B and C, I have the scores for John and Adam for the last 3 months, respectively. Lastly, in column D, I have the differences between their scores.
So, for example:
Category John Adam Differences
Soccer 5 3 2
Soccer 6 2 4
Soccer 3 5 2
Soccer 4 0 4
I want to create a table for within each category I have a table like below:
NÂș of cases Avg. Difference between John and Adam
When John's score is >
When John's score is <
When they are equal
Is there some type of formula where I can say something like this:
If the category is Soccer (the category being in column A), take the difference between John's score (column B) and Adam's score (column C) when John's score is larger than Adam's score, then calculate the average of those differences? Then, I would use the same formula but tweak it when John's score is smaller.
Additionally, would there be a formula where I can also, calculate within the category Soccer, how many times John's score is bigger than Adam's?
My data is much larger and I can't do this manually.
A B C D
1 Sport John Adam Differences
2 Soccer 5 3 2
3 Soccer 6 2 4
4 Soccer 3 5 -2
5 Soccer 4 0 4
6 Basketball 20 15 5
7 Basketball 7 13 -6
8 Basketball 26 10 16
9 Basketball 8 11 -3
Type in D1:
=B1-C1
Drag the formula in Column D to all rows which there are values in columns A, B and C.
Create the PivotTable.
Drag Sport to "Row Label" field. Drag Differences to "Row Label" field under Sport.
Drag Differences to "Values" field as: Count of Differences (same way the previous question)
Drag Differences to "Values" field (below Count of Differences), and set the mathematical operation as "Average" of Differences (left-mouse click Differences, choose "Values fields settings" and select "Average").
Give a right-click mouse in cell A5 (see picture bellow) and select "Group" option.
Set "Starting at" = 0; "Ending at" = 1000; "By" = 1000 (as in the picture below). Click ok.
You will have in each Sport, the count (frequency) and average Differences values for two groups:
When the Difference B1-C1 is negative; and
When the Difference B1-C1 is zero or positive.
The average of Differences when the score is equal will be always zero.