Generate new table from a current table in excel - excel

I have sample table like this:
ID | 1 | 2 | 3
-------------
1 | 0 | 1 | 0
--------------
2 | 1 | 1 | 1
Then I want to generate a new table from that table. It will take the second row (1) then compare with each column (1, 2, 3) then print value of the matrix ( 0 - 1 - 0 ). For example:
Row_ID | Column_ID | Value
--------------------------
1 | 1 | 0
--------------------------
1 | 2 | 1
--------------------------
1 | 3 | 0
--------------------------
2 | 1 | 1
--------------------------
2 | 2 | 1
--------------------------
2 | 3 | 1
I'm not sure how or where to start by using formula. Please help. Thanks,

Well. There's no single formula that's going to do the job, obviously, but we have a few options we can use. I'll assume that the new table is going to start in cell A1 of Sheet2. Adjust accordingly.
Start with manually entered headers
Row_ID | Column_ID | Value
In the first column, first row, enter a 1. In rows below, use this formula: =IF(B3<B2,A2+1,A2) This will increment the value in the first column by 1 each time the second column resets its numbering.
In the second column, first row, enter a 1. The formula we'll use for this one will need some tweaking, but the basic version is: =IF(MOD(ROW()**+1**,**3**)=0,1,B2+1)
This formula is going to essentially count up to a certain point, then reset its numbering. The point it will count to, and where it will reset, will vary depending on the amount of data you have and which row you're starting from. Replace the 3 with the number of data columns you have, and remove the **s. The +1 is needed to increase the Row() counter to the SAME NUMBER as your number of data columns. So in my example, with 3 data columns and starting on row 2, the ROW() function gives us 2, so we need to add 1 to that to get up to a total of 3. If I had 5 data columns, I would add 3 to the total. Hope that makes sense.
These two formulae should give you a set of row and column numbers. Copying the formula down will force the values to increase as needed, thus:
Row_ID | Column_ID | Value
1 | 1 |
1 | 2 |
1 | 3 |
2 | 1 |
...etc.
Finally, to bring in the values, we'll use an OFFSET formula in the Value column: =OFFSET(Sheet1!$A$1,A2,B2) That formula starts from a reference cell - A1, in this case - then moves down x number of rows and across y number of columns to return a value. X and Y are provided by the formulas we already have. Your final structure will be something like this:
Row_ID | Column_ID | Value
1 | 1 |=OFFSET(...
=IF(...|=IF(MOD(...|=OFFSET(...
I hope all that made sense. Please let me know if there's anything that doesn't, and I'll try to troubleshoot.
EDITED TO ADD:
If the Row ID is something like a key that needs to be included with each value, we can get that fairly easily. We'll include another column with a slightly modified OFFSET formula: =OFFSET(Sheet1!$A$1,A2,0)
With this version of the formula we're not changing the column as we go down, just the row when it changes. It allows the values in the first row to be repeated in every row of the table. So this is my input:
And this is my output:
Notice that the ID repeats on each line of the output for the same item.

Related

SUM columns on the same sheet based on conditions or SUMIFS from another sheet

Here is a small sample table
+--------+-------+--------+
| COL 1 | COL 2 | COL 3 |
+--------+-------+--------+
| abc123 | Total | |
+--------+-------+--------+
| abc123 | cat1 | 100.00 |
+--------+-------+--------+
| abc123 | cat2 | 200.00 |
+--------+-------+--------+
| def123 | Total | |
+--------+-------+--------+
| def123 | cat1 | 100.00 |
+--------+-------+--------+
| def123 | cat2 | 200.00 |
+--------+-------+--------+
In COL 3, IF COL 2 is "Total" I need to SUM everything in COL 3 for each row in COL1 that is the same. (EG. COL3 Total row should be 300.00 for abc123 and then 300.00 for def123) Otherwise if COL 2 is NOT "Total" I need to do SUMIFS('Sheet3'!N:N,'Sheet3'!A:A,Sheet2!A473,'Sheet3'!Q:Q,Sheet2!Q473)*Sheet4!$U$2)
How can I can I accomplish the first part of the SUM?
Edit:
I think my example is too rigid and appears like it is set.
Let me see if I can explain in more fluid terms. I will have to describe this some what in database terms. All of the columns are on one sheet for the purposes of the "Total" portion.
COL 1 is my partition. Each of the "ID's" in COL 1 consists of 57 rows. Within 1 of those 57 rows is "Total" in another column, in the example that is COL 2.
So I have a large table that in COL 1 there are say 5 different ID's with 57 rows for each ID resulting in 285 rows.
Now I had a sorting function that would likely make this whole thing easier, but that function is crashing excel and not sorting both required sorts ( https://techcommunity.microsoft.com/t5/excel/sort-function-causes-a-crash-and-does-not-perform-secondary-sort/m-p/1477123#M66205 )
I suppose if I can get the sorting function to stop crashing excel this becomes slightly easier as then "Total" is consistently placed in row 2, 58, 116, etc. and I can add up everything below it. Right now, because that sort doesn't work, I have to add up everything from COL 3 that is NOT assigned to "Total" in COL 2 and has the same ID in COL1.
So in the table above abc123 is 3 rows and I need to add up the two rows that are not total for abc123 and have the formula spit out 300 into COL 3 for total.
Then def123 needs the same treatment.
Here is the tough part: the sorting is inconsistent because the data comes from a Redshift query so it is random for each ID. The IDs themselves are in random order. I think I can get the sort for COL 1 to work without crashing excel, but the secondary sort with the custom order is crashing it.
One way to avoid the Circular Reference error when trying to Total a column is to use two Sums, one above and one below.
So, assuming that your Columns 1, 2 and 3 are A, B and C, and that data starts in Row 2 (Row 1 being a header), you need the Sum of cells above the current row:
SUMIFS(C$1:C1, A$1:A1, A2)
Plus the Sum of the cells below the current row:
SUMIFS(C3:INDEX(C:C, 1+COUNTA(A:A)), A3:INDEX(A:A, 1+COUNTA(A:A)), A2)
(Note that this will actually terminate one row above and below the dataset)
Put this together with an IF statement:
=IF(B2="Total", SUMIFS(C$1:C1, A$1:A1, A2) + SUMIFS(C3:INDEX(C:C, 1+COUNTA(A:A)), A3:INDEX(A:A, 1+COUNTA(A:A)), A2), EXISTING_FORMULA_HERE)
Alternatively, you could try writing an Array Formula to calculate the SUM directly, a bit like when using multiple conditions in a MATCH, something like this: (not enough information in the question to do this exactly)
=SUMPRODUCT('Sheet3'!N:N*(COUNTIFS(A:A,'Sheet3'!$A:A)>0)*(COUNTIFS(B:B,'Sheet3'!$Q:Q)>0))
(Sum of Sheet3!N:N when a row exists in the current sheet that matches columns Sheet3!A:A in Column A and Sheet3!Q:Q in Column B)
Note that working on Entire Columns with Array Formulae is quite slow, so you may want to limit those just to the Used Range

Sum All VLOOKUP Matches

I have a sheet where I am recording what I eat:
Another where I keep an index of values to lookup
I tried
=SUM(VLOOKUP('Sheet1'!A2:A11,'Sheet2'!A2:E11,2,FALSE))
but that only returned the first match, so then I tried
=SUMPRODUCT(SUMIF('Sheet1'!A2:A11,'Sheet2'!A2:A11,'Sheet2'!B2:B11))
but that isn't working either.
does anyone have a solution, where I can also multiply the value of the return match by the # of servings in the first sheet?
Thanks!
If you want a single output of calories through SUMPRODUCT then you can use
=SUMPRODUCT(B2:B11*IFERROR(VLOOKUP(A2:A11,Sheet2!A2:B11,2,0),0))
If you are sure that all entries on Sheet 1 can be located on Sheet 2 then you can drop IFERROR portion like
=SUMPRODUCT(B2:B11*VLOOKUP(A2:A11,Sheet2!A2:B11,2,0)).
Beware that if a value is not found in Sheet 2 then formula will produce wrong result as IFERROR will multiply the serving quantity with 0.
I combine 2 tables into one sheet, Table 1 housed in Column A & B and Table 2 housed in Column D & E
In G2, "Total Serving Colories" enter formula :
=SUMPRODUCT(VLOOKUP(T(IF({1},A2:A12)),D2:E12,2,FALSE)*B2:B12)
It's not super-clear what you're trying to get at. But defining the "Calories Per Serving" in a range called "cals",
+---+---------+-----+--------------------------------+
| | A | B | C |
+---+---------+-----+--------------------------------+
| 1 | egg | 3 | =(VLOOKUP(A2,cals,2,FALSE))*B2 |
| 2 | oatmeal | 1.5 | =(VLOOKUP(A3,cals,2,FALSE))*B3 |
| 3 | shrimp | 2 | =(VLOOKUP(A4,cals,2,FALSE))*B4 |
+---+---------+-----+--------------------------------+
Results in:

SUMIF minimum-so-far condition

I am trying to set up a smart conditional summing within Excel. But the range of functions available doesn't appear to provide what I am looking for.
I have two columns of numbers. In A, I have what we'll call indentation levels. In B, I have values.
For any particular row that has child indentations, I want to use a formula in B that will calculate the sum of values in B from the next row down to the next instance of that row's A value if the corresponding value in A is the minumum it has been so far.
e.g.
row | A | B | calc'd
--------------------
1 | 0 | 9 | y
2 | 2 | 2 |
3 | 1 | 7 | y
4 | 2 | 3 |
5 | 2 | 4 |
6 | 0 | 5 | y
7 | 1 | 5 |
So, for row 1, the sum range will be rows 2 through 5. This part, I can do with an OFFSET MATCH.
The SUMIF should include row 2, as A2 is the minimum value in A2:A2.
Likewise, it should include row 3, as A3 is the minimum value in A2:A3.
But it should not include rows 4 or 5 in the sum, because their A-column values are not the minimum "so far". (These values have already been "summed up" into row 3.)
How do I create a ranged sumif with this "minimum-so-far" condition?
I found a solution to this. Not quite as clean as I wanted, but it does the trick:
Add a helper value to column C, which is the "parent row" is will sum into.
For example, C5 {=MAX(ROW($9:5)*(A$1:A4<A5))} (note: array function).
Making the final equation for B1 =SUMIF(C2:Cn,"="&ROW(),B2:Bn) (where n is the upper limit of the working range).
As written here, inserting lines messes things up, but it can all be expanded with OFFSETs for row insertability.

Add page series to every second row

I have list of ID extracted from different page numbers. I wanted to add page number to every second row in Excel as shown below:
ID Number | Page Number
1 | 1
2 | 1
3 | 2
4 | 2
5 | 3
6 | 3
7 | 4
8 | 4
9 | 5
10 | 5
Is there any way to do it?
simply use in column B1:
=TRUNC((ROW()-x)/2)+1
while x is the row to start with
Or when matching with ID:
=TRUNC((A2-1)/2)+1
and then auto-fill down (in second case, A2 is the ID to that page -> order of ID's doesn't matter)
If you put values in the first 2 Page number rows then set the 3rd row to be the value of the first row, this forumla can be copied for all remaining rows
| |A|
|1|1|
|2|1|
|3|=A1+1|
|4|=A2+1|
|5|=A3+1|
Enter the following formula into cell B2 and copy down the column:
=ROUND(A2/2, 0)
where A2 is a value from the ID Number column.

Counting the number of older siblings in an Excel spreadsheet

I have a longitudinal spreadsheet of adolescent growth.
ID | CollectionDate | DOB | MOTHER ID | Sex
1 | 1Aug03 | 3Apr90 | 12 | 1
1 | 4Sept04 | 3Apr90 | 12 | 1
1 | 1Sept05 | 3Apr90 | 12 | 1
2 | 1Aug03 | 21Dec91 | 12 | 0
2 | 4Sept04 | 21Dec91 | 12 | 0
2 | 1Sept05 | 21Dec91 | 12 | 0
3 | 1Aug03 | 30Jan89 | 23 | 0
3 | 4Sept04 | 30Jan89 | 23 | 0
This is a sample of how my data is formatted and some of the variables that I have. As you can see, since it is longitudinal, each individual has multiple measurements. In the actual database there are over 10 measurements per individual and over 250 individuals.
What I am wanting to do is input a value signifying the number of older brothers and older sisters each individual has. That is why I have included the Mother ID (because it represents genetic relatedness) and sex. These new variable columns would just say how many older siblings of each sex each individual has. Is there a formula that I could use to do this quickly?
=COUNTIFS($B:$B,"<>"&$B2,$H:$H,$H2,$AI:$AI,$AI2,$J:$J,"<"&$J2)
Create a column named Distinct with this formula
=1/COUNTIF([ID],[#ID])
Then you can find all the older 0-sexed siblings like this
=SUMPRODUCT(([DOB]>[#DOB])*([MOTHERID]=[#MOTHERID])*([Sex]=0)*([Distinct]))
Note that I made the data a Table and used table notation. If you're not familiar [COLUMNNAME] refers to the whole column and [#COLUMNNAME] refers to the value in that column on the current row. It's similar to saying $A:$A and A2 if you're dealing with column A.
The first formula gives you a value to count that will always result in 1 for a particular ID. So ID=1 has three lines and Distinct will result in .33333 for each line. When you add up the three lines you get 1. This is similar to a SELECT DISTINCT in Sql parlance.
The SUMPRODUCT formula sums [Distinct] for every row where the DOB is greater than the current DOB, the Mother is the same as the current Mother, and the Sex is zero.
I have a possible solution. It involves adding two columns -- One for "# older siblings" and one for "unique?". So here are all the headings I have currently:
A -- ID
B -- CollectionDate
C -- DOB
D -- MOTHER ID
E -- Sex
F -- # older siblings
G -- unique?
In G2, I added the following formula:
=IF(A2=A1,0,1)
And dragged down. As long as the data is sorted by ID, this will only display "1" once for each unique person.
In F2, I added the following formula:
=COUNTIFS(G:G,"=1",D:D,"="&D2,C:C,"<"&C2)
And dragged down. It seemed to work correctly for the sample data you provided.
The stipulations are:
You would need the two columns.
The data would need to be sorted by ID
I hope this helps.
You need a formula like this (for example, for row 2):
=COUNTIFS($A:$A,"<>"&$A2,$E:$E,$E2,$D:$D,$D2,$C:$C,"<"&$C2)
Assuming E:E is column for sex, D:D is column for mother ID and C:C is column for DOB.
Write this formula in H2 cell for example and drag it down.

Resources