Counting the number of older siblings in an Excel spreadsheet - excel

I have a longitudinal spreadsheet of adolescent growth.
ID | CollectionDate | DOB | MOTHER ID | Sex
1 | 1Aug03 | 3Apr90 | 12 | 1
1 | 4Sept04 | 3Apr90 | 12 | 1
1 | 1Sept05 | 3Apr90 | 12 | 1
2 | 1Aug03 | 21Dec91 | 12 | 0
2 | 4Sept04 | 21Dec91 | 12 | 0
2 | 1Sept05 | 21Dec91 | 12 | 0
3 | 1Aug03 | 30Jan89 | 23 | 0
3 | 4Sept04 | 30Jan89 | 23 | 0
This is a sample of how my data is formatted and some of the variables that I have. As you can see, since it is longitudinal, each individual has multiple measurements. In the actual database there are over 10 measurements per individual and over 250 individuals.
What I am wanting to do is input a value signifying the number of older brothers and older sisters each individual has. That is why I have included the Mother ID (because it represents genetic relatedness) and sex. These new variable columns would just say how many older siblings of each sex each individual has. Is there a formula that I could use to do this quickly?
=COUNTIFS($B:$B,"<>"&$B2,$H:$H,$H2,$AI:$AI,$AI2,$J:$J,"<"&$J2)

Create a column named Distinct with this formula
=1/COUNTIF([ID],[#ID])
Then you can find all the older 0-sexed siblings like this
=SUMPRODUCT(([DOB]>[#DOB])*([MOTHERID]=[#MOTHERID])*([Sex]=0)*([Distinct]))
Note that I made the data a Table and used table notation. If you're not familiar [COLUMNNAME] refers to the whole column and [#COLUMNNAME] refers to the value in that column on the current row. It's similar to saying $A:$A and A2 if you're dealing with column A.
The first formula gives you a value to count that will always result in 1 for a particular ID. So ID=1 has three lines and Distinct will result in .33333 for each line. When you add up the three lines you get 1. This is similar to a SELECT DISTINCT in Sql parlance.
The SUMPRODUCT formula sums [Distinct] for every row where the DOB is greater than the current DOB, the Mother is the same as the current Mother, and the Sex is zero.

I have a possible solution. It involves adding two columns -- One for "# older siblings" and one for "unique?". So here are all the headings I have currently:
A -- ID
B -- CollectionDate
C -- DOB
D -- MOTHER ID
E -- Sex
F -- # older siblings
G -- unique?
In G2, I added the following formula:
=IF(A2=A1,0,1)
And dragged down. As long as the data is sorted by ID, this will only display "1" once for each unique person.
In F2, I added the following formula:
=COUNTIFS(G:G,"=1",D:D,"="&D2,C:C,"<"&C2)
And dragged down. It seemed to work correctly for the sample data you provided.
The stipulations are:
You would need the two columns.
The data would need to be sorted by ID
I hope this helps.

You need a formula like this (for example, for row 2):
=COUNTIFS($A:$A,"<>"&$A2,$E:$E,$E2,$D:$D,$D2,$C:$C,"<"&$C2)
Assuming E:E is column for sex, D:D is column for mother ID and C:C is column for DOB.
Write this formula in H2 cell for example and drag it down.

Related

How to extract multiple rows that meet a criteria which is given by 2 drop-down lists in EXCEL

I have a sheet that looks like this:
A | B | C | D | E | F
1 NAME | TASK | ADRESS | ORDER_GIVER | COUNT | NOTE
2 DROPDOWN_2 | move | NY, xy_street | Ann | 1 | ...
3 DROPDOWN_2 | fill | CA, yx_street | Rose | 3 | ...
...
100 NAME | TASK | ADRESS | ORDER_GIVER | COUNT | NOTE
101 DROPDOWN_1
102
103 NAME | TASK | ADRESS | ORDER_GIVER | COUNT | NOTE
104 DROPDOWN_1
INITIALLY:
In rows 1-99 you find the tasks with 1 column empty (NAME).
In rows 100+ you find "Tickets" which can be printed (2 rows for example 100-101)
THEN
1, The ORGANISER (me) makes tickets with names, by ctrl+c/ctrl+v the "ticket structure" and by choosing a name from the DROPDOWN_1 list.
2, Then starts to assign the tasks (row 1-99) to people by choosing them from the DROPDOWN_2 list. (note that dropdown name lists contain the same names.)
After this I would like to have the Excel to fill in the tickets by the rows that contain the same name as the ticket. One person can be assigned to more tasks, but one task can only be assigned to one man. (So tickets can have 1 NAME but more rows depending on the 1-99 list.
I am asking you to help me make a formula or function for this "autofill" of tickets because I have been searching for days for a solution however couldn't find a proper one.
In the Similar problems and solutions section you can find 2 links which had the closest answer. Unfortunately neither of them contain dropdown lists. I tried to solve the problem with INDEX(MATCH()) functions, but the problem is that it cannot handle the changes of names.
Thanks you,
Max
Similar problems and solutions:
https://www.get-digital-help.com/2009/09/28/extract-all-rows-from-a-range-that-meet-criteria-in-one-column-in-excel/
Extracting all rows based on a value of cell without VBA
Select A101:F392 and enter this as an array formula (ctrl+shift+enter):
=IFERROR(INDEX(A1:F99,ROUND(MOD(SMALL(IFERROR(CHOOSE({1,2},SMALL(IFERROR(1/(1/MMULT(IF(SMALL(COUNTIF(A2:A99,"<="&A2:A99),ROW(INDIRECT("2:98")))=SMALL(COUNTIF(A2:A99,"<="&A2:A99),ROW(INDIRECT("1:97"))),0,ROW(A2:A98)),{1,1}))+{0.001,-0.001},FALSE),ROW(INDIRECT("1:196"))),COUNTIF(A2:A99,"<="&A2:A99)+ROW(A2:A99)/1000),FALSE),ROW(INDIRECT("1:292"))),1)*1000,0),{1,2,3,4,5,6}),"")

Excel spreadsheet check two conditions along different rows & columns

Hey I'm really stuck on this one, basically tying to get a result of either 'hired' or 'available' (or 0 & 1 to represent) in B2, of BOTH the two conditions in B1 & A2, looking at the log table A6:B10 of when they are 'hired out'. I've tried VLOOKUP and many IF functions but neither quite work correctly.
One option here, though perhaps not the most graceful, would be to use VLOOKUP with the car and date as the key. The issue here is that there are multiple lookup values, namely the car and the particular date. To get around this, you could create a new column C which contains the concatenated car and date, e.g.
A | B | C | D
6 Car 1 | 03-01-2017 | Car 1 03-01-2017 | 0
7 Car 2 | 03-01-2017 | Car 2 03-01-2017 | 0
8 Car 3 | 04-01-2017 | Car 3 04-01-2017 | 0
9 Car 4 | 05-01-2017 | Car 4 05-01-2017 | 0
10 Car 2 | 06-01-2017 | Car 2 06-01-2017 | 0
To create column C, simply enter the following formula into cell C6 and then copy it down the entire column:
=CONCATENATE(A6, " ", B6)
Now you can use the following VLOOKUP formula in cell B2 to calculate whether a given ride is available on a certain date:
=IFERROR(VLOOKUP(A2&" "&B1,C6:D10,2,FALSE), 1)
Here I have hard-coded in column D the value 0 for every entry, to represent that these are rides which are already hired-out for that particular car and date. If our VLOOKUP formula finds a match, then it means the ride is hired-out. If VLOOKUP does not find a match, it would throw an error, in which case we display 1 to indicate availability.

How to get two+ rows to link together? Excel 2010 (Example)

I have a parts list with competitor pricing. One part number brings multiple brands up with the location of the company.
As you can see from the picture, I have part numbers for one item with three companies. I want to sort by part type. So for example I want to list only the brake pads. When I do this the blanks get sent to the bottom, but the blanks are not really blanks because they have additional info with them for that part number.
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | Column 7
Part No | Company A | Price | Company B | Price | Company C | Price
4656546 | Brand A | $5 | Brand A | $5 | Brand A | $5
(BLANK) | Brand b | $8 | Brand b | $8 | Brand b | $8
I have tried to use a helper column, but I have 1,000+ rows.
Does anyone know if you can link or have a relationship between two+ rows?
I hope you understand and if not. I can try to explain better.
I asume that a "blank" in PartNo means "take the PartNo from the cell above" ...
In order to normalize the PartNo (= get rid of the blanks) use another PartNo-Normalized column (e.g. [K:K]) and normalize as following:
K1 ="PartNo-Normalized"
K2:Kxx =IF(A2<>"",A2,K1)
Next convert all formulas in [K:K] into values !!! (Copy / PasteAs - Values) before sorting ... as a sort operation will destroy the calculated values.
After conversion to values it's save to sort, and you may create a filter on that column.
Depending on how well organized your data is, it might be a good idea to add one more column and fill it with 1, 2, 3, 4, 5 ... before any sorting so you can restore the original sort order just in case something nasty happens.

Transponse just some columns in excel

I have a worksheet with columns similar to the below
name | id | contact | category | week 1 | week 2 | week 3 | ... |week 52
What I need to do is transpose the 'week' columns into rows, so I end up with:
name | id | contact | category | week
With an entry for each week as a row in the s/sheet - thus making a long list on rows with the column data for each week.
example current format:
jones | 12345 | simon | electronics | 100 | 120| 130| 110 | ..........150
Required format
jones | 12345 | simon | electronics | 100
jones | 12345 | simon | electronics | 120
jones | 12345 | simon | electronics | 130
jones | 12345 | simon | electronics | 110
...
jones | 12345 | simon | electronics | 150
I have tried the usual excel transpose (via paste) but cannot get the first few columns to stay static, whilst transposing the week columns
Ideally I would like to achieve this within excel, but I can import the data into a mysql database and use that if the solution would be easier that way
Hope this makes sense
[added examples]
I would do the work on a second sheet, which uses the INDIRECT function to do the lookups for you:
http://www.excelfunctions.net/Excel-Indirect-Function.html
Start by setting up some indexes on the new sheet - we will use these to indirectly look up into the original sheet and pull the data across.
I would count up to 52 again and again in column A, starting with a 1 in A2, and using this formula below:
=if(A2=52,1,A2+1)
This would be my count of the weeks per person.
In column B, I would count my people, starting with a 1 in B2, and using this formula:
=if(A3=1,B2+1,B2)
This gives me the row and column offsets to use in the INDIRECT function to fetch the data from your original sheet.
Now the fun part - matching these row and column offsets to your actual data.
Lets assume your original data is in a sheet called "original". This is where we need to look up the data.
We will map the original column A into the new sheet's column C. So C2 can hold this formula:
=indirect("original!R"&($B2+1)&"C1",false)
What you are doing there is looking in the row that you calculated in the B column (formula above), and looking in the first column of that row (i.e. column A) - this is where the Name is stored.
Similarly, the "id", "contact" and "category" columns get mapped to new sheet columns D, E, F, using modifications of that formula:
=indirect("original!R"&($B2+1)&"C2",false)
=indirect("original!R"&($B2+1)&"C3",false)
=indirect("original!R"&($B2+1)&"C4",false)
Only the column offset gets changed in these updates.
To pull the weekly data across, we use a similar formula; the difference is that now we get to use the newly calculated column A, where we counted up from 1 to 52 over and over.
So G2 becomes:
=indirect("original!R"&($B2+1)&"C"&(4+$A2),false)
Copy this all down as far as you need, and hide columns A and B.

Count number of rows where multiple criteria are met

I'm trying to generate a table that shows a count of how many items are in any given status on any given day. My result table has a set of Dates down column A and column headers are various statuses. A sample of my data table with headers looks like this:
Product | Notice | Assigned | Complete | In Office | In Accounting
1 | 5/5/13 | 5/7/13 | 5/9/13 | 5/10/13 | 5/11/13
2 | 5/5/13 | 5/6/13 | 5/8/13 | 5/9/13 | 5/10/13
3 | 5/6/13 | 5/9/13 | 5/10/13 | 5/10/13 | 5/10/13
4 | 5/4/13 | 5/5/13 | 5/7/13 | 5/8/13 | 5/9/13
5 | 5/7/13 | 5/8/13 | 5/10/13 | 5/11/13 | 5/11/13
If my output table were to contain a set of dates in the first column with the statuses as headers, I need a count of how many rows were at the given status and had not yet transitioned to the next status so that in the Notice column, I'd have a count of rows where the Notice Date was <= X AND where the Assigned, Complete, In Office, In Accounting are all greater than X.
I've used a Sum(if(frequency(if statement to get me REALLY close but I feel like I need to have an AND statement within the second IF like this =SUM(IF(FREQUENCY(IF(AND
Here's what I have that won't work:
=SUM(IF(FREQUENCY(IF(AND(Table1[Assigned]<=A279,Table1[[Complete]:[In Accounting]]<=A279),ROW(Table1[[Complete]:[In Accounting]])),ROW(Table1[[Complete]:[In Accounting]]))>0,1))
If I take the "AND" portion out, this works fine except I need it to ONLY count rows where the given status actually has a date so if an "Assigned" date is empty, I don't want that row to be counted for the Assigned column.
Here's an example of what I'd expect to see in the results. I've listed the count in the each column as well as the corresponding product numbers in parenthesis. The corresponding product numbers are for reference only and won't actually be in the result table.
Date | Notice | Assigned | Complete
5/6 | 2 (1,3) | 2 (2,4) | 0
5/7 | 2 (3,5) | 2 (1,2) | 1 (4)
5/8 | 1 (3) | 2 (1,5) | 1 (2)
OK, assuming you have the original data in A1:F6 then with 2nd table headers in B9:D9 and row labels in A10:A12 then you can use this "array formula" in B10
=SUM((B$2:B$6<=$A10)*(MMULT((C$2:$F$6>$A10)+(C$2:$F$6=""),TRANSPOSE(COLUMN(C$2:$F$6)^0))=COLUMNS(C$2:$F$6)))
confirmed with CTRL+SHIFT+ENTER and copied down and across (see screenshot below)
As you can see the results are as per your requirement. If you replace dates with blanks it will still work
MMULTis a way to get a single value from each row even when you are looking at multiple columns.
I used cell references because I think that's easier, especially when copying the formula across and having a reducing range.......but you can use structured references if you want
Have you tried using COUNTIFS to count based on multiple criteria. It is fairly well documented here: http://office.microsoft.com/en-us/excel-help/countifs-function-HA010047494.aspx (2007+ only)
Basically, you use it like
=COUNTIFS(first_range_to_check, value_you_want_in_first_range, ...)
where the ... represents as many pairs as you want (up to 127 total pairs), note the conditions are AND connection so if you have two pairs, the first pair AND the second pair must return true for that row to count.

Resources