Expand a data set using two columns - excel

In Excel, I have two columns of data that I wish to combine.
Current set of data:
+---------+---------+
| column1 | column2 |
+---------+---------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
| | 5 |
| | 6 |
| | 7 |
+---------+---------+
For each value in column1, I need to assign all of the values in column2 so it looks like this:
+---------+---------+
| column1 | column2 |
+---------+---------+
| a | 1 |
| a | 2 |
| a | 3 |
| a | 4 |
| a | 5 |
| a | 6 |
| a | 7 |
+---------+---------+
| b | 1 |
| b | 2 |
| b | 3 |
| b | 4 |
| b | 5 |
| b | 6 |
| b | 7 |
+---------+---------+
| c | 1 |
| c | 2 |
| c | 3 |
| c | 4 |
| c | 5 |
| c | 6 |
| c | 7 |
+---------+---------+
| d | 1 |
| d | 2 |
| d | 3 |
| d | 4 |
| d | 5 |
| d | 6 |
| d | 7 |
+---------+---------+
How can I do this?
Do I need to find a macro/VB solution?

Since seems unlikely to receive any other answer:
in A1: a
in B1: =MOD(ROW()-1,7)+1
in A2: =IF(MOD(ROW()-1,7)>0,CHAR(CODE(A1)),CHAR(CODE(A1)+1))
Copy both formulae down to suit.

Related

Unique count of values in column per month

Excel-Table:
| A | B | C | D | E | F | G |
-----|----------------|-----------------|------------------|--------|---------|---------|---------|-----
1 | month&year | date | customer | | 2020-01 | 2020-03 | 2020-04 |
-----|----------------|-----------------|------------------|--------|---------|---------|---------|-----
2 | 2020-01 | 2020-01-10 | Customer A | | 3 | 2 | 4 |
3 | 2020-01 | 2020-01-14 | Customer A | | | | |
4 | 2020-01 | 2020-01-17 | Customer B | | | | |
5 | 2020-01 | 2020-01-19 | Customer B | | | | |
6 | 2020-01 | 2020-01-23 | Customer C | | | | |
7 | 2020-01 | 2020-01-23 | Customer B | | | | |
-----|----------------|-----------------|---------------- -|--------|---------|---------|---------|-----
8 | 2020-03 | 2020-03-18 | Customer E | | | | |
9 | 2020-03 | 2020-03-19 | Customer A | | | | |
-----|----------------|-----------------|------------------|--------|---------|---------|---------|-----
10 | 2020-04 | 2020-04-04 | Customer B | | | | |
11 | 2020-04 | 2020-04-07 | Customer C | | | | |
12 | 2020-04 | 2020-04-07 | Customer A | | | | |
13 | 2020-04 | 2020-04-07 | Customer E | | | | |
14 | 2020-04 | 2020-04-08 | Customer A | | | | |
15 | 2020-04 | 2020-04-12 | Customer A | | | | |
16 | 2020-04 | 2020-04-15 | Customer B | | | | |
17 | |
In my Excel file I want to calculate the unique count of cutomers per month as you can see in Cell E2:G2.
I already inserted Column A as a helper column which extracts only the month and the year from the date in Column B.
Therefore, the date-formatting is the same as in the timline in Cell E1:G2.
I guess the formula to get the unique count per month is somehow related to =COUNTIFS($A:$A,E$1) but I have no clue how to modify this formula to get the expected values.
Do you have any idea?
Here's one approach which would work for Office 365 and if you have access to UNIQUE:
=COUNTA(UNIQUE(IF($A$2:$A$16=G$1,$C$2:$C$16,""),,FALSE))-1
For older versions, following will work with CTRL+SHIFT+ENTER (array entry)
=SUM(--(FREQUENCY(IFERROR(MATCH($A$2:$A$16&$C$2:$C$16,E$1&$C$2:$C$16,0),"a"),MATCH($A$2:$A$16&$C$2:$C$16,E$1&$C$2:$C$16,0))>0))
You can do it without any helping column.
=SUM(--(UNIQUE(FILTER($C$2:$C$16,TEXT($B$2:$B$16,"yyyy-mm")=E$1))<>""))
For older version of excel use below formula with your helper column.
=SUMPRODUCT(--($A$2:$A$16=D$1)*(1/COUNTIFS($A$2:$A$16,$A$2:$A$16,$C$2:$C$16,$C$2:$C$16)))

SIGN() formula returns unexpected results

In continuation of my previous question: Sumproduct with multiple criteria on one range
Jeeped provided me with an very helpful formula to achieve a sumproduct() which takes multiple criteria. My current case is however a bit broader:
Take these example tables:
First column is the ID number, second column a respondent group(A,B). Column headers are question types (X,Y,Z).
Table Q1
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 1 | A | 2 | 2 | 1 | | 1 |
| 2 | A | 1 | 1 | | | 2 |
| 3 | A | 1 | 1 | | | 1 |
| 4 | A | 2 | 1 | | | 1 |
| 5 | A | 1 | 2 | 1 | | 1 |
| 6 | A | 1 | 1 | | | 1 |
| 7 | A | | | | | |
| 8 | A | | | | | |
| 9 | A | 1 | 1 | | | 1 |
| 10 | A | 2 | 2 | 2 | | 2 |
| 11 | A | | | | | |
| 12 | A | 1 | 2 | 1 | | 2 |
| 13 | B | | | | | |
| 14 | B | 1 | 1 | | | 1 |
| 15 | B | 2 | 2 | 1 | | 1 |
Table Q2
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 1 | A | 1 | 2 | 1 | | 1 |
| 2 | A | 1 | 1 | | | 1 |
| 3 | A | 1 | 1 | | | 1 |
| 4 | A | 1 | 1 | | | 1 |
| 5 | A | 1 | 1 | | | 1 |
| 6 | A | 1 | 1 | | | 1 |
| 7 | A | | | | | |
| 8 | A | | | | | |
| 9 | A | 1 | 1 | | | 1 |
| 10 | A | 1 | 1 | | | 1 |
| 11 | A | | | | | |
| 12 | A | 1 | 2 | 1 | | 1 |
| 13 | B | | | | | |
| 14 | B | 1 | 1 | | | 1 |
| 15 | B | 1 | 2 | 1 | | 1 |
Now I want to know the amount of times a respondent answered 1 (yes) on Q2 for each question type (X,Y,Z). The catch is that if someone answered 1 (yes) on Q1 it should "override" the answer on Q2, as we assume that when someone answers yes on Q1 (implementation of a measure), their answer on Q2 (knowledge of said measure) has to be yes as well.
The second catch is that for the first two occurrences of Y there can only be yes in one of both columns, so in fact there can only be two yes answers for question type Y for each respondent.
I used the following formula (on sheet 3): =SUMPRODUCT(SIGN(('Q1'!$C$2:$G$16=1)+('Q2'!$C$2:$G$16=1))*('Q2'!$B$2:$B$16=Blad3!$D5)*('Q2'!$C$1:$G$1=Blad3!E$4)) to obtain the following results.
| | X | Y | Z |
|---|---|----|---|
| A | 9 | 19 | 0 |
| B | 2 | 4 | 0 |
For X these results are correct, as there are 9 1's in table Q2.
For Y the results for B are correct, for A however they are not, as there are only 9 respondents, answering max 2 questions would result in a max of 18, we have 19 however.
It turns out there is nothing wrong with the formula, just that it isn't suited for the way this data is organised. If you look at row 5:
Q1
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | 1 | 2 | 1 | | 1 |
Q2
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | 1 | 1 | | | 1 |
If we condense that to everywhere there is a 1 in any of the Y column we get this table:
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | | 1 | 1 | | 1 |
When I ask for the sumproduct() for this combined table the result will be 3.
To prevent this I added a helper column (between the two Y and the Z column) to my tables, with the following formula: IF(OR(D1=1,E1=1),1,""). Removed the headers from the double Y columns, and re-running the query produced the correct results.
New table Q1 looks like this then:
| | | X | | | Y | Z | Y |
|----|---|---|---|---|---|---|---|
| 1 | A | 2 | 2 | 1 | 1 | | 1 |
| 2 | A | 1 | 1 | | 1 | | 2 |
| 3 | A | 1 | 1 | | 1 | | 1 |
| 4 | A | 2 | 1 | | 1 | | 1 |
| 5 | A | 1 | 2 | 1 | 1 | | 1 |
| 6 | A | 1 | 1 | | 1 | | 1 |
| 7 | A | | | | | | |
| 8 | A | | | | | | |
| 9 | A | 1 | 1 | | 1 | | 1 |
| 10 | A | 2 | 2 | 2 | | | 2 |
| 11 | A | | | | | | |
| 12 | A | 1 | 2 | 1 | 1 | | 2 |
| 13 | B | | | | | | |
| 14 | B | 1 | 1 | | 1 | | 1 |
| 15 | B | 2 | 2 | 1 | 1 | | 1 |

Determine range for one value in a column, use to run function over same range in another

Summary
I want to have a column in my spreadsheet that does 2 things.
1) In an ordered column, it will return the range where the column contains a specified value.
2) It will run a function (i.e., =SUM(), =AVERAGE(), etc.) over that same range in a different column.
Examples
Original
| NAME | VAL | FOO |
|-------|-----|-----|
| A | 3 | |
| A | 2 | |
| A | 4 | |
| A | 3 | |
| B | 2 | |
| B | 2 | |
| B | 1 | |
| C | 6 | |
| C | 5 | |
Average
I would want to get the average of VAL for each NAME. I would want the result to be:
| NAME | VAL | FOO |
|-------|-----|-----|
| A | 3 | 3 |
| A | 2 | 3 |
| A | 4 | 3 |
| A | 3 | 3 |
| B | 2 | 1.7 |
| B | 2 | 1.7 |
| B | 1 | 1.7 |
| C | 6 | 5.5 |
| C | 5 | 5.5 |
Sum
Another example would be to get the sum of VAL for each NAME.
| NAME | VAL | FOO |
|-------|-----|-----|
| A | 3 | 12 |
| A | 2 | 12 |
| A | 4 | 12 |
| A | 3 | 12 |
| B | 2 | 5 |
| B | 2 | 5 |
| B | 1 | 5 |
| C | 6 | 11 |
| C | 5 | 11 |
Having "NAME" ordered makes it easy. If "NAME" is in A1. Enter this into C2 for the sum, then fill down:
=IF(A2=A3,C3,SUMIF($A$2:A2,A2,$B$2:B2))
Enter this into C2 for the average, then fill down:
=IF(A2=A3,C3,AVERAGEIF($A$2:A2,A2,$B$2:B2))
Note that the result in C2 won't be what you want until you fill down.
Update for MAXIF
If you don't have Excel 2016, you'll have to use an array formula (commit with ctrl+shift+enter):
=IF(A2=A3,C3,MAX(IF($A$2:A2=A2,$B$2:B2)))

Adding strings to numbers with different length

I want to have an object like this, matching both of them and putting the names in each ID, both objects have a different length so I tried set names but it didn't work.
Any suggestions?
First Object
+----+-------+--+
| ID | Test | |
+----+-------+--+
| 1 | C | |
| 1 | M | |
| 1 | C | |
| 1 | M | |
| 2 | C | |
| 2 | M | |
| 2 | C | |
| 2 | M | |
| 4 | C | |
| 4 | M | |
| 4 | C | |
| 4 | M | |
+----+-------+--+
Second Object
+-----------+-----+--+
| Names | ID | |
+-----------+-----+--+
| Pepsi | 1 | |
| Coke | 2 | |
| Acuarious | 3 | |
| Fanta | 4 | |
| Beer | 5 | |
| Fries | 6 | |
+-----------+-----+--+
+----+-------+--------+--+
| ID | Names | Test | |
+----+-------+--------+--+
| 1 | Pepsi | C | |
| 1 | Pepsi | M | |
| 1 | Pepsi | C | |
| 1 | Pepsi | M | |
| 2 | Coke | C | |
| 2 | Coke | M | |
| 2 | Coke | C | |
| 2 | Coke | M | |
| 4 | Fanta | C | |
| 4 | Fanta | M | |
| 4 | Fanta | C | |
| 4 | Fanta | M | |
+----+-------+--------+--+
I think I sorted it out.
a <- merge(firstobject,secondobject,by.x="ID",by.y="ID",all.x=T,all.y=T)
This create a file that match by ID and at the same time put NA for those ones that donĀ“t match.
To get rid off the NAs
a <- a[!is.na(a$ID),]
I hope this helps.!!!

excel compare current time value from sheet1 to time value range from sheet2

I am trying to compare the time value from sheet 1 to sheet 2 and get the close match values in sheet1 -- B, C, D columns. Whenever I refresh the cell A it should automatically update the results in B, C, C, D see expected result
Sheet 1 show current time i.e., cell A1 "=now()"
Sheet1
----------------------------------------------------
| A | B | C | D |
|---------------------------------------------------
| 12:55:00 | | | |
----------------------------------------------------
In sheet 2, data available in 4 columns as below
--------------------------------------------------------
| No | Start | End | Date |
|-------------------------------------------------------
| 1 | 07:36:00 | 08:23:10 | 15/05/2015 |
| 2 | 08:23:10 | 09:10:20 | 15/05/2015 |
| 3 | 09:10:20 | 09:57:30 | 15/05/2015 |
| 4 | 09:57:30 | 10:44:40 | 15/05/2015 |
| 5 | 10:44:40 | 11:31:50 | 15/05/2015 |
| 6 | 11:31:50 | 12:19:00 | 15/05/2015 |
| 7 | 12:19:00 | 13:06:10 | 15/05/2015 |
| 8 | 13:06:10 | 13:53:20 | 15/05/2015 |
| 9 | 13:53:20 | 14:40:30 | 15/05/2015 |
| 10 | 14:40:30 | 15:27:40 | 15/05/2015 |
| 11 | 15:27:40 | 16:14:50 | 15/05/2015 |
| 12 | 16:14:50 | 17:02:00 | 15/05/2015 |
| 13 | 17:02:00 | 18:14:50 | 15/05/2015 |
| 14 | 18:14:50 | 19:27:40 | 15/05/2015 |
| 15 | 19:27:40 | 20:40:30 | 15/05/2015 |
| 16 | 20:40:30 | 21:53:20 | 15/05/2015 |
| 17 | 21:53:20 | 23:06:10 | 15/05/2015 |
| 18 | 23:06:10 | 00:19:00 | 16/05/2015 |
| 19 | 00:19:00 | 01:31:50 | 16/05/2015 |
| 20 | 01:31:50 | 02:44:40 | 16/05/2015 |
| 21 | 02:44:40 | 03:57:30 | 16/05/2015 |
| 22 | 03:57:30 | 05:10:20 | 16/05/2015 |
| 23 | 05:10:20 | 06:23:10 | 16/05/2015 |
| 24 | 06:23:10 | 07:36:00 | 16/05/2015 |
---------------------------------------------------------
Expected
Sheet1 - if the current time is 12:55:00 on 15/05/2015
-----------------------------------------------------------------------------
| A | B | C | D | E |
|-----------------------------------------------------------|---------------|
| 12:55:00 | 7 | 12:19:00 | 13:06:10 | 15/05/2015 |
-----------------------------------------------------------------------------
Sheet1 - if the current time is 03:55:00 on 16/05/2015
-----------------------------------------------------------------------------
| A | B | C | D | E |
|-----------------------------------------------------------|---------------|
| 12:55:00 | 21 | 02:44:40 | 03:57:30 | 16/05/2015 |
-----------------------------------------------------------------------------
for numbers I using the below formula but not sure how to achieve in case of time
=INDEX(A1:A20,MATCH(MIN(ABS(A1:A20-D1)),ABS(A1:A20-D1),0))
Thanks
If we assume that your dates are entered from different days but you want to treat them as though they are all on the same day, you just need to subtract off the days part before doing the comparison.
Since Excel stores dates as days with the decimal representing the hours/seconds, you can simply subtract off the integer part of the value.
Here is that formula. This is an array formula, entered with CTRL+SHIFT+ENTER.
=INDEX(A1:A20,MATCH(MIN(ABS(A1:A20-INT(A1:A20)-D1+INT(D1))),ABS(A1:A20-INT(A1:A20)-D1+INT(D1)),0))
For A1:A20 we subtract off INT(A1:A20). Same thing for D1 except D1 is already being subtracted, so the INT part gets added back in.
Thanks Byron, based on your solution, got an idea and implemented and achieved the desired output see my solution below
Sheet1 convert cell A1 value to number format enter in cell B1
=(A1-INT(A1))*24
In Sheet2 convert cell B to number format -> cell D
----------------------------------------------------------
| A | B | C | D |
|---------------------------------------------------------
| 1 | 07:36:00 | 08:23:10 | 7.6 | forumla "=(B12-INT(B12))*24"
| 2 | 08:23:10 | 09:10:20 | 8.386111111 |
| 3 | 09:10:20 | 09:57:30 | 9.172222222 |
| 4 | 09:57:30 | 10:44:40 | 9.958333333 |
| 5 | 10:44:40 | 11:31:50 | 10.74444444 |
| 6 | 11:31:50 | 12:19:00 | 11.53055556 |
| 7 | 12:19:00 | 13:06:10 | 12.31666667 |
| 8 | 13:06:10 | 13:53:20 | 13.10277778 |
| 9 | 13:53:20 | 14:40:30 | 13.88888889 |
| 10 | 14:40:30 | 15:27:40 | 14.675 |
| 11 | 15:27:40 | 16:14:50 | 15.46111111 |
| 12 | 16:14:50 | 17:02:00 | 16.24722222 |
| 13 | 17:02:00 | 18:14:50 | 17.03333333 |
| 14 | 18:14:50 | 19:27:40 | 18.24722222 |
| 15 | 19:27:40 | 20:40:30 | 19.46111111 |
| 16 | 20:40:30 | 21:53:20 | 20.675 |
| 17 | 21:53:20 | 23:06:10 | 21.88888889 |
| 18 | 23:06:10 | 00:19:00 | 23.10277778 |
| 19 | 00:19:00 | 01:31:50 | 0.316666667 |
| 20 | 01:31:50 | 02:44:40 | 1.530555556 |
| 21 | 02:44:40 | 03:57:30 | 2.744444444 |
| 22 | 03:57:30 | 05:10:20 | 3.958333333 |
| 23 | 05:10:20 | 06:23:10 | 5.172222222 |
| 24 | 06:23:10 | 07:36:00 | 6.386111111 |
---------------------------------------------------------
Now Sheet1 C1 enter the array forumla (shift + Ctrl + Enter)
=INDEX(Sheet2!A12:A35,MATCH(MIN(ABS(Sheet2!D12:D35-Sheet1!B1)),ABS(Sheet2!D12:D35-Sheet1!B1),0))
Sheet1 D1
=VLOOKUP(B3,Sheet2!A12:C35,2,FALSE)
Sheet1 E1
=VLOOKUP(B3,Sheet2!A12:C35,3,FALSE)
Output in Sheet1
------------------------------------------------------------------------------------
| A | B | C | D | E |
|-------------------------------------------------------------------|---------------|
| 07:36:58 | 7.615991667 | 1 | 07:36:00 | 08:23:10 |
-------------------------------------------------------------------------------------
Thanks

Resources