SIGN() formula returns unexpected results - excel

In continuation of my previous question: Sumproduct with multiple criteria on one range
Jeeped provided me with an very helpful formula to achieve a sumproduct() which takes multiple criteria. My current case is however a bit broader:
Take these example tables:
First column is the ID number, second column a respondent group(A,B). Column headers are question types (X,Y,Z).
Table Q1
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 1 | A | 2 | 2 | 1 | | 1 |
| 2 | A | 1 | 1 | | | 2 |
| 3 | A | 1 | 1 | | | 1 |
| 4 | A | 2 | 1 | | | 1 |
| 5 | A | 1 | 2 | 1 | | 1 |
| 6 | A | 1 | 1 | | | 1 |
| 7 | A | | | | | |
| 8 | A | | | | | |
| 9 | A | 1 | 1 | | | 1 |
| 10 | A | 2 | 2 | 2 | | 2 |
| 11 | A | | | | | |
| 12 | A | 1 | 2 | 1 | | 2 |
| 13 | B | | | | | |
| 14 | B | 1 | 1 | | | 1 |
| 15 | B | 2 | 2 | 1 | | 1 |
Table Q2
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 1 | A | 1 | 2 | 1 | | 1 |
| 2 | A | 1 | 1 | | | 1 |
| 3 | A | 1 | 1 | | | 1 |
| 4 | A | 1 | 1 | | | 1 |
| 5 | A | 1 | 1 | | | 1 |
| 6 | A | 1 | 1 | | | 1 |
| 7 | A | | | | | |
| 8 | A | | | | | |
| 9 | A | 1 | 1 | | | 1 |
| 10 | A | 1 | 1 | | | 1 |
| 11 | A | | | | | |
| 12 | A | 1 | 2 | 1 | | 1 |
| 13 | B | | | | | |
| 14 | B | 1 | 1 | | | 1 |
| 15 | B | 1 | 2 | 1 | | 1 |
Now I want to know the amount of times a respondent answered 1 (yes) on Q2 for each question type (X,Y,Z). The catch is that if someone answered 1 (yes) on Q1 it should "override" the answer on Q2, as we assume that when someone answers yes on Q1 (implementation of a measure), their answer on Q2 (knowledge of said measure) has to be yes as well.
The second catch is that for the first two occurrences of Y there can only be yes in one of both columns, so in fact there can only be two yes answers for question type Y for each respondent.
I used the following formula (on sheet 3): =SUMPRODUCT(SIGN(('Q1'!$C$2:$G$16=1)+('Q2'!$C$2:$G$16=1))*('Q2'!$B$2:$B$16=Blad3!$D5)*('Q2'!$C$1:$G$1=Blad3!E$4)) to obtain the following results.
| | X | Y | Z |
|---|---|----|---|
| A | 9 | 19 | 0 |
| B | 2 | 4 | 0 |
For X these results are correct, as there are 9 1's in table Q2.
For Y the results for B are correct, for A however they are not, as there are only 9 respondents, answering max 2 questions would result in a max of 18, we have 19 however.

It turns out there is nothing wrong with the formula, just that it isn't suited for the way this data is organised. If you look at row 5:
Q1
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | 1 | 2 | 1 | | 1 |
Q2
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | 1 | 1 | | | 1 |
If we condense that to everywhere there is a 1 in any of the Y column we get this table:
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | | 1 | 1 | | 1 |
When I ask for the sumproduct() for this combined table the result will be 3.
To prevent this I added a helper column (between the two Y and the Z column) to my tables, with the following formula: IF(OR(D1=1,E1=1),1,""). Removed the headers from the double Y columns, and re-running the query produced the correct results.
New table Q1 looks like this then:
| | | X | | | Y | Z | Y |
|----|---|---|---|---|---|---|---|
| 1 | A | 2 | 2 | 1 | 1 | | 1 |
| 2 | A | 1 | 1 | | 1 | | 2 |
| 3 | A | 1 | 1 | | 1 | | 1 |
| 4 | A | 2 | 1 | | 1 | | 1 |
| 5 | A | 1 | 2 | 1 | 1 | | 1 |
| 6 | A | 1 | 1 | | 1 | | 1 |
| 7 | A | | | | | | |
| 8 | A | | | | | | |
| 9 | A | 1 | 1 | | 1 | | 1 |
| 10 | A | 2 | 2 | 2 | | | 2 |
| 11 | A | | | | | | |
| 12 | A | 1 | 2 | 1 | 1 | | 2 |
| 13 | B | | | | | | |
| 14 | B | 1 | 1 | | 1 | | 1 |
| 15 | B | 2 | 2 | 1 | 1 | | 1 |

Related

How to create a calculated column in access 2013 to detect duplicates

I'm recreating a tool I made in Excel as it's getting bigger and performance is getting out of hand.
The issue is that I only have MS Access 2013 on my work laptop and I'm fairly new to the Expression Builder in Access 2013, which has a very limited function base to be honest.
My data has duplicates on the [Location] column, meaning that, I have multiple SKUs on that warehouse location. However, some of my calculations need to be done only once per [Location]. My solution to that, in Excel, was to make a formula (see below) putting 1 only on the first appearance of that location, putting 0 on next appearances. Doing that works like a charm because summing over that [Duplicate] column while imposing multiple criteria returns the number of occurrences of the multiple criteria counting locations only once.
Now, MS Access 2013 Expression Builder has no SUM nor COUNT functions to create a calculated column emulating my [Duplicate] column from Excel. Preferably, I would just input the raw data and let Access populate the calculated fields vs also inputting the calculated fields as well, since that would defeat my original purpose of reducing the computational cost of creating my dashboard.
The question is, how would you create a calculated column, in MS Access 2013 Expression Builder to recreate the below Excel function:
= IF($D$2:$D3=$D4,0,1)
In the sake of reducing the file size (over 100K rows) I even replace the 0 by a blank character "".
Thanks in advance for your help
Y
First and foremost, understand MS Access' Expression Builder is a convenience tool to build an SQL expression. Everything in Query Design ultimately is to build an SQL query. For this reason, you have to use a set-based mentality to see data in whole sets of related tables and not cell-by-cell mindset.
Specifically, to achieve:
putting 1 only on the first appearance of that location, putting 0 on next appearances
Consider a whole set-based approach by joining on a separate, aggregate query to identify the first value of your needed grouping, then calculate needed IIF expression. Below assumes you have an autonumber or primary key field in table (a standard in relational databases):
Aggregate Query (save as a separate query, adjust columns as needed)
SELECT ColumnD, MIN(AutoNumberID) As MinID
FROM myTable
GROUP BY ColumnD
Final Query (join to original table and build final IIF expression)
SELECT m.*, IIF(agg.MinID = AutoNumberID, 1, 0) As Dup_Indicator
FROM myTable m
INNER JOIN myAggregateQuery agg
ON m.[ColumnD] = agg.ColumnD
To demonstrate with random data:
Original
| ID | GROUP | INT | NUM | CHAR | BOOL | DATE |
|----|--------|-----|--------------|------|-------|------------|
| 1 | r | 9 | 1.424490258 | B6z | TRUE | 7/4/1994 |
| 2 | stata | 10 | 2.591235683 | h7J | FALSE | 10/5/1971 |
| 3 | spss | 6 | 0.560461966 | Hrn | TRUE | 11/27/1990 |
| 4 | stata | 10 | -1.499272175 | eXL | FALSE | 4/17/2010 |
| 5 | stata | 15 | 1.470269177 | Vas | TRUE | 6/13/2010 |
| 6 | r | 14 | -0.072238898 | puP | TRUE | 4/1/1994 |
| 7 | julia | 2 | -1.370405263 | S2l | FALSE | 12/11/1999 |
| 8 | spss | 6 | -0.153684675 | mAw | FALSE | 7/28/1977 |
| 9 | spss | 10 | -0.861482674 | cxC | FALSE | 7/17/1994 |
| 10 | spss | 2 | -0.817222582 | GRn | FALSE | 10/19/2012 |
| 11 | stata | 2 | 0.949287754 | xgc | TRUE | 1/18/2003 |
| 12 | stata | 5 | -1.580841322 | Y1D | TRUE | 6/3/2011 |
| 13 | r | 14 | -1.671303816 | JCP | FALSE | 5/15/1981 |
| 14 | r | 7 | 0.904181025 | Rct | TRUE | 7/24/1977 |
| 15 | stata | 10 | -1.198211174 | qJY | FALSE | 5/6/1982 |
| 16 | julia | 10 | -0.265808162 | 10s | FALSE | 3/18/1975 |
| 17 | r | 13 | -0.264955027 | 8Md | TRUE | 6/11/1974 |
| 18 | r | 4 | 0.518302149 | 4KW | FALSE | 9/12/1980 |
| 19 | r | 5 | -0.053620183 | 8An | FALSE | 4/17/2004 |
| 20 | r | 14 | -0.359197116 | F8Q | TRUE | 6/14/2005 |
| 21 | spss | 11 | -2.211875193 | AgS | TRUE | 4/11/1973 |
| 22 | stata | 4 | -1.718749471 | Zqr | FALSE | 2/20/1999 |
| 23 | python | 10 | 1.207878576 | tcC | FALSE | 4/18/2008 |
| 24 | stata | 11 | 0.548902226 | PFJ | TRUE | 9/20/1994 |
| 25 | stata | 6 | 1.479125922 | 7a7 | FALSE | 3/2/1989 |
| 26 | python | 10 | -0.437245299 | r32 | TRUE | 6/7/1997 |
| 27 | sas | 14 | 0.404746106 | 6NJ | TRUE | 9/23/2013 |
| 28 | stata | 8 | 2.206741458 | Ive | TRUE | 5/26/2008 |
| 29 | spss | 12 | -0.470694096 | dPS | TRUE | 5/4/1983 |
| 30 | sas | 15 | -0.57169507 | yle | TRUE | 6/20/1979 |
SQL (uses aggregate in subquery but can be a stored query)
SELECT r.*, IIF(sub.MinID = r.ID,1, 0) AS Dup
FROM Random_Data r
LEFT JOIN
(
SELECT r.GROUP, MIN(r.ID) As MinID
FROM Random_Data r
GROUP BY r.Group
) sub
ON r.[Group] = sub.[GROUP]
Output (notice the first GROUP value is tagged 1, all else 0)
| ID | GROUP | INT | NUM | CHAR | BOOL | DATE | Dup |
|----|--------|-----|--------------|------|-------|------------|-----|
| 1 | r | 9 | 1.424490258 | B6z | TRUE | 7/4/1994 | 1 |
| 2 | stata | 10 | 2.591235683 | h7J | FALSE | 10/5/1971 | 1 |
| 3 | spss | 6 | 0.560461966 | Hrn | TRUE | 11/27/1990 | 1 |
| 4 | stata | 10 | -1.499272175 | eXL | FALSE | 4/17/2010 | 0 |
| 5 | stata | 15 | 1.470269177 | Vas | TRUE | 6/13/2010 | 0 |
| 6 | r | 14 | -0.072238898 | puP | TRUE | 4/1/1994 | 0 |
| 7 | julia | 2 | -1.370405263 | S2l | FALSE | 12/11/1999 | 1 |
| 8 | spss | 6 | -0.153684675 | mAw | FALSE | 7/28/1977 | 0 |
| 9 | spss | 10 | -0.861482674 | cxC | FALSE | 7/17/1994 | 0 |
| 10 | spss | 2 | -0.817222582 | GRn | FALSE | 10/19/2012 | 0 |
| 11 | stata | 2 | 0.949287754 | xgc | TRUE | 1/18/2003 | 0 |
| 12 | stata | 5 | -1.580841322 | Y1D | TRUE | 6/3/2011 | 0 |
| 13 | r | 14 | -1.671303816 | JCP | FALSE | 5/15/1981 | 0 |
| 14 | r | 7 | 0.904181025 | Rct | TRUE | 7/24/1977 | 0 |
| 15 | stata | 10 | -1.198211174 | qJY | FALSE | 5/6/1982 | 0 |
| 16 | julia | 10 | -0.265808162 | 10s | FALSE | 3/18/1975 | 0 |
| 17 | r | 13 | -0.264955027 | 8Md | TRUE | 6/11/1974 | 0 |
| 18 | r | 4 | 0.518302149 | 4KW | FALSE | 9/12/1980 | 0 |
| 19 | r | 5 | -0.053620183 | 8An | FALSE | 4/17/2004 | 0 |
| 20 | r | 14 | -0.359197116 | F8Q | TRUE | 6/14/2005 | 0 |
| 21 | spss | 11 | -2.211875193 | AgS | TRUE | 4/11/1973 | 0 |
| 22 | stata | 4 | -1.718749471 | Zqr | FALSE | 2/20/1999 | 0 |
| 23 | python | 10 | 1.207878576 | tcC | FALSE | 4/18/2008 | 1 |
| 24 | stata | 11 | 0.548902226 | PFJ | TRUE | 9/20/1994 | 0 |
| 25 | stata | 6 | 1.479125922 | 7a7 | FALSE | 3/2/1989 | 0 |
| 26 | python | 10 | -0.437245299 | r32 | TRUE | 6/7/1997 | 0 |
| 27 | sas | 14 | 0.404746106 | 6NJ | TRUE | 9/23/2013 | 1 |
| 28 | stata | 8 | 2.206741458 | Ive | TRUE | 5/26/2008 | 0 |
| 29 | spss | 12 | -0.470694096 | dPS | TRUE | 5/4/1983 | 0 |
| 30 | sas | 15 | -0.57169507 | yle | TRUE | 6/20/1979 | 0 |

An unconventional transpose in excel

A part of my file consists in this:
Et SF
1 4.4937
1 5.1257
1 5.2018
1 5.3755
1 5.741
1 5.9086
1 6.1399
1 6.2518
2 3.0424
2 3.2744
2 3.883
2 3.9595
2 3.9892
2 4.1603
2 4.2943
2 4.5118
And I would like to transpose this way:
Et SF SF SF SF SF SF SF SF
1 4.4937 5.1257 5.2018 5.3755 5.741 5.9086 6.1399 6.2518
2 3.0424 3.2744 3.883 3.9595 3.9892 4.1603 4.2943 4.5118
Is it possible to do this in excel. I tried the option OFFSET but I wasn't able to do this.
You did not mention what did you try. There are many ways:
| | A | B | | Copy | | 4.4937 | 5.1257 | 5.2018 | 5.3755 | 5.741 | 5.9086 | 6.1399 | 6.2518 |
|----|:-:|:------:|:-:|:------:|:------:|--------|-----------|--------|--------|--------|--------|--------|--------|
| 1 | 1 | 4.4937 | = | ===== | ==| | 3.0424 | 3.2744 | 3.883 | 3.9595 | 3.9892 | 4.1603 | 4.2943 | 4.5118 |
| 2 | 1 | 5.1257 | | | | | A | Transpose | | | | | | |
| 3 | 1 | 5.2018 | | | | | | | | | | | | | |
| 4 | 1 | 5.3755 | | | | | | | | | | | | | |
| 5 | 1 | 5.741 | | | | | | | | | | | | | |
| 6 | 1 | 5.9086 | | | | | | | | | | | | | |
| 7 | 1 | 6.1399 | | | | | | | | | | | | | |
| 8 | 1 | 6.2518 | | | V | | | | | | | | | |
| 9 | 2 | 3.0424 | > | 3.0424 | 4.4937 | ==| | | | | | | | |
| 10 | 2 | 3.2744 | > | 3.2744 | 5.1257 | | | | | | | | |
| 11 | 2 | 3.883 | > | 3.883 | 5.2018 | | | | | | | | |
| 12 | 2 | 3.9595 | > | 3.9595 | 5.3755 | | | | | | | | |
| 13 | 2 | 3.9892 | > | 3.9892 | 5.741 | | | | | | | | |
| 14 | 2 | 4.1603 | > | 4.1603 | 5.9086 | | | | | | | | |
| 15 | 2 | 4.2943 | > | 4.2943 | 6.1399 | | | | | | | | |
| 16 | 2 | 4.5118 | > | 4.5118 | 6.2518 | | | | | | | | |
Just copy the first section (1) at the right side of the second section (2). Copy the complete selection and paste transposed.
If you want to just make though a formula on the same sheet:
| | A | B | C | D | E | F | G | H | I | J | K | L | M |
|----|:--:|:------:|:-:|:-:|:---------------------------:|-----------------------------|-----------------------------|-----|---|---|---|---|---|
| 1 | Et | SF | | | | | | | | | | | |
| 2 | 1 | 4.4937 | | 1 | =INDIRECT("B" & COLUMN()-3) | =INDIRECT("B" & COLUMN()-3) | =INDIRECT("B" & COLUMN()-3) | ... | | | | | |
| 3 | 1 | 5.1257 | | 2 | =INDIRECT("B" & COLUMN()+5) | =INDIRECT("B" & COLUMN()+5) | =INDIRECT("B" & COLUMN()+5) | ... | | | | | |
| 4 | 1 | 5.2018 | | | | | | | | | | | |
| 5 | 1 | 5.3755 | | | | | | | | | | | |
| 6 | 1 | 5.741 | | | | | | | | | | | |
| 7 | 1 | 5.9086 | | | | | | | | | | | |
| 8 | 1 | 6.1399 | | | | | | | | | | | |
| 9 | 1 | 6.2518 | | | | | | | | | | | |
| 10 | 2 | 3.0424 | | | | | | | | | | | |
| 11 | 2 | 3.2744 | | | | | | | | | | | |
| 12 | 2 | 3.883 | | | | | | | | | | | |
| 13 | 2 | 3.9595 | | | | | | | | | | | |
| 14 | 2 | 3.9892 | | | | | | | | | | | |
| 15 | 2 | 4.1603 | | | | | | | | | | | |
| 16 | 2 | 4.2943 | | | | | | | | | | | |
| 17 | 2 | 4.5118 | | | | | | | | | | | |
Hope it helps!

Determine range for one value in a column, use to run function over same range in another

Summary
I want to have a column in my spreadsheet that does 2 things.
1) In an ordered column, it will return the range where the column contains a specified value.
2) It will run a function (i.e., =SUM(), =AVERAGE(), etc.) over that same range in a different column.
Examples
Original
| NAME | VAL | FOO |
|-------|-----|-----|
| A | 3 | |
| A | 2 | |
| A | 4 | |
| A | 3 | |
| B | 2 | |
| B | 2 | |
| B | 1 | |
| C | 6 | |
| C | 5 | |
Average
I would want to get the average of VAL for each NAME. I would want the result to be:
| NAME | VAL | FOO |
|-------|-----|-----|
| A | 3 | 3 |
| A | 2 | 3 |
| A | 4 | 3 |
| A | 3 | 3 |
| B | 2 | 1.7 |
| B | 2 | 1.7 |
| B | 1 | 1.7 |
| C | 6 | 5.5 |
| C | 5 | 5.5 |
Sum
Another example would be to get the sum of VAL for each NAME.
| NAME | VAL | FOO |
|-------|-----|-----|
| A | 3 | 12 |
| A | 2 | 12 |
| A | 4 | 12 |
| A | 3 | 12 |
| B | 2 | 5 |
| B | 2 | 5 |
| B | 1 | 5 |
| C | 6 | 11 |
| C | 5 | 11 |
Having "NAME" ordered makes it easy. If "NAME" is in A1. Enter this into C2 for the sum, then fill down:
=IF(A2=A3,C3,SUMIF($A$2:A2,A2,$B$2:B2))
Enter this into C2 for the average, then fill down:
=IF(A2=A3,C3,AVERAGEIF($A$2:A2,A2,$B$2:B2))
Note that the result in C2 won't be what you want until you fill down.
Update for MAXIF
If you don't have Excel 2016, you'll have to use an array formula (commit with ctrl+shift+enter):
=IF(A2=A3,C3,MAX(IF($A$2:A2=A2,$B$2:B2)))

Adding strings to numbers with different length

I want to have an object like this, matching both of them and putting the names in each ID, both objects have a different length so I tried set names but it didn't work.
Any suggestions?
First Object
+----+-------+--+
| ID | Test | |
+----+-------+--+
| 1 | C | |
| 1 | M | |
| 1 | C | |
| 1 | M | |
| 2 | C | |
| 2 | M | |
| 2 | C | |
| 2 | M | |
| 4 | C | |
| 4 | M | |
| 4 | C | |
| 4 | M | |
+----+-------+--+
Second Object
+-----------+-----+--+
| Names | ID | |
+-----------+-----+--+
| Pepsi | 1 | |
| Coke | 2 | |
| Acuarious | 3 | |
| Fanta | 4 | |
| Beer | 5 | |
| Fries | 6 | |
+-----------+-----+--+
+----+-------+--------+--+
| ID | Names | Test | |
+----+-------+--------+--+
| 1 | Pepsi | C | |
| 1 | Pepsi | M | |
| 1 | Pepsi | C | |
| 1 | Pepsi | M | |
| 2 | Coke | C | |
| 2 | Coke | M | |
| 2 | Coke | C | |
| 2 | Coke | M | |
| 4 | Fanta | C | |
| 4 | Fanta | M | |
| 4 | Fanta | C | |
| 4 | Fanta | M | |
+----+-------+--------+--+
I think I sorted it out.
a <- merge(firstobject,secondobject,by.x="ID",by.y="ID",all.x=T,all.y=T)
This create a file that match by ID and at the same time put NA for those ones that donĀ“t match.
To get rid off the NAs
a <- a[!is.na(a$ID),]
I hope this helps.!!!

Expand a data set using two columns

In Excel, I have two columns of data that I wish to combine.
Current set of data:
+---------+---------+
| column1 | column2 |
+---------+---------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
| | 5 |
| | 6 |
| | 7 |
+---------+---------+
For each value in column1, I need to assign all of the values in column2 so it looks like this:
+---------+---------+
| column1 | column2 |
+---------+---------+
| a | 1 |
| a | 2 |
| a | 3 |
| a | 4 |
| a | 5 |
| a | 6 |
| a | 7 |
+---------+---------+
| b | 1 |
| b | 2 |
| b | 3 |
| b | 4 |
| b | 5 |
| b | 6 |
| b | 7 |
+---------+---------+
| c | 1 |
| c | 2 |
| c | 3 |
| c | 4 |
| c | 5 |
| c | 6 |
| c | 7 |
+---------+---------+
| d | 1 |
| d | 2 |
| d | 3 |
| d | 4 |
| d | 5 |
| d | 6 |
| d | 7 |
+---------+---------+
How can I do this?
Do I need to find a macro/VB solution?
Since seems unlikely to receive any other answer:
in A1: a
in B1: =MOD(ROW()-1,7)+1
in A2: =IF(MOD(ROW()-1,7)>0,CHAR(CODE(A1)),CHAR(CODE(A1)+1))
Copy both formulae down to suit.

Resources