Replace/Update Value in Column Using If statement

Replace/Update Value in Column Using If statement - excel

Consider this example. There are two columns one called grade with a number 1-10 and another called status that says pass or fail. Originally, any number 1-6 resulted in fail and everything from 7-10 said pass.
Grade | Status
1 | Fail
2 | Fail
3 | Fail
4 | Fail
5 | Fail
6 | Fail
7 | Pass
8 | Pass
9 | Pass
10 | Pass
Now I lowered the passing grade to a 6. I want to replace every fail in the status column with pass if it has a grade of 6. How would I do this using an if statement in Excel?

It sounds as if you're looking for a simple IF statement for your worksheet. If so, try this:
=IF(A2<6,"Fail","Pass")
Take this formula and drag it down your range, adjusting where necessary.
Below are some screens I threw together to show the formula a bit better.
This is how I assumed you had your original worksheet set up:
And this is how I set it up based on your request:

Related

Calculate sum of differences between rows but only if the previous cell is bigger than the current one

I started to get a headache around my problem that I cannot figure out for the love of me.
There are unknown amounts of column if that makes any difference, but basically each row needs to be compared to the previous one and ONLY when the previous value is greater, the difference between them gets added to the sum.
So for example I have this table
| A |
--|-----|
1 | 100 |
2 | 90 |
3 | 80 |
4 | 100 |
5 | 70 |
6 | 20 |
7 | 100 |
...
Expected result: 100, derived from ((100-90) + (90-80) + (100-70) + (70-20))
I have spent a whole day browsing every single excel tutorial page and cannot find a single helpful answer. Please help :(

Formula for Cell B2: (pull down through the rows).
=IF(A1>B1;A1-B1;0)+B1
Logic: If previous value is larger than current value, add the difference to the total.

If you want to do it in one formula, a basic way would be two use two ranges offset by one cell:
=SUMPRODUCT((A1:A6-A2:A7)*(A1:A6>A2:A7))
If you wanted to make a bit more dynamic (assuming there are no gaps in the data) you could try
=SUMPRODUCT((A1:INDEX(A:A,COUNT(A:A)-1)-A2:INDEX(A:A,COUNT(A:A)))*(A1:INDEX(A:A,COUNT(A:A)-1)>A2:INDEX(A:A,COUNT(A:A))))
If there are blanks between numbers, this won't work and you would probably need to go back to a simpler pull-down formula

How can I get an average formula that omits errors

I'm trying to get the average of four data points.
The problem is that one or more data points could be missing.
The average should be the average of the last four mondays or four last tuesdays etc.
Each data point is about 1000 rows apart so my idea was to "list" the dates needed and use vlookup and average.
Generic formula
// I only add two dates, but the same formula is repeated for four dates
=AVERAGE(VLOOKUP(DATE_1;Table;25;FALSE);VLOOKUP(DATE_2;Table;25;FALSE))
The DATE_1 and DATE_2 is dynamic calculations of the previous two, lets say mondays.
This works if all dates are there, but if one monday is missing VLOOKUP returns an error and the error can't be calculated as an average.
I figured I could wrap VLOOKUP with IFERROR, but I can't get that working either
// for simplicity I removed the average and only show one.
IFERROR(VLOOKUP(DATE_1;Table;25;FALSE);"") // returns empty string, can't calculate
IFERROR(VLOOKUP(DATE_1;Table;25;FALSE);0) // Works, but it skews the result with a zero.
I know AVERAGE skips empty cells, but how can I "emulate" a empty cell. "" is empty string and that is not the same.
Is there formula that can handle errors and still give me the average, or a formula that returns "empty cell"?

This is the whole point of the AGGREGATE function.
Instead of AVERAGE(SomeRange), use AGGREGATE(1, 6, SomeRange). Instead of AVERAGE(Value1, Value2), use AGGREGATE(1, 6, Value1, Value2)
The 1 tells AGGREGATE to calculate the AVERAGE, and the 6 tells it "Ignore error values". A full list of the values is at the bottom of this post
=AGGREGATE(1,6,VLOOKUP(DATE_1;Table;25;FALSE);VLOOKUP(DATE_2;Table;25;FALSE))
(As people have pointed out, this doesn't quite work properly without interim calculation cells - when you use a Formula in the function, Excel refuses to accept it in Reference Form)
Reference form: AGGREGATE(function_num, options, ref1, [ref2], …)
Array form: AGGREGATE(function_num, options, array, [k])
Function_num | Function
1 | AVERAGE
2 | COUNT
3 | COUNTA
4 | MAX
5 | MIN
6 | PRODUCT
7 | STDEV.S
8 | STDEV.P
9 | SUM
10 | VAR.S
11 | VAR.P
12 | MEDIAN
13 | MODE.SNGL
14 | LARGE
15 | SMALL
16 | PERCENTILE.INC
17 | QUARTILE.INC
18 | PERCENTILE.EXC
19 | QUARTILE.EXC
Option | Behaviour
0 | Ignore nested SUBTOTAL and AGGREGATE functions
1 | Ignore hidden rows, nested SUBTOTAL and AGGREGATE functions
2 | Ignore error values, nested SUBTOTAL and AGGREGATE functions
3 | Ignore hidden rows, error values, nested SUBTOTAL and AGGREGATE functions
4 | Ignore nothing
5 | Ignore hidden rows
6 | Ignore error values
7 | Ignore hidden rows and error values

Just giving you a different approach even though it is not exactly the same way as your question is going, I just thought to share how i've solved a similar issue.
No lookup table in this one, I personnally try to avoid these in these situations, as you always have to update them given some conditions.
{=AVERAGE(IF((WEEKDAY(A1:A276,2)=1)*((L1:L276)>0)*((A1:A276)>((TODAY())-29)),L1:L276,""))}
array formula, so ctrl+shift+enter
(WEEKDAY(A1:A276,2)=1) tests if it's a monday
(L1:L276)>0) is where I have values and therefore want to ignore zeros
((A1:A276)>((TODAY())-29)) added this one for you to check if it's less than 4 weeks old
if these conditions are fulfilled the respective value in L:L is take for the average (A:A being the date in this example)

Here's one more solution for you to try (array formula - Ctrl+Shift+Enter):
=AVERAGE(IF(ISNUMBER(MATCH($A$1:$A$10,CHOOSE({1,2,3,4},DATE_1,DATE_2,DATE_3,DATE_4),0)),$B$1:$B$10))
Result:

INDEX & MATCH doesn't produce the same problem, try this:
=AVERAGE(INDEX(YourColumnRange;MATCH(DATE_1;Table));INDEX(YourColumnRange;MATCH(DATE_2;Table)))
EDIT: To match your dataset you can manually calculate an average like this:
=SUM(IFERROR(INDEX(Y:Y,MATCH(Date_1,A:A,0)),0), IFERROR(INDEX(Y:Y,MATCH(Date_2,A:A,0)),0))/
COUNT(INDEX(Y:Y,MATCH(Date_1,A:A,0)), INDEX(Y:Y,MATCH(Date_2,A:A,0)))
This will allow you to skip empty rows. The formula is simplified for a generic case to make it easier to read.

Finding % based on multiple cells and using MATCH formula

Trying to come up with a formula for a spreadsheet I am working on, but hit a road block.
I need to use an INDEX and MATCH formula to ensure that the cell is going to lookup a sheet full of data and find all matches based on an ID number. I then need to work out the % value of how often that column was populated with data for that ID number.
Essentially as an example:
The data table would look like:
ID | Notes Added
1 | Yes
7 | Yes
12 | Yes
6 | Yes
10 |
12 |
And the result would be:
ID | Populated %
12 | 50 %
As you can see, the values that it should return would be 50%, as there are 2 entries that are listed for the ID of 12 (this is where the MATCH and INDEX would come in), however in the Notes Added, only 1 of the entries has a value.
The data table that it draws from will either have a value, or have no value, calculating the % would be based on if there is/isn't a value.

You'll want to use COUNTIFS Function. It would look something like this:
=COUNTIFS(B4:B9,C16,C4:C9,"Yes") / COUNTIF(B4:B9,C16)
=COUNTIFS(<ID Values Range>,<Id Value>,<notes added range>,"Yes") / COUNTIF(<ID Values Range>,<Id Value>)
Then just set the cell formatting to %.

How to resolve duplicate column names in excel file with Alteryx?

I have a wide excel file with price data, looking like this
Product | 2015-08-01 | 2015-09-01 | 2015-09-01 | 2015-10-01
ABC | 13 | 12 | 15 | 14
CDE | 69 | 70 | 71 | 67
FGH | 25 | 25 | 26 | 27
The date 2015-09-01 can be found twice, which in the context is valid but obviously messes up my workflow.
It can be understood that the first value is the minimum price, the second one the maximum price. If there is only one column, min and max are the same.
Is there a way to resolve this issue?
An idea I had was the following:
I also have cells that contain a value like "38 - 42", again indicating min and max. I resolved this by spliting it based on a Regex expression. What could be a solution is to join two columns that have the same header, to afterwards split the values according to my rules. That however would require me to detect dynamically if the headers are duplicates.
Is that something that is possible in Alteryx or is there an easier solution for this problem?
And of course asking the supplier of the file to change it is not really an option, unfortunatelly.
Thanks
EDIT:
Just got another idea:
I transpose the table to have the format
Product | Date | Price Low | Price High
So if I could check for duplicates in that table and somehow merge these records into one, that would do the trick as well.
EDIT2:
Since I seem to haven't made that clear, my final result should look like the transposed table in EDIT1. If there is only one value it should go in "Price Low" (and then I will probably copy it to "Price High" anyway. If there are two values they should go in the according columns. #Poornima's suggestion resolves the duplicate issue in a more sophisticated form than putting a "_2" behind the column name, but doesn't put the value in the required column.

If this format works for you:
Product | Date | Price Low | Price High
Then:
- Transpose with Product as a key field
- Use a select tool to truncate your Name field to 10 characters. This will remove any _2 values that Alteryx has automatically renamed.
- Summarize:
Group by Product
Group by Name
Then apply Min and Max operations to value.
Result is:
Product | Name | Min_Value | Max_Value
ABC | 2015-08-01 | 13 | 13
ABC | 2015-09-01 | 12 | 15
ABC | 2015-10-01 | 14 | 14

For this problem, you can leverage the native Excel (.xlsx) driver available in Alteryx 9.1. If multiple columns in Excel use the same string, then they are renamed by the native driver with an underscore at the end e.g., 2015-09-01, 2015-09-01_1. By leveraging this, we can reformat the data in three steps:
As you suggested, we start by transposing the data so that we can leverage the column headers.
We can then write a formula with the Formula Tool that evaluates whether the column header for the date is the first or the last one based on the header length.
The final step would be to bring the data back into the same format as before, which can be via the Crosstab Tool.
You can review the configurations for each of these tools here. The end result would be as follows.
Hope this helps.
Regards,
Poornima

Count number of rows where multiple criteria are met

I'm trying to generate a table that shows a count of how many items are in any given status on any given day. My result table has a set of Dates down column A and column headers are various statuses. A sample of my data table with headers looks like this:
Product | Notice | Assigned | Complete | In Office | In Accounting
1 | 5/5/13 | 5/7/13 | 5/9/13 | 5/10/13 | 5/11/13
2 | 5/5/13 | 5/6/13 | 5/8/13 | 5/9/13 | 5/10/13
3 | 5/6/13 | 5/9/13 | 5/10/13 | 5/10/13 | 5/10/13
4 | 5/4/13 | 5/5/13 | 5/7/13 | 5/8/13 | 5/9/13
5 | 5/7/13 | 5/8/13 | 5/10/13 | 5/11/13 | 5/11/13
If my output table were to contain a set of dates in the first column with the statuses as headers, I need a count of how many rows were at the given status and had not yet transitioned to the next status so that in the Notice column, I'd have a count of rows where the Notice Date was <= X AND where the Assigned, Complete, In Office, In Accounting are all greater than X.
I've used a Sum(if(frequency(if statement to get me REALLY close but I feel like I need to have an AND statement within the second IF like this =SUM(IF(FREQUENCY(IF(AND
Here's what I have that won't work:
=SUM(IF(FREQUENCY(IF(AND(Table1[Assigned]<=A279,Table1[[Complete]:[In Accounting]]<=A279),ROW(Table1[[Complete]:[In Accounting]])),ROW(Table1[[Complete]:[In Accounting]]))>0,1))
If I take the "AND" portion out, this works fine except I need it to ONLY count rows where the given status actually has a date so if an "Assigned" date is empty, I don't want that row to be counted for the Assigned column.
Here's an example of what I'd expect to see in the results. I've listed the count in the each column as well as the corresponding product numbers in parenthesis. The corresponding product numbers are for reference only and won't actually be in the result table.
Date | Notice | Assigned | Complete
5/6 | 2 (1,3) | 2 (2,4) | 0
5/7 | 2 (3,5) | 2 (1,2) | 1 (4)
5/8 | 1 (3) | 2 (1,5) | 1 (2)

OK, assuming you have the original data in A1:F6 then with 2nd table headers in B9:D9 and row labels in A10:A12 then you can use this "array formula" in B10
=SUM((B$2:B$6<=$A10)*(MMULT((C$2:$F$6>$A10)+(C$2:$F$6=""),TRANSPOSE(COLUMN(C$2:$F$6)^0))=COLUMNS(C$2:$F$6)))
confirmed with CTRL+SHIFT+ENTER and copied down and across (see screenshot below)
As you can see the results are as per your requirement. If you replace dates with blanks it will still work
MMULTis a way to get a single value from each row even when you are looking at multiple columns.
I used cell references because I think that's easier, especially when copying the formula across and having a reducing range.......but you can use structured references if you want

Have you tried using COUNTIFS to count based on multiple criteria. It is fairly well documented here: http://office.microsoft.com/en-us/excel-help/countifs-function-HA010047494.aspx (2007+ only)
Basically, you use it like
=COUNTIFS(first_range_to_check, value_you_want_in_first_range, ...)
where the ... represents as many pairs as you want (up to 127 total pairs), note the conditions are AND connection so if you have two pairs, the first pair AND the second pair must return true for that row to count.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string