Distinct Values based on two columns in Excel - excel

I must be tired because I feel like I've done this before, but just banging my head tonight.
Anyway, I have a data sheet. 10 columns 800 rows. ( an excel table if you will)
On a separate sheet I have various drop downs. I want to be able to have a drop down that is based on the first 2 selections the user makes.
So If in cell A2 user selects 10 and in cell B2 user selects 12 , I want to be able to filter the excel table for values in Column D, based on the 2 values selected beforehand.
I've looked at formulas and doing data validation. I'm using the data validation for cells A2 and B2, however they are just list.
I tried a 3rd data validation using an If statement, but that continues to fail.
datasheet could contain:
column1 |column2 | column3|etc...
10 | 12 | 22
11 | 13 | 23
11 | 33 | 23
10 | 12 | 25
11 | 13 | 24
10 | 12 | 26
What I think should work, something like:
if(A2=10, list1, list2)
My issue is list 1 is where I need to select distinct values from column 3 based on filtering items from columns 1 & 2, but I cant get it to work to save my life...
When I try to do the formulas as listed above, it barfs telling me I need a delimited list...
Just looking for pointers in the right direction. Like I said , I feel like I've done this before, just can't recall it tonight. Thanks in advance for any direction. Respectfully.

Related

Excel countif and sumif together

I am trying to write a formula in Excel which will count how many times we have sold less than 50 of a particular product. For example, here is a day's sales:
Order | Product | Qty
1 | A | 5
2 | A | 5
3 | A | 5
4 | B | 30
5 | C | 75
I want a formula in a cell which says how many times we have a requirement for less than 50 of a certain product. So in the example above, there is a total of 15 As, 30 Bs and 75 Cs, so 2 of those are less than 50.
I think it will need to be an array function of COUNTIF and SUM, but can't figure it out.
You could use this formula:
=SUMPRODUCT(--(IF(ROW($B$2:$B$10)=MATCH($B$2:$B$10,$B$1:$B$10,0),SUMIF($B$2:$B$10,$B$2:$B$10,$C$2:$C$10),"")<50))
Note: It's an array formula and must be entered through Ctrl+Shift+Enter
Product order placement can be randomized and does not have to be in order.
Another way
=SUMPRODUCT((SUMIF(B2:B10,B2:B10,C2:C10)<50)/COUNTIF(B2:B10,B2:B10))
Maybe something like that will help:
=SUMPRODUCT(--IF($B$2:$B$11<>$B$1:$B$10,SUMIF($B$2:$B$11,$B$2:$B$11,$C$2:$C$11)<50,0))
Note that this is an array formula so needs to be entered with Ctrl+Shift+Enter. Data needs to be sorted by Product (i.e. product A cannot appear in random rows, like row 2, 20 and 100; it needs to be grouped together).
Result:

How to resolve duplicate column names in excel file with Alteryx?

I have a wide excel file with price data, looking like this
Product | 2015-08-01 | 2015-09-01 | 2015-09-01 | 2015-10-01
ABC | 13 | 12 | 15 | 14
CDE | 69 | 70 | 71 | 67
FGH | 25 | 25 | 26 | 27
The date 2015-09-01 can be found twice, which in the context is valid but obviously messes up my workflow.
It can be understood that the first value is the minimum price, the second one the maximum price. If there is only one column, min and max are the same.
Is there a way to resolve this issue?
An idea I had was the following:
I also have cells that contain a value like "38 - 42", again indicating min and max. I resolved this by spliting it based on a Regex expression. What could be a solution is to join two columns that have the same header, to afterwards split the values according to my rules. That however would require me to detect dynamically if the headers are duplicates.
Is that something that is possible in Alteryx or is there an easier solution for this problem?
And of course asking the supplier of the file to change it is not really an option, unfortunatelly.
Thanks
EDIT:
Just got another idea:
I transpose the table to have the format
Product | Date | Price Low | Price High
So if I could check for duplicates in that table and somehow merge these records into one, that would do the trick as well.
EDIT2:
Since I seem to haven't made that clear, my final result should look like the transposed table in EDIT1. If there is only one value it should go in "Price Low" (and then I will probably copy it to "Price High" anyway. If there are two values they should go in the according columns. #Poornima's suggestion resolves the duplicate issue in a more sophisticated form than putting a "_2" behind the column name, but doesn't put the value in the required column.
If this format works for you:
Product | Date | Price Low | Price High
Then:
- Transpose with Product as a key field
- Use a select tool to truncate your Name field to 10 characters. This will remove any _2 values that Alteryx has automatically renamed.
- Summarize:
Group by Product
Group by Name
Then apply Min and Max operations to value.
Result is:
Product | Name | Min_Value | Max_Value
ABC | 2015-08-01 | 13 | 13
ABC | 2015-09-01 | 12 | 15
ABC | 2015-10-01 | 14 | 14
For this problem, you can leverage the native Excel (.xlsx) driver available in Alteryx 9.1. If multiple columns in Excel use the same string, then they are renamed by the native driver with an underscore at the end e.g., 2015-09-01, 2015-09-01_1. By leveraging this, we can reformat the data in three steps:
As you suggested, we start by transposing the data so that we can leverage the column headers.
We can then write a formula with the Formula Tool that evaluates whether the column header for the date is the first or the last one based on the header length.
The final step would be to bring the data back into the same format as before, which can be via the Crosstab Tool.
You can review the configurations for each of these tools here. The end result would be as follows.
Hope this helps.
Regards,
Poornima

Find value using multiple criteria

I've got a table with columns, each containing customer contact information. I've also got a formula that finds a phone number using multiple criteria: customer ID, type (mobile, home etc), and primary Y/N. The problem is this information can occur several times but with a different date, in which case the newest occurrence needs to be selected. The current CSE formula is:
=INDEX($C$6:$BZ$18;10;MATCH(<client_ID>;IF(($C$8:$BZ$8=<client_ID>)*($C$17:$BZ$17="home")*($C$18:$BZ$18="Y");$C$8:$BZ$8);0))
where
$C$6:$BZ$18 contains all data
$C$8:$BZ$8 contains all client IDs
$C$17:$BZ$17 contains the types of phone numbers
$C$18:$BZ$18 contains whether this number is the primary number of that type
$C$8:$BZ$8 contains the date a number was entered
The data looks like this:
B C D
---------------------------------------------------------------------
8 CLIENTID |Client1 |Client1 |
9 other | | |
10 other | | |
11 other | | |
12 other | | |
13 other | | |
14 other | | |
15 PHONE NUMBER |9876543210 |1234567890 |
16 DATE |2015-04-15 |2015-04-16 |
17 TYPE |Home |Home |
18 Primary |Y |Y |
The above formula selects phone number 9876543210 but it needs to select 1234567890 because that is the latest entry.
Any ideas on how to proceed from here?
The underlying value of dates are numbers so we can look for the furthest date to the right in a row by searching for an impossibly high number with the MATCH function without looking for an exact match.
      
The array formula in F6 is,
=INDEX($B$8:$BZ$18, MATCH(F$5, $B$8:$B$18, 0), MATCH(1E+99, IF($B$8:$BZ$8=$C6, IF($B$17:$BZ$17=$D6, IF($B$18:$BZ$18=$E6, $B$16:$BZ$16)))))
Array formulas need to be finalized with Ctrl+Shift+Enter↵.
If your dates are in ascending order (left-to-right) then an exact match will have to be sought. A three criteria pseudo-MAXIF formula can return that into the original formula modified to look for an exact match. If the maximum date is duplicated, the first one is returned.
=INDEX($C$8:$BZ$18, MATCH(F$5, $B$8:$B$18, 0), MATCH(MAX(INDEX($C$16:$BZ$16*($C$8:$BZ$8=$C6)*($C$17:$BZ$17=$D6)*($C$18:$BZ$18=$E6), , )), IF($C$8:$BZ$8=$C6, IF($C$17:$BZ$17=$D6, IF($C$18:$BZ$18=$E6, $C$16:$BZ$16))), 0))
In order to provide some maths without errors, I've shifted the calculation ranges to C:BZ. Array formulas still need to be finalized with Ctrl+Shift+Enter↵.
By appropriately locking either the row, column or both of the cell addresses, we can use the column header to identify a different category from column B as I have done with DATA LINE. The formula can be simply filled right.

PowerPivot transpose column headers on rows

I have two datasources, one comes from an excel file i cannot change with this format
item | week 1 | week 2 | ...
ITM01| 12 | 23 | ...
My second datasource comes from a query and looks like
item | Week | Value
ITM01| 1 | 5
ITM02| 2 | 10
...
I need to merge the two tables to have, hopefully, something like
item | Week | Value 1 | Value 2
ITM01| 1 | 5 | 12
ITM01| 2 | 10 | 23
...
I'd like to achieve this in powerpivot, considering that I cannot change the excel datasource, and i would like it to be updateable by using excels refresh button, which means, i think, that i should not create custom tables to handle the transposition, as that might ruin the refresh.
I'm really lost on how to achieve this and some help would be much appreciated.
I'd also add that i can change the second data source to look like the first one(weeks on columns), but while i might be able to connect the two tables, i would still not know how to achieve the desired output with weeks on rows.
Thanks a lot for your time.

Excel Formulas: Show total based on date entry

I've got a spreadsheet with two columns that represent the number of processed records, and the date the records were processed. In some cases, the records can be processed in multiple batches, so the document looks something like this:
33 4/1/2009
22 4/1/2009
12 4/2/2009
13 4/4/2009
36 4/4/2009
I'm trying to add a new set of columns that contain a date, and shows the total number of records for that date, automagically:
4/1/2009 55
4/2/2009 12
4/3/2009 0
4/4/2009 49
I know how to do this algorithmically, and I could probably manipulate the spreadsheet outside of Excel, but I'm trying to do this in the live spreadsheet, and am a bit bewildered as to how to pull it off.
Any ideas?
Thanks!
IVR Avenger
Will the SUMIF function work for you? SUMIF([range],[criteria],[sum_range]) I think you could set range = the set of cells containing dates in your first listing, criteria would be the cell containing the date in the second listing, and sum_range would be the counts in the first column of your first listing.
I would suggest using a Pivot Table. Put the dates into the row area and 'sum of' records in the data area. Nothing in the columns area.
A pivot table will be more dynamic than a formula solution because it will only show you dates that exist.
Assuming your dates are in column B and the numbers to be accumulated are in A, you could try something like this:
| A | B | C D
1 | 33 | 4/1/2009 | =MIN(B:B) | {=SUM(IF(B1:B5=C1,A1:A5,0))} |
2 | 22 | 4/1/2009 | =C1+1 | {=SUM(IF(B1:B5=C2,A1:A5,0))} |
3 | 12 | 4/2/2009 | =C2+1 | {=SUM(IF(B1:B5=C3,A1:A5,0))} |
4 | 13 | 4/4/2009 | =C3+1 | {=SUM(IF(B1:B5=C4,A1:A5,0))} |
5 | 36 | 4/4/2009 | =C4+1 | {=SUM(IF(B1:B5=C5,A1:A5,0))} |
Note the {} which signifies an array formula (input using Control-Shift-Enter) for any non-trivial amount of data it's heaps faster than SUMIF().
I'd be inclined to define dynamic names for the A1:A5 and B1:B5 parts, something like
=OFFSET(A1,0,0,COUNT(A:A),1)
so that I didn't have to keep fixing up my formulae.
There's still a manual element: adding new rows for extra dates, for example - that might be a good place for a little VBA. Alternatively, if you can get away with showing, for example, the last 90 days' totals, then you could fix the number of rows used.

Resources