PowerPivot transpose column headers on rows - excel

I have two datasources, one comes from an excel file i cannot change with this format
item | week 1 | week 2 | ...
ITM01| 12 | 23 | ...
My second datasource comes from a query and looks like
item | Week | Value
ITM01| 1 | 5
ITM02| 2 | 10
...
I need to merge the two tables to have, hopefully, something like
item | Week | Value 1 | Value 2
ITM01| 1 | 5 | 12
ITM01| 2 | 10 | 23
...
I'd like to achieve this in powerpivot, considering that I cannot change the excel datasource, and i would like it to be updateable by using excels refresh button, which means, i think, that i should not create custom tables to handle the transposition, as that might ruin the refresh.
I'm really lost on how to achieve this and some help would be much appreciated.
I'd also add that i can change the second data source to look like the first one(weeks on columns), but while i might be able to connect the two tables, i would still not know how to achieve the desired output with weeks on rows.
Thanks a lot for your time.

Related

Distinct Values based on two columns in Excel

I must be tired because I feel like I've done this before, but just banging my head tonight.
Anyway, I have a data sheet. 10 columns 800 rows. ( an excel table if you will)
On a separate sheet I have various drop downs. I want to be able to have a drop down that is based on the first 2 selections the user makes.
So If in cell A2 user selects 10 and in cell B2 user selects 12 , I want to be able to filter the excel table for values in Column D, based on the 2 values selected beforehand.
I've looked at formulas and doing data validation. I'm using the data validation for cells A2 and B2, however they are just list.
I tried a 3rd data validation using an If statement, but that continues to fail.
datasheet could contain:
column1 |column2 | column3|etc...
10 | 12 | 22
11 | 13 | 23
11 | 33 | 23
10 | 12 | 25
11 | 13 | 24
10 | 12 | 26
What I think should work, something like:
if(A2=10, list1, list2)
My issue is list 1 is where I need to select distinct values from column 3 based on filtering items from columns 1 & 2, but I cant get it to work to save my life...
When I try to do the formulas as listed above, it barfs telling me I need a delimited list...
Just looking for pointers in the right direction. Like I said , I feel like I've done this before, just can't recall it tonight. Thanks in advance for any direction. Respectfully.

Pruning specific duplicate rows in VBA

I have a list of purchase orders that's updated daily (it's a large list). As orders are complete, they are added to the list as "closed" instead of "open", but the "open" entry stays on the list. I want to find duplicate entries (duplicate rows of data except open/closed status) and remove the duplicate "Open" entry.
I know how to use the RemoveDuplicates method, but I suspect that method does not have the ability to do what I want. I can do this by looping through the Open orders to find duplicates, but I'm assuming there's a slightly cleaner more efficient method. Also order numbers are recycled so Order Number and Date have to match for it to be a duplicate order.
The data is structured like this:
A | B | C
Order Number | Date | Closed/Open
--------------------------------------
1 | 1 | Open
2 | 1 | Open
1 | 2 | Open
4 | 2 | Open
1 | 2 | Open
1 | 1 | Closed
In the above case, the first and last rows have duplicate columns A and B. In this case, I want to remove only the one that's "Open" and keep the other.
Does anyone know of a slick way of doing this? or should I just loop throw all the open orders to see if any are closed?

How to resolve duplicate column names in excel file with Alteryx?

I have a wide excel file with price data, looking like this
Product | 2015-08-01 | 2015-09-01 | 2015-09-01 | 2015-10-01
ABC | 13 | 12 | 15 | 14
CDE | 69 | 70 | 71 | 67
FGH | 25 | 25 | 26 | 27
The date 2015-09-01 can be found twice, which in the context is valid but obviously messes up my workflow.
It can be understood that the first value is the minimum price, the second one the maximum price. If there is only one column, min and max are the same.
Is there a way to resolve this issue?
An idea I had was the following:
I also have cells that contain a value like "38 - 42", again indicating min and max. I resolved this by spliting it based on a Regex expression. What could be a solution is to join two columns that have the same header, to afterwards split the values according to my rules. That however would require me to detect dynamically if the headers are duplicates.
Is that something that is possible in Alteryx or is there an easier solution for this problem?
And of course asking the supplier of the file to change it is not really an option, unfortunatelly.
Thanks
EDIT:
Just got another idea:
I transpose the table to have the format
Product | Date | Price Low | Price High
So if I could check for duplicates in that table and somehow merge these records into one, that would do the trick as well.
EDIT2:
Since I seem to haven't made that clear, my final result should look like the transposed table in EDIT1. If there is only one value it should go in "Price Low" (and then I will probably copy it to "Price High" anyway. If there are two values they should go in the according columns. #Poornima's suggestion resolves the duplicate issue in a more sophisticated form than putting a "_2" behind the column name, but doesn't put the value in the required column.
If this format works for you:
Product | Date | Price Low | Price High
Then:
- Transpose with Product as a key field
- Use a select tool to truncate your Name field to 10 characters. This will remove any _2 values that Alteryx has automatically renamed.
- Summarize:
Group by Product
Group by Name
Then apply Min and Max operations to value.
Result is:
Product | Name | Min_Value | Max_Value
ABC | 2015-08-01 | 13 | 13
ABC | 2015-09-01 | 12 | 15
ABC | 2015-10-01 | 14 | 14
For this problem, you can leverage the native Excel (.xlsx) driver available in Alteryx 9.1. If multiple columns in Excel use the same string, then they are renamed by the native driver with an underscore at the end e.g., 2015-09-01, 2015-09-01_1. By leveraging this, we can reformat the data in three steps:
As you suggested, we start by transposing the data so that we can leverage the column headers.
We can then write a formula with the Formula Tool that evaluates whether the column header for the date is the first or the last one based on the header length.
The final step would be to bring the data back into the same format as before, which can be via the Crosstab Tool.
You can review the configurations for each of these tools here. The end result would be as follows.
Hope this helps.
Regards,
Poornima

Pivot table to return the FIRST value in a range

Glad to be joining the forum.
My question deals with attempting to return the FIRST value that occurs over several columns of data, using a pivot table that is filtered within a narrow time range. My current pivot table works by counting values in each column over the time rows. However I'm really only interested in the FIRST value that I come across for each person. So the raw looks something like this:
Person|TimeValue|Variable1|Variable2
1 | 1 | 1 | 0
1 | 2 | 1 | 0
2 | 1 | 1 | 0
2 | 2 | 0 | 1
What I currently get for a pivot using a range of time1 to time 2 is
1 | |2 | 0
2 | |1 | 1
Clearly, the time range I select includes MULTIPLE values in the same column, leading to counts of >1. What I'm thinking is that there is a way to use the same time sorting, but count only the FIRST time a value occurs in that variable, so that the pivot reports only the first time a value occurs within the range for the variables of interest.
Is there a simple way, or am I going to have to do this in VBA?
Much appreciated for any and all help. This is my first more complicated attempt with the newer pivots.
This is probably not the problem you would want to solve using a pivot table. You could just use the VLOOKUP Excel function to solve this issue in a simple way. VLOOKUP will always return the first value in the lookup range that matches the lookup value.

Excel Formulas: Show total based on date entry

I've got a spreadsheet with two columns that represent the number of processed records, and the date the records were processed. In some cases, the records can be processed in multiple batches, so the document looks something like this:
33 4/1/2009
22 4/1/2009
12 4/2/2009
13 4/4/2009
36 4/4/2009
I'm trying to add a new set of columns that contain a date, and shows the total number of records for that date, automagically:
4/1/2009 55
4/2/2009 12
4/3/2009 0
4/4/2009 49
I know how to do this algorithmically, and I could probably manipulate the spreadsheet outside of Excel, but I'm trying to do this in the live spreadsheet, and am a bit bewildered as to how to pull it off.
Any ideas?
Thanks!
IVR Avenger
Will the SUMIF function work for you? SUMIF([range],[criteria],[sum_range]) I think you could set range = the set of cells containing dates in your first listing, criteria would be the cell containing the date in the second listing, and sum_range would be the counts in the first column of your first listing.
I would suggest using a Pivot Table. Put the dates into the row area and 'sum of' records in the data area. Nothing in the columns area.
A pivot table will be more dynamic than a formula solution because it will only show you dates that exist.
Assuming your dates are in column B and the numbers to be accumulated are in A, you could try something like this:
| A | B | C D
1 | 33 | 4/1/2009 | =MIN(B:B) | {=SUM(IF(B1:B5=C1,A1:A5,0))} |
2 | 22 | 4/1/2009 | =C1+1 | {=SUM(IF(B1:B5=C2,A1:A5,0))} |
3 | 12 | 4/2/2009 | =C2+1 | {=SUM(IF(B1:B5=C3,A1:A5,0))} |
4 | 13 | 4/4/2009 | =C3+1 | {=SUM(IF(B1:B5=C4,A1:A5,0))} |
5 | 36 | 4/4/2009 | =C4+1 | {=SUM(IF(B1:B5=C5,A1:A5,0))} |
Note the {} which signifies an array formula (input using Control-Shift-Enter) for any non-trivial amount of data it's heaps faster than SUMIF().
I'd be inclined to define dynamic names for the A1:A5 and B1:B5 parts, something like
=OFFSET(A1,0,0,COUNT(A:A),1)
so that I didn't have to keep fixing up my formulae.
There's still a manual element: adding new rows for extra dates, for example - that might be a good place for a little VBA. Alternatively, if you can get away with showing, for example, the last 90 days' totals, then you could fix the number of rows used.

Resources