I have 50000 something random entries in a worksheet categorized by Zip Code and I need to group them by Zones. I have a reference list that shows which Zip corresponds to which Zone. How can I add a column to the worksheet with the Zone that corresponds to the Zip without manually looking it up and typing it in.
This is what the reference list looks like:
Zone Zip
1 03227
1 03254
1 03269
...
2 05687
2 05691
etc
Here's an example using VLookup
Related
I need to create a SSIS package that would extract data from an Excel source and load it into a SQL Server Destination.
The Excel file name would have a date, typically the file name would look like emp_20110909.xls where 11 is the Month, 09 is the Day and 09 is the Year. Now I want to capture this date and in the destination table add another column named "Extracted_Date" and populate the captured date for all the records extracted from this excel.
Can anyone tell me how to do that process?
Excel as a data source offers no explicit functionality for this whereas the Flat File Source does. I blogged about this under What is the name of a file
What you're looking to do is have a Foreach File Enumerator look in a folder for your Excel file(s). Assign the value of the currently found file to a variable like #[User::CurrentFileName]. That would look something like C:\ssisdata\mySource\Input\emp_110909.xls
You would update the Excel Connection Manager to have an expression on the ExcelFilePath property so now as the value of #[User::CurrentFileName] changes, so does the actual referenced file. You can find plenty of references to using the foreach enumerator on the web or search my answers
The last bit you need is to parse the value of CurrentFileName to find the year
(11), month (09) and day (09) elements - or maybe you want it as one big value (110909). For this, I would create 4 variables: FileDate, FileYear, FileMonth, FileDay all as string. Yes, they're numbers but for our usage, treating them as string is going to be easier.
FileDate will correspond to everything between the underscore following emp up until the period of xls. We're going to use the Expression language of SSIS to do this and the particular elements will be SUBSTRING, FINDSTRING and LEN
SUBSTRING(#[User::CurrentFileName], FINDSTRING(#[User::CurrentFileName], "emp_", 1) + LEN("emp_"), 6)
Here, I was lazy and just "knew" the length was 6 and hardcoded as such. In the event that someone gives us a emp_20110909.xls this will fail. The preceding expression would be modified by finding the position of the period and then calculating the length from the emp_ position.
Now that we know FileDate, we can use SUBSTRING to slice out the first 2 elements for year, next 2 for month and final two for day.
You can then inject those values into your Data Flow via a Derived Task or push into an audit table via Execute SQL Task.
I need to write a complex formula in excel (or if someone has a suggestion as to another program to use I'm open to it!) with multiple conditions based on where the item is stored.
Each item has a minimum and maximum par level calculated, but can be stored in multiple locations. The percentage of that par is calculated based on where that item is stored (See last image below). For example:
Item A is stored in Central location 1, Central location 2, and 2 External (aka non-central) locations. There is a total quantity of 100 Item A's.
Based on our scenarios, we would find that:
Central Location 1: 70%
Central Location 2: 20%
External Location 1:10%/# of external Locations
External Location 2:10%/# of external Locations
So our par level for that item in each location would be:
Central Location 1: 70 of Item A
Central Location 2: 20 of Item A
External Location 1: 5 of Item A
External Location 2: 5 of Item A
The left side are the storage locations for each item ID #. I need to distribute the total Min and Max to each location depending on the scenarios below
I could go through and do this manually for each item (Where is stored, what is the scenario, calculate) but there are 1,500 items all stored in various places. Is there any formula I could write to calculate where the item is and how much of the item would go to each area its stored in?
I've tried using various IF and matching functions but feel like I don't have any clue where to start.
Any help would be great!
Different Scenarios of where an item can be stored. Depending on storage locations, each location will get a different percentage of the total (to the right in the image above)
OPTION 1
Build out your reference table like above using the following formulas for columns M and N row 2 and copy down for as many items as you have
=SUMPRODUCT(($C$2:$C$11=$J2)*(LEFT($A$2:$A$11)="c"))
=SUMPRODUCT(($C$2:$C$11=$J2)*(LEFT($A$2:$A$11)="e"))
After that in your cell M2 using the following equation and copy down:
=IF($A2="external",IF(INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),4)=1,0.2,0.1)*INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),2)/INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),5),IF(--RIGHT($A2)=1,IF(INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),4)=1,IF(INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),5)=0,1,0.8),0.7)*INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),2),IF(INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),5)=0,0.3,IF(INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),4)=1,0.8,0.2))))
in order to get your max values, repeat the concept but change
*INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),2)
to
*INDEX($J$2:$N$5,MATCH($C2,$J$2:$J$5,0),3)
The change tells it to grab the value from the 3rd column instead of the second. There will be multiple locations to do this.
OPTION 2
Build out your table on the right to look like the following using the formula beneath the picture.
Build out columns M and N as OPTION 1
Build out columns O:Q using the following:
=IF(AND(M3=0,N3=0),0,IF(M3=2,0.7,IF(N3=0,1,0.8))*K3)
=IF(AND(M3=0,N3=0),0,IF(M3=2,IF(N3=0,0.3,0.2),0.8)*K3)
=IF(AND(M3=0,N3=0),0,IF(M3=1,0.2,0.1)*K3/N3)
Repeat these fomulas in R:T changing K3 to L3
Then in column D and E use the respective formulas:
=INDEX($O$3:$Q$6,MATCH($C2,$J$3:$J$6,0),MATCH($A2,$O$2:$Q$2,0))
=INDEX($R$3:$T$6,MATCH($C2,$J$3:$J$6,0),MATCH($A2,$R$2:$T$2,0))
copy the formulas down as required.
OPTIONS 2 while more spread out is probably easier to read and thus maintain.
The issue is sorting an array that is generated automatically from an data source using a formula that extracts unique data points. (Data points are date/time)
The data is being extracted with this fomula.
=INDEX(Table_ExternalData_1[SampleDateTime],MATCH(0,INDEX(COUNTIF($G$2:G2,Table_ExternalData_1[SampleDateTime]),0,0),0))
Once extracted, the data is not sorted right away. The current data is extracted from a database via an SQL string that pulls in data corresponding to the data and time that the data point was created.
Because of this, the extracted points are not in the correct order. I am attempting to sort the extracted data points from earliest to latest to continue with the data sorting, but need the date/times to be sorted in a separate row.
I have attempted to use a pivot table, but it isn't exactly what I need and ends up being a messier end product than I need.
All assistance is appreciated.
Example is below.
1
2
3
5
1
2
3
4
6
5
3
I need this.
1
2
3
4
5
6
I did end up finding a solution that I will be able to modify. Using a single row of a pivot table, I took just the date/time column and had the PivotTable function sort the data to be utilized as necessary.
Thank you.
The fact that the range in the example you give:
1) Consists of entries of a numeric datatype only
2) Does not contain any blanks
means that the solution is relatively simple.
Assuming that data is in A1:A11, first use a single cell somewhere within the worksheet to count the number of expected returns. For example, using B1 for this purpose, enter this formula in that cell:
=SUM(IF(FREQUENCY(A1:A11,A1:A11),1))
Your main formula is then:
=IF(ROWS($1:1)>B$1,"",SMALL(IF(FREQUENCY(A$1:A$11,A$1:A$11),A$1:A$11),ROWS($1:1)))
the latter being copied down until you start to get blanks for the results.
Regards
I am working on a distribution problem, analysing the volumes delivered to a set of stores (75 stores).
I have an Excel file as follows:
As you can see, each day does not contain the same stores, given that each store does not receive a delivery every day.
I want to get a new table that has the code of the store in the columns, and the information about volume and miles in the rows. Furthermore, I want to sum the values of the volumes given that they belong to the same store. In this example this would look like this:
As you can imagine, my spreadsheet is way bigger, having a total of 6500 rows and 800 columns. I was thinking about using the function combination of INDEX/MATCH, but I cannot see how to make it sum the multiple values for a given store in a given date.
While you need to extend this formula, you could use:
=SUMIF(INDEX($C:$F,MATCH($J2,$A:$A,0),),L$1,INDEX($C:$F,MATCH($J2,$A:$A,0)+MOD(ROW(),2)+1,))
if the table is build up like this:
From L2 you can simply drag down and to the left as needed ;)
EDIT
To also get the stores:
L1: {=MIN(IF(MOD(ROW($C$1:$F$6),3)=1,$C$1:$F$6))}
This is an array formula and must be entered without the {} but being confirmed with ctrl+shift+enter!
M1: =SUMPRODUCT(SMALL(IF(MOD(ROW($C$1:$F$6),3)=1,$C$1:$F$6),SUM((IF(MOD(ROW($C$1:$F$6),3)=1,$C$1:$F$6)<=L$1)*1)+1))
from M1 you can simply copy to the right.
And to get the dates (if non continuous or something like that)
J2: =MIN(A:A)
J3: =J2
J4: =SMALL(A:A,COUNTIF(A:A,"<="&J3)+1)
J5: =J4
then copy J4:J5 simply down :)
Dont put the stores in the columns, use VBA or similar to read the input files and normalize the data so that the output would be a table looking like
Store - Date - Volume - Miles
101 10/06/2016 520 120
102 11/06/2016 500 100
Then you can always lookup a store and date or pivot the data later.
In Excel I have an old list of downloads by country like this:
Country | Downloads
USA | 20
Canada | 50
etc....
Now I have a new list of downloads (since the previous list was created) in the same format. I need to combine the two and add the values of the old list with the new. There are also downloads from countries that didn't download before and they need to be added at the bottom.
This is one of the rare places where Data->Consolidate is useful. It does exactly what you want. Add the two ranges as references, and it will output the combined table using whatever function you want, here SUM.
Starting point
Result