Calculation and search in ranges of codes - excel

I have a question regarding the Calculation of something:
I used to use this function below to calculate the available hours of working colleagues like this:
Function BerekenUrenPercentage(OptelGebied As Range, TriggerWoord As String, PercentageGebied As Range) As Double
Dim Cel As Range, Uren As Double, Waarde As String, Teller As Double, Factor As Integer
For Each Cel In OptelGebied.Cells
Teller = Teller + 1
If InStr(Cel.Formula, TriggerWoord) > 0 Then
Waarde = Cel.Formula
Waarde = Replace(Waarde, TriggerWoord, "")
Factor = PercentageGebied.Formula(Teller, 1)
Uren = Uren + ((Int(Waarde) * Factor) / 100)
End If
Next Cel
BerekenUrenPercentage = Uren
End Function
But I need some more flexibility and efficiency in my calculations so I reworked the whole thing, but can't figure out the formula I need to use.
It searches in a whole week of attendance (7 columns) for the desired Code (see below), if found it subtracts that code from the cell and only the hours remain.
It duplicates that with the percentage of Employability (searches in the correct week and person (on Pers ID)) and searches on.
I Would also like (later on) to enter multiple codes in 1 cell like: DAV4/VK4 or DAV4/DEV4 in the future and still find the two correct codes
We have an attendance schedule on my work:
Attendance Table where we note if someone is working, in which discipline and for how long.
Like: DAA8 -> DAA is a code and 8 is 8 hours
Examples of codes: DVP8, DVA,DVV,DPP,DPA,DPV,DAP,DAA,DAV,DEV,DLOG, VK8,ZK8,BK8,BL6 etc
VK = Vacation, ZK = Sick
There is also an Table for Employability in Percentage Where all the workers are displayed with the percentage of employability. We use this to calculate how many hours a person is helping towards our goal.
And the calculation of Hours available table: Here is where I will use the function I am searching for.
Can anyone help me?
Example: In the first cell of the attendance table is one of the cells i want to count.
I want the lookup function
cell (in the picture has the value 120) in the row with the code DVP
(left of that cell) to search for the code, Distill the numeric value
multiply that by the corresponding week in the Employability table.
And do that for every occurence of the code in the week the cell
stands (top of the tables).

Related

VBA function for Upside/Downside Capture

apologies for my ignorance, I'm brand new to VBA - I'm sure this is a simple problem...
I'm trying to write a fn. for up/down side capture in VBA. This is the problem:
There are two columns. One has fund performance in % (I've labelled 'returns'). The other has index performance in % (labelled 'index'). Both are same length / same number of rows. I need both to be variables to enter to the fn.
For UpsideCapture fn., for all nos. in the index column >0, I want to find the corresponding number in the returns column (which will be on the same row). Once I have those numbers I can compound them.
I've tried using Offset, assuming the returns column is 15 columns to the left of the index column but it doesn't return anything, and I don't really want to rely on it always being 15 columns apart (it arbitrary).
Many thanks!
One of my rubbish attempts is below. Any help is much appreciated. Its really just a case of finding the correct corresponding row based on the value in the index column...
Function UpsideCapture(returns As Range, index As Range) As Variant
Dim n As Integer
Dim m As Integer
Dim i As Integer
n = returns.Rows.Count
m = index.Rows.Count
For i = 1 To m
If index(i) > 0 Then
Upsidecap = ((1 + Upsidecap) * (1 + Offset(returns(i), -15))) - 1
End If
Next
UpsideCapture = Upsidecap
End Function
example

Sum of a specific range that changes on each iteration of a loop

I have a sheet that the values of a range change each time I change a specific cell. Let's say that the cell C8 is an indentity of a person and column H the scheduled monthly repayments. I need to find the aggregate monthly repayments, hence on each possible value of C8 (and that actually means for every person as you can think of different values of C8) I need the aggegate of repayments, hence the aggegate of cell Hi Hence, keeping row i constant and changing cell C8, I always need to sum Hi. So I actually need sum(Hi) (i constant and the index of the sum is cell c8, so if c8 takes value from 1 to 200, I need the sum(Hi(c8)), again row i . Hi(c8) it is just a notation to show you that Hi depends on the value of c8. The actual formula in cell H10 is INDEX('Sheet2'!R:R,MATCH('Sheet1'!$C$8,'Sheet2'!F:F,0)))). H11 and onwards have the same formula with slight twists for the fact that the repayments are not always equal, but the index function remains the same.
Then, the total of H10 for all possible values of c8 is pasted in c17, the total of H11 is pasted in C18 etc. Please find some images below, maybe that helps to support what I try to achieve. enter image description here
I have the following code for that purpose. Note that the above example was just to explain you a bit the background, the cells and the range that changes are different.
sub sumloop()
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Sheets("Sheet1").Range("C8").Value = 1
Dim i, k As Integer
i = 1
k = Sheets("Sheet1").Range("C9").Value
Dim LR As Long
LR = Sheets("Sheet1").Range("C" &
Sheets("Sheet1").Rows.Count).End(xlUp).row
Sheets("Sheet1").Range("C17:C" & LR).ClearContents
Do While i <= k
If (Sheets("Sheet1").Range("J9").Value = "") Then
Sheets("Sheet1").Range("h10:h200").Copy
Sheets("Sheet1").Range("c17").PasteSpecial
Paste:=xlValues, Operation:=xlAdd, SkipBlanks:= _
False, Transpose:=False
Else
Sheets("Sheet1").Range("h9:h200").Copy
Sheets("Sheet1").Range("c17").PasteSpecial
Paste:=xlValues, Operation:=xlAdd, SkipBlanks:= _
False, Transpose:=False
End If
Sheets("Sheet1").Range("C8").Value = Sheets("Sheet1").Range("C8").Value+1
i = i + 1
Loop
Sheets("Sheet1").Range("C8").Value = 1
Application.ScreenUpdating = True
Application.DisplayStatusBar = True
End Sub
The if inside of the loop is needed as the location of the first value of the range depends on some criteria which have not to do with the code. Also k denotes the maximum number of possible values. What I need is approximately 250.
While the code works, it takes approximately 40 seconds to run for 84 values of cell C8 and approximately 1.5 minute for 250. I tried some things, changed do while to for but nothing significant, used variable ranges instead of fixed ones like h10:h100, very similar to what I do with Sheet1.Range(C17:C&LR). Again no significant changes. As I am very new to vba I don't know if 1.5 minutes are a lot for such a simple code, but to me it seems a lot and this analysis is needed for 10 different combinations of 250 different values for cell c8, which means 15 minutes approximately.
I would appreciate if anyone can suggest me something faster.
Thank you very much in advance.
Here is a complete solution, with explainations in comments.
Because we do not have you source spreadsheet, I could not run any tests on this.
Option Explicit 'This forces you to declare all your varaibles correctly. It may seem annoying at first glance, but will quickly save you time in the future.
Sub sumloop()
Application.ScreenUpdating = False
'Application.DisplayStatusBar = False -> This is not noticely slowing down your code as soon as you do not refresh the StatusBar value for more than say 5-10 times per second.
'Save the existing Calculation Mode to restore it at the end of the Macro
Dim xlPreviousCalcMode As XlCalculation
xlPreviousCalcMode = Application.Calculation
Application.Calculation = xlCalculationManual
'Conveniently store the Sheet into a variable. You might want to do the same with your cells, for example: MyCellWhichCounts = MySheet.Range("c17")
Dim MySheet As Worksheet
MySheet = ActiveWorkbook.Sheets("Sheet1")
MySheet.Range("C8").Value2 = 1 'It is recommended to use.Value2 instead of .Value (notably in case your data type is Currency, but it is good practice to use that one all the time)
Dim LR As Long
LR = MySheet.Range("C" & MySheet.Rows.Count).End(xlUp).Row 'Be carefull with "MySheet.Rows.Count", it may go beyond your data range, for example if you modify the formatting of a cell below your "last" row.
MySheet.Range("C17:C" & LR).Value2 = vbNullString 'It is recommended to use vbNullString instead of ""; although I agree it makes it more difficult to read.
Dim i As Integer, k As Integer 'Integers are ok, just make sure you neer exceed 255
k = MySheet.Range("C9").Value2
For i = 1 To k 'Use a For whenever you can, it is easier to maintain (i.e. avoid errors and also for you to remember when you go back to it years later)
'Little extra so you can track progress of your calcs
Dim z As Integer
z = 10 'This can have any value > 0. If the value is low, you will refresh your app often but it will slow down. If the value is high, it won't affect performance but your app might freeze and/or you will not have your Statusbar updated as often as you might like. As a rule of thumb, I aim to refresh around 5 times per seconds, which is enough for the end user not to notice anything.
If i Mod z = 0 Then 'Each time i is a mutliple of z
Application.StatusBar = "Calculating i = " & i & " of " & k 'We refresh the Statusbar
DoEvents 'We prevent the Excel App to freeze and throw messages like: The application is not responding.
End If
'Set the range
Dim MyResultRange As Range
If (MySheet.Range("J9").Value2 = vbNullString) Then
MyResultRange = MySheet.Range("h10:h200")
Else
MyResultRange = MySheet.Range("h9:h200")
End If
'# Extract Result Data
MyResultRange.Calculate 'Refresh the Range values
Dim MyResultData As Variant
MyResultData = MyResultRange.Value2 'Store the values in VBA all at once
'# Extract Original Data
Dim MyOriginalRange as Range
MyOriginalRange.Calculate
MyOriginalRange = MySheet.Range("c17").Resize(MyResultRange.Rows.Count,MyResultRange.Columns.Count) 'This produces a Range of the same size as MyResultRange
Dim MyOriginalData as Variant
MyOriginalData = MyOriginalRange.Value2
'# Sum Both Data Arrays
Dim MySumData() as Variant
Redim MySumData(lbound(MyResultRange,1) to ubound(MyResultRange,1),lbound(MyResultRange,2) to ubound(MyResultRange,2))
Dim j as long
For j = lbound(MySumData,1) to ubound(MySumData,1)
MySumData(j,1)= MyResultData(j,1) + MyOriginalData(j,1)
Next j
'Instead of the "For j = a to b", you could use this, but might be slower: MySumData = Application.WorksheetFunction.MMult(Array(1, 1), Array(MyResultData, MyOriginalData))
MySheet.Range("C8").Value2 = MySheet.Range("C8").Value2 + 1
Next i
MySheet.Range("C8").Value2 = 1
Application.ScreenUpdating = True
Application.StatusBar = False 'Give back the status bar control to the Excel App
Application.Calculation = xlPreviousCalcMode 'Do not forget to restore the Calculation Mode to its previous state
End Sub
Added by OP (see comments)
Image 1 Code written in the initially question. enter image description here
Image 2 Code above enter image description here
OK, A few things.
Firstly, Dim i, k As Integer doesn't do what you think it does, you need to do: Dim i As Integer, k As Integer
Secondly don't use Integer in VBA use Long so Dim i As Long, k As Long
Third the calculations are killing you. Turn them off with Application.Calculation = xlCalculationManual at the start of your code and back on with Application.Calculation = xlCalculationAutomatic at the end of your code.
Now we are presented with really fast code but the problem that it doesn't update on each iteration which you need it to do. You can calculate just a range like so: Sheets("Sheet1").Range("h10:h200").Calculate so put that in just before you copy the range
There will be an even faster way to do this but I just can't seem to wrap my head around your requirements so I am unable to assist further.
Welcome to StackOverflow.
I must admit I got a bit confused by your narrative, as I did not fully understand if you are doing a sum(a,b,c) or a sum(sum(a,b,c), sum(d,e,f), ...).
In any cases, a trick that will dramatically accelerate your script is the use of arrays.
Performing calcs with VBA is not slow, but retrieving the data from Excel (communicating with the application) IS slow, and pretty much depending on the number of "requests", rather than the quantity of data requested.
You can use arrays to request the data from a range all at once, isntead of requesting the value of each cell separately.
Dim Arr() As Variant
Arr = Range("A1:E999")
It is as simple as this.
Give it a try and if you are still struggling let us know.
BONUS
If you are new to Arrays, keep in mind you can have a two-dimmensionnal array:
Dim 2DArray(0 to 10, 0 to 50)
Or a stacked array (an array of arrays):
Dim MyArray() as String
Dim StackedArray() as MyArray
Dim StackedArray() as Variant
You will need a 2D-Array for extracting the data from a range, but I feel you may need an Array of 2D-Arrays for your Sum of Sums.
Some recommended reading: https://excelmacromastery.com/excel-vba-array/
How to achieve the same through pivot charts (no VBA)
Step 1
First, you must organize your data in a specific way, where each column is a field, and each row is a data entry. If you are not familiar with databases, this is the most tricky point as you may arrange your data in different ways.
Long story short, we will take an example where you have 3 customers and 4 dates.
So that is 12 data entries, which will provide the repayment value for each of the possible customer ID and date.
Step 2
Select that data and insert a PivotChart.
Note: you could insert a PivotTable alone, or a PivotChart alone. I recommend the option hwere you insert both, as managing your data will be more intuitive when working on the Chart. The table is updated at the same time you update the chart.
Step 3
Make sure the all your data is selected, including the top row which will dictate the name of each field (the name of each column).
Step 4
A new sheet has just been create, and you can see where both your PivotTble and PivotCharts will appear. Select the chart.
Step 5
A menu to the right will appear (it might have already been there, so make sure you selected the Chart and not the Table, as that menu would be slightly different).
Step 6
Drag and drop the field names into the categories as shown.
What you are doing here is telling Excel what data you want to see (Values) and how you want to break it down (per date, and per customer).
Step 7
By default dates data is always groupped quartile and year. To be able to see all the date we have data for, you can click the [+] near the data on the Table: this will show more details for both the table and the chart.
Step 8
But we want to get completely rid of the quartils and years. In order to achieve this, you need to right click any value of your date column in the Table, and choose "Ungroup" as displayed.
Step 9
Your data now looks like this.
Note the time axis is not on scale. For example if you hae monthly data and a month is missing, there will be no gap. This is one of the difficulties with Pivot data. This can be overcomes, but it is off topic here.
Step 10
Now we want to have a cumulative view of the data, so we want to play with the way the values are proessed by Excel.
Select the chart, then in the right panel: right click on the "Sum of Repayment" field, and select "Value Field Settings".
Step 11
In the "Show Values As" tab, select "Show values as" "Running Tital In".
Then choose "Date".
Here we are telling Excel that the value to display should be a cumulative total, cumulated according to the "Date" field.
Press OK.
Step 12
You now have what you are looking for. If you look in the Table, you have one column per Customer ID, and one row per date. For a given Date, you have the cumulative repayment made by a given Customer ID. At the very right, you have the Grand Total, which is, for a given date, the sum of all the Customer ID values.
Step 13
The Chart keeps showing the cumulative payment per CUstomer ID, and we cannot see the grand total.
In orer to achieve this, simply remove the "Customer ID" field from the "Legend (Series)" category area in the Fields Panel, as shown. (you can untick the Customer Id [x] box, or you can drag and drop it from the category area to the main list area).
Step 14
Now we only have the Grand total in the chart. But why?
If you display the "Value Field Settings" of Sum of Repyament" (Step 10), the first tab "Summarize Values By" will tell Excel what to do when several value meet the same Legend and Axis values.
Now that we removed the Customer ID field from the Legend area, for each date, we have 3 repayment values (one for each Customer ID). In the field settings, we tell Excel to use a "Sum". So it returns the sum of the 3 values.
But you could play around and return the Average, or even use "Count", which will show you how many records you have (it will return 3).
That is why pivot charts are so powerful: with only a few clicks and/or drag and drop, you can display a myriad of different graphics for your data.
For future interest, you should look online for Filters, and "Insert Slicer" (which is equivalent to filtering, but will add button directly on your chart: great when showing the data to colleagues and switch from one setting to another)
Hope this helped!

How to fill data in excel sheet where date lies between a series of date ranges given in another sheet ? Also, a particular column should match

I'm working on Social Survey project.Due to discrepancies in data I'm stuck at a certain place. The survey conducting volunteers were given tablets with unique IDs. On different dates, the tablets were used in different cities
Sheet 1 one contains a list of around thousands of responses for which city names are missing and Sheet 2 contains a list of tablets in use in different cities on different dates.
Sheet 1
City DeviceID StartDate EndDate
Delhi 25 21-08-2014 26-08-2014
Mumbai 39 14-05-2014 21-05-2014
Chennai 91 17-11-2014 21-11-2014
Bangalore 91 11-10-2014 21-10-2014
Delhi 91 26-05-2015 29-05-2015
Hyderabad 25 23-05-2015 28-05-2015
Sheet 2
S.Id DeviceId SurveyDate City
203 91 15-10-2014 ?
204 25 24-08-2014 ?
I need to somehow fill up the values for the city column in Sheet 2.
I tried using Vlookup but being a beginner to excel, was unable to get things working. I managed to format the string in date columns as date.
But am unsure about how to pursue this further.
From my understanding, Vlookup requires that the date ranges to be continuous, with no missing values in between. It is not so in this case. This is real world data and hence imperfect.
What would be the right approach to this problem ? Can this be done with excel macros ?
I also read up a bit about nested if statements but am confused being a beginner to excel formulas and data manipulation.
There is two ways to do what you want.
The first one is using vba and create a macro to do the job BUT you will have to iterate through all your data multiple time (n1*n2 loops in the worst case scenario where n1 and n2 is the number of rows in it's table respectively) which is really slow if you have a lot of data.
The other way is a little more complicated and includes array formulas but is really faster than vba because it uses the build in functions of excel (which are optimized already).
So I will use a much simpler example and you can use that as you wish on your data.
I have the following tables:
Table1
city ID start end
A 1 3 5
B 3 4 6
C 3 5 8
Table 2
ID point city
3 5 ?
So we want a formula that completes the second table. where ID match exactly and point is between start-end. We are going to use MATCH and INDEX to get it.
Here it is:
=INDEX(A$2:A$4;MATCH(1;(B$2:B$4=G2)*(C$2:C$4<=H2)*(D$2:D$4>=H2);0))
First of all to run this after you write it you should not press enter but instead ctrl+shift+enter to tell excel to run it as an array formula otherwise it will not run at all.
Now we got that out of the way let me explain what is going on here:
The MATCH does the following:
match the value 1 (TRUE) in the range I created and that should be an exact match. But how the range is created? Lets take that part for example:
This B$2:B$4=G2 -gives-> {1;3;3}=3 --> {FALSE;TRUE;TRUE}
Similarly the second thing in the MATCH gives: {TRUE;TRUE;FALSE}
So now we have (keep in mind that the * is similar to logical AND):
{FALSE;TRUE;TRUE}*{TRUE;TRUE;FALSE} --> {FALSE;TRUE;FALSE}
and this combined with the third gives {FALSE;TRUE;FALSE}
So now we have MATCH(1;{FALSE;TRUE;FALSE};0) --> 2 because in the range only the second row matches the 1 (first row that it matches).
So now we just use index to get from another range whatever is on row 2.
You can use the above on your own data to get the expected results.
Good luck!
If the deviceId values should match and the survey date should be between the start date and end date, VLookup won't suffice. The following pointers, however, should get you started:
1) Define the date ranges from which the date comparisons should be made.
2) Use an overlap date checking function to determine if the date in question overlaps the start and end dates.
3) Loop through the date ranges and insert in Sheet2 when a match is found, i.e. when the deviceId values match and the date overlaps.
The following function takes as parameters the date to be checked, the start and end date and returns True, if dateVal overlaps the start and end date:
Function dateOverlap(dateVal As String, startDate As String, endDate As String) As Boolean
If DateDiff("d", startDate, dateVal) >= 0 And DateDiff("d", endDate, dateVal) <= 0 Then _
dateOverlap = True
End Function
Example usage
Debug.Print dateOverlap("05-10-2016", "01-10-2016", "10-10-2016") (returns true).
Here we use MEDIAN() as an easy way to test for "in-between".
Sub FillInTheBlanks()
Dim s1 As Worksheet, s2 As Worksheet
Dim N1 As Long, N2 As Long, i As Long, j As Long
Dim rc As Long, DeId As Long, sDate As Date
Dim wf As WorksheetFunction
Set s1 = Sheets("Sheet1")
Set s2 = Sheets("Sheet2")
Set wf = Application.WorksheetFunction
rc = Rows.Count
N1 = s1.Cells(rc, "A").End(xlUp).Row
N2 = s2.Cells(rc, "A").End(xlUp).Row
For i = 2 To N2
DeId = s2.Cells(i, "B").Value
sDate = s2.Cells(i, "C").Value
For j = 2 To N1
If DeId = s1.Cells(j, 2).Value Then
If sDate = wf.Median(sDate, s1.Cells(j, "C").Value, s1.Cells(j, "D").Value) Then
s2.Cells(i, "D").Value = s1.Cells(j, "A").Value
End If
End If
Next j
Next i
End Sub
Sheet2:
starting from Sheet1:

Need formula for excel, to subtract the number "9" to each number individually and

I want you to have some fun. I need something specific.
First i must explain what i do. I use a simple codification for product prices at retail store, because i dont want people know the real price for themselves. So i change the original numbers to another subtracting the number 9 for each number.
Normally I manually write down all the prices with this codification for every product.
So.. for example number 10 would be 89. (9-1 = 8) and (9-0 = 9)
Other examples:
$128 = 871
$75 = 24
$236 = 763
$9 = 0
Finally i put 2 number nines (9) at the beginning of the codified price also, to confuse people who might think that number could be the price.
So the examples i used before are like this:
99871 (means $128)
9924 (means $75)
99763 (means $236)
990 (means $9)
Remember that i need 2 (two) nines before the real price. The real prices never start with 0 so, the nines at the beginning exist only to confuse people.
Ok. So, now that you understand, here comes the 2nd part.
I have an excel whith hundreds of my products added, with prices, description, etc. And i decided it is time to use a printer and start to print this information from excel. I have a software to do that, but first i need to have the codified prices in the excel also.
The fun part begins when i want to convert the real prices that are already written in my excel document into a new column AUTOMATICALLY. So that way i donĀ“t have to type again all the prices in codified form for the old and new items i add in the future.
Can someone help me with this? Is it even possible?
I tried with =A1-9999 but, it works well with 2 character number only. Because if the real price is 5, i will get 3 nines: 9994(code). And if the price is 234 i will get only 1 nine 9765(code). And it is a condition i need to have the TWO nines at first.
Thank you very much in advanced!
Though you have requested for formula , I am suggesting VBA program which seems to me very convenient.
You have to open VBE and insert a module and copy the program. Change the code lines wherever indicated to suit your requirements for sheets etc.
Sub NumberCode()
Dim c As Range
Dim LR As Integer
Dim numProbs As Long
Dim sht As Worksheet
Dim s As Integer
Dim v As Long
Dim v1 As Long
Set sht = Worksheets("Sheet1") ' change as per yr requirement
numProbs = 0
LR = sht.Cells(Rows.Count, "A").End(xlUp).Row
For Each c In sht.Range("A1:A" & LR).Cells
s = Len(c)
v = c.Value
v1 = 99
For s = 1 To Len(c)
v1 = v1 & (9 - Mid(c, s, 1))
Next
c.Offset(0, 1).Value = v1
v1 = 99
numProbs = numProbs + 1
Next
MsgBox "Number coding finished"
End Sub
Sample sheet of results is appended below.
I will be using helper cells but you could dump it all into one cell if you want since you are only dealing with 4 characters.
For the purpose of this example, I am assuming your original price list starts in B11.
=IFERROR(9-MID($B11,COLUMN(A1),1),"")
Place that in D11 and copy to the right three more times so you have it from D11 to G11. That formula strips off 1 character from your price and subtracts that character from 9. When you go the next column it repeats itself. If you do not have that many characters, it will return "".
In C11 you will build your number based on the adjacent 4 columns using this formula:
="99"&D11&E11&F11&G11
It places 99 in front then adds the numbers from the adjacent 4 columns.
Select cells C11 to G11 and copy and paste downward beside your data column as far as you need to go.
An alternate more concise method would be:
=REPT(9,LEN(B11)+2)-B11
Perhaps I'm missing something, though simply:
=REPT(9,2+LEN(A1))-A1
seems good to me.
Regards

How can I lookup data from one column, when the value I'm referencing changes columns?

I want to do an INDEX-MATCH-like lookup between two documents, except my MATCH's index array doesn't stay in one column.
In Vague-English: I want a value from a known column that matches another value that may be found in any column.
Refer to the image below. Let's call everything to the left of the bold vertical line on column H doc1, and the right side will be doc2.
Doc2 has a column "Find This", which will be the INDEX's array. It is compared with "ID1" from doc1 (Note that the values in "Find This" will not be in the same order as column ID1, but it's easier to undertsand this way).
The "[Result]" column in doc2 will be the value from doc1's "Want This" column from the row that matches "FIND THIS" ...However, sometimes the value from "FIND THIS" is not in the "ID1" column, and is instead in "ID2","ID3", etc.
So, I'm trying to generate Col K from Col J. This would be like pressing Ctrl+F and searching for a value in Col J, then taking the value from Col D in that row and copying it to Col K.
I made identical values from a column the same color in the other doc to make it easier to visualize where they are coming from.
Note also that in column F of doc1, the same value from doc2's "Find This" can be found after some other text.
Also note that the column headers are only there as examples, the ID columns are not actually numbered.
I would simply hard-code the correct column to search from, but I'm not in control of doc1, and I'm worried that future versions may have new "ID" columns, with other's being removed.
I'd prefer this to be a solution in the form of a formula, but VB will do.
To generate column K based on given values of column J then you could use the following:
=INDEX(doc1!$D$2:$D$14,SUMPRODUCT((doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14))-1)
Copy that formula down as far as you need to go.
It basically only returns the row of the where a matching column J is found. we then find that row in the index of your D range to get your value in K.
Proof of concept:
UPDATE:
If you are working with non unique entities n column J. That is the value on its own can be found in multiple rows and columns. Consider using the following to return the Last row where there J value is found:
=INDEX(doc1!$D$2:$D$14,AGGREGATE(14,6,(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14),1)-1)
UPDATE 2:
And to return the first row where what you are looking in column J is found use:
=INDEX($D$2:$D$14,AGGREGATE(15,6,1/($B$2:$H$14=J2)*ROW($B$2:$H$14)-1,1))
Thanks to Scott Craner for the hint on the minimum formula.
To determine if you have UNIQUE data from column J in your range B2:H14 you can enter this array formula. In order to enter an array formula you need to press CTRL+SHFT+ENTER at the same time and not just ENTER. You will know you have done it right when you see {} around your formula in the formula bar. You cannot at the {} manually.
=IF(MAX(COUNTIF($B$2:$H$14,J2:J14))>1,"DUPLICATES","UNIQUE")
UPDATE 3
AGGREGATE - A relatively new function to me but goes back to Excel 2010. Aggregate is 19 functions rolled into 1. It would be nice if they all worked the same way but they do not. I think it is functions numbered 14 and up that will perform the same way an array formula or a CSE formula if you prefer. The nice thing is you do not need to use CSE when entering or editing them. SUMPRODUCT is another example of a regular formula that performs array formula calculations.
The meat of this explanation I believe is what is happening inside of the AGGREGATE brackets. If you click on the link you will get an idea of what the first two arguments are. The first defines which function you are using, and the second tell AGGREGATE how to deal with Errors, hidden rows, and some other nested functions. That is the relatively easy part. What I believe you want to know is what is happening with this:
(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14)
For illustrative purpose lets reduce this formula to something a little smaller in scale that does the same thing. I'll avoid starting in A1 as that can make life a little easier when counting since it the 1st row and first column. So by placing the example range outside of it you can see some more special considerations potentially.
What I want to know is what row each of the items list in Column C occurs in column B
| B | C
3 | DOG | PLATYPUS
4 | CAT | DOG
5 | PLATYPUS |
The full formula for our mini example would be:
{=($B$3:$B$5=C2)*ROW($B$3:$B$5)}
And we are going to look at the following as an array
=INDEX($B$3:$B$5,AGGREGATE(14,6,($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
So the first brackets is going to be a Boolean array as you noted. Every cell that is TRUE will TRUE until its forced into a math calculation. When that happens, True becomes 1 and False becomes 0.I that formula was entered as a CSE formula and place in D2, it would break down as follows:
FALSE X 3
FALSE X 4
TRUE X 5
The 3, 4 and 5 come from ROW() returning the value of the row number that it is dealing with at the time of the array math operation. Little trick, we could have had ROW(1:3). Just need to make sure the size of the array matches! This is not matrix math is just straight across multiplication. And since the Boolean is now experiencing a math operation we are now looking at:
0 X 3 = 0
0 X 4 = 0
1 X 5 = 5
So the array of {0,0,5} gets handed back to the aggregate for more processing. The important thing to note here is that it contains ONLY 0 and the individual row numbers where we had a match. So with the first aggregate formula, formula 14 was chosen which is the LARGE function. And we also told it to ignore errors, which in this particular case does not matter. So after providing the array to the aggregate function, there was a ,1) to finish off the aggregate function. The 1 tells the aggregate function that we want the 1st larges number when the array is sorted from smallest to largest. If that number was 2 it would be the 2nd largest number and so on. So the last row or the only row that something is found on is returned. So in our small example it would be 5.
But wait that 5 was buried inside another function called Index. and in our small example that INDEX formula would be:
=INDEX($B$3:$B$5,AGGREGATE(...)-2)
Well we know that the range is only 3 rows long, so asking for the 5th row, would have excel smacking you up side the head with an error because your index number is out of range. So in comes the header row correction of -1 in the original formula or -2 for the small example and what we really see for the small example is:
=INDEX($B$3:$B$5,5-2)
=INDEX($B$3:$B$5,3)
and here is a weird bit of info, That last one does not become PLATYPUS...it becomes the cell reference to =B5 which pulls PLATYPUS. But that little nuance is a story for another time.
Now in the comments Scott essentially told me to invert for the error to get the first row. And this is important step for the aggregate and it had me running in circles for awhile. So the full equation for the first row option in our mini example is
=INDEX($B$3:$B$5,AGGREGATE(15,6,1/($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
And what Scott Craner was actually suggesting which Skips one math step is:
=INDEX($B$3:$B$5,AGGREGATE(15,6,ROW($B$3:$B$5)/($B$3:$B$5=C2),1)-2)
However since I only realized this after writing this all up the explanation will continue with the first of these two equations
So the important thing to note here is the change from function 14 to function 15 which is SMALL. Think of it a finding the minimum. And this time that 6 plays a huge factor along with the 1/. So our array in the middle this time equates to:
1/FALSE X 3
1/FALSE X 4
1/TRUE X 5
Which then becomes:
1/0 X 3
1/0 X 4
1/1 X 5
Which then has excel slapping you up side the head again because you are trying to divide by 0:
#div/0 X 3
#div/0 X 4
1/1 X 5
But you were smart and you protected yourself from that slap upside the head when you told AGGREGATE to ignore error when you used 6 as the second argument/reference! Therefore what is above becomes:
{5}
Since we are performing a SMALL, and we passed ,1) as the closing part of the AGGREGATE, we have essentially said give me the minimum row number or the 1st smallest number of the resulting array when sorted in ascending order.
The rest plays out the same as it did for the LARGE AGGREGATE method. The pitfall I fell into originally is I did not use the 1/ to force an error. As a result, every time I tried getting the SMALL of the array I was getting 0 from all the false results.
SUMPRODUCT works in a very similar fashion, but only works when your result array in the middle only returns 1 non zero answer. The reason being is the last step of the SUMPRODUCT function is to all the individual elements of the resulting array. So if you only have 1 non zero, you get that non zero number. If you had two rows that matched for instance 12 and 31, then the SUMPRODUCT method would return 43 which is not any of the row numbers you wanted, where as aggregate large would have told you 31 and aggregate small would have told you 12.
Something like this maybe, starting in K2 and copied down:
=IFERROR(INDEX(D:D,MAX(IFERROR(MATCH(J2,B:B,0),-1),IFERROR(MATCH(J2,E:E,0),-1),IFERROR(MATCH(J2,G:G,0),-1),IFERROR(MATCH(J2,H:H,0),-1))),"")
If you want to keep the positions of the columns for the Match variable, consider creating generic range names for each column you want to check, like "Col1", "Col2", "Col3". Create a few more range names than you think you will need and reference them to =$B:$B, =$E:$E etc. Plug all range names into Match functions inside the Max() statement as above.
When columns are added or removed from the table, adjust the range name definitions to the columns you want to check.
For example, if you set up the formula with five Matches inside the Max(), and the table changes so you only want to check three columns, point three of the range names to the same column. The Max() will only return one result and one lookup, even if the same column is matched several times.
I came up with a vba solution if I understood correctly:
Sub DisplayActiveRange()
Dim sheetToSearch As Worksheet
Set sheetToSearch = Sheet2
Dim sheetToOutput As Worksheet
Set sheetToOutput = Sheet1
Dim search As Range
Dim output As Range
Dim searchCol As String
searchCol = "J"
Dim outputCol As String
outputCol = "K"
Dim valueCol As String
valueCol = "D"
Dim r As Range
Dim currentRow As Integer
currentRow = 1
Dim maxRow As Integer
maxRow = sheetToOutput.UsedRange.Rows.Count
For currentRow = 1 To maxRow
Set search = Range("J" & currentRow)
For Each r In sheetToSearch.UsedRange
If r.Value <> "" Then
If r.Value = search.Value Then
Set output = sheetToOutput.Range(outputCol & currentRow)
output.Value = sheetToSearch.Range(valueCol & currentRow).Value
currentRow = currentRow + 1
Set search = sheetToOutput.Range(searchCol & currentRow)
End If
End If
Next
Next currentRow
End Sub
There might be better ways of doing it, but this will give you what you want. We assume headers in both "source" and "destination" sheets. You will need to adapt the "Const" declarations according to how your sheets are named. Press Control & G in Excel to bring up the VBA window and copy and paste this code into "This Workbook" under the "VBA Project" group, then select "Run" from the menu:
Option Explicit
Private Const sourceSheet = "Source"
Private Const destSheet = "Destination"
Public Sub FindColumns()
Dim rowCount As Long
Dim foundValue As String
Sheets(destSheet).Select
rowCount = 1 'Assume a header row
Do While Range("J" & rowCount + 1).value <> ""
rowCount = rowCount + 1
foundValue = FncFindText(Range("J" & rowCount).value)
Sheets(destSheet).Select
Range("K" & rowCount).value = foundValue
Loop
End Sub
Private Function FncFindText(value As String) As String
Dim rowLoop As Long
Dim colLoop As Integer
Dim found As Boolean
Dim pos As Long
Sheets(sourceSheet).Select
rowLoop = 1
colLoop = 0
Do While Range(alphaCon(colLoop + 1) & rowLoop + 1).value <> "" And found = False
rowLoop = rowLoop + 1
Do While Range(alphaCon(colLoop + 1) & rowLoop).value <> "" And found = False
colLoop = colLoop + 1
pos = InStr(Range(alphaCon(colLoop) & rowLoop).value, value)
If pos > 0 Then
FncFindText = Mid(Range(alphaCon(colLoop) & rowLoop).value, pos, Len(value))
found = True
End If
Loop
colLoop = 0
Loop
End Function
Private Function alphaCon(aNumber As Integer) As String
Dim letterArray As String
Dim iterations As Integer
letterArray = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
If aNumber <= 26 Then
alphaCon = (Mid$(letterArray, aNumber, 1))
Else
If aNumber Mod 26 = 0 Then
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations - 1, 1)) & (Mid$(letterArray, 26, 1))
Else
'we deliberately round down using 'Int' as anything with decimal places is not a full iteration.
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations, 1)) & (Mid$(letterArray, (aNumber - (26 * iterations)), 1))
End If
End If
End Function

Resources