Excel: create sortable compound ID - excel

All, I asked a question "Excel VBA: Sort, then Copy and Paste" and received two excellent answers. However, because I failed to provide sufficient user requirements, they won't work: I asked for a fix to the existing solution I created, instead of specifying the actual business need and seeing if anyone has a better way.
(sigh) Here goes:
My boss asked me to create a ss to log issues. He wants a compound ID that concatenates the "Assigned Date" with a number that indicates what number issue it is for that day only. A new day, the count must restart at 1. E.g.:
Assigned Issue Concatenated
Date & Count = ID
5/11/2011 & 1 = 5112011-1
5/11/2011 & 2 = 5112011-2
5/11/2011 & 3 = 5112011-3
5/12/2011 & 1 = 5122011-1
I solved this with a hidden column C that calculates =IF(D2<>D1,1,C1+1), thus calculating the Issue Count by incrementing the previous issue count if the assigned date in column D is the same as the previous date, and starting over at 1 when the date changes. Another column concatenates the assigned date and the issue count, and I have my issue ID.
Quick, easy, elegant, in, out, and done. Right? But when I delivered the ss, he pointed out that if you (that is, he) sorts any part of the spreadsheet, the issue ID goes out of sequence. Of course---each formula isn't referencing the previous date in sequence if the rows are sorted out of Assigned Date order.
My immediate thought, which prompted my previous question, was to first re-sort the Assigned Date order correctly, then copy and paste the value of the calculated Issue Count to lock it in, and thus preserve the concatenated ID.
The only other way I can see to do this (in VBA, natch) is to:
evaluate all the dates in the Assigned Date column
evaluate all the numbers in the Issue Count column
calculate the latest sequential Issue Count for an a new item assigned on a given Assigned Date
Assign that sequential Issue Count to the new item
It'd be nice to then place the cursor into the next cell that the user would ordinarily go to, which would be the one right adjacent to the just-entered Assigned Date; however, that isn't necessary
That would avoid the need to re-sort the physical ss. However, besides a hazy guess that this would involve VLOOKUP, I got nothing. I couldn't find anything through searching.
Can anyone help? Or suggest a place to go? Thanks!!!

Sounds like you just want to automate a Paste Special action. The following replaces the formulas in a1:a100 with their calculated values:
Set src = ActiveSheet.Range("a1:a100")
src.Copy
src.Select
Selection.PasteSpecial Paste:=xlPasteValues, _
Operation:=xlNone, _
SkipBlanks:=False, _
Transpose:=False

I think the formula =IF(D2<>D1,1,C1+1) could be improved as this relies on dates being in order. The following will preserve the count for any order that is sorted
Assume
ColA ColB ColC
Row1 Assigned_Date Issue Count Concatenate
Row2 05/11/2011 =COUNTIF($A$1:A2,A2) =TEXT(A2,"ddmmyyyy")&"-"&B2
Row3 05/11/2011 =COUNTIF($A$1:A3,A3) =TEXT(A3,"ddmmyyyy")&"-"&B3
Row4 05/12/2011 =COUNTIF($A$1:A4,A4) =TEXT(A4,"ddmmyyyy")&"-"&B4
Row5 05/11/2011 =COUNTIF($A$1:A5,A5) =TEXT(A5,"ddmmyyyy")&"-"&B5
Essentially enter B2 and C2 formulae and drag down. You might need to swap ddmmyyyy to mmddyyyy as we use dates first rather than months :)
Also, note the locking of the first part of the range only using $ - $A$1:Ax
This works perfectly for your current question but does not work if the Issue Count is assigned in time order per date.

How about using a procedure? Just click a button to add the next entry.
I've assumed that the entries will be given today's date and that the sheet layout is:
Rows: 1 = Title / 2 = left blank / 3 = Headings of the data block
Columns: A = Date / B = Issue Count / C = Combined ID / D etc = other data
Sub AddEntry()
Dim iDayRef As Long, iNumRows As Long, n As Long
With Range("A3")
iNumRows = .CurrentRegion.Rows.Count
For n = 2 To iNumRows
If .Cells(n, 1).Value = Date Then
If .Cells(n, 2).Value > iDayRef Then iDayRef = .Cells(n, 2).Value
End If
Next
.Cells(iNumRows + 1, 1).Value = Date
.Cells(iNumRows + 1, 2).Value = iDayRef + 1
.Cells(iNumRows + 1, 3).Value = Format(Date, "mm/dd/yyyy") & " - " & iDayRef + 1
.Cells(iNumRows + 1, 4).Select
End With
End Sub
And do you really need three columns for Date, Count, and Combined ID? If you went with a
yyyy/mm/dd - xx
ID format, one column could replace all three, and you could easily sort on it.

Related

How to fill data in excel sheet where date lies between a series of date ranges given in another sheet ? Also, a particular column should match

I'm working on Social Survey project.Due to discrepancies in data I'm stuck at a certain place. The survey conducting volunteers were given tablets with unique IDs. On different dates, the tablets were used in different cities
Sheet 1 one contains a list of around thousands of responses for which city names are missing and Sheet 2 contains a list of tablets in use in different cities on different dates.
Sheet 1
City DeviceID StartDate EndDate
Delhi 25 21-08-2014 26-08-2014
Mumbai 39 14-05-2014 21-05-2014
Chennai 91 17-11-2014 21-11-2014
Bangalore 91 11-10-2014 21-10-2014
Delhi 91 26-05-2015 29-05-2015
Hyderabad 25 23-05-2015 28-05-2015
Sheet 2
S.Id DeviceId SurveyDate City
203 91 15-10-2014 ?
204 25 24-08-2014 ?
I need to somehow fill up the values for the city column in Sheet 2.
I tried using Vlookup but being a beginner to excel, was unable to get things working. I managed to format the string in date columns as date.
But am unsure about how to pursue this further.
From my understanding, Vlookup requires that the date ranges to be continuous, with no missing values in between. It is not so in this case. This is real world data and hence imperfect.
What would be the right approach to this problem ? Can this be done with excel macros ?
I also read up a bit about nested if statements but am confused being a beginner to excel formulas and data manipulation.
There is two ways to do what you want.
The first one is using vba and create a macro to do the job BUT you will have to iterate through all your data multiple time (n1*n2 loops in the worst case scenario where n1 and n2 is the number of rows in it's table respectively) which is really slow if you have a lot of data.
The other way is a little more complicated and includes array formulas but is really faster than vba because it uses the build in functions of excel (which are optimized already).
So I will use a much simpler example and you can use that as you wish on your data.
I have the following tables:
Table1
city ID start end
A 1 3 5
B 3 4 6
C 3 5 8
Table 2
ID point city
3 5 ?
So we want a formula that completes the second table. where ID match exactly and point is between start-end. We are going to use MATCH and INDEX to get it.
Here it is:
=INDEX(A$2:A$4;MATCH(1;(B$2:B$4=G2)*(C$2:C$4<=H2)*(D$2:D$4>=H2);0))
First of all to run this after you write it you should not press enter but instead ctrl+shift+enter to tell excel to run it as an array formula otherwise it will not run at all.
Now we got that out of the way let me explain what is going on here:
The MATCH does the following:
match the value 1 (TRUE) in the range I created and that should be an exact match. But how the range is created? Lets take that part for example:
This B$2:B$4=G2 -gives-> {1;3;3}=3 --> {FALSE;TRUE;TRUE}
Similarly the second thing in the MATCH gives: {TRUE;TRUE;FALSE}
So now we have (keep in mind that the * is similar to logical AND):
{FALSE;TRUE;TRUE}*{TRUE;TRUE;FALSE} --> {FALSE;TRUE;FALSE}
and this combined with the third gives {FALSE;TRUE;FALSE}
So now we have MATCH(1;{FALSE;TRUE;FALSE};0) --> 2 because in the range only the second row matches the 1 (first row that it matches).
So now we just use index to get from another range whatever is on row 2.
You can use the above on your own data to get the expected results.
Good luck!
If the deviceId values should match and the survey date should be between the start date and end date, VLookup won't suffice. The following pointers, however, should get you started:
1) Define the date ranges from which the date comparisons should be made.
2) Use an overlap date checking function to determine if the date in question overlaps the start and end dates.
3) Loop through the date ranges and insert in Sheet2 when a match is found, i.e. when the deviceId values match and the date overlaps.
The following function takes as parameters the date to be checked, the start and end date and returns True, if dateVal overlaps the start and end date:
Function dateOverlap(dateVal As String, startDate As String, endDate As String) As Boolean
If DateDiff("d", startDate, dateVal) >= 0 And DateDiff("d", endDate, dateVal) <= 0 Then _
dateOverlap = True
End Function
Example usage
Debug.Print dateOverlap("05-10-2016", "01-10-2016", "10-10-2016") (returns true).
Here we use MEDIAN() as an easy way to test for "in-between".
Sub FillInTheBlanks()
Dim s1 As Worksheet, s2 As Worksheet
Dim N1 As Long, N2 As Long, i As Long, j As Long
Dim rc As Long, DeId As Long, sDate As Date
Dim wf As WorksheetFunction
Set s1 = Sheets("Sheet1")
Set s2 = Sheets("Sheet2")
Set wf = Application.WorksheetFunction
rc = Rows.Count
N1 = s1.Cells(rc, "A").End(xlUp).Row
N2 = s2.Cells(rc, "A").End(xlUp).Row
For i = 2 To N2
DeId = s2.Cells(i, "B").Value
sDate = s2.Cells(i, "C").Value
For j = 2 To N1
If DeId = s1.Cells(j, 2).Value Then
If sDate = wf.Median(sDate, s1.Cells(j, "C").Value, s1.Cells(j, "D").Value) Then
s2.Cells(i, "D").Value = s1.Cells(j, "A").Value
End If
End If
Next j
Next i
End Sub
Sheet2:
starting from Sheet1:

Need formula for excel, to subtract the number "9" to each number individually and

I want you to have some fun. I need something specific.
First i must explain what i do. I use a simple codification for product prices at retail store, because i dont want people know the real price for themselves. So i change the original numbers to another subtracting the number 9 for each number.
Normally I manually write down all the prices with this codification for every product.
So.. for example number 10 would be 89. (9-1 = 8) and (9-0 = 9)
Other examples:
$128 = 871
$75 = 24
$236 = 763
$9 = 0
Finally i put 2 number nines (9) at the beginning of the codified price also, to confuse people who might think that number could be the price.
So the examples i used before are like this:
99871 (means $128)
9924 (means $75)
99763 (means $236)
990 (means $9)
Remember that i need 2 (two) nines before the real price. The real prices never start with 0 so, the nines at the beginning exist only to confuse people.
Ok. So, now that you understand, here comes the 2nd part.
I have an excel whith hundreds of my products added, with prices, description, etc. And i decided it is time to use a printer and start to print this information from excel. I have a software to do that, but first i need to have the codified prices in the excel also.
The fun part begins when i want to convert the real prices that are already written in my excel document into a new column AUTOMATICALLY. So that way i don´t have to type again all the prices in codified form for the old and new items i add in the future.
Can someone help me with this? Is it even possible?
I tried with =A1-9999 but, it works well with 2 character number only. Because if the real price is 5, i will get 3 nines: 9994(code). And if the price is 234 i will get only 1 nine 9765(code). And it is a condition i need to have the TWO nines at first.
Thank you very much in advanced!
Though you have requested for formula , I am suggesting VBA program which seems to me very convenient.
You have to open VBE and insert a module and copy the program. Change the code lines wherever indicated to suit your requirements for sheets etc.
Sub NumberCode()
Dim c As Range
Dim LR As Integer
Dim numProbs As Long
Dim sht As Worksheet
Dim s As Integer
Dim v As Long
Dim v1 As Long
Set sht = Worksheets("Sheet1") ' change as per yr requirement
numProbs = 0
LR = sht.Cells(Rows.Count, "A").End(xlUp).Row
For Each c In sht.Range("A1:A" & LR).Cells
s = Len(c)
v = c.Value
v1 = 99
For s = 1 To Len(c)
v1 = v1 & (9 - Mid(c, s, 1))
Next
c.Offset(0, 1).Value = v1
v1 = 99
numProbs = numProbs + 1
Next
MsgBox "Number coding finished"
End Sub
Sample sheet of results is appended below.
I will be using helper cells but you could dump it all into one cell if you want since you are only dealing with 4 characters.
For the purpose of this example, I am assuming your original price list starts in B11.
=IFERROR(9-MID($B11,COLUMN(A1),1),"")
Place that in D11 and copy to the right three more times so you have it from D11 to G11. That formula strips off 1 character from your price and subtracts that character from 9. When you go the next column it repeats itself. If you do not have that many characters, it will return "".
In C11 you will build your number based on the adjacent 4 columns using this formula:
="99"&D11&E11&F11&G11
It places 99 in front then adds the numbers from the adjacent 4 columns.
Select cells C11 to G11 and copy and paste downward beside your data column as far as you need to go.
An alternate more concise method would be:
=REPT(9,LEN(B11)+2)-B11
Perhaps I'm missing something, though simply:
=REPT(9,2+LEN(A1))-A1
seems good to me.
Regards

How can I lookup data from one column, when the value I'm referencing changes columns?

I want to do an INDEX-MATCH-like lookup between two documents, except my MATCH's index array doesn't stay in one column.
In Vague-English: I want a value from a known column that matches another value that may be found in any column.
Refer to the image below. Let's call everything to the left of the bold vertical line on column H doc1, and the right side will be doc2.
Doc2 has a column "Find This", which will be the INDEX's array. It is compared with "ID1" from doc1 (Note that the values in "Find This" will not be in the same order as column ID1, but it's easier to undertsand this way).
The "[Result]" column in doc2 will be the value from doc1's "Want This" column from the row that matches "FIND THIS" ...However, sometimes the value from "FIND THIS" is not in the "ID1" column, and is instead in "ID2","ID3", etc.
So, I'm trying to generate Col K from Col J. This would be like pressing Ctrl+F and searching for a value in Col J, then taking the value from Col D in that row and copying it to Col K.
I made identical values from a column the same color in the other doc to make it easier to visualize where they are coming from.
Note also that in column F of doc1, the same value from doc2's "Find This" can be found after some other text.
Also note that the column headers are only there as examples, the ID columns are not actually numbered.
I would simply hard-code the correct column to search from, but I'm not in control of doc1, and I'm worried that future versions may have new "ID" columns, with other's being removed.
I'd prefer this to be a solution in the form of a formula, but VB will do.
To generate column K based on given values of column J then you could use the following:
=INDEX(doc1!$D$2:$D$14,SUMPRODUCT((doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14))-1)
Copy that formula down as far as you need to go.
It basically only returns the row of the where a matching column J is found. we then find that row in the index of your D range to get your value in K.
Proof of concept:
UPDATE:
If you are working with non unique entities n column J. That is the value on its own can be found in multiple rows and columns. Consider using the following to return the Last row where there J value is found:
=INDEX(doc1!$D$2:$D$14,AGGREGATE(14,6,(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14),1)-1)
UPDATE 2:
And to return the first row where what you are looking in column J is found use:
=INDEX($D$2:$D$14,AGGREGATE(15,6,1/($B$2:$H$14=J2)*ROW($B$2:$H$14)-1,1))
Thanks to Scott Craner for the hint on the minimum formula.
To determine if you have UNIQUE data from column J in your range B2:H14 you can enter this array formula. In order to enter an array formula you need to press CTRL+SHFT+ENTER at the same time and not just ENTER. You will know you have done it right when you see {} around your formula in the formula bar. You cannot at the {} manually.
=IF(MAX(COUNTIF($B$2:$H$14,J2:J14))>1,"DUPLICATES","UNIQUE")
UPDATE 3
AGGREGATE - A relatively new function to me but goes back to Excel 2010. Aggregate is 19 functions rolled into 1. It would be nice if they all worked the same way but they do not. I think it is functions numbered 14 and up that will perform the same way an array formula or a CSE formula if you prefer. The nice thing is you do not need to use CSE when entering or editing them. SUMPRODUCT is another example of a regular formula that performs array formula calculations.
The meat of this explanation I believe is what is happening inside of the AGGREGATE brackets. If you click on the link you will get an idea of what the first two arguments are. The first defines which function you are using, and the second tell AGGREGATE how to deal with Errors, hidden rows, and some other nested functions. That is the relatively easy part. What I believe you want to know is what is happening with this:
(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14)
For illustrative purpose lets reduce this formula to something a little smaller in scale that does the same thing. I'll avoid starting in A1 as that can make life a little easier when counting since it the 1st row and first column. So by placing the example range outside of it you can see some more special considerations potentially.
What I want to know is what row each of the items list in Column C occurs in column B
| B | C
3 | DOG | PLATYPUS
4 | CAT | DOG
5 | PLATYPUS |
The full formula for our mini example would be:
{=($B$3:$B$5=C2)*ROW($B$3:$B$5)}
And we are going to look at the following as an array
=INDEX($B$3:$B$5,AGGREGATE(14,6,($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
So the first brackets is going to be a Boolean array as you noted. Every cell that is TRUE will TRUE until its forced into a math calculation. When that happens, True becomes 1 and False becomes 0.I that formula was entered as a CSE formula and place in D2, it would break down as follows:
FALSE X 3
FALSE X 4
TRUE X 5
The 3, 4 and 5 come from ROW() returning the value of the row number that it is dealing with at the time of the array math operation. Little trick, we could have had ROW(1:3). Just need to make sure the size of the array matches! This is not matrix math is just straight across multiplication. And since the Boolean is now experiencing a math operation we are now looking at:
0 X 3 = 0
0 X 4 = 0
1 X 5 = 5
So the array of {0,0,5} gets handed back to the aggregate for more processing. The important thing to note here is that it contains ONLY 0 and the individual row numbers where we had a match. So with the first aggregate formula, formula 14 was chosen which is the LARGE function. And we also told it to ignore errors, which in this particular case does not matter. So after providing the array to the aggregate function, there was a ,1) to finish off the aggregate function. The 1 tells the aggregate function that we want the 1st larges number when the array is sorted from smallest to largest. If that number was 2 it would be the 2nd largest number and so on. So the last row or the only row that something is found on is returned. So in our small example it would be 5.
But wait that 5 was buried inside another function called Index. and in our small example that INDEX formula would be:
=INDEX($B$3:$B$5,AGGREGATE(...)-2)
Well we know that the range is only 3 rows long, so asking for the 5th row, would have excel smacking you up side the head with an error because your index number is out of range. So in comes the header row correction of -1 in the original formula or -2 for the small example and what we really see for the small example is:
=INDEX($B$3:$B$5,5-2)
=INDEX($B$3:$B$5,3)
and here is a weird bit of info, That last one does not become PLATYPUS...it becomes the cell reference to =B5 which pulls PLATYPUS. But that little nuance is a story for another time.
Now in the comments Scott essentially told me to invert for the error to get the first row. And this is important step for the aggregate and it had me running in circles for awhile. So the full equation for the first row option in our mini example is
=INDEX($B$3:$B$5,AGGREGATE(15,6,1/($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
And what Scott Craner was actually suggesting which Skips one math step is:
=INDEX($B$3:$B$5,AGGREGATE(15,6,ROW($B$3:$B$5)/($B$3:$B$5=C2),1)-2)
However since I only realized this after writing this all up the explanation will continue with the first of these two equations
So the important thing to note here is the change from function 14 to function 15 which is SMALL. Think of it a finding the minimum. And this time that 6 plays a huge factor along with the 1/. So our array in the middle this time equates to:
1/FALSE X 3
1/FALSE X 4
1/TRUE X 5
Which then becomes:
1/0 X 3
1/0 X 4
1/1 X 5
Which then has excel slapping you up side the head again because you are trying to divide by 0:
#div/0 X 3
#div/0 X 4
1/1 X 5
But you were smart and you protected yourself from that slap upside the head when you told AGGREGATE to ignore error when you used 6 as the second argument/reference! Therefore what is above becomes:
{5}
Since we are performing a SMALL, and we passed ,1) as the closing part of the AGGREGATE, we have essentially said give me the minimum row number or the 1st smallest number of the resulting array when sorted in ascending order.
The rest plays out the same as it did for the LARGE AGGREGATE method. The pitfall I fell into originally is I did not use the 1/ to force an error. As a result, every time I tried getting the SMALL of the array I was getting 0 from all the false results.
SUMPRODUCT works in a very similar fashion, but only works when your result array in the middle only returns 1 non zero answer. The reason being is the last step of the SUMPRODUCT function is to all the individual elements of the resulting array. So if you only have 1 non zero, you get that non zero number. If you had two rows that matched for instance 12 and 31, then the SUMPRODUCT method would return 43 which is not any of the row numbers you wanted, where as aggregate large would have told you 31 and aggregate small would have told you 12.
Something like this maybe, starting in K2 and copied down:
=IFERROR(INDEX(D:D,MAX(IFERROR(MATCH(J2,B:B,0),-1),IFERROR(MATCH(J2,E:E,0),-1),IFERROR(MATCH(J2,G:G,0),-1),IFERROR(MATCH(J2,H:H,0),-1))),"")
If you want to keep the positions of the columns for the Match variable, consider creating generic range names for each column you want to check, like "Col1", "Col2", "Col3". Create a few more range names than you think you will need and reference them to =$B:$B, =$E:$E etc. Plug all range names into Match functions inside the Max() statement as above.
When columns are added or removed from the table, adjust the range name definitions to the columns you want to check.
For example, if you set up the formula with five Matches inside the Max(), and the table changes so you only want to check three columns, point three of the range names to the same column. The Max() will only return one result and one lookup, even if the same column is matched several times.
I came up with a vba solution if I understood correctly:
Sub DisplayActiveRange()
Dim sheetToSearch As Worksheet
Set sheetToSearch = Sheet2
Dim sheetToOutput As Worksheet
Set sheetToOutput = Sheet1
Dim search As Range
Dim output As Range
Dim searchCol As String
searchCol = "J"
Dim outputCol As String
outputCol = "K"
Dim valueCol As String
valueCol = "D"
Dim r As Range
Dim currentRow As Integer
currentRow = 1
Dim maxRow As Integer
maxRow = sheetToOutput.UsedRange.Rows.Count
For currentRow = 1 To maxRow
Set search = Range("J" & currentRow)
For Each r In sheetToSearch.UsedRange
If r.Value <> "" Then
If r.Value = search.Value Then
Set output = sheetToOutput.Range(outputCol & currentRow)
output.Value = sheetToSearch.Range(valueCol & currentRow).Value
currentRow = currentRow + 1
Set search = sheetToOutput.Range(searchCol & currentRow)
End If
End If
Next
Next currentRow
End Sub
There might be better ways of doing it, but this will give you what you want. We assume headers in both "source" and "destination" sheets. You will need to adapt the "Const" declarations according to how your sheets are named. Press Control & G in Excel to bring up the VBA window and copy and paste this code into "This Workbook" under the "VBA Project" group, then select "Run" from the menu:
Option Explicit
Private Const sourceSheet = "Source"
Private Const destSheet = "Destination"
Public Sub FindColumns()
Dim rowCount As Long
Dim foundValue As String
Sheets(destSheet).Select
rowCount = 1 'Assume a header row
Do While Range("J" & rowCount + 1).value <> ""
rowCount = rowCount + 1
foundValue = FncFindText(Range("J" & rowCount).value)
Sheets(destSheet).Select
Range("K" & rowCount).value = foundValue
Loop
End Sub
Private Function FncFindText(value As String) As String
Dim rowLoop As Long
Dim colLoop As Integer
Dim found As Boolean
Dim pos As Long
Sheets(sourceSheet).Select
rowLoop = 1
colLoop = 0
Do While Range(alphaCon(colLoop + 1) & rowLoop + 1).value <> "" And found = False
rowLoop = rowLoop + 1
Do While Range(alphaCon(colLoop + 1) & rowLoop).value <> "" And found = False
colLoop = colLoop + 1
pos = InStr(Range(alphaCon(colLoop) & rowLoop).value, value)
If pos > 0 Then
FncFindText = Mid(Range(alphaCon(colLoop) & rowLoop).value, pos, Len(value))
found = True
End If
Loop
colLoop = 0
Loop
End Function
Private Function alphaCon(aNumber As Integer) As String
Dim letterArray As String
Dim iterations As Integer
letterArray = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
If aNumber <= 26 Then
alphaCon = (Mid$(letterArray, aNumber, 1))
Else
If aNumber Mod 26 = 0 Then
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations - 1, 1)) & (Mid$(letterArray, 26, 1))
Else
'we deliberately round down using 'Int' as anything with decimal places is not a full iteration.
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations, 1)) & (Mid$(letterArray, (aNumber - (26 * iterations)), 1))
End If
End If
End Function

How can I add a 1 to the most recent, repeated row in Excel?

I have a dataset with 60+ thousand rows in excel and about 20 columns. The "ID column" sometimes repeats itself and I want to add a column that will return 1 only in the row that is the most recent only IF it repeats itself.
Here is the example. I have…
ID DATE ColumnX
AS1 Jan-2013 DATA
AS2 Feb-2013 DATA
AS3 Jan-2013 DATA
AS4 Dec-2013 DATA
AS2 Dec-2013 DATA
I want…
ID DATE ColumnX New Column
AS1 Jan-2013 DATA 1
AS2 Feb-2013 DATA 0
AS3 Jan-2013 DATA 1
AS4 Dec-2013 DATA 1
AS2 Dec-2013 DATA 1
I've been trying with a combination of sort and nested if's, but it depends on my data being always in the same order (so that it looks up the ID in the previous row).
Bonus points: consider my dataset if fairly large for excel, so the most efficient code that won't eat up processor would be appreciated!
An approach you could use is to point MSQuery at your table and use SQL to apply the business rules. On the positive side, this runs very quickly (a couple seconds in my tests against 64k rows). A huge minus is the query engine does not seem to support Excel tables exceeding 64k rows, but there might be ways to work around this. Regardless, I offer the solution in case it gives you some ideas.
To set up first give your data set a named range. I called it MYTABLE. Save. Next select a cell to the right of your table in row 1, and click through Data | From other sources | from Microsoft Query. Choose Excel Files* | OK, browse for your file. The Query Wiz should open, showing MYTABLE available, add all the columns. Click Cancel (really), and click Yes, you want to continue editing.
The MSQuery interface should open, click the SQL button and replace the code with the following. You will need to edit some specifics, such as the file path. (Also, note I used different column names. This was sheer paranoia on my part. The Jet engine is very finicky and I wanted to rule out conflicts with reserved words as I built this.)
SELECT
MYTABLE.ID_X,
MYTABLE.DATE_X,
MYTABLE.COLUMN_X,
IIF(MAXDATES.ID_x IS NULL,0,1) * IIF(DUPTABLE.ID_X IS NULL,0,1) AS NEW_DATA
FROM ((`C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE
LEFT OUTER JOIN (
SELECT MYTABLE1.ID_X, MAX(MYTABLE1.DATE_X) AS MAXDATE
FROM `C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE1
GROUP BY MYTABLE1.ID_X
) AS MAXDATES
ON MYTABLE.ID_X = MAXDATES.ID_X
AND MYTABLE.DATE_X = MAXDATES.MAXDATE)
LEFT OUTER JOIN (
SELECT MYTABLE2.ID_X
FROM `C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE2
GROUP BY MYTABLE2.ID_X
HAVING COUNT(1) > 1
) AS DUPTABLE
ON MYTABLE.ID_X = DUPTABLE.ID_X)
With the code in place MSQuery will complain the query can't be represented graphically. It's OK. The query will execute -- it might take longer than expected to run at this stage. I'm not sure why, but it should run much faster on subsequent refreshes. Once results return, File | Return data to Excel. Accept the defaults on the Import Data dialog.
That's the technique. To refresh the query against new data simply Data | Refresh. If you need to tweak the query you can get back to it though Excel via Data | Connections | Properties | Definition tab.
The code I provided returns your original data plus the NEW_DATA column, which has value 1 if the ID is duplicated and the date is the maximum date for that ID, otherwise 0. This code will not sort out ties if an ID's maximum date is on several rows. All such rows will be tagged 1.
Edit: The code is easily modified to ignore the duplication logic and show most recent row for all IDs. Simply change the last bit of the SELECT clause to read
IIF(MAXDATES.ID_x IS NULL,0,1) AS NEW_DATA
In that case, you could also remove the final LEFT JOIN with alias DUPTABLE.
Sort by ID, then by DATE (ascending). Define entries in new column to be 1 if previous row has the same ID and next row has a different ID or is empty (for last row), 0 otherwise.
It could be done in VBA. I'd be interested to know if this is possible just using formulas, I had to do something similar once before.
Sub Macro1()
Dim rowCount As Long
Sheets("Sheet1").Activate
rowCount = Cells(Rows.Count, 1).End(xlUp).Row
Columns("A:D").Select
Selection.AutoFilter
Range("D2:D" & rowCount).Select
Selection.ClearContents
Columns("A:D").Select
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
("B1:B" & rowCount), SortOn:=xlSortOnValues
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
("A1:A" & rowCount), SortOn:=xlSortOnValues
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.Apply
Dim counter As Integer
For counter = 2 To rowCount
Cells(counter, 4) = 1
If Cells(counter, 1) = Cells(counter + 1, 1) Then Cells(counter, 4) = 0
Next counter
End Sub
So you activate the sheet and get the count of rows.
Then select and autofilter the results, and clear out Column D which has the 0s or 1s. Then filter on the values mbroshi suggested that you say you're already using. Then execute a loop for each record, changing the value to 1, but then back to 0 if the value ahead of it has the same ID.
Depending on your processor I dont think this would take more than a minute or two to run. If you do find something using formulas I would be interested to see it!

Turning an excel formula into a VBA function

I'm a bit new to trying to program and originally was just trying to improve a spreadsheet but it's gone beyond using a basic function in excel. I have a table that I am having a function look at to find a building number in the first column and then look at start and finish dates in two other respective columns to find out if it should populate specific blocks on a calendar worksheet. The problem occurs because the same building number may appear multiple times with different dates and I need to to find an entry that matches the correct dates.
I was able to create a working though complicated formula to find the first instance and learned I can add a nested if of that formula again in the false statement with a slight change. I can continue doing that but it becomes very large and cumbersome. I'm trying to find a way to make a function for the formula with a variable in it that would look at how many times the it has already been used so it keeps searching down the table for an answer that fits the parameters.
This is currently my formula:
=IFERROR(IF(AND(DATE('IF SHEET (2)'!$F$7,MATCH('IF SHEET (2)'!$C$2,'IF SHEET (2)'!$C$2:'IF SHEET (2)'!$N$2,0),'IF SHEET (2)'!C$4)>=VLOOKUP("2D11"&1,A2:F6,4,0),DATE('IF SHEET (2)'!$F$7,MATCH('IF SHEET (2)'!$C$2,'IF SHEET (2)'!$C$2:'IF SHEET (2)'!$N$2,0),'IF SHEET (2)'!C$4)<=VLOOKUP("2D11"&1,A2:F6,4,0)),IF(VLOOKUP("2D11"&1,A2:F6,3,0)="2D11",VLOOKUP("2D11"&1,A2:F6,6,FALSE)),"NO ANSWER"),"ERROR")
Where you see 2D11&1 is where I need the variable for 1 so it would be "number of times it's been used in the function +1" then I could just loop it so it would keep checking till it ran out of 2D11's or found one that matched. I haven't posted before and I'm doing this through a lot of trial and error so if you need more info please post and say so and I'll try to provide it.
So rather than have someone try to make sense of the rediculous formula I posted I though I would try to make it simpler by just stating what I need to accomplish and trying to see how to turn that into a VBA function. So I'm kinda looking at a few steps:
Matches first instance of building name in column A with
building name for the row of the output cell.
Is date connected with the output cell >= start date of first entry(which is user entered in column D).
Is date connected with the output cell <= end date of first entry(which is user entered in column E).
Enters Unit name(located in column F) for first instance of the building if Parts 1, 2, and 3 are all True.
If parts 1, 2, or 3 are False then loops to look at next instance of the building name down column 1.
Hopefully this makes things clearer than the formula so I'm able to get help as I'm still pretty stuck due to low knowledge of VBA.
Here is a simple solution...
Building_name = ???
Date = ???
Last_Row = Range("A65536").End(xlUp).Row
For i = 1 To Last_Row
if cells(i,1).value = Building_Name Then
if date >= cells(i,4).value Then
if date <= cells(i,5).value Then
first instance = cells(i,6).value
end if
end if
end if
next
you should add a test at the end to avoid the case where there is no first instance in the table
If I understand correctly, you have a Table T1 made of 3 columns: T1.building, T1.start date, T1.end date.
Then you have 3 parameters: P1=building, P2=start date, P3=end date.
You need to find the first entry in table T1 that "fits" within the input parameters dates, that is:
P1=T1.building
P2<=T1.start date
P3>=T1.end date
If so, you can define a custom function like this
Public Function MyLookup(Key As Variant, DateMin As Variant, DateMax As Variant, LookUpTable As Range, ResultColumn As Integer) As Range
Dim iIndx As Integer
Dim KeyValue As Variant
Dim Found As Boolean
On Error GoTo ErrHandler
Found = False
iIndx = 1
Do While (Not Found) And (iIndx <= LookUpTable.Rows.Count)
KeyValue = LookUpTable.Cells(iIndx, 1)
If (KeyValue = Key) And _
(DateMin <= LookUpTable.Cells(iIndx, 2)) And _
(DateMax >= LookUpTable.Cells(iIndx, 3)) Then
Set MyLookup = LookUpTable.Cells(iIndx, ResultColumn)
Found = True
End If
iIndx = iIndx + 1
Loop
Exit Function
ErrHandler:
MsgBox "Error in MyLookup: " & Err.Description
End Function
That may not be the most performant piece of code in the world, but I think it's explanatory.
You can download this working example

Resources