Related
I have an excel table that contain values in these formats. The tables span over 30000 entries.
I need to clean this data so that only the numbers directly after V- are left. This would mean that when the value is SV-51140r3_rule, V-4407..., I would only want 4407 to remain and when the value is SV-245744r822811_rule, I would only want 245744 to remain. I have about 10 formulas that can handle these variations, but it requires a lot of manual labor. I've also used the text to column feature of excel to clean this data as well, but it takes about 30 minutes to an hour to go through the whole document. I'm looking for ways that I can streamline this process so that one formula or function can handle all of these different variations. I'm open to using VBA but don't have a whole lot of experience with it and I am unable to use Pandas or any IDE or programming language. Help please!!
I've used text to columns to clean data that way and I've used a variation of this formula
=IFERROR(RIGHT(A631,LEN(A631)-FIND("#",SUBSTITUTE(A631,"-","#",LEN(A631)-LEN(SUBSTITUTE(A631,"-",""))))),A631)
Depending on your version of Excel, either of these should work. If you have the ability to use the Let function, it will improve your performance, as this outstanding article articulates.
If you're on a really old version of excel, you'll need to hit ctl shift enter to make array formula work.
While these look daunting, all these functions are doing is finding the last V (by this function) =SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄","") and then looping through each character and only returning numbers.
Obviously the mushroom 🍄 could be any character that one would consider improbable to appear in the actual data.
Old School
=TEXTJOIN("",TRUE,IF(ISNUMBER(MID(MID(SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄",""),
FIND("-",SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄","")),9^9),
FILTER(COLUMN($1:$1),COLUMN($1:$1)<=LEN(MID(SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄",""),
FIND("-",SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄","")),9^9))),1)+0),
MID(MID(SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄",""),
FIND("-",SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄","")),9^9),
FILTER(COLUMN($1:$1),COLUMN($1:$1)<=LEN(MID(SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄",""),
FIND("-",SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄","")),9^9))),1),""))
Let Function
(use this if you can)
=LET(zText,SUBSTITUTE(RIGHT(SUBSTITUTE(A2,"V",REPT("🍄",999)),999),"🍄",""),
TEXTJOIN("",TRUE,IF(ISNUMBER(MID(MID(zText,FIND("-",zText),9^9),
FILTER(COLUMN($1:$1),COLUMN($1:$1)<=LEN(MID(zText,FIND("-",zText),9^9))),1)+0),
MID(MID(zText,FIND("-",zText),9^9),
FILTER(COLUMN($1:$1),COLUMN($1:$1)<=LEN(MID(zText,FIND("-",zText),9^9))),1),"")))
VBA Custom Function
You could also use a VBA custom function to accomplish what you want.
Function getNumbersAfterCharcter(aCell As Range, aCharacter As String) As String
Const errorValue = "#NoValuesInText"
Dim i As Long, theValue As String
For i = Len(aCell.Value) To 1 Step -1
theValue = Mid(aCell.Value, i, 1)
If IsNumeric(theValue) Then
getNumbersAfterCharcter = Mid(aCell.Value, i, 1) & getNumbersAfterCharcter
ElseIf theValue = aCharacter Then
Exit Function
End If
Next i
If getNumbersAfterCharcter = "" Then getNumbersAfterCharcter = errorValue
End Function
I am having an issue with VLookup in my VBA code. When I didn't have the range as a dynamic size using CurrentRegion it worked flawlessly, now for whatever reason, it only works for a single loop over and fails at the first VLookup. I have tried with and without the with block. I have a variable that will replace the 5 in the for loop and works but have kept it as its original so that isn't affecting it.
I apologize if this is a duplicate but I could not find an answer. Any help would be appreciated.
Set jobTypeData = Worksheets("JobType").Range("A1").CurrentRegion
For i = 1 To 5
jobTypeTemp = Worksheets("Employees2").Cells(i + 1, 16).Value
jobRoleTemp = Worksheets("Employees2").Cells(i + 1, 17).Value
'Creates variables relevant to the job type
With Worksheets("JobType")
minHours = CDec(WorksheetFunction.VLookup(jobTypeTemp, jobTypeData, 2))
maxHours = CDec(WorksheetFunction.VLookup(jobTypeTemp, jobTypeData, 3))
minShift = CDec(WorksheetFunction.VLookup(jobTypeTemp, jobTypeData, 5))
maxShift = CDec(WorksheetFunction.VLookup(jobTypeTemp, jobTypeData, 6))
shiftGap = CDec(WorksheetFunction.VLookup(jobTypeTemp, jobTypeData, 7))
End With
There is more code under this but it all works a-okay and worked fine before using the dynamic data size.
The code that works does not change the contents or size of jobTypeData.
I have also checked the values VLookup returns and they are all correct. The only thing I can think of is that it can't find the second jobType however I have checked and they are identical (no hidden spaces).
The answer appears to be related to the way VLookup uses 'approximate match'.
I always thought VLookup would still provide the exact match if it was an option and because I had strict data validation I never thought using approximate match would be a problem. However, turns out, it's not just mildly unpredictable, but actually unreliable.
After checking it was assuming "Role Name" was a closer match to "Team Manager" than "Team Manager". Upon changing it to using 'exact match', everything worked fine. No changes to the code which worked before either.
Well, lesson learned...
When you have an exact match ALWAYS use (no matter how confident you are, use me as an example):
VLOOKUP(arg1, arg2, arg3, FALSE)
Not:
VLOOKUP(arg1, arg2, arg3)
Document: https://docs.google.com/spreadsheets/d/1N4cGw5eUq_3gCJh1w39qVatX9KV1_Hr-AqRHj_nbckA/edit#gid=1770960621
Question
How can I convert the following simple formulas at Schedule!C20:I29 into a single, simple ARRAYFORMULA at Schedule!C20?
=Count(Filter(Students!$B$5:$B, Find(C6, Filter(Students!$J$5:$O,Students!$J$4:$O$4 = 'Current Class'!$B$3))))
.
NOTE:
The above code is only a partial solution. I will substitute the ARRAYFORMULA version of the code into the correct part of the code at Current Class!L6
The C6 reference above can take on any cell between Schedule!C6:I15. I have named that range Timetable_Code. I thought I could do the following, but I was wrong...
=Arrayformula(Count(Filter(Students!$B$5:$B, Find(Timetable_Code, Filter(Students!$J$5:$O,Students!$J$4:$O$4 = 'Current Class'!$B$3)))))
Background
Originally, I created a table that now resides at 1st Version - Current Class!L6. This tab is only for your reference and will be deleted soon. Each cell has a formula with a slight modification. This formula works correctly; however, it is a behemoth and would be hard to modify...
=if(COUNTIF(Meta!$B$5:$B, CONCATENATE("=",if(L$5 = "THURSDAY", "TH", if(L$5 = "SUNDAY", "SU", left(L$5,1))), if(left($K6, 2) = "12", 0, left($K6, 1)))), CONCATENATE(if(L$5 = "THURSDAY", "TH", if(L$5 = "SUNDAY", "SU", left(L$5,1))), if(left($K6, 2) = "12", 0, left($K6, 1)), " ( ", Count(Filter(Students!$B$5:$B, Find(CONCATENATE(if(L$5 = "THURSDAY", "TH", if(L$5 = "SUNDAY", "SU", left(L$5,1))), if(left($K6, 2) = "12", 0, left($K6, 1))), Filter(Students!$J$5:$O,Students!$J$4:$O$4 = $B$3)))), " )") ,"")
.
Pros
I don't have to create any helper data.
All calculations are "in-memory"
Cons
Too large
Hard to modify
I like the output, but I don't like the cons, so I started to create a more edit-friendly version of the code that I am mostly OK with. This code is located at Current Class!L6 (and a secondary copy at Schedule!C33 - it will be deleted.) It has a single formula at Current Class!L6...
=arrayformula(if(COUNTIF(Meta!$B$5:$B, ("=" & Timetable_Code)), (Timetable_Code & " ( " & Timetable_StudentCount & " )") ,""))
.
Pros
Very easy to understand
Very easy to modify
No need to copy formula over to other cells
Cons
Two ( 2 ) helper tables were created ( one of which I think is unneeded)
Again, I like the output, but I really don't like the second helper table (Schedule!C20). I feel like this table can be eliminated, but I have not been able to figure out how.
If you really want to use arrayformula, here it is. For Schedule!C20.
=arrayformula((len(concatenate(index(Students!J5:O, , match('Current Class'!$B$3, Students!J4:O4, 0))))-len(substitute(concatenate(index(Students!J5:O, , match('Current Class'!$B$3, Students!J4:O4, 0))),C6:I15,"")))/len(C6:I15))
Probably you can use filter(as you did before) instead of index & match part, but I prefer index & match and don't want to dig more. Also you can use one help cell to store filter or index & match result to shorten the formula.
The core idea is from counting occurrences of given character in a string, ie len(a1) - len(substitute(a1, .... You can find many documents about it in the net.
Anyway, if I were you, I'd be satified with the current state. Just lock and hide the help tables or sheets. Nobody cares hidden sheets and if something bad happens, you can revert any change.
After getting a good answer from #Sangbok Lee, I decided to break apart each part of the function he gave to me. While doing that I found a highly unlikely connection to some work I did in the Google Sheets last week. A helper column I had in another tab kind of did what Sangbok Lee was trying to do. All I had to do was split that helper column into two columns (1 for the previous final calculation, 1 for) and calculate an additional count column
After reworking both of our formulas, and testing the result, I found a solution that I am even more satisfied with!
=arrayformula(if(countif(Meta!$B$5:$B, (Timetable_Code)), (Timetable_Code & " ( " & vlookup(Timetable_Code, StudentCount_Lookup, 2, false) & " )") ,""))
.
Check out the differences in the Google Sheet
Look at 1st Version - Current Class!L6 tab for the 1st version
Look at Current Class!L6 for the 2nd version
Look at Current Class!U6 for the 3rd and final version
Also look at tab Meta and Schedule for the differences.
Note: Green is old data, Red is new data
I am attempting to combine the if and mid functions in excel. If a value in a specific cell is true, then return the first six characters from another cell; otherwise return nothing. The following are the options I have tried:
=if(AA2=TRUE), MID(Y2,1,5), "")
=if(AA2=TRUE), MID(Y2,1,5), ""))
=if(mid(AA2=True), (Y2, 1, 5), "")
Could someone point out the errors in this syntax? I am new to programming and any help would be appreciated.
Thanks.
You close the parentheses too soon. Instead of
=if(AA2=TRUE), MID(Y2,1,5), "")
Try
=if((AA2=TRUE), MID(Y2,1,5), "")
i want to create the "cases" formula for excel to simulate Select case behavior (with multiple arguments and else optional).
If A1 and A2 are excel cells, this is the goal:
A1 Case: A2 Formula: A2 Result
5 cases({A1>5,"greather than 5"}, {A1<5, "less than 5"},{else,"equal to 5"}) equal to 5
Hi cases({A1="","there is nothing"},{else,A1}) Hi
1024 cases({5<A1<=10,10},{11<=A1<100,100},{A1>100,1000}) 1000
12 cases({A1=1 to 9, "digit"}, {A1=11|22|33|44|55|66|77|88|99, "11 multiple"}) (empty)
60 cases({A1=1 to 49|51 to 99,"not 50"}) not 50
If it could, It must accept excel formulas or vba code, to make an operation over the cell before take a case, i.g.
cases({len(A1)<7, "too short"},{else,"good length"})
If it could, it must accept to or more cells to evaluate, i.g.
if A2=A3=A4=A5=1 and A1=2, A6="one", A7="two"
cases(A1!=A2|A3|A4|A5, A6}, {else,A7}) will produce "two"
By the way, | means or, != means different
Any help?
I'm grateful.
What I could write was this:
Public Function arr(ParamArray args()) 'Your function, thanks
arr = args
End Function
Public Function cases(arg, arg2) 'I don't know how to do it better
With Application.WorksheetFunction
cases = .Choose(.Match(True, arg, 0), arg2)
End With
End Function
I call the function in this way
=cases(arr(A1>5, A1<5, A1=5),arr( "gt 5", "lt 5", "eq 5"))
And i can't get the goal, it just works for the first condition, A1>5.
I fixed it using a for, but i think it's not elegant like your suggestion:
Function selectCases(cases, actions)
For i = 1 To UBound(cases)
If cases(i) = True Then
selectCases = actions(i)
Exit Function
End If
Next
End Function
When i call the function:
=selectCases(arr(A1>5, A1<5, A1=5),arr( "gt 5", "lt 5", "eq 5"))
It works.
Thanks for all.
After work a little, finally i get a excel select case, closer what i want at first.
Function cases(ParamArray casesList())
'Check all arguments in list by pairs (case, action),
'case is 2n element
'action is 2n+1 element
'if 2n element is not a test or case, then it's like the "otherwise action"
For i = 0 To UBound(casesList) Step 2
'if case checks
If casesList(i) = True Then
'then take action
cases = casesList(i + 1)
Exit Function
ElseIf casesList(i) <> False Then
'when the element is not a case (a boolean value),
'then take the element.
'It works like else sentence
cases = casesList(i)
Exit Function
End If
Next
End Function
When A1=5 and I call:
=cases(A1>5, "gt 5",A1<5, "lt 5","eq 5")
It can be read in this way: When A1 greater than 5, then choose "gt 5", but when A1 less than 5, then choose "lt 5", otherwise choose "eq 5". After run it, It matches with "eq 5"
Thank you, it was exciting and truly educative!
O.K., there's no way at all to do exactly what you want. You can't use anything other than Excel syntax within a formula, so stuff like 'A1 = 1 to 9' is just impossible.
You could write a pretty elaborate VBA routine that took strings or something and parsed them, but that really amounts to designing and implementing a complete little language. And your "code" wouldn't play well with Excel. For example, if you called something like
=cases("{A1="""",""there is nothing""},{else,A1}")
(note the escaped quotes), Excel wouldn't update your A1 reference when it moved or the formula got copied. So let's discard the whole "syntax" option.
However, it turns out you can get much of the behavior I think you actually want with regular Excel formulas plus one tiny VBA UDF. First the UDF:
Public Function arr(ParamArray args())
arr = args
End Function
This lets us create an array from a set of arguments. Since the arguments can be expressions instead of just constants, we can call it from a formula like this:
=arr(A1=42, A1=99)
and get back an array of boolean values.
With that small UDF, you can now use regular formulas to "select cases". They would look like this:
=CHOOSE(MATCH(TRUE, arr(A1>5, A1<5, A1=5), 0), "gt 5", "lt 5", "eq 5")
What's going on is that 'arr' returns a boolean array, 'MATCH' finds the position of the first TRUE, and 'CHOOSE' returns the corresponding "case".
You can emulate an "else" clause by wrapping the whole thing in 'IFERROR':
=IFERROR(CHOOSE(MATCH(TRUE, arr(A1>5, A1<5), 0), "gt 5", "lt 5"), "eq 5")
If that is too verbose for you, you can always write another VBA UDF that would bring the MATCH, CHOOSE, etc. inside, and call it like this:
=cases(arr(A1>5, A1<5, A1=5), "gt 5", "lt 5", "eq 5")
That's not far off from your proposed syntax, and much, much simpler.
EDIT:
I see you've already come up with a (good) solution that is closer to what you really want, but I thought I'd add this anyway, since my statement above about bringing MATCH, CHOOSE, etc. inside the UDF made it look easier thatn it really is.
So, here is a 'cases' UDF:
Public Function cases(caseCondResults, ParamArray caseValues())
On Error GoTo EH
Dim resOfMatch
resOfMatch = Application.Match(True, caseCondResults, 0)
If IsError(resOfMatch) Then
cases = resOfMatch
Else
Call assign(cases, caseValues(LBound(caseValues) + resOfMatch - 1))
End If
Exit Function
EH:
cases = CVErr(xlValue)
End Function
It uses a little helper routine, 'assign':
Public Sub assign(ByRef lhs, rhs)
If IsObject(rhs) Then
Set lhs = rhs
Else
lhs = rhs
End If
End Sub
The 'assign' routine just makes it easier to deal with the fact that users can call UDFs with either values or range references. Since we want our 'cases' UDF to work like Excel's 'CHOOSE', we'd like to return back references when necessary.
Basically, within the new 'cases' UDF, we do the "choose" part ourselves by indexing into the param array of case values. I slapped an error handler on there so basic stuff like a mismatch between case condition results and case values will result in a return value of #VALUE!. You would probably add more checks in a real function, like making sure the condition results were booleans, etc.
I'm glad you reached an even better solution for yourself, though! This has been interesting.
MORE ABOUT 'assign':
In response to your comment, here is more about why that is part of my answer. VBA uses a different syntax for assigning an object to a variable than it does for assigning a plain value. Look at the VBA help or see this stackoverflow question and others like it: What does the keyword Set actually do in VBA?
This matters because, when you call a VBA function from an Excel formula, the parameters can be objects of type Range, in addition to numbers, strings, booleans, errors, and arrays. (See Can an Excel VBA UDF called from the worksheet ever be passed an instance of any Excel VBA object model class other than 'Range'?)
Range references are what you describe using Excel syntax like A1:Q42. When you pass one to an Excel UDF as a parameter, it shows up as a Range object. If you want to return a Range object from the UDF, you have to do it explicitly with the VBA 'Set' keyword. If you don't use 'Set', Excel will instead take the value contained within the Range and return that. Most of the time this doesn't matter, but sometimes you want the actual range, like when you've got a named formula that must evaluate to a range because it's used as the source for a validation list.